Recurse Center - Batch 2 - Cycle 20241016-20241018 - Optimization, Propagation, Variation, Generation, Discrimination
Optimization, Propagation, Variation, Generation, Discrimination
Day 1
I'm still super happy about what I learned about CNNs and ResNets last week!
Here's another short explainer on how ResNets work.
I finished the optimization section in ARENA and started moving into backpropogation.
I learned a lot of new things, including:
On Optimization
-
This page, A Visual Explanation of Gradient Descent Methods (Momentum, AdaGrad, RMSProp, Adam), gave a lot of lovely visual examples of how these optimizers work and how each of them improves over the previous one.
-
This video on Optimization for Deep Learning (Momentum, RMSprop, AdaGrad, Adam) also looked good as well.
-
How to use Weights and Biases to do a sweep of hyperparamters to search for the most optimal hyperparamters that maximize model accuracy.
I also spent a bit more time workin on Heap, and realize that I need to learn more about Ansible and how it works in order to be better and making improvements/updates to Heap in the future.
On Backpropagation
After finishing the backpropagation section in ARENA, I started pairing with a fellow Recurser on generating a synthetic speech dataset that contains speakers saying the numbers 0-9, as a way to start building a simple audio speech classifier from scratch.
Some resources:
Day 2
On VAEs and GANs
Today we started learning to build and train Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs)
I started learning about building and training GANs today, and will have to finish that up on the first day of my next cycle.
Day 3:
- Today I was mostly off-line tending to non-Recurse related things.
Things for next cycle
I have a small bit of work on GANs to finish at the top of the week, which I'm looking forward to getting past, because...
Next we we will learn to build Transformers and start getting into mechanistic interpretability! I've been preloading my brain with a lot of resources on the topic, linked below. Really looking forward to getting my mind blown next week :)
On AI Safety
On Mechanistic Interpretability
-
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
-
Progress measures for grokking via mechanistic interpretability
-
Accompanying website to Progress Measures for Grokking via Mechanistic Interpretability
-
Mechanistic Interpretability - NEEL NANDA (DeepMind) on Machine Learning Street Talk
-
Reading AI's Mind - Mechanistic Interpretability Explained [Anthropic Research]