Recurse Center - Batch 2 - Cycle 20240930-20241002 - Submersion


Submersion

This cycle was a bit intense, and I feel less like I'm immersing myself in ML/AI studies and instead I'm fully submerging myself into it, like diving into a deep, wide ocean with no intention of swimming back up to the surface. A lot happened this cycle, but strangely I don't have too much to write about it for the time being. I spent the first two days of this cycle moving through online videos on linear algebra, and all of my notes are in my notebook for now. Which meant I did very little coding. But on the last day of this cycle, I did get around to some coding, through the lens of learning about einops. More details below...

Day 1

I watched 3Brown1Blue's Essence of Linear Algebra videos and took copious notes.

Day 2

At the Audio HangShared my Speech Emotion Recognition notebook in the Audio Hang group, and got into a conversation about audio signal feature extraction and clustering, especially around mel-frequency cepstral coefficients. We also talked a lot concatenative synthesis and granular synthesis, and another Recurse showed off one that they built from scratch in Rust and JS :)

Later on in the day I finished all of the Essense of Linear Algebra videos :) As I get deeper into Linear Algebra, I'd love to check out this book: Linear Algebra Done Right.

Day 3:

For the AI ML Paper Cuts study group, we read the now-classic Attention is All You Need which popularized both self-attention mechanisms and transformer models. I watched a couple of videos to help complement the paper, including Illustrated Guide to Transformers Neural Network: A step by step explanation and Attention Is All You Need over by Yannic Kilcher.A lot of it was over my head, but I was happy to expose myself to it this early, and at the very least I could follow and understand the implications of the paper. I'll be excited to revisit this paper once I have a firmer understadning of the fundamentals that lead up to it. In particular, there was some discussion on understanding some of the transformer's sublayers a bit more in depth.

It was mentioned in the call to check out 3Blue1Brown's videos on Neural Networks to get some basics on neural networks in an intuitive, visual way. This annotated version of the Transformer paper also seems like a great resource, especially for its visualizations of positional encodings.

Next week we are scheduled to read Learning Transferable Visual Models From Natural Language Supervision, which introduces Contrastive Language-Image Pre-training. I'm super interested in learning about this idea because it lead to Contrastive Language-Audio Pre-training, which makes possible text-to-audio systems introduced in the analogous paper CLAP: Learning Audio Concepts From Natural Language Supervision. So while I have a lot of things going on already, it would be nice to try to keep up with that paper because of its implications to other areas that I'm currently interested in with respect to audio machine learning.

In the afternoon I worked through an ARENA pre-requisite on einops, a Python library for tensor manipulaton that prioritizes readability and explicitness. It is based on Einstein notation I worked through the einops basics and started to develop an intuitive feel for how the library works, compared to NumPy or PyTorch. This intro video to einops was also very helpful for getting a fuller sense of its usefulness. Some other resources I came across include this article on einops in 30 seconds (早い!), this Reddit post on how to read and understand einops expressions, and two resources sharing a similar, catchy title: Einsum is All you Need - Einstein Summation in Deep Learning and Einsum Is All You Need: NumPy, PyTorch and TensorFlow.

As a way to encapsulate all of the ML/AI self-study I'll be doing (and have been doing), I created a monorepo to contain all of the topics that I'll be learning as a one-stop-shop for all things related to ML/AI.

ML/AI Self-Study Repo

Things for next cycle