Recurse Center - Batch 2 - Cycle 20241020-20241022 - Transformation


Transformation

This cycle was a lot of pairing on transformers. For example we implemented a Transformer Block like the one diagramed above.

Day 1

I was offline doing non-RC things :)

Day 2

Building and Training Transformers

I paired today on building an implementation of GPT-2 from scratch!

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Gaussian Error Linear Units

LayerNorm

Day 3:

Following up from yesterday, I did more pairing on transformers, this time learning different methods for sampling from a pretrained GPT-2.

How to generate text: using different decoding methods for language generation with Transformers

I also worked more on Audrey! I'm hoping to present something around Week 7.

Data Augmentation Techniques for Audio Data in Python

Things for next cycle

I want to focus on ARENA, Audrey, and helping support Heap!

Automatic Speech Recognition with Transformer

Things to research

Introduction to State Space Models (SSM)

Mamba