Recurse Center - Batch 2 - Cycle 20241129-20241201 - Extension
Extension
Most of this cycle was spent with family, so I leaned into that more than anything else. I made a bit more progress on working with the Common Voice dataset (more below). More importantly, I think I decided I want to extend my batch at RC for six weeks! I'm going to check in with the RC faculty next cycle and see if that would be possible.
Day 1
I did a silly thing and screwed up my data preprocessing for the Common Voice dataset. The dataset comes as mp3s, so I tried to convert them to wavs with a new sampling rate of 16000. Unfortunately the code I wrote ended up stretching the audio out, so not only did it completely balloon the wav version of the dataset to about 2 terabytes (yikes), it also made all of the audio unusable. So, I rewrote the code, and I'm now reprocessing all of the files again. Lessons learned!
Day 2
Today was mostly spend re-processing the Common Voice data set.
Day 3:
I'm still continuing to re-process the dataset, which is just about halfway done.
It's also December 1st, which means Advent of Code is starting. I've never done it before so I dedided to try it out. My main intentions are for me to practice Python and have a little fun doing this while hopefully stretching myself a bit as a programmer through the process.
Things for next cycle
For the next few cycles that wrap up my current 12 week batch, I want to:
- Extend my RC batch an additional six weeks
- Work on Audrey and feature visualizaiton examples for a presentation at RC at the end of my batch
- Continue working on Heap
- Continue chipping away at speech transformer and mech interp, knowning that I'll get to more of it during my (hopefully approved) batch extension