Recurse Center - Batch 2 - Cycle 20241129-20241201 - Extension


Extension

Most of this cycle was spent with family, so I leaned into that more than anything else. I made a bit more progress on working with the Common Voice dataset (more below). More importantly, I think I decided I want to extend my batch at RC for six weeks! I'm going to check in with the RC faculty next cycle and see if that would be possible.

Day 1

I did a silly thing and screwed up my data preprocessing for the Common Voice dataset. The dataset comes as mp3s, so I tried to convert them to wavs with a new sampling rate of 16000. Unfortunately the code I wrote ended up stretching the audio out, so not only did it completely balloon the wav version of the dataset to about 2 terabytes (yikes), it also made all of the audio unusable. So, I rewrote the code, and I'm now reprocessing all of the files again. Lessons learned!

Day 2

Today was mostly spend re-processing the Common Voice data set.

Day 3:

I'm still continuing to re-process the dataset, which is just about halfway done.

It's also December 1st, which means Advent of Code is starting. I've never done it before so I dedided to try it out. My main intentions are for me to practice Python and have a little fun doing this while hopefully stretching myself a bit as a programmer through the process.

Things for next cycle

For the next few cycles that wrap up my current 12 week batch, I want to: