Recurse Center - Batch 2 - Cycle 20241125-20241127 - Transportation
Transportation
This cycle I flew back home to Florida to be with my family for the holidays, so I've been spending most of my time with them. I did manage to get a few things done however, which is exciting.
Day 1
Today was mostly spent working on getting set up to download the Common Voice dataset and preparing myself to work with it. Some code around downloading, extracting, and converting the mp3 files to wav files can be found in this notebook.
Day 2
I worked more on Zora and created some fun classes and functionality, including a listener
class that has two main functions, listen
and interpret
.
import numpy as np
import torch as t
class Listener:
def __init__(self, model_architecture, model_weights, interpreter):
self.model_architecture = model_architecture
self.model_weights = model_weights
self.interpreter = interpreter
def load(self):
# get our device
device = t.device("cuda" if t.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# load the model
self.model_architecture.load_state_dict(t.load(self.model_weights, map_location=device))
# load the interpreter
self.interpreter.load(self.model_architecture)
def listen(self, spec):
# Pass spec into model
outputs = self.model_architecture(spec)
prediction = str(outputs.argmax().item())
print("Model prediction:", prediction)
def interpret(self, spec):
self.interpreter.interpret(spec)
Day 3:
Today I started working on downloading the Common Voice dataset for my transformer-based ASR model. It's going to take.... a couple of days to convert the 2459129 mp3 clips into wav files, so we are now sitting back and waiting for that to happen.
Things for next cycle
A little bit of the same for next cycle: working on Zora, working on ARENA and helping out with Heap.