Recurse Center - Batch 2 - Cycle 20241125-20241127 - Transportation


Transportation

This cycle I flew back home to Florida to be with my family for the holidays, so I've been spending most of my time with them. I did manage to get a few things done however, which is exciting.

Day 1

Today was mostly spent working on getting set up to download the Common Voice dataset and preparing myself to work with it. Some code around downloading, extracting, and converting the mp3 files to wav files can be found in this notebook.

Day 2

I worked more on Zora and created some fun classes and functionality, including a listener class that has two main functions, listen and interpret.

import numpy as np
import torch as t

class Listener:
    def __init__(self, model_architecture, model_weights, interpreter):
        self.model_architecture = model_architecture
        self.model_weights = model_weights
        self.interpreter = interpreter

    def load(self):
        # get our device
        device = t.device("cuda" if t.cuda.is_available() else "cpu")
        print(f"Using device: {device}")

        # load the model
        self.model_architecture.load_state_dict(t.load(self.model_weights, map_location=device))

        # load the interpreter
        self.interpreter.load(self.model_architecture)

    def listen(self, spec):
        # Pass spec into model
        outputs = self.model_architecture(spec)
        prediction = str(outputs.argmax().item())
        print("Model prediction:", prediction)

    def interpret(self, spec):
        self.interpreter.interpret(spec)

Day 3:

Today I started working on downloading the Common Voice dataset for my transformer-based ASR model. It's going to take.... a couple of days to convert the 2459129 mp3 clips into wav files, so we are now sitting back and waiting for that to happen.

Things for next cycle

A little bit of the same for next cycle: working on Zora, working on ARENA and helping out with Heap.