Recurse Center - Batch 2 - Cycle 20241117-20241119 - Gestation
Gestation
This cycle was a lot of work related to starting my library called Zora. It's an interpretable machine listening library focused on voice and speech. You can learn more about Zora, my values and intentions around the library, and ways to contribute here!
Day 1
Today was focused mostly on non-RC related activities
Day 2
Today I worked on Zora and made some nice progress. I also started on an implementation of a transformer-basesd ASR system based on the Speech-Transformer paper.
Day 3:
Today I tested some new set up on the Heap cluster. Some other Recursers set up a new 10TB HD that can be accessed from other machines, which is a huge boon for the cluster at large.
Afterward I had a nice pairing session with another Recurser around my library. I was prompted to explain how convolutions work and what were seeing in some of this feature visualization work I'm doing. It was really nice trying to explain these concepts out loud, and I arrived at some language around convolutional layers generating a "low resolution, but highly information dense" representation of an input as it passes through the network.
Here are some first attempts at visualizing the activations in this CNN trained on speech digit information (this is for the number 6):
They also were excited by how the library makes possible specific, low-latency, and local-first machine listening, which was always a desire! Not having to be connected to the internet could make the library appealing in settings where one doesn't have reliable access to the internet.
Finally we chatted about how interpretability functionality allows the user of the library to get more direct interpretation of what is going on in the model, especially around more "non-visual" data like sound, verses having to just guess based on what you hear (which could be more unreliable).
Things for next cycle
For next cycle, it will be much more of the same: working on this library, working on ARENA and helping out with Heap.