Published on 01.17.2020 in [rc]
Week 2: Noise to Signal
Hello! This week I really felt like I made a lot of progress towards my goals. A lot of things came together in a really great way, and I can start to see how my overall approach to RC and what I'm studying is informing each other and interleaving in ways that I wanted it to.
This week I started by getting a lot of video lectures out of the way on Sunday, including the week 4 of fastai's deep learning course and ASPMA. That really set me up well to focus on programming for most of the week, instead of burning most of my time with lectures and fueling my anxiety that I'm not programming/making enough.
I also decided this week to try not to context switch as much - for now, I'm trying to still spend the mornings working on algorithms, but now I'll alternate days where I focus one day on ASPMA/audio signal processing and the other day on audio ML/fastai. I think it worked out really well this week, and made me feel less anxious to rush through something so I could switch to another related but contextually different tasks. So for this week I did ASPMA work on Monday and Wednesday, and audo ML work on Tuesday and Thursday. I found it successful, so I'm going to try it again next week!
- Talk aboout audio ML work so far
This week I feel like I made a lot of progress in the audio ML front, combining some of the stuff I've been learning about in the ASPMA course into the work I've been doing with fastai's new audio library. In the ASPMA course, we learned about short-time Fourier transform and how its used to generate spectrograms. I was able to use some of that knowledge to try to make a real-time spectrogram generator from the microphone. It didn't turn out super well, and its something I want to master, so I think I'll take another crack at it next week.
Earlier in the week, I met with Marko Stamenovic, an RC alum who works professionally on audio ML at Bose. We had an amazing conversation about audio ML, some of the current topics in the field, areas to check out related to my interests, and what it would be like to work professionally in that field.
We talked about a lot of topics that I need to go back and check out, including:
- ML Signal Processing
- Neural audio synthesis
- The MARL community out of NYU (Brian McFee, Juan Pablo, Keunwoo Choi)
- The Music Hackathon community
For audio generation, Marko pointed me to:
- WaveNet (DeepMind)
- FFT Net (Adobe, Justin Saloman)
- LPCNet (Mozilla)
He suggested first trying to genrate sine waves, then speech, then field recordings with these architectures.
Marko also told me to really focus on the STFT as its a fundamental algorithm in audio ML. He also mentioned that being able to do deployed real-time audo ML on the phone is very in-demand so that might be something I try to refocus on while at RC.
This week I was also able to finish my PR on fastai's audio library. The task at hand was creating a test to make sure spectrograms generated with the library always returned right-side up. I was able to use some of the skills I learned in the ASPMA class, specifically around generating an audio signal, in order to create a test case to create a simple 5hz signal, generate a spectrogram from that, and test to make sure the highest energy bin in that specgrogram was at the bottom. This was such a great moment where everything felt like it came together, and I only imagine that this will happen more and more :)
Finally, I did more MIT 6.006 lectures on algorithms. This week was sorting, including insertion sort, merge sort and heap sort. I particularly love heap sort! I also gave a small presentation on merge sort at RC as part of our Algorithms Study Group, which forced me to really dig into merge sort and understand how it works, including writing out its recursion tree. I love forcing myself into situations that make it guarenteed that I'll have to really focus and deeply understand something so that I can present it to others. I hope to do it more in the future.
For now, I think everything is moving well. I do want to realign what I'm working towards, and try to keep the bigger goals in mind of making something that generates sound. I do think though that the listening part of this is just as important, so I want to think about how to combine the two, because I do think they are both two sides of the same coin. I'll spend a bit more time thinking about that today and I'll hopefully have some idea forward before setting my goals for next week.
Published on 01.10.2020 in [rc]
Hello! If you are reading this, welcome! This is my attempt to be a better (technical) writer, starting with writing about my programming life at the Recurse Center. For more about me, please visit my personal website. For a quick intro, I make installations, performances, and sculptures that let you explore the world through your ears. I surface the vibratory histories of past interactions inscribed in material and embedded in space, peeling back sonic layers to reveal hidden memories and untold stories. I share my tools and techniques with others through listening tours, workshops, and open source hardware/software. During my time at RC, I want to dive deep into the world of machine listening, computational audio, and programamtic sound. To do that, I'm splitting my time, 2/3s of which will be spent on audio ML and audio signal processing. The other 1/3 of my time will be spent on getting a better foundation on computer science, algorithms, and data structures. In the following post, I'll write about my experience with those areas, and pepper in some observations along the way that I've had since being here!
On the audio ML side of things, this week I dove into fastai's new version 2 of their library, specifically so I could start working on their new audio extension! I'm really excited to contirbute to this extension, as this will be the first time I've really contributed to open source. The current team seems incredibly nice and smart, so I'm really looking forward to working with them. The first thing I did was get version 2 of fastai and fastcore setup on my Paperspace machine, but then I realized that I could/should get it set up on RC's Heap cluster! This took a bit to get working, but it was pretty smooth to get everything setup, so now I feel ready to start working with it. My first project idea was to build a bird classifier, using examples of birds found around the Newtown Creek. I was able to put together a test dataset from recordings I downloaded on https://www.xeno-canto.org/. I did want to start training this week, but I think that's going to have to happen next week. This week I also finished up to week 3 of the fastai DL lectures, so that was good progress. Next week I'll tackle week 4 and use the rest of the week to actually code something.
On the audio signal processing side of things, I was able to finish week 3 of the Audio Signal Processing for Musical Applications course on Coursera, which I've really been enjoying. Week 1 and 2's homework assignments were pretty easy and straightforward, but this week's homework assignment was way more difficult! I didn't expect it to take as much time as it did, and I did have to cut some corners at the end and look at someone else's example to finish it. It wasn't the most ideal situation, and I now know going into next week to anticipate needing to spend more time with the assignments.
Finally, on the algos side, I finished Lectures 1 and 2 of Introduction to Algorithms 6.006 from MIT Open Courseware. I tried a couple of LeetCode questions related to those lectures as well. I need to find a way to make sure I actually code things related to that course, instead of just simply watching the videos. My approach has been 1) Watch a video 2) Do a couple of problems related to that, all before lunch. I think if I can get into a good flow for this, I'll be doing just fine.
Over the course of my first week, I've already had my ups and downs. One thing has been being overambitious in what I can get done in a day. I'm ready spending 9am-7pm at RC, and I still have the feeling that I can't get everything done. I'm going to have to be ok with not getting everything done that I've set out to do each day.
I had a nice check-in with one of the faculty members about algo studying and project management. Two takeaways were: 1) Don't spend all your time at RC griding on algorithm studying/cramming videos. Do some, but don't spend the entire day doing it. And 2) Once you feel like you know enough of what you need to get started on a project, start! Let the project drive what you need to learn.
One of the things I think I should start doing is create a list of goals for the week on Sunday night, and then let that drive what I should be focusing on for the week, making sure I've planned out enough time and space during the week to realistically make those goals happen, knowing that I want to leave space for serendipity while at RC.
Going forward with RC, I made a list of projects I want to work on. I'm categorizing them as "Small/Known" (as in I already know how to do them or have an understanding of a clear path as to how to make them real, and "Big/Ambitious", as in I'm not exactly quite sure where to start and they will be take a longer time to do.
For now that list looks like:
Small / Known
- NCA / Newtown Creek Bird Classifier
- Freesound multilabel classifier
- Shubert's tone generator
Big / Ambitious
- Voice recognition for security
- Sonic generator with GANs
For next week, I want to:
Week 4 of fastai
Week 4 of ASPMA
Lecture 3 and 4 of MIT 6.006
Make bird classifier
Make Shepard tone sound generator
More LeetCode problems