For my second co-op, I interned as a Machine Learning Software Engineer at Gridspace. At the peak of LLMs, I got an opportunity to work at the forefront of the speech AI scene. I learned so much at this place, met some of the most brilliant and motivated people, and I have some amazing memories to hold onto. Here’s a pretty quickly put together version of my internship experience at Gridspace:

Where it started

The office is this huge warehouse-like building within the Arts District of Downtown LA.

Untitled

I was one of four interns taken into Jurassic Park — the office had giant trees, a cave, vintage rooms with retro tech, and many (many) dogs. We were given a crate containing all the tech goods and a copy of the US constitutions since all of interns were from Canada (👀).

What I learned

The first month of the internship featured a course on speech technology and machine learning instructed by Gridspacers. It gave a high-level but detailed overview of audio analysis and processing to scaling ML software systems in the cloud. I found this IAP lecture series really informative and found it very relevant to the work I did.

Here’s a link to the IAP lecture playlist:

https://youtube.com/playlist?list=PL6owWFYBB-AohZQyFd29I-WGad3JfAwjc

In addition to what it’s like working at a fast-paced startup, I learned about the AI scene in the US. I think large language models will heavily transform the tech world soon and at Gridspace, I mostly worked with the LLM team.

In the LLM team, I worked with multiple open source LLMs and performed some data operations which involved using Apache Beam and Google Cloud. Furthermore, I learned some full-stack web development using Django and React while taking up some tasks in the Voice team. It complemented the work I did in the LLM team and gave me a good all around machine learning software engineering internship experience as the title says.

I also learned the ways of giving head-pats to our HR:

PXL_20230120_022201482.TS.mp4

What I worked on

I initially picked up tasks on data collection and generation for fine-tuning or evaluating open-source LLMs from Huggingface for a variety of tasks. I automated as much of the process as possible by writing Bash scripts and using other Python libraries. To generate more data from open source datasets and corpuses, I tried parallelizing and streaming data across GCP using Apache Beam — perhaps one of the most useful tools I’ve learned to use in this internship.

I also dabbled in on making a quick FastAPI-based micro-service and tried making a Docker image for it. In the Voice team, I worked on multiple Django and React-based tasks such as making conversation tree views, a leaderboard, as well as building a pipeline for creating Django objects from cloud logs.

Untitled

My final project in the LLM team was trying the optimize these open-source models for inference where I researched/worked with different optimization techniques such as Deepspeed mulit-GPU inference, ONNX runtime, and Flash Attention. I wrote down my understanding of how these optimization techniques worked in my research page.

It was one of the most fun tasks I’ve worked on, partly because it was very interesting to find out how each optimization worked on different models but also because most of these optimization techniques were still under development and sparsely documented. It gave me glimpse into what the future of this field could look like and working with cutting edge technology was really rewarding.

In a nutshell, I worked on so many different fun projects across the software development spectrum and I learned so much in the process. If I were to try anything different in the future, it would perhaps be trying to work closer to the production side (in DevOps and Infra).