
Digits at CUDA Mode IRL

At Digits, we are constantly pushing the limits of production machine learning, therefore we were beyond excited in being selected for one of the few and rare tickets of the inaugural IRL event hosted by the CUDA Mode community in San Francisco, a gathering organized by Accel and NVIDIA on September 21st, 2024. This event, brought together some of the brightest minds in the Torch and CUDA ecosystem, was a testament to the vibrant and ever-evolving world of AI.
We were thrilled to be among the attendees, eager to connect with the community and soak in the latest trends shaping the future of AI. The speaker lineup was truly impressive, featuring giants like Andrej Karpathy and Tim Dettmers, who delivered insightful talks that left us buzzing with inspiration.
Unveiling the Cutting Edge
Andrej Karpathy, the visionary and creator behind llm.c, shared his motivations for creating this tool and walked us through its intricate implementation. His passion for simplifying large language models (LLMs) and making them accessible to a wider audience was truly inspiring.
Lily Liu delved into the exciting world of vLLM, highlighting its latest advancements. Her presentation on speculative decoding particularly caught our attention. This novel concept promises to revolutionize the way we might be able to speed up interferences with LLMs, and we can't wait to experiment with it ourselves!
Supriya Rao presented her team's work on TorchAO, a game-changing tool for optimizing Torch models. This innovative framework empowers developers to significantly accelerate inference by enabling efficient pruning and quantization of models.
Tim Dettmer ignited a lively discussion around the power of open-source models and their potential to compete with closed, proprietary systems. His insights into the future of AI development left us pondering the evolving landscape of innovation.
The Evening's Highlight
The event concluded with a captivating Q&A session featuring Wen-Mei Hwu, a leading figure in the field of AI hardware. His insights and perspectives provided valuable insights into the future of AI hardware and its impact on the broader ecosystem. For example, Wen-Mei stressed that current ML hardware is often CPU bound, a restriction that is often over looked.
Hackathon and Learning
Beyond the presentations, we were fortunate to participate in a hands-on hackathon led by CUDA experts. We spent the day diving deep into the technical details of optimizing the inference of Meta's Llama3.1 405b model, engaging in stimulating discussions and collaborating with fellow enthusiasts.
Community Spirit
The CUDA Mode community embraced us with open arms. We were struck by the shared passion and dedication of everyone present. The energy was infectious, and we left feeling invigorated and inspired by the collective spirit of innovation.
Moving Forward
We're incredibly grateful for the opportunity to have been part of this groundbreaking event. We are already looking forward to the next CUDA Mode IRL, where we can continue to learn, collaborate, and shape the future of AI together.
Join the Community
If you're interested in joining the CUDA Mode community and participating in future events, we encourage you to visit their Discord channel: https://discord.gg/cudamode
We can't wait to see you there!