Digits at CUDA Mode IRL

Hannes Hapke

Published October 03, 2024

Sign with "CUDA Mode IRL, Happy Hacking"

At Digits, we are constantly pushing the limits of production machine learning, therefore we were beyond excited in being selected for one of the few and rare tickets of the inaugural IRL event hosted by the CUDA Mode community in San Francisco, a gathering organized by Accel and NVIDIA on September 21st, 2024. This event, brought together some of the brightest minds in the Torch and CUDA ecosystem, was a testament to the vibrant and ever-evolving world of AI.

Digits at CUDA Mode IRL

We were thrilled to be among the attendees, eager to connect with the community and soak in the latest trends shaping the future of AI. The speaker lineup was truly impressive, featuring giants like Andrej Karpathy and Tim Dettmers, who delivered insightful talks that left us buzzing with inspiration.

Unveiling the Cutting Edge

Andrej Karpathy, the visionary and creator behind llm.c, shared his motivations for creating this tool and walked us through its intricate implementation. His passion for simplifying large language models (LLMs) and making them accessible to a wider audience was truly inspiring.

Lily Liu delved into the exciting world of vLLM, highlighting its latest advancements. Her presentation on speculative decoding particularly caught our attention. This novel concept promises to revolutionize the way we might be able to speed up interferences with LLMs, and we can't wait to experiment with it ourselves!

Supriya Rao presented her team's work on TorchAO, a game-changing tool for optimizing Torch models. This innovative framework empowers developers to significantly accelerate inference by enabling efficient pruning and quantization of models.

Tim Dettmer ignited a lively discussion around the power of open-source models and their potential to compete with closed, proprietary systems. His insights into the future of AI development left us pondering the evolving landscape of innovation.

The Evening's Highlight

The event concluded with a captivating Q&A session featuring Wen-Mei Hwu, a leading figure in the field of AI hardware. His insights and perspectives provided valuable insights into the future of AI hardware and its impact on the broader ecosystem. For example, Wen-Mei stressed that current ML hardware is often CPU bound, a restriction that is often over looked.

Digits at CUDA Mode IRL

Hackathon and Learning

Beyond the presentations, we were fortunate to participate in a hands-on hackathon led by CUDA experts. We spent the day diving deep into the technical details of optimizing the inference of Meta's Llama3.1 405b model, engaging in stimulating discussions and collaborating with fellow enthusiasts.

Community Spirit

The CUDA Mode community embraced us with open arms. We were struck by the shared passion and dedication of everyone present. The energy was infectious, and we left feeling invigorated and inspired by the collective spirit of innovation.

Moving Forward

We're incredibly grateful for the opportunity to have been part of this groundbreaking event. We are already looking forward to the next CUDA Mode IRL, where we can continue to learn, collaborate, and shape the future of AI together.

Join the Community

If you're interested in joining the CUDA Mode community and participating in future events, we encourage you to visit their Discord channel: https://discord.gg/cudamode

We can't wait to see you there!