LLM Finetuning—Digits panel discussion at Startup Grind Conference 2025

At this year's Startup Grind Conference in Redwood City, Digits had the opportunity to lead a panel discussion on LLM finetuning strategies. The session provided valuable insights into when and how to approach model finetuning, along with practical lessons learned from real-world implementations.

Hannes speaking at Startup Grind Conference 2025

Understanding When to Fine-tune LLMs

When implementing AI solutions, one of the first questions organizations face is whether they need to finetune their language models. While finetuning can provide significant benefits, it's not always the best approach. Here are the key scenarios where finetuning makes sense:

Domain-Specific Knowledge Needs

In specialized fields like healthcare, legal, or fintech applications, general-purpose language models often lack the depth of knowledge required. Finetuning becomes crucial when accuracy in specialized terminology and domain-specific concepts is critical for your application's success.

Consistent Style and Format Requirements

Organizations often need their AI systems to maintain a consistent voice and style that aligns with their brand identity. Finetuning can help ensure that the model's outputs consistently follow your organization's:

Writing style and tone
Formatting guidelines
Brand voice conventions

Performance Improvements

Finetuning can significantly enhance model performance for specific use cases by:

Increasing accuracy for particular tasks
Reducing hallucinations
Improving consistency across outputs
Better handling of edge cases
Optimizing for shorter, more efficient prompts

Cost and Latency Optimization

One of the most compelling reasons to finetune is the potential for significant cost and performance improvements. Through finetuning, organizations can achieve:

Shorter input prompts
More efficient processing
Lower token usage
Reduced inference time

When to Skip Finetuning

It's equally important to recognize situations where finetuning might not be necessary:

When prompt engineering can achieve the desired results
If you have limited access to high-quality training data
When resources for training and inference are constrained
In cases where time-to-market is critical

Key Lessons from the Field

Our experience has yielded several important insights about finetuning:

The Power of LoRA Adapters

LoRA (Low-Rank Adaptation) adapters have emerged as a powerful tool for efficient finetuning, offering a good balance between performance and resource requirements.

Smaller Models, Bigger Impact

Recent advancements have shown that smaller models can deliver impressive results while significantly reducing costs. For example, moving from Llama 3.1 70B to Llama 3.2 3B can reduce instance costs from $4,000/month to $700/month while maintaining strong performance.

Latency Improvements

The impact on response times can be dramatic, with inference times dropping from multiple seconds to sub-second responses after proper finetuning.

Looking Ahead

As AI technology continues to evolve, the decision to finetune will remain a crucial strategic choice for organizations implementing AI solutions. The key is to carefully evaluate your specific needs, available resources, and performance requirements before embarking on a finetuning project.

At Digits, we're committed to sharing our experiences and insights to help organizations make informed decisions about their AI implementations. If you have questions about finetuning or would like to learn more about our approach, we'd love to hear from you.