Earley AI Podcast

In this podcast hosts Seth Earley invites a broad array of thought leaders and practitioners to talk about what's possible in artificial intelligence as well as what is practical in the space as we move toward a world where AI is embedded in all aspects of our personal and professional lives. They explore what's emerging in technology, data science, and enterprise applications for artificial intelligence and machine learning and how to get from early-stage AI projects to fully mature applications. Seth is founder & CEO of Earley Information Science and the award-winning author of "The AI Powered Enterprise."

All Episodes

Earley AI Podcast

Earley AI Podcast Episode 70 - AI at Scale: Why Infrastructure Matters More Than Ever

July 14, 2025 • Seth Earley • Episode 70

This episode features a fascinating conversation with Sid Sheth, CEO and Co-Founder of d-Matrix. With a deep background in building advanced systems for high-performance workloads, Sid and his team are at the forefront of AI compute innovation—specifically focused on making AI inference more efficient, cost-effective, and scalable for enterprise use. Host Seth Earley dives into Sid’s journey, the architectural shifts in AI infrastructure, and what it means for organizations seeking to maximize their AI investments.

Key Takeaways:

The Evolution of AI Infrastructure: Sid breaks down how the traditional tech stack is being rebuilt to support the unique demands of AI, particularly shifting from general-purpose CPUs to specialized accelerators for inference.
Training vs. Inference: Using a human analogy, Sid explains the fundamental difference between model training (learning) and inference (applying knowledge), emphasizing why most enterprise value comes from efficient inference.
Purpose-built Accelerators: d-Matrix’s approach to creating inference-only accelerators means dramatically reducing overhead, latency, energy consumption, and cost compared to traditional GPU solutions.
Scalability & Efficiency: Learn how in-memory compute, chiplets, and innovative memory architectures enable d-Matrix to deliver up to 10x lower latency, and significant gains in energy and dollar efficiency for AI applications.
Market Trends: Sid reveals how, although today’s focus is largely on training compute, the next five to ten years will see inference dominate as organizations seek ROI from deployed AI.
Enterprise Strategy Advice: Sid urges tech leaders not to be conservative, but to embrace a heterogeneous and flexible infrastructure strategy to future-proof their AI investments.
Real-World Use Cases: Hear about d-Matrix’s work enabling low-latency agentic/reasoning models, which are critical for real-time and interactive AI workloads.

Insightful Quote from Sid Sheth:

“Now is not the time to be conservative and get comfortable with choice. In the world of inference there isn’t going to be one size fits all... The world of the future is heterogeneous, where you’re going to have a compute fleet that is augmented with different types of compute to serve different needs.”

Tune in to discover how to rethink your AI infrastructure strategy and stay ahead in the rapidly evolving world of enterprise AI!

Thanks to our sponsors: