HomeNewsTechnologyNVIDIA's Dynamo 1.0 Is Free, Open Source Software That Makes AI Inference...

NVIDIA’s Dynamo 1.0 Is Free, Open Source Software That Makes AI Inference Up to 7x Faster

follow us on Google News

Running AI models at scale is harder than it looks. Training a model is a one-time investment. Inference, the process of actually using that model to answer questions, generate content, or power agent workflows, happens billions of times a day across every AI product in production. As agentic AI systems grow more complex and usage patterns become harder to predict, the infrastructure that manages inference has become one of the most important and least glamorous problems in the industry. NVIDIA is taking a direct swing at it with Dynamo 1.0, a production-grade open source software platform for generative and agentic inference at scale, and it is available to developers right now at no cost.

The performance numbers NVIDIA is citing are significant. In recent industry benchmarks, Dynamo boosted the inference performance of NVIDIA Blackwell GPUs by up to seven times compared with running without it. For cloud providers and enterprises operating millions of GPUs, that kind of efficiency gain translates directly into lower cost per token and higher revenue opportunity from existing hardware, without buying a single additional chip.

What Dynamo Actually Does

The core problem Dynamo solves is resource orchestration inside an AI data center. When agentic AI systems are running in production, inference requests do not arrive in a neat, predictable stream. They come in bursts, at varying sizes and levels of complexity, involving different modalities and performance requirements. Managing those requests efficiently across a large cluster of GPUs requires sophisticated traffic management and memory optimization that general-purpose infrastructure software was not built to handle.

- Advertisement -

NVIDIA describes Dynamo 1.0 as functioning like a distributed operating system for AI factories, coordinating GPU and memory resources across the cluster in the same way a computer’s operating system coordinates hardware and applications. The analogy captures something real about what Dynamo does: it abstracts away the complexity of the underlying hardware and provides an intelligent layer that routes work to where it can be done most efficiently.

In practical terms, Dynamo splits inference work across GPUs by adding smarter traffic control and the ability to move data between GPUs and lower-cost storage, which reduces wasted computation and eases memory constraints. For agentic AI workloads and long-context requests specifically, it can route incoming requests to the GPUs that already hold the most relevant context from earlier processing steps, then offload that context to storage when it is no longer needed. That kind of context-aware routing is particularly valuable as AI agents take on longer and more complex tasks that generate large volumes of intermediate state.

Jensen Huang, founder and CEO of NVIDIA, described inference as the engine of intelligence and Dynamo as the first-ever operating system for AI factories, noting that the rapid ecosystem adoption reflects how seriously the industry is treating the agentic AI production challenge.

The Open Source Ecosystem Integration

One of the most strategically significant aspects of Dynamo 1.0 is how deeply it integrates with the existing open source inference ecosystem. Rather than requiring organizations to replace their existing frameworks, NVIDIA is integrating Dynamo and NVIDIA TensorRT-LLM library optimizations directly into popular frameworks including LangChain, llm-d, LMCache, SGLang, and vLLM.

The core building blocks of Dynamo are also available as standalone modules for developers who want to use specific components without adopting the full stack. KVBM handles smarter memory management for key-value cache data. NVIDIA NIXL manages fast GPU-to-GPU data movement across the cluster. NVIDIA Grove simplifies scaling operations for inference workloads. Each of these can be pulled in independently and integrated into existing infrastructure rather than requiring a wholesale migration.

NVIDIA is also contributing TensorRT-LLM CUDA kernels to the FlashInfer project so they can be natively integrated into open source frameworks, which extends the performance benefits of NVIDIA’s inference optimizations to projects that are not directly built on Dynamo.

Who Is Already Using It

The adoption list for Dynamo spans virtually every tier of the AI infrastructure ecosystem, which is a strong signal that the performance benefits are real and the integration path is practical.

Among cloud service providers, Amazon Web Services, Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure have all integrated the NVIDIA inference platform. NVIDIA Cloud Partners including Alibaba Cloud, CoreWeave, Crusoe, DigitalOcean, Gcore, GMI Cloud, Lightning AI, Nebius, Nscale, Together AI, and Vultr are also on board.

- Advertisement -

Chen Goldberg, executive vice president of product and engineering at CoreWeave, described the challenge of moving AI from experimental pilots to continuous large-scale production as requiring infrastructure that is as dynamic as the models it supports, and said Dynamo provides the durability and high-performance orchestration required to move the industry’s most ambitious agentic workloads into global production.

Danila Shtan, chief technology officer of Nebius, pointed to the value of NVIDIA’s full software stack from Dynamo to TensorRT-LLM in delivering predictable performance and faster time to deployment, which helps customers find a simpler and higher-performance path to production AI.

Among AI-native companies, Cursor and Perplexity are using the NVIDIA inference platform, as is Hebbia. Inference endpoint providers Baseten, Deep Infra, and Fireworks are also part of the ecosystem.

The global enterprise adoption is particularly broad. AstraZeneca, BlackRock, ByteDance, Coupang, Instacart, Meituan, PayPal, Pinterest, Shopee, and SoftBank Corp are all using the NVIDIA inference platform. Pinterest’s chief technology officer Matt Madrigal described the challenge of delivering a multimodal AI experience to hundreds of millions of users as requiring real-time intelligence at global scale, and said Dynamo is helping the company expand the personalized experiences it delivers through high-performance AI infrastructure.

Together AI cofounder and CEO Vipul Ved Prakash said that combining Dynamo 1.0 with Together AI’s inference research helps deliver a high-performance stack for accelerated, cost-effective inference on large-scale production workloads, which reflects how AI-native companies are thinking about the economics of running inference at scale.

- Advertisement -

Why the Open Source Approach Matters

NVIDIA releasing Dynamo 1.0 as free and open source software is a deliberate strategic choice, and it reflects how the company is thinking about its position in the inference market. The hardware business, selling Blackwell GPUs and the systems they go into, is enormously valuable. But the value of that hardware is partly determined by how efficiently it can be used, and Dynamo is software that makes NVIDIA GPUs more efficient.

By open sourcing Dynamo and integrating it into the existing frameworks that developers are already using, NVIDIA removes friction from adoption and ensures that the performance advantages of its GPU architecture are accessible to the broadest possible range of developers and organizations. An organization running inference on NVIDIA hardware that adopts Dynamo gets better performance from hardware it is already paying for. An organization evaluating GPU infrastructure has one more reason to choose NVIDIA if the software layer that maximizes performance is free and already integrated into the tools it uses.

The seven times performance improvement headline is compelling on its own, but the more durable advantage is ecosystem depth. When the frameworks developers rely on, the cloud platforms they deploy to, and the enterprises building on top of those platforms all use the same inference optimization layer, the switching costs for moving away from that ecosystem compound over time.

Availability

NVIDIA Dynamo 1.0 is available today to developers worldwide at no cost. Documentation and getting started resources are available on the Dynamo webpage, and the codebase is open source for organizations that want to inspect, modify, or contribute to it.

Leave a Reply

More to Explore

Blender 5.1: The Precision Refinement Every Designer Needs

Released on March 17, 2026, Blender 5.1 arrives not as a radical departure, but as a masterclass in refinement. While version 5.0 was the...

How to Set Up Firefox’s New Free Built-in VPN and Use Native Split View

Digital privacy often feels like a full-time job, requiring users to juggle various extensions and subscriptions just to keep their personal data from leaking...

OpenAI Shuts Down Sora as Disney’s $1 Billion Deal Collapses

The sudden closure of OpenAI's AI video platform marks one of the most dramatic reversals in the brief history of generative AI, and leaves...

Sony’s Tokyo Studio Is Where the Future of Filmmaking Gets Made

Sony is bringing its global media production hub network to Japan, opening the Digital Media Production Center Japan (DMPC Japan) inside the company's Group...

Anthropic’s Claude Cowork Lets You Assign AI Tasks From Your Phone and Walk Away

Artificial intelligence is getting better at doing things. The harder challenge has always been getting it to do things without you watching. Anthropic's Claude...

Adobe and NVIDIA Are Teaming Up to Reinvent Creative and Marketing Workflows With AI

Two of the most influential companies in creative technology are deepening a partnership that goes back more than two decades. Adobe and NVIDIA have...

NVIDIA Is Trying to Become the Default Platform for Every Kind of Robot

Jensen Huang has a bold prediction: every industrial company will become a robotics company. Whether or not that timeline plays out exactly as he...

NVIDIA and T-Mobile Want to Turn the 5G Network Into a Distributed AI Computer

Most conversations about AI infrastructure focus on data centers, the massive facilities packed with GPU racks that train and run the world's most powerful...

BYD, Nissan, Geely and More Are Building Self-Driving Cars on NVIDIA’s Platform — and Robotaxis Are Coming to Uber by 2027

Self-driving vehicles have been a promise for a long time. The technology has advanced significantly, but wide-scale deployment has remained perpetually just around the...

NVIDIA Wants to Be the Platform That Powers Every Enterprise AI Agent

Autonomous AI agents are moving from experiment to enterprise infrastructure faster than most organizations anticipated. The question is no longer whether companies will deploy...

NVIDIA Is Building a Coalition of AI Labs to Develop Open Frontier Models Together

The race to build the most powerful AI models has largely been a competition, with labs guarding their research, their data, and their techniques...

NVIDIA Is Releasing a Wave of Open AI Models Covering Everything From Robot Brains to Drug Discovery

NVIDIA does not just make chips. It has spent years building a parallel business in AI software and open models, and at GTC this...

NVIDIA’s NemoClaw Brings Security and Privacy to OpenClaw’s Fast-Growing AI Agent Platform

AI agents are getting good enough to actually be useful, and that is precisely when the uncomfortable questions start. If a piece of software...

NVIDIA Is Taking Its AI Chips to Space — Here’s What That Actually Means

NVIDIA has spent the last several years building the infrastructure that powers AI on Earth. Now it is setting its sights considerably higher. The...

NVIDIA Is Giving the AI Factory Industry a Shared Blueprint — and the Energy Grid a Role to Play

Building a large-scale AI factory is one of the most complex infrastructure challenges in the world right now. You are not just buying servers...

Recommended for You

You Might Also Like