NVIDIA unveils Llama Nemotron AI models for enterprises

Wed, 19th Mar 2025

NVIDIA has announced the launch of the open Llama Nemotron family of models, designed to provide developers and enterprises with a foundation for creating advanced AI agents capable of completing complex tasks.

The Llama Nemotron reasoning family, built on Llama models, has been refined by NVIDIA after training to enhance multistep math, coding, reasoning, and decision-making capabilities. This refinement process has reportedly increased model accuracy by up to 20% compared to the base model and improved inference speed fivefold compared to other open reasoning models. As a result, the models can handle more complex tasks, enhance decision-making, and potentially reduce enterprise operational costs.

Companies including Accenture, Amdocs, Atlassian, Box, Cadence, CrowdStrike, Deloitte, IQVIA, Microsoft, SAP, and ServiceNow are collaborating with NVIDIA on these new reasoning models and software. These partnerships aim to transform the way industries utilise agentic AI.

"Reasoning and agentic AI adoption is incredible," said Jensen Huang, Founder and CEO of NVIDIA. "NVIDIA's open reasoning models, software, and tools give developers and enterprises everywhere the building blocks to create an accelerated agentic AI workforce."

The Llama Nemotron family is available as NVIDIA NIM microservices in Nano, Super, and Ultra sizes, each optimised for different deployment needs. The Nano model focuses on accuracy for PCs and edge devices, the Super model provides optimal accuracy and throughput on a single GPU, and the Ultra model offers maximum accuracy for multi-GPU servers.

NVIDIA has conducted extensive post-training using high-quality synthetic data generated by the NVIDIA Nemotron and other open models, alongside other curated datasets co-created with NVIDIA. These tools, datasets, and post-training techniques will be openly available, allowing enterprises to customise their own reasoning models.

Microsoft is integrating the Llama Nemotron reasoning models and NIM microservices into its Azure AI Foundry, expanding the model catalogue within Azure AI Foundry to enhance services like Azure AI Agent Service for Microsoft 365.

SAP is incorporating Llama Nemotron models into its Business AI solutions and Joule, its AI copilot. "We are collaborating with NVIDIA to integrate Llama Nemotron reasoning models into Joule to enhance our AI agents, making them more intuitive, accurate, and cost-effective," said Walter Sun, Global Head of AI at SAP. "These advanced reasoning models will refine and rewrite user queries, enabling our AI to better understand inquiries and deliver smarter, more efficient AI-powered experiences that drive business innovation."

ServiceNow is using the models to enhance enterprise productivity with more accurate AI agents. Accenture has made the models available on its AI Refinery platform to help clients develop and deploy customised AI agents for industry-specific challenges.

Deloitte plans to include Llama Nemotron models in its Zora AI platform, designed for deep functional and industry-specific knowledge and transparency in decision-making.

Developers can deploy these reasoning models with new NVIDIA agentic AI tools and software to facilitate advanced reasoning in collaborative AI systems. The NVIDIA AI Enterprise software platform includes the NVIDIA AI-Q Blueprint and NVIDIA AI Data Platform among other resources for building and optimising AI systems.

The NVIDIA Llama Nemotron Nano and Super models and NIM microservices are now available as a hosted API. Enterprises can run these microservices in production with NVIDIA AI Enterprise on accelerated data center and cloud infrastructure.

Share on: