NVIDIA Developer Blog

165 articles archived · Visit source ↗ · RSS

NVIDIA Developer Blog official-blog 13h ago

How to Govern Autonomous Agents in Enterprise AI Factories

AI agents are quickly moving beyond chat. They inspect code, run tests, read documents, search knowledge bases, query internal systems, and operate for hours on...

28
NVIDIA Developer Blog official-blog 3d ago

Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure

AI agents have changed a lot in the last two years. The first could only answer one question at a time. Then came multi-turn chat, where the model could keep...

7
NVIDIA Developer Blog official-blog 3d ago

Creating the NVIDIA Nemotron 3 Ultra NVFP4 Checkpoint with NVIDIA Model Optimizer

As context windows grow longer, moving large model weights efficiently becomes critical to performance. A common way to address this is quantization, an...

37
NVIDIA Developer Blog official-blog 4d ago

Streamlining Resource Binding with End-to-End Support for Vulkan Descriptor Heaps

Shaders are GPU programs that process visual data—such as rays, pixels, geometry, and textures—to produce specific rendering effects. Shaders find necessary...

32
NVIDIA Developer Blog official-blog 4d ago

Scaling AI Inference Across Multiple GPUs Using NVIDIA TensorRT with Multi-Device Inference Support

Generative AI workloads are rapidly outgrowing the memory and compute budget of single GPUs. For inference developers building media generation pipelines, the...

38
NVIDIA Developer Blog official-blog 4d ago

How KRAFTON Built PUBG Ally, a Co-Playable Character Powered by NVIDIA ACE

AI companions in games have long been constrained by scripted behavior trees and fixed dialogue. PUBG Ally is a different kind of system. Built by KRAFTON for...

26
NVIDIA Developer Blog official-blog 5d ago

Accelerating BEV Pooling on NVIDIA GPUs for Physical AI Applications

An increasingly common design pattern for autonomous vehicles (AVs), robotics, and spatial AI systems is bird's-eye-view (BEV) perception. BEV models project...

31
NVIDIA Developer Blog official-blog 6d ago

Maximize AI Factory Energy Efficiency Through Full-Stack Inference and Training Optimizations

Power can account for 40% of the operating expenses (OpEx) to run an AI factory. Each watt can be spent on overhead, data ingestion, training, or generating...

34
NVIDIA Developer Blog official-blog 6d ago

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash Speculative Decoding

As AI systems move from single-turn interactions to coordinated multiagent workflows, low-latency inference becomes increasingly important. Autoregressive LLMs...

33
NVIDIA Developer Blog official-blog 6d ago

Build an AI Scientist for Life Science Discovery with NVIDIA BioNeMo Agent Toolkit

AI scientists are emerging as a new interface for scientific computing. These agents can read papers, write code, generate hypotheses, call APIs, inspect files,...

12
NVIDIA Developer Blog official-blog 6d ago

How Telcos Build Autonomous Networks with Agentic AI

Telecom operators are adopting AI across network operations, customer care, and back-office workflows, but most are still early in the journey to autonomy. In...

37
NVIDIA Developer Blog official-blog 7d ago

CCCL Runtime: A Modern C++ Runtime for CUDA

The NVIDIA CUDA Core Compute Libraries (CCCL) provides delightful and efficient abstractions for CUDA developers in C++ and Python. It features: Parallel...

8
NVIDIA Developer Blog official-blog 7d ago

Enable Real-Time AI for High-Speed Data Acquisition with DAQIRI

When AlphaFold2 revolutionized drug discovery in 2020, its success relied entirely on the roughly 170,000 protein structures collected by scientists since 1971...

33
NVIDIA Developer Blog official-blog 7d ago

Inside NVIDIA Halos for Robotics: A Full-Stack Functional Safety System for Physical AI

Physical AI—robots working autonomously alongside people in factories, warehouses, hospitals, and homes—is arriving faster than most expected. Traditional...

12
NVIDIA Developer Blog official-blog 13d ago

Building AI Agents for AR Glasses and XR Devices with NVIDIA XR AI

Developers building for AR glasses and wearable devices face an infrastructure gap. The hardware is ready, but creating AI experiences requires integrating live...

33
NVIDIA Developer Blog official-blog 13d ago

Build Your Own Transaction Foundation Model for Financial Intelligence

Every swipe, transfer, and payment on a modern financial network encodes a pattern of human behavior. Transaction data is one of the richest signals an...

16
NVIDIA Developer Blog official-blog 13d ago

Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins

NVIDIA RTX technologies are deeply integrated into Unreal Engine 5 through the NVIDIA RTX Branch of Unreal Engine and the NVIDIA DLSS Unreal Engine plugin. This...

23
NVIDIA Developer Blog official-blog 13d ago

How to Optimize Transformer-Based Models for Low-Precision Training

Transformer architectures are the backbone of many modern large language and generative AI models. As these models grow in size, training runs consume more GPU...

5
NVIDIA Developer Blog official-blog 13d ago

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

NVIDIA delivered a clean sweep in MLPerf Training v6.0, the latest edition of industry-standard AI training benchmarks developed by the MLCommons consortium....

17
NVIDIA Developer Blog official-blog 14d ago

Fine-Tuning Biological Foundation Models with LoRA Using NVIDIA BioNeMo Recipes

Foundation models are reshaping computational biology. Pretrained on massive corpora of protein or genomic sequences, models such as ESM2 (a protein language...

8
NVIDIA Developer Blog official-blog 14d ago

Boosting MoE Training Throughput with Advanced Fusion Kernels

Mixture-of-experts (MoE) models have quickly become a foundational component of modern, large-scale AI systems. They are widely adopted because they enable...

36
NVIDIA Developer Blog official-blog 14d ago

Pretrained to Imagine, Fine-Tuned to Act: The Rise of World-Action Models

Quick glossary for readers new to VLA/WAM terminology VLA Vision-Language-Action model: a robot policy that starts from a pretrained VLM backbone and adapts it...

22
NVIDIA Developer Blog official-blog 17d ago

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI Benchmark

AI agents have fundamentally changed the complexity of inference workloads. Until now, the industry has struggled to define a standard for measuring how...

8
NVIDIA Developer Blog official-blog 17d ago

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

As enterprise AI adoption scales, developers are increasingly forced to stitch together fragmented pipelines—separate models for text, vision, and...

25
NVIDIA Developer Blog official-blog 18d ago

One-Click Multi-Tenant Security with NVIDIA Quantum InfiniBand

NVIDIA Quantum InfiniBand now offers intent-based security profiles in Unified Fabric Manager (UFM) that enable multi-tenant fabric security in a single...

33
NVIDIA Developer Blog official-blog 19d ago

Run DiffusionGemma on NVIDIA for Developer-Ready, High-Throughput Text Generation

Developers building real-time AI—such as chat assistants, copilots, and agentic workflows—are often constrained by token-by-token generation speed. This...

6
NVIDIA Developer Blog official-blog 19d ago

Designing Production-Ready Battery Energy Storage Systems for AI Factories

AI factories are changing what data-center infrastructure must do. Unlike traditional data centers, AI factories are built to manufacture intelligence at scale....

29
NVIDIA Developer Blog official-blog 20d ago

Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability

As AI infrastructure scales, enterprise expectations for operational maturity are increasing. Organizations expect these systems to be provisionable,...

38
NVIDIA Developer Blog official-blog 20d ago

Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT

Converting a quantized checkpoint into an NVIDIA TensorRT engine bridges the gap between model optimization and production deployment, enabling faster...

6
NVIDIA Developer Blog official-blog 20d ago

Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL

Federated learning (FL) research often begins with a deceptively simple question: What should we try next? A new aggregation rule, a FedProx coefficient, a...

16
NVIDIA Developer Blog official-blog 20d ago

Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

Training a speech AI model to correctly recognize or synthesize clinical terminology is surprisingly difficult. Drug names like Acetaminophen, Amlodipine,...

9
NVIDIA Developer Blog official-blog 21d ago

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

Pre-training frontier LLMs comes down to throughput. When training spans trillions of tokens across thousands of accelerators, every percentage point of step...

34
NVIDIA Developer Blog official-blog 25d ago

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

Single-turn chatbots are evolving into long-running agents that can reason, maintain context, use tools, and run efficiently across many turns to complete...

33
NVIDIA Developer Blog official-blog 27d ago

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA

AI agents are changing how you interact with your PC. Creators, developers, and AI enthusiasts are already using these agents extensively to assist with...

17
NVIDIA Developer Blog official-blog 27d ago

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

AI agents are a powerful tool for synthesizing data to accelerate research, summarize information, and help teams make decisions faster. But combining internal...

29
NVIDIA Developer Blog official-blog 28d ago

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2

As AI agents move from the digital world to the physical environment, they can readily use NVIDIA Jetson to accelerate real-world deployment with optimized...

24
NVIDIA Developer Blog official-blog 28d ago

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark

The rise of autonomous, long-running AI agents has introduced a new class of compute demand, namely tasks that maintain large context windows, spawn concurrent...

16
NVIDIA Developer Blog official-blog 29d ago

How to Post-Train Autonomous Vehicle Models in Closed-Loop with NVIDIA Alpamayo

Developing autonomous vehicle (AV) policies requires bridging an important gap between training and deployment. Vision-language-action (VLA) models that can...

26
NVIDIA Developer Blog official-blog 29d ago

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3

Physical AI systems must understand the real world before they can act within it. Robots, autonomous vehicles, and smart spaces need to understand what's...

21
NVIDIA Developer Blog official-blog 29d ago

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security

The AI era is driving a new class of infrastructure: AI factories that transform data into intelligence for autonomous AI agents operating at unprecedented...

4
NVIDIA Developer Blog official-blog 29d ago

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories

Each wave of AI has created a new scaling law. Pretraining scaled intelligence through larger datasets, more parameters, and massively parallel GPU systems....

4
NVIDIA Developer Blog official-blog 29d ago

NVIDIA DSX OS Delivers Open, Modular Software for Operating AI Factories at Scale

AI is now essential infrastructure, powered by AI factories that generate intelligence in the form of tokens. As demand grows, these factories must scale...

13
NVIDIA Developer Blog official-blog 1mo ago

DynoSim: Simulating the Pareto Frontier

Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker...

22
NVIDIA Developer Blog official-blog 1mo ago

How to Automate AI Model Documentation with the NVIDIA MCG Toolkit

As AI models grow in complexity and regulatory scrutiny intensifies under frameworks including California’s AB-2013 and the EU AI Act, software teams...

17
NVIDIA Developer Blog official-blog 1mo ago

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI

AI applications are moving beyond text generation to multimodal systems that can perceive, search, and reason across images, documents, video, and...

38
NVIDIA Developer Blog official-blog 1mo ago

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

The cold-start problem In production inference deployments, demand fluctuates over time, requiring inference replicas to scale elastically. However,...

35
NVIDIA Developer Blog official-blog 1mo ago

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to...

4
NVIDIA Developer Blog official-blog 1mo ago

What’s New for Game Developers in NVIDIA RTX: DLSS 4.5 for UE5 and Multilingual AI Characters

NVIDIA RTX provides game developers with direct paths to AI-driven characters, frame generation, and ray-traced rendering. This post walks through a meaningful...

33
NVIDIA Developer Blog official-blog 1mo ago

Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

Developers can now use NVIDIA CUDA Tile programming within large existing C++ GPU codebases to develop highly optimized GPU kernels using tile-based...

38
NVIDIA Developer Blog official-blog 1mo ago

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates

NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in...

9

How to Govern Autonomous Agents in Enterprise AI Factories

Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure

Creating the NVIDIA Nemotron 3 Ultra NVFP4 Checkpoint with NVIDIA Model Optimizer

Streamlining Resource Binding with End-to-End Support for Vulkan Descriptor Heaps

Scaling AI Inference Across Multiple GPUs Using NVIDIA TensorRT with Multi-Device Inference Support

How KRAFTON Built PUBG Ally, a Co-Playable Character Powered by NVIDIA ACE

Accelerating BEV Pooling on NVIDIA GPUs for Physical AI Applications

Maximize AI Factory Energy Efficiency Through Full-Stack Inference and Training Optimizations

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash Speculative Decoding

Build an AI Scientist for Life Science Discovery with NVIDIA BioNeMo Agent Toolkit

How Telcos Build Autonomous Networks with Agentic AI

CCCL Runtime: A Modern C++ Runtime for CUDA

Enable Real-Time AI for High-Speed Data Acquisition with DAQIRI

Inside NVIDIA Halos for Robotics: A Full-Stack Functional Safety System for Physical AI

Building AI Agents for AR Glasses and XR Devices with NVIDIA XR AI

Build Your Own Transaction Foundation Model for Financial Intelligence

Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins

How to Optimize Transformer-Based Models for Low-Precision Training

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

Fine-Tuning Biological Foundation Models with LoRA Using NVIDIA BioNeMo Recipes

Boosting MoE Training Throughput with Advanced Fusion Kernels

Pretrained to Imagine, Fine-Tuned to Act: The Rise of World-Action Models

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI Benchmark

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

One-Click Multi-Tenant Security with NVIDIA Quantum InfiniBand

Run DiffusionGemma on NVIDIA for Developer-Ready, High-Throughput Text Generation

Designing Production-Ready Battery Energy Storage Systems for AI Factories

Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability

Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT

Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL

Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark

How to Post-Train Autonomous Vehicle Models in Closed-Loop with NVIDIA Alpamayo

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories

NVIDIA DSX OS Delivers Open, Modular Software for Operating AI Factories at Scale

DynoSim: Simulating the Pareto Frontier

How to Automate AI Model Documentation with the NVIDIA MCG Toolkit

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

What’s New for Game Developers in NVIDIA RTX: DLSS 4.5 for UE5 and Multilingual AI Characters

Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates