Vercel — AI · · 1 min read

Nemotron 3 Ultra now available on AI Gateway

Mirrored from Vercel — AI for archival readability. Support the source by reading on the original site.

1 min read

Jun 4, 2026

Nemotron 3 Ultra from Nvidia is now available on Vercel AI Gateway.

Nemotron 3 Ultra is an open Mixture-of-Experts reasoning model built for orchestrating long-running agent workflows, with a 1M token context window. The model targets multi-turn agent workflows: planning, tool use, sub-agent delegation, and error recovery. Throughput reaches up to 350 tokens per second, with up to 30% lower cost on agentic tasks.

To use Nemotron 3 Ultra, set model to nvidia/nemotron-3-ultra-550b-a55b in the AI SDK.

import { streamText } from 'ai';
const result = streamText({
model: 'nvidia/nemotron-3-ultra-550b-a55b',
prompt: 'Plan and run a multi-step research task and synthesize a report.',
});

AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting, Zero Data Retention support, dynamic provider sorting by latency and cost, and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on Bring Your Own Key (BYOK) requests.

Learn more about AI Gateway, view the AI Gateway model leaderboard or try it in our model playground.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Vercel — AI