OpenAI and Broadcom unveil LLM-optimized inference chip
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
https://openai.com/index/openai-broadcom-jalapeno-inference-chip/
Quoted from the start of the blog post:
- Early testing shows that the first-generation accelerator will deliver performance per watt substantially better than current state-of-the-art
- Built from the ground up for current and future LLMs across the industry
- Developed from design to production in nine months, accelerated by OpenAI’s models
- Expands OpenAI’s full-stack platform, from products to models and now to chips
- To be deployed at gigawatt scale with data center partners, over multiple generations
The announcement doesn't have much content beyond this. This does not look like it will be a chip aimed at consumers, but it's worth knowing about either way.
[link] [comments]
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.