Tag

Model releases

390 articles archived under #model-release · RSS

r/LocalLLaMA community 2h ago

24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)

I got Qwen 3.6 35B-A3B and Gemma 4 26B-A4B running on a $200 secondhand machine (i7-6700 / GTX 1080 / 32 GB RAM) using llama.cpp (the TurboQuant/RotorQuant KV cache quantisation allows 128k context within the 8 GB VRAM). Results (Q4_K_M models, 128k context): Model tok/s Key…

19
Don't Worry About the Vase community 3h ago

Cyber Lack of Security and AI Governance

The real recent story of AI has been the background work being done on Cybersecurity, as we process the Mythos Moment along with GPT-5.5, and figure out both how to patch the internet and what our new regulatory regime is going to look like.

31
TechCrunch — AI news-outlet 3h ago

Anthropic’s Cat Wu says that, in the future, AI will anticipate your needs before you know what they are

The head of product for Claude Code and Cowork says that the next big step for AI is proactivity.

36
r/LocalLLaMA community 4h ago

MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)

TL;DR Results from the title are for single inference with 2 prompt of 1k and 15k tokens. So no MTP (as it’s slower for big prompt), no DFlash (working too but slower for big prompt), no quant used (full precision wanted) and the results are pretty good for a 2018 card. (Bench…

27
The Information — AI news-outlet 4h ago

Latest Version of Anthropic’s Mythos AI is Even Better at Hacking, UK Researchers Say

Anthropic’s latest version of its Mythos AI showed “notable capability jumps” at finding and exploiting undiscovered software vulnerabilities compared to an earlier version of the model, researchers at the U.K.’s AI Security Institute said Wednesday. Anthropic has not released…

13
r/LocalLLaMA community 6h ago

Who is your favourite quant publisher and why?

Hey everyone, I’ve been a big fan of Unsloth for several reasons: They publish models ASAP after release. They usually offer the lowest PPL. Their website has tons of helpful tutorials and documentation. Recently, I stumbled upon this Reddit thread suggesting to try out an Apex…

10
TechCrunch — AI news-outlet 8h ago

Amazon launches an AI shopping assistant for the search bar, powered by Alexa+

Alexa for Shopping is a new personalized AI shopping assistant in the Amazon search bar that replaces its Rufus assistant.

12
TechCrunch — AI news-outlet 9h ago

Introducing the 6 stages at TechCrunch Disrupt 2026 — built for today’s tougher startup market

From October 13-15, TechCrunch Disrupt 2026 will feature 200+ sessions across six stages, led by 250+ tech leaders shaping the industry today. Register now to save up to $410, plus 50% off a second pass.

16
OpenAI official-blog 9h ago

Claude 4 announced — context window doubles, agentic tools land

Anthropic published the Claude 4 release notes today, doubling the context window to 400K tokens and shipping native tool-use across the API + Claude.ai web client.

18
r/LocalLLaMA community 9h ago

qwen3.6 just stops

https://preview.redd.it/74cj1xu9pw0h1.png?width=1229&format=png&auto=webp&s=3ae999cc3530ecb4eccf70e25f1a9eb2aa3f2d7b Sometimes qwen 3.6 just stops at the middle of a task, is there a way to avoid it? This is qwen-code CLI, but also happens on opencode. Running with vLLM with…

17
Google DeepMind official-blog 9h ago

GPT-5 paper drops on arXiv — scaling laws revisited

OpenAI researchers released a 47-page preprint examining how scaling laws hold up at trillion-parameter regimes, with new evidence for compute-optimal training.

27 2
The Information — AI news-outlet 10h ago

Amazon Drops ‘Rufus’ Branding on Shopping Chatbot, Adds AI in Search

Amazon is rebranding its Rufus chatbot to Alexa for Shopping, the company said on Wednesday. Amazon first announced an AI-powered chatbot for shoppers called Rufus in early 2024. A year later, Amazon launched Alexa+, the large language model-enhanced version of its decade…

19
r/LocalLLaMA community 10h ago

TextGen is now a native desktop app. Open-source alternative to LM Studio (formerly text-generation-webui).

Hi all, I have been making a lot of updates to my project, and I wanted to share them here. TextGen (previously text-generation-webui, also known as my username oobabooga or ooba) has been in development since December 2022, before LLaMa and llama.cpp existed. In the last two…

32
Microsoft AI official-blog 11h ago

Hugging Face releases open-weights model family

Three new open-weights models under Apache 2.0 — sizes from 1B to 70B — released alongside training recipes and evaluation harnesses.

21
Cohere official-blog 12h ago

Mistral AI announces fine-tuning service

A managed fine-tuning offering opens up for Mistral Large + Medium tier users, with LoRA + full-parameter options.

27
r/LocalLLaMA community 14h ago

The Trillion-Parameter Dilemma: MiMo-V2.5-Pro went open-source (1.02T params). Is self-hosting worth it when the API costs $70 for 387M tokens?

Xiaomi open-sourced MiMo-V2.5-Pro. 1.02 trillion parameters, 42B active (MoE), 1M context, MIT license. On paper, this is exciting. In practice, I'm stuck on the math. What I've been doing with it I've been running V2.5-Pro via the API through Claude Code for autonomous coding…

13
r/LocalLLaMA community 15h ago

Does THINKING MODE significantly improve translation?

Between a solid model from Qwen or Gemma 4, when translating a text, does "thinking mode" significantly boost the quality of the translation, or is the difference negligible?   submitted by   /u/Sostrene_Blue [link]   [comments]

27
The Information — AI news-outlet 15h ago

Former Alibaba Star Researcher Starts New AI Lab, Seeks $2 Billion Valuation

Junyang Lin, former lead researcher of Alibaba’s Qwen models who left the firm earlier this year, is seeking to raise several hundred million dollars for his new AI lab, The Information reported . Lin’s new AI lab will likely be valued at around $2 billion after the funding…

32
Hacker News — AI on Front Page community 16h ago

SecurityBaseline.eu

Article URL: https://internetcleanup.foundation/2026/05/european-governments-3000-tracking-sites-1000-phpmyadmins-and-99pct-poorly-encrypted-email-introducing-securitybaseline-eu/ Comments URL: https://news.ycombinator.com/item?id=48118763 Points: 203 # Comments: 101

16
The Information — AI news-outlet 17h ago

Former Alibaba Star Researcher Starts New AI Lab, Seeks $2 Billion Valuation

Junyang Lin, former lead researcher of Alibaba’s Qwen models, is seeking to raise several hundred million dollars for his new AI lab, according to two people with direct knowledge of the matter. The new AI lab will likely be valued at around $2 billion after the funding round,…

28
arXiv — Machine Learning research 19h ago

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

arXiv:2605.11011v1 Announce Type: new Abstract: Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying…

37
arXiv — NLP / Computation & Language research 19h ago

Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

arXiv:2605.11887v1 Announce Type: new Abstract: Large language models have achieved remarkable capabilities across diverse tasks, yet their internal decision-making processes remain largely opaque, limiting our ability to inspect, control, and systematically improve them. This…

22
arXiv — NLP / Computation & Language research 19h ago

Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

arXiv:2605.12028v1 Announce Type: new Abstract: We describe our system for SemEval-2026 Task 8 (MTRAGEval), participating in Task A (Retrieval) across four English-language domains. Our approach employs a three-stage pipeline: (1) query rewriting via a LoRA-fine-tuned Qwen 2.5…

30
Hugging Face Daily Papers research 21h ago

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

Abstract LoopUS is a post-training framework that transforms pretrained LLMs into looped architectures for improved reasoning performance through latent-refinement and adaptive early exiting mechanisms. AI-generated summary Looped computation shows promise in improving the…

31
Simon Willison community 23h ago

datasette 1.0a29

Release: datasette 1.0a29 New TokenRestrictions.abbreviated(datasette) utility method for creating "_r" dictionaries. #2695 Table headers and column options are now visible even if a table contains zero rows. #2701 Fixed bug with display of column actions dialog on Mobile…

24
r/LocalLLaMA community 1d ago

High VRAM local coding model — still Qwen 3.6 27B?

I’ve been using Qwen 3.6 27B and it’s amazing. Not exactly your Opus replacement, but great for small tasks and checking work. But if you had 224GB of VRAM, would it still be your choice? Or is there something you consider better in the 100+B range (GPT-OSS, Deepseek, etc)…

7
Hacker News — AI on Front Page community 1d ago

Scrcpy v4.0

Article URL: https://github.com/Genymobile/scrcpy/releases/tag/v4.0 Comments URL: https://news.ycombinator.com/item?id=48114356 Points: 246 # Comments: 39

4
Ollama releases dev-tools 1d ago

v0.23.4-rc0

launch/opencode: add image modalities for vision models ( #15922 )

24
Ollama releases dev-tools 1d ago

v0.23.4

launch/opencode: add image modalities for vision models ( #15922 )

36
llama.cpp releases dev-tools 1d ago

b9123

ggml-webgpu: Enables running gpt-oss-20b ( #22906 ) Enable to run gpt-oss-20b and refactor mulmat-q disable test-backend-ops in ubuntu-24-webgpu macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) macOS Intel (x64) iOS XCFramework Linux: Ubuntu…

13
r/LocalLLaMA community 1d ago

Luce DFlash + PFlash on AMD Strix Halo: Qwen3.6-27B at 2.23x decode and 3.05x prefill vs llama.cpp HIP

Hey fellow Llamas, keeping it short. We just shipped DFlash and PFlash support for the AMD Ryzen AI MAX+ 395 iGPU (gfx1151, Strix Halo, 128 GiB unified memory). Same Luce DFlash stack from the RTX 3090 post a couple weeks back , now running on the consumer AMD APU class. Repo:…

22
Hacker News — Front Page community 1d ago

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

Hey HN, Henry here from Cactus. We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices. We were always frustrated by the little…

28
r/LocalLLaMA community 1d ago

Needle: We Distilled Gemini Tool Calling Into a 26M Model

We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices. We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted…

4
Simon Willison community 1d ago

llm 0.32a2

Release: llm 0.32a2 A bunch of useful stuff in this LLM alpha, but the most important detail is this one: Most reasoning-capable OpenAI models now use the /v1/responses endpoint instead of /v1/chat/completions . This enables interleaved reasoning across tool calls for GPT-5…

22
r/LocalLLaMA community 1d ago

New Qwen3.6 27b Autoround Quant (int4) Best Recipe

I've been using the int4 Autoround quant from "Lorbus/Qwen3.6-27B-int4-AutoRound" and it has been pretty good! Great quality and performance on an RTX 5090 vllm. I decided to use a similar Autoround recipe but use the "autorund-best" preset instead, it uses more iterations to…

34
TechCrunch — AI news-outlet 1d ago

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

Google unveiled its new AI-first Googlebooks laptops, more agentic Gemini features, vibe-coded Android widgets, Gemini in Chrome, refreshed Android Auto, and more ahead of I/O.

32
TechCrunch — AI news-outlet 1d ago

Google adds Gemini-powered dictation to Gboard, which could be bad news for dictation startups

Google's transcription feature will initially launch with Samsung Galaxy and Google Pixel phones.

27
TechCrunch — AI news-outlet 1d ago

Google brings agentic AI and vibe-coded widgets to Android

Gemini Intelligence will also include Gboard-based dictation and form-filling capabilities.

30
r/LocalLLaMA community 1d ago

Let's build claude code from scratch!

So, I made this video about how to create claude code from scratch. Here's the video: https://youtu.be/8pDfgBEy8bg and Github: https://github.com/CohleM/nanoclaude Feedback is extremely appreciated.   submitted by   /u/RoyalMaterial9614 [link]   [comments]

10
r/LocalLLaMA community 1d ago

Local LLM autocomplete + agentic coding on a single 16GB GPU + 64GB RAM

Today I set up a full coding toolbox on a single RTX 5080 (with RAM offloading) that's actually viable. Autocomplete : bartowski/Qwen2.5-Coder-7B-Instruct-GGUF:Q6_K_L Agentic : unsloth/Qwen3.6-35B-A3B-GGUF:UD-Q8_K_XL Why these models: Qwen2.5 is still the best model for infill…

9
LangChain releases dev-tools 1d ago

langchain==1.3.0

This release adds support for version="v3" in stream_events / astream_events for langchain agents. Refer to the event streaming guide for details.

30
r/LocalLLaMA community 1d ago

MagicQuant (v2.0) - Hybrid Mixed GGUF Models + Unsloth Dynamic Learned Quant Configurations + Benchmark table with collapsed winners and more

I spent the past 5+ months building a pipeline that creates hybrid GGUF quant mixes. I also built it to learn from Unsloth (or other) models by utilizing their quant to tensor assignment. And some architectures like Qwen3.6 27B have super weird patterns that can get genuinely…

14
r/MachineLearning community 1d ago

TabPFN-3 just released: a pre-trained tabular foundation model for up to 1M rows [R][N]

TabPFN-3 was released today, the next iteration of the tabular foundation model, originally published in Nature. Quick recap for anyone new to TabPFN: TabPFN predicts on tabular data in a single forward pass - no training, no hyperparameter search, no tuning. Built on TabPFN-2.5…

31
Vercel — AI dev-tools 1d ago

Fast mode for Opus 4.7 available on AI Gateway

Fast mode for Claude Opus 4.7 is now available on AI Gateway in research preview. Fast mode delivers ~2.5x faster output token generation with full Opus 4.7 intelligence. This is an early, experimental feature. To enable fast mode, pass speed: 'fast' in the anthropic provider…

32
Vercel — AI dev-tools 1d ago

AI Gateway production index

Ask which AI model is best, and the answer changes before the ink dries. That's what happens in an industry where new models are released weekly. Every benchmark measures a different race, and every race crowns its own winner, but Vercel has a unique view of the industry through…

38
OpenAI news 1d ago

How NVIDIA engineers and researchers build with Codex

Teams use Codex with GPT-5.5 to ship production systems and turn research ideas into runnable experiments.

18
Vercel — AI dev-tools 1d ago

Node.js 26.x now available on Vercel Sandboxes

Vercel Sandbox now supports Node.js version 26. To run a Sandbox with Node.js 26, upgrade @vercel/sandbox to 1.10.2 or later, or to 2.0.0-beta.19 or later if you're using v2 and set the runtime property to node26 : Get started today and learn more in the documentation . Read more

4
NVIDIA Developer Blog official-blog 2d ago

Introducing NVIDIA Fleet Intelligence for Real-Time GPU Fleet Visibility and Optimization

The compute capability of large GPU fleets presents unprecedented opportunities to innovate and provide value to customers in record time. Yet these...

9
Anthropic SDK (Python) releases dev-tools 2d ago

v0.101.0

0.101.0 (2026-05-11) Full Changelog: v0.100.0...v0.101.0 Features aws: Add AWS client for Claude Platform on AWS ( 1e70e3a ) Bug Fixes client: add missing f-string prefix in file type error message ( 06d109a ) Chores examples: bump tools_runner.py to claude-sonnet-4-5-20250929 (…

18
Stack Overflow Blog news 2d ago

Introducing the Heap, the software engineering blog for everyone

If you’ve got something you’ve been dying to share with the Stack Overflow community but don’t quite have a place to share it, we've got you.

5
OpenAI news 2d ago

OpenAI launches DeployCo to help businesses build around intelligence

OpenAI launches DeployCo, a new enterprise deployment company built to help organizations bring frontier AI into production and turn it into measurable business impact.

32
vLLM releases dev-tools 3d ago

v0.20.2

vLLM v0.20.2 Highlights This release features 6 commits from 6 contributors (0 new)! This is a small patch release with bug fixes for DeepSeek V4, gpt-oss, and Qwen3-VL Bug Fixes DeepSeek V4 sparse attention : Re-enable the persistent topk path on Hopper and ensure the memset…

11
Simon Willison community 5d ago

Using Claude Code: The Unreasonable Effectiveness of HTML

Using Claude Code: The Unreasonable Effectiveness of HTML Thought-provoking piece by Thariq Shihipar (on the Claude Code team at Anthropic) advocating for HTML over Markdown as an output format to request from Claude. The article is crammed with interesting examples (collected…

19
Don't Worry About the Vase community 5d ago

Claude Code, Codex and Agentic Coding #8

When I started this series, everyone was going crazy for coding agents.

20
LangChain releases dev-tools 5d ago

langchain==1.2.18

Changes since langchain==1.2.17 release(langchain): 1.2.18 ( #37250 ) revert: feat(langchain): ls_agent_type tag on create_agent calls ( #37249 ) chore(langchain-classic): deprecate hub, limit loads/dumps ( #37234 ) refactor(langchain-classic): retarget deprecations to…

26
Latent.Space news-outlet 5d ago

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

OpenAI continues deploying GPT-5 everywhere

18
Smol AI News news-outlet 5d ago

not much happened today

**OpenAI** rapidly expanded the **GPT-5.5** family with multiple variants including **gpt-image-2**, **GPT-5.5 Pro**, and **GPT-5.5 Cyber**, receiving positive feedback for efficiency and usability. **Codex** evolved into a long-running agent runtime with a new **/goal**…

35
Simon Willison community 6d ago

llm-gemini 0.31

Release: llm-gemini 0.31 gemini-3.1-flash-lite is no longer a preview . Here's my write-up of the Gemini 3.1 Flash-Lite Preview model back in March. I don't believe this new non-preview model has changed since then. Tags: llm-release , gemini , llm , google , generative-ai , ai…

10
Simon Willison community 6d ago

Behind the Scenes Hardening Firefox with Claude Mythos Preview

Behind the Scenes Hardening Firefox with Claude Mythos Preview Fascinating, in-depth details on how Mozilla used their access to the Claude Mythos preview to locate and then fix hundreds of vulnerabilities in Firefox: Suddenly, the bugs are very good Just a few months ago,…

6
Simon Willison community 6d ago

Notes on the xAI/Anthropic data center deal

There weren't a lot of big new announcements from Anthropic at yesterday's Code w/ Claude event, but the biggest by far was the deal they've struck with SpaceX/xAI to use "all of the capacity of their Colossus data center". As I mentioned in my live blog of the keynote , that's…

8
LangChain releases dev-tools 6d ago

langchain-core==0.3.86

Changes since langchain-core==0.3.85 release(core): 0.3.86 ( #37242 ) fix(core): backport path-traversal fix to v0.3 ( CVE-2026-34070 , GHSA-qh6h-p6c9-ff54 ) ( #37233 )

21
LangChain releases dev-tools 6d ago

langchain==0.3.30

Changes since langchain==0.3.29 release(langchain): release 0.3.30 ( #37241 ) chore(langchain): backport loads/dumps harden to v0.3 and deprecate hub ( #37239 )

27
LangChain releases dev-tools 6d ago

langchain-classic==1.0.7

Changes since langchain-classic==1.0.6 release(langchain-classic): 1.0.7 ( #37240 ) chore(langchain-classic): deprecate hub, limit loads/dumps ( #37234 )

31
OpenAI news 6d ago

Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber

OpenAI expands Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber, helping verified defenders accelerate vulnerability research and protect critical infrastructure.

38
Vercel — AI dev-tools 6d ago

Next.js May 2026 security release

Summary We have shipped a coordinated security release for Next.js addressing 13 advisories across denial of service, middleware and proxy bypass, server-side request forgery, cache poisoning, and cross-site scripting. One advisory addresses an upstream React Server Components…

7
Smol AI News news-outlet 6d ago

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

**OpenAI** released **GPT-Realtime-2**, a voice model with **GPT-5-class reasoning**, tool use, interruption handling, and extended context windows up to **128K tokens**, achieving top scores on **Big Bench Audio** and **Conversational Dynamics** benchmarks. They also launched a…

22
OpenAI news 6d ago

Introducing Trusted Contact in ChatGPT

Introducing Trusted Contact in ChatGPT, an optional safety feature that notifies someone you trust if serious self-harm concerns are detected.

23
Ars Technica — AI news-outlet 7d ago

Anthropic raises Claude Code usage limits, credits new deal with SpaceX

Deal follows others with Microsoft, Amazon, and more.

12
LangChain releases dev-tools 7d ago

langchain==1.3.0a2

Initial release release(langchain): 1.3.0a2 ( #37225 ) release(langchain): 1.3.0a2 ( #37224 ) fix(langchain): ordered schema resolution — list replaces set so state_schema wins ( #37223 ) release(langchain): 1.3.0a1 ( #37140 ) feat(langchain): wire stream_events(version='v3')…

27
Ars Technica — AI news-outlet 7d ago

Anthropic's Claude Managed Agents can now "dream," sort of

Also, 5-hour usage limits will double for Pro and Max users of Claude Code.

14
Simon Willison community 7d ago

Live blog: Code w/ Claude 2026

I'm at Anthropic's Code w/ Claude event today. Here's my live blog of the morning keynote sessions. Tags: ai , generative-ai , llms , anthropic , claude , claude-code , live-blog

18
Smol AI News news-outlet 7d ago

Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized

**Anthropic** announced a new **SpaceX compute partnership** to significantly increase capacity for **Claude** products, doubling **Claude Code's 5-hour rate limits** for Pro, Max, Team, and Enterprise users, removing peak-hour limit reductions, and substantially increasing API…

26
OpenAI news 7d ago

Introducing ChatGPT Futures: Class of 2026

Meet the ChatGPT Futures Class of 2026—26 student innovators using AI to build, research, and drive real-world impact. Discover how this generation is redefining learning, creativity, and opportunity with ChatGPT.

14
Zed Editor dev-tools 7d ago

Introducing Zed for Business

Zed for Business adds org-wide AI settings, policy enforcement, and centralized billing for teams already using Zed.

20
Simon Willison community 7d ago

datasette-referrer-policy 0.1

Release: datasette-referrer-policy 0.1 The OpenStreetMap tiles on the Datasette global-power-plants demo weren't displaying correctly. This turned out to be caused by two bugs. The first is that the CAPTCHA I added to that site a few weeks ago was triggering for the .json fetch…

25
Don't Worry About the Vase community 8d ago

The AI Ad-Hoc Prior Restraint Era Begins

The White House has ordered Anthropic not to expand access to Mythos, and is at least seriously considering a complete about-face of American Frontier AI policy into a full prior restraint regime, where anyone wishing to release a highly capable new model will have to ask for…

37
LangChain releases dev-tools 8d ago

langchain-core==1.3.3

Changes since langchain-core==1.3.2 release(core): 1.3.3 ( #37198 ) fix(core): set deprecation since to 1.3.3 to match release ( #37200 ) fix(core, langchain): harden load() against untrusted manifests ( #37197 ) chore: bump notebook from 7.5.0 to 7.5.6 in /libs/core ( #37109 )…

22
LangChain releases dev-tools 8d ago

langchain-fireworks==1.3.1

Changes since langchain-fireworks==1.3.0 fix(fireworks): require api_key in FireworksEmbeddings ( #37193 ) release(fireworks): 1.3.1 ( #37189 ) fix(fireworks): strip non-wire keys from ToolMessage text content blocks ( #37187 )

33
LangChain releases dev-tools 8d ago

langchain-mistralai==1.1.4

Changes since langchain-mistralai==1.1.3 release(mistralai): 1.1.4 ( #37191 ) fix(mistralai): strip non-wire keys from ToolMessage ( #37188 )

4
OpenAI news 8d ago

Unlocking large scale AI training networks with MRC (Multipath Reliable Connection)

OpenAI introduces MRC (Multipath Reliable Connection), a new supercomputer networking protocol released via OCP to improve resilience and performance in large-scale AI training clusters.

25
OpenAI news 8d ago

GPT-5.5 Instant: smarter, clearer, and more personalized

GPT-5.5 Instant updates ChatGPT’s default model with smarter, more accurate answers, reduced hallucinations, and improved personalization controls.

37
OpenAI news 8d ago

GPT-5.5 Instant System Card

May 5, 2026 Safety Publication GPT‑5.5 Instant System Card Read the System Card (opens in a new window) Introduction GPT‑5.5 Instant is our latest Instant model, and explained in our blog ⁠ . The comprehensive safety mitigation approach for this model is similar to previous…

10
Vercel — AI dev-tools 8d ago

How KIKO Milano scales for Black Friday

KIKO Milano on Vercel: Eliminated 3 weeks of Black Friday infrastructure prep 75% decrease in app build times Went from minimal releases to deploying multiple times per day KIKO Milano’s ecommerce team used to treat peak traffic as an operations project. Weeks before Black…

37
Simon Willison community 8d ago

datasette-llm 0.1a7

Release: datasette-llm 0.1a7 Mechanism for configuring default options for specific models. Part of Datasette's evolving support mechanism for plugins that use LLMs. It's now possible to configure a model with default options, e.g. to say all enrichment operations should use a…

30
Simon Willison community 8d ago

llm-echo 0.5a0

Release: llm-echo 0.5a0 New -o thinking 1 option to help test against LLM 0.32a0 and higher. This plugin provides a fake model called "echo" for LLM which doesn't run an LLM at all - it's useful for writing automated tests. You can now do this: uvx --with llm==0.32a1 --with…

17
Simon Willison community 8d ago

Granite 4.1 3B SVG Pelican Gallery

Granite 4.1 3B SVG Pelican Gallery IBM released their Granite 4.1 family of LLMs a few days ago. They're Apache 2.0 licensed and come in 3B, 8B and 30B sizes. Granite 4.1 LLMs: How They’re Built by Granite team member Yousaf Shah describes the training process in detail. Unsloth…

11
Simon Willison community 9d ago

April 2026 newsletter

I just sent out the April edition of my sponsors-only monthly newsletter . If you are a sponsor (or if you start a sponsorship now) you can access it here . In this month's newsletter: Opus 4.7 and GPT-5.5, both with price increases Claude Mythos and LLM security research…

14
Simon Willison community 9d ago

TRE Python binding — ReDoS robustness demo

Research: TRE Python binding — ReDoS robustness demo If it's good enough for antirez to add to Redis I figured Ville Laurikari's TRE regular expression engine was worth exploring in a little more detail. I had Claude Code build an experimental Python binding (it used ctypes )…

12
vLLM releases dev-tools 9d ago

v0.20.1

vLLM v0.20.1 This is a patch release on top of v0.20.0 primarily focused on DeepSeek V4 stabilization and performance improvements , along with several important bug fixes. DeepSeek V4 Base model support ( #41006 ). Multi-stream pre-attention GEMM ( #41061 ), configurable…

37
Smol AI News news-outlet 9d ago

not much happened today

**OpenAI** rolled out **GPT-5.5 Instant** as the new default for ChatGPT and API, enhancing **factuality, intelligence, image understanding, and tone** with stronger personalization features like saved memories and Gmail integration. OpenAI also shared infrastructure updates on…

28
Smol AI News news-outlet 9d ago

not much happened today

**AI Twitter Recap** highlights the shift from model-centric AI to **context pipelines** and **agent orchestration** as key performance drivers. Notably, **gpt-5.2-codex** and **gpt-5.3-codex** showed significant benchmark improvements through prompt and middleware tuning. The…

16
Vercel — AI dev-tools 9d ago

Introducing deepsec: The security harness for finding vulnerabilities in your codebase

Today we’re open sourcing deepsec : a security harness powered by coding agents. It runs on your own infrastructure and surfaces hard-to-find issues in large codebases. You can run deepsec on your laptop without setting up a cloud service for privileged source code access. For…

38
Marcus on AI community 11d ago

Richard Dawkins and The Claude Delusion

The great skeptic gets taken in

17
The Algorithmic Bridge news-outlet 12d ago

Weekly Top Picks #120

Q1 earnings / Trump wants to nationalize AI / China protects its workers / ARC-AGI-3 defeats GPT-5.5 and Opus-4.7 / The "permanent underclass" / Dawkins x Claudia

25
MIT Technology Review — AI news-outlet 12d ago

A new US phone network for Christians aims to block porn and gender-related content

A new US-wide cell phone network marketed to Christians is set to launch next week. It blocks porn, which experts in network security say marks the first time a US cell plan has used network-level blocking for such content that can’t be turned off even by adult account owners.…

14
Smol AI News news-outlet 12d ago

not much happened today

**xAI released Grok 4.3**, improving cost/performance with a **53 Intelligence Index score**, 4 points higher than Grok 4.20, and significant gains on **GDPval-AA** and **τ²-Bench Telecom**. However, accuracy tradeoffs raised reliability concerns. Community opinions are mixed,…

32
Latent.Space news-outlet 12d ago

[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work

a quiet day lets us reflect on coding agents "breaking containment"

13
ThursdAI news-outlet 12d ago

📅 ThursdAI - Apr 30 - DeepSeek V4 (1.6T MoE), Cursor SDK Wins WolfBench, Mayo's REDMOD Saves Lives, Stripe Gives Agents a Wallet & more

From Weights & Biases - one last one for April, with incredible AI news, a monthly recap and Max from Pangram as a guest + I have OpenClaw a credit card!

21
Don't Worry About the Vase community 13d ago

AI #166: Google Sells Out

This was the week of GPT-5.5.

11
Vercel — AI dev-tools 13d ago

Grok 4.3 on AI Gateway

Grok 4.3 is now available on Vercel AI Gateway . The model has a 1M token context window and improvements in accuracy, tool calling, and instruction following. To use Grok 4.3, set model to xai/grok-4.3 in the AI SDK . AI Gateway provides a unified API for calling models,…

7
Smol AI News news-outlet 13d ago

not much happened today

**OpenAI's GPT-5.5** achieves top-tier performance in long-horizon cyber tasks, matching or surpassing **Claude Mythos Preview** with a **71.4%** pass rate and showing ongoing improvement beyond **100M tokens** inference. OpenAI also released an **Advanced Account Security**…

32
OpenAI news 13d ago

Introducing Advanced Account Security

Introducing Advanced Account Security: phishing-resistant login, stronger recovery, and enhanced protections to safeguard sensitive data and prevent account takeover.

36
OpenAI news 14d ago

Where the goblins came from

How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.

4
MIT News — AI research 14d ago

The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing

Building on a long-standing MIT–IBM collaboration, the new lab will chart the convergence of AI, algorithms, and quantum computing.

8
Vercel — AI dev-tools 15d ago

Native Deployment Checks are now available

You can now run lint and typecheck on every Vercel deployment, in parallel with the build. Native Deployment Checks are available to every team and join your existing Deployment Checks alongside GitHub and Marketplace integrations. Once added from your project's Build and…

21
Hugging Face official-blog 15d ago

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

13
Don't Worry About the Vase community 15d ago

GPT-5.5: Capabilities and Reactions

The system card for GPT-5.5 mostly told us what we expected.

27
OpenAI Python SDK releases dev-tools 15d ago

v2.33.0

2.33.0 (2026-04-28) Full Changelog: v2.32.0...v2.33.0 Features api: api update ( 18f834a ) Bug Fixes api: correct prompt_cache_retention enum value from in-memory to in_memory ( #1822 ) ( f9d2d13 ) Chores ci: remove release-doctor workflow ( 00b2091 )

37
Stack Overflow Blog news 15d ago

Turning scattered knowledge into trusted intelligence: Stack Internal 2026.3

Now generally available in the 2026.3 release, Ingestion transforms siloed content into structured, verified knowledge—optimized for both your teams and AI.

14
Smol AI News news-outlet 15d ago

not much happened today

**vLLM v0.20.0** introduces significant improvements in memory and MoE serving efficiency, including **TurboQuant 2-bit KV cache** for **4× KV capacity** and a **2.1% latency improvement**. The update supports multiple hardware platforms like **DeepSeek V4 MegaMoE on…

9
Latent.Space news-outlet 15d ago

[AINews] ImageGen is on the Path to AGI

reflecting on the continued GPT-Image-2 explosion

34
OpenAI news 15d ago

OpenAI models, Codex, and Managed Agents come to AWS

OpenAI GPT models, Codex, and Managed Agents are now available on AWS, enabling enterprises to build secure AI in their AWS environments.

18
vLLM releases dev-tools 16d ago

v0.20.0

vLLM v0.20.0 Highlights This release features 752 commits from 320 contributors (123 new)! DeepSeek V4 : Initial DeepSeek V4 support landed ( #40860 ), with DSML token-leakage fix in DSV4/3.2 ( #40806 ), DSA + MTP IMA fix ( #40772 ), and a silu clamp limit on the shared expert (…

33
Don't Worry About the Vase community 16d ago

GPT 5.5: The System Card

Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro.

20
vLLM releases dev-tools 16d ago

v0.20.1rc0: Add system_fingerprint field to OpenAI-compatible API responses (#40537)

Co-authored-by: Claude [email protected]

6
Google DeepMind official-blog 16d ago

Announcing our partnership with the Republic of Korea

Google DeepMind and Korea partner to accelerate scientific breakthroughs using frontier AI models

6
Smol AI News news-outlet 16d ago

not much happened today

**OpenAI** loosens its **Azure exclusivity**, allowing distribution across **Google TPU**, **AWS Trainium**, and **Bedrock** with commitments through **2032** and revenue share through **2030**. **GPT-5.5** shows improved benchmarks but is not uniformly dominant, ranking…

11
Latent.Space news-outlet 18d ago

[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips

The prodigal Tiger returns... but is no longer the benchmarks leader.

21
NVIDIA Developer Blog official-blog 18d ago

Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints

DeepSeek just launched its fourth generation of flagship models with DeepSeek-V4-Pro and DeepSeek-V4-Flash, both targeted at enabling highly efficient...

5
The Algorithmic Bridge news-outlet 19d ago

Weekly Top Picks #119

SpaceX + Cursor + Mistral / Jensen v Jensen / The job AI can't take / GPT-5.5 and ChatGPT Images 2.0 / An anti-grammar app / Terence Tao on the future

20
Vercel — AI dev-tools 19d ago

GPT 5.5 on AI Gateway

GPT-5.5 is now available on Vercel AI Gateway . There are 2 variants: GPT-5.5 and GPT-5.5 Pro. Both models are tuned for long-running agentic work across coding, computer use, knowledge work, and scientific research, and are more token-efficient than the previous generation.…

37
Smol AI News news-outlet 19d ago

DeepSeek v4

**DeepSeek-V4** technical release features a **1.6T-parameter MoE with 49B active parameters** and **1M-token context**, showcasing hybrid attention and compressed KV schemes for major memory reductions. It ranks as the **#2 open-weights reasoning model** behind **Kimi K2.6**…

13
ThursdAI news-outlet 19d ago

📅 Apr 23: OpenAI's Week: GPT-5.5, GPT-Image-2, Codex CUA + Chronicle, + Claude Design, Kimi K2.6, Qwen 3.6-27B

From Weights & Biases, what an intense week, that's fully dominated by OpenAI, a new top LLM (5.5), a new top Image Gen (imagev2) and tons of codex releases + Claude Design and a bunch of open source

20
One Useful Thing (Ethan Mollick) community 20d ago

Sign of the future: GPT-5.5

One impressive step on the curve

26
Don't Worry About the Vase community 20d ago

AI #165: In Our Image

This was the week of Claude Opus 4.7.

15
OpenAI news 20d ago

Introducing GPT-5.5

Introducing GPT-5.5, our smartest model yet—faster, more capable, and built for complex tasks like coding, research, and data analysis across tools.

26
OpenAI news 20d ago

GPT-5.5 System Card

April 23, 2026 Safety Publication GPT‑5.5 System Card Read the System Card (opens in a new window) 1. Introduction GPT‑5.5 is a new model designed for complex, real-world work, including writing code, researching online, analyzing information, creating documents and…

4
Vercel — AI dev-tools 20d ago

Deepseek V4 on AI Gateway

DeepSeek V4 is now available on Vercel AI Gateway . There are 2 model variants: DeepSeek V4 Pro and DeepSeek V4 Flash. A 1M token context window is the default across both models. DeepSeek V4 Pro focuses on agentic coding, formal mathematical reasoning, and long-horizon…

27
Smol AI News news-outlet 20d ago

GPT 5.5

**OpenAI launched GPT-5.5** as its new flagship model for "real work and powering agents," immediately available in ChatGPT and Codex but with delayed API access due to enhanced safety requirements. The model features improved token efficiency and supports longer multi-step…

14
OpenAI news 20d ago

GPT-5.5 Bio Bug Bounty

Explore the GPT-5.5 Bio Bug Bounty: a red-teaming challenge to find universal jailbreaks for bio safety risks, with rewards up to $25,000.

35
Don't Worry About the Vase community 21d ago

Opus 4.7 Part 3: Model Welfare

It is thanks to Anthropic that we get to have this discussion in the first place.

35
Latent.Space news-outlet 21d ago

Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

A rare interview with Shopify's CTO on -everything- that Shopify is doing to maximize AI for their customers, with exclusive data on their own AI adoption.

26
OpenAI news 21d ago

Introducing workspace agents in ChatGPT

Workspace agents in ChatGPT are Codex-powered agents that automate complex workflows, run in the cloud, and help teams scale work across tools securely.

29
Smol AI News news-outlet 21d ago

not much happened today

**Alibaba** released **Qwen3.6-27B**, a dense, Apache 2.0 open coding model with thinking and non-thinking modes, outperforming the larger Qwen3.5-397B-A17B on multiple coding benchmarks including SWE-bench and Terminal-Bench. It supports native vision-language reasoning over…

15
Latent.Space news-outlet 21d ago

[AINews] OpenAI launches GPT-Image-2

with Cursor getting a $10B contract with xAI and a right to acquire for $60B.

33
OpenAI news 21d ago

Introducing OpenAI Privacy Filter

OpenAI Privacy Filter is an open-weight model for detecting and redacting personally identifiable information (PII) in text with state-of-the-art accuracy

37
Zed Editor dev-tools 21d ago

Introducing Parallel Agents in Zed

Run multiple agents at once, in the same window.

8
Don't Worry About the Vase community 22d ago

Opus 4.7 Part 2: Capabilities and Reactions

Claude Opus 4.7 raises a lot of key model welfare related concerns.

12
OpenAI news 22d ago

Introducing ChatGPT Images 2.0

ChatGPT Images 2.0 introduces a state-of-the-art image generation model with improved text rendering, multilingual support, and advanced visual reasoning.

9
Vercel — AI dev-tools 22d ago

GPT Image 2 on AI Gateway

GPT Image 2 is now available on Vercel AI Gateway . OpenAI's newest image model supports detailed instruction following, accurate placement and relationships between objects, and rendering of dense text across multiple aspect ratios. The model can render fine-grained elements…

30
Smol AI News news-outlet 22d ago

GPT-Image-2

**OpenAI** launched **GPT-Image-2**, enhancing image generation with improved text rendering, layout fidelity, editing, multilingual support, and "thinking" capabilities. It supports generating slides, infographics, diagrams, UI mockups, and QR codes, and integrates with tools…

36
vLLM releases dev-tools 22d ago

v0.19.1

This is a patch release on top of v0.19.0 with Transformers v5.5.3 upgrade and bug fixes for Gemma4: Update to transformers v5 ( #30566 ) [Bugfix] Fix invalid JSON in Gemma 4 streaming tool calls by stripping partial delimiters ( #38992 ) [Bugfix][Frontend] Fix Gemma4 streaming…

17
OpenAI news 22d ago

Scaling Codex to enterprises worldwide

OpenAI launches Codex Labs, partners with with Accenture, PwC, Infosys, and others to help enterprises deploy and scale Codex across the software development lifecycle, and hits 4M Codex WAU.

13
Don't Worry About the Vase community 23d ago

Opus 4.7 Part 1: The Model Card

Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7.

28
Vercel — AI dev-tools 23d ago

Kimi K2.6 on AI Gateway

Kimi K2.6 from Moonshot AI is now available on Vercel AI Gateway . The model focuses on long-horizon coding tasks, with generalization across languages such as Rust, Go, and Python and across front-end, devops, and performance optimization work. K2.6 can turn simple prompts into…

11
OpenAI news 23d ago

OpenAI helps Hyatt advance AI among colleagues

Hyatt deploys ChatGPT Enterprise across its global workforce, using GPT-5.4 and Codex to improve productivity, operations, and guest experiences.

11
Ahead of AI (Sebastian Raschka) research 25d ago

My Workflow for Understanding LLM Architectures

A learning-oriented workflow for understanding new open-weight model releases

14
Don't Worry About the Vase community 26d ago

AI #164: Pre Opus

This is a day late because, given the discourse around Dwarkesh Patel’s interview with Jensen Huang, I pushed the weekly to Friday.

34
Smol AI News news-outlet 26d ago

not much happened today

**Anthropic** launched **Claude Design**, a prototyping tool powered by **Claude Opus 4.7**, targeting design workflows and competing with **Figma** and others. Benchmarks show **Opus 4.7** leading in coding and text tasks, with improved efficiency and adaptive reasoning, though…

7
ThursdAI news-outlet 26d ago

April 16 - Codex uses your mac in the background, Opus 4.7 release not quite Mythos + 3 interviews

From Weights & Biases - 3 interviews, 3 Breaking news (Opus 4.7 and Codex Computer use) + a discussion about ZL Continuums among the cohosts. Let us catch you up!

37
Anthropic SDK (Python) releases dev-tools 27d ago

v0.96.0

0.96.0 (2026-04-16) Full Changelog: v0.95.0...v0.96.0 Features api: add claude-opus-4-7, token budgets and user_profiles ( 0aa2a0d ) Chores ci: remove release-doctor workflow ( 1d9add3 )

31
Vercel — AI dev-tools 27d ago

Claude Opus 4.7 on AI Gateway

Claude Opus 4.7 from Anthropic is now available on Vercel AI Gateway . Opus 4.7 is optimized for long-running, asynchronous agents and handles complex, multi-step tasks with reliable agentic execution. The model shows gains on knowledge-worker tasks, particularly where it needs…

6
Smol AI News news-outlet 27d ago

Anthropic's Claude Opus 4.7

**Anthropic** launched **Claude Opus 4.7**, its most capable Opus model yet, featuring stronger coding and agentic performance, a new tokenizer, and improved long-context handling with a new **xhigh** reasoning tier. Benchmarks show substantial gains, including **SWE-bench Pro…

37
OpenAI news 27d ago

Introducing GPT-Rosalind for life sciences research

OpenAI introduces GPT-Rosalind, a frontier reasoning model built to accelerate drug discovery, genomics analysis, protein reasoning, and scientific research workflows.

32
OpenAI news 27d ago

Accelerating the cyber defense ecosystem that protects us all

Leading security firms and enterprises join OpenAI’s Trusted Access for Cyber, using GPT-5.4-Cyber and $10M in API grants to strengthen global cyber defense.

19
Google DeepMind official-blog 28d ago

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

13
Don't Worry About the Vase community 28d ago

Claude Code, Codex and Agentic Coding #7: Auto Mode

As we all try to figure out what Mythos means for us down the line, the world of practical agentic coding continues, with the latest array of upgrades.

7
Anthropic SDK (Python) releases dev-tools 29d ago

v0.95.0

0.95.0 (2026-04-14) Full Changelog: v0.94.1...v0.95.0 Features api: mark Sonnet and Opus 4 as deprecated ( 0c1e773 ) bedrock: use auth header for mantle client ( #1644 ) ( 3b93090 )

32
Vercel — AI dev-tools 29d ago

Logs filtering for Vercel Workflows now available

Workflow run log filtering is now supported on Vercel, making it easy to view all logs associated with a workflow run in one place instead of piecing them together across individual requests. You can use the “View Logs” button from the workflow run details page to jump directly…

4
NVIDIA Developer Blog official-blog 29d ago

NVIDIA Ising Introduces AI-Powered Workflows to Build Fault-Tolerant Quantum Systems

NVIDIA Ising is the world's first family of open AI models for building quantum processors, launching with two model domains: Ising Calibration and Ising...

22
Vercel — AI dev-tools 29d ago

Elastic Build Machines is now GA

Elastic build machines, released in beta on March 24, are now generally available for all Pro and Enterprise customers, and are now the default for all new Pro teams. Rather than a one-size-fits-all approach, Vercel evaluates each project individually and assigns the right…

14
OpenAI news 29d ago

Trusted access for the next era of cyber defense

OpenAI expands its Trusted Access for Cyber program, introducing GPT-5.4-Cyber to vetted defenders and strengthening safeguards as AI cybersecurity capabilities advance.

29
Marcus on AI community 1mo ago

Claude Mythos, evaluated

How afraid should we be?

23
Vercel — AI dev-tools 1mo ago

Copy-to-Prompt instructions now available for Flags

The feature flags details page now includes copy-to-prompt instructions in the instructions pane. You or your agent can install the Flags SDK, link the project using the Vercel CLI , and add the required flag definitions to the code base. Teams that prefer manual configuration…

9
Google DeepMind official-blog 1mo ago

Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning

Gemini Robotics ER 1.6: Enhancing spatial reasoning and multi-view understanding for autonomous robotics.

32
OpenAI news 1mo ago

Enterprises power agentic workflows in Cloudflare Agent Cloud with OpenAI

Cloudflare brings OpenAI’s GPT-5.4 and Codex to Agent Cloud, enabling enterprises to build, deploy, and scale AI agents for real-world tasks with speed and security.

20
NVIDIA Developer Blog official-blog 1mo ago

MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platforms for Complex AI Applications

The release of MiniMax M2.7 adds enhancements to the popular MiniMax M2.5 model, built for agentic harnesses,...

33
Marcus on AI community 1mo ago

The biggest advance in AI since the LLM

Why Claude Code changes everything

17
Vercel — AI dev-tools 1mo ago

Anomaly alert configuration now available

You can now granularly configure anomaly alerts to define exactly which unexpected spikes and errors matter to your application. Alert rules give you detection-level control, allowing you to customize which projects, alert types, metrics, HTTP status codes, and specific routes…

33
Smol AI News news-outlet 1mo ago

not much happened today

**GLM-5.1** has reached **#3 on Code Arena**, surpassing **Gemini 3.1** and **GPT-5.4**, and matching **Claude Sonnet 4.6** in coding performance. **Z.ai** now holds the **#1 open model rank** close to the top overall. The advisor pattern, combining a cheap executor with an…

12
Interconnects research 1mo ago

Claude Mythos and misguided open-weight fearmongering

Another dance around fears of open-source.

18
Marcus on AI community 1mo ago

Three reasons to think that the Claude Mythos announcement from Anthropic was overblown

No need to panic just yet

17
ThursdAI news-outlet 1mo ago

📅 ThursdAI LIVE from London - Claude Mythos, Codex Resets, Muse Spark & More | w/ Swyx and friends from OpenAI, Deepmind, LMArena and OpenClaw

From Weights & Biases: Wildest and our most european ThursdAI ever, live from London AI Engineer conference, with many friends like @swyx, Omar, VB and new friends

7
The Algorithmic Bridge news-outlet 1mo ago

What Happens When AI Gets Too Good at One Thing

Thoughts on Claude Mythos

31
Zed Editor dev-tools 1mo ago

Introducing Zed's Agent Metrics

A public, weekly view of AI agent adoption and turn times inside Zed, plus a few patterns worth watching.

19
Smol AI News news-outlet 1mo ago

not much happened today

**Meta Superintelligence Labs** launched **Muse Spark**, a natively multimodal reasoning model featuring tool use, visual chain of thought, and multi-agent orchestration. It is live on **meta.ai** and the Meta AI app with a private API preview and plans for open-sourcing future…

29
OpenAI news 1mo ago

Introducing the Child Safety Blueprint

Discover OpenAI’s Child Safety Blueprint—a roadmap for building AI responsibly with safeguards, age-appropriate design, and collaboration to protect and empower young people online.

6
Vercel — AI dev-tools 1mo ago

Opus 4.6 Fast Mode available on AI Gateway

Fast mode support for Claude Opus 4.6 is now available on AI Gateway. Fast mode is a premium high-speed option that delivers 2.5x faster output token speeds with the same model intelligence. This is an early, experimental feature. Fast mode's increased output token speeds enable…

11
Vercel — AI dev-tools 1mo ago

GLM 5.1 on AI Gateway

GLM 5.1 from Z.ai is now available on Vercel AI Gateway . Designed for long-horizon autonomous tasks, GLM-5.1 can work continuously on a single task for extended periods, handling planning, execution, testing, and iterative refinement in a closed loop. Rather than one-shot code…

38
Smol AI News news-outlet 1mo ago

Anthropic @ $30B ARR, Project GlassWing and Claude Mythos Preview — first model too dangerous to release since GPT-2

**Anthropic** strategically challenges **OpenAI** amid its upcoming IPO concerns by announcing a jump from **$19B ARR in March** to **$30B ARR in April**, highlighting a differential growth rate and higher cost efficiency. The company also revealed **Claude Mythos**, rumored as…

30
OpenAI news 1mo ago

Announcing the OpenAI Safety Fellowship

A pilot program to support independent safety and alignment research and develop the next generation of talent

14
Smol AI News news-outlet 1mo ago

not much happened today

**Google** introduced **Skills in Chrome**, enabling reusable browser workflows with Gemini prompts and a library of ready-made Skills, enhancing end-user agentization. **Tencent** teased **HYWorld 2.0**, an open-source 3D world model generating editable scenes from a single…

8
Smol AI News news-outlet 1mo ago

not much happened today

**Gemma 4** was launched by **Google** under an **Apache 2.0 license**, marking a significant open-model release focused on **reasoning, agentic workflows, multimodality, and on-device use**. It outperforms models 10x larger and has immediate ecosystem support including…

35
ThursdAI news-outlet 1mo ago

📅 ThursdAI - Apr 2 - Gemma 4 is the new LLama, Claude Code Leak, OpenAI raises $122B & more AI news

Listen now | From Weights & Biases: Gemma 4 w/ Omar from Deepmind, OpenAI raises $122B largest funding round, we cover the Claude Code leak with the guy who put it on Github and got >100K stars in 24 hours & more

23
NVIDIA Developer Blog official-blog 1mo ago

Bringing AI Closer to the Edge and On-Device with Gemma 4

The Gemmaverse expands with the launch of the latest Gemma 4 multimodal and multilingual models, designed to scale across the full spectrum of deployments, from...

27
Vercel — AI dev-tools 1mo ago

Qwen 3.6 Plus on AI Gateway

Qwen 3.6 Plus from Alibaba is now available on Vercel AI Gateway . Compared to Qwen 3.5 Plus, this model adds stronger agentic coding capabilities, from frontend development to repository-level problem solving, along with improved multimodal perception and reasoning. It features…

19
Vercel — AI dev-tools 1mo ago

Gemma 4 on AI Gateway

Gemma 4 26B (MoE) and 31B (Dense) from Google are now available on Vercel AI Gateway . Built on the same architecture as Gemini 3, both open models support function-calling, agentic workflows, structured JSON output, and system instructions. Both support up to 256K context, 140+…

25
Smol AI News news-outlet 1mo ago

Gemma 4

**Google DeepMind** released **Gemma 4**, a family of open-weight, multimodal models with long-context support up to **256K tokens** under an **Apache 2.0 license**, marking a major capability and licensing shift. The lineup includes **31B dense**, **26B MoE (A4B)**, and two…

14
NVIDIA Developer Blog official-blog 1mo ago

CUDA Tile Programming Now Available for BASIC!

Note: CUDA Tile Programming in BASIC is an April Fools’ joke, but it's also real and actually works, demonstrating the flexibility of CUDA. CUDA 13.1...

5
Vercel — AI dev-tools 1mo ago

GLM 5V Turbo on AI Gateway

GLM 5V Turbo from Z.ai is now available on Vercel AI Gateway . GLM 5V Turbo is a multimodal coding model that turns screenshots and designs into code, debugs visually, and operates GUIs autonomously. It's strong at design-to-code generation, visual code generation, and…

26
Smol AI News news-outlet 1mo ago

not much happened today

**Arcee’s Trinity-Large-Thinking** was released with **open weights under Apache 2.0**, featuring a **400B total / 13B active** model size and strong agentic performance, ranking **#2 on PinchBench**. **Z.ai’s GLM-5V-Turbo** is a **vision coding model** with **native multimodal…

13
OpenAI news 1mo ago

Gradient Labs gives every bank customer an AI account manager

Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.

18
One Useful Thing (Ethan Mollick) community 1mo ago

Claude Dispatch and the Power of Interfaces

We often lack the tools for the job, even if the AI is capable enough

16
The Algorithmic Bridge news-outlet 1mo ago

Anthropic Accidentally Leaked the Secret Roadmap of Claude Code

The source code of Claude Code reveals unreleased features, internal codenames, and the future of your new favorite AI product. Here's what it all means.

28
Vercel — AI dev-tools 1mo ago

How FLORA shipped a creative agent on Vercel's AI stack

FLORA on Vercel 2x faster to production with their generation system Zero infrastructure debates after migration 50+ image models orchestrated A seasonal fashion launch is a story, not a single frame. Crafting that story is a process of exploration: It’s the same piece, worn by…

12
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** introduced **computer use inside Claude Code** for closed-loop verification in a research preview for Pro/Max users, enhancing reliable app iteration. **OpenAI** released a **Codex plugin for Claude Code**, enabling cross-agent composition and signaling a shift…

16
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** is reportedly introducing a new AI model tier called **Capybara**, which is larger and more intelligent than **Claude Opus 4.6**, showing improved performance in coding, academic reasoning, and cybersecurity. The model is speculated to be around **10 trillion…

38
Google DeepMind official-blog 1mo ago

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.

13
Google DeepMind official-blog 1mo ago

Lyria 3 Pro: Create longer tracks in more

Introducing Lyria 3 Pro, which unlocks longer tracks with structural awareness. We’re also bringing Lyria to more Google products and surfaces.

5
Vercel — AI dev-tools 1mo ago

Elastic build machines now available in beta

Elastic build machines are now available in beta for all paid plans, giving teams control over build performance without project-level micromanagement. You can configure elastic builds at the team or project level. Rather than a one-size-fits-all approach, Vercel evaluates each…

5
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** advances agent infrastructure with a multi-agent harness emphasizing orchestration and "computer use" for complex software environments. **Figma**, **GitHub**, and **Cursor** launch design canvases with direct AI editing, showcasing tool-calling becoming…

12
Smol AI News news-outlet 1mo ago

not much happened today

**Google** launched **Gemini 3.1 Flash Live**, a realtime voice and vision agent model with **2x longer conversation memory**, supporting **70 languages** and **128k context**. **Mistral AI** released **Voxtral TTS**, a low-latency, open-weight text-to-speech model supporting…

31
Smol AI News news-outlet 1mo ago

The Claude Code Source Leak

**Anthropic's** closed-source coding product **Claude Code** experienced a significant source leak exposing over **500k lines** of orchestration logic, including autonomous modes and memory systems, but not model weights. The leak led to rapid public reverse-engineering,…

14
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** introduced **Claude Cowork** and **Claude Code** enabling desktop control of mouse, keyboard, and screen in a **macOS research preview**, expanding agent capabilities beyond APIs and browsers. The agent ecosystem is evolving towards long-running, parallel,…

29
Vercel — AI dev-tools 1mo ago

SERHANT.'s playbook for rapid AI iteration

Impact at a glance Started with Next.js on Vercel, which made it easier to expand to a React Native iOS app without rebuilding their backend Engineers focus on AI design and iteration instead of platform plumbing Orchestrates OpenAI, Claude, and Gemini by task to optimize cost…

12
Vercel — AI dev-tools 1mo ago

Activity Log now available in Vercel CLI

The activity log provides a list of all activities on a team, along with the user who performed the event, the type of event, and time. We've now added the vercel activity command to the CLI so that you can query for activity events. You can filter events by type, date range,…

32
Smol AI News news-outlet 1mo ago

not much happened today

**Cursor's Composer 2**, built on **Kimi K2.5**, sparked discussion over model attribution and licensing, highlighting a shift toward post-trained derivatives of open-source models with domain-specific fine-tuning and reinforcement learning. **Claude Code** is expanding into…

36
ThursdAI news-outlet 1mo ago

ThursdAI - Opus 1M, Jensen declares OpenClaw as the new Linux, GPT 5.4 Mini & Nano, Minimax 2.7, Composer 2 & more AI news

From Weights & Biases, here's what happened in AI this week. Jensen goes ClawPilled with NemoClaw, new smaller GPT 5.4s, MiniMax autoresearches 3.7 and Composer 2 from Cursor beats Opus + more AI

15
Vercel — AI dev-tools 1mo ago

Vercel is now available in Stripe Projects

You can now signup and deploy to Vercel through Stripe Projects. Available in developer preview, this CLI-based workflow lets teams and AI agents create infrastructure environments directly from the terminal. As a launch and design partner for Stripe Projects, Vercel enables a…

18
Smol AI News news-outlet 1mo ago

not much happened today

**Cursor** launched **Composer 2**, a frontier-class coding model with major cost reductions and strong benchmark scores like **61.3 on CursorBench** and **73.7 on SWE-bench Multilingual**. The model was improved via a **first continued pretraining run** feeding into…

36
Vercel — AI dev-tools 1mo ago

MiniMax M2.7 is live on AI Gateway

MiniMax M2.7 is now available on Vercel AI Gateway in two variants: standard and high-speed. M2.7 is a major step up from previous M2-series models in software engineering, agentic workflows, and professional office tasks. The model natively supports multi-agent collaboration,…

34
Smol AI News news-outlet 1mo ago

MiniMax 2.7: GLM-5 at 1/3 cost SOTA Open Model

**MiniMax M2.7** is the headline model release, described as a "self-evolving agent" with strong performance metrics including **56.22% on SWE-Pro**, **57.0% on Terminal Bench 2**, and parity with **Sonnet 4.6**. It features recursive self-improvement in skills, memory, and…

6
Google DeepMind official-blog 1mo ago

Measuring progress toward AGI: A cognitive framework

We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.

30
Vercel — AI dev-tools 1mo ago

Vercel Open Source Program: Winter 2026 cohort

In April, we launched the Vercel Open Source Program to give maintainers the resources, credits, and support they need to ship faster and scale confidently. The first group joined through our spring 2025 cohort . Today we are welcoming the winter 2026 cohort. From AI-native apps…

25
Vercel — AI dev-tools 1mo ago

Introducing the Vercel plugin for coding agents

Claude Code and Cursor can now further understand Vercel projects using the new Vercel plugin and a full platform knowledge graph. The plugin observes real-time activity, including file edits and terminal commands, to dynamically inject Vercel knowledge into the agent's context.…

28
Vercel — AI dev-tools 1mo ago

Use GPT 5.4 Mini and Nano on AI Gateway

GPT-5.4 Mini and GPT-5.4 Nano from OpenAI are now available on Vercel AI Gateway . Both models deliver state-of-the-art performance for their size class in coding and computer use, and are built for sub-agent workflows where multiple smaller models coordinate on parts of a…

9
Smol AI News news-outlet 1mo ago

not much happened today

**OpenAI** released **GPT-5.4 mini** and **GPT-5.4 nano**, their most capable small models optimized for coding, multimodal understanding, and subagents, featuring a **400k context window** and over **2x speed** compared to GPT-5 mini. The mini model approaches larger GPT-5.4…

32
NVIDIA Developer Blog official-blog 1mo ago

Introducing NVIDIA BlueField-4-Powered CMX Context Memory Storage Platform for the Next Frontier of AI

AI‑native organizations increasingly face scaling challenges as agentic AI workflows drive context windows to millions of tokens and models scale toward...

27
NVIDIA Developer Blog official-blog 2mo ago

Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning

Agentic AI systems need models with the specialized depth to solve dense technical problems autonomously. They must excel at reasoning, coding, and long-context...

6
Smol AI News news-outlet 2mo ago

not much happened today

**NVIDIA’s Nemotron 3 Super** is a **120B parameter / ~12B active** open model featuring a **hybrid Mamba-Transformer / SSM Latent MoE** architecture and **1M context window**, delivering up to **2.2x faster inference than GPT-OSS-120B** in FP4 with strong throughput gains. It…

10
Smol AI News news-outlet 2mo ago

Yann LeCun’s AMI Labs launches with a $1.03B seed to build world models around JEPA

**Yann LeCun** launched **Advanced Machine Intelligence (AMI Labs)** with a record **$1.03B seed round** at a **$3.5B pre-money valuation**, aiming to build AI models that understand the **physical world** through **world models** rather than just language prediction. The…

29
Hugging Face official-blog 2mo ago

Introducing Storage Buckets on the Hugging Face Hub

Back to Articles Introducing Storage Buckets on the Hugging Face Hub Published March 10, 2026 Update on GitHub Upvote 194 Lucain Pouget Wauplin Eliott Coyac coyotte508 Adrien Carreira XciD Victor Mustar victor Julien Chaumond julien-c Quentin Lhoest lhoestq Pierric Cistac…

16
Zed Editor dev-tools 2mo ago

Introducing Zed for Education

Empowering the next generation of developers with Zed's Pro features, free for one year.

12
Smol AI News news-outlet 2mo ago

not much happened today

**OpenAI** rolled out **GPT-5.4**, achieving tied **#1** on the **Artificial Analysis Intelligence Index** with **Gemini 3.1 Pro Preview** scoring **57** (up from 51 for GPT-5.2 xhigh). GPT-5.4 features a larger **~1.05M token** context window and higher per-token prices…

12
ThursdAI news-outlet 2mo ago

ThursdAI - Mar 5 - OpenAI's GPT-5.4 Solves a 20-Year Math Problem, Anthropic Gets Designated a Supply Chain Risk, Qwen Drama Unfolds

From Weights & Biases, we were hoping for a chill "Anthropic vs Gov" week, but then OpenAI dropped GPT 5.4 in the middle of our live stream + Qwen models and corpo drama, Stepfun 3.5 and wolfbench.ai

27
Smol AI News news-outlet 2mo ago

GPT 5.4: SOTA Knowledge Work -and- Coding -and- CUA Model, OpenAI is so very back

**OpenAI** launched **GPT-5.4** and **GPT-5.4 Pro** with unified mainline and Codex models, featuring **native computer use**, up to **~1M token context**, and efficiency improvements including a new **Codex `/fast` mode**. Benchmarks showed strong results like…

25
Hugging Face official-blog 2mo ago

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

Back to Articles Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines Published March 5, 2026 Update on GitHub Upvote 51 YiYi Xu YiYiXu Alvaro Somoza OzzyGT Dhruv Nair dn6 Sayak Paul sayakpaul Modular Diffusers introduces a new way to build…

4
Smol AI News news-outlet 2mo ago

not much happened today

**Gemini 3.1 Flash-Lite** is highlighted by **Demis Hassabis** for its speed and cost-efficiency, focusing on latency and cost per capability rather than raw performance. **NotebookLM Studio** introduces a new feature for generating immersive cinematic video overviews. Rumors…

20
Google DeepMind official-blog 2mo ago

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.

8
Interconnects research 2mo ago

Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 — Chinese labs' latest push of the frontier

Welcome to the year of the horse!

33
Smol AI News news-outlet 2mo ago

not much happened today

**Google DeepMind** launched **Gemini 3.1 Flash-Lite**, emphasizing *dynamic thinking levels* for adjustable compute, with notable metrics like **$0.25/M input**, **$1.50/M output**, **1432 Elo on LMArena**, and **2.5× faster time-to-first-token** than Gemini 2.5 Flash. It…

35
Smol AI News news-outlet 2mo ago

not much happened today

**Alibaba** released the **Qwen 3.5** series with models ranging from **0.8B to 9B** parameters, featuring **native multimodality**, **scaled reinforcement learning**, and targeting **edge and lightweight agent** deployments. The models support very long context windows up to…

18
NVIDIA Developer Blog official-blog 2mo ago

Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints

Alibaba has introduced the new open source Qwen3.5 series built for native multimodal agents. The first model in this series is a ~400B parameter native...

25
Smol AI News news-outlet 2mo ago

Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model

**Google and DeepMind** launched **Nano Banana 2** (aka **Gemini 3.1 Flash Image Preview**), a leading image generation and editing model integrated across multiple Google products with features like **4K upscaling**, **multi-subject consistency**, and **real-time…

29
Ahead of AI (Sebastian Raschka) research 2mo ago

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026

A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026

30
Smol AI News news-outlet 2mo ago

Agentic Engineering: WTF Happened in December 2025?

**Perplexity** launched **Computer**, an orchestration-first agent platform featuring multi-model routing, usage-based pricing, and parallel asynchronous sub-agents for distributed workflows. **Andrej Karpathy** claims a "phase change" in coding agents since December,…

21
Smol AI News news-outlet 2mo ago

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

**Anthropic** alleges *industrial-scale* distillation attacks on its **Claude** model by **DeepSeek**, **Moonshot AI**, and **MiniMax**, involving **~24,000 fraudulent accounts** and **>16M Claude exchanges** to extract capabilities, raising concerns about competitive risks and…

21
Smol AI News news-outlet 2mo ago

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

**Alibaba** launched the **Qwen 3.5 Medium Model Series** featuring models like **Qwen3.5-Flash**, **Qwen3.5-35B-A3B (MoE)**, and **Qwen3.5-122B-A10B (MoE)** emphasizing efficiency over scale with innovations like **1M context** and INT4 quantization. **OpenAI** released…

14
Smol AI News news-outlet 2mo ago

not much happened today

**Gemini 3.1 Pro** demonstrates strong retrieval capabilities and cost efficiency compared to **GPT-5.2** and **Opus 4.6**, though users report tooling and UI issues. The **SWE-bench Verified** evaluation methodology is under scrutiny for consistency, with updates bringing…

27
ThursdAI news-outlet 2mo ago

📅 ThursdAI - Feb 19 - Gemini 3.1 Pro Drops LIVE, Sonnet 4.6 Closes Gap, OpenClaw Goes to OpenAI

Hey, it’s Alex, let me catch you up!

12
Google DeepMind official-blog 2mo ago

Gemini 3.1 Pro: A smarter model for your most complex tasks

3.1 Pro is designed for tasks where a simple answer isn’t enough.

31
Smol AI News news-outlet 2mo ago

Gemini 3.1 Pro: 2x 3.0 on ARC-AGI 2

**Google** released **Gemini 3.1 Pro**, a developer preview integrated across the **Gemini app**, **NotebookLM**, **Gemini API / AI Studio**, and **Vertex AI**, highlighting a significant reasoning improvement with **ARC-AGI-2 = 77.1%** and strong coding and agentic-tool…

10
Google DeepMind official-blog 2mo ago

A new way to express yourself: Gemini can now create music

The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.

27
Smol AI News news-outlet 2mo ago

not much happened today

**Anthropic** released **Claude Opus/Sonnet 4.6**, showing a significant intelligence index jump but with increased token usage and cost. **Anthropic** also shared insights on AI agent autonomy, highlighting human-in-the-loop prevalence and software engineering tool calls.…

5
Smol AI News news-outlet 2mo ago

Claude Sonnet 4.6: clean upgrade of 4.5, mostly better with some caveats

**Anthropic** launched **Claude Sonnet 4.6**, an upgrade over Sonnet 4.5, featuring broad improvements in **coding, long-context reasoning, agent planning, knowledge work, and design**, plus a **1M-token context window (beta)**. Benchmarks show Sonnet 4.6 leading on **GDPval-AA…

4
Smol AI News news-outlet 2mo ago

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

**Alibaba** released **Qwen3.5-397B-A17B**, an open-weight model featuring **native multimodality**, **spatial intelligence**, and a **hybrid linear attention + sparse MoE** architecture supporting **201 languages** and **long context windows** up to **256K tokens**. The model…

35
ThursdAI news-outlet 2mo ago

📆 Open source just pulled up to Opus 4.6 — at 1/20th the price

Plus: Gemini 3 Deep Think hits 84% on ARC-AGI, OpenAI's new 1000 t/s coding model, and the video model that shattered reality.

21
Hugging Face official-blog 2mo ago

Custom Kernels for All from Codex and Claude

Back to Articles Custom Kernels for All from Codex and Claude Published February 13, 2026 Update on GitHub Upvote 75 ben burtenshaw burtenshaw Sayak Paul sayakpaul Aritra Roy Gosthipaty ariG23498 shaun smith evalstate tl;dr: We built an agent skill that teaches coding agents how…

18
Google DeepMind official-blog 3mo ago

Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

29
Smol AI News news-outlet 3mo ago

new Gemini 3 Deep Think, Anthropic $30B @ $380B, GPT-5.3-Codex Spark, MiniMax M2.5

**Google DeepMind** is rolling out the upgraded **Gemini 3 Deep Think V2** reasoning mode to **Google AI Ultra** subscribers and opening early access to the **Vertex AI / Gemini API** for select users. Key benchmark achievements include **ARC-AGI-2 at 84.6%**, **Humanity’s Last…

31
Zed Editor dev-tools 3mo ago

Introducing Theme Builder

Create beautiful Zed themes with an interactive live preview.

7
Smol AI News news-outlet 3mo ago

Z.ai GLM-5: New SOTA Open Weights LLM

**Zhipu AI** launched **GLM-5**, an **Opus-class** model scaling from **355B to 744B parameters** with **DeepSeek Sparse Attention** integration for cost-efficient long-context serving. GLM-5 achieves **SOTA on BrowseComp** and leads on **Vending Bench 2**, focusing on office…

18
Smol AI News news-outlet 3mo ago

Qwen-Image 2.0 and Seedance 2.0

**OpenAI** advances its Responses API for multi-hour agent workflows with features like **server-side compaction**, **hosted containers**, and **Skills API**, alongside upgrading **Deep Research** to **GPT-5.2** and adding connectors. Discussions around sandbox design highlight…

6
Google DeepMind official-blog 3mo ago

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Research papers point to the growing impact of Deep Think across fields

28
Interconnects research 3mo ago

Opus 4.6, Codex 5.3, and the post-benchmark era

On comparing models in 2026.

34
Smol AI News news-outlet 3mo ago

not much happened today

**OpenAI** launched **GPT-5.3-Codex** with a Super Bowl ad emphasizing "You can just build things" as a product strategy, focusing on builder tooling over chat interfaces. The model is rolling out across **Cursor, VS Code, and GitHub** with phased API access and is flagged as…

34
Hugging Face official-blog 3mo ago

Transformers.js v4: Now Available on NPM!

Back to Articles Transformers.js v4: Now Available on NPM! Published February 9, 2026 Update on GitHub Upvote 95 Joshua Xenova Nico Martin nico-martin We're excited to announce that Transformers.js v4 is now available on NPM! After a year of development (we started in March 2025…

34
Smol AI News news-outlet 3mo ago

not much happened today

**AI News** for early February 2026 highlights a detailed comparison between **GPT-5.3-Codex** and **Claude Opus 4.6**, with users noting **Codex's** strength in detailed scoped tasks and **Opus's** ergonomic advantage for exploratory work. Benchmarks on Karpathy's **nanochat…

11
ThursdAI news-outlet 3mo ago

📆 ThursdAI - Feb 5 - Opus 4.6 was #1 for ONE HOUR before GPT 5.3 Codex, Voxtral transcription, Codex app, Qwen Coder Next & the Agentic Internet

From Weights & Biases - a hell of a week to be covering the AI news, with 2 big model drops live during the show, 1 interview with VB from OpenAI about Codex app and the new model, Voxtral and more AI

19
Hugging Face official-blog 3mo ago

Introducing SyGra Studio

Back to Articles Introducing SyGra Studio Enterprise Article Published February 5, 2026 Upvote 26 Surajit Dasgupta surajit ServiceNow-AI Bidyapati Pradhan bidyapati ServiceNow-AI Amit Kumar Saha amitsaha ServiceNow-AI Vipul Mittal vipulmitt ServiceNow-AI Sriram Puttagunta…

10
Smol AI News news-outlet 3mo ago

OpenAI and Anthropic go to war: Claude Opus 4.6 vs GPT 5.3 Codex

**OpenAI** launched **GPT-5.3-Codex**, emphasizing **token efficiency**, **inference speed**, and hardware/software co-design with **GB200-NVL72** and **NVIDIA** collaboration. The new **Frontier** agent platform supports business-context agents with execution environments and…

15
Smol AI News news-outlet 3mo ago

ElevenLabs $500m Series D at $11B, Cerebras $1B Series H at $23B, Vibe Coding -> Agentic Engineering

**Google's Gemini 3** is being integrated widely, including a new **Chrome side panel** and **Nano Banana** UX features, with rapid adoption and a **78% unit-cost reduction** in serving costs. The **Gemini app** reached **750M+ MAU** in Q4 2025, nearing ChatGPT's user base.…

23
Hugging Face official-blog 3mo ago

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

Back to Articles The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ Team Article Published February 3, 2026 Upvote 53 Adina Yakefu AdinaY huggingface Irene Solaiman irenesolaiman huggingface This is the third and final blog in a three-part series on China's…

10
Smol AI News news-outlet 3mo ago

Context Graphs: Hype or actually Trillion-dollar opportunity?

**Zhipu AI** launched **GLM-OCR**, a lightweight **0.9B** multimodal OCR model excelling in complex document understanding with top benchmark scores and day-0 deployment support from **lmsys**, **vllm**, and **novita labs**. **Ollama** enabled local-first usage with easy offline…

28
Smol AI News news-outlet 3mo ago

OpenAI Codex App: death of the VSCode fork, multitasking worktrees, Skills Automations

**OpenAI** launched the **Codex app** on macOS as a dedicated agent-native command center for coding, featuring **multiple agents in parallel**, **built-in worktrees** for conflict isolation, **skills** for reusable bundles, and **scheduled automations**. The app emphasizes…

19
ThursdAI news-outlet 3mo ago

📆 ThursdAI - Jan 29 - Genie3 is here, Clawd rebrands, Kimi K2.5 surprises, Chrome goes agentic & more AI news

Listen now | From Weights & Biases (live from SF) - Genie 3 is finally here and made us go "whoah", Clawdbot delivers despite rebrand, Kimi K2.5 king OSS, Chrome crushes Atlas & Grok Imagine #1

26
Smol AI News news-outlet 3mo ago

xAI Grok Imagine API - the #1 Video Model, Best Pricing and Latency - and merging with SpaceX

**Google DeepMind** launched **Project Genie (Genie 3 + Nano Banana Pro + Gemini)**, a prototype for creating interactive, real-time generated worlds from text or image prompts, currently available to **Google AI Ultra subscribers in the U.S. (18+)** with noted limitations like…

29
Hugging Face official-blog 3mo ago

Introducing Daggr: Chain apps programmatically, inspect visually

Back to Articles Introducing Daggr: Chain apps programmatically, inspect visually Published January 29, 2026 Update on GitHub Upvote 107 merve merve yuvraj sharma ysharma Abubakar Abid abidlabs hysts hysts Pedro Cuenca pcuenq TL;DR: Daggr is a new, open-source Python library for…

13
Smol AI News news-outlet 3mo ago

not much happened today

**AI News for 1/27/2026-1/28/2026** highlights a quiet day with deep dives into frontier model "personality split" where **GPT-5.2** excels at *exploration* and **Claude Opus 4.5** at *exploitation*, suggesting **OpenAI** suits research workflows and **Anthropic** commercial…

21
Hugging Face official-blog 3mo ago

We Got Claude to Build CUDA Kernels and teach open models!

Back to Articles We got Claude to teach open models how to write CUDA kernels! Published January 28, 2026 Update on GitHub Upvote 156 ben burtenshaw burtenshaw shaun smith evalstate merve merve Pedro Cuenca pcuenq The best thing about agent skills is upskilling your agents on…

22
Hugging Face official-blog 3mo ago

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

Back to Articles Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek Team Article Published January 27, 2026 Upvote 45 Adina Yakefu AdinaY huggingface Irene Solaiman irenesolaiman huggingface This is the second blog in a three-part series on…

32
Smol AI News news-outlet 3mo ago

Moonshot Kimi K2.5 - Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager

**MoonshotAI's Kimi K2.5** is a **32B active-1T parameter open-weights model** featuring **native multimodality** with image and video understanding, built through continual pretraining on **15 trillion mixed visual and text tokens**. It introduces a new **MoonViT vision…

22
Hugging Face official-blog 3mo ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Back to Articles Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Team Article Published January 27, 2026 Upvote 74 Jason Zhu JasonZhu13 LinkedIn Hejian Sang pb09204048 LinkedIn Arup De arde171 LinkedIn Rohit Jain rohjain LinkedIn Yanning Chen m0m0chen…

32
Smol AI News news-outlet 3mo ago

Anthropic launches the MCP Apps open spec, in Claude.ai

**Anthropic** has officially absorbed the independent MCP UI project and, collaborating with **OpenAI**, **Block**, **VS Code**, **Antigravity**, **JetBrains**, and **AWS**, released the **MCP Apps spec** and official support in **Claude.ai**. This standard aims to enable a rich…

9
Smol AI News news-outlet 3mo ago

not much happened today

**Anthropic** launches "Claude in Excel Pro" with enhanced features. **OpenAI** reveals upcoming **Codex** agent loop and cybersecurity measures. **Google** boosts **Gemini App** quotas and partners with **Sakana AI** for advanced AI Scientist projects in Japan. **Cursor**…

25
Smol AI News news-outlet 3mo ago

OpenEvidence, the ‘ChatGPT for doctors,’ raises $250m at $12B valuation, 12x from $1b last Feb

**OpenEvidence** raised **$12 billion**, a 12x increase from last year, with usage by 40% of U.S. physicians and over $100 million in annual revenue. **Anthropic** released a new **Claude** model constitution under **CC0 1.0**, framing it as a living document for alignment and…

34
Hugging Face official-blog 3mo ago

One Year Since the “DeepSeek Moment”

Back to Articles One Year Since the “DeepSeek Moment” Team Article Published January 20, 2026 Upvote 62 Adina Yakefu AdinaY huggingface Irene Solaiman irenesolaiman huggingface This is the first blog in a series that will examine China’s open source community’s historical…

10
Hugging Face official-blog 3mo ago

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

Back to Articles Waypoint-1: Real-time Interactive Video Diffusion from Overworld Published January 20, 2026 Update on GitHub Upvote 43 Andrew Lapp lapp0 Overworld Louis Castricato LouisCastricato Overworld Scott Fox ScottieFox Overworld Shahbuland Matiana shahbuland Overworld…

15
VentureBeat — AI news-outlet 3mo ago

Claude Code costs up to $200 a month. Goose does the same thing for free.

The artificial intelligence coding revolution comes with a catch: it's expensive. Claude Code , Anthropic's terminal-based AI agent that can write, debug, and deploy code autonomously, has captured the imagination of software developers worldwide. But its pricing —…

26
ThursdAI news-outlet 3mo ago

📆 ThursdAI - Jan 15 - Agent Skills Deep Dive, GPT 5.2 Codex Builds a Browser, Claude Cowork for the Masses, and the Era of Personalized AI!

From Weights & Biases - come learn what agent skills are all about, Claude Cowork opens the door for non coders to do agentic stuff, GPT 5.2 Codex in API and Gemini get personalized! Big week!

34
Smol AI News news-outlet 3mo ago

Open Responses: explicit spec for OpenAI's Responses API supported by OpenRouter, Ollama, Huggingface, vLLM, et al

**OpenAI** launched the **Open Responses** API spec, an open-source, multi-provider standard for interoperable LLM APIs designed to simplify agent stacks and tooling. Early adopters like **ollama** and **vLLM** support the spec, while notable absences include **anthropic** and…

4
Smol AI News news-outlet 3mo ago

not much happened today.

**OpenAI** launched **GPT-5.2-Codex** API, touted as their strongest coding model for long-running tasks and cybersecurity. **Cursor** integrated GPT-5.2-Codex to autonomously run a browser for a week, producing over 3 million lines of Rust code. **GitHub** incorporated it into…

19
VentureBeat — AI news-outlet 4mo ago

Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI

Salesforce on Tuesday launched an entirely rebuilt version of Slackbot , the company's workplace assistant, transforming it from a simple notification tool into what executives describe as a fully powered AI agent capable of searching enterprise data, drafting documents,…

37
Smol AI News news-outlet 4mo ago

Anthropic Labs: Cowork, Claude Code, MCP, Skills incubator led by Mike Krieger and Ben Mann

**Anthropic** consolidates its AI agent products under the **Cowork** brand, integrating prior tools like **Claude Code** and **Claude for Chrome** into a unified agent with sandboxed Linux VM environments using **Apple's virtualization** and **bubblewrap** for security.…

23
VentureBeat — AI news-outlet 4mo ago

Anthropic launches Cowork, a Claude Desktop agent that works in your files — no coding required

Anthropic released Cowork on Monday, a new AI agent capability that extends the power of its wildly successful Claude Code tool to non-technical users — and according to company insiders, the team built the entire feature in approximately a week and a half, largely using Claude…

38
Smol AI News news-outlet 4mo ago

Apple picks Google's Gemini to power Siri's next generation

**Apple** has decided to power Siri with **Google's Gemini models** and cloud technology, marking a significant partnership and a setback for **OpenAI**, which was initially partnered with Apple. **Anthropic** launched "Cowork," a product preview for Claude's coding…

10
Smol AI News news-outlet 4mo ago

not much happened today

**Anthropic** tightens usage policies for **Claude Max** in third-party apps, prompting builders to adopt **model-agnostic orchestration** and **BYO-key** defaults to mitigate platform risks. The **Model Context Protocol (MCP)** is evolving into a key tooling plane with **OpenAI…

31
ThursdAI news-outlet 4mo ago

ThursdAI - Jan 8 - Vera Rubin's 5x Jump, Ralph Wiggum Goes Viral, GPT Health Launches & XAI Raises $20B Mid-Controversy

Listen now | From Weights & Biases, latest ThursdAI roundup with NVIDIA CES news, Grok no guardrails, Ralph Wiggum breakdown with Ryan, GPT Health and OSS AI!

16
Smol AI News news-outlet 4mo ago

not much happened today

**Stanford paper** reveals **Claude 3.7 Sonnet** memorized **95.8% of Harry Potter 1**, highlighting copyright extraction risks compared to **GPT-4.1**. **Google AI Studio** sponsors **TailwindCSS** amid OSS funding debates. **Google** and **Sundar Pichai** launch **Gmail Gemini…

21
One Useful Thing (Ethan Mollick) community 4mo ago

Claude Code and What Comes Next

With the right tools, AI can accomplish impressive things

36
VentureBeat — AI news-outlet 4mo ago

Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment

Nous Research , the open-source artificial intelligence startup backed by crypto venture firm Paradigm , released a new competitive programming model on Monday that it says matches or exceeds several larger proprietary systems — trained in just four days using 48 of…

4
Smol AI News news-outlet 4mo ago

not much happened today

**AI News for 1/6/2026-1/7/2026** highlights a quiet day with key updates on **LangChain DeepAgents** introducing **Ralph Mode** for persistent agent loops, **Cursor** improving context management by reducing token usage by **46.9%**, and operational safety measures for coding…

26
Hugging Face official-blog 4mo ago

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

Back to Articles NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI Enterprise + Article Published January 5, 2026 Upvote 64 Tsung-Yi Lin tsungyi nvidia Debraj Sinha debrajsinha nvidia NVIDIA today released Cosmos Reason 2 , the latest advancement in open, reasoning…

17
Hugging Face official-blog 4mo ago

Introducing Falcon-H1-Arabic: Pushing the Boundaries of Arabic Language AI with Hybrid Architecture

Back to Articles Introducing Falcon-H1-Arabic: Pushing the Boundaries of Arabic Language AI with Hybrid Architecture Community Article Published January 5, 2026 Upvote 40 Basma Boussaha basma-b tiiuae Mohammed Alyafeai Alyafeai tiiuae Ahmed Alzubaidi amztheory tiiuae Leen AlQadi…

28
VentureBeat — AI news-outlet 4mo ago

The creator of Claude Code just revealed his workflow, and developers are losing their minds

When the creator of the world's most advanced coding agent speaks, Silicon Valley doesn't just listen — it takes notes. For the past week, the engineering community has been dissecting a thread on X from Boris Cherny , the creator and head of Claude Code at Anthropic .…

31
Smol AI News news-outlet 4mo ago

not much happened today

**DeepSeek** released a new paper on **mHC: Manifold-Constrained Hyper-Connections**, advancing residual-path design as a key scaling lever in neural networks. Their approach constrains residual mixing matrices to the **Birkhoff polytope** to improve stability and performance,…

13
ThursdAI news-outlet 4mo ago

ThursdAI - Jan 1 2026 - Will Brown Interview + Nvidia buys Groq, Meta buys Manus, Qwen Image 2412 & Alex New Year greetings

From Weights & Biases, Last episode of last year, first episode of this new one, Groq and Manus are picked up last second, Qwen releases a new image & interview with Will Brown from Prime Intellect

4
Smol AI News news-outlet 4mo ago

not much happened today

**South Korea's Ministry of Science** launched a coordinated program with **5 companies** to develop sovereign foundation models from scratch, featuring large-scale MoE architectures like **SK Telecom A.X-K1 (519B total / 33B active)** and **LG K-EXAONE (236B MoE / 23B…

24
Ahead of AI (Sebastian Raschka) research 4mo ago

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.

33
Smol AI News news-outlet 4mo ago

Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch

**Manus** achieved a rapid growth trajectory in 2025, raising **$500M** from Benchmark and reaching **$100M ARR** before being acquired by **Meta** for an estimated **$4B**. The **vLLM** team launched a dedicated community site with new resources, while performance issues with…

30
Smol AI News news-outlet 4mo ago

not much happened today

**MiniMax M2.1** launches as an **open-source** agent and coding Mixture-of-Experts (MoE) model with **~10B active / ~230B total parameters**, claiming to outperform **Gemini 3 Pro** and **Claude Sonnet 4.5**, and supports local inference including on **Apple Silicon M3 Ultra**…

10
Smol AI News news-outlet 4mo ago

not much happened today

**GLM-4.7** and **MiniMax M2.1** open-weight model releases highlight day-0 ecosystem support, coding throughput, and agent workflows, with GLM-4.7 achieving a +9.5% improvement over GLM-4.6 and MiniMax M2.1 positioned as an OSS Claude-like MoE model with 230B total parameters…

18
Smol AI News news-outlet 4mo ago

not much happened today

**Zhipu AI's GLM-4.7** release marks a significant improvement in **coding, complex reasoning, and tool use**, quickly gaining ecosystem adoption via Hugging Face and OpenRouter. **Xiaomi's MiMo-V2-Flash** is highlighted as a practical, cost-efficient mixture-of-experts model…

30
Smol AI News news-outlet 4mo ago

not much happened today

**Alibaba** released **Qwen-Image-Layered**, an open-source model enabling Photoshop-grade layered image decomposition with recursive infinite layers and prompt-controlled structure. **Kling 2.6** introduced advanced motion control for image-to-video workflows, supported by a…

18
Google DeepMind official-blog 4mo ago

Gemini 3 Flash: frontier intelligence built for speed

Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.

10
Zed Editor dev-tools 4mo ago

Zed Moves Toward Secure-by-Default: Introducing Worktree Trust

We're introducing a new worktree trust mechanism while maintaining options for a low-friction experience you expect from Zed.

17
Google DeepMind official-blog 4mo ago

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.

11
Google DeepMind official-blog 5mo ago

Improved Gemini audio models for powerful voice experiences

Improved Gemini audio models for powerful voice interactions Share x.com Facebook LinkedIn Mail Bibo Xu Director of Product Management Tara Sainath Distinguished Research Scientist General summary Google enhanced Gemini 2.5 Flash Native Audio for better live voice agents. Expect…

37
Hugging Face official-blog 5mo ago

Codex is Open Sourcing AI models

Back to Articles Codex is Open Sourcing AI models Published December 11, 2025 Update on GitHub Upvote 82 ben burtenshaw burtenshaw shaun smith evalstate Building on our work to get Claude Code to train open source models, we are now getting Codex to go further. We gave Codex…

23
Hugging Face official-blog 5mo ago

Introducing swift-huggingface: The Complete Swift Client for Hugging Face

Back to Articles Introducing swift-huggingface: The Complete Swift Client for Hugging Face Published December 5, 2025 Update on GitHub Upvote 43 Mattt mattt Today, we're announcing swift-huggingface , a new Swift package that provides a complete client for the Hugging Face Hub.…

27
Hugging Face official-blog 5mo ago

We Got Claude to Fine-Tune an Open Source LLM

Back to Articles We Got Claude to Fine-Tune an Open Source LLM Published December 4, 2025 Update on GitHub Upvote 624 ben burtenshaw burtenshaw shaun smith evalstate We gave Claude the ability to fine-tune language models using a new tool called Hugging Face Skills . Not just…

35
Ahead of AI (Sebastian Raschka) research 5mo ago

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

Understanding How DeepSeek's Flagship Open-Weight Models Evolved

34
Google DeepMind official-blog 5mo ago

How we’re bringing AI image verification to the Gemini app

How we’re bringing AI image verification to the Gemini app Nov 20, 2025 · Share x.com Facebook LinkedIn Mail We are increasing content transparency by introducing the ability to verify if an image was generated or edited by Google AI right in the Gemini app. Pushmeet Kohli VP,…

37
Google DeepMind official-blog 5mo ago

Build with Nano Banana Pro, our Gemini 3 Pro Image model

Build with Nano Banana Pro, our Gemini 3 Pro Image model Share x.com Facebook LinkedIn Mail Here’s how developers can use Nano Banana Pro (Gemini 3 Pro Image), a powerful new image generation and editing model with advanced features and creative control. Alisa Fortin Product…

10
Google DeepMind official-blog 5mo ago

Introducing Nano Banana Pro

Introducing Nano Banana Pro Nov 20, 2025 · Share x.com Facebook LinkedIn Mail Turn your visions into studio-quality designs with unprecedented control, improved text rendering and enhanced world knowledge. Naina Raisinghani Product Manager, Google DeepMind General summary Google…

22
Hugging Face official-blog 5mo ago

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

Back to Articles Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Published November 20, 2025 Update on GitHub Upvote 42 Mattt mattt guest LLMs have become essential tools for building software. But for Apple developers, integrating them remains…

25
Google DeepMind official-blog 5mo ago

Start building with Gemini 3

Start building with Gemini 3 Nov 18, 2025 · Share x.com Facebook LinkedIn Mail Whether you’re an experienced developer or a vibe coder, Gemini 3 can help you bring any idea to life. Logan Kilpatrick Product Lead, Google AI Studio and the Gemini API General summary Google is…

20
One Useful Thing (Ethan Mollick) community 5mo ago

Three Years from GPT-3 to Gemini 3

From chatbots to agents

20
Google DeepMind official-blog 5mo ago

A new era of intelligence with Gemini 3

A new era of intelligence with Gemini 3 Nov 18, 2025 · Share x.com Facebook LinkedIn Mail Gemini 3 is our most intelligent model that helps you bring any idea to life. Sundar Pichai CEO, Google and Alphabet Demis Hassabis CEO, Google DeepMind Koray Kavukcuoglu CTO, Google…

21
Google DeepMind official-blog 5mo ago

Introducing Google Antigravity

11
Google DeepMind official-blog 6mo ago

SIMA 2: An Agent that Plays, Reasons, and Learns With You in Virtual 3D Worlds

Introducing SIMA 2, a Gemini-powered AI agent that can think, understand, and take actions in interactive environments.

16
Google DeepMind official-blog 6mo ago

How AI is giving Northern Ireland teachers time back

A six-month long pilot program with the Northern Ireland Education Authority’s C2k initiative found that integrating Gemini and other generative AI tools saved participating teachers an average of 10 hours per week.

38
Zed Editor dev-tools 6mo ago

Introducing Agent Extensions

Zed launches Agent Server Extensions, enabling one-click installation of ACP-compatible agents like Augment Code and OpenCode.

31
Hugging Face official-blog 6mo ago

Granite 4.0 Nano: Just how small can you go?

Back to Articles Granite 4.0 Nano: Just how small can you go? Enterprise Article Published October 28, 2025 Upvote 124 Kate Soule katesoule ibm-granite Rameswar Panda rpand002 ibm-granite Today we are excited to share Granite 4.0 Nano , our smallest models yet, released as part…

15
Google DeepMind official-blog 6mo ago

T5Gemma: A new collection of encoder-decoder Gemma models

Introducing T5Gemma, a new collection of encoder-decoder LLMs.

26
Google DeepMind official-blog 6mo ago

MedGemma: Our most capable open models for health AI development

We’re announcing new multimodal models in the MedGemma collection, our most capable open models for health AI development.

4
Google DeepMind official-blog 6mo ago

Introducing Gemma 3n: The developer guide

Gemma 3n is designed for the developer community that helped shape Gemma.

15
Google DeepMind official-blog 6mo ago

Gemini 2.5 Flash-Lite is now ready for scaled production use

Gemini 2.5 Flash-Lite, previously in preview, is now stable and generally available. This cost-efficient model provides high quality in a small size, and includes 2.5 family features like a 1 million-token context window and multimodality.

30
Google DeepMind official-blog 6mo ago

Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad

The International Mathematical Olympiad (“IMO”) is the world’s most prestigious competition for young mathematicians, and has been held annually since 1959. Each country taking part is represented by six elite, pre-university mathematicians who compete to solve six exceptionally…

7
Google DeepMind official-blog 6mo ago

Aeneas transforms how historians connect the past

Introducing the first model for contextualizing ancient inscriptions, designed to help historians better interpret, attribute and restore fragmentary texts.

10
Google DeepMind official-blog 6mo ago

Gemini achieves gold-medal level at the International Collegiate Programming Contest World Finals

Gemini 2.5 Deep Think achieves breakthrough performance at the world’s most prestigious computer programming competition, demonstrating a profound leap in abstract problem solving.

34
Google DeepMind official-blog 6mo ago

Gemini Robotics 1.5 brings AI agents into the physical world

We’re powering an era of physical agents — enabling robots to perceive, plan, think, use tools and act to better solve complex, multi-step tasks.

7
Google DeepMind official-blog 6mo ago

Introducing CodeMender: an AI agent for code security

Using advanced AI to fix critical software vulnerabilities

24
Google DeepMind official-blog 6mo ago

Try Deep Think in the Gemini app

We're rolling out Deep Think in the Gemini app for Google AI Ultra subscribers, and we're giving select mathematicians access to the full version of the Gemini 2.5 Deep Think model entered into the IMO competition.

29
Google DeepMind official-blog 6mo ago

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

Today, we're adding a new, highly specialized tool to the Gemma 3 toolkit: Gemma 3 270M, a compact, 270-million parameter model.

36
Google DeepMind official-blog 6mo ago

Image editing in Gemini just got a major upgrade

Transform images in amazing new ways with updated native image editing in the Gemini app.

34
Google DeepMind official-blog 6mo ago

Introducing the Gemini 2.5 Computer Use model

Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.

22
Google DeepMind official-blog 6mo ago

Introducing Veo 3.1 and advanced creative capabilities

We’re rolling out significant updates to Veo that give people even more creative control.

30
Google DeepMind official-blog 6mo ago

How a Gemma model helped discover a new potential cancer therapy pathway

We’re launching a new 27 billion parameter foundation model for single-cell analysis built on the Gemma family of open models.

36
Hugging Face official-blog 6mo ago

Building the Open Agent Ecosystem Together: Introducing OpenEnv

Back to Articles Building the Open Agent Ecosystem Together: Introducing OpenEnv Published October 23, 2025 Update on GitHub Upvote 162 Joseph Spisak spisakjo openenv Davide Testuggine darktex openenv Zach Wentz zkwentz openenv Pierre Andrews mortimerp9 openenv Sanyam Bhutani…

9
Hugging Face official-blog 6mo ago

Sentence Transformers is joining Hugging Face!

Back to Articles Sentence Transformers is joining Hugging Face! Published October 22, 2025 Update on GitHub Upvote 88 Tom Aarsen tomaarsen Today, we are announcing that Sentence Transformers is transitioning from Iryna Gurevych’s Ubiquitous Knowledge Processing (UKP) Lab at the…

9
Zed Editor dev-tools 6mo ago

Codex is Live in Zed

OpenAI's Codex AI agent is now available in Zed via the Agent Client Protocol (ACP).

23
Zed Editor dev-tools 7mo ago

How the Community is Driving ACP Forward

A progress report on the adoption of the Agent Client Protocol (ACP) since we launched it.

35
Ahead of AI (Sebastian Raschka) research 8mo ago

Understanding and Implementing Qwen3 From Scratch

A Detailed Look at One of the Leading Open-Source LLMs

14
Zed Editor dev-tools 8mo ago

Claude Code: Now in Beta in Zed

You asked, and here it is. Use Claude Code in public beta directly in Zed, built on the new Agent Client Protocol.

8
One Useful Thing (Ethan Mollick) community 8mo ago

Mass Intelligence

From GPT-5 to nano banana: everyone is getting access to powerful AI

24
Zed Editor dev-tools 8mo ago

Bring Your Own Agent to Zed — Featuring Gemini CLI

Zed now lets you use the agent of your choice through the new Agent Client Protocol, starting with Google's Gemini CLI.

12
Aider releases dev-tools 9mo ago

Aider v0.86.0

Added support for all GPT-5 models. Added support for Grok-4 via xai/grok-4 and openrouter/x-ai/grok-4 model names. Added support for gemini/gemini-2.5-flash-lite-preview-06-17 model, by Tamir Zahavi-Brunner. /clear now prints “All chat history cleared.” so you know it worked,…

13
Ahead of AI (Sebastian Raschka) research 9mo ago

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

And How They Stack Up Against Qwen3

15
One Useful Thing (Ethan Mollick) community 9mo ago

GPT-5: It Just Does Stuff

Putting the AI in Charge

31
Ahead of AI (Sebastian Raschka) research 9mo ago

The Big LLM Architecture Comparison

From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design

21
Google DeepMind official-blog 10mo ago

AlphaGenome: AI for better understanding the genome

Introducing a new, unifying DNA sequence model that advances regulatory variant-effect prediction and promises to shed new light on genome function — now available via API.

34
Google DeepMind official-blog 10mo ago

Gemini Robotics On-Device brings AI to local robotic devices

We’re introducing an efficient, on-device robotics model with general-purpose dexterity and fast task adaptation.

33
Google DeepMind official-blog 11mo ago

Gemini 2.5: Updates to our family of thinking models

Explore the latest Gemini 2.5 model updates with enhanced performance and accuracy: Gemini 2.5 Pro now stable, Flash generally available, and the new Flash-Lite in preview.

32
Google DeepMind official-blog 11mo ago

We’re expanding our Gemini 2.5 family of models

Gemini 2.5 Flash and Pro are now generally available, and we’re introducing 2.5 Flash-Lite, our most cost-efficient and fastest 2.5 model yet.

29
Google DeepMind official-blog 11mo ago

How we're supporting better tropical cyclone prediction with AI

We’re launching Weather Lab, featuring our experimental cyclone predictions, and we’re partnering with the U.S. National Hurricane Center to support their forecasts and warnings this cyclone season.

12
Google DeepMind official-blog 11mo ago

Advanced audio dialog and generation with Gemini 2.5

Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.

28
Nonint (James Betker) research 11mo ago

Vibe Coding

I had a pretty incredible vibe coding experience with o3 today. As I’m sure many of you have also had recently – whether with o3, or Claude or Gemini. I was iterating on a problem with it over a couple of hours. I asked it to come up with an idea for a novel…

26
Google DeepMind official-blog 11mo ago

Fuel your creativity with new generative media models and tools

Introducing Veo 3 and Imagen 4, and a new tool for filmmaking called Flow.

12
Google DeepMind official-blog 11mo ago

Announcing Gemma 3n preview: Powerful, efficient, mobile-first AI

Gemma 3n is a cutting-edge open model designed for fast, multimodal AI on devices, featuring optimized performance, unique flexibility with a 2-in-1 model, and expanded multimodal understanding with audio, empowering developers to build live, interactive applications and…

23
Google DeepMind official-blog 11mo ago

Gemini 2.5: Our most intelligent models are getting even better

Gemini 2.5 Pro continues to be loved by developers as the best model for coding, and 2.5 Flash is getting even better with a new update. We’re bringing new capabilities to our models, including Deep Think, an experimental enhanced reasoning mode for 2.5 Pro.

34
Google DeepMind official-blog 11mo ago

Advancing Gemini's security safeguards

We’ve made Gemini 2.5 our most secure model family to date.

10
Google DeepMind official-blog 11mo ago

Our vision for building a universal AI assistant

We’re extending Gemini to become a world model that can make plans and imagine new experiences by simulating aspects of the world.

28
Google DeepMind official-blog 12mo ago

Gemini 2.5 Pro Preview: even better coding performance

We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner.

31
Google DeepMind official-blog 12mo ago

Build rich, interactive web apps with an updated Gemini 2.5 Pro

Our updated version of Gemini 2.5 Pro Preview has improved capabilities for coding.

29
Google DeepMind official-blog 13mo ago

Introducing Gemini 2.5 Flash

Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.

23
Google DeepMind official-blog 13mo ago

Generate videos in Gemini and Whisk with Veo 2

Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated clips.

26
Google DeepMind official-blog 13mo ago

Gemini 2.5: Our most intelligent AI model

Gemini 2.5 is our most intelligent AI model, now with thinking built in.

32
Google DeepMind official-blog 14mo ago

Gemini Robotics brings AI into the physical world

Introducing Gemini Robotics and Gemini Robotics-ER, AI models designed for robots to understand, act and react to the physical world.

9
Google DeepMind official-blog 14mo ago

Experiment with Gemini 2.0 Flash native image generation

Native image output is available in Gemini 2.0 Flash for developers to experiment with in Google AI Studio and the Gemini API.

5
Google DeepMind official-blog 14mo ago

Introducing Gemma 3

The most capable model you can run on a single GPU or TPU.

17
Google DeepMind official-blog 14mo ago

Start building with Gemini 2.0 Flash and Flash-Lite

Gemini 2.0 Flash-Lite is now generally available in the Gemini API for production use in Google AI Studio and for enterprise customers on Vertex AI

6
Maarten Grootendorst research 15mo ago

A Visual Guide to Reasoning LLMs

Exploring Test-Time Compute Techniques and DeepSeek-R1

9
Zed Editor dev-tools 15mo ago

How is DeepSeek-R1 for Coding? Try it right now!

How to try DeepSeek-R1 for coding via the Zed code editor's out-the-box support.

26
Zed Editor dev-tools 19mo ago

Zed AI: Introducing Usage-Based Billing for High-Volume Users

Some billing-related changes for Zed's AI offering

25
Zed Editor dev-tools 21mo ago

Introducing Zed AI

Powerful AI-assisted coding powered by Anthropic's Claude, now available.

35
Zed Editor dev-tools 21mo ago

Bringing Interactive Computing to Zed: Introducing REPL Support

Run code, visualize data, and iterate right in your editor with Zed's new REPL support.

34
Nonint (James Betker) research 24mo ago

GPT-4o

I’m very pleased to show the world GPT-4o. I came into the project mid-last year with Alexis Conneau with the goal of scaling up speech models and building an “AudioLM”. We knew we had something special late last year, but I don’t think either of us…

22
Zed Editor dev-tools 27mo ago

Optimizing the Metal pipeline to maintain 120 FPS in GPUI

Zed feels smoother than ever with today's release of 0.121, thanks to a series of optimizations that began on the kitchen table of popular streamer Theo Browne . In an excellent video following our open source launch, Theo gave a bunch of great feedback, but what really…

6
Zed Editor dev-tools 27mo ago

User themes now in Preview

Theme Zed just how you like it in the latest Preview release.

24
Zed Editor dev-tools 28mo ago

Introducing Channels for Collaboration

Channels: A virtual office for software teams!

26
Lil'Log (Lilian Weng) research 31mo ago

Adversarial Attacks on LLMs

The use of large language models in the real world has strongly accelerated by the launch of ChatGPT. We (including my team at OpenAI, shoutout to them) have invested a lot of effort to build default safe behavior into the model during the alignment process (e.g. via RLHF ).…

5
Maarten Grootendorst research 31mo ago

Introducing KeyLLM — Keyword Extraction with LLMs

Use KeyLLM, KeyBERT, and Mistral 7B to extract keywords from your data

15
Maarten Grootendorst research 32mo ago

3 Ways To Improve Your Large Language Model

Enhancing the power of Llama 2

19
Maarten Grootendorst research 33mo ago

Topic Modeling with Llama 2

Create easily interpretable topics with Large Language Models

29
Maarten Grootendorst research 33mo ago

Decoding Auto-GPT

The Mechanics of an Autonomous GPT-4

35
Lil'Log (Lilian Weng) research 35mo ago

LLM Powered Autonomous Agents

Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT , GPT-Engineer and BabyAGI , serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories,…

26
Eugene Yan research 49mo ago

How to Measure and Mitigate Position Bias

Introducing randomness and/or learning from inherent randomness to mitigate position bias.

19
Andrej Karpathy research 62mo ago

Short Story on AI: Forward Pass

The inspiration for this short story came to me while reading Kevin Lacker’s Giving GPT-3 a Turing Test . It is probably worth it (though not required) to skim this post to get a bit of a background on some of this story. It was probably around the 32nd layer of the 400th token…

30
Lil'Log (Lilian Weng) research 88mo ago

Generalized Language Models

[Updated on 2019-02-14: add ULMFiT and GPT-2 .] [Updated on 2020-02-29: add ALBERT .] [Updated on 2020-10-25: add RoBERTa .] [Updated on 2020-12-13: add T5 .] [Updated on 2020-12-30: add GPT-3 .] [Updated on 2021-11-13: add XLNet , BART and ELECTRA ; Also updated the Summary…

6

24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)

Cyber Lack of Security and AI Governance

Anthropic&#8217;s Cat Wu says that, in the future, AI will anticipate your needs before you know what they are

MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)

Latest Version of Anthropic’s Mythos AI is Even Better at Hacking, UK Researchers Say

Who is your favourite quant publisher and why?

Amazon launches an AI shopping assistant for the search bar, powered by Alexa+

Introducing the 6 stages at TechCrunch Disrupt 2026 — built for today’s tougher startup market

Claude 4 announced — context window doubles, agentic tools land

qwen3.6 just stops

GPT-5 paper drops on arXiv — scaling laws revisited

Amazon Drops ‘Rufus’ Branding on Shopping Chatbot, Adds AI in Search

TextGen is now a native desktop app. Open-source alternative to LM Studio (formerly text-generation-webui).

Hugging Face releases open-weights model family

Mistral AI announces fine-tuning service

The Trillion-Parameter Dilemma: MiMo-V2.5-Pro went open-source (1.02T params). Is self-hosting worth it when the API costs $70 for 387M tokens?

Does THINKING MODE significantly improve translation?

Former Alibaba Star Researcher Starts New AI Lab, Seeks $2 Billion Valuation

SecurityBaseline.eu

Former Alibaba Star Researcher Starts New AI Lab, Seeks $2 Billion Valuation

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

datasette 1.0a29

High VRAM local coding model — still Qwen 3.6 27B?

Scrcpy v4.0

v0.23.4-rc0

v0.23.4

b9123

Luce DFlash + PFlash on AMD Strix Halo: Qwen3.6-27B at 2.23x decode and 3.05x prefill vs llama.cpp HIP

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

Needle: We Distilled Gemini Tool Calling Into a 26M Model

llm 0.32a2

New Qwen3.6 27b Autoround Quant (int4) Best Recipe

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

Google adds Gemini-powered dictation to Gboard, which could be bad news for dictation startups

Google brings agentic AI and vibe-coded widgets to Android

Let's build claude code from scratch!

Local LLM autocomplete + agentic coding on a single 16GB GPU + 64GB RAM

langchain==1.3.0

MagicQuant (v2.0) - Hybrid Mixed GGUF Models + Unsloth Dynamic Learned Quant Configurations + Benchmark table with collapsed winners and more

TabPFN-3 just released: a pre-trained tabular foundation model for up to 1M rows [R][N]

Fast mode for Opus 4.7 available on AI Gateway

AI Gateway production index

How NVIDIA engineers and researchers build with Codex

Node.js 26.x now available on Vercel Sandboxes

Introducing NVIDIA Fleet Intelligence for Real-Time GPU Fleet Visibility and Optimization

v0.101.0

Introducing the Heap, the software engineering blog for everyone

OpenAI launches DeployCo to help businesses build around intelligence

v0.20.2

Using Claude Code: The Unreasonable Effectiveness of HTML

Claude Code, Codex and Agentic Coding #8

langchain==1.2.18

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

not much happened today

llm-gemini 0.31

Behind the Scenes Hardening Firefox with Claude Mythos Preview

Notes on the xAI/Anthropic data center deal

langchain-core==0.3.86

langchain==0.3.30

langchain-classic==1.0.7

Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber

Next.js May 2026 security release

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

Introducing Trusted Contact in ChatGPT

Anthropic raises Claude Code usage limits, credits new deal with SpaceX

langchain==1.3.0a2

Anthropic&#039;s Claude Managed Agents can now "dream," sort of

Live blog: Code w/ Claude 2026

Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized

Introducing ChatGPT Futures: Class of 2026

Introducing Zed for Business

datasette-referrer-policy 0.1

The AI Ad-Hoc Prior Restraint Era Begins

langchain-core==1.3.3

langchain-fireworks==1.3.1

langchain-mistralai==1.1.4

Unlocking large scale AI training networks with MRC (Multipath Reliable Connection)

Anthropic’s Cat Wu says that, in the future, AI will anticipate your needs before you know what they are

Anthropic's Claude Managed Agents can now "dream," sort of