Smol AI News

100 articles archived · Visit source ↗ · RSS

Smol AI News news-outlet 1d ago

not much happened today

**Research-level reasoning benchmarks** are advancing with **439 new math problems** from **64 mathematicians** and expanded medical benchmarks in **Medmarks v1.0** covering **30 benchmarks** and **61 models**. **Google DeepMind's AI Co-Mathematician** achieves **48% on…

15
Smol AI News news-outlet 2d ago

not much happened today

**Thinking Machines** previewed their new **native interaction models** designed for **full-duplex multimodal interaction** enabling real-time concurrent listening, speaking, watching, thinking, searching, and reacting, marking a shift beyond turn-based AI. This approach…

36
Smol AI News news-outlet 5d ago

not much happened today

**OpenAI** rapidly expanded the **GPT-5.5** family with multiple variants including **gpt-image-2**, **GPT-5.5 Pro**, and **GPT-5.5 Cyber**, receiving positive feedback for efficiency and usability. **Codex** evolved into a long-running agent runtime with a new **/goal**…

35
Smol AI News news-outlet 6d ago

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

**OpenAI** released **GPT-Realtime-2**, a voice model with **GPT-5-class reasoning**, tool use, interruption handling, and extended context windows up to **128K tokens**, achieving top scores on **Big Bench Audio** and **Conversational Dynamics** benchmarks. They also launched a…

22
Smol AI News news-outlet 7d ago

Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized

**Anthropic** announced a new **SpaceX compute partnership** to significantly increase capacity for **Claude** products, doubling **Claude Code's 5-hour rate limits** for Pro, Max, Team, and Enterprise users, removing peak-hour limit reductions, and substantially increasing API…

26
Smol AI News news-outlet 9d ago

not much happened today

**OpenAI** rolled out **GPT-5.5 Instant** as the new default for ChatGPT and API, enhancing **factuality, intelligence, image understanding, and tone** with stronger personalization features like saved memories and Gmail integration. OpenAI also shared infrastructure updates on…

28
Smol AI News news-outlet 9d ago

not much happened today

**AI Twitter Recap** highlights the shift from model-centric AI to **context pipelines** and **agent orchestration** as key performance drivers. Notably, **gpt-5.2-codex** and **gpt-5.3-codex** showed significant benchmark improvements through prompt and middleware tuning. The…

16
Smol AI News news-outlet 12d ago

not much happened today

**xAI released Grok 4.3**, improving cost/performance with a **53 Intelligence Index score**, 4 points higher than Grok 4.20, and significant gains on **GDPval-AA** and **τ²-Bench Telecom**. However, accuracy tradeoffs raised reliability concerns. Community opinions are mixed,…

32
Smol AI News news-outlet 13d ago

not much happened today

**OpenAI's GPT-5.5** achieves top-tier performance in long-horizon cyber tasks, matching or surpassing **Claude Mythos Preview** with a **71.4%** pass rate and showing ongoing improvement beyond **100M tokens** inference. OpenAI also released an **Advanced Account Security**…

32
Smol AI News news-outlet 14d ago

not much happened today

**OpenAI** is expanding **Codex** from a coding tool to a general work surface with persistent context, tools, integrations, and team rollout, including **Codex-only seats with $0 seat fee** for Business/Enterprise customers through June. Performance improvements focus on…

23
Smol AI News news-outlet 15d ago

not much happened today

**vLLM v0.20.0** introduces significant improvements in memory and MoE serving efficiency, including **TurboQuant 2-bit KV cache** for **4× KV capacity** and a **2.1% latency improvement**. The update supports multiple hardware platforms like **DeepSeek V4 MegaMoE on…

9
Smol AI News news-outlet 16d ago

not much happened today

**OpenAI** loosens its **Azure exclusivity**, allowing distribution across **Google TPU**, **AWS Trainium**, and **Bedrock** with commitments through **2032** and revenue share through **2030**. **GPT-5.5** shows improved benchmarks but is not uniformly dominant, ranking…

11
Smol AI News news-outlet 19d ago

DeepSeek v4

**DeepSeek-V4** technical release features a **1.6T-parameter MoE with 49B active parameters** and **1M-token context**, showcasing hybrid attention and compressed KV schemes for major memory reductions. It ranks as the **#2 open-weights reasoning model** behind **Kimi K2.6**…

13
Smol AI News news-outlet 20d ago

GPT 5.5

**OpenAI launched GPT-5.5** as its new flagship model for "real work and powering agents," immediately available in ChatGPT and Codex but with delayed API access due to enhanced safety requirements. The model features improved token efficiency and supports longer multi-step…

14
Smol AI News news-outlet 21d ago

not much happened today

**Alibaba** released **Qwen3.6-27B**, a dense, Apache 2.0 open coding model with thinking and non-thinking modes, outperforming the larger Qwen3.5-397B-A17B on multiple coding benchmarks including SWE-bench and Terminal-Bench. It supports native vision-language reasoning over…

15
Smol AI News news-outlet 22d ago

GPT-Image-2

**OpenAI** launched **GPT-Image-2**, enhancing image generation with improved text rendering, layout fidelity, editing, multilingual support, and "thinking" capabilities. It supports generating slides, infographics, diagrams, UI mockups, and QR codes, and integrates with tools…

36
Smol AI News news-outlet 23d ago

not much happened today

**Moonshot's Kimi K2.6** is a major open-weight **1T-parameter MoE** model featuring **32B active parameters**, **384 experts**, **MLA attention**, **256K context window**, native multimodality, and **INT4 quantization**. It supports day-0 integration with platforms like…

9
Smol AI News news-outlet 26d ago

not much happened today

**Anthropic** launched **Claude Design**, a prototyping tool powered by **Claude Opus 4.7**, targeting design workflows and competing with **Figma** and others. Benchmarks show **Opus 4.7** leading in coding and text tasks, with improved efficiency and adaptive reasoning, though…

7
Smol AI News news-outlet 27d ago

Anthropic's Claude Opus 4.7

**Anthropic** launched **Claude Opus 4.7**, its most capable Opus model yet, featuring stronger coding and agentic performance, a new tokenizer, and improved long-context handling with a new **xhigh** reasoning tier. Benchmarks show substantial gains, including **SWE-bench Pro…

37
Smol AI News news-outlet 28d ago

not much happened today

**OpenAI** expanded its Agents SDK by separating the agent harness from compute/storage, enabling long-running, durable agents with features like file/computer use, skills, memory, and compaction. The harness is now open-source and supports execution via partner sandboxes,…

37
Smol AI News news-outlet 1mo ago

not much happened today

**Harness engineering** is emerging as a key discipline in AI agent development, emphasizing components like filesystems, memory, and retries beyond just models. **OpenAI's Codex** is expanding agentic coding workflows beyond software engineering, including codebase…

32
Smol AI News news-outlet 1mo ago

not much happened today

**GLM-5.1** has reached **#3 on Code Arena**, surpassing **Gemini 3.1** and **GPT-5.4**, and matching **Claude Sonnet 4.6** in coding performance. **Z.ai** now holds the **#1 open model rank** close to the top overall. The advisor pattern, combining a cheap executor with an…

12
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic's Mythos** and **OpenAI's** upcoming restricted cyber-capable models are central to recent discussions, with debates on their security realism and evaluation methods. **LangChain's Deep Agents deploy** introduces an open memory, model-agnostic agent harness…

36
Smol AI News news-outlet 1mo ago

not much happened today

**Meta Superintelligence Labs** launched **Muse Spark**, a natively multimodal reasoning model featuring tool use, visual chain of thought, and multi-agent orchestration. It is live on **meta.ai** and the Meta AI app with a private API preview and plans for open-sourcing future…

29
Smol AI News news-outlet 1mo ago

Anthropic @ $30B ARR, Project GlassWing and Claude Mythos Preview — first model too dangerous to release since GPT-2

**Anthropic** strategically challenges **OpenAI** amid its upcoming IPO concerns by announcing a jump from **$19B ARR in March** to **$30B ARR in April**, highlighting a differential growth rate and higher cost efficiency. The company also revealed **Claude Mythos**, rumored as…

30
Smol AI News news-outlet 1mo ago

not much happened today

**Hermes Agent** is gaining attention as a leading open agent stack with features like self-improving skills, persistent memory, and a self-improvement loop. Its new **Manim skill** enables generation of math/technical animations, expanding agent capabilities. The Hermes…

19
Smol AI News news-outlet 1mo ago

not much happened today

**Google** introduced **Skills in Chrome**, enabling reusable browser workflows with Gemini prompts and a library of ready-made Skills, enhancing end-user agentization. **Tencent** teased **HYWorld 2.0**, an open-source 3D world model generating editable scenes from a single…

8
Smol AI News news-outlet 1mo ago

not much happened today

**Gemma 4** was launched by **Google** under an **Apache 2.0 license**, marking a significant open-model release focused on **reasoning, agentic workflows, multimodality, and on-device use**. It outperforms models 10x larger and has immediate ecosystem support including…

35
Smol AI News news-outlet 1mo ago

Gemma 4

**Google DeepMind** released **Gemma 4**, a family of open-weight, multimodal models with long-context support up to **256K tokens** under an **Apache 2.0 license**, marking a major capability and licensing shift. The lineup includes **31B dense**, **26B MoE (A4B)**, and two…

14
Smol AI News news-outlet 1mo ago

not much happened today

**Arcee’s Trinity-Large-Thinking** was released with **open weights under Apache 2.0**, featuring a **400B total / 13B active** model size and strong agentic performance, ranking **#2 on PinchBench**. **Z.ai’s GLM-5V-Turbo** is a **vision coding model** with **native multimodal…

13
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** introduced **computer use inside Claude Code** for closed-loop verification in a research preview for Pro/Max users, enhancing reliable app iteration. **OpenAI** released a **Codex plugin for Claude Code**, enabling cross-agent composition and signaling a shift…

16
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** is reportedly introducing a new AI model tier called **Capybara**, which is larger and more intelligent than **Claude Opus 4.6**, showing improved performance in coding, academic reasoning, and cybersecurity. The model is speculated to be around **10 trillion…

38
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** advances agent infrastructure with a multi-agent harness emphasizing orchestration and "computer use" for complex software environments. **Figma**, **GitHub**, and **Cursor** launch design canvases with direct AI editing, showcasing tool-calling becoming…

12
Smol AI News news-outlet 1mo ago

not much happened today

**ARC-AGI-3** benchmark introduced by **@arcprize** and **François Chollet** resets the frontier for general agentic reasoning with humans solving 100% of tasks versus under 1% for current models, focusing on zero-preparation generalization and human-like learning efficiency.…

4
Smol AI News news-outlet 1mo ago

not much happened today

**Google** launched **Gemini 3.1 Flash Live**, a realtime voice and vision agent model with **2x longer conversation memory**, supporting **70 languages** and **128k context**. **Mistral AI** released **Voxtral TTS**, a low-latency, open-weight text-to-speech model supporting…

31
Smol AI News news-outlet 1mo ago

The Claude Code Source Leak

**Anthropic's** closed-source coding product **Claude Code** experienced a significant source leak exposing over **500k lines** of orchestration logic, including autonomous modes and memory systems, but not model weights. The leak led to rapid public reverse-engineering,…

14
Smol AI News news-outlet 1mo ago

not much happened today

**Anthropic** introduced **Claude Cowork** and **Claude Code** enabling desktop control of mouse, keyboard, and screen in a **macOS research preview**, expanding agent capabilities beyond APIs and browsers. The agent ecosystem is evolving towards long-running, parallel,…

29
Smol AI News news-outlet 1mo ago

not much happened today

**Cursor's Composer 2**, built on **Kimi K2.5**, sparked discussion over model attribution and licensing, highlighting a shift toward post-trained derivatives of open-source models with domain-specific fine-tuning and reinforcement learning. **Claude Code** is expanding into…

36
Smol AI News news-outlet 1mo ago

not much happened today

**Cursor** launched **Composer 2**, a frontier-class coding model with major cost reductions and strong benchmark scores like **61.3 on CursorBench** and **73.7 on SWE-bench Multilingual**. The model was improved via a **first continued pretraining run** feeding into…

36
Smol AI News news-outlet 1mo ago

MiniMax 2.7: GLM-5 at 1/3 cost SOTA Open Model

**MiniMax M2.7** is the headline model release, described as a "self-evolving agent" with strong performance metrics including **56.22% on SWE-Pro**, **57.0% on Terminal Bench 2**, and parity with **Sonnet 4.6**. It features recursive self-improvement in skills, memory, and…

6
Smol AI News news-outlet 1mo ago

not much happened today

**OpenAI** released **GPT-5.4 mini** and **GPT-5.4 nano**, their most capable small models optimized for coding, multimodal understanding, and subagents, featuring a **400k context window** and over **2x speed** compared to GPT-5 mini. The mini model approaches larger GPT-5.4…

32
Smol AI News news-outlet 1mo ago

not much happened today

**Moonshot's Attention Residuals** paper introduced an input-dependent attention mechanism over prior layers with a **1.25x compute advantage** and less than **2% inference latency overhead**, validated on **Kimi Linear 48B total / 3B active**. The paper sparked debate on…

26
Smol AI News news-outlet 2mo ago

not much happened today

**MCP tools** remain relevant for deterministic APIs despite ergonomic criticisms, with new **web MCP support in Chrome v146** enabling continuous browsing agents. Persistent memory is emerging as a key differentiator for agents, with IBM improving task completion rates and…

5
Smol AI News news-outlet 2mo ago

not much happened today

**Harnesses, agent infrastructure, and the MCP protocol** are central themes, with emphasis on how **harnesses, sandboxes, filesystem access, skills, memory, and observability** shape agent UI/UX and runtime environments. Despite jokes about MCP's demise, it remains vital in…

26
Smol AI News news-outlet 2mo ago

not much happened today

**NVIDIA’s Nemotron 3 Super** is a **120B parameter / ~12B active** open model featuring a **hybrid Mamba-Transformer / SSM Latent MoE** architecture and **1M context window**, delivering up to **2.2x faster inference than GPT-OSS-120B** in FP4 with strong throughput gains. It…

10
Smol AI News news-outlet 2mo ago

Yann LeCun’s AMI Labs launches with a $1.03B seed to build world models around JEPA

**Yann LeCun** launched **Advanced Machine Intelligence (AMI Labs)** with a record **$1.03B seed round** at a **$3.5B pre-money valuation**, aiming to build AI models that understand the **physical world** through **world models** rather than just language prediction. The…

29
Smol AI News news-outlet 2mo ago

Autoresearch: Sparks of Recursive Self Improvement

**RSI** covers AI developments from 3/5/2026 to 3/9/2026, highlighting the emergence of **LLMs autonomously training smaller LLMs**, marking a significant "AutoML moment" in AI progress. **Karpathy** and **Yi Tay** discuss "vibe training," where AI models fix bugs and improve…

7
Smol AI News news-outlet 2mo ago

not much happened today

**OpenAI** rolled out **GPT-5.4**, achieving tied **#1** on the **Artificial Analysis Intelligence Index** with **Gemini 3.1 Pro Preview** scoring **57** (up from 51 for GPT-5.2 xhigh). GPT-5.4 features a larger **~1.05M token** context window and higher per-token prices…

12
Smol AI News news-outlet 2mo ago

GPT 5.4: SOTA Knowledge Work -and- Coding -and- CUA Model, OpenAI is so very back

**OpenAI** launched **GPT-5.4** and **GPT-5.4 Pro** with unified mainline and Codex models, featuring **native computer use**, up to **~1M token context**, and efficiency improvements including a new **Codex `/fast` mode**. Benchmarks showed strong results like…

25
Smol AI News news-outlet 2mo ago

not much happened today

**Gemini 3.1 Flash-Lite** is highlighted by **Demis Hassabis** for its speed and cost-efficiency, focusing on latency and cost per capability rather than raw performance. **NotebookLM Studio** introduces a new feature for generating immersive cinematic video overviews. Rumors…

20
Smol AI News news-outlet 2mo ago

not much happened today

**Google DeepMind** launched **Gemini 3.1 Flash-Lite**, emphasizing *dynamic thinking levels* for adjustable compute, with notable metrics like **$0.25/M input**, **$1.50/M output**, **1432 Elo on LMArena**, and **2.5× faster time-to-first-token** than Gemini 2.5 Flash. It…

35
Smol AI News news-outlet 2mo ago

not much happened today

**Alibaba** released the **Qwen 3.5** series with models ranging from **0.8B to 9B** parameters, featuring **native multimodality**, **scaled reinforcement learning**, and targeting **edge and lightweight agent** deployments. The models support very long context windows up to…

18
Smol AI News news-outlet 2mo ago

OpenAI closes $110B raise from Amazon, NVIDIA, SoftBank in largest startup fundraise in history @ $840B post-money

**OpenAI** has closed a major funding round totaling **$110 billion** at a **$730 billion pre-money valuation**, with investments from **SoftBank ($30B)**, **NVIDIA ($30B)**, and **Amazon ($50B)**. Key user metrics include **1.6 million weekly Codex users**, **over 9 million…

29
Smol AI News news-outlet 2mo ago

Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model

**Google and DeepMind** launched **Nano Banana 2** (aka **Gemini 3.1 Flash Image Preview**), a leading image generation and editing model integrated across multiple Google products with features like **4K upscaling**, **multi-subject consistency**, and **real-time…

29
Smol AI News news-outlet 2mo ago

Agentic Engineering: WTF Happened in December 2025?

**Perplexity** launched **Computer**, an orchestration-first agent platform featuring multi-model routing, usage-based pricing, and parallel asynchronous sub-agents for distributed workflows. **Andrej Karpathy** claims a "phase change" in coding agents since December,…

21
Smol AI News news-outlet 2mo ago

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

**Anthropic** alleges *industrial-scale* distillation attacks on its **Claude** model by **DeepSeek**, **Moonshot AI**, and **MiniMax**, involving **~24,000 fraudulent accounts** and **>16M Claude exchanges** to extract capabilities, raising concerns about competitive risks and…

21
Smol AI News news-outlet 2mo ago

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

**Alibaba** launched the **Qwen 3.5 Medium Model Series** featuring models like **Qwen3.5-Flash**, **Qwen3.5-35B-A3B (MoE)**, and **Qwen3.5-122B-A10B (MoE)** emphasizing efficiency over scale with innovations like **1M context** and INT4 quantization. **OpenAI** released…

14
Smol AI News news-outlet 2mo ago

not much happened today

**Gemini 3.1 Pro** demonstrates strong retrieval capabilities and cost efficiency compared to **GPT-5.2** and **Opus 4.6**, though users report tooling and UI issues. The **SWE-bench Verified** evaluation methodology is under scrutiny for consistency, with updates bringing…

27
Smol AI News news-outlet 2mo ago

Gemini 3.1 Pro: 2x 3.0 on ARC-AGI 2

**Google** released **Gemini 3.1 Pro**, a developer preview integrated across the **Gemini app**, **NotebookLM**, **Gemini API / AI Studio**, and **Vertex AI**, highlighting a significant reasoning improvement with **ARC-AGI-2 = 77.1%** and strong coding and agentic-tool…

10
Smol AI News news-outlet 2mo ago

not much happened today

**Anthropic** released **Claude Opus/Sonnet 4.6**, showing a significant intelligence index jump but with increased token usage and cost. **Anthropic** also shared insights on AI agent autonomy, highlighting human-in-the-loop prevalence and software engineering tool calls.…

5
Smol AI News news-outlet 2mo ago

Claude Sonnet 4.6: clean upgrade of 4.5, mostly better with some caveats

**Anthropic** launched **Claude Sonnet 4.6**, an upgrade over Sonnet 4.5, featuring broad improvements in **coding, long-context reasoning, agent planning, knowledge work, and design**, plus a **1M-token context window (beta)**. Benchmarks show Sonnet 4.6 leading on **GDPval-AA…

4
Smol AI News news-outlet 2mo ago

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

**Alibaba** released **Qwen3.5-397B-A17B**, an open-weight model featuring **native multimodality**, **spatial intelligence**, and a **hybrid linear attention + sparse MoE** architecture supporting **201 languages** and **long context windows** up to **256K tokens**. The model…

35
Smol AI News news-outlet 2mo ago

MiniMax-M2.5: SOTA coding, search, toolcalls, $1/hour

**MiniMax-M2.5** is now open source, featuring an "agent-native" reinforcement learning framework called **Forge** trained across **200k+ RL environments** for coding, tool use, and workflows. It boasts strong benchmark scores like **80.2% SWE-Bench Verified** and emphasizes…

20
Smol AI News news-outlet 3mo ago

new Gemini 3 Deep Think, Anthropic $30B @ $380B, GPT-5.3-Codex Spark, MiniMax M2.5

**Google DeepMind** is rolling out the upgraded **Gemini 3 Deep Think V2** reasoning mode to **Google AI Ultra** subscribers and opening early access to the **Vertex AI / Gemini API** for select users. Key benchmark achievements include **ARC-AGI-2 at 84.6%**, **Humanity’s Last…

31
Smol AI News news-outlet 3mo ago

Z.ai GLM-5: New SOTA Open Weights LLM

**Zhipu AI** launched **GLM-5**, an **Opus-class** model scaling from **355B to 744B parameters** with **DeepSeek Sparse Attention** integration for cost-efficient long-context serving. GLM-5 achieves **SOTA on BrowseComp** and leads on **Vending Bench 2**, focusing on office…

18
Smol AI News news-outlet 3mo ago

Qwen-Image 2.0 and Seedance 2.0

**OpenAI** advances its Responses API for multi-hour agent workflows with features like **server-side compaction**, **hosted containers**, and **Skills API**, alongside upgrading **Deep Research** to **GPT-5.2** and adding connectors. Discussions around sandbox design highlight…

6
Smol AI News news-outlet 3mo ago

not much happened today

**OpenAI** launched **GPT-5.3-Codex** with a Super Bowl ad emphasizing "You can just build things" as a product strategy, focusing on builder tooling over chat interfaces. The model is rolling out across **Cursor, VS Code, and GitHub** with phased API access and is flagged as…

34
Smol AI News news-outlet 3mo ago

not much happened today

**AI News** for early February 2026 highlights a detailed comparison between **GPT-5.3-Codex** and **Claude Opus 4.6**, with users noting **Codex's** strength in detailed scoped tasks and **Opus's** ergonomic advantage for exploratory work. Benchmarks on Karpathy's **nanochat…

11
Smol AI News news-outlet 3mo ago

OpenAI and Anthropic go to war: Claude Opus 4.6 vs GPT 5.3 Codex

**OpenAI** launched **GPT-5.3-Codex**, emphasizing **token efficiency**, **inference speed**, and hardware/software co-design with **GB200-NVL72** and **NVIDIA** collaboration. The new **Frontier** agent platform supports business-context agents with execution environments and…

15
Smol AI News news-outlet 3mo ago

ElevenLabs $500m Series D at $11B, Cerebras $1B Series H at $23B, Vibe Coding -> Agentic Engineering

**Google's Gemini 3** is being integrated widely, including a new **Chrome side panel** and **Nano Banana** UX features, with rapid adoption and a **78% unit-cost reduction** in serving costs. The **Gemini app** reached **750M+ MAU** in Q4 2025, nearing ChatGPT's user base.…

23
Smol AI News news-outlet 3mo ago

Context Graphs: Hype or actually Trillion-dollar opportunity?

**Zhipu AI** launched **GLM-OCR**, a lightweight **0.9B** multimodal OCR model excelling in complex document understanding with top benchmark scores and day-0 deployment support from **lmsys**, **vllm**, and **novita labs**. **Ollama** enabled local-first usage with easy offline…

28
Smol AI News news-outlet 3mo ago

OpenAI Codex App: death of the VSCode fork, multitasking worktrees, Skills Automations

**OpenAI** launched the **Codex app** on macOS as a dedicated agent-native command center for coding, featuring **multiple agents in parallel**, **built-in worktrees** for conflict isolation, **skills** for reusable bundles, and **scheduled automations**. The app emphasizes…

19
Smol AI News news-outlet 3mo ago

MoltBook takes over the timeline

**Moltbook** and **OpenClaw** showcase emergent multi-agent social networks where AI agents autonomously interact, creating an AI-native forum layer with complex security and identity challenges. **Karpathy** describes this as "takeoff-adjacent," highlighting bots…

15
Smol AI News news-outlet 3mo ago

xAI Grok Imagine API - the #1 Video Model, Best Pricing and Latency - and merging with SpaceX

**Google DeepMind** launched **Project Genie (Genie 3 + Nano Banana Pro + Gemini)**, a prototype for creating interactive, real-time generated worlds from text or image prompts, currently available to **Google AI Ultra subscribers in the U.S. (18+)** with noted limitations like…

29
Smol AI News news-outlet 3mo ago

not much happened today

**AI News for 1/27/2026-1/28/2026** highlights a quiet day with deep dives into frontier model "personality split" where **GPT-5.2** excels at *exploration* and **Claude Opus 4.5** at *exploitation*, suggesting **OpenAI** suits research workflows and **Anthropic** commercial…

21
Smol AI News news-outlet 3mo ago

Moonshot Kimi K2.5 - Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager

**MoonshotAI's Kimi K2.5** is a **32B active-1T parameter open-weights model** featuring **native multimodality** with image and video understanding, built through continual pretraining on **15 trillion mixed visual and text tokens**. It introduces a new **MoonViT vision…

22
Smol AI News news-outlet 3mo ago

Anthropic launches the MCP Apps open spec, in Claude.ai

**Anthropic** has officially absorbed the independent MCP UI project and, collaborating with **OpenAI**, **Block**, **VS Code**, **Antigravity**, **JetBrains**, and **AWS**, released the **MCP Apps spec** and official support in **Claude.ai**. This standard aims to enable a rich…

9
Smol AI News news-outlet 3mo ago

not much happened today

**Anthropic** launches "Claude in Excel Pro" with enhanced features. **OpenAI** reveals upcoming **Codex** agent loop and cybersecurity measures. **Google** boosts **Gemini App** quotas and partners with **Sakana AI** for advanced AI Scientist projects in Japan. **Cursor**…

25
Smol AI News news-outlet 3mo ago

OpenEvidence, the ‘ChatGPT for doctors,’ raises $250m at $12B valuation, 12x from $1b last Feb

**OpenEvidence** raised **$12 billion**, a 12x increase from last year, with usage by 40% of U.S. physicians and over $100 million in annual revenue. **Anthropic** released a new **Claude** model constitution under **CC0 1.0**, framing it as a living document for alignment and…

34
Smol AI News news-outlet 3mo ago

not much happened today

**X Engineering** open-sourced its new transformer-based recommender algorithm, sparking community debate on transparency and fairness. **GLM-4.7-Flash (30B-A3B)** gains momentum as a strong local inference model with efficient KV-cache management and quantization tuning…

32
Smol AI News news-outlet 3mo ago

not much happened today

**AI News for 1/16/2026-1/19/2026** covers new architectures for scaling Transformer memory and context, including **STEM** from **Carnegie Mellon** and **Meta AI**, which replaces part of the FFN with a token-indexed embedding lookup enabling CPU offload and asynchronous…

37
Smol AI News news-outlet 3mo ago

ChatGPT starts testing ads on free tier + new $8/mo Go plan in the US

**OpenAI** announced the **ChatGPT Go** tier at **$8/month** with ads testing in the US free tier, emphasizing that ads will not influence responses and will be clearly labeled. The update includes memory improvements and a "very fast Codex" feature teased by **Sam Altman**. The…

26
Smol AI News news-outlet 3mo ago

Open Responses: explicit spec for OpenAI's Responses API supported by OpenRouter, Ollama, Huggingface, vLLM, et al

**OpenAI** launched the **Open Responses** API spec, an open-source, multi-provider standard for interoperable LLM APIs designed to simplify agent stacks and tooling. Early adopters like **ollama** and **vLLM** support the spec, while notable absences include **anthropic** and…

4
Smol AI News news-outlet 3mo ago

not much happened today.

**OpenAI** launched **GPT-5.2-Codex** API, touted as their strongest coding model for long-running tasks and cybersecurity. **Cursor** integrated GPT-5.2-Codex to autonomously run a browser for a week, producing over 3 million lines of Rust code. **GitHub** incorporated it into…

19
Smol AI News news-outlet 4mo ago

Anthropic Labs: Cowork, Claude Code, MCP, Skills incubator led by Mike Krieger and Ben Mann

**Anthropic** consolidates its AI agent products under the **Cowork** brand, integrating prior tools like **Claude Code** and **Claude for Chrome** into a unified agent with sandboxed Linux VM environments using **Apple's virtualization** and **bubblewrap** for security.…

23
Smol AI News news-outlet 4mo ago

Apple picks Google's Gemini to power Siri's next generation

**Apple** has decided to power Siri with **Google's Gemini models** and cloud technology, marking a significant partnership and a setback for **OpenAI**, which was initially partnered with Apple. **Anthropic** launched "Cowork," a product preview for Claude's coding…

10
Smol AI News news-outlet 4mo ago

not much happened today

**Anthropic** tightens usage policies for **Claude Max** in third-party apps, prompting builders to adopt **model-agnostic orchestration** and **BYO-key** defaults to mitigate platform risks. The **Model Context Protocol (MCP)** is evolving into a key tooling plane with **OpenAI…

31
Smol AI News news-outlet 4mo ago

not much happened today

**Stanford paper** reveals **Claude 3.7 Sonnet** memorized **95.8% of Harry Potter 1**, highlighting copyright extraction risks compared to **GPT-4.1**. **Google AI Studio** sponsors **TailwindCSS** amid OSS funding debates. **Google** and **Sundar Pichai** launch **Gmail Gemini…

21
Smol AI News news-outlet 4mo ago

not much happened today

**AI News for 1/6/2026-1/7/2026** highlights a quiet day with key updates on **LangChain DeepAgents** introducing **Ralph Mode** for persistent agent loops, **Cursor** improving context management by reducing token usage by **46.9%**, and operational safety measures for coding…

26
Smol AI News news-outlet 4mo ago

xAI raises $20B Series E at ~$230B valuation

**xAI**, Elon Musk's AI company, completed a massive **$20 billion Series E funding round**, valuing it at about **$230 billion** with investors like **Nvidia**, **Cisco Investments**, and others. The funds will support AI infrastructure expansion including **Colossus I and II…

36
Smol AI News news-outlet 4mo ago

not much happened today

**AI News** from early January 2026 highlights a viral economic prediction about **Vietnam** surpassing Thailand, **Microsoft**'s reported open-sourcing of **bitnet.cpp** for 1-bit CPU inference promising speed and energy gains, and a new research partnership between **Google…

36
Smol AI News news-outlet 4mo ago

not much happened today

**DeepSeek** released a new paper on **mHC: Manifold-Constrained Hyper-Connections**, advancing residual-path design as a key scaling lever in neural networks. Their approach constrains residual mixing matrices to the **Birkhoff polytope** to improve stability and performance,…

13
Smol AI News news-outlet 4mo ago

not much happened today

**South Korea's Ministry of Science** launched a coordinated program with **5 companies** to develop sovereign foundation models from scratch, featuring large-scale MoE architectures like **SK Telecom A.X-K1 (519B total / 33B active)** and **LG K-EXAONE (236B MoE / 23B…

24
Smol AI News news-outlet 4mo ago

not much happened today

**Z.ai (GLM family) IPO in Hong Kong on Jan 8, 2026**, aiming to raise **$560M** at **HK$4.35B**, marking it as the "first AI-native LLM company" public listing. The IPO highlights **GLM-4.7** as a starting point. **Meta AI** acquired **Manus** for approximately **$4–5B**, with…

15
Smol AI News news-outlet 4mo ago

Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch

**Manus** achieved a rapid growth trajectory in 2025, raising **$500M** from Benchmark and reaching **$100M ARR** before being acquired by **Meta** for an estimated **$4B**. The **vLLM** team launched a dedicated community site with new resources, while performance issues with…

30
Smol AI News news-outlet 4mo ago

not much happened today

**MiniMax M2.1** launches as an **open-source** agent and coding Mixture-of-Experts (MoE) model with **~10B active / ~230B total parameters**, claiming to outperform **Gemini 3 Pro** and **Claude Sonnet 4.5**, and supports local inference including on **Apple Silicon M3 Ultra**…

10
Smol AI News news-outlet 4mo ago

Nvidia buys (most of) Groq for $20B cash; largest execuhire ever

**Groq** leadership team is joining **Nvidia** under a "non-exclusive licensing agreement" in a deal valued at **$20 billion cash**, marking a major acquisition in AI chip space though Nvidia states it is not acquiring Groq as a company. Jensen Huang plans to integrate Groq's…

4
Smol AI News news-outlet 4mo ago

not much happened today

**GLM-4.7** and **MiniMax M2.1** open-weight model releases highlight day-0 ecosystem support, coding throughput, and agent workflows, with GLM-4.7 achieving a +9.5% improvement over GLM-4.6 and MiniMax M2.1 positioned as an OSS Claude-like MoE model with 230B total parameters…

18
Smol AI News news-outlet 4mo ago

not much happened today

**Zhipu AI's GLM-4.7** release marks a significant improvement in **coding, complex reasoning, and tool use**, quickly gaining ecosystem adoption via Hugging Face and OpenRouter. **Xiaomi's MiMo-V2-Flash** is highlighted as a practical, cost-efficient mixture-of-experts model…

30
Smol AI News news-outlet 4mo ago

not much happened today

**Alibaba** released **Qwen-Image-Layered**, an open-source model enabling Photoshop-grade layered image decomposition with recursive infinite layers and prompt-controlled structure. **Kling 2.6** introduced advanced motion control for image-to-video workflows, supported by a…

18

not much happened today

not much happened today

not much happened today

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

DeepSeek v4

GPT 5.5

not much happened today

GPT-Image-2

not much happened today

not much happened today

Anthropic's Claude Opus 4.7

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Anthropic @ $30B ARR, Project GlassWing and Claude Mythos Preview — first model too dangerous to release since GPT-2

not much happened today

not much happened today

not much happened today

Gemma 4

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

The Claude Code Source Leak

not much happened today

not much happened today

not much happened today

MiniMax 2.7: GLM-5 at 1/3 cost SOTA Open Model

not much happened today

not much happened today

not much happened today

not much happened today

not much happened today

Yann LeCun’s AMI Labs launches with a $1.03B seed to build world models around JEPA

Autoresearch: Sparks of Recursive Self Improvement

not much happened today

GPT 5.4: SOTA Knowledge Work -and- Coding -and- CUA Model, OpenAI is so very back

not much happened today

not much happened today

not much happened today

OpenAI closes $110B raise from Amazon, NVIDIA, SoftBank in largest startup fundraise in history @ $840B post-money

Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model

Agentic Engineering: WTF Happened in December 2025?

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

Claude Code Anniversary + Launches from: Qwen 3.5, Cursor Demos, Cognition Devin 2.2, Inception Mercury 2

not much happened today

Gemini 3.1 Pro: 2x 3.0 on ARC-AGI 2

not much happened today

Claude Sonnet 4.6: clean upgrade of 4.5, mostly better with some caveats

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

MiniMax-M2.5: SOTA coding, search, toolcalls, $1/hour

new Gemini 3 Deep Think, Anthropic $30B @ $380B, GPT-5.3-Codex Spark, MiniMax M2.5

Z.ai GLM-5: New SOTA Open Weights LLM

Qwen-Image 2.0 and Seedance 2.0

not much happened today

not much happened today

OpenAI and Anthropic go to war: Claude Opus 4.6 vs GPT 5.3 Codex

ElevenLabs $500m Series D at $11B, Cerebras $1B Series H at $23B, Vibe Coding -> Agentic Engineering

Context Graphs: Hype or actually Trillion-dollar opportunity?

OpenAI Codex App: death of the VSCode fork, multitasking worktrees, Skills Automations

MoltBook takes over the timeline

xAI Grok Imagine API - the #1 Video Model, Best Pricing and Latency - and merging with SpaceX

not much happened today

Moonshot Kimi K2.5 - Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager

Anthropic launches the MCP Apps open spec, in Claude.ai

not much happened today

OpenEvidence, the ‘ChatGPT for doctors,’ raises $250m at $12B valuation, 12x from $1b last Feb

not much happened today