Smol AI News
132 articles archived · Visit source ↗ · RSS
-
Smol AI News news-outlet 4d ago
not much happened today
**OpenAI** previewed **GPT-5.6** with three variants: **Sol** (flagship), **Terra** (mid-tier), and **Luna** (lower-cost), launching under a restricted rollout mandated by the U.S. government, limiting access to trusted partners. **Sol** boasts enhanced cybersecurity and safety…
35 -
Smol AI News news-outlet 5d ago
not much happened today
**Z.ai's GLM-5.2** leads in coding and agent benchmarks with top scores like **1595** on Code Arena: Frontend and **34.29%** reasoning accuracy with zero failures. Databricks improved GLM-5.2 speed to **392 tok/s** using hardware and optimizations. **Ornith-1.0**, a new…
13 -
Smol AI News news-outlet 6d ago
not much happened today
**OpenAI** announced **Jalapeño**, its first custom AI chip for LLM inference, built with **Broadcom**, aiming to control more of the AI stack and improve compute economics with a fast 9-month design cycle. Community analysis suggests Jalapeño features **216GB HBM3E**,…
30 -
Smol AI News news-outlet 7d ago
not much happened today
**Prime Intellect's `prime-rl` v0.6.0** advances agentic reinforcement learning infrastructure supporting **1 trillion parameter MoE models** with sub-5-minute step times and a **131k context GLM-5 agentic setup**. The release includes optimizations in inference, training, and…
37 -
Smol AI News news-outlet 8d ago
not much happened today
**OpenAI** expanded its **Daybreak** program with the **GPT-5.5-Cyber** model, focusing on closed-loop patch generation for cybersecurity, scanning over 30 million commits and covering major projects like cURL and Python. The release sparked debate on policy and export controls,…
36 -
Smol AI News news-outlet 11d ago
not much happened today
**GLM-5.2** emerges as a leading open-weight coding model rivaling **Opus 4.8** and **GPT-5.5** in software engineering tasks, emphasizing the strategic importance of open models for provider competition, on-prem deployment, and fine-tuning rights. Experts like **Patrick…
17 -
Smol AI News news-outlet 12d ago
not much happened today
**GLM-5.2** from **Zhipu** emerged as a leading open-weight model with innovative **IndexShare** sparse-attention enabling efficient **1M-token inference**, praised as comparable to **GPT-5.5** and **Opus 4.8** but lacking vision support. Other notable open models include…
18 -
Smol AI News news-outlet 18d ago
not much happened today
**Anthropic** suspended access to **Claude Fable 5** and **Mythos 5** due to **US export controls**, sparking a debate on **model sovereignty** and geopolitical risks for frontier AI vendors. **Artificial Analysis** updated its coding agent benchmark, replacing **SWE-Bench Pro**…
17 -
Smol AI News news-outlet 19d ago
not much happened today
**Anthropic** reversed its covert degradation policy on **Claude Fable 5** after public backlash, sparking debates on governance, transparency, and access to frontier AI models. The model shows strong capabilities with mixed benchmark results, including **87.8% on WeirdML** and…
19 -
Smol AI News news-outlet 19d ago
not much happened today
**Anthropic's Fable/Mythos export-control crisis** dominates AI news, highlighting the intersection of **national security** and frontier model access. Technical voices like **François Chollet** criticize opaque regulatory actions and advocate for **standardized benchmarks for…
6 -
Smol AI News news-outlet 20d ago
not much happened today
**Anthropic** faced backlash for silently degrading AI research capabilities in its **Fable/Mythos** models without clear disclosure, raising concerns about trust, reproducibility, and enterprise data retention policies. Despite controversy, **Fable 5** demonstrated strong…
15 -
Smol AI News news-outlet 21d ago
Anthropic Claude Fable 5
**Anthropic** released two major models: **Claude Fable 5** for general availability and **Claude Mythos 5** for restricted access, with fallback to **Claude Opus 4.8** for sensitive queries. **Fable 5** features a **1M-token context window** and pricing at **$10/million input…
24 -
Smol AI News news-outlet 22d ago
not much happened today
**FrontierCode** benchmark by **Cognition** highlights the challenge of coding tasks with the best model, **Opus 4.8**, scoring only about **13%** on the hardest subset, indicating coding is less solved than benchmarks suggest. The trend toward using **loops** as a control…
5 -
Smol AI News news-outlet 25d ago
not much happened today
**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an…
23 -
Smol AI News news-outlet 26d ago
not much happened today
**NVIDIA** released **Nemotron 3 Ultra**, a fully open **550B MoE** model with **55B active parameters** and **1M context**, optimized for long-running agent tasks with up to **5x speedup** and **30% cost reduction**. It features hybrid Mamba/attention, LatentMoE, native MTP,…
7 -
Smol AI News news-outlet 28d ago
not much happened today
**Microsoft** released the detailed technical report for **MAI-Thinking-1**, a generalist reasoning model trained without third-party distillation, achieving **97% on AIME 2025** and outperforming Sonnet 4.6 in human preference tests. The report was praised for transparency,…
7 -
Smol AI News news-outlet 29d ago
not much happened today
**NVIDIA** led open-source AI model releases with **Cosmos 3**, a comprehensive omnimodal world model unifying language, image, video, audio, and action using a Mixture-of-Transformers design, and **Nemotron 3 Ultra**, a **550B** parameter open-weight model noted for high…
33 -
Smol AI News news-outlet 1mo ago
not much happened today
**Anthropic** rolled out **Claude Opus 4.8**, which shows incremental improvements but mixed benchmark results, including better cooperation and coding behavior but some regressions in document parsing. Platform updates include mid-conversation system instructions enhancing long…
29 -
Smol AI News news-outlet 1mo ago
not much happened today
**AI News for 5/23/2026-5/26/2026** highlights a shift in AI product strategy emphasizing **model + harness + workflow + UI + memory + economics** over model quality alone. **OpenAI** released a significant **Codex** update with features like **appshots** and remote computer…
16 -
Smol AI News news-outlet 1mo ago
not much happened today
**Inference optimization** is increasingly architectural, with **EAGLE 3.1** improving speculative decoding and long-context handling, collaborating with **vLLM** and **TorchSpec**. **Perplexity** open-sourced a rebuilt **Unigram tokenizer** cutting CPU use by **5–6×** and…
15 -
Smol AI News news-outlet 1mo ago
not much happened today
**RAEv2** advances representation-first tokenization with **>10x faster convergence** and improved generation, tested on **text-to-image** and **world models**. **NVIDIA's Gated DeltaNet-2** innovates linear attention with channel-wise gates, outperforming **KDA** and…
23 -
Smol AI News news-outlet 1mo ago
not much happened today
**Agent infrastructure** is advancing with **LangSmith Engine** providing CI/CD loops for agents and **SmithDB** enabling low-latency querying for observability. **Cognition's Devin Auto-Triage** offers persistent automation for bug triage with memory and subagent structures.…
27 -
Smol AI News news-outlet 1mo ago
not much happened today
**Cerebras** made headlines with its **IPO**, marking a significant milestone for the company known for its contrarian hardware approach. The **Cerebras CFO Bob Komin** emphasized the company's capability to serve **trillion-parameter models**, including internal **OpenAI 5.4…
35 -
Smol AI News news-outlet 1mo ago
not much happened today
**OpenAI** expanded **Codex** integration with the ChatGPT mobile app enabling remote task management and introduced Remote SSH, hooks, and programmatic tokens for enterprise automation. The IDE ecosystem is shifting to "agent-first" UX with **GitHub Copilot App** preview and…
26 -
Smol AI News news-outlet 1mo ago
not much happened today
**Cline, LangChain, Notion, and Cursor** advanced agent infrastructure and developer platforms with innovations like **Cline SDK**, **LangSmith Engine**, **SmithDB** (offering **12–15×** faster observability), and Notion's External Agents API integrating third-party agents such…
14 -
Smol AI News news-outlet 1mo ago
not much happened today
**Research-level reasoning benchmarks** are advancing with **439 new math problems** from **64 mathematicians** and expanded medical benchmarks in **Medmarks v1.0** covering **30 benchmarks** and **61 models**. **Google DeepMind's AI Co-Mathematician** achieves **48% on…
15 -
Smol AI News news-outlet 1mo ago
not much happened today
**Thinking Machines** previewed their new **native interaction models** designed for **full-duplex multimodal interaction** enabling real-time concurrent listening, speaking, watching, thinking, searching, and reacting, marking a shift beyond turn-based AI. This approach…
36 -
Smol AI News news-outlet 1mo ago
not much happened today
**OpenAI** rapidly expanded the **GPT-5.5** family with multiple variants including **gpt-image-2**, **GPT-5.5 Pro**, and **GPT-5.5 Cyber**, receiving positive feedback for efficiency and usability. **Codex** evolved into a long-running agent runtime with a new **/goal**…
35 -
Smol AI News news-outlet 1mo ago
not much happened today
**OpenAI** rolled out **GPT-5.5 Instant** as the new default for ChatGPT and API, enhancing **factuality, intelligence, image understanding, and tone** with stronger personalization features like saved memories and Gmail integration. OpenAI also shared infrastructure updates on…
28 -
Smol AI News news-outlet 1mo ago
not much happened today
**AI Twitter Recap** highlights the shift from model-centric AI to **context pipelines** and **agent orchestration** as key performance drivers. Notably, **gpt-5.2-codex** and **gpt-5.3-codex** showed significant benchmark improvements through prompt and middleware tuning. The…
16 -
Smol AI News news-outlet 1mo ago
not much happened today
**OpenAI** achieved a major math breakthrough by disproving a long-standing Erdős unit distance problem using a **general-purpose reasoning model**, marking a milestone in AI-driven formal science and long-horizon reasoning. The result was validated by prominent mathematicians…
8 -
Smol AI News news-outlet 1mo ago
not much happened today
**AI News for 5/4/2026-5/5/2026** highlights a shift in AI product development emphasizing **model + harness + workflow + UI + memory + economics** over model quality alone, with notable updates from **OpenAI Codex** and **Claude** including new features like **Appshots**,…
26 -
Smol AI News news-outlet 2mo ago
not much happened today
**xAI released Grok 4.3**, improving cost/performance with a **53 Intelligence Index score**, 4 points higher than Grok 4.20, and significant gains on **GDPval-AA** and **τ²-Bench Telecom**. However, accuracy tradeoffs raised reliability concerns. Community opinions are mixed,…
32 -
Smol AI News news-outlet 2mo ago
not much happened today
**OpenAI's GPT-5.5** achieves top-tier performance in long-horizon cyber tasks, matching or surpassing **Claude Mythos Preview** with a **71.4%** pass rate and showing ongoing improvement beyond **100M tokens** inference. OpenAI also released an **Advanced Account Security**…
32 -
Smol AI News news-outlet 2mo ago
not much happened today
**OpenAI** is expanding **Codex** from a coding tool to a general work surface with persistent context, tools, integrations, and team rollout, including **Codex-only seats with $0 seat fee** for Business/Enterprise customers through June. Performance improvements focus on…
23 -
Smol AI News news-outlet 2mo ago
not much happened today
**vLLM v0.20.0** introduces significant improvements in memory and MoE serving efficiency, including **TurboQuant 2-bit KV cache** for **4× KV capacity** and a **2.1% latency improvement**. The update supports multiple hardware platforms like **DeepSeek V4 MegaMoE on…
9 -
Smol AI News news-outlet 2mo ago
not much happened today
**OpenAI** loosens its **Azure exclusivity**, allowing distribution across **Google TPU**, **AWS Trainium**, and **Bedrock** with commitments through **2032** and revenue share through **2030**. **GPT-5.5** shows improved benchmarks but is not uniformly dominant, ranking…
11 -
Smol AI News news-outlet 2mo ago
DeepSeek v4
**DeepSeek-V4** technical release features a **1.6T-parameter MoE with 49B active parameters** and **1M-token context**, showcasing hybrid attention and compressed KV schemes for major memory reductions. It ranks as the **#2 open-weights reasoning model** behind **Kimi K2.6**…
13 -
Smol AI News news-outlet 2mo ago
GPT 5.5
**OpenAI launched GPT-5.5** as its new flagship model for "real work and powering agents," immediately available in ChatGPT and Codex but with delayed API access due to enhanced safety requirements. The model features improved token efficiency and supports longer multi-step…
14 -
Smol AI News news-outlet 2mo ago
not much happened today
**Alibaba** released **Qwen3.6-27B**, a dense, Apache 2.0 open coding model with thinking and non-thinking modes, outperforming the larger Qwen3.5-397B-A17B on multiple coding benchmarks including SWE-bench and Terminal-Bench. It supports native vision-language reasoning over…
15 -
Smol AI News news-outlet 2mo ago
GPT-Image-2
**OpenAI** launched **GPT-Image-2**, enhancing image generation with improved text rendering, layout fidelity, editing, multilingual support, and "thinking" capabilities. It supports generating slides, infographics, diagrams, UI mockups, and QR codes, and integrates with tools…
36 -
Smol AI News news-outlet 2mo ago
not much happened today
**Moonshot's Kimi K2.6** is a major open-weight **1T-parameter MoE** model featuring **32B active parameters**, **384 experts**, **MLA attention**, **256K context window**, native multimodality, and **INT4 quantization**. It supports day-0 integration with platforms like…
9 -
Smol AI News news-outlet 2mo ago
not much happened today
**Anthropic** launched **Claude Design**, a prototyping tool powered by **Claude Opus 4.7**, targeting design workflows and competing with **Figma** and others. Benchmarks show **Opus 4.7** leading in coding and text tasks, with improved efficiency and adaptive reasoning, though…
7