Ahead of AI (Sebastian Raschka)
20 articles archived · Visit source ↗ · RSS
-
Ahead of AI (Sebastian Raschka) research 25d ago
My Workflow for Understanding LLM Architectures
A learning-oriented workflow for understanding new open-weight model releases
14 -
Ahead of AI (Sebastian Raschka) research 1mo ago
Components of A Coding Agent
How coding agents use tools, memory, and repo context to make LLMs work better in practice
7 -
Ahead of AI (Sebastian Raschka) research 1mo ago
A Visual Guide to Attention Variants in Modern LLMs
From MHA and GQA to MLA, sparse attention, and hybrid architectures
38 -
Ahead of AI (Sebastian Raschka) research 2mo ago
A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026
A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026
30 -
Ahead of AI (Sebastian Raschka) research 3mo ago
Categories of Inference-Time Scaling for Improved LLM Reasoning
And an Overview of Recent Inference-Scaling Papers
11 -
Ahead of AI (Sebastian Raschka) research 4mo ago
The State Of LLMs 2025: Progress, Problems, and Predictions
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
33 -
Ahead of AI (Sebastian Raschka) research 4mo ago
LLM Research Papers: The 2025 List (July to December)
In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.
25 -
Ahead of AI (Sebastian Raschka) research 5mo ago
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Understanding How DeepSeek's Flagship Open-Weight Models Evolved
34 -
Ahead of AI (Sebastian Raschka) research 6mo ago
Beyond Standard LLMs
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
29 -
Ahead of AI (Sebastian Raschka) research 7mo ago
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
29 -
Ahead of AI (Sebastian Raschka) research 8mo ago
Understanding and Implementing Qwen3 From Scratch
A Detailed Look at One of the Leading Open-Source LLMs
14 -
Ahead of AI (Sebastian Raschka) research 9mo ago
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3
15 -
Ahead of AI (Sebastian Raschka) research 9mo ago
The Big LLM Architecture Comparison
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
21 -
Ahead of AI (Sebastian Raschka) research 10mo ago
LLM Research Papers: The 2025 List (January to June)
A topic-organized collection of 200+ LLM research papers from 2025
37 -
Ahead of AI (Sebastian Raschka) research 11mo ago
Understanding and Coding the KV Cache in LLMs from Scratch
KV caches are one of the most critical techniques for efficient inference in LLMs in production.
5 -
-
Ahead of AI (Sebastian Raschka) research 12mo ago
The State of Reinforcement Learning for LLM Reasoning
Understanding GRPO and New Insights from Reasoning Model Papers
25 -
Ahead of AI (Sebastian Raschka) research 13mo ago
First Look at Reasoning From Scratch: Chapter 1
Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistical pattern recognition. However, new advances in reasoning methodologies now enable LLMs to tackle…
25 -
Ahead of AI (Sebastian Raschka) research 14mo ago
The State of LLM Reasoning Model Inference
Inference-Time Compute Scaling Methods to Improve Reasoning Models
26 -
Ahead of AI (Sebastian Raschka) research 15mo ago
Understanding Reasoning LLMs
Methods and Strategies for Building and Refining Reasoning Models
26