Nonint (James Betker)
10 articles archived · Visit source ↗ · RSS
-
Nonint (James Betker) research 11mo ago
Vibe Coding
I had a pretty incredible vibe coding experience with o3 today. As I’m sure many of you have also had recently – whether with o3, or Claude or Gemini. I was iterating on a problem with it over a couple of hours. I asked it to come up with an idea for a novel…
26 -
Nonint (James Betker) research 12mo ago
Mixture of Experts
A Transformer is a stack of alternating Attention and MLP layers through which data embedded as high dimensional vectors is fed. A Mixture of Experts (MoE) Transformer substitutes the MLP layer for an “MoE Layer”. Let’s dive into what that means. The…
23 -
Nonint (James Betker) research 14mo ago
The Paradigm
Over the past decade, some of the most remarkable AI breakthroughs—AlphaGo, AlphaStar, AlphaFold1, VPT, OpenAI Five, ChatGPT—have all shared a common thread: they start with large-scale data gathering (self-supervised or imitation learning, or SSL) and then use reinforcement…
28 -
Nonint (James Betker) research 16mo ago
Beating ARC the hard way
ARC is benchmark developed to test out of distribution reasoning and common sense in general solvers. It is specifically designed to be: Easily solvable by most humans Not amenable to any kind of brute-force solvers (e.g. try every permutation of a solution) Not able to be…
4 -
Nonint (James Betker) research 23mo ago
General Intelligence (2024)
Folks in the field of AI like to make predictions for AGI. I have thoughts, and I’ve always wanted to write them down. Let’s do that. Since this isn’t something I’ve touched on in the past, I’ll start by doing my best to define what I mean by “general…
27 -
Nonint (James Betker) research 24mo ago
GPT-4o
I’m very pleased to show the world GPT-4o. I came into the project mid-last year with Alexis Conneau with the goal of scaling up speech models and building an “AudioLM”. We knew we had something special late last year, but I don’t think either of us…
22 -
Nonint (James Betker) research 26mo ago
Research Code
At my job, I’m currently in a cycle that is involving working with software engineers quite a bit. One thing that has happened a number of times is that a software engineer will bring up “research code” with a condescending tone. The implication is that…
20 -
Nonint (James Betker) research 26mo ago
Learned Structures
From 2019-2021, I was fascinated with neural network architectures. I think a lot of researchers in the field were at the time. The transformer paper had been out for a little while and it was starting to sink in how transformational it was going to be. The general question in…
27 -
Nonint (James Betker) research 28mo ago
go/rulesofthumb
Google has a neat internal website called “Rules of Thumb”, which compares the marginal cost of computational resources to the unit of a “SWE”. “SWE” refers to “Software Engineer” – which itself is the marginal cost to pay…
12 -
Nonint (James Betker) research 30mo ago
Compute Multipliers
I’ve listened to a couple of interviews with Dario Amodei, CEO of Anthropic, this year. In both of them, he dropped the term “compute multiplier” a few times. This concept is exceptionally important in the field of ML, and I don’t see it talked about…
18