AllenAI has been iterating on their MolmoAct2 models for robotics
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
r/AllenAI is cooking with MolmoAct2, a 5B vision-language-action model for robot control. They keep releasing new fine-tunes on different kinds of robotics datasets, including (but not limited to, and they keep releasing new ones):
https://huggingface.co/allenai/MolmoAct2-LIBERO - general robotics tasks
https://huggingface.co/allenai/MolmoAct2-DROID - interactive robotics tasks
https://huggingface.co/allenai/MolmoAct2-BimanualYAM - absolute joint-pose control
https://huggingface.co/allenai/MolmoAct2-SO100_101 - also absolute joint-pose control
AllenAI has released these as fully open source models, publishing not only their weights but also their complete training datasets (including pretraining), their training software source code, and technical papers describing the theory, training, and assessments of these models.
If anyone is fiddling with robots controlled via LLM inference, you should give MolmoAct2 models a look.
[link] [comments]
More from r/LocalLLaMA
-
Why Dario is on fire: lesson from dotcom bubble.
Jun 30
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.