Tag

Robotics

184 articles archived under #robotics · RSS

r/LocalLLaMA community 1mo ago

X-Post of lightweight wheely robots. How / what are they running as the brains? Local? IoT-Style? Networked?

  submitted by   /u/Mchanger [link]   [comments]

8
r/MachineLearning community 1mo ago

pipeline is really slow - consulting [D]

Hi, after a long debugging process and many discussions, I wanted to ask for advice from people who may have encountered similar training bottlenecks. My goal is imitation learning for robotics. Model / Pipeline Observation space: 4 RGB robot cameras image resolution: 128x128x3…

25
Hugging Face Daily Papers research 1mo ago

Minimalist Visual Inertial Odometry

Abstract A minimalist visual-inertial odometry approach uses four photodiodes with optical Gabor masks and a temporal convolutional network to achieve accurate planar motion estimation for differential-drive robots. AI-generated summary Visual-Inertial Odometry(VIO), which is…

17
arXiv — Machine Learning research 1mo ago

SCI-Defense: Defending Manipulation Attacks from Generative Engine Optimization

arXiv:2605.21948v1 Announce Type: new Abstract: LLM-based ranking systems are vulnerable to Generative Engine Optimization (GEO) attacks, where adversaries inject semantic signals into product descriptions to artificially boost rankings. We propose SCI-Defense, a three-component…

30
arXiv — NLP / Computation & Language research 1mo ago

Reducing Political Manipulation with Consistency Training

arXiv:2605.22771v1 Announce Type: new Abstract: Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert…

25
Hacker News — AI on Front Page community 1mo ago

Waymo pauses Atlanta service as its robotaxis keep driving into floods

Article URL: https://techcrunch.com/2026/05/21/waymo-pauses-atlanta-service-as-its-robotaxis-keep-driving-into-floods/ Comments URL: https://news.ycombinator.com/item?id=48225426 Points: 201 # Comments: 254

24
r/MachineLearning community 1mo ago

Looking for real world comparisons between WALL OSS pi0.6 and OpenVLA[D]

I am choosing a baseline for a real manipulation stack and trying not to lose a month on setup that someone here has already done. Shortlist is OpenVLA, pi0.6, and WALL OSS from X Square Robot. OpenVLA is still the easiest reference point with lots of reproductions. pi0.6 looks…

21
arXiv — Machine Learning research 1mo ago

Mechanisms of Misgeneralization in Physical Sequence Modeling

arXiv:2605.20299v1 Announce Type: new Abstract: Generative sequence models are often trained to plan motion in physical domains, from robotics to mechanical simulations. When constructing a dataset to train such a model, engineers may curate demonstrations to specify how…

10
Hugging Face Daily Papers research 1mo ago

Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching

Abstract Domain-Randomized Instance Set (DRIS) enables robust policy learning for dexterous manipulation tasks by simultaneously representing multiple randomized instances, achieving strong sim-to-real transfer without extensive real-world fine-tuning. AI-generated summary…

19
Ars Technica — AI news-outlet 1mo ago

The Internet can't stop watching Figure AI's humanoid robots handling packages

Figure AI's 24/7 livestream showcases human soft spot for humanoid robots.

27
arXiv — Machine Learning research 1mo ago

EUPHORIA: Efficient Universal Planning via Hybrid Optimization for Robust Industrial Robotic Assembly

arXiv:2605.18872v1 Announce Type: new Abstract: Robotic assembly in architectural construction faces a persistent bottleneck: existing planners are either highly specialized, requiring prohibitive retraining for every new geometric design, or operationally inefficient, treating…

38
arXiv — NLP / Computation & Language research 1mo ago

DECOR: Auditing LLM Deception via Information Manipulation Theory

arXiv:2605.19270v1 Announce Type: new Abstract: Large language models can deceive by subtly manipulating truthful information -- omitting key facts, shifting focus, or obscuring meaning -- making such behavior difficult to detect. Existing black-box methods rely on…

7
TechCrunch — AI news-outlet 1mo ago

Google’s Genie world model can now simulate real streets with Street View

Google DeepMind is integrating Street View with Project Genie to create immersive, interactive world simulations for robotics, gaming, and travel, allowing users to explore environments, weather changes, and rare scenarios.

36
arXiv — Machine Learning research 1mo ago

World Model-Enabled Causal Digital Twins for Semantic Communications in Physical AI Systems

arXiv:2605.16547v1 Announce Type: new Abstract: Semantic communication has emerged as a promising paradigm for enabling goal-oriented networking. However, most existing semantic communication solutions are tailored to one-shot tasks and optimize instantaneous performance. Hence,…

27
arXiv — NLP / Computation & Language research 1mo ago

A Pilot Benchmark for NL-to-FOL Translation in Planetary Exploration

arXiv:2605.17911v1 Announce Type: new Abstract: Future planetary exploration envisions autonomous robotic agents operating under severe communication constraints, without global positioning, and with minimal human intervention. In such environments, agents must not only perceive…

35
Hugging Face official-blog 1mo ago

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Back to Articles Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Enterprise + Article Published May 18, 2026 Upvote - Ting-Yun Chang ting-yunc nvidia Miguel Martin miguelmartin-nv nvidia Jonathan Allen nv-spectralflight nvidia Ke Ding kding1…

11
r/LocalLLaMA community 1mo ago

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

DystopiaBench runs 36 escalating scenarios across 6 dystopia types: Petrov: Autonomous weapons, nuclear override Orwell: Mass surveillance, truth manipulation Huxley: Behavioral conditioning, pleasure pacification Basaglia: Coercive therapeutic control LaGuardia: Regulatory…

22
Hugging Face Daily Papers research 1mo ago

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

Abstract A mobile-based framework for collecting long-duration egocentric robot data using smartphone sensors, enabling large-scale training of vision-language-action models. AI-generated summary The recent advancement of Vision Language Action (VLA) models has driven a critical…

12
arXiv — Machine Learning research 1mo ago

A Unified Perturbation Framework for Analyzing Leaderboard Stability and Manipulation

arXiv:2605.15761v1 Announce Type: new Abstract: Evaluation leaderboards such as LMArena play a central role in benchmarking large language models by aggregating pairwise human preferences into model rankings, yet the robustness of these rankings remains poorly understood. We…

28
arXiv — NLP / Computation & Language research 1mo ago

PhysBrain 1.0 Technical Report

arXiv:2605.15298v1 Announce Type: cross Abstract: Vision-language-action models have advanced rapidly, but robot trajectories alone provide limited coverage for learning broad physical understanding. PhysBrain 1.0 studies a complementary route: converting large-scale human…

29
Hugging Face Daily Papers research 1mo ago

OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation

Abstract OmniHumanoid enables cross-embodiment video generation by factorizing motion transfer and embodiment-specific adaptation, allowing scalable adaptation to new humanoid embodiments using unpaired data. AI-generated summary Cross-embodiment video generation aims to…

7
Hugging Face Daily Papers research 1mo ago

DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo

Abstract DexJoCo presents a benchmark and toolkit for dexterous manipulation with 11 functional tasks evaluating tool-use, bimanual coordination, and long-horizon execution, along with a low-cost data collection system and comprehensive model evaluation. AI-generated summary…

36
r/LocalLLaMA community 1mo ago

AllenAI has been iterating on their MolmoAct2 models for robotics

r/AllenAI is cooking with MolmoAct2, a 5B vision-language-action model for robot control. They keep releasing new fine-tunes on different kinds of robotics datasets, including (but not limited to, and they keep releasing new ones): https://huggingface.co/allenai/MolmoAct2-LIBERO…

31
r/LocalLLaMA community 1mo ago

Built a fully offline suitcase robot around a Jetson Orin NX SUPER 16GB. Gemma 4 E4B, ~200ms cached TTFT, 30+ sensors, no WiFi/BT/cellular. He has opinions.

Sparky runs entirely on the Jetson. Gemma 4 E4B at Q4_K_M via llama.cpp with q8_0 KV cache and flash attention. 12K context, native system role, sampler defaults from the model card. Cached TTFT around 200ms, sustained 14-15 tok/s. SenseVoiceSmall for STT, Piper for TTS with…

21
Hugging Face Daily Papers research 1mo ago

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

Abstract Multi-agent pathfinding solver enhanced with learnable communication module improves coordination and performance while maintaining scalability. AI-generated summary Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning…

37
arXiv — Machine Learning research 1mo ago

WarmPrior: Straightening Flow-Matching Policies with Temporal Priors

arXiv:2605.13959v1 Announce Type: new Abstract: Generative policies based on diffusion and flow matching have become a dominant paradigm for visuomotor robotic control. We show that replacing the standard Gaussian source distribution with WarmPrior, a simple temporally grounded…

19
arXiv — Machine Learning research 1mo ago

R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning

arXiv:2605.14026v1 Announce Type: new Abstract: For reinforcement learning in data-scarce domains like real-world robotics, intensive data reuse enhances efficiency but induces overfitting. While prior works focus on critic bias, representation-level instability in…

4
arXiv — NLP / Computation & Language research 1mo ago

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

arXiv:2605.14712v1 Announce Type: cross Abstract: Robot imitation data are often multimodal: similar visual-language observations may be followed by different action chunks because human demonstrators act with different short-horizon intents, task phases, or recent context.…

37
Hugging Face Daily Papers research 1mo ago

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

Abstract IntentVLA is a history-conditioned visual-language action framework that improves robot imitation learning stability by encoding short-horizon intents from visual observations, addressing challenges from partial observability and ambiguous observations. AI-generated…

21
Hugging Face Daily Papers research 1mo ago

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Abstract CAFE is a new benchmark for evaluating concept-faithful segmentation in promptable models through attribute-level counterfactual manipulation, revealing that accurate mask prediction does not guarantee semantic grounding. AI-generated summary Segmentation is a…

18
arXiv — Machine Learning research 1mo ago

Ergodic Trajectory Design by Learned Pushforward Maps: Provable Coverage via Conditional Flow Matching

arXiv:2605.13063v1 Announce Type: new Abstract: Designing continuous trajectories whose time-averaged occupancy provably matches a prescribed spatial density (the \emph{ergodic coverage} problem) is central to UAV-assisted data collection and sensing, robotic exploration, and…

21
Hugging Face Daily Papers research 1mo ago

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

Abstract RoboEvolve combines vision-language and video generation models in a co-evolutionary framework to enable scalable robotic manipulation with improved data efficiency and continuous learning capabilities. AI-generated summary The scalability of robotic manipulation is…

29
Hugging Face Daily Papers research 1mo ago

World Action Models: The Next Frontier in Embodied AI

Abstract World Action Models unify predictive state modeling with action generation for embodied policy learning, forming a cohesive framework for understanding environment dynamics and action prediction. AI-generated summary Vision-Language-Action (VLA) models have achieved…

15
Hugging Face Daily Papers research 1mo ago

World Model for Robot Learning: A Comprehensive Survey

Abstract World models as predictive representations of environmental dynamics have become essential for robot learning, supporting policy learning, planning, and simulation across various embodied applications. AI-generated summary World models, which are predictive…

12

X-Post of lightweight wheely robots. How / what are they running as the brains? Local? IoT-Style? Networked?

pipeline is really slow - consulting [D]

Minimalist Visual Inertial Odometry

SCI-Defense: Defending Manipulation Attacks from Generative Engine Optimization

Reducing Political Manipulation with Consistency Training

Waymo pauses Atlanta service as its robotaxis keep driving into floods

Looking for real world comparisons between WALL OSS pi0.6 and OpenVLA[D]

Mechanisms of Misgeneralization in Physical Sequence Modeling

Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching

The Internet can&#039;t stop watching Figure AI&#039;s humanoid robots handling packages

EUPHORIA: Efficient Universal Planning via Hybrid Optimization for Robust Industrial Robotic Assembly

DECOR: Auditing LLM Deception via Information Manipulation Theory

Google’s Genie world model can now simulate real streets with Street View

World Model-Enabled Causal Digital Twins for Semantic Communications in Physical AI Systems

A Pilot Benchmark for NL-to-FOL Translation in Planetary Exploration

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

A Unified Perturbation Framework for Analyzing Leaderboard Stability and Manipulation

PhysBrain 1.0 Technical Report

OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation

DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo

AllenAI has been iterating on their MolmoAct2 models for robotics

Built a fully offline suitcase robot around a Jetson Orin NX SUPER 16GB. Gemma 4 E4B, ~200ms cached TTFT, 30+ sensors, no WiFi/BT/cellular. He has opinions.

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

WarmPrior: Straightening Flow-Matching Policies with Temporal Priors

R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Ergodic Trajectory Design by Learned Pushforward Maps: Provable Coverage via Conditional Flow Matching

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

World Action Models: The Next Frontier in Embodied AI

World Model for Robot Learning: A Comprehensive Survey

The Internet can't stop watching Figure AI's humanoid robots handling packages