News / #robotics Tag Robotics 184 articles archived under #robotics · RSS Sign in to follow r/LocalLLaMA community 1mo ago X-Post of lightweight wheely robots. How / what are they running as the brains? Local? IoT-Style? Networked?   submitted by   /u/Mchanger [link]   [comments] 8 r/MachineLearning community 1mo ago pipeline is really slow - consulting [D] Hi, after a long debugging process and many discussions, I wanted to ask for advice from people who may have encountered similar training bottlenecks. My goal is imitation learning for robotics. Model / Pipeline Observation space: 4 RGB robot cameras image resolution: 128x128x3… 25 Hugging Face Daily Papers research 1mo ago Minimalist Visual Inertial Odometry Abstract A minimalist visual-inertial odometry approach uses four photodiodes with optical Gabor masks and a temporal convolutional network to achieve accurate planar motion estimation for differential-drive robots. AI-generated summary Visual-Inertial Odometry(VIO), which is… 17 arXiv — Machine Learning research 1mo ago SCI-Defense: Defending Manipulation Attacks from Generative Engine Optimization arXiv:2605.21948v1 Announce Type: new Abstract: LLM-based ranking systems are vulnerable to Generative Engine Optimization (GEO) attacks, where adversaries inject semantic signals into product descriptions to artificially boost rankings. We propose SCI-Defense, a three-component… 30 arXiv — NLP / Computation & Language research 1mo ago Reducing Political Manipulation with Consistency Training arXiv:2605.22771v1 Announce Type: new Abstract: Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert… 25 Hacker News — AI on Front Page community 1mo ago Waymo pauses Atlanta service as its robotaxis keep driving into floods Article URL: https://techcrunch.com/2026/05/21/waymo-pauses-atlanta-service-as-its-robotaxis-keep-driving-into-floods/ Comments URL: https://news.ycombinator.com/item?id=48225426 Points: 201 # Comments: 254 24 r/MachineLearning community 1mo ago Looking for real world comparisons between WALL OSS pi0.6 and OpenVLA[D] I am choosing a baseline for a real manipulation stack and trying not to lose a month on setup that someone here has already done. Shortlist is OpenVLA, pi0.6, and WALL OSS from X Square Robot. OpenVLA is still the easiest reference point with lots of reproductions. pi0.6 looks… 21 arXiv — Machine Learning research 1mo ago Mechanisms of Misgeneralization in Physical Sequence Modeling arXiv:2605.20299v1 Announce Type: new Abstract: Generative sequence models are often trained to plan motion in physical domains, from robotics to mechanical simulations. When constructing a dataset to train such a model, engineers may curate demonstrations to specify how… 10 Hugging Face Daily Papers research 1mo ago Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching Abstract Domain-Randomized Instance Set (DRIS) enables robust policy learning for dexterous manipulation tasks by simultaneously representing multiple randomized instances, achieving strong sim-to-real transfer without extensive real-world fine-tuning. AI-generated summary… 19 Ars Technica — AI news-outlet 1mo ago The Internet can't stop watching Figure AI's humanoid robots handling packages Figure AI's 24/7 livestream showcases human soft spot for humanoid robots. 27 arXiv — Machine Learning research 1mo ago EUPHORIA: Efficient Universal Planning via Hybrid Optimization for Robust Industrial Robotic Assembly arXiv:2605.18872v1 Announce Type: new Abstract: Robotic assembly in architectural construction faces a persistent bottleneck: existing planners are either highly specialized, requiring prohibitive retraining for every new geometric design, or operationally inefficient, treating… 38 arXiv — NLP / Computation & Language research 1mo ago DECOR: Auditing LLM Deception via Information Manipulation Theory arXiv:2605.19270v1 Announce Type: new Abstract: Large language models can deceive by subtly manipulating truthful information -- omitting key facts, shifting focus, or obscuring meaning -- making such behavior difficult to detect. Existing black-box methods rely on… 7 TechCrunch — AI news-outlet 1mo ago Google’s Genie world model can now simulate real streets with Street View Google DeepMind is integrating Street View with Project Genie to create immersive, interactive world simulations for robotics, gaming, and travel, allowing users to explore environments, weather changes, and rare scenarios. 36 arXiv — Machine Learning research 1mo ago World Model-Enabled Causal Digital Twins for Semantic Communications in Physical AI Systems arXiv:2605.16547v1 Announce Type: new Abstract: Semantic communication has emerged as a promising paradigm for enabling goal-oriented networking. However, most existing semantic communication solutions are tailored to one-shot tasks and optimize instantaneous performance. Hence,… 27 arXiv — NLP / Computation & Language research 1mo ago A Pilot Benchmark for NL-to-FOL Translation in Planetary Exploration arXiv:2605.17911v1 Announce Type: new Abstract: Future planetary exploration envisions autonomous robotic agents operating under severe communication constraints, without global positioning, and with minimal human intervention. In such environments, agents must not only perceive… 35 Hugging Face official-blog 1mo ago Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Back to Articles Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Enterprise + Article Published May 18, 2026 Upvote - Ting-Yun Chang ting-yunc nvidia Miguel Martin miguelmartin-nv nvidia Jonathan Allen nv-spectralflight nvidia Ke Ding kding1… 11 r/LocalLLaMA community 1mo ago I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you. DystopiaBench runs 36 escalating scenarios across 6 dystopia types: Petrov: Autonomous weapons, nuclear override Orwell: Mass surveillance, truth manipulation Huxley: Behavioral conditioning, pleasure pacification Basaglia: Coercive therapeutic control LaGuardia: Regulatory… 22 Hugging Face Daily Papers research 1mo ago MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware Abstract A mobile-based framework for collecting long-duration egocentric robot data using smartphone sensors, enabling large-scale training of vision-language-action models. AI-generated summary The recent advancement of Vision Language Action (VLA) models has driven a critical… 12 arXiv — Machine Learning research 1mo ago A Unified Perturbation Framework for Analyzing Leaderboard Stability and Manipulation arXiv:2605.15761v1 Announce Type: new Abstract: Evaluation leaderboards such as LMArena play a central role in benchmarking large language models by aggregating pairwise human preferences into model rankings, yet the robustness of these rankings remains poorly understood. We… 28 arXiv — NLP / Computation & Language research 1mo ago PhysBrain 1.0 Technical Report arXiv:2605.15298v1 Announce Type: cross Abstract: Vision-language-action models have advanced rapidly, but robot trajectories alone provide limited coverage for learning broad physical understanding. PhysBrain 1.0 studies a complementary route: converting large-scale human… 29 Hugging Face Daily Papers research 1mo ago OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation Abstract OmniHumanoid enables cross-embodiment video generation by factorizing motion transfer and embodiment-specific adaptation, allowing scalable adaptation to new humanoid embodiments using unpaired data. AI-generated summary Cross-embodiment video generation aims to… 7 Hugging Face Daily Papers research 1mo ago DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Abstract DexJoCo presents a benchmark and toolkit for dexterous manipulation with 11 functional tasks evaluating tool-use, bimanual coordination, and long-horizon execution, along with a low-cost data collection system and comprehensive model evaluation. AI-generated summary… 36 r/LocalLLaMA community 1mo ago AllenAI has been iterating on their MolmoAct2 models for robotics r/AllenAI is cooking with MolmoAct2, a 5B vision-language-action model for robot control. They keep releasing new fine-tunes on different kinds of robotics datasets, including (but not limited to, and they keep releasing new ones): https://huggingface.co/allenai/MolmoAct2-LIBERO… 31 r/LocalLLaMA community 1mo ago Built a fully offline suitcase robot around a Jetson Orin NX SUPER 16GB. Gemma 4 E4B, ~200ms cached TTFT, 30+ sensors, no WiFi/BT/cellular. He has opinions. Sparky runs entirely on the Jetson. Gemma 4 E4B at Q4_K_M via llama.cpp with q8_0 KV cache and flash attention. 12K context, native system role, sampler defaults from the model card. Cached TTFT around 200ms, sustained 14-15 tok/s. SenseVoiceSmall for STT, Piper for TTS with… 21 Hugging Face Daily Papers research 1mo ago Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding Abstract Multi-agent pathfinding solver enhanced with learnable communication module improves coordination and performance while maintaining scalability. AI-generated summary Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning… 37 arXiv — Machine Learning research 1mo ago WarmPrior: Straightening Flow-Matching Policies with Temporal Priors arXiv:2605.13959v1 Announce Type: new Abstract: Generative policies based on diffusion and flow matching have become a dominant paradigm for visuomotor robotic control. We show that replacing the standard Gaussian source distribution with WarmPrior, a simple temporally grounded… 19 arXiv — Machine Learning research 1mo ago R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning arXiv:2605.14026v1 Announce Type: new Abstract: For reinforcement learning in data-scarce domains like real-world robotics, intensive data reuse enhances efficiency but induces overfitting. While prior works focus on critic bias, representation-level instability in… 4 arXiv — NLP / Computation & Language research 1mo ago IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation arXiv:2605.14712v1 Announce Type: cross Abstract: Robot imitation data are often multimodal: similar visual-language observations may be followed by different action chunks because human demonstrators act with different short-horizon intents, task phases, or recent context.… 37 Hugging Face Daily Papers research 1mo ago IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation Abstract IntentVLA is a history-conditioned visual-language action framework that improves robot imitation learning stability by encoding short-horizon intents from visual observations, addressing challenges from partial observability and ambiguous observations. AI-generated… 21 Hugging Face Daily Papers research 1mo ago From Pixels to Concepts: Do Segmentation Models Understand What They Segment? Abstract CAFE is a new benchmark for evaluating concept-faithful segmentation in promptable models through attribute-level counterfactual manipulation, revealing that accurate mask prediction does not guarantee semantic grounding. AI-generated summary Segmentation is a… 18 arXiv — Machine Learning research 1mo ago Ergodic Trajectory Design by Learned Pushforward Maps: Provable Coverage via Conditional Flow Matching arXiv:2605.13063v1 Announce Type: new Abstract: Designing continuous trajectories whose time-averaged occupancy provably matches a prescribed spatial density (the \emph{ergodic coverage} problem) is central to UAV-assisted data collection and sensing, robotic exploration, and… 21 Hugging Face Daily Papers research 1mo ago RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data Abstract RoboEvolve combines vision-language and video generation models in a co-evolutionary framework to enable scalable robotic manipulation with improved data efficiency and continuous learning capabilities. AI-generated summary The scalability of robotic manipulation is… 29 Hugging Face Daily Papers research 1mo ago World Action Models: The Next Frontier in Embodied AI Abstract World Action Models unify predictive state modeling with action generation for embodied policy learning, forming a cohesive framework for understanding environment dynamics and action prediction. AI-generated summary Vision-Language-Action (VLA) models have achieved… 15 Hugging Face Daily Papers research 1mo ago World Model for Robot Learning: A Comprehensive Survey Abstract World models as predictive representations of environmental dynamics have become essential for robot learning, supporting policy learning, planning, and simulation across various embodied applications. AI-generated summary World models, which are predictive… 12 Page 4 of 4 · 184 articles ← Newer