Hugging Face Daily Papers

65 articles archived · Visit source ↗ · RSS

Hugging Face Daily Papers research 1h ago

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

Abstract INSET is a unified multimodal model that embeds images as native vocabulary within textual instructions, enabling better handling of complex interleaved inputs through transformer-based contextual locality and supporting both image generation and editing tasks.…

34
Hugging Face Daily Papers research 1h ago

Reward Hacking in Rubric-Based Reinforcement Learning

Abstract Research examines reward hacking in rubric-based reinforcement learning, identifying verifier failure and rubric-design limitations as key sources of divergence between training and evaluation metrics. AI-generated summary Reinforcement learning with verifiable rewards…

31
Hugging Face Daily Papers research 1h ago

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors

Abstract VidSplat is a training-free generative reconstruction framework that uses video diffusion priors to synthesize novel views and recover complete 3D scenes from sparse inputs through adaptive denoising and iterative refinement. AI-generated summary Gaussian Splatting has…

29
Hugging Face Daily Papers research 1h ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Abstract Training efficiency is improved by strategically allocating scarce labeled data through staged reinforcement learning and dense supervision, using sparse rewards for teacher model discovery and dense rewards for student model compression. AI-generated summary In…

35
Hugging Face Daily Papers research 1h ago

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

Abstract FATE is an on-policy framework that uses failure trajectories to improve agent safety and performance through self-evolution and Pareto-aware optimization. AI-generated summary Tool-using LLM agents fail through trajectories rather than only final responses, as they may…

10
Hugging Face Daily Papers research 3h ago

LLM Agents Already Know When to Call Tools -- Even Without Reasoning

Abstract When2Tool benchmark identifies conditions under which tool calls are necessary for LLM agents, revealing that models can predict tool necessity from hidden states but fail to act on this knowledge, leading to the development of Probe&Prefill method that reduces…

15
Hugging Face Daily Papers research 3h ago

Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

Abstract A local distribution-aware detection framework that amplifies micro-scale statistical irregularities to identify AI-generated images with improved accuracy. AI-generated summary Recent generative models can produce images that appear highly realistic, raising challenges…

26
Hugging Face Daily Papers research 4h ago

Solve the Loop: Attractor Models for Language and Reasoning

Abstract Attractor Models enable efficient iterative refinement through fixed-point solving with implicit differentiation, achieving superior language modeling and reasoning performance with reduced computational costs compared to traditional transformers. AI-generated summary…

5
Hugging Face Daily Papers research 4h ago

A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

Abstract Massive activation emergence in LLMs occurs consistently across model families at a specific layer, where RMSNorm and FFN parameters jointly contribute, leading to reduced hidden representation diversity that can be mitigated through a proposed method improving…

36
Hugging Face Daily Papers research 5h ago

Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception

Abstract Urban-ImageNet presents a large-scale multi-modal dataset and evaluation benchmark for urban space perception from social media imagery, organized under a hierarchical taxonomy for scene classification, cross-modal retrieval, and instance segmentation tasks.…

36
Hugging Face Daily Papers research 5h ago

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Abstract A fast-slow learning framework for large language models combines fixed parameters with optimized context to achieve better sample efficiency, reduced catastrophic forgetting, and improved adaptability in continual learning scenarios. AI-generated summary Large language…

22
Hugging Face Daily Papers research 6h ago

Efficient Pre-Training with Token Superposition

Abstract Token-Superposition Training (TST) improves pre-training efficiency by combining contiguous tokens into bags during a superposition phase with multi-hot cross-entropy objective, achieving faster training times without architectural changes. AI-generated summary…

30
Hugging Face Daily Papers research 6h ago

Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty

Abstract Agent-BRACE decomposes LLM agents into belief state and policy models, using structured textual claims with certainty labels to handle partial observability and long-term dependencies in complex environments. AI-generated summary Large language models (LLMs) are…

28
Hugging Face Daily Papers research 7h ago

ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

Abstract ORBIT addresses catastrophic forgetting in large language model fine-tuning for generative retrieval by tracking parameter distances and employing weight averaging to maintain model performance. AI-generated summary Despite the rapid advancements in large language model…

7
Hugging Face Daily Papers research 9h ago

UniPath: Adaptive Coordination of Understanding and Generation for Unified Multimodal Reasoning

Abstract Unified multimodal models can improve performance by adaptively selecting coordination paths rather than using fixed patterns, enabling diverse reasoning strategies for different inputs. AI-generated summary Unified multimodal models (UMMs) aim to integrate…

19
Hugging Face Daily Papers research 9h ago

Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs

Abstract Reinforcement learning improves large language model recall of parametric knowledge by redistributing probability mass toward correct answers, with gains driven primarily by reinforcing rare but learnable examples. AI-generated summary Reinforcement learning (RL) has…

14
Hugging Face Daily Papers research 10h ago

Relit-LiVE: Relight Video by Jointly Learning Environment Video

Abstract A novel video relighting framework called Relit-LiVE is presented that produces physically consistent results without requiring camera pose information by incorporating raw reference images and using environment video prediction for joint relighting and environment map…

16
Hugging Face Daily Papers research 10h ago

Reliable Chain-of-Thought via Prefix Consistency

Abstract Prefix consistency uses answer reproduction rates under trace regeneration to weight candidate responses, achieving high accuracy with significantly fewer tokens than standard majority voting. AI-generated summary Large Language Models often improve accuracy on…

29
Hugging Face Daily Papers research 10h ago

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Abstract POISE enables stable and efficient policy optimization for large reasoning models by estimating baselines using internal model signals, reducing computational overhead while maintaining performance comparable to existing methods. AI-generated summary Reinforcement…

37
Hugging Face Daily Papers research 12h ago

IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs

Abstract IndustryBench evaluates industrial procurement question answering systems in Chinese against national standards, revealing significant gaps in safety compliance and highlighting the need for safety-aware assessment beyond standard accuracy metrics. AI-generated summary…

5
Hugging Face Daily Papers research 12h ago

Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

Abstract Language models can be enhanced by transitioning from sequential message-based instruction-tuning to parallel stream processing, enabling simultaneous reading and generation across multiple concurrent data flows. AI-generated summary The continued improvements in…

6
Hugging Face Daily Papers research 13h ago

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Abstract Pion is a spectrum-preserving optimizer for large language model training that uses orthogonal equivalence transformations to maintain singular values during weight updates, offering stable performance comparable to standard optimizers. AI-generated summary We introduce…

34
Hugging Face Daily Papers research 13h ago

Do not copy and paste! Rewriting strategies for code retrieval

Abstract Research investigates how different text rewriting strategies impact code retrieval performance, identifying that full natural language rewriting provides the greatest improvements while proposing entropy-based diagnostics to determine when such costly rewrites are…

15
Hugging Face Daily Papers research 13h ago

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Abstract Collaborative intelligence enables multiple distributed LLMs to work together across devices and clouds to provide high-quality responses under diverse resource constraints. AI-generated summary Large language models (LLMs) are transforming society, powering…

9
Hugging Face Daily Papers research 13h ago

Debiased Model-based Representations for Sample-efficient Continuous Control

Abstract DR.Q algorithm improves model-based representations for Q-learning by maximizing mutual information and using faded prioritized experience replay to reduce bias and overfitting in representation learning. AI-generated summary Model-based representations recently stand…

20
Hugging Face Daily Papers research 13h ago

WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting

Abstract WildRelight dataset addresses the gap between synthetic and real-world single-image relighting by providing high-resolution outdoor scenes with aligned natural illumination, enabling physics-guided domain adaptation through diffusion posterior sampling and test-time…

21
Hugging Face Daily Papers research 13h ago

PAAC: Privacy-Aware Agentic Device-Cloud Collaboration

Abstract PAAC is a privacy-aware agentic framework that aligns planner-executor decomposition with device-cloud boundaries, using typed placeholder tokens and deterministic registries to enhance privacy while maintaining accuracy in distributed language model agents.…

23
Hugging Face Daily Papers research 14h ago

FaithfulFaces: Pose-Faithful Facial Identity Preservation for Text-to-Video Generation

Abstract FaithfulFaces is a pose-faithful facial identity preservation framework that improves identity consistency in text-to-video generation through pose-shared alignment and explicit Euler angle embeddings. AI-generated summary Identity-preserving text-to-video generation…

38
Hugging Face Daily Papers research 14h ago

Implicit Preference Alignment for Human Image Animation

Abstract Implicit Preference Alignment (IPA) addresses hand motion generation challenges through data-efficient post-training that eliminates need for paired preference data while using hand-aware local optimization for improved quality. AI-generated summary Human image…

37
Hugging Face Daily Papers research 15h ago

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Abstract Multi-turn dialogue safety monitoring system detects harmful intent accumulation through turn-level analysis and evaluates performance on a new benchmark dataset. AI-generated summary Hidden malicious intent in multi-turn dialogue poses a growing threat to deployed…

18
Hugging Face Daily Papers research 15h ago

Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction

Abstract Asynchronous reinforcement learning in large language models faces challenges with PPO-style corrections due to delayed updates and missing historical logits, which are addressed through exact and approximate correction methods including snapshot tracking and revised…

9
Hugging Face Daily Papers research 16h ago

GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction

Abstract A unified model for joint named entity recognition and relation extraction that uses a shared transformer encoder to simultaneously identify entities and extract relations with zero-shot capabilities. AI-generated summary Joint named entity recognition (NER) and…

27
Hugging Face Daily Papers research 16h ago

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

Abstract Switching from Masked Language Modeling to Causal Language Modeling during encoder adaptation improves downstream performance on biomedical texts through dense supervision effects in lower transformer layers. AI-generated summary When adapting an encoder to a new…

25
Hugging Face Daily Papers research 16h ago

World Action Models: The Next Frontier in Embodied AI

Abstract World Action Models unify predictive state modeling with action generation for embodied policy learning, forming a cohesive framework for understanding environment dynamics and action prediction. AI-generated summary Vision-Language-Action (VLA) models have achieved…

15
Hugging Face Daily Papers research 17h ago

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

Abstract A visual-native agent harness with image bank reference protocol enables reusable intermediate visual evidence and closed-loop data generation that improves multimodal deep search performance across multiple benchmarks. AI-generated summary Multimodal deep search…

33
Hugging Face Daily Papers research 17h ago

L2P: Unlocking Latent Potential for Pixel Generation

Abstract Latent-to-Pixel transfer paradigm efficiently leverages pre-trained latent diffusion models to create pixel-space models with minimal training overhead and high-resolution generation capabilities. AI-generated summary Pixel diffusion models have recently regained…

14
Hugging Face Daily Papers research 17h ago

From Web to Pixels: Bringing Agentic Search into Visual Perception

Abstract Researchers introduce WebEye, a benchmark for object localization requiring external knowledge resolution, and Pixel-Searcher, an agent-based approach that connects hidden target identities to visual annotations through search and reasoning. AI-generated summary Visual…

22
Hugging Face Daily Papers research 17h ago

SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning

Abstract SeePhys Pro benchmark reveals that current multimodal models struggle with representation-invariant reasoning when information shifts from text to visual formats, and demonstrates that blind training can improve performance through residual textual cues. AI-generated…

36
Hugging Face Daily Papers research 17h ago

MEME: Multi-entity & Evolving Memory Evaluation

Abstract MEME benchmark evaluates memory systems across multiple entities and evolving conditions, revealing persistent challenges in dependency reasoning despite advanced retrieval and prompting techniques. AI-generated summary LLM-based agents increasingly operate in…

17
Hugging Face Daily Papers research 17h ago

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction

Abstract Lite3R addresses efficiency challenges in transformer-based 3D reconstruction through sparse attention and low-precision quantization while maintaining geometric accuracy. AI-generated summary Transformer-based 3D reconstruction has emerged as a powerful paradigm for…

22
Hugging Face Daily Papers research 18h ago

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

Abstract PASA is a robust watermarking algorithm for large language models that operates at the semantic level using latent embedding spaces and shared randomness for secure text detection. AI-generated summary Watermarking for large language models (LLMs) is a promising…

16
Hugging Face Daily Papers research 18h ago

FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning

Abstract Training framework FocuSFT improves long-context language model performance by addressing attention allocation issues through bilevel optimization with parametric memory that focuses attention on semantically relevant content. AI-generated summary Large language models…

25
Hugging Face Daily Papers research 18h ago

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

Abstract ToolCUA is an end-to-end agent that learns optimal GUI-tool path selection through staged training, achieving superior performance in hybrid action space environments. AI-generated summary Computer Use Agents (CUAs) can act through both atomic GUI actions, such as click…

38
Hugging Face Daily Papers research 18h ago

AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward

Abstract AlphaGRPO enhances multimodal generation by applying Group Relative Policy Optimization to AR-Diffusion Unified Multimodal Models through self-reflective refinement and decompositional verifiable reward mechanisms. AI-generated summary In this paper, we propose…

26
Hugging Face Daily Papers research 18h ago

AdaPreLoRA: Adafactor Preconditioned Low-Rank Adaptation

Abstract LoRA optimizers are analyzed through a unified framework based on surrogate matrices and preconditioners, with AdaPreLoRA proposing a novel approach using Adafactor diagonal Kronecker preconditioning to improve factor-space updates while maintaining low memory usage.…

38
Hugging Face Daily Papers research 18h ago

World Model for Robot Learning: A Comprehensive Survey

Abstract World models as predictive representations of environmental dynamics have become essential for robot learning, supporting policy learning, planning, and simulation across various embodied applications. AI-generated summary World models, which are predictive…

12
Hugging Face Daily Papers research 18h ago

Geometric Factual Recall in Transformers

Abstract Transformer language models use geometric memorization where embeddings encode linear superpositions of attributes and MLPs act as relation-conditioned selectors rather than associative key-value mappings. AI-generated summary How do transformer language models memorize…

6
Hugging Face Daily Papers research 18h ago

Continual Harness: Online Adaptation for Self-Improving Foundation Agents

Abstract A self-improving AI system for embodied agents autonomously refines its own prompts, skills, and memory through continuous learning without environment resets, achieving human-level performance in complex video games. AI-generated summary Coding harnesses such as Claude…

15
Hugging Face Daily Papers research 19h ago

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Abstract Autonomous agents exhibit distinct value systems from underlying language models, requiring new benchmarking approaches to assess alignment across diverse execution environments. AI-generated summary Autonomous agents have rapidly matured as task executors and seen…

28
Hugging Face Daily Papers research 19h ago

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Abstract MCP-Cosmos integrates generative World Models into the Model Context Protocol ecosystem to enhance agent planning and execution through predictive simulation in latent space. AI-generated summary The Model Context Protocol (MCP) has unified the interface between Large…

6
Hugging Face Daily Papers research 19h ago

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

Abstract A new benchmark called LongMemEval-V2 is introduced to evaluate memory systems' ability to help agents acquire environment-specific experience in web environments, featuring a suite of memory methods including AgentRunbook-R and AgentRunbook-C that demonstrate varying…

23
Hugging Face Daily Papers research 19h ago

δ-mem: Efficient Online Memory for Large Language Models

Abstract A lightweight memory mechanism called δ-mem enhances large language models by augmenting a frozen attention backbone with a compact associative memory state that provides low-rank corrections to attention computations. AI-generated summary Large language models…

12
Hugging Face Daily Papers research 19h ago

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

Abstract Enterprise discovery agents that read system configuration at runtime outperform traditional world models in configurable environments where dynamics change over time. AI-generated summary World models enable agents to anticipate the effects of their actions by…

37
Hugging Face Daily Papers research 19h ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Abstract Unified vision-language models treat understanding and generation as integrated processes rather than separate tasks, demonstrating strong performance across multiple multimodal capabilities including image synthesis and action reasoning. AI-generated summary Recent…

37
Hugging Face Daily Papers research 19h ago

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

Abstract DRoRAE enhances visual representation by fusing multi-layer features from pretrained vision encoders through adaptive routing and incremental correction, improving reconstruction and generation quality. AI-generated summary Representation autoencoders that reuse frozen…

6
Hugging Face Daily Papers research 20h ago

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

Abstract On-policy distillation and self-distillation methods for large language models exhibit varying effectiveness depending on teacher choice, loss formulation, and instance-specific privileged information availability, with identified failure mechanisms including…

32
Hugging Face Daily Papers research 20h ago

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives

Abstract CausalCine enables interactive, multi-shot video generation by addressing limitations of autoregressive models through causal modeling, dynamic memory routing, and real-time distillation techniques. AI-generated summary Autoregressive video generation aims at real-time,…

38
Hugging Face Daily Papers research 20h ago

MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents

Abstract MemPrivacy enables privacy-preserving personalized memory in edge-cloud environments by using type-aware placeholders to protect sensitive data while maintaining semantic integrity for effective memory operations. AI-generated summary As LLM-powered agents are…

30
Hugging Face Daily Papers research 20h ago

LychSim: A Controllable and Interactive Simulation Framework for Vision Research

Abstract A simulation framework called LychSim is introduced, featuring a Python API, procedural data pipeline, and MCP integration to enable controllable and interactive environments for vision system development and evaluation. AI-generated summary While self-supervised…

23
Hugging Face Daily Papers research 20h ago

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Abstract An agentic framework called AutoLLMResearch automates high-cost large language model experiment configurations by learning from multi-fidelity experimental environments and enabling efficient configuration identification through cross-fidelity extrapolation.…

5
Hugging Face Daily Papers research 20h ago

MoCam: Unified Novel View Synthesis via Structured Denoising Dynamics

Abstract MoCam addresses the challenge of generative novel view synthesis by dynamically coordinating geometric and appearance priors through structured denoising dynamics within a diffusion framework. AI-generated summary Generative novel view synthesis faces a fundamental…

13
Hugging Face Daily Papers research 20h ago

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

Abstract Deep research agents trained using RubricEM framework demonstrate superior performance on long-form research tasks through rubric-guided reinforcement learning with stage-aware planning and reflection-based meta-policy evolution. AI-generated summary Training deep…

16
Hugging Face Daily Papers research 20h ago

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

Abstract LoopUS is a post-training framework that transforms pretrained LLMs into looped architectures for improved reasoning performance through latent-refinement and adaptive early exiting mechanisms. AI-generated summary Looped computation shows promise in improving the…

31
Hugging Face Daily Papers research 20h ago

Teaching Language Models to Think in Code

Abstract ThinC framework enables mathematical problem solving where code serves as the primary reasoning mechanism instead of a verification tool, demonstrating superior performance on math benchmarks. AI-generated summary Tool-integrated reasoning (TIR) has emerged as a…

6
Hugging Face Daily Papers research 1d ago

TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems

Abstract Test-time co-evolution framework for multi-agent systems that jointly adapts agent capabilities and communication topology at different time scales to achieve task-conditioned stability and improved performance. AI-generated summary Multi-agent systems (MAS) have…

16

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

Reward Hacking in Rubric-Based Reinforcement Learning

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

LLM Agents Already Know When to Call Tools -- Even Without Reasoning

Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

Solve the Loop: Attractor Models for Language and Reasoning

A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Efficient Pre-Training with Token Superposition

Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty

ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

UniPath: Adaptive Coordination of Understanding and Generation for Unified Multimodal Reasoning

Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs

Relit-LiVE: Relight Video by Jointly Learning Environment Video

Reliable Chain-of-Thought via Prefix Consistency

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs

Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Do not copy and paste! Rewriting strategies for code retrieval

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Debiased Model-based Representations for Sample-efficient Continuous Control

WildRelight: A Real-World Benchmark and Physics-Guided Adaptation for Single-Image Relighting

PAAC: Privacy-Aware Agentic Device-Cloud Collaboration

FaithfulFaces: Pose-Faithful Facial Identity Preservation for Text-to-Video Generation

Implicit Preference Alignment for Human Image Animation

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction

GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

World Action Models: The Next Frontier in Embodied AI

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

L2P: Unlocking Latent Potential for Pixel Generation

From Web to Pixels: Bringing Agentic Search into Visual Perception

SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning

MEME: Multi-entity & Evolving Memory Evaluation

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward

AdaPreLoRA: Adafactor Preconditioned Low-Rank Adaptation

World Model for Robot Learning: A Comprehensive Survey

Geometric Factual Recall in Transformers

Continual Harness: Online Adaptation for Self-Improving Foundation Agents

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

δ-mem: Efficient Online Memory for Large Language Models

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives

MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents

LychSim: A Controllable and Interactive Simulation Framework for Vision Research

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

MoCam: Unified Novel View Synthesis via Structured Denoising Dynamics

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

Teaching Language Models to Think in Code

TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems