Hugging Face Daily Papers
500 articles archived · Visit source ↗ · RSS
-
Hugging Face Daily Papers research 6d ago
Toward Parking Spot Occupancy Recognition: A Self-Supervised Approach
Abstract A self-supervised transfer learning approach for parking spot occupancy recognition that achieves high accuracy with minimal labeled data through two-stage training and deployment strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As urban areas expand, automatic…
22 -
Hugging Face Daily Papers research 6d ago
Capable but Careless: Do Computer-Use Agents Follow Contextual Integrity?
Abstract Computer-use agents frequently expose inappropriate information across applications, prompting the creation of AgentCIBench to evaluate and mitigate privacy risks in cross-application contexts. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Computer-use agents (CUAs) now…
7 -
Hugging Face Daily Papers research 6d ago
Demystifying Training-Time Augmentation for Data-Constrained Language Model Pretraining
Abstract Training-time data augmentation techniques help mitigate overfitting in autoregressive language model pretraining by delaying performance deterioration and improving final model quality when training on fixed datasets for many epochs. Generated by…
28 -
Hugging Face Daily Papers research 6d ago
Toward Open Weight Models Without Risks: Separating Public and Private Capabilities in LLMs
Abstract Tiered Language Models (TLMs) provide a framework for releasing large language models with configurable capability levels through secret keys that modify computation graphs while maintaining public model integrity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
21 -
Hugging Face Daily Papers research 6d ago
Arbor: Explicit Geometric Conditioning for Controllable 3D Asset Generation
Abstract Arbor enables explicit 3D spatial control in text-conditioned latent generation through constraint meshes that define occupancy, avoidance, and contact regions, maintaining object quality while improving constraint adherence. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
26 -
Hugging Face Daily Papers research 6d ago
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention
Abstract Grouped Query Experts (GQE) improves Transformer efficiency by selectively activating query heads based on token content while maintaining key-value cache benefits of grouped-query attention. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Self-attention is central to…
25 -
Hugging Face Daily Papers research 6d ago
Training Open Models for Agentic Phone Use
Abstract PhoneBuddy combines real and mock app environments to improve training of open models for phone use, demonstrating enhanced task success rates through mixed reinforcement learning approaches. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Phones are becoming an important…
11 -
Hugging Face Daily Papers research 6d ago
Counsel: A Meta-Evaluation Dataset for Agentic Tasks
Abstract A large-scale dataset of human-metaevaluations of LLM critiques for agentic tasks is introduced to improve the calibration and reliability of automated evaluation methods. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As agentic systems tackle increasingly complex…
22 -
Hugging Face Daily Papers research 6d ago
Notes2Skills: From Lab Notebooks to Certainty-Aware Scientific Agent Skills
Abstract Notes2Skills framework converts laboratory notes into verifiable skills for AI agents while maintaining author uncertainty levels, addressing gaps in scientific AI development. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Scientific discovery workflows usually contain…
27 -
-
Hugging Face Daily Papers research 6d ago
SkillHarness: Harnessing Safe Skills for Computer-Use Agents
Abstract SkillHarness is a framework that enables computer-use agents to safely learn and execute skills in dynamic environments by incorporating safety constraints and adaptive skill selection mechanisms. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Computer-Use Agents (CUAs)…
24 -
Hugging Face Daily Papers research 6d ago
Improving Text-to-Music Generation with Human Preference Rewards
Abstract A text-to-music generation system uses reward conditioning, expert iteration, and preference tuning to improve audio quality while maintaining efficiency within a 120M-parameter model framework. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We describe our entry to the…
19 -
Hugging Face Daily Papers research 7d ago
Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding
Abstract Autoregressive generation in large language models traditionally uses the final layer for token prediction, but a new decoding strategy dynamically selects more reliable intermediate layers based on entropy-guided search, improving reasoning performance with minimal…
34 -
Hugging Face Daily Papers research 7d ago
Unlimited OCR Works
Abstract Unlimited OCR introduces Reference Sliding Window Attention to eliminate growing memory consumption during long-sequence OCR tasks, enabling efficient transcription of multiple pages in a single forward pass. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recently,…
12 -
Hugging Face Daily Papers research 7d ago
Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents
Abstract A failure detection framework for long-horizon robotic tasks uses action-conditioned world models and functional conformal prediction to monitor manipulation trajectories with only final task labels. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Long-horizon tasks are…
8 -
Hugging Face Daily Papers research 7d ago
MeshFlow: Mesh Generation with Equivariant Flow Matching
Abstract MeshFlow generates triangle meshes directly using equivariant optimal-transport flow matching models with improved inference speed over autoregressive methods. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Meshes are among the most common 3D scene representations, but…
16 -
Hugging Face Daily Papers research 7d ago
HAKARI-Bench: A Lightweight Benchmark for Comparing Retrieval Architectures and Efficiency Settings under Unified Conditions
Abstract HAKARI-Bench provides a lightweight benchmark for comparing retrieval methods across multiple configurations and languages, enabling efficient model selection and performance analysis. Generated by Qwen/Qwen2.5-Coder-32B-Instruct With the rapid spread of…
23 -
Hugging Face Daily Papers research 7d ago
AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction
Abstract AOHP presents an Android-based operating system framework that treats AI agents as first-class entities, enhancing task completion rates and reducing execution costs through specialized agent-oriented mechanisms. Generated by Qwen/Qwen2.5-Coder-32B-Instruct AI agents…
16 -
Hugging Face Daily Papers research 7d ago
DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams
Abstract Agentic Data Tailoring paradigm uses learnable data processing to structure high-entropy multimodal streams, with DataClaw_0-9B model achieving robust alignment through SFT and GRPO on a novel benchmark. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Massive unstructured…
19 -
Hugging Face Daily Papers research 7d ago
UniverSat: Resolution- and Modality-Agnostic Transformers for Earth Observation
Abstract UniverSat introduces a Universal Patch Encoder for Vision Transformers that enables robust, sensor-agnostic spatial feature extraction across diverse Earth Observation data types. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Vision Transformers (ViT) dominate computer…
6 -
Hugging Face Daily Papers research 7d ago
FastMix: Fast Data Mixture Optimization via Gradient Descent
Abstract FASTMIX automates optimal data mixture discovery during training by formulating mixture selection as a bilevel optimization problem that jointly optimizes mixture coefficients and model parameters through iterative updates. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
19 -
Hugging Face Daily Papers research 7d ago
CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents
Abstract A principled synthesis engine generates high-quality terminal-agent tasks through multi-dimensional capability taxonomy and evidence-guided research, creating a distilled dataset that enables significant performance gains in LLM training. Generated by…
5 -
Hugging Face Daily Papers research 7d ago
PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning
Abstract PoLAR introduces a geometrically structured latent action representation in hyperbolic space that separates transition extent from transition mode, improving robotic policy learning performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Latent action pretraining…
12 -
Hugging Face Daily Papers research 7d ago
Dense Reward for Multi-View 3D Reasoning with Global Maps and Local Views
Abstract DR-MV3D presents a map-grounded learning framework with dense rewards to improve multi-view 3D visual question answering through global map construction, view-trajectory planning, and egocentric grounding. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multi-view 3D…
15 -
Hugging Face Daily Papers research 7d ago
Causal Discovery in the Era of Agents
Abstract Language models should assist causal discovery workflows by providing contextual support and explanations rather than generating causal conclusions, as demonstrated through a platform that integrates data analysis and expert knowledge. Generated by…
31 -
Hugging Face Daily Papers research 7d ago
EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions
Abstract EnterpriseClawBench presents a benchmark for enterprise agents based on real-world sessions with 852 reproducible tasks, emphasizing comprehensive evaluation metrics beyond single performance scores. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Enterprise agents…
30 -
Hugging Face Daily Papers research 7d ago
PolicyTrim: Boosting Intrinsic Policy Efficiency of Vision-Language-Action Models
Abstract PolicyTrim is a reinforcement learning-based framework that enhances VLA model efficiency by extending reliable action chunk lengths and reducing redundant physical steps through dynamic exploration and redundancy-aware rewards. Generated by…
25 -
Hugging Face Daily Papers research 7d ago
Safe Few-Step Generation via Velocity Editing
Abstract VESFlow is a training-free safety method for flow matching-based text-to-image generation that edits velocity fields to ensure safe output while maintaining prompt integrity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Flow matching has recently emerged as a strong…
16 -
Hugging Face Daily Papers research 7d ago
Tmax: A simple recipe for terminal agents
Abstract A novel RL training approach for terminal agents achieves superior performance using a simplified recipe and expanded dataset, enabling effective training with fewer parameters than previous methods. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Terminal-using agents…
36 -
Hugging Face Daily Papers research 7d ago
OpenRath: Session-Centered Runtime State for Agent Systems
Abstract OpenRath introduces a PyTorch-like programming model for multi-agent systems using Session as a central runtime abstraction that enables explicit fork, merge, and replay operations while recording comprehensive execution state. Generated by…
21 -
Hugging Face Daily Papers research 7d ago
Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation
Abstract Trajectory-Augmented Policy Optimization (TAPO) enhances large language model reasoning by creating explicit corrective trajectories that preserve erroneous reasoning while incorporating natural-language diagnoses and corrections, outperforming traditional…
31 -
Hugging Face Daily Papers research 7d ago
Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning
Abstract Large language models can be trained through reinforcement learning to develop a meta-capability enabling continuous learning and adaptation across long sequences of tasks in dynamic environments. Generated by Qwen/Qwen2.5-Coder-32B-Instruct This work presents a general…
31 -
Hugging Face Daily Papers research 7d ago
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems
Abstract PlanBench-XL evaluates large language model agents' ability to plan and adapt in complex tool-rich environments with limited visibility and dynamic disruptions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct LLM agents increasingly operate in large tool ecosystems, where…
10 -
Hugging Face Daily Papers research 7d ago
World Action Models: A Survey
Abstract World Action Models are predictive-action systems that generate future states for decision-making, with designs balancing representational richness against computational constraints. Generated by Qwen/Qwen2.5-Coder-32B-Instruct World Action Models (WAMs) are embodied…
30 -
Hugging Face Daily Papers research 7d ago
HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization
Abstract HydraHead is a novel attention hybridization architecture that combines Full Attention and Linear Attention at the head level, achieving superior long-context performance with reduced training overhead through interpretability-driven selection and scale-normalized…
34 -
Hugging Face Daily Papers research 7d ago
KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking
Abstract KaLM-Reranker-V1 is a fast reranker that decouples query and passage computation using encoder-decoder architecture with Matryoshka embedding pooling and cross-attention for efficient relevance modeling. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As retrieval systems…
32 -
Hugging Face Daily Papers research 7d ago
Manifold Bandits: Bayesian Curriculum Learning over the Latent Geometry of Large Language Models
Abstract Reinforcement learning approaches for improving LLM reasoning capabilities are enhanced by a Bayesian Manifold Curriculum framework that structures problem sampling based on task manifold relationships and endogenous non-stationarity. Generated by…
20 -
Hugging Face Daily Papers research 7d ago
EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory
Abstract EvoEmbedding is a dynamic embedding model that generates adaptive representations by maintaining a continuously updated latent memory, enabling improved retrieval performance in long-context scenarios. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Existing embedding…
32 -
Hugging Face Daily Papers research 7d ago
Exploring the Design Space of Reward Backpropagation for Flow Matching
Abstract FlowBP addresses limitations in flow matching model alignment by using a surrogate trajectory framework that reduces memory usage and gradient chaining while maintaining performance across multiple text-to-image models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
23 -
Hugging Face Daily Papers research 7d ago
DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks
Abstract Search agents face challenges in real-world evaluation due to limited benchmarks and coarse metrics, necessitating more nuanced assessment approaches. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Search Agents (SAs) typically leverage large language models (LLMs) to…
14 -
Hugging Face Daily Papers research 7d ago
CalVerT: Augmenting Agents with Calibrated Verifier Telemetry Improves Action and Learning in Knowledge-Intensive Tasks
Abstract Calibrated verifier telemetry enhances LLM agents in knowledge-intensive question answering by providing confidence scores and grounding verification, reducing both over-retrieval and unsupported answers. Generated by Qwen/Qwen2.5-Coder-32B-Instruct LLM agents in…
7 -
Hugging Face Daily Papers research 7d ago
Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark
Abstract PhySciBench benchmark reveals limited performance of current LLM agents in physical science research, leading to development of DelveAgent framework that improves accuracy through modular design and physics-grounded mechanisms. Generated by…
5 -
Hugging Face Daily Papers research 7d ago
When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning
Abstract Adaptive Binning introduces a training-adaptive discretization method for self-supervised learning on medical tabular data, improving representation learning through feature-wise refinement and heterogeneous feature handling. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
13 -
Hugging Face Daily Papers research 7d ago
Characterizing Narrative Content in Web-scale LLM Pretraining Data
Abstract A comprehensive analysis of narrative structures in large-scale language model training data reveals measurable, multidimensional narrative patterns that vary across different content sources and topics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The narrative…
21 -
Hugging Face Daily Papers research 7d ago
SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG
Abstract SproutRAG is an attention-guided hierarchical retrieval-augmented generation framework that organizes sentence-level chunks into semantically coherent units using learned inter-sentence attention, enabling multi-granularity retrieval without additional LLM calls or…
33 -
Hugging Face Daily Papers research 7d ago
MCompassRAG: Topic Metadata as a Semantic Compass for Paragraph-Level Retrieval
Abstract MCompassRAG enhances retrieval-augmented generation by using topic-level metadata to guide chunk selection, improving both efficiency and precision in complex research tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Retrieval-augmented generation (RAG) systems…
32 -
Hugging Face Daily Papers research 7d ago
StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Abstract Multimodal large language models exhibit social bias driven by specific visual attributes, with fashion style and socioeconomic cues having the greatest impact on model judgments. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multimodal large language models (MLLMs) are…
37 -
Hugging Face Daily Papers research 7d ago
Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models
Abstract Reflective Masking enables iterative local refinement in Mask Diffusion Models through lightweight post-training, supporting multi-turn reasoning without architectural changes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct While reasoning on autoregressive (AR) models is…
26 -
Hugging Face Daily Papers research 7d ago
GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning
Abstract GeneralVLA-2 addresses limitations in vision-language-action systems by introducing GeoFuse-MV3D for improved 3D reconstruction and an enhanced KnowledgeBank for better memory management in robotic manipulation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
32 -
Hugging Face Daily Papers research 7d ago
SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction
Abstract SpatialAvatar-0 enables high-quality 4D head avatar generation by combining feed-forward prediction with per-subject refinement through a shared Gaussian representation, achieving superior performance across multiple benchmarks. Generated by…
20