Hugging Face Daily Papers
500 articles archived · Visit source ↗ · RSS
-
Hugging Face Daily Papers research 18d ago
DRIFT: A Residual Flow Adapter for Decoding Continuous Outputs in Vision-Language Models
Abstract DRIFT is a framework that adapts pretrained vision-language models for continuous decoding tasks by combining coarse prediction with iterative refinement through flow matching, improving performance across perception and planning tasks. Generated by…
12 -
Hugging Face Daily Papers research 18d ago
Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models
Abstract Vision-language models can improve grounding performance under aggressive token reduction by replacing irreversible visual-token pruning with recoverable routing that allows tokens to re-enter the processing pipeline at later stages. Generated by…
16 -
Hugging Face Daily Papers research 18d ago
Adaptive Multi-Resolution Procedural Knowledge Compression for Large Language Models
Abstract SKIM is an adaptive multi-resolution soft token compression framework that efficiently compresses procedural skills while maintaining task performance and enabling lightweight offline compression for frequently updated community skills. Generated by…
6 -
Hugging Face Daily Papers research 18d ago
τ-Rec: A Verifiable Benchmark for Agentic Recommender Systems
Abstract A benchmark for agentic recommender systems is introduced that uses verifiable rewards and controlled dialogue constraints to evaluate conversational agent reliability, revealing significant performance gaps among leading models. Generated by…
6 -
Hugging Face Daily Papers research 18d ago
On Subquadratic Architectures: From Applications to Principles
Abstract xLSTM demonstrates superior performance in sequence modeling tasks compared to Mamba-2 and Gated DeltaNet due to enhanced state tracking and memory dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Transformers dominate modern sequence modeling, but their quadratic…
4 -
Hugging Face Daily Papers research 19d ago
Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training
Abstract ART enables parameter-efficient fine-tuning of frozen multimodal language models by optimizing raw visual input through gradient backpropagation, achieving performance comparable to LoRA while supporting pre-compiled computational graphs. Generated by…
8 -
Hugging Face Daily Papers research 19d ago
TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning
Abstract TRACE is a rollout allocation framework that improves reward contrast in multi-turn agentic reinforcement learning by dynamically distributing resources across tree-structured rollouts based on prefix-level informativeness. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
27 -
Hugging Face Daily Papers research 19d ago
FlowLet: Conditional 3D Brain MRI Synthesis using Wavelet Flow Matching
Abstract FlowLet is a conditional generative framework that synthesizes age-conditioned 3D MRIs using flow matching in an invertible 3D wavelet domain, improving brain age prediction performance for underrepresented age groups. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Brain…
18 -
Hugging Face Daily Papers research 19d ago
POISE: Position-Aware Undetectable Skill Injection on LLM Agents
Abstract POISE is a stealthy skill-poisoning attack that embeds malicious triggers within benign-looking instructions, achieving high attack success rates while avoiding detection by LLM scanners that are overly sensitive to privileged tool operations. Generated by…
16 -
Hugging Face Daily Papers research 19d ago
Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code
Abstract Grammar-constrained decoding techniques used to ensure syntactic validity in code generation can be exploited as an attack surface, leading to the development of a jailbreak method called CodeSpear and a safety alignment approach named CodeShield. Generated by…
37 -
Hugging Face Daily Papers research 19d ago
Breaking the Bubble: Asynchronous Pipeline Parallel Training with Bounded Weight Inconsistency
Abstract PACI enables efficient asynchronous pipeline training by controlling forward/backward weight inconsistency through local gradient accumulation, achieving higher throughput and faster training time-to-accuracy without sacrificing stability or memory usage. Generated by…
9 -
Hugging Face Daily Papers research 19d ago
Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation
Abstract A lightweight approach combining a frozen pretrained time-series foundation model with a simple regression head achieves superior RUL prediction performance compared to various baseline methods on industrial sensor data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
15 -
Hugging Face Daily Papers research 19d ago
Large Language Models Are Overconfident in Their Own Responses
Abstract Instruction tuning degrades calibration in large language models, with chat templates exacerbating overconfidence through ownership bias, which can be mitigated by reframing model responses as user input during confidence assessment. Generated by…
22 -
Hugging Face Daily Papers research 19d ago
Distilling LLM Feedback for Lean Theorem Proving
Abstract Feedback Distillation improves post-training of reasoning models by using self-distillation with token-level supervision and privileged feedback from language models, offering better diversity and complementary benefits when combined with GRPO. Generated by…
38 -
Hugging Face Daily Papers research 19d ago
EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning
Abstract EvoTrainer autonomously evolves both language model policies and training harnesses through empirical feedback, demonstrating superior performance in complex reasoning and coding tasks compared to traditional handcrafted approaches. Generated by…
6 -
Hugging Face Daily Papers research 19d ago
Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning
Abstract A training-free framework for spatial reasoning from egocentric videos that enables revisiting conclusions through synthesized novel-view videos generated from predicted 3D geometry. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Spatial reasoning from egocentric videos…
11 -
Hugging Face Daily Papers research 19d ago
Redesign Mixture-of-Experts Routers with Manifold Power Iteration
Abstract Researchers propose a novel router redesign for Mixture-of-Experts models that aligns router rows with the principal singular directions of expert matrices using Manifold Power Iteration to improve model effectiveness. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Router…
33 -
Hugging Face Daily Papers research 19d ago
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks
Abstract A new benchmark and adapter protocol called Claw-SWE-Bench enables fair comparison of diverse coding agents by standardizing evaluation conditions and revealing the importance of adapter design for effective code generation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
16 -
Hugging Face Daily Papers research 19d ago
ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics
Abstract A new benchmark called ComBench is introduced to evaluate large language models' combinatorial reasoning abilities through Olympiad-level problems that test both proof construction and explicit mathematical constructions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
37 -
Hugging Face Daily Papers research 19d ago
Toward Generalist Autonomous Research via Hypothesis-Tree Refinement
Abstract An AI framework called Arbor enables autonomous scientific research by combining strategic coordination, isolated hypothesis testing, and a persistent knowledge tree to iteratively improve research outcomes across multiple domains. Generated by…
18 -
Hugging Face Daily Papers research 19d ago
TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders
Abstract TRL-Bench establishes a standardized benchmark for evaluating tabular representation learning models across multiple granularities, revealing that encoder performance varies by task type and requires capability-specific assessment rather than single leaderboard…
6 -
Hugging Face Daily Papers research 19d ago
Verifiable Environments Are LEGO Bricks: Recursive Composition for Reasoning Generalization
Abstract Recursive automated composition framework enables scalable reinforcement learning for language models by automatically combining verifiable environments through compositional operators. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reinforcement Learning (RL) with…
11 -
Hugging Face Daily Papers research 19d ago
Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions
Abstract A teacher-student framework decouples complex reasoning from efficient reward deployment in text-to-image training, achieving superior preference accuracy and optimization performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reward models are central to…
22 -
Hugging Face Daily Papers research 19d ago
Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application
Abstract Large language model agents require specialized environments for training and evaluation, which can be categorized by their engineering lifecycle stages and evolved through various paradigms including neural and symbolic approaches. Generated by…
8 -
Hugging Face Daily Papers research 19d ago
Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models
Abstract Embodied-R1.5 is a unified embodied foundation model that integrates embodied reasoning capabilities and achieves state-of-the-art performance on embodied vision-language benchmarks through a multi-task balanced reinforcement learning approach. Generated by…
35 -
Hugging Face Daily Papers research 19d ago
InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning
Abstract InternVideo3 enhances long-horizon multimodal tasks through Multimodal Contextual Reasoning and efficient attention mechanisms, demonstrating strong performance on video understanding benchmarks and video agent capabilities. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
18 -
Hugging Face Daily Papers research 19d ago
i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models
Abstract A comprehensive experimental study of text-to-image diffusion models reveals key design choices and training insights leading to the development of i1, a 3B-parameter model that matches leading performance while maintaining full openness. Generated by…
21 -
Hugging Face Daily Papers research 19d ago
World Model Self-Distillation: Training World Models to Solve General Tasks
Abstract A scalable framework combines self-distillation and reinforcement learning to transfer task-solving abilities from vision-language models to video diffusion models without requiring labeled task-video data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Pretrained video…
15 -
Hugging Face Daily Papers research 19d ago
Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling
Abstract Bebop addresses the efficiency bottleneck in reinforcement learning training of large language models by optimizing multi-token prediction techniques through entropy-aware sampling and novel training objectives that improve acceptance rates and inference throughput.…
28 -
Hugging Face Daily Papers research 19d ago
World Pilot: Steering Vision-Language-Action Models with World-Action Priors
Abstract World Pilot enhances Vision-Language-Action models by incorporating dynamic scene evolution and trajectory priors from a World-Action Model, achieving superior performance in zero-shot out-of-distribution manipulation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
10 -
Hugging Face Daily Papers research 19d ago
Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay
Abstract Continual Instruction Tuning enables effective fine-tuning of large language models for low-resource language translation, achieving superior performance compared to standard instruction tuning and multilingual models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large…
4 -
Hugging Face Daily Papers research 19d ago
DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch
Abstract A large-scale dataset called DeNovoSWE is introduced for training code agents to generate entire software repositories from documentation, significantly improving performance on long-horizon software engineering tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As the…
15 -
Hugging Face Daily Papers research 19d ago
ICA Lens: Interpreting Language Models Without Training Another Dictionary
Abstract Independent component analysis (ICA) is revived as an efficient method for discovering interpretable directions in language model representations, offering a faster alternative to sparse autoencoder training while maintaining competitive performance in probing tasks.…
22 -
Hugging Face Daily Papers research 19d ago
PaperMentor: A Human-Centered Multi-Agent Writing Tutor for AI Research Papers on Overleaf
Abstract A human-centered writing assistant system called PaperMentor integrates expert research advice with specialized agents to provide actionable feedback during manuscript drafting, outperforming AI baselines in usability and relevance. Generated by…
38 -
Hugging Face Daily Papers research 19d ago
When Behavioral Safety Evaluation Fails: A Representation-Level Perspective
Abstract Behavioral safety evaluations of large language models provide incomplete insights into internal robustness, as demonstrated by the audit gap between observable outputs and latent space vulnerabilities revealed through intervention-based testing. Generated by…
38 -
Hugging Face Daily Papers research 19d ago
In-Context Multiple Instance Learning
Abstract Pretraining a Perceiver-style architecture on synthetic bag-structured data enables efficient, task-adaptive classification from few labeled examples in multiple instance learning scenarios. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multiple Instance Learning (MIL)…
10 -
Hugging Face Daily Papers research 19d ago
MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation
Abstract Video generative models achieve improved long-range consistency through coarse-to-fine token generation using a multi-scale autoencoder and diffusion model architecture. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Video generative models have become increasingly…
28 -
Hugging Face Daily Papers research 19d ago
Decentralized Multi-Agent Systems with Shared Context
Abstract Decentralized Language Models (DeLM) framework enables scalable large language model reasoning through parallel agents that asynchronously coordinate via a shared verified context, improving performance and efficiency over centralized approaches. Generated by…
25 -
Hugging Face Daily Papers research 19d ago
SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction
Abstract SkillHarm is a benchmark for evaluating skill-based attacks across the skill-use lifecycle, demonstrating significant vulnerabilities in current agents with attack success rates up to 86.3%. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Agent skills occupy a privileged…
36 -
Hugging Face Daily Papers research 19d ago
Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests
Abstract CapCode framework uses randomized testing with performance caps to detect and prevent shortcut exploitation in agent evaluation, while CapReward rewards systems that adhere to intended task specifications. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A growing failure…
21 -
Hugging Face Daily Papers research 19d ago
The Role of Feedback Alignment in Self-Distillation
Abstract Self-distillation effectiveness depends on structural alignment between feedback and solver reasoning, with step-aligned critique outperforming binary rewards and reference solutions by targeting specific reasoning failures. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…
32 -
Hugging Face Daily Papers research 19d ago
Next Forcing: Causal World Modeling with Multi-Chunk Prediction
Abstract Next Forcing introduces a multi-chunk prediction framework that accelerates training and inference for autoregressive video generation while improving accuracy and physical law adherence. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Autoregressive video generation has…
19 -
Hugging Face Daily Papers research 19d ago
FadeMem: Distance-Aware Memory Consolidation for Autoregressive Video Diffusion
Abstract FadeMem introduces a distance-aware key-value memory consolidation mechanism that organizes historical video data into a temporal hierarchy, improving long-video generation by preserving recent context and long-range anchors under fixed cache constraints. Generated by…
36 -
Hugging Face Daily Papers research 20d ago
Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning
Abstract CPPO addresses limitations in reinforcement learning with verifiable rewards by introducing position-weighted thresholds and cumulative prefix budgeting to better handle autoregressive generation challenges. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reinforcement…
12 -
Hugging Face Daily Papers research 20d ago
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders
Abstract Sparse autoencoders trained on language model representations reveal interpretable features for speech synthesis that can be manipulated to control linguistic and prosodic attributes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Language models increasingly serve as the…
19 -
Hugging Face Daily Papers research 20d ago
Kwai Keye-VL-2.0 Technical Report
Abstract Kwai Keye-VL-2.0-30B-A3B is an open-source Mixture-of-Experts multimodal foundation model that enables long-video understanding and agentic intelligence through DeepSeek Sparse Attention and specialized training infrastructure. Generated by…
36 -
Hugging Face Daily Papers research 20d ago
IR3DE: A Linear Router for Large Language Models
Abstract A ridge regression-based routing method achieves competitive performance in selecting domain-expert LLMs for different tasks while enabling dynamic addition/removal of experts without retraining. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Foundational Large Language…
28 -
Hugging Face Daily Papers research 20d ago
PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models
Abstract A psychologically-informed refusal framework called PsychoSafe is developed for large language models to improve harmful request handling through structured supportive communication, showing enhanced refusal quality and resource referral while maintaining performance on…
14 -
Hugging Face Daily Papers research 20d ago
BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling
Abstract BrainSurgery is a tool for robust and reproducible tensor manipulation of neural network checkpoints through declarative YAML plans with built-in validation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As deep learning models scale, managing, inspecting, and modifying…
12 -
Hugging Face Daily Papers research 20d ago
UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors
Abstract A universal PET image denoising framework addresses variability in dose reduction factors through domain generalization techniques and region-aware learning strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Most existing deep learning-based PET image denoising…
26