Hugging Face Daily Papers · May 29, 2026 · 6 min read

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

#multimodal #reasoning #benchmark #funding

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

We introduce Tiny but Trusted, a parameter-efficient vision-language framework for time-series anomaly detection with grounded reasoning, enabling compact VLMs to provide accurate and explainable reasoning over sequential data. Instead of treating anomaly detection as interval prediction alone, we construct VisAnomBench, a curated benchmark from public time-series datasets augmented with natural-language anomaly rationales selected using fine-grained, task-specific rewards from multiple large VLMs. Fine-tuning on this benchmark yields VisAnomReasoner, a lightweight VLM that jointly localizes abnormal temporal regions and explains the underlying pattern shifts. The model improves anomaly localization and interpretability while outperforming strong baselines on VisAnomBench and generalizing to TSB-AD-U.\n","updatedAt":"2026-05-29T15:44:52.151Z","author":{"_id":"678ac3b31cbaa0b4bc295885","avatarUrl":"/avatars/1244fc1b305c9c6383df9bb5e4707347.svg","fullname":"Ismini Lourentzou","name":"isminoula","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8342894911766052},"editors":["isminoula"],"editorAvatarUrls":["/avatars/1244fc1b305c9c6383df9bb5e4707347.svg"],"reactions":[],"isReport":false}},{"id":"6a1a41266c3a012e753effff","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:45:10.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Reasoning-Guided Grounding: Elevating Video Anomaly Detection through Multimodal Large Language Models](https://huggingface.co/papers/2605.02912) (2026)\n* [LATERN: Test-Time Context-Aware Explainable Video Anomaly Detection](https://huggingface.co/papers/2605.15054) (2026)\n* [Detecting Time Series Anomalies Like an Expert: A Multi-Agent LLM Framework with Specialized Analyzers](https://huggingface.co/papers/2605.05725) (2026)\n* [AnomalyAgent: Training-Free Agentic Models for Zero-/Few-Shot Anomaly Detection](https://huggingface.co/papers/2605.30140) (2026)\n* [IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools](https://huggingface.co/papers/2605.20682) (2026)\n* [CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection](https://huggingface.co/papers/2605.23116) (2026)\n* [Anomaly-Aware Vision-Language Adapters for Zero-Shot Anomaly Detection](https://huggingface.co/papers/2605.12069) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. \nThe following papers were recommended by the Semantic Scholar API \n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.02912\">Reasoning-Guided Grounding: Elevating Video Anomaly Detection through Multimodal Large Language Models</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.15054\">LATERN: Test-Time Context-Aware Explainable Video Anomaly Detection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.05725\">Detecting Time Series Anomalies Like an Expert: A Multi-Agent LLM Framework with Specialized Analyzers</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.30140\">AnomalyAgent: Training-Free Agentic Models for Zero-/Few-Shot Anomaly Detection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.20682\">IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.23116\">CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.12069\">Anomaly-Aware Vision-Language Adapters for Zero-Shot Anomaly Detection</a> (2026)</li>\n</ul>\n Please give a thumbs up to this comment if you found it helpful!\n If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><a href=\"/librarian-bot\">@librarian-bot</a> recommend</code>\n","updatedAt":"2026-05-30T01:45:10.876Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.686028242111206},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30344","authors":[{"_id":"6a19b3be808ddbc3c7d42d95","name":"Xiaona Zhou","hidden":false},{"_id":"6a19b3be808ddbc3c7d42d96","name":"Muntasir Wahed","hidden":false},{"_id":"6a19b3be808ddbc3c7d42d97","name":"Tianjiao Yu","hidden":false},{"_id":"6a19b3be808ddbc3c7d42d98","name":"Constantin Brif","hidden":false},{"_id":"6a19b3be808ddbc3c7d42d99","name":"Ismini Lourentzou","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection","submittedOnDailyBy":{"_id":"678ac3b31cbaa0b4bc295885","avatarUrl":"/avatars/1244fc1b305c9c6383df9bb5e4707347.svg","isPro":false,"fullname":"Ismini Lourentzou","user":"isminoula","type":"user","name":"isminoula"},"summary":"Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in sequential data. Public anomaly detection benchmarks typically provide interval annotations but not natural-language rationales, making it difficult to fine-tune VLMs to produce grounded, interpretable decisions. To address this gap, we construct VisAnomBench, a curated benchmark built from public time-series datasets and augmented with high-quality anomaly explanations selected from multiple large VLMs using fine-grained, task-specific rewards. Through fine-tuning on this benchmark, we develop VisAnomReasoner, a parameter-efficient VLM for time-series anomaly detection. Experimental results on VisAnomBench show that VisAnomReasoner achieves more accurate anomaly localization and consistently outperforms all baselines, with improvements of at least 21.23 and 23.87 percentage points in precision and F1, respectively. Additional experiments on the TSB-AD-U benchmark demonstrate strong cross-benchmark generalization, with VisAnomReasoner improving precision and F1 by 9.57 and 13.39 percentage points, respectively.","upvotes":0,"discussionId":"6a19b3bf808ddbc3c7d42d9a","projectPage":"https://plan-lab.github.io/projects/VisAnom","ai_summary":"A parameter-efficient vision-language model is developed for time-series anomaly detection using a novel benchmark with natural-language rationales, achieving superior performance and generalization across multiple datasets.","ai_keywords":["Vision-Language Models","anomaly detection","parameter-efficient fine-tuning","time-series datasets","natural-language rationales","cross-benchmark generalization"],"organization":{"_id":"681be082cdcffc26982f55d7","name":"PLAN-Lab","fullname":"PLAN Lab @University of Illinois Urbana-Champaign","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/678ac3b31cbaa0b4bc295885/B6lcdzr22Y44SsFuyTnwH.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"organization":{"_id":"681be082cdcffc26982f55d7","name":"PLAN-Lab","fullname":"PLAN Lab @University of Illinois Urbana-Champaign","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/678ac3b31cbaa0b4bc295885/B6lcdzr22Y44SsFuyTnwH.jpeg"}}">

Papers

arxiv:2605.30344

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

Published on May 28

· Submitted by

Ismini Lourentzou on May 29

PLAN Lab @University of Illinois Urbana-Champaign

Upvote

Authors:

Abstract

A parameter-efficient vision-language model is developed for time-series anomaly detection using a novel benchmark with natural-language rationales, achieving superior performance and generalization across multiple datasets.

AI-generated summary

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in sequential data. Public anomaly detection benchmarks typically provide interval annotations but not natural-language rationales, making it difficult to fine-tune VLMs to produce grounded, interpretable decisions. To address this gap, we construct VisAnomBench, a curated benchmark built from public time-series datasets and augmented with high-quality anomaly explanations selected from multiple large VLMs using fine-grained, task-specific rewards. Through fine-tuning on this benchmark, we develop VisAnomReasoner, a parameter-efficient VLM for time-series anomaly detection. Experimental results on VisAnomBench show that VisAnomReasoner achieves more accurate anomaly localization and consistently outperforms all baselines, with improvements of at least 21.23 and 23.87 percentage points in precision and F1, respectively. Additional experiments on the TSB-AD-U benchmark demonstrate strong cross-benchmark generalization, with VisAnomReasoner improving precision and F1 by 9.57 and 13.39 percentage points, respectively.

View arXiv page View PDF Project page Add to collection

Community

isminoula

Paper submitter about 23 hours ago

librarian-bot

about 13 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.30344 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.30344 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.30344 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers