Tag

Security

500 articles archived under #security · RSS

arXiv — NLP / Computation & Language research 18d ago

SkillChain: Closing the Loop on Skill Evolution for Image-Based E-Commerce AI Assistants

arXiv:2606.12984v1 Announce Type: new Abstract: Image-based AI assistants are now deployed at production scale on e-commerce platforms, where a single uploaded image can trigger fundamentally different user intents: product search, style recommendation, visual encyclopedia, or…

18
arXiv — NLP / Computation & Language research 18d ago

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

arXiv:2606.13044v1 Announce Type: new Abstract: As AI-generated reviews move from experimental tools into peer-review infrastructure, most robustness concerns have focused on explicit attacks such as hidden instructions and prompt injection. We study a harder and more…

32
arXiv — NLP / Computation & Language research 18d ago

NTS-CoT: Mitigating Hallucinations in LLM-based News Timeline Summarization with Chain-of-Thought Reasoning

arXiv:2606.13171v1 Announce Type: new Abstract: The rapid updates of online news make tracking event developments challenging, highlighting the need for timeline summarization (TLS). Hallucinations, where LLM-generated content deviates from source news, still remain a critical…

5
arXiv — NLP / Computation & Language research 18d ago

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

arXiv:2606.13647v1 Announce Type: new Abstract: We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprising 31 datasets across 7 task types -- nearly 4$\times$ the depth of existing multilingual…

25
arXiv — NLP / Computation & Language research 18d ago

Two Wrongs, No Right: Auditing Social-Desirability Bias in LLM Annotators for Computational Social Science

arXiv:2606.12426v1 Announce Type: cross Abstract: LLM annotators are increasingly used in computational social science (CSS), but it is unclear whether their alignment-shaped errors preserve the empirical conclusions a researcher would report. We audit three open-source 7B…

10
arXiv — NLP / Computation & Language research 18d ago

Multi-Bitwidth Quantization for LLMs Using Additive Codebooks

arXiv:2606.12876v1 Announce Type: cross Abstract: As large language models (LLMs) are increasingly deployed across heterogeneous hardware with varying resource constraints, the ability to adaptively manage the trade-off between performance and efficiency without retraining is…

9
arXiv — NLP / Computation & Language research 18d ago

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

arXiv:2606.12900v1 Announce Type: cross Abstract: Large language models (LLMs) often hallucinate by generating factually incorrect or unfaithful content, posing significant risks to their safe use. Detecting such hallucinations is particularly challenging under the zero-source…

30
Hacker News — AI on Front Page community 18d ago

Codex for open source

Article URL: https://openai.com/form/codex-for-oss/ Comments URL: https://news.ycombinator.com/item?id=48497195 Points: 216 # Comments: 74

11
Hacker News — AI on Front Page community 18d ago

The RCE that AMD wouldn't fix

See also https://www.youtube.com/watch?v=4HjWHNLRMB0 Related: The RCE that AMD won't fix - https://news.ycombinator.com/item?id=46906947 - Feb 2026 (173 comments) Comments URL: https://news.ycombinator.com/item?id=48492215 Points: 225 # Comments: 99

7
Hacker News — AI on Front Page community 18d ago

MiMo Code is now released and open-source

Article URL: https://mimo.xiaomi.com/mimocode Comments URL: https://news.ycombinator.com/item?id=48490826 Points: 259 # Comments: 134

37
r/MachineLearning community 18d ago

Is Symbolic Regression still a thing, given LLMs' performance? [D]

I've been teaching myself about Symbolic Regression (SR), which looks like a super exciting field. (A great intro resource below [1]). But then I was wondering: given LLMs' increasingly-growing power in generating code, which is in a way very similar to Symbolic Regression (or…

21
r/LocalLLaMA community 18d ago

Cognitor: open-source semantic search engine. Automatically chunks, embeds and indexes the content of a target folder, making it searchable semantically.

https://github.com/tanaos/cognitor Cognitor is an open-source semantic search engine and vector database which automatically chunks, embeds and indexes the entire content of a target folder (and its subfolders), making it easily searchable by both AI agents and humans.…

15
r/LocalLLaMA community 18d ago

How I implemented ASR bias for voice transcription models [Open Source]

I've been spending the last couple of weeks building a Wispr Flow clone as an open source project. For context, it is a voice dictation app that lets you type faster, by speaking instead of actually typing. I spent the first week building the basic STT capabilities. One of the…

29
Hugging Face Daily Papers research 18d ago

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

Abstract Grammar-constrained decoding techniques used to ensure syntactic validity in code generation can be exploited as an attack surface, leading to the development of a jailbreak method called CodeSpear and a safety alignment approach named CodeShield. Generated by…

37
arXiv — Machine Learning research 19d ago

A prior-free blind detection of information leakage from model predictions

arXiv:2606.11267v1 Announce Type: new Abstract: Data leakage -- contamination of a model with information unavailable at baseline -- is the dominant reproducibility failure in machine-learning-based science, yet detection tools require training code, external data, or domain…

25
arXiv — Machine Learning research 19d ago

Bergson: An Open Source Library for Data Attribution

arXiv:2606.11660v1 Announce Type: new Abstract: Data attribution is a promising field in interpretability that aims to explain model behavior through the influence of its training data, with applications including debugging undesirable model behavior and training dataset…

26
arXiv — Machine Learning research 19d ago

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

arXiv:2606.12251v1 Announce Type: new Abstract: Gradient-based adversarial attacks remain a dominant threat to deep neural networks (DNNs), as they exploit gradient information to efficiently optimize adversarial perturbations. To address this, we investigate whether…

27
arXiv — NLP / Computation & Language research 19d ago

SOMA-SQL: Resolving Multi-Source Ambiguity in NL-to-SQL via Synthetic Log and Execution Probing

arXiv:2606.11424v1 Announce Type: new Abstract: Natural language interfaces to databases aim to translate user questions into executable SQL, yet remain brittle in real-world settings where questions are underspecified and schemas are large and ambiguous. Ambiguity across user…

6
arXiv — NLP / Computation & Language research 19d ago

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

arXiv:2606.11786v1 Announce Type: new Abstract: Large Language Models (LLMs) offer new potential for translation tasks but often experience performance degradation when handling low-resource languages. To address this limitation, we propose an approach for fine-tuning LLMs on a…

37
arXiv — NLP / Computation & Language research 19d ago

GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs

arXiv:2606.11898v1 Announce Type: new Abstract: Research on Text-Attributed Graphs (TAGs) has gained significant attention recently due to its broad applications across various real-world data scenarios, such as citation networks, e-commerce platforms, social media, and web…

35
arXiv — NLP / Computation & Language research 19d ago

Semantic Grading of Written Answers in Low-Resource Language Bangla Using a Fine-Tuned Lightweight Language Model

arXiv:2606.11931v1 Announce Type: new Abstract: Bangla is among the world's most widely spoken languages, yet it remains underserved in educational NLP research. In many remote and rural regions, access to qualified subject teachers is limited, and written answers are…

14
arXiv — NLP / Computation & Language research 19d ago

A Resource for Enthymeme Detection in Controversial Political Discourse

arXiv:2606.12186v1 Announce Type: new Abstract: Enthymemes, arguments with unstated premises or conclusions, are pervasive in persuasive discourse, yet their annotation remains notoriously subjective. We present a resource of 1,482 tweets from politically controversial…

6
arXiv — NLP / Computation & Language research 19d ago

From Explicit Elements to Implicit Intent: A Predefined Library for Auditable Behavioral Inference

arXiv:2606.11207v1 Announce Type: cross Abstract: We present SemantiClean, a modular framework for extracting structured semantic signals from e-commerce session data and driving pluggable inference targets including purchase intent, customer segmentation, and product affinity…

5
arXiv — NLP / Computation & Language research 19d ago

Gumbel-BEARD: Automatic Layer Selection for Self-Supervised Adaptation of Whisper in Low-Resource Domains

arXiv:2606.11429v1 Announce Type: cross Abstract: Speech foundation models often struggle in low-resource domains due to domain mismatch and data scarcity. We propose Gumbel-BEARD, a domain adaptation framework that automates Whisper encoder layer selection via an end-to-end…

29
Hugging Face Daily Papers research 19d ago

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

Abstract Continual Instruction Tuning enables effective fine-tuning of large language models for low-resource language translation, achieving superior performance compared to standard instruction tuning and multilingual models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large…

4
Zed Editor dev-tools 19d ago

Software Is Made Between Commits

Agents turned the conversation into the real source of our software. DeltaDB is the version control built for it.

21
r/MachineLearning community 19d ago

Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P]

Surprised there's no real tooling for this given how much research exists on continual learning. Built pyrecall to fill the gap. Snapshots skill scores before/after fine-tuning, flags regressions, rolls back LoRA adapters by name. Fully local, no external APIs. v0.1.0, MIT, pip…

17
r/LocalLLaMA community 19d ago

Best Open-Source AI coding model for my specs?

hello everyone! im looking for the most powerful open-source coding ai while still fitting my system my specs: CPU: AMD ryzen 7 7700 GPU: RTX 5070 RAM: 32 gb DDR5 OS: windows 11 use case: Writing, Coding, debugging. any recommendations would be great. thanks in advance  …

4
GitHub Blog — AI & ML official-blog 19d ago

Give GitHub Copilot CLI real code intelligence with language servers

Install and configure LSP servers for GitHub Copilot CLI, replacing brute-force grep/decompile with real code intelligence. The post Give GitHub Copilot CLI real code intelligence with language servers appeared first on The GitHub Blog .

34
Hugging Face Daily Papers research 19d ago

Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

Abstract CapCode framework uses randomized testing with performance caps to detect and prevent shortcut exploitation in agent evaluation, while CapReward rewards systems that adhere to intended task specifications. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A growing failure…

21
r/LocalLLaMA community 19d ago

Cohere released North Mini Code: It's first Open-Source Agentic Coding Model

Small: 30 billion parameters, 3B active. Efficient: Benchmarks to 33.4 on the Artificial Analysis Coding Index, competitive among similar sized models. Open Source: Apache 2.0 license HF: https://huggingface.co/CohereLabs/North-Mini-Code-1.0   submitted by  …

8
r/MachineLearning community 19d ago

Introducing Papers Without Code [P]

Hi, Niels here from the open-source team at Hugging Face. I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on…

36
Hugging Face Daily Papers research 20d ago

Kwai Keye-VL-2.0 Technical Report

Abstract Kwai Keye-VL-2.0-30B-A3B is an open-source Mixture-of-Experts multimodal foundation model that enables long-video understanding and agentic intelligence through DeepSeek Sparse Attention and specialized training infrastructure. Generated by…

36
Hugging Face Daily Papers research 20d ago

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Abstract A psychologically-informed refusal framework called PsychoSafe is developed for large language models to improve harmful request handling through structured supportive communication, showing enhanced refusal quality and resource referral while maintaining performance on…

14
r/LocalLLaMA community 20d ago

Without open source LLMs, US AI companies could have already monopoled the technology

For such technology with clear importance and impact on all of us, I believe that making it open source is an ethical duty, otherwise, especially with the 1-sided politics of the US we experience today, they could have already monopoled the technology by now, maybe make it…

22
arXiv — Machine Learning research 20d ago

QSplitFL: Capability Aware Deep Q-Learning for Optimal Split Point Selection in Split Federated Learning

arXiv:2606.09869v1 Announce Type: new Abstract: Federated Learning (FL) combined with Split Learning (SL) is a privacy preserving paradigm that enables training deep neural networks (DNNs) on resource constrained devices while reducing overall training cost. However, determining…

22
arXiv — Machine Learning research 20d ago

LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems

arXiv:2606.09892v1 Announce Type: new Abstract: Textual event records, such as alarm logs, have become an increasingly common data source in engineering and manufacturing systems. Beyond identifying correlations or recurring patterns, engineers are often interested in…

31
arXiv — Machine Learning research 20d ago

A Source Domain is All You Need: Source-Only Cross-OS Transfer Learning for APT Anomaly Detection via Semantic Alignment and Optimal Transport

arXiv:2606.10216v1 Announce Type: new Abstract: Advanced Persistent Threats (APTs) are stealthy, multi-stage cyberattacks whose detection is difficult due to scarce labeled traces, severe class imbalance, and the challenge of generating realistic malicious behavior. These…

12
arXiv — Machine Learning research 20d ago

Baseline-Free Policy Optimization for Neural Combinatorial Optimization

arXiv:2606.10321v1 Announce Type: new Abstract: Neural combinatorial optimization (NCO) trains autoregressive policies to solve routing problems. The standard training algorithm, REINFORCE with a rollout baseline, requires maintaining and periodically updating a frozen copy of…

13
arXiv — NLP / Computation & Language research 20d ago

OpenRTLSet: A Fully Open-Source Dataset for Large Language Model-based Verilog Module Design

arXiv:2606.10285v1 Announce Type: new Abstract: OpenRTLSet introduces the largest fully open-source dataset for hardware design, offering over 131,000 diverse Verilog code samples to the research community and industry. Our dataset uniquely combines Verilog code from GitHub…

6
arXiv — NLP / Computation & Language research 20d ago

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

arXiv:2606.11127v1 Announce Type: new Abstract: Synthetic post-training pipelines commonly filter generated samples with reward models or holistic LLM judges, yet two practices remain rarely examined together: whether the filtering signal is grounded in the source evidence that…

9
arXiv — NLP / Computation & Language research 20d ago

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

arXiv:2606.09852v1 Announce Type: cross Abstract: High-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and maintainability are essential. We presented an AI powered framework that automates…

36
arXiv — NLP / Computation & Language research 20d ago

Advancing the State-of-the-Art in Empirical Privacy Auditing

arXiv:2606.10481v1 Announce Type: cross Abstract: Parameter-efficient fine-tuning of large language models (LLMs) can exhibit problematic memorization of individual training examples. Empirical privacy auditing (EPA) quantifies this risk by measuring realistic data leakage on…

23
r/LocalLLaMA community 20d ago

Without open llm competition, closed source LLM companies will become insatiable.

I can't imagine how arrogant one must be to make such a decision. People pay $200 a month for Anthropic to mess with their codebase. Imagine how they would humiliate their customers if the world didn't have an open-source model.…

6
r/LocalLLaMA community 20d ago

zai-org/SCAIL-2 · Hugging Face

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning SCAIL-2 is an open-source model for end-to-end controlled character animation . It animates a reference character with a driving video, and also supports character replacement and…

15
Hugging Face Daily Papers research 20d ago

Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path

Abstract Rectified Flows retain subtle training data traces that accumulate during training and can be exploited for membership inference attacks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Understanding what generative models retain from training data remains challenging,…

12
Hugging Face Daily Papers research 20d ago

Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation?

Abstract Large language models can improve translation for low-resource languages through structured linguistic reasoning traces, with the most significant benefits occurring during inference rather than training. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large language…

30
r/LocalLLaMA community 20d ago

Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support

Hey everyone, TinySearch v0.2.0 (first stable beta) is out. The first version used DuckDuckGo directly, which worked well enough to prove the idea, but yeah.. relying on one search source was way too fragile lol. DDG started throwing limits/CAPTCHAs more often in the last 2…

25
Hugging Face Daily Papers research 21d ago

Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops

Abstract Researchers identify widespread vulnerabilities in agent benchmark verification systems and develop an automated iterative process using LLM agents to create robust verifiers that resist exploitation while maintaining legitimate task performance. Generated by…

20
r/LocalLLaMA community 21d ago

Have we reached the point where open-source LLMs are “just good enough”?

The question I’m asking myself is whether open-source LLMs are now “ just good enough ” to meet 95% of requirements. I know, of course, that they still need to and will get even better, but where does the added value of the remaining 5% come from? a) Better answer quality? Okay,…

19

SkillChain: Closing the Loop on Skill Evolution for Image-Based E-Commerce AI Assistants

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

NTS-CoT: Mitigating Hallucinations in LLM-based News Timeline Summarization with Chain-of-Thought Reasoning

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

Two Wrongs, No Right: Auditing Social-Desirability Bias in LLM Annotators for Computational Social Science

Multi-Bitwidth Quantization for LLMs Using Additive Codebooks

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

Codex for open source

The RCE that AMD wouldn't fix

MiMo Code is now released and open-source

Is Symbolic Regression still a thing, given LLMs' performance? [D]

Cognitor: open-source semantic search engine. Automatically chunks, embeds and indexes the content of a target folder, making it searchable semantically.

How I implemented ASR bias for voice transcription models [Open Source]

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

A prior-free blind detection of information leakage from model predictions

Bergson: An Open Source Library for Data Attribution

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

SOMA-SQL: Resolving Multi-Source Ambiguity in NL-to-SQL via Synthetic Log and Execution Probing

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs

Semantic Grading of Written Answers in Low-Resource Language Bangla Using a Fine-Tuned Lightweight Language Model

A Resource for Enthymeme Detection in Controversial Political Discourse

From Explicit Elements to Implicit Intent: A Predefined Library for Auditable Behavioral Inference

Gumbel-BEARD: Automatic Layer Selection for Self-Supervised Adaptation of Whisper in Low-Resource Domains

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

Software Is Made Between Commits

Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P]

Best Open-Source AI coding model for my specs?

Give GitHub Copilot CLI real code intelligence with language servers

Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

Cohere released North Mini Code: It's first Open-Source Agentic Coding Model

Introducing Papers Without Code [P]

Kwai Keye-VL-2.0 Technical Report

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Without open source LLMs, US AI companies could have already monopoled the technology

QSplitFL: Capability Aware Deep Q-Learning for Optimal Split Point Selection in Split Federated Learning

LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems

A Source Domain is All You Need: Source-Only Cross-OS Transfer Learning for APT Anomaly Detection via Semantic Alignment and Optimal Transport

Baseline-Free Policy Optimization for Neural Combinatorial Optimization

OpenRTLSet: A Fully Open-Source Dataset for Large Language Model-based Verilog Module Design

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

Advancing the State-of-the-Art in Empirical Privacy Auditing

Without open llm competition, closed source LLM companies will become insatiable.

zai-org/SCAIL-2 · Hugging Face

Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path

Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation?

Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support

Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops

Have we reached the point where open-source LLMs are “just good enough”?