News / #security Tag Security 500 articles archived under #security · RSS Sign in to follow arXiv — NLP / Computation & Language research 18d ago SkillChain: Closing the Loop on Skill Evolution for Image-Based E-Commerce AI Assistants arXiv:2606.12984v1 Announce Type: new Abstract: Image-based AI assistants are now deployed at production scale on e-commerce platforms, where a single uploaded image can trigger fundamentally different user intents: product search, style recommendation, visual encyclopedia, or… 18 arXiv — NLP / Computation & Language research 18d ago No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions arXiv:2606.13044v1 Announce Type: new Abstract: As AI-generated reviews move from experimental tools into peer-review infrastructure, most robustness concerns have focused on explicit attacks such as hidden instructions and prompt injection. We study a harder and more… 32 arXiv — NLP / Computation & Language research 18d ago NTS-CoT: Mitigating Hallucinations in LLM-based News Timeline Summarization with Chain-of-Thought Reasoning arXiv:2606.13171v1 Announce Type: new Abstract: The rapid updates of online news make tracking event developments challenging, highlighting the need for timeline summarization (TLS). Hallucinations, where LLM-generated content deviates from source news, still remain a critical… 5 arXiv — NLP / Computation & Language research 18d ago SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation arXiv:2606.13647v1 Announce Type: new Abstract: We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprising 31 datasets across 7 task types -- nearly 4$\times$ the depth of existing multilingual… 25 arXiv — NLP / Computation & Language research 18d ago Two Wrongs, No Right: Auditing Social-Desirability Bias in LLM Annotators for Computational Social Science arXiv:2606.12426v1 Announce Type: cross Abstract: LLM annotators are increasingly used in computational social science (CSS), but it is unclear whether their alignment-shaped errors preserve the empirical conclusions a researcher would report. We audit three open-source 7B… 10 arXiv — NLP / Computation & Language research 18d ago Multi-Bitwidth Quantization for LLMs Using Additive Codebooks arXiv:2606.12876v1 Announce Type: cross Abstract: As large language models (LLMs) are increasingly deployed across heterogeneous hardware with varying resource constraints, the ability to adaptively manage the trade-off between performance and efficiency without retraining is… 9 arXiv — NLP / Computation & Language research 18d ago Zero-source LLM Hallucination Detection with Human-like Criteria Probing arXiv:2606.12900v1 Announce Type: cross Abstract: Large language models (LLMs) often hallucinate by generating factually incorrect or unfaithful content, posing significant risks to their safe use. Detecting such hallucinations is particularly challenging under the zero-source… 30 Hacker News — AI on Front Page community 18d ago Codex for open source Article URL: https://openai.com/form/codex-for-oss/ Comments URL: https://news.ycombinator.com/item?id=48497195 Points: 216 # Comments: 74 11 Hacker News — AI on Front Page community 18d ago The RCE that AMD wouldn't fix See also https://www.youtube.com/watch?v=4HjWHNLRMB0 Related: The RCE that AMD won't fix - https://news.ycombinator.com/item?id=46906947 - Feb 2026 (173 comments) Comments URL: https://news.ycombinator.com/item?id=48492215 Points: 225 # Comments: 99 7 Hacker News — AI on Front Page community 18d ago MiMo Code is now released and open-source Article URL: https://mimo.xiaomi.com/mimocode Comments URL: https://news.ycombinator.com/item?id=48490826 Points: 259 # Comments: 134 37 r/MachineLearning community 18d ago Is Symbolic Regression still a thing, given LLMs' performance? [D] I've been teaching myself about Symbolic Regression (SR), which looks like a super exciting field. (A great intro resource below [1]). But then I was wondering: given LLMs' increasingly-growing power in generating code, which is in a way very similar to Symbolic Regression (or… 21 r/LocalLLaMA community 18d ago Cognitor: open-source semantic search engine. Automatically chunks, embeds and indexes the content of a target folder, making it searchable semantically. https://github.com/tanaos/cognitor Cognitor is an open-source semantic search engine and vector database which automatically chunks, embeds and indexes the entire content of a target folder (and its subfolders), making it easily searchable by both AI agents and humans.… 15 r/LocalLLaMA community 18d ago How I implemented ASR bias for voice transcription models [Open Source] I've been spending the last couple of weeks building a Wispr Flow clone as an open source project. For context, it is a voice dictation app that lets you type faster, by speaking instead of actually typing. I spent the first week building the basic STT capabilities. One of the… 29 Hugging Face Daily Papers research 18d ago Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code Abstract Grammar-constrained decoding techniques used to ensure syntactic validity in code generation can be exploited as an attack surface, leading to the development of a jailbreak method called CodeSpear and a safety alignment approach named CodeShield. Generated by… 37 arXiv — Machine Learning research 19d ago A prior-free blind detection of information leakage from model predictions arXiv:2606.11267v1 Announce Type: new Abstract: Data leakage -- contamination of a model with information unavailable at baseline -- is the dominant reproducibility failure in machine-learning-based science, yet detection tools require training code, external data, or domain… 25 arXiv — Machine Learning research 19d ago Bergson: An Open Source Library for Data Attribution arXiv:2606.11660v1 Announce Type: new Abstract: Data attribution is a promising field in interpretability that aims to explain model behavior through the influence of its training data, with applications including debugging undesirable model behavior and training dataset… 26 arXiv — Machine Learning research 19d ago Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization arXiv:2606.12251v1 Announce Type: new Abstract: Gradient-based adversarial attacks remain a dominant threat to deep neural networks (DNNs), as they exploit gradient information to efficiently optimize adversarial perturbations. To address this, we investigate whether… 27 arXiv — NLP / Computation & Language research 19d ago SOMA-SQL: Resolving Multi-Source Ambiguity in NL-to-SQL via Synthetic Log and Execution Probing arXiv:2606.11424v1 Announce Type: new Abstract: Natural language interfaces to databases aim to translate user questions into executable SQL, yet remain brittle in real-world settings where questions are underspecified and schemas are large and ambiguous. Ambiguity across user… 6 arXiv — NLP / Computation & Language research 19d ago Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay arXiv:2606.11786v1 Announce Type: new Abstract: Large Language Models (LLMs) offer new potential for translation tasks but often experience performance degradation when handling low-resource languages. To address this limitation, we propose an approach for fine-tuning LLMs on a… 37 arXiv — NLP / Computation & Language research 19d ago GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs arXiv:2606.11898v1 Announce Type: new Abstract: Research on Text-Attributed Graphs (TAGs) has gained significant attention recently due to its broad applications across various real-world data scenarios, such as citation networks, e-commerce platforms, social media, and web… 35 arXiv — NLP / Computation & Language research 19d ago Semantic Grading of Written Answers in Low-Resource Language Bangla Using a Fine-Tuned Lightweight Language Model arXiv:2606.11931v1 Announce Type: new Abstract: Bangla is among the world's most widely spoken languages, yet it remains underserved in educational NLP research. In many remote and rural regions, access to qualified subject teachers is limited, and written answers are… 14 arXiv — NLP / Computation & Language research 19d ago A Resource for Enthymeme Detection in Controversial Political Discourse arXiv:2606.12186v1 Announce Type: new Abstract: Enthymemes, arguments with unstated premises or conclusions, are pervasive in persuasive discourse, yet their annotation remains notoriously subjective. We present a resource of 1,482 tweets from politically controversial… 6 arXiv — NLP / Computation & Language research 19d ago From Explicit Elements to Implicit Intent: A Predefined Library for Auditable Behavioral Inference arXiv:2606.11207v1 Announce Type: cross Abstract: We present SemantiClean, a modular framework for extracting structured semantic signals from e-commerce session data and driving pluggable inference targets including purchase intent, customer segmentation, and product affinity… 5 arXiv — NLP / Computation & Language research 19d ago Gumbel-BEARD: Automatic Layer Selection for Self-Supervised Adaptation of Whisper in Low-Resource Domains arXiv:2606.11429v1 Announce Type: cross Abstract: Speech foundation models often struggle in low-resource domains due to domain mismatch and data scarcity. We propose Gumbel-BEARD, a domain adaptation framework that automates Whisper encoder layer selection via an end-to-end… 29 Hugging Face Daily Papers research 19d ago Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay Abstract Continual Instruction Tuning enables effective fine-tuning of large language models for low-resource language translation, achieving superior performance compared to standard instruction tuning and multilingual models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large… 4 Zed Editor dev-tools 19d ago Software Is Made Between Commits Agents turned the conversation into the real source of our software. DeltaDB is the version control built for it. 21 r/MachineLearning community 19d ago Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P] Surprised there's no real tooling for this given how much research exists on continual learning. Built pyrecall to fill the gap. Snapshots skill scores before/after fine-tuning, flags regressions, rolls back LoRA adapters by name. Fully local, no external APIs. v0.1.0, MIT, pip… 17 r/LocalLLaMA community 19d ago Best Open-Source AI coding model for my specs? hello everyone! im looking for the most powerful open-source coding ai while still fitting my system my specs: CPU: AMD ryzen 7 7700 GPU: RTX 5070 RAM: 32 gb DDR5 OS: windows 11 use case: Writing, Coding, debugging. any recommendations would be great. thanks in advance  … 4 GitHub Blog — AI & ML official-blog 19d ago Give GitHub Copilot CLI real code intelligence with language servers Install and configure LSP servers for GitHub Copilot CLI, replacing brute-force grep/decompile with real code intelligence. The post Give GitHub Copilot CLI real code intelligence with language servers appeared first on The GitHub Blog . 34 Hugging Face Daily Papers research 19d ago Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests Abstract CapCode framework uses randomized testing with performance caps to detect and prevent shortcut exploitation in agent evaluation, while CapReward rewards systems that adhere to intended task specifications. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A growing failure… 21 r/LocalLLaMA community 19d ago Cohere released North Mini Code: It's first Open-Source Agentic Coding Model Small: 30 billion parameters, 3B active. Efficient: Benchmarks to 33.4 on the Artificial Analysis Coding Index, competitive among similar sized models. Open Source: Apache 2.0 license HF: https://huggingface.co/CohereLabs/North-Mini-Code-1.0   submitted by  … 8 r/MachineLearning community 19d ago Introducing Papers Without Code [P] Hi, Niels here from the open-source team at Hugging Face. I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on… 36 Hugging Face Daily Papers research 20d ago Kwai Keye-VL-2.0 Technical Report Abstract Kwai Keye-VL-2.0-30B-A3B is an open-source Mixture-of-Experts multimodal foundation model that enables long-video understanding and agentic intelligence through DeepSeek Sparse Attention and specialized training infrastructure. Generated by… 36 Hugging Face Daily Papers research 20d ago PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models Abstract A psychologically-informed refusal framework called PsychoSafe is developed for large language models to improve harmful request handling through structured supportive communication, showing enhanced refusal quality and resource referral while maintaining performance on… 14 r/LocalLLaMA community 20d ago Without open source LLMs, US AI companies could have already monopoled the technology For such technology with clear importance and impact on all of us, I believe that making it open source is an ethical duty, otherwise, especially with the 1-sided politics of the US we experience today, they could have already monopoled the technology by now, maybe make it… 22 arXiv — Machine Learning research 20d ago QSplitFL: Capability Aware Deep Q-Learning for Optimal Split Point Selection in Split Federated Learning arXiv:2606.09869v1 Announce Type: new Abstract: Federated Learning (FL) combined with Split Learning (SL) is a privacy preserving paradigm that enables training deep neural networks (DNNs) on resource constrained devices while reducing overall training cost. However, determining… 22 arXiv — Machine Learning research 20d ago LMT: A Bayesian Framework for Causal Discovery from Textual Alarm Records in Manufacturing Systems arXiv:2606.09892v1 Announce Type: new Abstract: Textual event records, such as alarm logs, have become an increasingly common data source in engineering and manufacturing systems. Beyond identifying correlations or recurring patterns, engineers are often interested in… 31 arXiv — Machine Learning research 20d ago A Source Domain is All You Need: Source-Only Cross-OS Transfer Learning for APT Anomaly Detection via Semantic Alignment and Optimal Transport arXiv:2606.10216v1 Announce Type: new Abstract: Advanced Persistent Threats (APTs) are stealthy, multi-stage cyberattacks whose detection is difficult due to scarce labeled traces, severe class imbalance, and the challenge of generating realistic malicious behavior. These… 12 arXiv — Machine Learning research 20d ago Baseline-Free Policy Optimization for Neural Combinatorial Optimization arXiv:2606.10321v1 Announce Type: new Abstract: Neural combinatorial optimization (NCO) trains autoregressive policies to solve routing problems. The standard training algorithm, REINFORCE with a rollout baseline, requires maintaining and periodically updating a frozen copy of… 13 arXiv — NLP / Computation & Language research 20d ago OpenRTLSet: A Fully Open-Source Dataset for Large Language Model-based Verilog Module Design arXiv:2606.10285v1 Announce Type: new Abstract: OpenRTLSet introduces the largest fully open-source dataset for hardware design, offering over 131,000 diverse Verilog code samples to the research community and industry. Our dataset uniquely combines Verilog code from GitHub… 6 arXiv — NLP / Computation & Language research 20d ago Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation arXiv:2606.11127v1 Announce Type: new Abstract: Synthetic post-training pipelines commonly filter generated samples with reward models or holistic LLM judges, yet two practices remain rarely examined together: whether the filtering signal is grounded in the source evidence that… 9 arXiv — NLP / Computation & Language research 20d ago LLM-Based Code Documentation Generation and Multi-Judge Evaluation arXiv:2606.09852v1 Announce Type: cross Abstract: High-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and maintainability are essential. We presented an AI powered framework that automates… 36 arXiv — NLP / Computation & Language research 20d ago Advancing the State-of-the-Art in Empirical Privacy Auditing arXiv:2606.10481v1 Announce Type: cross Abstract: Parameter-efficient fine-tuning of large language models (LLMs) can exhibit problematic memorization of individual training examples. Empirical privacy auditing (EPA) quantifies this risk by measuring realistic data leakage on… 23 r/LocalLLaMA community 20d ago Without open llm competition, closed source LLM companies will become insatiable. I can't imagine how arrogant one must be to make such a decision. People pay $200 a month for Anthropic to mess with their codebase. Imagine how they would humiliate their customers if the world didn't have an open-source model.… 6 r/LocalLLaMA community 20d ago zai-org/SCAIL-2 · Hugging Face SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning SCAIL-2 is an open-source model for end-to-end controlled character animation . It animates a reference character with a driving video, and also supports character replacement and… 15 Hugging Face Daily Papers research 20d ago Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path Abstract Rectified Flows retain subtle training data traces that accumulate during training and can be exploited for membership inference attacks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Understanding what generative models retain from training data remains challenging,… 12 Hugging Face Daily Papers research 20d ago Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation? Abstract Large language models can improve translation for low-resource languages through structured linguistic reasoning traces, with the most significant benefits occurring during inference rather than training. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large language… 30 r/LocalLLaMA community 20d ago Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support Hey everyone, TinySearch v0.2.0 (first stable beta) is out. The first version used DuckDuckGo directly, which worked well enough to prove the idea, but yeah.. relying on one search source was way too fragile lol. DDG started throwing limits/CAPTCHAs more often in the last 2… 25 Hugging Face Daily Papers research 21d ago Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops Abstract Researchers identify widespread vulnerabilities in agent benchmark verification systems and develop an automated iterative process using LLM agents to create robust verifiers that resist exploitation while maintaining legitimate task performance. Generated by… 20 r/LocalLLaMA community 21d ago Have we reached the point where open-source LLMs are “just good enough”? The question I’m asking myself is whether open-source LLMs are now “ just good enough ” to meet 95% of requirements. I know, of course, that they still need to and will get even better, but where does the added value of the remaining 5% come from? a) Better answer quality? Okay,… 19 Page 5 of 10 · 500 articles ← Newer Older →