News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow arXiv — NLP / Computation & Language research 25d ago Measuring the sensitivity of LLM-based structured extraction to prompt, model, and schema choices in clinical discharge summaries arXiv:2606.05970v1 Announce Type: new Abstract: Large language models are increasingly used for structured extraction from clinical free-text notes, but the sensitivity of their output to upstream configuration choices is less understood than their accuracy on fixed benchmarks.… 23 Hacker News — AI on Front Page community 25d ago Open Code Review – An AI-powered code review CLI tool Article URL: https://github.com/alibaba/open-code-review Comments URL: https://news.ycombinator.com/item?id=48406358 Points: 233 # Comments: 66 32 Hugging Face Daily Papers research 25d ago Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases Abstract MedSP1000 introduces an interactive benchmark derived from standardized patients to evaluate clinical agents' dynamic performance across encounters, revealing limitations of current large language models in medical applications. Generated by… 18 llama.cpp releases dev-tools 25d ago b9503 fix(mtmd): handle Gemma 4 audio projector embedding size ( #24091 ) mtmd: handle Gemma 4 audio projector embedding size rm projection_dim from clip_n_mmproj_embd Co-authored-by: Xuan Son Nguyen [email protected] macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,… 28 arXiv — Machine Learning research 26d ago Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset arXiv:2606.03995v1 Announce Type: new Abstract: Background: Alzheimer's disease (AD) affects over 55 million people worldwide. Accurate, interpretable detection of normal cognition (NC), mild cognitive impairment (MCI), and AD from routine clinical assessments remains a critical… 14 arXiv — Machine Learning research 26d ago KODA: Contrastive Representation Comparison and Alignment for Vision-Language Foundation Models arXiv:2606.04180v1 Announce Type: new Abstract: Vision-language foundation models such as CLIP and SigLIP provide widely used representations for multimodal learning systems. While these models are typically compared through downstream performance, such evaluations often do not… 8 arXiv — NLP / Computation & Language research 26d ago When Clients Stop Following: A Cognitive Conceptualization Diagram-driven Framework for Strategic Counseling arXiv:2606.04389v1 Announce Type: new Abstract: Large Language Models (LLMs) show promise in psychological counseling, yet existing benchmarks rely heavily on highly cooperative simulated clients. We observe a critical counselor-following phenomenon: these clients often rapidly… 14 arXiv — NLP / Computation & Language research 26d ago Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases arXiv:2606.05112v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly proposed as clinical agents, yet static, single-turn benchmarks cannot capture how a model dynamically delivers care across an encounter: gathering information, planning treatment, and… 32 Hugging Face official-blog 26d ago Designing the hf CLI as an agent-optimized way to work with the Hub Back to Articles Designing the hf CLI as an agent-optimized way to work with the Hub Published June 4, 2026 Update on GitHub Upvote 4 Célina Hanouti celinah Lucain Pouget Wauplin hf is the official command-line entrypoint to the Hugging Face Hub. Anything you can do on the Hub… 12 Ollama releases dev-tools 26d ago v0.30.4-rc1: llama-server: fix gemma4 patch wiring (#16477) This will fix the "clip.cpp:4399: Unknown projector type" crash. 4 Ollama releases dev-tools 26d ago v0.30.4: llama-server: fix gemma4 patch wiring (#16477) This will fix the "clip.cpp:4399: Unknown projector type" crash. 38 r/LocalLLaMA community 26d ago How to use audio and vision modalities in llama.cpp? How to use audio and vision modalities in llama.cpp with Gemma4 12B it? I’m on release b9494, but when I run llama-cli it shows “modalities: text” only, and crashes if I try to add an image.   submitted by   /u/No-Leave-4512 [link]   [comments] 20 Hugging Face Daily Papers research 26d ago KletterMix: Climbing Toward High-Quality German Pretraining Data Abstract A high-quality German-language corpus for language model pretraining is introduced through careful translation of an English corpus while preserving document structure and metadata, demonstrating improved downstream performance in German-language tasks. Generated by… 28 Hacker News — AI on Front Page community 26d ago Mouseless – keyboard-driven control of macOS/Linux/Windows Article URL: https://mouseless.click Comments URL: https://news.ycombinator.com/item?id=48383667 Points: 223 # Comments: 107 38 arXiv — Machine Learning research 27d ago Auditable Climate Risk Intelligence from Fragmented ESG Data: Deterministic Orchestration and Imbalance-Aware Learning for Scope 1-3 Validation arXiv:2606.02604v1 Announce Type: new Abstract: ESG and climate risk data remain fragmented across heterogeneous Scope 1, Scope 2, and Scope 3 reporting environments, while conventional validation pipelines lack provenance aware auditability, hidden drift detection, and… 9 arXiv — Machine Learning research 27d ago DECA: Decentralizing Block-Wise Adam for Efficient LLM Full-Parameter Fine-Tuning on Non-IID Data arXiv:2606.03209v1 Announce Type: new Abstract: Fine-tuning large language models (LLMs) in privacy-sensitive and resource-constrained environments remains challenging. Since training data are often distributed across multiple clients, decentralized fine-tuning offers a natural… 15 arXiv — Machine Learning research 27d ago Learning Temporal Causal Structure via Smooth Differentiable Optimization arXiv:2606.03227v1 Announce Type: new Abstract: Causal discovery with instantaneous effects in multivariate time series is challenging, as the instantaneous structure must be acyclic. Prior methods enforce this by either separating instantaneous and lagged estimation into… 4 arXiv — Machine Learning research 27d ago Multi-Modal Graph Neural Network with Transformer-Guided Adaptive Diffusion for Preclinical Alzheimer Classification arXiv:2606.03322v1 Announce Type: new Abstract: The graphical representation of the brain offers critical insights into diagnosing and prognosing neurodegenerative disease via relationships between regions of interest (ROIs). Despite recent emergence of various Graph Neural… 14 arXiv — NLP / Computation & Language research 27d ago AI Rater Discrimination Depends on Scoring Protocol in Complex Clinical Decision-Making arXiv:2606.03198v1 Announce Type: new Abstract: Clinical AI evaluation increasingly delegates scoring to large language models (LLMs) acting as AI raters, yet their scoring behavior across evaluation conditions has not been quantitatively characterized. We address this gap… 17 arXiv — NLP / Computation & Language research 27d ago The Word and the Way: Strategies for Domain-Specific BERT Pre-Training in German Medical NLP arXiv:2606.03250v1 Announce Type: new Abstract: Digital healthcare generates vast amounts of clinical text that can support AI-assisted applications, yet German biomedical language models remain limited by older architectures or restricted training data. We present ChristBERT… 33 arXiv — NLP / Computation & Language research 27d ago SagaQA: A Multi-hop Reasoning Benchmark for Long-form Narrative Understanding in TV Series arXiv:2606.03301v1 Announce Type: new Abstract: We introduce SagaQA, a long-form video benchmark for multi-hop reasoning over full-length TV series. Existing video reasoning benchmarks often emphasize local understanding of adjacent frames or clips. SagaQA addresses this gap by… 33 arXiv — NLP / Computation & Language research 27d ago Selective Token-Level Cryptographic Redaction for Privacy-Preserving Clinical Deployment of Large Language Models arXiv:2606.03399v1 Announce Type: new Abstract: While large language models (LLMs) are increasingly used for clinical applications, many existing pipelines require sending raw sensitive health information to remote servers for processing, which heightens the risk of privacy… 4 arXiv — NLP / Computation & Language research 27d ago Does Language Shift Break Medical Vision-Language Models? Indonesian Radiology Visual Question Answering Case Study arXiv:2606.03693v1 Announce Type: new Abstract: Medical Vision-Language Models (VLMs) are typically evaluated on English radiology visual question answering benchmarks, leaving their robustness under non-English clinical language largely unexplored. We introduce IndoRad-VQA, an… 10 arXiv — NLP / Computation & Language research 27d ago KletterMix: Climbing Toward High-Quality German Pretraining Data arXiv:2606.03773v1 Announce Type: new Abstract: High-quality pretraining data is a central ingredient in modern language models, but German-language resources remain far less developed than their English counterparts: they are often smaller, less carefully curated, weakly… 19 Hacker News — AI on Front Page community 27d ago MAI-Code-1-Flash https://microsoft.ai/models/mai-code-1-flash/ https://microsoft.ai/pdf/MAI-Code-1-Flash-Model-Card.PDF Launching seven new MAI models: https://microsoft.ai/news/building-a-hillclimbing-machine-la... Comments URL: https://news.ycombinator.com/item?id=48374466 Points: 228 #… 36 Hacker News — AI on Front Page community 27d ago 1-Click GitHub Token Stealing via a VSCode Bug Article URL: https://blog.ammaraskar.com/github-token-stealing/ Comments URL: https://news.ycombinator.com/item?id=48371562 Points: 220 # Comments: 30 4 Vercel — AI dev-tools 28d ago Edit Git settings for all projects in a repo Monorepos that deploy many projects can now configure all of their project's Git settings more conveniently. Previously, if you wanted to consistently configure each project's settings for commit status, repository_dispatch events , etc., you had to click through to every… 16 Hugging Face Daily Papers research 28d ago Multi-Agent Computer Use Abstract Multi-agent computer use systems outperform single-agent approaches on complex tasks by enabling parallel execution and dynamic task decomposition through directed acyclic graphs. AI-generated summary Computer use agents (CUAs) today are primarily deployed as single… 18 arXiv — Machine Learning research 28d ago PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution arXiv:2606.00342v1 Announce Type: new Abstract: We study the problem of differentially private (DP) $k$-means clustering in Euclidean space. Previous solutions rely on summing the private data directly, which induces a sensitivity proportional to the domain. We introduce… 17 arXiv — Machine Learning research 28d ago Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings arXiv:2606.00426v1 Announce Type: new Abstract: Federated continual learning (FCL) lets distributed clients adapt language-model heads to evolving NLP tasks without sharing raw text. Under user-level differential privacy (DP), replay-based continual learning faces a structural… 17 arXiv — NLP / Computation & Language research 28d ago A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models arXiv:2606.00027v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed across healthcare, yet existing benchmarks fail to capture model behavior under adversarial or ethically complex conditions common in clinical practice. We developed a… 37 arXiv — NLP / Computation & Language research 28d ago LLMs for Cardiovascular Risk Prediction from Structured Clinical Data arXiv:2606.00031v1 Announce Type: new Abstract: Coronary artery disease (CAD) remains one of the leading causes of death globally, highlighting the need for reliable predictive systems to support early diagnosis and risk assessment. While traditional machine learning models… 14 arXiv — NLP / Computation & Language research 28d ago LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification arXiv:2606.00647v1 Announce Type: new Abstract: Detecting psychological defense mechanisms in conversational text remains a challenging clinical NLP problem. For the PsyDefDetect 2026 shared task (nine-class utterance classification evaluated via macro F1), our team LinguIUTics… 5 arXiv — NLP / Computation & Language research 28d ago Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning arXiv:2606.01301v1 Announce Type: new Abstract: Hallucinations in medical large language models (LLMs) pose serious risks for clinical decision support, particularly when models must reason over complex electronic health records (EHRs). However, existing benchmarks often lack a… 8 r/MachineLearning community 28d ago MeshFlow: production-safe multi-agent orchestration — SHA-256 audit chain, HIPAA/SOX/GDPR built in, 70-85% token cost reduction [Open Source][D] 79% of enterprises have adopted AI agents. Only 11% run them in production. We've spent the past year building agent systems for banks, clinical operations teams, and engineering orgs. The problem isn't that agents don't work — they work fine. The problem is that every framework… 12 Vercel — AI dev-tools 28d ago Build Chat SDK web UIs in Vue or Svelte The Chat SDK web adapter now has first-class support for Vue and Svelte, joining the existing React integration. Because the adapter speaks the AI SDK UI message stream protocol , the same server route works. Each framework ships its own useChat , built on the matching AI SDK… 16 Vercel — AI dev-tools 28d ago Build custom Slack runtimes Chat SDK now ships the Slack adapter 's primitives as standalone imports for apps that already handle their own routing, state, or workflow execution. Use only what you need: Request verification and payload parsing ( @chat-adapter/slack/webhook ) Markdown formatting (… 20 OpenAI Python SDK releases dev-tools 28d ago v2.40.0 2.40.0 (2026-06-01) Full Changelog: v2.39.0...v2.40.0 Features api: Add Amazon Bedrock Responses support Bug Fixes api: allow setting bedrock api keys on the client directly ( 4d5bfde ) 19 Vercel — AI dev-tools 28d ago Chat SDK adds Velt support Chat SDK now supports Velt with the new vendor-official adapter . Build bots that read and reply within Velt comment threads, right where your team already works: documents, text editors, and canvases. Tag the bot, and it will answer in the same thread, grounding its reply with… 24 Vercel — AI dev-tools 28d ago Chat SDK adds AgentPhone support Chat SDK now supports AgentPhone with the new vendor-official adapter . Give your bot its own phone number so it can handle voice calls and text messages using the same handlers you already write. When a call ends, the transcript is delivered as a message, allowing your bot to… 14 Hacker News — AI on Front Page community 28d ago NPM packages from RedHat have been compromised Article URL: https://github.com/RedHatInsights/javascript-clients/issues/492 Comments URL: https://news.ycombinator.com/item?id=48356625 Points: 327 # Comments: 151 37 r/LocalLLaMA community 28d ago MTP is nice and all, but what about PP speeds? I don't know for the rest of you, but with my setup, as soon as i enable MTP, the PP performance and GPU usage drops significantly for some reason. It's not as much a memory issue for me as it is declining performance. My setup is: 2x Radeon VII 16gb on ROCm, 1x Rtx3080 8gb Max… 28 Hugging Face Daily Papers research 28d ago One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation Abstract Group Prompting enables efficient cell instance segmentation by leveraging per-type prompting through a training-free framework that uses multi-scale encoder features and recursive prompt expansion. AI-generated summary Cell instance segmentation models trained on… 32 Hugging Face Daily Papers research 28d ago How can embedding models bind concepts? Abstract Vision-language models like CLIP struggle with concept binding despite recognizing individual concepts, but controlled transformer models can learn low-complexity binding functions that generalize better through multiplicative interactions. AI-generated summary Humans… 11 r/LocalLLaMA community 29d ago Just found a 1-click RCE in pewdiepie's Odysseus Chat PR being submitted to help the project as we speak. Sound on for extra lols.   submitted by   /u/theonejvo [link]   [comments] 7 Vercel — AI dev-tools 29d ago Qwen 3.7 Plus now available on AI Gateway Qwen 3.7 Plus from Alibaba is now available on Vercel AI Gateway . The model unifies vision and language into a single agent foundation, with capabilities spanning GUI and CLI operation, coding and productivity workflows with full-modality input, and visual agent tasks including… 26 arXiv — Machine Learning research 29d ago Gait2Hip-60: A Unified Deep Learning Benchmark for Predicting Hip Muscle Forces and Joint Moments from Multi-Cadence Gait Kinematics arXiv:2605.30374v1 Announce Type: new Abstract: Estimating hip muscle forces and joint moments during gait typically relies on musculoskeletal simulation, which is informative but time-consuming and difficult to apply in clinical settings. This study developed a deep learning… 10 arXiv — Machine Learning research 29d ago Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents arXiv:2605.30590v1 Announce Type: new Abstract: Two clinical AI systems can score nearly identically on coverage-based rubrics yet behave radically differently when their patient inputs change: one updates its recommendations to match the new clinical signal, while the other… 23 arXiv — NLP / Computation & Language research 29d ago Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages arXiv:2605.30529v1 Announce Type: new Abstract: Sentence-embedding models for semantic search are overwhelmingly developed and evaluated on English corpora. When applied to clinical retrieval in other languages -- particularly retrieval of ICD-10-CM / CIE-10 codes -- recall… 26 arXiv — NLP / Computation & Language research 29d ago Same Patient, Different Words, Different Diagnosis? Evaluating Semantic Stability in Clinical LLMs arXiv:2605.30646v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used in clinical applications. However, their behavior remains highly sensitive to subtle linguistic variations, such as rephrasing or syntactic variation. This sensitivity poses risks… 27 Page 6 of 10 · 500 articles ← Newer Older →