News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow Hugging Face Daily Papers research 1mo ago ECHO: Terminal Agents Learn World Models for Free Abstract Environment cross-entropy hybrid objective combines policy-gradient loss with auxiliary environment observation prediction to provide dense supervision from terminal feedback, improving agent performance and self-improvement capabilities. AI-generated summary CLI agents… 23 r/LocalLLaMA community 1mo ago Llamacpp server : How do the -np and -c flags interact? I've been using lm studio for a few months. I want to try hermes agents with Qwen 3.6 MoE, so I'm switching to llama.cpp and I don't understand well how the server slots -np and the context size -c interact. The context for each parallel client appears to be equally distributed… 10 arXiv — Machine Learning research 1mo ago Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis arXiv:2605.24162v1 Announce Type: new Abstract: Biological systems are governed by structured molecular interactions, where pathways, regulatory circuits, and functional gene relationships shape cellular behavior and disease progression. Much of this knowledge is naturally… 6 arXiv — Machine Learning research 1mo ago PrivFusion: A Privacy-preserving Multi-Agent Framework for Harmonizing Distributed Datasets arXiv:2605.24249v1 Announce Type: new Abstract: The growing availability of clinical data has increased the use of machine learning, yet centralized data aggregation is often infeasible for sensitive health information. Federated Learning (FL) offers a distributed alternative,… 19 arXiv — Machine Learning research 1mo ago Optimizing Digital Therapeutic Interventions: Online Learning under Endogenous Adherence arXiv:2605.24261v1 Announce Type: new Abstract: A critical challenge facing clinicians managing chronic disease interventions is sustaining long-run patient health given limited information and resources. Digital therapeutics (DTs) provide a cost-effective way to manage… 31 arXiv — Machine Learning research 1mo ago Lake Detection and Water Quality Estimation in Sentinel-2 Data arXiv:2605.24515v1 Announce Type: new Abstract: With climate change and increasing human pressure on natural landscapes, inland water resources are becoming progressively scarcer, more vulnerable, and more difficult to manage sustainably. Reliable and automated methods for… 25 arXiv — Machine Learning research 1mo ago ECHO: Terminal Agents Learn World Models for Free arXiv:2605.24517v1 Announce Type: new Abstract: CLI agents are the closest thing language models have to an embodied setting: the model emits commands, the terminal executes them, and the returned stream -- stdout, errors, files, logs, and traces -- records the consequences. We… 25 arXiv — Machine Learning research 1mo ago Hardware-Aware Federated Learning for Speech Emotion Recognition arXiv:2605.24712v1 Announce Type: new Abstract: Federated learning (FL) enables privacy-preserving collaborative training across distributed edge devices, but real deployments involve heterogeneous clients with different processing power, memory capacity, and communication… 16 arXiv — NLP / Computation & Language research 1mo ago A Multi-Probe Audit of Clinical-Interview Depression Detection Benchmarks arXiv:2605.23977v1 Announce Type: new Abstract: This paper audits benchmark evaluation in clinical-interview depression detection through four complementary probes across DAIC/E-DAIC, CMDC, ANDROIDS, MODMA, and PDCH. First, we re-evaluate E-DAIC under strict subject-disjoint… 27 arXiv — NLP / Computation & Language research 1mo ago When Reasoning Hurts: Source-Aware Evaluation of Frontier LLMs for Clinical SOAP Note Generation arXiv:2605.24902v1 Announce Type: new Abstract: Reasoning-enabled LLMs perform strongly on medical reasoning benchmarks, but it remains unclear whether these gains transfer to structured clinical documentation; we investigate this question using SOAP note generation from… 13 arXiv — NLP / Computation & Language research 1mo ago Overview of the PsyDefDetect Shared Task at BioNLP 2026: Detecting Levels of Psychological Defense Mechanisms in Supportive Conversations arXiv:2605.24907v1 Announce Type: new Abstract: We present an overview of PsyDefDetect, the shared task on detecting levels of psychological defense mechanisms in emotional support dialogues, co-located with BioNLP@ACL 2026. Grounded in the clinically validated Defense Mechanism… 20 arXiv — NLP / Computation & Language research 1mo ago TRACE: A taxonomy-grounded synthetic dataset for teaching-program generation and session interpretation in Applied Behavior Analysis arXiv:2605.25038v1 Announce Type: new Abstract: Applied Behavior Analysis (ABA) is a clinical discipline whose documentation, teaching programs and multi-session behavioral logs, is formulaic and high-volume, yet real session data is HIPAA-protected and bound by professional… 28 arXiv — NLP / Computation & Language research 1mo ago Evidence-Linked Radiology Reporting: A Human-Supervised Reference Architecture for Structured Imaging Intelligence arXiv:2605.25120v1 Announce Type: new Abstract: Radiology reports remain the primary mechanism by which imaging findings are communicated to clinical teams. However, much of the structured information behind these reports, including measurements, image evidence, prior… 8 Hugging Face Daily Papers research 1mo ago Geometry-Aware Image Flow Matching Abstract Geometry-aware generative models leveraging spherical manifolds and optimal transport techniques outperform traditional Euclidean approaches for natural image synthesis. AI-generated summary Recent advances in generative models highlight the power of geometry-aware… 29 Simon Willison community 1mo ago Notes on Pope Leo XIV's encyclical on AI Dropped this morning by the Vatican: Magnifica Humanitas of His Holiness Pope Leo XIV on Safeguarding the Human Person in the Time of Artificial Intelligence . This is a very interesting document. It's some of the clearest writing I've seen on the ethics of integrating AI into… 12 r/LocalLLaMA community 1mo ago AI content detector based on Qwen 0.8b fine-tuned on Pangram dataset I've fine-tuned Qwen 3.5 0.8B on the dataset provided by Pangram with their EditLens paper. It's available via a Chrome extension ; you can just click selected text and it's going to give you the probability distribution of how likely it is AI-generated. It takes under 1s on my… 36 r/MachineLearning community 1mo ago Is AI inference platform really that saturated now? [D] I’m thinking of expanding an on-device inference SDk into a full blown AI inference platform and seeing more and more inference platform popping out. Been talking with a VC from Seattle/NY. Is this space really that saturated?   submitted by   /u/kampak212 [link]  … 35 TechCrunch — AI news-outlet 1mo ago What ClickUp’s mass layoff tells us about the future of work The nine-year-old startup is replacing hundreds of employees with thousands of AI agents. 18 TechCrunch — AI news-outlet 1mo ago The pope’s AI encyclical isn’t really about AI Pope Leo XIV's first encyclical uses AI as a lens to diagnose older problems: concentrated power, eroding democracy, and a tech elite that shapes the world to its own advantage. 34 Hacker News — AI on Front Page community 1mo ago Magnifica Humanitas (Encyclical Letter) Article URL: https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html Comments URL: https://news.ycombinator.com/item?id=48265206 Points: 229 # Comments: 63 36 r/LocalLLaMA community 1mo ago We added W8A8 activation quantization to MLX — prefill went from 2.84s to 2.52s on M5 Pro Hey, I work on inference tooling at Mininglamp AI. We needed faster prefill for a 4B VLM running on Apple Silicon. Problem was MLX only does weight-only quant — activations stay FP16 the whole way through. So we wrote Cider, a small SDK that adds W8A8 activation quant on top of… 21 arXiv — Machine Learning research 1mo ago MedExpMem: Adapting Experience Memory for Differential Diagnosis arXiv:2605.22872v1 Announce Type: new Abstract: Experienced physicians develop diagnostic expertise through clinical practice, acquiring not only disease knowledge but also the ability to differentiate confusable conditions. Current medical vision-language models (VLMs) lack… 24 arXiv — Machine Learning research 1mo ago FederatedRSF : Federated Random Survival Forests for Partially Overlapping Medical Data arXiv:2605.22954v1 Announce Type: new Abstract: Multi-center survival prediction can improve robustness and generalizability, yet privacy regulations and institutional governance often prevent pooling patient-level clinical and genomic data across institutions. In practice,… 28 arXiv — Machine Learning research 1mo ago Class-Dependent Hybrid Data Augmentation for Multiclass Migraine Classification under Severe Class Imbalance arXiv:2605.23453v1 Announce Type: new Abstract: We conducted a reproducibility-oriented re-evaluation of prior migraine classification studies, correcting for data leakage and metric bias. We then introduced (i) a clinically motivated aggregation of two hemiplegic subtypes… 23 arXiv — NLP / Computation & Language research 1mo ago When Symptoms Are Not Enough: Evidence-Weighting Patterns in Large Language Model Psychiatric Screening arXiv:2605.23148v1 Announce Type: new Abstract: As demand for mental health care outpaces clinician-delivered assessment, scalable screening tools are increasingly needed. Large language models (LLMs) may identify psychiatric risk from patient narratives, but their reliability… 9 arXiv — NLP / Computation & Language research 1mo ago ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate Communication arXiv:2605.23326v1 Announce Type: new Abstract: We present ClimateChat-300K, a large-scale dataset of 299,329 public Facebook posts about climate change collected between May 2020 and May 2024 through the CrowdTangle platform. The dataset contains 41 metadata features including… 5 arXiv — NLP / Computation & Language research 1mo ago The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems arXiv:2605.23024v1 Announce Type: cross Abstract: Large language models now write software, draft legal documents, and produce clinical notes, yet fundamental limits, from Turing and Arrow to the No Free Lunch theorems, shape what computation can do. This thesis turns such… 22 arXiv — NLP / Computation & Language research 1mo ago What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference arXiv:2605.23158v1 Announce Type: cross Abstract: The deployment of large language models (LLMs) on resource-constrained devices remains challenging, spurring interest in split inference, where models are partitioned between client and server to reduce computational burden and… 6 r/LocalLLaMA community 1mo ago OCR, granite-docling-258m vs granite-docling-2stage-258m: has anyone actually noticed any improvements? IBM's granite-docling-2stage-258m granite-docling-2stage-258m Granite Docling 2stage builds upon the Granite Docling, but introduces a key modifications: it builds a dynamic prompt that precomputes layout objects found within a page, making it more robust on out of distribution… 19 r/LocalLLaMA community 1mo ago Have we passed the peak of inflated expectations? I noticed the number of people in this sub going down a bit and checked out some google trends. Any idea what's causing this sharp decline?   submitted by   /u/fairydreaming [link]   [comments] 18 r/MachineLearning community 1mo ago Custom image encoder [P] Hello, I would like to know whether building my own image encoder would be a good idea instead of using models like CLIP, SigLIP/SigLIP2, or DINO. My use case is video frame classification. My pipeline is the following: the client sends me a video stream, sampled at 1 frame per… 5 arXiv — Machine Learning research 1mo ago HealthCraft: A Reinforcement Learning Safety Environment for Emergency Medicine arXiv:2605.21496v1 Announce Type: new Abstract: Frontier language models are being deployed into clinical workflows faster than the infrastructure to evaluate them safely. Static medical-QA benchmarks miss the failure modes that matter in emergency medicine: trajectory-level… 4 arXiv — Machine Learning research 1mo ago Calibration, Uncertainty Communication, and Deployment Readiness in CKD Risk Prediction: A Framework Evaluation Study arXiv:2605.21566v1 Announce Type: new Abstract: Machine learning models for chronic kidney disease (CKD) risk prediction often post strong discrimination scores on internal test sets. Calibration and uncertainty quantification get far less attention, leaving clinicians without… 9 arXiv — Machine Learning research 1mo ago ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data arXiv:2605.21963v1 Announce Type: new Abstract: Long-horizon clinical simulation -- predicting how a patient's physiology evolves over years under specified interventions -- is central to chronic-disease care, yet existing electronic health record (EHR) models are predominantly… 19 arXiv — Machine Learning research 1mo ago Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics arXiv:2605.22164v1 Announce Type: new Abstract: Latent world models can contain the state needed for control, yet their terminal-cost interface can expose the planner to the wrong decision-relevant information. In common latent MPC, candidate sequences are ranked by Euclidean… 20 arXiv — Machine Learning research 1mo ago Decomposing Ensemble Spread in Lorenz '96 With Learned Stochastic Parameterizations arXiv:2605.22242v1 Announce Type: new Abstract: Weather and climate forecasts are inherently uncertain due to chaotic dynamics, imperfect initial conditions, and incomplete representation of the underlying physical processes. Operational ensemble forecasts aim to represent these… 30 arXiv — Machine Learning research 1mo ago Explainable AI for Data-Driven Design of High-Dimensional Predictive Studies arXiv:2605.22243v1 Announce Type: new Abstract: Predictive modelling is important for health data analysis and data-driven clinical decision-making. However, predictive studies are challenging to design optimally by hand when tens or even hundreds of features require selection,… 19 arXiv — Machine Learning research 1mo ago No Epoch Like the Present: Robust Climate Emulation Requires Out-of-Distribution Generalisation arXiv:2605.22248v1 Announce Type: new Abstract: Climate emulation is an out-of-distribution (OOD) projection task. This is precisely the challenge where modern Machine Learning (ML) methods are most prone to failure. Consequently, while current ML emulators trained on present… 38 arXiv — Machine Learning research 1mo ago Detecting Atypical Clients in Federated Learning via Representation-Level Divergence arXiv:2605.22266v1 Announce Type: new Abstract: Federated learning enables collaborative training across distributed clients with heterogeneous data, but such heterogeneity often leads to unstable updates and degraded global performance. Moreover, in practical deployments,… 29 arXiv — NLP / Computation & Language research 1mo ago When Cases Get Rare: A Retrieval Benchmark for Off-Guideline Clinical Question Answering arXiv:2605.21807v1 Announce Type: new Abstract: Across medical specialties, clinical practice is anchored in evidence-based guidelines that codify best studied diagnostic and treatment pathways. These pathways routinely fall short for the long tail of real-world care not covered… 34 arXiv — NLP / Computation & Language research 1mo ago ChronoMedKG: A Temporally-Grounded Biomedical Knowledge Graph and Benchmark for Clinical Reasoning arXiv:2605.22734v1 Announce Type: new Abstract: Biomedical knowledge graphs (KGs) treat disease associations as static facts, but temporal information is crucial for clinical reasoning, e.g., a symptom diagnostic of one disease at age 3 may imply a different disease at age 13.… 32 arXiv — NLP / Computation & Language research 1mo ago The Double Dilemma in Multi-Task Radiology Report Generation: A Gradient Dynamics Analysis and Solution arXiv:2605.22635v1 Announce Type: cross Abstract: While multi-task learning based automatic radiology report generation (RRG) is widely adopted to ensure clinical consistency, most focus on architectural designs yet remain limited to coarse linear scalarization strategies. These… 12 Hugging Face Daily Papers research 1mo ago Training Large Language Models to Predict Clinical Events Abstract Longitudinal clinical notes are converted into temporal prediction examples using Foresight Learning, enabling improved clinical prediction through LoRA adaptation that enhances calibration and reduces uncertainty compared to base models. AI-generated summary… 34 r/LocalLLaMA community 1mo ago Gmail tie-ins hey folks. I’m looking to setup a way to give a local LLM access to google cloud SDK for Gmail functions. The goal is to be able to have an LLM once daily check a spreadsheet, and based on criteria send an email that will be structured exactly the same way each time, simply as a… 14 llama.cpp releases dev-tools 1mo ago b9276 server: expose prompt token counts in /slots endpoint ( #23454 ) Add n_prompt_tokens, n_prompt_tokens_processed, and n_prompt_tokens_cache to the /slots JSON response. These fields are already tracked internally but were not exposed, making it impossible for clients to monitor… 15 The Information — AI news-outlet 1mo ago Workday Stock Jumps 10% After Company Reveals AI Agent Gains Workday shares climbed more than 10% in after-hours trading on Thursday after the HR application maker said the number of customers using its AI agents in the three months ended April 30 roughly doubled from the previous quarter to more than 4,000. Gerrit Kazmaier, the company’s… 38 OpenAI Python SDK releases dev-tools 1mo ago v2.38.0 2.38.0 (2026-05-21) Full Changelog: v2.37.0...v2.38.0 Features api: api update ( 33d1d01 ) api: manual updates ( a21700a ) api: update OpenAPI spec or Stainless config ( 00265c5 ) Chores api: docs updates ( ee10152 ) check release PR custom code sync ( 2638779 ) remove release… 26 r/LocalLLaMA community 1mo ago Qwen3.6 35Ba3 has changed my workflows and even how I use my computer My workflow has changed basically to ask Codex to do certain tasks and then document how to do them (including errors it found on its way) into a skill. I feed that skill to pi, and suddenly my qwen3.6 gets that hard stuff done: - devops on a VPS - using docling to create epubs… 33 Google DeepMind official-blog 1mo ago We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks The Asia-Pacific region is a global engine for economic growth, but it's also highly vulnerable to climate change. While green technologies are gaining momentum, a recent report shows they aren’t scaling fast enough to keep up with the region’s rising environmental risks. To… 22 r/LocalLLaMA community 1mo ago LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more I've been building this for the past few months as a side project — started because I didn't want to run llama.cpp from the command line every time I wanted to try a model. I just wanted something that worked with a click. Fair warning: I'm not a developer. This is 100% vibe… 33 Page 8 of 10 · 500 articles ← Newer Older →