News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow arXiv — Machine Learning research 15d ago Can Machine Learning Forecast Rice Yields in Data-Constrained Settings? Satellite Climate Data, National Crop Statistics, and Lessons from Sierra Leone arXiv:2606.13959v1 Announce Type: new Abstract: Sierra Leone's agriculture operates with almost no data-driven decision support, and no published machine learning study has examined the country's crop yields. We ask whether rice yield can be forecast from data Sierra Leone… 25 arXiv — Machine Learning research 15d ago Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc Adversarial Auditing and Multi-Agent Feedback Loops arXiv:2606.14149v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs recommend recently banned or… 25 arXiv — Machine Learning research 15d ago Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport arXiv:2606.14157v1 Announce Type: new Abstract: Cities deliver basic services through mixed public-private facility networks, including schools, clinics, transit providers, and subsidized service points. In these systems, planners often observe where households go, but not the… 9 arXiv — Machine Learning research 15d ago Machine Learning for Biomedical Raman Spectroscopy: From Spectral Acquisition to Clinical Translation arXiv:2606.14169v1 Announce Type: new Abstract: Raman spectroscopy provides label-free, chemically specific characterization of biological systems and has become an important tool for cancer diagnosis, molecular subtyping, microbiological identification, and intraoperative… 13 arXiv — Machine Learning research 15d ago Federated Learning for Feature Generalization with Convex Constraints arXiv:2606.14416v1 Announce Type: new Abstract: Federated learning (FL) often struggles with generalization due to heterogeneous client data. Local models are prone to overfitting their local data distributions, and even transferable features can be distorted during aggregation.… 12 arXiv — Machine Learning research 15d ago PepALD: Macrocyclic Peptide Generation via Autoregressive Latent Diffusion arXiv:2606.14510v1 Announce Type: new Abstract: Macrocyclic peptides are promising therapeutic candidates for intracellular targets, but their design requires simultaneous control over non-natural monomer chemistry, ring topology, membrane permeability, and target binding.… 10 arXiv — Machine Learning research 15d ago Expert-Driven Survival Machines: Improving Stratification and Interpretability in Multiple Clinical Cohorts arXiv:2606.14608v1 Announce Type: new Abstract: Survival prediction plays a central role for healthcare providers and clinical researchers. Accurate risk stratification enables early intervention and improved patient management. Most existing deep survival models learn one… 9 arXiv — NLP / Computation & Language research 15d ago DLawBench: Evaluating LLMs Through Multi-Turn Legal Consultation arXiv:2606.13931v1 Announce Type: new Abstract: Lawyer-client consultation is a critical starting point for legal services. Effective legal assistance hinges on eliciting sufficient and truthful information from clients in order to devise strategies that best protect their… 5 arXiv — NLP / Computation & Language research 15d ago Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding arXiv:2606.13940v1 Announce Type: new Abstract: Automated International Classification of Diseases (ICD) coding is a core medical-coding task for billing, epidemiology, and clinical decision support. Generative large language models (LLMs) are often reported as weak medical… 27 arXiv — NLP / Computation & Language research 15d ago Personal Care Utility: Health as Everyday Infrastructure arXiv:2606.14145v1 Announce Type: new Abstract: Healthcare is essential, expert, and episodic by design - built around the roughly one hour per year a person spends with a clinician. The 8,759 hours outside clinical settings, where eating, sleeping, movement, medication, and… 9 arXiv — NLP / Computation & Language research 15d ago A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions arXiv:2606.14460v1 Announce Type: new Abstract: Transformer-based clinical language models are increasingly integrated into high-stakes clinical decision support pipelines, yet the computational mechanisms through which demographic associations encoded in medical documentation… 35 arXiv — NLP / Computation & Language research 15d ago Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources arXiv:2606.14141v1 Announce Type: cross Abstract: Sound events are entities with semantic identities, locations, and trajectories, but current audio-language models usually reason about clips as global event content. Conversely, sound event localization models track source… 12 arXiv — NLP / Computation & Language research 15d ago ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning arXiv:2606.14697v1 Announce Type: cross Abstract: Building trustworthy medical multimodal large language models (MLLMs) is critical for reliable clinical decision support. Existing medical hallucination benchmarks mainly focus on data collection, but often ignore where… 4 Vercel — AI dev-tools 15d ago Auth0 joins the Vercel Marketplace You can now add Auth0 , a production-ready authentication to your Vercel app in just a few clicks. Built for modern frameworks like Next.js, Auth0 is an identity and access management platform for securing your apps and agentic workflows. This integration enables: Automatic… 26 Vercel — AI dev-tools 15d ago Chat SDK now supports rich text in Telegram Chat SDK now renders explicit markdown and ast messages as native rich messages on the Telegram adapter . Your bots get real headings, lists, tables, task lists, formulas, and separate media blocks instead of flattened text. What you get: Native formatting : headings, lists,… 5 r/LocalLLaMA community 15d ago Gemma 12b less than 10 watts 6.5pp 1.3tg Google pixel 10 pro Termux Llamacpp version: 9639 (ef8268fee) $ ./llama.cpp/build_vulkan/bin/llama-cli -m storage/downloads/gemma-4-12b-it-UD-Q3_K_XL.gguf --model-draft storage/downloads/mtp-gemma-4-12b-it.gguf --temp 1.0 --top-p 0.95 --top-k 64 --spec-type draft-mtp… 5 Hacker News — AI on Front Page community 15d ago I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models TLDR: I had 2,207 GoPro videos, and I need to rewatch them to find interesting moments from my cycling journey. I built a project to index them locally on my M1 Max using open-source ML models, search for those moments, and send the best clips straight to my DaVinci Resolve… 28 llama.cpp releases dev-tools 15d ago b9631 cli : fix not copying preserved tokens ( #24258 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64… 6 r/LocalLLaMA community 16d ago Best batteries-included harness tuned for Qwen 3.6 and Gemma 4? (little-coder, smallcode, etc...) After testing little-coder for a week now, I can confidently say that it's better and more reliable than OpenCode and Cline. What's the best harness you've used with Qwen 3.6 and Gemma 4? I'm aware that you can get better results by using pi.dev or a custom harness tuned for… 6 r/LocalLLaMA community 16d ago In your opinion, what is the best CLI-based (or other) coding tool for regular software engineering (NOT VIBE CODING)? This includes but is not only limited to: OpenCode, Command Code, Kilo Code, Cline, Claude Code, etc. Please try to include tools in which I can connect local models, so not stuff like Antigravity.   submitted by   /u/Potential_Top_4669 [link]   [comments] 36 r/LocalLLaMA community 16d ago llama-launcher v1.3 release -> Bayesian Optimisation Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher , a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses… 16 r/MachineLearning community 16d ago Price is not cost: how we are using the wrong variable to measure the cost of LLMs [D] Upfront disclosure: this is my write-up (and I'll link it below), but laying out the argument here so you can strawman/steelman it without clicking anything. Assertion 1: per token price is the wrong metric for measuring the cost of work done by LLMs/reasoning models. Users get… 36 Vercel — AI dev-tools 17d ago Workflow SDK now runs natively in Nitro v3 Workflow SDK 's native Nitro v3 integration is now in beta. Steps run inside the same bundled runtime as the rest of your app, instead of a separate bundle. Nitro's useStorage() and other server-side APIs work directly inside "use step" functions. The Nitro dev server also… 26 GitHub Blog — AI & ML official-blog 17d ago How we made GitHub Copilot CLI more selective about delegation Better orchestration, fewer handoffs, faster progress, without a single new knob. The post How we made GitHub Copilot CLI more selective about delegation appeared first on The GitHub Blog . 25 The Information — AI news-outlet 17d ago SpaceX Shares Open at $150 Per Share SpaceX shares started trading at $150 per share around mid-day on Friday, up 11% from the company’s initial public offering price of $135. Shares of the company climbed shortly after trading began, reaching about $165 shortly after noon. The offering makes SpaceX CEO Elon Musk… 7 arXiv — NLP / Computation & Language research 18d ago EDEN: A Large-Scale Corpus of Clinical Notes for Italian arXiv:2606.12569v1 Announce Type: new Abstract: We present EDEN (Emergency Department Electronic Notes), a new and unique large-scale corpus of clinical notes produced in Emergency Departments of Italian hospitals. The corpus, in its current version, is composed of approximately… 25 arXiv — NLP / Computation & Language research 18d ago sebis at CRF Filling 2026: A Two-Stage Local LLM Pipeline for Medical CRF Filling arXiv:2606.13082v1 Announce Type: new Abstract: The extraction of structured clinical information from unstructured EHR notes is a persistent bottleneck in healthcare informatics. While large language models (LLMs) offer high performance, their deployment in clinical settings is… 12 arXiv — NLP / Computation & Language research 18d ago Cross-Modal Masked Compositional Concept Modeling for Enhancing Visio-Linguistic Compositionality arXiv:2606.13288v1 Announce Type: cross Abstract: Contrastively trained vision-language models like CLIP, have made remarkable progress in learning joint image-text representations, but still face challenges in compositional understanding. They often exhibit a "bag-of-words"… 38 Vercel — AI dev-tools 18d ago Program Claude Code, Codex, Pi and other agent harnesses with AI SDK AI SDK 7 introduces HarnessAgent , a single API for running established agent harnesses, including Claude Code, Codex, and Pi. AI SDK has always let you switch models without rewriting your agent. Now you can switch the harness the same way. Write the agent once. Use the best… 7 NVIDIA Developer Blog official-blog 18d ago One-Click Multi-Tenant Security with NVIDIA Quantum InfiniBand NVIDIA Quantum InfiniBand now offers intent-based security profiles in Unified Fabric Manager (UFM) that enable multi-tenant fabric security in a single... 33 r/MachineLearning community 18d ago What should context compression keep? I looked at how six agents handle it[D] I use Claude Code, Codex CLI, OpenCode, Cline, Cursor, and Amp enough to notice a pattern in how they handle long context. They are all converging on layered progressive compression, but they disagree on what to protect. Most protect recent user messages as a first-class asset.… 20 arXiv — Machine Learning research 19d ago Federated continual learning: A comprehensive survey on lifelong and privacy-preserving learning over distributed and non-stationary data arXiv:2606.11272v1 Announce Type: new Abstract: Federated Learning (FL) enables collaborative and privacy-preserving model training across distributed clients, but most existing FL systems implicitly assume data stationarity. In real-world settings-such as healthcare, industrial… 10 arXiv — Machine Learning research 19d ago Mirror Descent Beyond Euclidean Stability: An Exponential Separation in Initialization Sensitivity arXiv:2606.11431v1 Announce Type: new Abstract: Mirror Descent (MD) extends Gradient Descent (GD) beyond Euclidean geometry and has recently reappeared as a lens for KL-regularized policy optimization in reinforcement learning and LLM post-training. This raises a basic… 10 arXiv — Machine Learning research 19d ago LSTM-Based Detection of Structural Breaks in Property Insurance Loss Reserving: A Climate-Informed Approach arXiv:2606.11463v1 Announce Type: new Abstract: Accurate loss reserving is foundational to insurer solvency, yet accelerating climate driven catastrophes systematically violate the stability assumptions on which traditional actuarial methods depend. This white paper presents a… 30 arXiv — Machine Learning research 19d ago AI4Land: Scalable Deep Learning for Global High-Resolution Land Use Reconstruction arXiv:2606.11793v1 Announce Type: new Abstract: Uncertainty in the terrestrial carbon cycle remains a major constraint in climate projections, partly driven by the uncertainties affecting the land surface representation and variability in Earth system models. To address this… 11 arXiv — Machine Learning research 19d ago Multimodal Ordinal Modeling of Alzheimer's Disease Severity Using Structural MRI and Clinical Data arXiv:2606.11794v1 Announce Type: new Abstract: Neurodegenerative diseases such as Alzheimer's disease (AD) require accurate and scalable tools for assessing disease severity, yet current clinical staging remains time-intensive and prone to variability. We propose an… 17 arXiv — Machine Learning research 19d ago Tabular Foundation Models for Clinical Survival Analysis via Survival-Aware Adaptation arXiv:2606.12006v1 Announce Type: new Abstract: Predicting time-to-event outcomes such as mortality is a fundamental task in clinical decision-making, commonly addressed through survival analysis. While classical statistical and deep learning approaches have been widely studied,… 33 arXiv — Machine Learning research 19d ago PCA-Enhanced Adaptive NVAR Framework for High-Resolution Sea Surface Temperature Forecasting in the East Sea arXiv:2606.12141v1 Announce Type: new Abstract: Accurate forecasting of sea surface temperature (SST) in regional seas such as the East Sea is crucial for monitoring marine ecosystems, assessing climate risks, managing fisheries, and conducting naval operations. Traditional… 34 arXiv — Machine Learning research 19d ago Using Explainability as a Training-Time Reliability Signal for Efficient ECG Classification arXiv:2606.12252v1 Announce Type: new Abstract: Training deep neural networks for clinical time-series analysis is computationally demanding, yet many healthcare settings lack the resources required for repeated model development and deployment. This challenge is particularly… 8 arXiv — NLP / Computation & Language research 19d ago BioDivergence: A Benchmark and Evaluation Framework for Hidden Contextual Contradictions in Biomedical Abstracts arXiv:2606.11208v1 Announce Type: new Abstract: Biomedical findings often seem to conflict across studies, but many of these differences are context-dependent rather than true contradictions. Variations in cohort, geography, assay protocol, disease subtype, and clinical setting… 29 arXiv — NLP / Computation & Language research 19d ago Reassessing High-Performing LLMs on Polish Medical Exams: True Competence or Bias-Driven Performance? arXiv:2606.12250v1 Announce Type: new Abstract: Large language models (LLMs) in medicine are mainly evaluated using multiple-choice question answering (MCQA), which can overestimate real clinical ability due to guessing strategies and answer biases. To address these limitations,… 37 The Information — AI news-outlet 19d ago Xbox Plans Layoffs as Revenue, Profit Margins Decline Microsoft’s Xbox gaming unit plans to cut staff in the coming months as its financial picture worsens, according to someone with knowledge of the plans. In a note to staff Wednesday, CEO Asha Sharma said that Xbox’s “accountability margins”—a term Microsoft uses internally to… 30 TechCrunch — AI news-outlet 19d ago Fresh off bond sale, Amazon borrows $17.5B from banks as AI spending continues Companies are burning through exorbitant sums of money to keep pace in the AI arms race. Debt is climbing. 14 GitHub Blog — AI & ML official-blog 19d ago Give GitHub Copilot CLI real code intelligence with language servers Install and configure LSP servers for GitHub Copilot CLI, replacing brute-force grep/decompile with real code intelligence. The post Give GitHub Copilot CLI real code intelligence with language servers appeared first on The GitHub Blog . 34 Hugging Face Daily Papers research 19d ago BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling Abstract BrainSurgery is a tool for robust and reproducible tensor manipulation of neural network checkpoints through declarative YAML plans with built-in validation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As deep learning models scale, managing, inspecting, and modifying… 12 Hugging Face Daily Papers research 20d ago Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Abstract Flow-DPPO replaces ratio clipping with divergence proximal constraints in flow matching models, improving training stability and multi-objective optimization through exact KL divergence computation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent work has… 34 arXiv — Machine Learning research 20d ago TRAPS: Therapeutic Response Analysis via Pathway-informed Stratification arXiv:2606.09898v1 Announce Type: new Abstract: Cancer treatment planning requires decisions across multiple clinical dimensions at once. Clinicians must determine whether a patient should receive targeted molecular therapy, radiation therapy, and whether they are likely to… 22 arXiv — Machine Learning research 20d ago LongMoE: Longitudinal Multimodal Learning via Trajectory-Aware Mixture-of-Experts arXiv:2606.09907v1 Announce Type: new Abstract: Multimodal clinical learning is increasingly important for integrating diverse patient data, including imaging, text, and personalised health records. However, it faces two fundamental challenges: i) modality missingness, where… 28 arXiv — Machine Learning research 20d ago FedSteer: Taming Extreme Gradient Staleness in Federated Learning with Corrective Projections and Caching arXiv:2606.10124v1 Announce Type: new Abstract: Federated learning (FL) is often subject to aggregation variance if clients do not consistently participate in training rounds. While reusing stale model updates from inactive clients is a common technique to reduce this variance,… 33 arXiv — Machine Learning research 20d ago MMClima: A Framework for Multimodal Climate Science Data and Evaluation arXiv:2606.10194v1 Announce Type: new Abstract: Climate change research increasingly requires AI systems that reason across text, dynamic visual content, and scientific figures, yet existing climate QA benchmarks are small, mostly textual, and cover a narrow range of models. We… 20 Page 4 of 10 · 500 articles ← Newer Older →