Tag

Developer Tool

500 articles archived under #developer-tool · RSS

arXiv — NLP / Computation & Language research 6d ago

RASC+: Retrieval-Constrained LLM Adjudication for Clinical Value Set Authoring

arXiv:2606.23992v1 Announce Type: new Abstract: Clinical value sets define the standardized terminology codes used in quality measurement, phenotyping, cohort construction, and clinical decision support. The recently introduced Retrieval-Augmented Set Completion (RASC) benchmark…

32
arXiv — NLP / Computation & Language research 6d ago

PORTER: Language-Grounded Event Representations for Portable Structured EHR Foundation Models

arXiv:2606.24102v1 Announce Type: new Abstract: Most electronic health record (EHR) foundation models encode clinical events as discrete event tokens from a fixed vocabulary and therefore cannot directly represent events containing unseen concepts or new combinations of concepts…

35
arXiv — Machine Learning research 6d ago

MedPCFM: Improving Medical Point Cloud Completion by Integrating Point Transformers and Flow Matching

arXiv:2606.24433v1 Announce Type: cross Abstract: Medical point cloud completion is important for anatomical reconstruction and downstream clinical workflows, yet generative modeling in this setting remains insufficiently studied. We investigate completion through…

28
arXiv — NLP / Computation & Language research 6d ago

One Year Later...The Harms Persist, But So Do We!

arXiv:2606.23884v1 Announce Type: new Abstract: General-purpose large language models (LLMs) are increasingly used for mental health-related conversations, yet safety safeguards remain inadequate and inconsistent across clinical conditions. This study evaluates six proprietary…

26
arXiv — NLP / Computation & Language research 6d ago

MedBench v5: A Dynamic, Process-Oriented, and Hallucination-Aware Benchmark for Clinical Multimodal Models

arXiv:2606.24155v1 Announce Type: new Abstract: Existing medical AI benchmarks lack process visibility, atomic skill evaluation, and integrated hallucination detection. We introduce MedBench v5, a redesigned benchmark for clinical multimodal models (language, vision-language,…

38
arXiv — NLP / Computation & Language research 6d ago

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

arXiv:2606.24200v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) in clinical settings increasingly requires multilingual retrieval against predominantly English evidence corpora. Multilingual medical retrieval demands three capabilities: cross-lingual…

36
arXiv — NLP / Computation & Language research 6d ago

A specialized reasoning large language model for accelerating rare disease diagnosis: a randomized AI physician assistance trial

arXiv:2606.24510v1 Announce Type: cross Abstract: Rare diseases affect millions of individuals worldwide, yet timely diagnosis remains a major public health challenge due to scarcity of specialized clinical expertise. While large language models (LLMs) show promise to support…

28
arXiv — NLP / Computation & Language research 6d ago

Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering

arXiv:2403.04890v4 Announce Type: replace Abstract: In this paper, we propose a modified version of the MedQA-USMLE dataset, named MEDQA-OPEN, which contains open-ended medical questions without options to mimic clinical scenarios, along with clinician-approved reasoned answers.…

7
Hacker News — AI on Front Page community 6d ago

Fired by Google for creating the Google workspace CLI

https://xcancel.com/JPoehnelt/status/2069482265953087602 Comments URL: https://news.ycombinator.com/item?id=48649011 Points: 252 # Comments: 173

36
r/LocalLLaMA community 6d ago

650+ Apache-2.0 biomedical NER/de-id models that run on-device in MLX. Same fp32 weights, identical outputs: the clinical NER models run 30-40x faster than PyTorch-CPU on a 3-year-old M3 Max. Repro inside.

Disclosure first: I maintain OpenMed, so read this with that bias. I'm posting the numbers with the full methodology and a runnable script so you can reproduce or tear it apart. I'm here for the next couple of hours to answer methodology questions. What it is: an open-source…

25
r/LocalLLaMA community 6d ago

I benchmarked 8 LLMs for medical scribing. Hallucinations were rare; omissions need attention.

I ran a small benchmark on LLMs for medical scribing. Reason: most discussion around AI scribe safety focuses on hallucinations. That matters, but in notes I kept seeing another problem: models often leave out clinically relevant details from the conversation. So I evaluated 8…

10
MIT Technology Review — AI news-outlet 6d ago

The $400 million machine powering the future of chipmaking

Jos Benschop is climbing a ladder to get to the top of his newest machine.  It’s a bit of a schlep. The contraption is the size of a double-decker bus—more than 150 tons of gleaming precision-milled aluminum covered in thousands of snaking tubes, colored cables, and…

24
Vercel — AI dev-tools 6d ago

Deploy Node servers with zero configuration

You can now deploy a Node.js server to Vercel with zero configuration. Vercel detects a server.ts file at the project root or at src/server.ts and deploys it as a Node.js application, in addition to existing zero-configuration backends like Express, Koa, and NestJS: Vercel CLI…

12
Hugging Face Daily Papers research 7d ago

CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents

Abstract A principled synthesis engine generates high-quality terminal-agent tasks through multi-dimensional capability taxonomy and evidence-guided research, creating a distilled dataset that enables significant performance gains in LLM training. Generated by…

5
Hugging Face official-blog 7d ago

Shipping huggingface_hub every week with AI, open tools, and a human in the loop

Back to Articles a]:hidden"> Shipping huggingface_hub every week with AI, open tools, and a human in the loop Published June 23, 2026 Update on GitHub Upvote - Lucain Pouget Wauplin Célina Hanouti celinah huggingface_hub is the Python client at the base of the Hugging Face…

18
Vercel — AI dev-tools 7d ago

Redesigned trace viewer for Vercel Workflows

The trace viewer for Vercel Workflows and Workflow SDK has been redesigned to better support inspecting runs from start to finish. Search across spans, zoom into any section of the timeline, and step through with the keyboard to find what you're looking for fast, then click into…

19
Vercel — AI dev-tools 7d ago

Preserve local environment variables when linking with the Vercel CLI

The Vercel CLI now preserves your .env.local file when running vercel link . Previously, linking could overwrite variables already in the file. The CLI now updates VERCEL_OIDC_TOKEN if it exists, or appends it if missing, without touching anything else. Run pnpm i -g…

9
Vercel — AI dev-tools 7d ago

Chat SDK adds Kapso support

Chat SDK now supports Kapso with the new vendor-official adapter . Kapso connects your bot to WhatsApp through its hosted platform, handling the WhatsApp Business setup, credentials, and webhooks so you can focus on your bot's logic. Replies use the standard Chat SDK thread and…

31
Vercel — AI dev-tools 7d ago

Chat SDK adds Novu support

Chat SDK now supports Novu with the new vendor-official adapter . One handler set puts your agent on Slack, Microsoft Teams, WhatsApp, Telegram, and email. Novu handles credentials, identity, and delivery, keeping OAuth and tokens outside your app and mapping each channel to one…

32
Vercel — AI dev-tools 7d ago

Chat SDK adds Sendblue support

Chat SDK now supports Sendblue with the new vendor-official adapter . Build bots that send and receive iMessage, SMS, and RCS through Sendblue's hosted gateway, reaching people on the messaging apps they already use. Messages use iMessage-first delivery with support for…

10
Vercel — AI dev-tools 7d ago

Chat SDK adds Linq support

Chat SDK now supports Linq with the new vendor-official adapter . Build bots that send and receive texts in both direct messages and group chats, with bidirectional media and native iMessage tapback reaction support. Replies use the standard Chat SDK thread and message APIs,…

17
Hugging Face Daily Papers research 8d ago

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Abstract A 3D brain MRI generative model uses a masked-autoencoder tokenizer to create clinically informative embeddings that support both medical task performance and controlled image generation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Three-dimensional (3D) brain MRI is…

6
r/LocalLLaMA community 8d ago

Your Favorite Workflow to Convert PDF with Complex Structure to Markdown?

I've tried markitdown, Docling, and Mineru. Are there better tools I should try? I need to process tables, floating box, etc. Thanks!   submitted by   /u/chibop1 [link]   [comments]

30
Vercel — AI dev-tools 8d ago

WebSocket support is now in Public Beta

Vercel Functions can now serve WebSocket connections, enabling bidirectional communication between clients and server-side code on Vercel. Use WebSockets for realtime features such as interactive AI streaming, chat, and collaborative apps. WebSocket connections run Fluid compute…

23
Vercel — AI dev-tools 8d ago

Vercel CLI now supports signing blob URLs

You can now generate signed URLs for Vercel Blob directly from the Vercel CLI. A signed URL is a scoped URL with a set expiration time that lets you perform a single operation on a specific object. Each URL is scoped to one operation ( get , head , put , or delete ), one…

30
Vercel — AI dev-tools 8d ago

Workflow SDK now compresses run and step payloads

The Workflow SDK 5 beta now compresses all run, hook, and step inputs and outputs with zstd . Compression kicks in automatically, but only when it helps. Small payloads stay as-is, larger ones get compressed before they're persisted. Compressed payloads use less storage and are…

16
Simon Willison community 8d ago

sqlite-utils 4.0rc1 adds migrations and nested transactions

sqlite-utils is my combined Python library and CLI tool for working with SQLite databases. It provides an extensive set of higher-level operations on top of Python's default sqlite3 package , including support for complex table transformations , automatic table creation from…

13
r/LocalLLaMA community 8d ago

I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch

Hey folks Hope you are doing well I started HobbyLM as an side project last month Initially I wrote an Agent harness using Claude SDK which takes notes on various LLM architecture does ablation studies to find optimised or well fit architecture for this model training then I…

16
r/MachineLearning community 9d ago

Python packages for particle swarms, genetic algorithms. Scikit-opt maybe? [D]

I'm working with a client on a curve-fitting optimization problem. They are currently using a constrained Levenburg-Marquardt optimizer for their task which is complex, slow, and sometimes gets stuck in local minima. I suggested using particle swarm optimization (PSO), and the…

17
r/LocalLLaMA community 9d ago

Qwen code companion on vscode marketplace - thoughts

I just came across this extension in vscode few days ago and tried to use with LM studio hosted models and it really is pretty good compared to `continue`, `kilo`, `cline`, `roo` like I felt without much tweaks, gets straight to the point, if any tweaks required u could do…

36
r/LocalLLaMA community 9d ago

Gemma 4 26b a4b is genuinely the best model I have tried for language learning and scientific queries!

I know gemma 4 26b is (according to this sub) a bit behind for coding tasks but for language learning and scientific (health/biology/medical/clinical/biochem) queries it’s unbeaten even by Qwen 3.5/3.6. Since the competition in the small MOE models is generally between Qwen…

28
Simon Willison community 10d ago

Quoting Sean Lynch

The real valuable capability MCP offers over skills/CLI is isolating the auth flow outside of the agent’s context window, and potentially out of the harness completely. [...] Maybe the idealized form of MCP is just an auth gateway for the API and nothing else. That’d still be a…

8
llama.cpp releases dev-tools 10d ago

b9730

mtmd, arg: fix utf8 handling on windows ( #24779 ) mtmd, arg: fix utf8 handling on windows also fix ggml_fopen fix build fail also fix CLI macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux:…

36
Hugging Face Daily Papers research 10d ago

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Abstract ACIE, an agentic RAG system deployed in a clinical setting, demonstrates high accuracy in extracting medical information from complex patient contexts, achieving 96.5% acceptance rate by nuclear-medicine physicians across 7,326 judgments. Generated by…

5
arXiv — Machine Learning research 11d ago

cAPM: Continual AI-Assisted Pace-Mapping with Active Learning

arXiv:2606.19373v1 Announce Type: new Abstract: Ventricular tachycardia is a life-threatening rhythm disorder and a major cause of sudden cardiac death. Pace-mapping is a clinical procedure for identifying the intervention target during catheter ablation of VT. It requires…

15
arXiv — Machine Learning research 11d ago

Insulin4RL: Real-Time Insulin Management in the Intensive Care Unit for Offline Reinforcement Learning

arXiv:2606.19481v1 Announce Type: new Abstract: Offline reinforcement learning (ORL) offers the potential to improve the quality of clinical decision-making using historical electronic health record (EHR) data. Current training and evaluative practices in this field rely heavily…

10
arXiv — Machine Learning research 11d ago

Federated Bilevel Performative Prediction

arXiv:2606.19734v1 Announce Type: new Abstract: Federated bilevel optimization is widely used for nested learning problems across distributed clients, such as federated hyperparameter tuning and meta-learning under privacy and communication constraints. Most existing…

7
arXiv — Machine Learning research 11d ago

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning

arXiv:2606.19827v1 Announce Type: new Abstract: Medical tabular data are ubiquitous in clinical research, but deep learning for tables remains underexplored because reliable labels often require costly expert adjudication, even though structured clinical variables are routinely…

21
arXiv — Machine Learning research 11d ago

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

arXiv:2606.20034v1 Announce Type: new Abstract: Understanding urban spatial morphology is critical for climate modeling, risk assessment, and sustainable urban design, and Local Climate Zone (LCZ) mapping provides the basic framework for this. However, many cities still use…

10
arXiv — Machine Learning research 11d ago

Constrained hybrid modelling to predict microbial dynamics and organic matter turnover in soil systems

arXiv:2606.20329v1 Announce Type: new Abstract: Soil microorganisms control organic matter cycling and largely determine how soil systems can cope with and mitigate climate change and environmental threats. Representing microbial dynamics in process-based soil models is…

17
arXiv — NLP / Computation & Language research 11d ago

Before the Labels: How Dataset Construction Shapes Suicidality Detection in Clinical Text

arXiv:2606.19637v1 Announce Type: new Abstract: Clinical NLP increasingly relies on electronic health record (EHR) data to detect suicidal behaviors, treating clinical documentation as more reliable ground truth than social media. We argue that this framing obscures how…

36
arXiv — NLP / Computation & Language research 11d ago

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

arXiv:2606.19852v1 Announce Type: new Abstract: Information extraction from pathology reports is essential for cancer staging, tumor registry population. Yet key data remains embedded in narrative reports, making manual extraction labor-intensive and error-prone. Traditional…

26
arXiv — NLP / Computation & Language research 11d ago

Source-Grounded Data Generation for Text-to-JSON Learning

arXiv:2606.20072v1 Announce Type: new Abstract: From financial filings to clinical records, legacy industries rely heavily on long, unstructured documents to store high-value information. Reliably extracting this information into structured, machine-readable representations is a…

4
arXiv — NLP / Computation & Language research 11d ago

MedRLM: Recursive Multimodal Health Intelligence for Long-Context Clinical Reasoning, Sensor-Guided Screening, Evidence-Grounded Decision Support, and Community-to-Tertiary Referral Optimization

arXiv:2606.20164v1 Announce Type: new Abstract: Real-world clinical decision support requires reasoning over heterogeneous and longitudinal patient information rather than answering isolated medical questions. However, current medical large language models and…

29
arXiv — NLP / Computation & Language research 11d ago

Beyond the GUI Paradigm: Do Mobile Agents Need the Phone Screen?

arXiv:2606.19388v1 Announce Type: cross Abstract: Recent advances in mobile agents are dominated by the GUI paradigm, in which agents perceive UI information and emit screen interactions. However, mobile platforms also expose a command-line interface (CLI) that provides direct…

31
arXiv — NLP / Computation & Language research 11d ago

AgentFinVQA: A Deployable Multi-Agent Pipeline for Auditable Financial Chart QA

arXiv:2606.19782v1 Announce Type: cross Abstract: Financial chart question answering in regulated settings demands more than accuracy: practitioners must know which answers to trust before acting on them, and many institutions cannot send client data to external model providers.…

10
llama.cpp releases dev-tools 11d ago

b9713

mtmd: add batching for mtmd-cli, add video tests ( #24778 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

22
llama.cpp releases dev-tools 11d ago

b9701

mtmd: refactor preprocessor, add mtmd_image_preproc_out ( #24736 ) add mtmd_image_preproc_out add dev docs remove unused clip API rm unused clip_image_f32_batch::grid change preprocess() call signature macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI…

15
arXiv — Machine Learning research 12d ago

ThousandWorlds: A benchmark for climate emulation of potentially habitable exoplanets

arXiv:2606.18338v1 Announce Type: new Abstract: The search for life beyond Earth will depend on detecting faint signatures in the atmospheres of potentially habitable exoplanets. Interpreting those signatures requires understanding the host planet's climate: the same molecule…

23
arXiv — Machine Learning research 12d ago

SCOPE-FL: A Strategy-proof Chain-based Optimal pareto efficient Federated Learning System

arXiv:2606.18384v1 Announce Type: new Abstract: Hierarchical Federated Learning (HFL) enables scalable collaborative model training across distributed devices while preserving data privacy. However, existing HFL client selection mechanisms suffer from a fundamental strategic…

31

RASC+: Retrieval-Constrained LLM Adjudication for Clinical Value Set Authoring

PORTER: Language-Grounded Event Representations for Portable Structured EHR Foundation Models

MedPCFM: Improving Medical Point Cloud Completion by Integrating Point Transformers and Flow Matching

One Year Later...The Harms Persist, But So Do We!

MedBench v5: A Dynamic, Process-Oriented, and Hallucination-Aware Benchmark for Clinical Multimodal Models

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

A specialized reasoning large language model for accelerating rare disease diagnosis: a randomized AI physician assistance trial

Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering

Fired by Google for creating the Google workspace CLI

650+ Apache-2.0 biomedical NER/de-id models that run on-device in MLX. Same fp32 weights, identical outputs: the clinical NER models run 30-40x faster than PyTorch-CPU on a 3-year-old M3 Max. Repro inside.

I benchmarked 8 LLMs for medical scribing. Hallucinations were rare; omissions need attention.

The $400 million machine powering the future of chipmaking

Deploy Node servers with zero configuration

CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents

Shipping huggingface_hub every week with AI, open tools, and a human in the loop

Redesigned trace viewer for Vercel Workflows

Preserve local environment variables when linking with the Vercel CLI

Chat SDK adds Kapso support

Chat SDK adds Novu support

Chat SDK adds Sendblue support

Chat SDK adds Linq support

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Your Favorite Workflow to Convert PDF with Complex Structure to Markdown?

WebSocket support is now in Public Beta

Vercel CLI now supports signing blob URLs

Workflow SDK now compresses run and step payloads

sqlite-utils 4.0rc1 adds migrations and nested transactions

I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch

Python packages for particle swarms, genetic algorithms. Scikit-opt maybe? [D]

Qwen code companion on vscode marketplace - thoughts

Gemma 4 26b a4b is genuinely the best model I have tried for language learning and scientific queries!

Quoting Sean Lynch

b9730

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

cAPM: Continual AI-Assisted Pace-Mapping with Active Learning

Insulin4RL: Real-Time Insulin Management in the Intensive Care Unit for Offline Reinforcement Learning

Federated Bilevel Performative Prediction

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

Constrained hybrid modelling to predict microbial dynamics and organic matter turnover in soil systems

Before the Labels: How Dataset Construction Shapes Suicidality Detection in Clinical Text

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

Source-Grounded Data Generation for Text-to-JSON Learning

MedRLM: Recursive Multimodal Health Intelligence for Long-Context Clinical Reasoning, Sensor-Guided Screening, Evidence-Grounded Decision Support, and Community-to-Tertiary Referral Optimization

Beyond the GUI Paradigm: Do Mobile Agents Need the Phone Screen?

AgentFinVQA: A Deployable Multi-Agent Pipeline for Auditable Financial Chart QA

b9713

b9701

ThousandWorlds: A benchmark for climate emulation of potentially habitable exoplanets

SCOPE-FL: A Strategy-proof Chain-based Optimal pareto efficient Federated Learning System