Hugging Face Daily Papers · · 4 min read

FlowCompile: An Optimizing Compiler for Structured LLM Workflows

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

FlowCompile is an optimizing compiler for structured LLM workflows. Given a workflow graph, a validation/profile set, and a design space over sub-agent models, reasoning budgets, and optional workflow structure choices, FlowCompile performs compile-time design-space exploration and emits a reusable set of workflow-level configurations spanning accuracy-latency trade-offs.</p>\n","updatedAt":"2026-05-14T21:42:10.346Z","author":{"_id":"62d09eb86a61a88ea0d83918","avatarUrl":"/avatars/81b511d94cced304ffca058caff662d4.svg","fullname":"Junyan Li","name":"senfu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7661470174789429},"editors":["senfu"],"editorAvatarUrls":["/avatars/81b511d94cced304ffca058caff662d4.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.13647","authors":[{"_id":"6a064183b1a8cbabc9f09717","name":"Junyan Li","hidden":false},{"_id":"6a064183b1a8cbabc9f09718","name":"Zhang-Wei Hong","hidden":false},{"_id":"6a064183b1a8cbabc9f09719","name":"Maohao Shen","hidden":false},{"_id":"6a064183b1a8cbabc9f0971a","name":"Yang Zhang","hidden":false},{"_id":"6a064183b1a8cbabc9f0971b","name":"Chuang Gan","hidden":false}],"publishedAt":"2026-05-13T00:00:00.000Z","submittedOnDailyAt":"2026-05-14T00:00:00.000Z","title":"FlowCompile: An Optimizing Compiler for Structured LLM Workflows","submittedOnDailyBy":{"_id":"62d09eb86a61a88ea0d83918","avatarUrl":"/avatars/81b511d94cced304ffca058caff662d4.svg","isPro":false,"fullname":"Junyan Li","user":"senfu","type":"user","name":"senfu"},"summary":"Structured LLM workflows, where specialized LLM sub-agents execute according to a predefined graph, have become a powerful abstraction for solving complex tasks. Optimizing such workflows, i.e., selecting configurations for each sub-agent to balance accuracy and latency, is challenging due to the combinatorial design space over model choices, reasoning budgets, and workflow structures. Existing cost-aware methods largely treat workflow optimization as a routing problem, selecting a configuration at inference time for each query according to the accuracy-latency objective used during training. We argue that structured LLM workflows can also be optimized from a compilation perspective: before deployment, the system can globally explore the workflow design space and construct a reusable set of workflow-level configurations spanning diverse accuracy-latency trade-offs. Drawing inspiration from machine learning compilers, we introduce FlowCompile, a structured LLM workflow compiler that performs compile-time design space exploration to identify a high-quality, reusable trade-off set. FlowCompile decomposes a workflow into sub-agents, profiles each sub-agent under diverse configurations, and composes these measurements through a structure-aware proxy to estimate workflow-level accuracy and latency. It then identifies diverse high-quality configurations in a single compile-time pass, without retraining or online adaptation. Experiments across diverse workflows and challenging benchmarks show that FlowCompile consistently outperforms heuristically optimized workflow configurations and routing-based baselines, delivering up to 6.4x speedup. The compiled configuration set further serves as a reusable optimization artifact, enabling flexible deployment under varying runtime preferences and supporting downstream selection or routing.","upvotes":0,"discussionId":"6a064183b1a8cbabc9f0971c","githubRepo":"https://github.com/UMass-Embodied-AGI/FlowCompile","githubRepoAddedBy":"user","ai_summary":"FlowCompile is a structured LLM workflow compiler that optimizes complex multi-agent tasks by performing compile-time exploration of workflow configurations to balance accuracy and latency without retraining.","ai_keywords":["structured LLM workflows","sub-agents","workflow optimization","accuracy-latency trade-offs","compile-time design space exploration","machine learning compilers","workflow-level configurations","structure-aware proxy","runtime preferences"],"githubStars":0,"organization":{"_id":"65205233fe5881ad35a318a4","name":"UMassAmherst","fullname":"University of Massachusetts Amherst","avatar":"https://www.gravatar.com/avatar/0126d0062c96fe5c76a9a41ebb11daff?d=retro&size=100"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"organization":{"_id":"65205233fe5881ad35a318a4","name":"UMassAmherst","fullname":"University of Massachusetts Amherst","avatar":"https://www.gravatar.com/avatar/0126d0062c96fe5c76a9a41ebb11daff?d=retro&size=100"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.13647.md"}">
Papers
arxiv:2605.13647

FlowCompile: An Optimizing Compiler for Structured LLM Workflows

Published on May 13
· Submitted by
Junyan Li
on May 14
Authors:
,
,
,
,

Abstract

FlowCompile is a structured LLM workflow compiler that optimizes complex multi-agent tasks by performing compile-time exploration of workflow configurations to balance accuracy and latency without retraining.

AI-generated summary

Structured LLM workflows, where specialized LLM sub-agents execute according to a predefined graph, have become a powerful abstraction for solving complex tasks. Optimizing such workflows, i.e., selecting configurations for each sub-agent to balance accuracy and latency, is challenging due to the combinatorial design space over model choices, reasoning budgets, and workflow structures. Existing cost-aware methods largely treat workflow optimization as a routing problem, selecting a configuration at inference time for each query according to the accuracy-latency objective used during training. We argue that structured LLM workflows can also be optimized from a compilation perspective: before deployment, the system can globally explore the workflow design space and construct a reusable set of workflow-level configurations spanning diverse accuracy-latency trade-offs. Drawing inspiration from machine learning compilers, we introduce FlowCompile, a structured LLM workflow compiler that performs compile-time design space exploration to identify a high-quality, reusable trade-off set. FlowCompile decomposes a workflow into sub-agents, profiles each sub-agent under diverse configurations, and composes these measurements through a structure-aware proxy to estimate workflow-level accuracy and latency. It then identifies diverse high-quality configurations in a single compile-time pass, without retraining or online adaptation. Experiments across diverse workflows and challenging benchmarks show that FlowCompile consistently outperforms heuristically optimized workflow configurations and routing-based baselines, delivering up to 6.4x speedup. The compiled configuration set further serves as a reusable optimization artifact, enabling flexible deployment under varying runtime preferences and supporting downstream selection or routing.

Community

Paper submitter about 4 hours ago

FlowCompile is an optimizing compiler for structured LLM workflows. Given a workflow graph, a validation/profile set, and a design space over sub-agent models, reasoning budgets, and optional workflow structure choices, FlowCompile performs compile-time design-space exploration and emits a reusable set of workflow-level configurations spanning accuracy-latency trade-offs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.13647
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.13647 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.13647 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.13647 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers