Industrial LLM agent systems often separate planning from execution, yet LLM<br>planners frequently produce structurally invalid or unnecessarily long<br>workflows, leading to brittle failures and avoidable tool and API cost. We<br>propose \\texttt{SPIN}, a planning wrapper that combines validated Directed<br>Acyclic Graph (DAG) planning with prefix based execution control. \\texttt{SPIN}<br>enforces a strict DAG contract through \\texttt{_validate_plan_text} and<br>repair prompting, producing executable plans before downstream execution, and<br>then evaluates DAG prefixes incrementally to stop when the current prefix is<br>sufficient to answer the query. On AssetOpsBench, across 261 scenarios,<br>\\texttt{SPIN} reduces executed tasks from 1061 to 623 and improves<br>\\emph{Accomplished} from 0.638 to 0.706, while reducing tool calls from 11.81<br>to 6.82 per run. On MCP Bench, the same wrapper improves planning, grounding,<br>and dependency related scores for both GPT OSS1 and Llama 4 Maverick.</p>\n","updatedAt":"2026-05-15T01:50:35.531Z","author":{"_id":"64c47f731d44fc06afc80953","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UT2mHX2WuCm5Ws4rGKyCB.png","fullname":"Dhaval Patel","name":"DhavalPatel","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":7,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8560519218444824},"editors":["DhavalPatel"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UT2mHX2WuCm5Ws4rGKyCB.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.14051","authors":[{"_id":"6a067bbdb1a8cbabc9f09822","name":"Yusuke Ozaki","hidden":false},{"_id":"6a067bbdb1a8cbabc9f09823","name":"Dhaval Patel","hidden":false}],"publishedAt":"2026-05-13T00:00:00.000Z","submittedOnDailyAt":"2026-05-15T00:00:00.000Z","title":"SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks","submittedOnDailyBy":{"_id":"64c47f731d44fc06afc80953","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UT2mHX2WuCm5Ws4rGKyCB.png","isPro":false,"fullname":"Dhaval Patel","user":"DhavalPatel","type":"user","name":"DhavalPatel"},"summary":"Industrial LLM agent systems often separate planning from execution, yet LLM planners frequently produce structurally invalid or unnecessarily long workflows, leading to brittle failures and avoidable tool and API cost. We propose SPIN, a planning wrapper that combines validated Directed Acyclic Graph (DAG) planning with prefix based execution control. SPIN enforces a strict DAG contract through \\_validate\\_plan\\_text and repair prompting, producing executable plans before downstream execution, and then evaluates DAG prefixes incrementally to stop when the current prefix is sufficient to answer the query. On AssetOpsBench, across 261 scenarios, SPIN reduces executed tasks from 1061 to 623 and improves Accomplished from 0.638 to 0.706, while reducing tool calls from 11.81 to 6.82 per run. On MCP Bench, the same wrapper improves planning, grounding, and dependency related scores for both GPT OSS1 and Llama 4 Maverick.","upvotes":1,"discussionId":"6a067bbdb1a8cbabc9f09824","ai_summary":"SPIN is a planning wrapper that combines validated DAG planning with prefix-based execution control to reduce task execution and improve plan validity in industrial LLM agent systems.","ai_keywords":["DAG planning","prefix-based execution control","validated planning","repair prompting","Directed Acyclic Graph","LLM agent systems","task execution","plan validation"],"organization":{"_id":"616e7b1d75754a5d5fa455cf","name":"ibm","fullname":"IBM","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/637bfdf60dc13843b468ac20/9228luWRoGbZwKGxkOOsj.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64c47f731d44fc06afc80953","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UT2mHX2WuCm5Ws4rGKyCB.png","isPro":false,"fullname":"Dhaval Patel","user":"DhavalPatel","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"616e7b1d75754a5d5fa455cf","name":"ibm","fullname":"IBM","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/637bfdf60dc13843b468ac20/9228luWRoGbZwKGxkOOsj.png"}}">
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks
Abstract
SPIN is a planning wrapper that combines validated DAG planning with prefix-based execution control to reduce task execution and improve plan validity in industrial LLM agent systems.
AI-generated summary
Industrial LLM agent systems often separate planning from execution, yet LLM planners frequently produce structurally invalid or unnecessarily long workflows, leading to brittle failures and avoidable tool and API cost. We propose SPIN, a planning wrapper that combines validated Directed Acyclic Graph (DAG) planning with prefix based execution control. SPIN enforces a strict DAG contract through \_validate\_plan\_text and repair prompting, producing executable plans before downstream execution, and then evaluates DAG prefixes incrementally to stop when the current prefix is sufficient to answer the query. On AssetOpsBench, across 261 scenarios, SPIN reduces executed tasks from 1061 to 623 and improves Accomplished from 0.638 to 0.706, while reducing tool calls from 11.81 to 6.82 per run. On MCP Bench, the same wrapper improves planning, grounding, and dependency related scores for both GPT OSS1 and Llama 4 Maverick.
Community
Industrial LLM agent systems often separate planning from execution, yet LLM
planners frequently produce structurally invalid or unnecessarily long
workflows, leading to brittle failures and avoidable tool and API cost. We
propose \texttt{SPIN}, a planning wrapper that combines validated Directed
Acyclic Graph (DAG) planning with prefix based execution control. \texttt{SPIN}
enforces a strict DAG contract through \texttt{_validate_plan_text} and
repair prompting, producing executable plans before downstream execution, and
then evaluates DAG prefixes incrementally to stop when the current prefix is
sufficient to answer the query. On AssetOpsBench, across 261 scenarios,
\texttt{SPIN} reduces executed tasks from 1061 to 623 and improves
\emph{Accomplished} from 0.638 to 0.706, while reducing tool calls from 11.81
to 6.82 per run. On MCP Bench, the same wrapper improves planning, grounding,
and dependency related scores for both GPT OSS1 and Llama 4 Maverick.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.14051 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.14051 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.14051 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.