r/MachineLearning · June 2, 2026 · 2 min read

MeshFlow: production-safe multi-agent orchestration — SHA-256 audit chain, HIPAA/SOX/GDPR built in, 70-85% token cost reduction [Open Source][D]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

79% of enterprises have adopted AI agents. Only 11% run them in production. We've spent the past year building agent systems for banks, clinical operations teams, and engineering orgs. The problem isn't that agents don't work — they work fine. The problem is that every framework leaves compliance, cost governance, and crash recovery as exercises for the team. After the framework fails them in production. We built MeshFlow to close that gap. **The core idea:** treat governance as infrastructure, not middleware. Every agent step passes through a 15-step kernel that handles identity, rate limiting, budget enforcement, compliance profiles, input/output guardrails, PII detection, risk classification, tool permission, the LLM call itself, audit ledger write, and SLA recording — in that order, always, without configuration. ```python from meshflow import Workflow, CostCap, Agent wf = Workflow(cost_cap=CostCap(usd=5.00)) wf.add(Agent('researcher'), Agent('analyst'), Agent('writer')) result = wf.run('Write a competitive analysis of our market') # Compliant. Durable. Audited. Cost-capped. Done. ``` ```bash pip install meshflow ``` **What's technically interesting:** **Token optimization layer** — five compounding mechanisms that reduce LLM spend 70-85%: - `cache_control` on every system prompt and tool definition (Anthropic: 10% of normal price on cached tokens) - `ModelRouter`: task-type classification routes simple tasks to nano models (keyword + token-count heuristic, zero LLM call) - `ContextCompactor`: sliding window summarization activates at configurable token threshold - `RAGTokenBudget`: hard `max_chars` cap on knowledge injection with truncate/drop/tail strategies - `ContextDeduplicator`: shared context sent once for N parallel agents, not N times **SHA-256 audit chain** — each step record stores `prev_hash` (SHA-256 of the previous record) and `entry_hash` (SHA-256 of its own canonical fields). Modify any log entry and `verify_chain()` breaks. This is the artifact HIPAA §164.312(b) and SOC 2 CC7.2 actually want. **Durable execution** — `DurableWorkflowExecutor` with five backends (memory, SQLite, Redis, Postgres, S3). Same `run_id` on restart resumes from last checkpoint. For workflows with side effects, this is a correctness requirement, not just a convenience. **ReplayLedger interactive API** — `diff(run_a, run_b)` returns a structured `RunDiff` (changed nodes, cost delta, token delta). `fork(run_id, from_step=3)` creates a new run copying steps 0–2. `load_state(run_id, step_index)` for time-travel inspection. The ledger is append-only by design. **Framework-agnostic** — `govern(your_langgraph_graph)`, `from_crewai(your_crew)`, `from_autogen(your_agent)` add governance to any existing system without rewriting it. 4,379 passing tests. Apache 2.0. `pip install meshflow`. GitHub: https://github.com/Anteneh-T-Tessema/meshflow Docs: https://meshflow.dev Happy to answer technical questions about the architecture.

submitted by /u/Adventurous_Tank8261
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/MachineLearning