Skip to Content
WalkthroughsAgentsProduction Agent Workflow

Building a Production-Ready Agent Workflow

This walkthrough builds a complete document review workflow in Marie AI and then brings the same workflow shape into Marie Studio as a visual starter.

The workflow does four things:

  • marie-ai owns the runnable execution path
  • routes a packet to the right specialists
  • runs those specialists in parallel
  • lets one side argue for approval and one side argue for escalation
  • returns a structured DecisionEnvelope

The resulting example lives in projects/marie-ai/examples/agents/document_review_workflow.py.

You can work through this in two passes. Start with the deterministic version so you can see the workflow clearly, then switch to live mode when you want real model-backed specialists and judging.

Before You Start

This tutorial is useful if you want to do three things:

  1. run a routed multi-agent workflow locally without depending on a live model provider
  2. inspect the structured output contract that downstream systems would consume
  3. load the same workflow shape into Marie Studio as a visual starter

You will need:

  • the Marie workspace checked out locally
  • the Python environment used by marie-ai
  • Marie Studio if you want to import the visual starter

By the end, you should have:

  • a successful safe and risky workflow run
  • a clear understanding of why each decision was made
  • an importable prefab JSON you can inspect in Studio

What You Will Build

The example models a document review decision:

  1. inspect the packet
  2. decide which specialists need to run
  3. run those specialists concurrently
  4. let one side argue for approval and one side argue for escalation
  5. emit a structured decision envelope

That is enough to show the production shape without burying the workflow under framework-specific UI.

Why Start Deterministic

The default mode does not depend on a live model provider. That is intentional.

If your first pass requires a running LLM backend, you usually end up with three problems immediately:

  • local development becomes fragile
  • CI coverage becomes expensive and noisy
  • the workflow contract is harder to validate because model variance hides implementation mistakes

So the default path uses rule-based specialists running through Marie’s real coordination layer. That gives you something you can run locally, test in CI, and inspect without wondering whether a model variation caused the result.

Live Variant

The same example also supports a live mode.

In live mode:

  • the specialist findings are generated by a provider-backed LLM
  • the approval and escalation debate turns are generated by a provider-backed LLM
  • the final DecisionEnvelope is generated by a provider-backed LLM

The routing and fan-out stay the same. What changes is how the findings, debate, and final decision are produced.

If you want to use the live path with OpenAI-compatible APIs:

export OPENAI_API_KEY=your_key_here python examples/agents/document_review_workflow.py --sample risky --mode live --backend openai

If you want to use a Marie-backed model instead:

python examples/agents/document_review_workflow.py --sample risky --mode live --backend marie

Both modes return the same DecisionEnvelope shape. That means you can start with the stable version, then swap in a live provider without changing the downstream contract.

Workflow Shape

The flow is simple:

Input Packet | v Router | +--> Extraction Agent ----+ +--> Policy Agent --------+--> Evidence Set --> Debate --> Judge --> DecisionEnvelope +--> Risk Agent ----------+ +--> History Agent -------+

Here is the shape we are building:

{ "document_id": "DOC-2009", "decision": "escalate", "confidence": 0.97, "next_action": "Route the packet to a human reviewer with the specialist evidence attached.", "selected_specialists": ["extraction", "policy", "risk", "history"] }

Step 0: Load The Visual Starter In Studio

This step is optional, but it makes the workflow shape concrete before you dig into the code.

This prefab mirrors the workflow shape from the tutorial. It is useful when you want to inspect the graph visually while keeping the executable logic in Python.

Studio Import Starter

Production Agent Workflow Example

Visual starter for the routed specialists, fan-out, debate, and structured decision pattern described in this tutorial.

Import path in Studio: Query Plans -> Prefabs -> Import Prefab
Route Specialists
4 paths
BRANCH
BRANCH
Extraction Specialist
agent://document-review/extraction
COMPUTE
EXECUTOR_ENDPOINT
Policy Specialist
agent://document-review/policy
COMPUTE
EXECUTOR_ENDPOINT
Risk Specialist
agent://document-review/risk
COMPUTE
EXECUTOR_ENDPOINT
History Specialist
agent://document-review/history
COMPUTE
EXECUTOR_ENDPOINT
Collect Evidence
WAIT_ALL_ACTIVE
MERGER
MERGER_ENHANCED
Debate Findings
agent://document-review/debate
COMPUTE
EXECUTOR_ENDPOINT
Judge Decision
agent://document-review/judge
COMPUTE
EXECUTOR_ENDPOINT

After downloading the JSON:

  1. open Marie Studio
  2. go to Query Plans -> Prefabs
  3. choose Import Prefab
  4. import the downloaded file into your workspace

Once you import it, you have a graph you can inspect while you work through the rest of the example.

Step 1: Start With The Output Contract

The workflow starts with explicit models:

  • WorkflowInput
  • SpecialistFinding
  • DebateTurn
  • DecisionEnvelope

Start here because everything else in the workflow feeds this contract.

class DecisionEnvelope(BaseModel): document_id: str decision: Literal["approve", "escalate"] confidence: float summary: str next_action: str missing_evidence: list[str] = Field(default_factory=list) citations: list[str] = Field(default_factory=list) selected_specialists: list[str] = Field(default_factory=list) specialist_findings: list[SpecialistFinding] = Field(default_factory=list) debate: list[DebateTurn] = Field(default_factory=list)

This is the piece you want downstream systems to consume. Free-form text is easy to demo, but hard to route, audit, and evaluate.

Step 2: Add A Small Router

The router is intentionally narrow. It does not try to be smart with a model. It encodes a few clear branching rules:

  • always run extraction
  • always run policy
  • always run risk
  • add history when prior flags, vendor history, or customer tier justify it
def select_specialists(request: WorkflowInput) -> list[str]: specialists = ["extraction", "policy", "risk"] if ( request.prior_flags > 0 or request.vendor != "unknown" or request.customer_tier != "standard" ): specialists.append("history") return specialists

This keeps the decision boundary obvious. You can see exactly why history gets added.

Step 3: Fan Out To Specialists

Now wire those specialists through Marie’s real coordination layer:

from marie.agent.config import CoordinationConfig from marie.agent.coordination.fan_out import FanOutCoordinator config = CoordinationConfig( topology="parallel", merge_strategy="aggregate", max_concurrent=4, timeout=10.0, ) coordinator = FanOutCoordinator(config)

The specialists are simple, but the execution model is real:

  • bounded concurrency
  • coordinated results
  • consistent merge behavior
  • a runtime contract that can later move behind the gateway workflow API

That is what makes this more than a toy script. You get the real workflow mechanics even before you add a live model.

Step 4: Make Each Specialist Return Evidence

Each specialist returns a SpecialistFinding with:

  • summary
  • blockers
  • evidence
  • citations
  • risk_score

That keeps the later debate grounded in actual findings instead of vague roleplay.

For example, the policy specialist checks approval limits and regulated-customer requirements. The risk specialist scans for manual-review phrases like manual override, urgent wire, and refund outside cycle.

The extraction specialist also resolves a subtle but important input issue: signature language. During verification, the first version incorrectly treated "Unsigned" as if it contained "signed". That bug was fixed by matching explicit signature terms and overriding the result when unsigned appears.

This is also why the deterministic pass is useful. Logic bugs are easy to spot when the model is not part of the story yet.

Step 5: Let The Two Sides Argue

Once the specialist results are collected, the workflow creates two debate turns:

  • approve_agent
  • challenge_agent

The approval side tries to justify an automatic decision. The challenge side tries to justify manual review.

Both sides work from the same evidence set. They do not call new tools or change the packet. They only interpret the findings differently.

That makes the debate cheap, explainable, and easy to inspect later.

Step 6: End With A Structured Decision

The workflow ends with a judge that computes the final DecisionEnvelope.

That envelope carries the fields production systems actually need:

  • final decision
  • confidence
  • missing evidence
  • citations
  • next action
  • the specialist findings and debate record
{ "document_id": "DOC-1001", "decision": "approve", "missing_evidence": [], "selected_specialists": ["extraction", "policy", "risk", "history"] }

This is the part that turns the example into something reusable. The result is shaped for systems, not just for a human reading the console.

Step 7: Run The Deterministic Version

From projects/marie-ai:

source ~/environments/marie-3.12/bin/activate python examples/agents/document_review_workflow.py --sample safe python examples/agents/document_review_workflow.py --sample risky

What to expect:

  • the safe sample returns an approve decision with little or no missing evidence
  • the risky sample returns an escalate decision with concrete missing evidence and citations
  • both runs emit the same DecisionEnvelope shape

When you look at the JSON, focus on these fields first:

  • decision
  • confidence
  • next_action
  • missing_evidence
  • selected_specialists

Then run the focused test:

source ~/environments/marie-3.12/bin/activate python -m pytest tests/integration/agent/test_document_review_workflow.py

The focused workflow test currently passes.

Step 8: Run The Live Version

Once the deterministic path makes sense, rerun the same workflow in live mode:

export OPENAI_API_KEY=your_key_here python examples/agents/document_review_workflow.py --sample safe --mode live --backend openai python examples/agents/document_review_workflow.py --sample risky --mode live --backend openai

What changes in live mode:

  • the findings will become more natural and expressive
  • confidence values may shift from run to run
  • the debate and final summary will reflect model behavior, not just fixed rules

What should stay stable:

  • the selected workflow shape
  • the DecisionEnvelope schema
  • the downstream fields your systems consume

Step 9: Adapt It For Your Own Workflow

At this point you have a complete loop:

  1. run the backend example
  2. inspect the DecisionEnvelope
  3. import the visual starter into Studio
  4. change the routing rules, specialists, or output contract for your own domain

The safest way to adapt it is to keep the envelope stable and change one layer at a time:

  • first change the input packet and router rules
  • then change specialist logic
  • only then replace deterministic checks with live model calls or tools

Files To Look At

If you want to keep working on this example, these are the main files:

  • projects/marie-ai/examples/agents/document_review_workflow.py
  • projects/marie-ai/tests/integration/agent/test_document_review_workflow.py
  • projects/marie-studio/docs/content/tutorials/agents/production-workflow-example.mdx

Try These Next

Swap in your own routing rules

Replace the document-review heuristics with the real decision points from your domain. Claims intake, vendor onboarding, contract review, and exception handling all use the same routed-specialist shape.

Replace deterministic specialists with live model calls

Keep the same contract models, then move one specialist at a time from rule-based logic to LLM-backed or tool-backed execution. That lets you measure model variance without changing the workflow shape.

Push the decision envelope into your own downstream systems

The useful part of the example is not the printed JSON. It is the stable envelope. Feed that object into review queues, audit logs, evaluation datasets, or approval systems.

Add runtime visibility after you wire the workflow into your own runtime

Once you expose the workflow through your own API or gateway, connect it to your debugger, orchestration, or monitoring views.

Takeaway

The useful part of this pattern is not the debate by itself or the fan-out by itself. It is the combination of routing, concurrent specialists, structured output, and a final decision object that other systems can trust.

Last updated on