Marie Execution Fabric
Marie Execution Fabric is the resource-aware execution layer that runs heterogeneous AI and document-processing work across Marie executors. It combines DAG semantics with global priority planning, executor-class capacity, leases, and slot-based dispatch so that long and short jobs can share the same runtime without turning the scheduler into a blind queue.
The fabric is designed for document workloads that can span thousands of pages, many request types, and many executor classes. It does not assume every job has the same cost. It continuously chooses runnable work based on dependencies, priority, and available executor capacity.
Why It Exists
Traditional workflow platforms are excellent at defining, observing, and replaying workflows. Marie needs that, but it also needs a scheduler that can keep specialized AI executors busy while respecting page-scale document pipelines, SLA pressure, and typed capacity.
Marie Execution Fabric exists to answer operational questions such as:
- What work is globally runnable right now?
- Which runnable item is most important?
- Which executor class can run it?
- Is there capacity available before dispatch?
- Can the work be leased and tracked safely?
- How do we keep short and long jobs from blocking each other unnecessarily?
Core Model
Submitted DAG
-> persisted DAG and job records
-> MemoryFrontier of runnable work
-> GlobalPriorityExecutionPlanner
-> DB lease and executor slot reservation
-> typed executor dispatch
-> terminal callback and slot releaseThe important boundary is that Marie does not merely start a DAG and wait for a backend to finish. The scheduler keeps a global frontier of ready work and only dispatches when the matching executor class has capacity.
Fabric Responsibilities
| Layer | Responsibility |
|---|---|
| DAG persistence | Store submitted DAGs and job state before execution |
| MemoryFrontier | Track dependency-satisfied work that is eligible to run |
| GlobalPriorityExecutionPlanner | Choose the next runnable work across DAGs and priorities |
| Slot capacity | Reserve executor capacity before work leaves the scheduler |
| DB leases | Protect work ownership across scheduler and executor boundaries |
| Executor dispatch | Send work to the matching executor class only after capacity exists |
| Runtime observability | Expose lifecycle timings, slot occupancy, queue pressure, and terminal state |
Resource-Aware Scheduling
Execution Fabric schedules against executor capacity, not just DAG order. A runnable job is not dispatched until the scheduler can reserve the right kind of slot.
ready work + priority + executor class + available slots -> dispatch decisionThis is what allows Marie to handle mixed workloads:
- short validation tasks
- long document extraction tasks
- page-level fan-out
- LLM-backed annotation work
- layout, OCR, classifier, splitter, and custom processor stages
- pipelines with thousands of pages and many dependent job nodes
The scheduler can run these together because readiness, priority, and capacity are evaluated at the work-item level.
Runtime Fabric vs Execution Fabric
Marie Studio uses the word fabric for related but distinct control-plane concepts.
| Concept | Scope | Purpose |
|---|---|---|
| Runtime Fabric | Gateway groups | Define homogeneous runtime targets and read/write paths |
| Execution Fabric | Scheduler and executors | Select, lease, dispatch, and complete heterogeneous work |
| LLM Dispatch Runtime Fabric | LLM request transport | Queue and execute executor-originated LLM calls through a fabric read path |
Runtime Fabric decides where runtime traffic is aimed. Execution Fabric decides which work should run next once it is inside Marie.
What Makes It Different
Marie Execution Fabric is closer to a low-latency execution fabric than a traditional run launcher because scheduling is tied to live executor capacity.
| Traditional DAG run orchestration | Marie Execution Fabric |
|---|---|
| Start or queue a workflow run | Persist DAG and track runnable work |
| Mark tasks ready as dependencies finish | Maintain a global ready frontier |
| Apply broad concurrency limits | Reserve typed executor slots before dispatch |
| Backend workers eventually pick up tasks | Scheduler dispatches only when capacity exists |
| Optimized for scheduled workflows and backfills | Optimized for heterogeneous AI/document execution |
This does not mean Marie cannot run long jobs. Long-running work is a first-class case. The distinction is that long jobs consume matching executor slots while the planner can continue selecting other runnable work for other available executor classes.
Throughput Estimation
For planning, estimate throughput from effective slot occupancy, not just raw model or processor runtime.
effective_runtime = slot_reserved -> slot_released
estimated_jobs_per_second =
executor_count / effective_runtime_seconds
estimated_jobs_per_hour =
executor_count * 3600 / effective_runtime_secondsUse the table below as a planning placeholder. Replace {executors} with the number of available executor slots for the relevant executor class.
| Average effective runtime | Estimated throughput per executor slot | Estimated throughput for {executors} slots |
|---|---|---|
| 1 second | 3,600 jobs/hour | {executors} * 3,600 jobs/hour |
| 30 seconds | 120 jobs/hour | {executors} * 120 jobs/hour |
| 60 seconds | 60 jobs/hour | {executors} * 60 jobs/hour |
These are capacity estimates, not SLA guarantees. SLA behavior depends on arrival bursts, priority policy, executor mix, retry behavior, and the tail of each job class.
Page-Scale Document Workloads
Execution Fabric is designed for page-scale processing. A single document can fan out into many page-level or region-level jobs, and a batch can contain thousands of pages across many submitted DAGs.
The fabric keeps this manageable by separating:
- DAG state from runnable frontier state
- readiness from dispatch
- executor-class capacity from global priority
- pre-slot scheduler latency from slot occupancy
- terminal callbacks from future scheduling decisions
This lets Marie process large document sets without forcing every page, model call, or processor step through a single serialized workflow runner.
Operational Signals
Use these signals to understand whether the fabric is healthy:
| Signal | What it tells you |
|---|---|
| Frontier wait | Time from DAG insertion to first planner visibility |
| Candidate planning time | Time spent choosing globally runnable work |
| DB lease time | Cost of durable ownership before dispatch |
| Slot wait | Whether executor capacity is saturated |
| Slot occupancy | Effective runtime used for throughput estimates |
| Callback-to-release time | Time to free capacity after executor completion |
| SLA pressure | Whether priority policy is keeping deadlines safe |
The most important throughput metric is slot occupancy. The most important latency metric is the end-to-end path from submit to terminal state.