Marie Execution Fabric

Marie Execution Fabric is the resource-aware execution layer that runs heterogeneous AI and document-processing work across Marie executors. It combines DAG semantics with global priority planning, executor-class capacity, leases, and slot-based dispatch so that long and short jobs can share the same runtime without turning the scheduler into a blind queue.

The fabric is designed for document workloads that can span thousands of pages, many request types, and many executor classes. It does not assume every job has the same cost. It continuously chooses runnable work based on dependencies, priority, and available executor capacity.

Why It Exists

Traditional workflow platforms are excellent at defining, observing, and replaying workflows. Marie needs that, but it also needs a scheduler that can keep specialized AI executors busy while respecting page-scale document pipelines, SLA pressure, and typed capacity.

Marie Execution Fabric exists to answer operational questions such as:

What work is globally runnable right now?
Which runnable item is most important?
Which executor class can run it?
Is there capacity available before dispatch?
Can the work be leased and tracked safely?
How do we keep short and long jobs from blocking each other unnecessarily?

Core Model


Submitted DAG
  -> persisted DAG and job records
  -> MemoryFrontier of runnable work
  -> GlobalPriorityExecutionPlanner
  -> DB lease and executor slot reservation
  -> typed executor dispatch
  -> terminal callback and slot release

The important boundary is that Marie does not merely start a DAG and wait for a backend to finish. The scheduler keeps a global frontier of ready work and only dispatches when the matching executor class has capacity.

Fabric Responsibilities

Layer	Responsibility
DAG persistence	Store submitted DAGs and job state before execution
MemoryFrontier	Track dependency-satisfied work that is eligible to run
GlobalPriorityExecutionPlanner	Choose the next runnable work across DAGs and priorities
Slot capacity	Reserve executor capacity before work leaves the scheduler
DB leases	Protect work ownership across scheduler and executor boundaries
Executor dispatch	Send work to the matching executor class only after capacity exists
Runtime observability	Expose lifecycle timings, slot occupancy, queue pressure, and terminal state

Resource-Aware Scheduling

Execution Fabric schedules against executor capacity, not just DAG order. A runnable job is not dispatched until the scheduler can reserve the right kind of slot.


ready work + priority + executor class + available slots -> dispatch decision

This is what allows Marie to handle mixed workloads:

short validation tasks
long document extraction tasks
page-level fan-out
LLM-backed annotation work
layout, OCR, classifier, splitter, and custom processor stages
pipelines with thousands of pages and many dependent job nodes

The scheduler can run these together because readiness, priority, and capacity are evaluated at the work-item level.

Runtime Fabric vs Execution Fabric

Marie Studio uses the word fabric for related but distinct control-plane concepts.

Concept	Scope	Purpose
Runtime Fabric	Gateway groups	Define homogeneous runtime targets and read/write paths
Execution Fabric	Scheduler and executors	Select, lease, dispatch, and complete heterogeneous work
LLM Dispatch Runtime Fabric	LLM request transport	Queue and execute executor-originated LLM calls through a fabric read path

Runtime Fabric decides where runtime traffic is aimed. Execution Fabric decides which work should run next once it is inside Marie.

What Makes It Different

Marie Execution Fabric is closer to a low-latency execution fabric than a traditional run launcher because scheduling is tied to live executor capacity.

Traditional DAG run orchestration	Marie Execution Fabric
Start or queue a workflow run	Persist DAG and track runnable work
Mark tasks ready as dependencies finish	Maintain a global ready frontier
Apply broad concurrency limits	Reserve typed executor slots before dispatch
Backend workers eventually pick up tasks	Scheduler dispatches only when capacity exists
Optimized for scheduled workflows and backfills	Optimized for heterogeneous AI/document execution

This does not mean Marie cannot run long jobs. Long-running work is a first-class case. The distinction is that long jobs consume matching executor slots while the planner can continue selecting other runnable work for other available executor classes.

Throughput Estimation

For planning, estimate throughput from effective slot occupancy, not just raw model or processor runtime.


effective_runtime = slot_reserved -> slot_released

estimated_jobs_per_second =
  executor_count / effective_runtime_seconds

estimated_jobs_per_hour =
  executor_count * 3600 / effective_runtime_seconds

Use the table below as a planning placeholder. Replace {executors} with the number of available executor slots for the relevant executor class.

Average effective runtime	Estimated throughput per executor slot	Estimated throughput for `{executors}` slots
1 second	3,600 jobs/hour	`{executors} * 3,600 jobs/hour`
30 seconds	120 jobs/hour	`{executors} * 120 jobs/hour`
60 seconds	60 jobs/hour	`{executors} * 60 jobs/hour`

These are capacity estimates, not SLA guarantees. SLA behavior depends on arrival bursts, priority policy, executor mix, retry behavior, and the tail of each job class.

Page-Scale Document Workloads

Execution Fabric is designed for page-scale processing. A single document can fan out into many page-level or region-level jobs, and a batch can contain thousands of pages across many submitted DAGs.

The fabric keeps this manageable by separating:

DAG state from runnable frontier state
readiness from dispatch
executor-class capacity from global priority
pre-slot scheduler latency from slot occupancy
terminal callbacks from future scheduling decisions

This lets Marie process large document sets without forcing every page, model call, or processor step through a single serialized workflow runner.

Operational Signals

Use these signals to understand whether the fabric is healthy:

Signal	What it tells you
Frontier wait	Time from DAG insertion to first planner visibility
Candidate planning time	Time spent choosing globally runnable work
DB lease time	Cost of durable ownership before dispatch
Slot wait	Whether executor capacity is saturated
Slot occupancy	Effective runtime used for throughput estimates
Callback-to-release time	Time to free capacity after executor completion
SLA pressure	Whether priority policy is keeping deadlines safe

The most important throughput metric is slot occupancy. The most important latency metric is the end-to-end path from submit to terminal state.