Skip to Content
InfrastructureExecution Fabric

Marie Execution Fabric

Marie Execution Fabric is the resource-aware execution layer that runs heterogeneous AI and document-processing work across Marie executors. It combines DAG semantics with global priority planning, executor-class capacity, leases, and slot-based dispatch so that long and short jobs can share the same runtime without turning the scheduler into a blind queue.

The fabric is designed for document workloads that can span thousands of pages, many request types, and many executor classes. It does not assume every job has the same cost. It continuously chooses runnable work based on dependencies, priority, and available executor capacity.

Why It Exists

Traditional workflow platforms are excellent at defining, observing, and replaying workflows. Marie needs that, but it also needs a scheduler that can keep specialized AI executors busy while respecting page-scale document pipelines, SLA pressure, and typed capacity.

Marie Execution Fabric exists to answer operational questions such as:

  • What work is globally runnable right now?
  • Which runnable item is most important?
  • Which executor class can run it?
  • Is there capacity available before dispatch?
  • Can the work be leased and tracked safely?
  • How do we keep short and long jobs from blocking each other unnecessarily?

Core Model

Submitted DAG -> persisted DAG and job records -> MemoryFrontier of runnable work -> GlobalPriorityExecutionPlanner -> DB lease and executor slot reservation -> typed executor dispatch -> terminal callback and slot release

The important boundary is that Marie does not merely start a DAG and wait for a backend to finish. The scheduler keeps a global frontier of ready work and only dispatches when the matching executor class has capacity.

Fabric Responsibilities

LayerResponsibility
DAG persistenceStore submitted DAGs and job state before execution
MemoryFrontierTrack dependency-satisfied work that is eligible to run
GlobalPriorityExecutionPlannerChoose the next runnable work across DAGs and priorities
Slot capacityReserve executor capacity before work leaves the scheduler
DB leasesProtect work ownership across scheduler and executor boundaries
Executor dispatchSend work to the matching executor class only after capacity exists
Runtime observabilityExpose lifecycle timings, slot occupancy, queue pressure, and terminal state

Resource-Aware Scheduling

Execution Fabric schedules against executor capacity, not just DAG order. A runnable job is not dispatched until the scheduler can reserve the right kind of slot.

ready work + priority + executor class + available slots -> dispatch decision

This is what allows Marie to handle mixed workloads:

  • short validation tasks
  • long document extraction tasks
  • page-level fan-out
  • LLM-backed annotation work
  • layout, OCR, classifier, splitter, and custom processor stages
  • pipelines with thousands of pages and many dependent job nodes

The scheduler can run these together because readiness, priority, and capacity are evaluated at the work-item level.

Runtime Fabric vs Execution Fabric

Marie Studio uses the word fabric for related but distinct control-plane concepts.

ConceptScopePurpose
Runtime FabricGateway groupsDefine homogeneous runtime targets and read/write paths
Execution FabricScheduler and executorsSelect, lease, dispatch, and complete heterogeneous work
LLM Dispatch Runtime FabricLLM request transportQueue and execute executor-originated LLM calls through a fabric read path

Runtime Fabric decides where runtime traffic is aimed. Execution Fabric decides which work should run next once it is inside Marie.

What Makes It Different

Marie Execution Fabric is closer to a low-latency execution fabric than a traditional run launcher because scheduling is tied to live executor capacity.

Traditional DAG run orchestrationMarie Execution Fabric
Start or queue a workflow runPersist DAG and track runnable work
Mark tasks ready as dependencies finishMaintain a global ready frontier
Apply broad concurrency limitsReserve typed executor slots before dispatch
Backend workers eventually pick up tasksScheduler dispatches only when capacity exists
Optimized for scheduled workflows and backfillsOptimized for heterogeneous AI/document execution

This does not mean Marie cannot run long jobs. Long-running work is a first-class case. The distinction is that long jobs consume matching executor slots while the planner can continue selecting other runnable work for other available executor classes.

Throughput Estimation

For planning, estimate throughput from effective slot occupancy, not just raw model or processor runtime.

effective_runtime = slot_reserved -> slot_released estimated_jobs_per_second = executor_count / effective_runtime_seconds estimated_jobs_per_hour = executor_count * 3600 / effective_runtime_seconds

Use the table below as a planning placeholder. Replace {executors} with the number of available executor slots for the relevant executor class.

Average effective runtimeEstimated throughput per executor slotEstimated throughput for {executors} slots
1 second3,600 jobs/hour{executors} * 3,600 jobs/hour
30 seconds120 jobs/hour{executors} * 120 jobs/hour
60 seconds60 jobs/hour{executors} * 60 jobs/hour

These are capacity estimates, not SLA guarantees. SLA behavior depends on arrival bursts, priority policy, executor mix, retry behavior, and the tail of each job class.

Page-Scale Document Workloads

Execution Fabric is designed for page-scale processing. A single document can fan out into many page-level or region-level jobs, and a batch can contain thousands of pages across many submitted DAGs.

The fabric keeps this manageable by separating:

  • DAG state from runnable frontier state
  • readiness from dispatch
  • executor-class capacity from global priority
  • pre-slot scheduler latency from slot occupancy
  • terminal callbacks from future scheduling decisions

This lets Marie process large document sets without forcing every page, model call, or processor step through a single serialized workflow runner.

Operational Signals

Use these signals to understand whether the fabric is healthy:

SignalWhat it tells you
Frontier waitTime from DAG insertion to first planner visibility
Candidate planning timeTime spent choosing globally runnable work
DB lease timeCost of durable ownership before dispatch
Slot waitWhether executor capacity is saturated
Slot occupancyEffective runtime used for throughput estimates
Callback-to-release timeTime to free capacity after executor completion
SLA pressureWhether priority policy is keeping deadlines safe

The most important throughput metric is slot occupancy. The most important latency metric is the end-to-end path from submit to terminal state.

Last updated on