Durable checkpointing
Continuation-based recovery. No replay, no determinism constraints.
Sayiir is a durable workflow engine. It’s great at what it does: checkpointed execution, parallel tasks, conditional branching, loops, workflow composition, retries, signals, and crash recovery — all with zero infrastructure.
This page is about what’s missing and what we’re building next. We’re honest about gaps because that’s how you decide if Sayiir is right for your use case today, or if you should wait.
These features are stable and production-ready across Rust, Python, and Node.js:
Durable checkpointing
Continuation-based recovery. No replay, no determinism constraints.
Fork/join parallelism
Run branches in parallel, merge results with a join task.
Retries & timeouts
Exponential backoff, per-task timeouts, durable retry state.
Conditional branching
Route work based on data with route. Key-based routing with optional defaults.
Signals & events
Wait for external events with optional timeouts. Signals are durably buffered.
Delays
Durable delays that don’t hold workers.
Distributed workers
Multiple workers polling a shared PostgreSQL backend.
Python & Node.js bindings
Thin wrappers around the shared Rust core. Pydantic and Zod integration.
Loops with exit conditions
Iterative workflows with LoopResult (again/done). Max iterations policy, durable checkpointing per iteration.
Workflow composition
Inline child workflows with then_flow / thenFlow. Task registries merge automatically. Build modular pipelines from reusable sub-workflows.
PostgreSQL backend
Production-grade persistence with ACID transactions, claim-based distribution, and snapshot history.
Task execution context
Read-only access to workflow ID, instance ID, task ID, and task metadata from within running tasks. All three bindings.
Status: Planned
Today, each task only sees the output of the previous task. Passing context through a fork requires workarounds — each branch must carry context in its output so the join task can reassemble it.
What it enables:
Proposed API:
@taskdef search_web(query: dict, ctx: WorkflowContext) -> list[dict]: depth = ctx.get("depth", "detailed") results = do_search(query["topic"], depth) ctx.set("web_results_count", len(results)) return resultsStatus: In progress
Production observability for understanding what’s happening: which tasks are running, how long they take, where failures occur.
instance_id, task_id, worker_id on all spansThis ships as part of the open-source library, independent of Sayiir Server.
Status: Exploring
LLM responses benefit from token-by-token streaming. Sayiir tasks are atomic — they run to completion and get checkpointed. Streaming conflicts with this model.
What we’re exploring:
@streaming_task decorator that streams partial results while still checkpointing the final outputThis is the hardest problem on the roadmap because it touches the core execution model.
Features for high-scale and specialized production use cases.
Long-running workflows that loop indefinitely (monitoring, polling, recurring processing) without unbounded state growth.
continue_as_new(input) primitive — restart with fresh stateThe checkpoint model makes this fundamentally easier than replay-based systems.
Single-binary durable execution with zero infrastructure. For CLI tools, edge functions, embedded systems.
Commercial offering for teams that need operational tooling on top of the open-source core.
See the Sayiir Server page for details.
The roadmap is shaped by real use cases. If you’re hitting one of these gaps, or have a use case we haven’t considered: