Skip to content

Serialization & Migration

A codec controls how task inputs and outputs are serialized when checkpointed. The codec is set per-backend instance, not per-workflow.

Human-readable JSON via serde_json. Types must derive serde::Serialize and serde::Deserialize.

use sayiir_runtime::prelude::*;
// JsonCodec is the default — no explicit codec needed
let workflow = workflow! {
name: "order_pipeline",
registry: registry,
steps: [validate, charge, ship]
}.unwrap();

Zero-copy binary serialization via rkyv. Types must derive rkyv::Archive, rkyv::Serialize, and rkyv::Deserialize.

use sayiir_runtime::serialization::RkyvCodec;
let workflow = workflow! {
name: "order_pipeline",
codec: RkyvCodec,
registry: registry,
steps: [validate, charge, ship]
}.unwrap();
CriteriaJsonCodecRkyvCodec
ReadabilityHuman-readable, easy to inspectBinary, requires tooling
PerformanceSlower, larger payloadsFast, zero-copy deserialization
EcosystemLanguage-agnostic (serde)Rust-only
DebuggingEasy — jq on snapshotsHarder — opaque bytes
Best forDevelopment, mixed-language stacksHigh-throughput Rust workloads

Rule of thumb: Start with JsonCodec (the default). Switch to RkyvCodec when serialization shows up in profiling.

Every type that flows through a task (inputs and outputs) must be serializable by the codec you choose.

use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize)]
struct OrderInput {
order_id: String,
amount: f64,
/// New field — old snapshots without it will use the default
#[serde(default)]
currency: String,
}

For Python, any type that json.dumps / json.loads can handle works (dicts, lists, strings, numbers, booleans, None). For Node.js, anything that survives JSON.stringify / JSON.parse.

Definition Hash — What Breaks Resumability

Section titled “Definition Hash — What Breaks Resumability”

When a workflow is built, Sayiir computes a SHA-256 definition hash from the workflow’s structural shape. This hash is stored with every snapshot. On resume, the engine compares the current hash to the snapshot’s hash — if they differ, the workflow cannot resume and a DefinitionMismatch error is raised.

The hash covers the structure of the workflow:

  • Task IDs and their sequential order
  • Timeout values
  • Retry policies (max retries, initial delay, backoff multiplier)
  • Task version strings (from metadata)
  • Fork IDs and branch structure
  • Branch IDs, keys, and default handlers
  • Loop IDs, max iterations, and on_max policy
  • Delay durations and IDs
  • Signal names, IDs, and timeouts
  • Task implementations (the function body)
  • The codec used
  • The backend used
  • Other task metadata (tags, descriptions, display names)
ChangeHash ImpactSafe for In-Flight?
Change task logic (same signature)No changeYes
Swap backend (InMemory → Postgres)No changeYes
Switch codec (Json → Rkyv)No changeYes*
Update task tags or descriptionsNo changeYes
Add a new task to the pipelineChangesNo
Remove a taskChangesNo
Reorder tasksChangesNo
Change a timeout valueChangesNo
Change retry policyChangesNo
Add/remove a fork branchChangesNo
Change loop max_iterations or on_maxChangesNo
Rename a task IDChangesNo
Change a signal nameChangesNo
Bump task versionChangesNo

* Switching codecs is hash-safe but existing serialized bytes must still be decodable by the new codec. In practice, switching codecs on a live system with in-flight data requires a drain first.

Task Version — Opting In to Schema Change Detection

Section titled “Task Version — Opting In to Schema Change Detection”

The definition hash does not cover task input/output types. If you change the shape of a task’s data types, the hash stays the same — and in-flight workflows may fail at runtime when they try to deserialize old cached results with the new type.

To make the engine detect these schema changes, set the version field in your task metadata. The version string is included in the definition hash, so bumping it forces a new workflow version and prevents in-flight workflows from resuming with the old schema.

from sayiir import task, TaskMetadata
@task(metadata=TaskMetadata(version="2.0"))
def process_order(order: dict) -> dict:
return {"status": "processed", "total": order["total"]}

When you change a task’s input or output type in a breaking way, bump the version string. The engine will reject resuming in-flight workflows that were started with the old version, and you can drain them before deploying.

When a hash mismatch occurs:

Workflow definition mismatch: expected hash 'a1b2c3...', found 'd4e5f6...'

In Rust this is BuildError::DefinitionMismatch (at build time) or WorkflowError::DefinitionMismatch (at runtime). Python and Node.js raise equivalent exceptions.

Even when the workflow structure doesn’t change (same definition hash), the data types flowing through tasks can evolve. Here’s how to do it safely with JsonCodec / serde.

Use #[serde(default)] so existing snapshots without the new field deserialize correctly:

#[derive(Serialize, Deserialize)]
struct OrderInput {
order_id: String,
amount: f64,
#[serde(default)]
priority: Option<String>, // new field — old data deserializes as None
}

In Python, this happens naturally with dict.get("priority").

Don’t delete the field while in-flight workflows may still carry it. Instead, stop writing it:

#[derive(Serialize, Deserialize)]
struct OrderInput {
order_id: String,
amount: f64,
#[serde(skip_serializing, default)]
legacy_field: Option<String>, // still readable, no longer written
}

Once all in-flight workflows have drained, you can remove the field entirely.

Use #[serde(rename)] or #[serde(alias)] for backward compatibility:

#[derive(Serialize, Deserialize)]
struct OrderInput {
#[serde(alias = "order_id")]
id: String, // renamed from order_id — old snapshots still work
amount: f64,
}
  • Adding a variant is safe — existing data won’t contain it.
  • Removing a variant breaks deserialization for in-flight workflows that stored it.

Sayiir separates two concerns when deploying changes:

  1. Structural changes (adding/removing tasks, changing timeouts, reordering steps) — automatically gated by the definition hash. The engine rejects mismatches; you cannot accidentally resume an old workflow with a new structure.
  2. Schema changes (modifying the shape of task inputs or outputs) — your responsibility. The definition hash does not cover data types, so the engine won’t catch a schema mismatch until deserialization fails at runtime.

The recommended approach for both is version-pinning with draining: pin running workflow instances to their original definition, let them finish, and deploy the new version for new instances.

These changes can be deployed at any time, even with in-flight workflows:

  • Bug fixes in task logic — the function body isn’t hashed
  • Infrastructure swaps — changing backends, scaling workers
  • Metadata updates — tags, descriptions
  • Backward-compatible schema changes — see Schema Evolution above

Just deploy as usual. In-flight workflows resume normally.

When you change the workflow structure (add/remove tasks, change timeouts, reorder steps, etc.), the definition hash changes. In-flight workflows cannot resume with the new definition — the engine enforces this automatically.

Simplest approach: use a new instance ID. If your application controls instance IDs, the easiest migration path is to start new workflow instances with fresh IDs under the new definition. Old instances tied to the old definition will either drain naturally or can be cancelled. This avoids any coordination between old and new workers.

# Old definition — instances "order-100", "order-101" still running
# New definition — new instances get "order-102", "order-103", ...
# Old instances drain on their own; no conflict.

Drain-and-restart approach: If you need the same instance IDs to carry over:

  1. Stop starting new workflow instances with the old definition
  2. Wait for in-flight workflows to complete (drain)
  3. Deploy the new workflow definition
  4. Start new instances with the new definition
graph LR
    A[Stop new submissions] --> B[Drain in-flight]
    B --> C[Deploy new definition]
    C --> D[Resume traffic]

A schema change is breaking when old serialized data cannot be deserialized into the new type — for example, renaming a field without an alias, changing a field’s type, or removing a required field.

Because the definition hash does not cover data types, the engine won’t prevent you from deploying a breaking schema change. Instead, tasks will fail at runtime with a DeserializationError (Python), CODEC_ERROR (Node.js), or RuntimeError::Codec (Rust) when they try to decode a cached result that no longer matches the new type.

Recommended approach: drain before deploying breaking schema changes.

  1. Stop starting new instances
  2. Let in-flight workflows complete (they still use the old schema)
  3. Deploy the new code with the updated types
  4. Start new instances — only new data flows through the new schema

For gradual migrations, you can also run old and new workers side by side: old workers continue processing old instances while new workers handle new ones. In distributed mode, workers already skip tasks whose definition hash they don’t recognize, so this works naturally.

When running multiple workers, all workers must agree on the definition hash for workflows they process. During a rolling deployment:

  1. Workers with the old definition will skip tasks created with the new definition
  2. Workers with the new definition will skip tasks created with the old definition

This means you can:

  • Blue/green deploy — bring up new workers alongside old ones. Old workers drain old instances, new workers handle new instances. Shut down old workers once all old instances complete.
  • Drain, then deploy all workers at once — simplest if you can tolerate a brief pause in processing.
ScenarioHash changes?Automatic protection?Recommended strategy
Bug fix in task logicNoN/A (safe)Deploy normally
Add #[serde(default)] fieldNoN/A (safe)Deploy normally
Add/remove/reorder a taskYesEngine rejects mismatchNew instance IDs, or drain first
Change timeout or retry policyYesEngine rejects mismatchNew instance IDs, or drain first
Rename a field without aliasNoNo — fails at runtimeDrain first
Change a field’s typeNoNo — fails at runtimeDrain first