Core Concepts

Papayya provides two execution paths that share the same observability and cost control features.

Two execution paths

Local (you run, Papayya tracks)

You run your agent on your own machine or infrastructure. Papayya provides checkpointing, cost tracking, and a dashboard.

Your Process
  → papayya().run("my-agent")
    → run.task("search", fn)  ← checkpointed to Postgres
    → run.task("summarize", fn)  ← checkpointed
    → run.complete(result)
  → Dashboard shows checkpoints, cost, timing

Best for: developers with existing agents, existing infrastructure, or agents that need access to local resources.

Cloud (Papayya runs your code)

You deploy agent code. Papayya runs it in isolated containers with full lifecycle management.

Your Code → deploy → S3 → Docker build → Registry
                                             ↓
Trigger run → Redis queue → Worker → Container launched
                                             ↓
                             Papayya loads agent, intercepts LLM calls
                             Reports steps + usage to Control Plane
                             Heartbeats every 15s
                                             ↓
                             Steps + status → Postgres → Dashboard

Best for: scheduled runs, webhook-triggered agents, long-running tasks that shouldn't depend on your process staying alive.

The execution model

An Agent is a definition: a model, instructions, tools, and constraints (max steps, budget).
A Run is a single execution of an agent with a given input. Runs have a lifecycle: queued → running → completed/failed/cancelled.
A Step is an atomic unit of work — an LLM call, a tool execution, or a checkpoint. Steps are numbered sequentially.
A Tool Call is a specific invocation of a tool within a step.

Key principles

Both paths share these guarantees:

Principle	What it means
Step-based execution	Every run is a sequence of discrete steps, not a black box
Checkpoint after every step	State is persisted to Postgres after each step completes
Resume, don't restart	If anything crashes, the run picks up from the last checkpoint
Budget is a hard cap	Runs stop when the budget is hit, not after
Full trace	Every step, tool call, and token count is recorded and queryable

Read on for details on each concept:

Runs & Steps — the execution lifecycle
Triggers — three ways to start a run: API, schedules, and webhooks
Durability — checkpointing and crash recovery
Budget Enforcement — how cost limits work
Replay — re-running from any step
Schedules & Webhooks — CLI and API reference for automated triggers

Quickstart Runs & Steps