
# Pipeline

`POST https://api.relaystation.ai/v1/pipeline` runs a **sequence of transforms in one call**. Send a file and a list of steps; the pipeline feeds each step's output into the next and returns the final result — no intermediate round-trips, no orchestration on your side. This is what turns the cputools catalog from a list of operations into a composable transform runtime.

## How it works

```json
POST /v1/pipeline
{ "file": {"inline":"<b64>"},
  "steps": [
    { "op": "data.filter",  "args": { "where": {"column":"status","op":"eq","value":"active"} } },
    { "op": "data.groupby", "args": { "by": ["country"], "aggregate": [{"fn":"count","as":"n"}] } },
    { "op": "csv.convert",  "args": { "to": "json" } }
  ],
  "to": "json" }
```

Each step is `{ "op": "<category>.<op>", "args": { ... } }` — the op's normal arguments, minus the `file` (the data is threaded for you). Steps run **in order**, left to right; the final `to` sets the output format.

## The composable op set

The pipeline composes the in-process single-stream ops — tabular **and** binary:

- **csv** — `csv.convert`, `csv.dedupe`, `csv.select`
- **data** — `data.filter`, `data.sort`, `data.groupby`, `data.cast`, `data.pivot`, `data.derive`, `data.fillna`, `data.dropna`, `data.rename`, `data.slice`, `data.explode`, plus the **Excel bridges** `data.from-xlsx` / `data.to-xlsx` (and `data.profile`, `data.validate`, `data.schema-infer` as final, report-producing steps)
- **image** — `image.resize`, `image.convert`, `image.compress`, `image.rotate` (chain image transforms in one call)
- **utils** — `utils.base64`
- **archive** — `archive.gzip`

Steps thread bytes, so a chain stays within a data type — a tabular chain, an image chain, or an **Excel-in/Excel-out** chain (`from-xlsx → … → to-xlsx`). Multi-input ops (`data.join`, `data.union`, `data.diff`), the SQL engine (`data.sql`), and the worker-backed PDF ops aren't composable in a pipeline — use them as standalone calls.

## Billing

One charge: **the sum of the steps**. Each step is priced at its normal per-op rate (read live from the same `cputools.price.*` tunables the standalone ops use), plus an orchestration fee (`cputools.price.pipeline.run.flat_micros`, default $0). A three-step pipeline on a 5 MB file is `Σ (step rate × 5 MiB) + fee`. An unauthenticated `POST` returns the exact summed price as a `402` challenge — quote before you pay.

## Atomic

A pipeline is all-or-nothing: if any step fails (an unknown column, a malformed intermediate, an over-cap result), the whole call fails and the charge is **refunded**. You never pay for a partial run.

## Caps

Operator-tunable: `cputools.pipeline.max_steps` (default 10) bounds the chain length; `cputools.pipeline.max_intermediate_bytes` (128 MiB) bounds the data threaded between steps — a step whose output crosses it returns `413` and refunds.

## Errors

- `402 PAYMENT_REQUIRED` — no valid payment.
- `422 OP_NOT_COMPOSABLE` — a step names an op that isn't in the composable set.
- `422 TERMINAL_STEP_NOT_LAST` — a report op (`data.profile`) appears before the last step.
- `422` (step-level) — a step's own validation error (e.g. `UNKNOWN_COLUMN`) surfaces from that step; the charge is refunded.
- `413 INTERMEDIATE_TOO_LARGE` — a step's output exceeds the intermediate cap.

## Next

[Data tools](/docs/data-tools) · [Data — SQL](/docs/data-sql) · [CSV tools](/docs/csv-tools) · [Pricing](/pricing) · [API reference](/api-reference)
