
# Data — SQL

`POST https://api.relaystation.ai/v1/data/sql` runs **full SQL over your uploaded file** — a real [DuckDB](https://duckdb.org/) engine, not a partial dialect. Upload a CSV/JSON, send a `SELECT`, get the result back. The power of a database with the simplicity of one stateless call.

## Sandboxed by design

The engine is the selling point — and so is the cage around it. Your SQL runs against your file and **nothing else**: the engine is locked **read-only with external access disabled** before your query runs, so it cannot read other files, reach the network (no `httpfs` / SSRF), write files, install or load extensions, or attach other databases. Your query sees exactly one table — `input`, loaded from your upload — and the full DuckDB SQL surface over it (joins, window functions, CTEs, aggregates). Powerful *and* safe.

## Inputs and outputs

Body: `{ file, sql, from?, to? }`. The `file` is an **input source** — `{ "inline": "<base64>" }` (≤ 4 MB) or `{ "inputKey": "..." }` (≤ 50 MB, from `POST /v1/cputools/upload-url`). `from` is the input format (`csv` default, `json`, `ndjson`, `parquet`); `to` is the result format (`csv` default, `json`, `parquet`). Your file is loaded as the table **`input`** — write `... FROM input`. The result comes back in the uniform output envelope.

### Parquet, in and out

DuckDB reads [**Parquet**](https://parquet.apache.org/) directly — pass `from: "parquet"` to query a columnar file with the same `SELECT ... FROM input`. And `to: "parquet"` emits a **binary Parquet** result: `csv`/`json` results return as inline text, but Parquet is binary, so it rides the storage-ref output envelope (`{ output: { outputKey, outputUrl, … } }`) — see [receiving outputs](/docs/receiving-outputs) and [persistence tiers](/docs/persistence-tiers). The Parquet file is produced by a **safe two-step** that never touches the read-only user-SQL sandbox, so the sandbox guarantees above hold unchanged.

A handy shape: `{ from: "csv", to: "parquet", sql: "SELECT * FROM input" }` is a one-call CSV→Parquet conversion (and `from: "parquet", to: "csv"` the reverse). Gzipped CSV inputs are handled transparently — a gzip-compressed CSV upload is decompressed before the query, no extra flag needed.

## Billing

Per-MB of input, **minimum 1 MiB**, at `cputools.price.data.sql.per_mb_micros` (launch default **$0.0005 / MB** — a higher tier than the in-process ops, since the query runs on the vendored DuckDB worker). The live `402` challenge is authoritative.

## Query

```json
POST /v1/data/sql
{ "file": {"inline":"<b64 csv>"},
  "sql": "SELECT country, count(*) AS n, sum(spend) AS total FROM input GROUP BY country ORDER BY total DESC",
  "to": "json" }
```

## Sample

```bash
curl -X POST https://api.relaystation.ai/v1/data/sql \
  -H 'X-Payment: <base64 EIP-3009 auth>' \
  -H 'Idempotency-Key: top-countries-20260609' \
  -H 'Content-Type: application/json' \
  -d '{"file":{"inline":"Y291bnRyeSxzcGVuZApVUywxMApVUyw1CkRFLDcK"},"sql":"SELECT country, sum(spend) AS total FROM input GROUP BY country ORDER BY total DESC","to":"json"}'
```

## Caps

Operator-tunable, enforced per call: a memory limit (`cputools.data.sql.memory_limit_mb`), a wall-clock timeout (`cputools.data.sql.timeout_ms`, default ~25 s — a query that runs long is killed), an output-size cap (`cputools.data.sql.max_output_bytes`, 128 MiB — a larger result returns `413`), a thread cap, and an SQL-length cap. These bound a runaway or oversized query; the engine sandbox (above) bounds what the query can reach.

## Errors

- `402 PAYMENT_REQUIRED` — no valid payment.
- `422 SQL_INVALID` — a SQL syntax/semantic error (the DuckDB message is surfaced; your data is never echoed). Includes any attempt to use a blocked capability (file/network/extension) — the sandbox denies it.
- `413 OUTPUT_TOO_LARGE` — the result exceeds the output cap (incl. a large `to: "parquet"` result).
- A query that exceeds the time or memory limit is terminated and the charge refunded.

## Next

[Data tools](/docs/data-tools) · [Pipeline](/docs/pipeline) · [Pricing](/pricing) · [API reference](/api-reference)
