# PDF tools Twenty-two PDF operations, one endpoint, one uniform request/response shape. Every op is a `POST https://api.relaystation.ai/v1/pdf/`. The pure-PDF ops (merge … metadata, images, diff, bookmarks, attachments) run in-process on pdf-lib / pdf.js; the heavier ops run on dedicated engines — encrypt, decrypt, compress, repair, render, OCR, ocr-searchable, and verify-signatures on vendored **qpdf**, **poppler** (including `pdfsig`), and **Tesseract**; **from-html (HTML→PDF) on a vendored headless Chromium**; and **from-office (Office→PDF) on LibreOffice (Gotenberg)** — same request shape, same per-call billing. `from-html` **generates** a PDF from scratch (an invoice, a report, a certificate); `from-office` **converts** an existing office document (docx, xlsx, pptx, …) into one. ## Inputs and outputs — the uniform shape Every op takes its file (or files) as an **input source** and returns a uniform **output envelope**. You never stream a raw file; the body is always JSON. **Input source** — one of two shapes per file field: ```json { "inline": "" } // for files ≤ 4 MB { "inputKey": "" } // for larger files ``` For files above the 4 MB inline ceiling, mint a one-time presigned upload first: ```bash curl -X POST https://api.relaystation.ai/v1/cputools/upload-url \ -H 'Authorization: Bearer rs_live_' \ -H 'Content-Type: application/json' \ -d '{"ext":"pdf","contentType":"application/pdf"}' # → { "url": "...", "fields": { ... }, "inputKey": "...", "expiresAt": "...", "maxBytes": 52428800 } ``` Then upload with a **multipart/form-data `POST` to `url`** — include every entry from `fields` as a form field, then a `file` field carrying the bytes (the standard S3 presigned-POST form). Finally pass `{"inputKey":"..."}` as the file to the op route. The presigned POST is customer-scoped, short-lived, and size-capped at `maxBytes` (S3 rejects an oversized body at upload time). An x402 lodestone wallet can mint one too — minting is free; only the op that consumes the bytes bills. **Output envelope** — always JSON, under an `output` key: ```json { "output": { "inline": "", // present when the result ≤ 4 MB "outputKey": "...", // present when the result > 4 MB "outputUrl": "https://...", // presigned GET for the scratch key "sizeBytes": 12345, "contentType": "application/pdf", "filename": "merged.pdf" } } ``` The inline-vs-reference threshold is operator-tunable (`cputools.io.max_inline_bytes`, default 4 MB). Results at or under it come back inline; larger results are written to a scratch key and returned as a presigned `outputUrl`. The full inline / scratch / baton model is the [persistence tiers](/docs/persistence-tiers) page; recipes for downloading or chaining outputs are in [Receiving outputs](/docs/receiving-outputs). ## Billing Per-page ops bill the page count × the per-page rate, **minimum one page**. `metadata` is a flat per-call charge. The per-op rates are operator-tunable (`cputools.price.pdf..per_page_micros`); the launch defaults: | Op | Billed on | Rate | |---|---|---| | `merge` | output pages | $0.001 / page | | `split` | source pages | $0.001 / page | | `rotate` | result pages | $0.001 / page | | `pages` | result pages | $0.001 / page | | `watermark` | pages stamped | $0.001 / page | | `form` | pages | $0.001 / page | | `extract-text` | source pages | $0.002 / page | | `metadata` | per call | $0.001 flat | | `encrypt` | pages | $0.001 / page | | `decrypt` | pages | $0.001 / page | | `compress` | pages | $0.001 / page | | `repair` | pages | $0.001 / page | | `render` | output pages | $0.001 / page | | `ocr` | source pages | $0.003 / page | | `ocr-searchable` | source pages | $0.004 / page | | `verify-signatures` | per call | $0.001 flat | | `from-html` | per call | $0.003 flat | | `from-office` | per call | $0.005 flat | | `images` | pages scanned | $0.001 / page | | `diff` | combined pages of both inputs | $0.001 / page | | `bookmarks` | per call | $0.001 flat | | `attachments` | per call | $0.001 flat | These are launch defaults; the live `402` challenge is authoritative (an unauthenticated `POST` returns the exact price). Every billable call needs an `Idempotency-Key` header — omit it and the chassis generates one per call. ## merge Combine 2–50 PDFs into one, in the order given. ```json POST /v1/pdf/merge { "files": [ {"inline":""}, {"inputKey":"..."} ], "filename": "combined.pdf" } ``` ## split Split one PDF by 1-based ranges (`"1-3,5,8-10"`) into separate documents, or `burst: true` to explode every page into its own single-page PDF. Omit `ranges` (or pass `"all"`) for the whole document. ```json POST /v1/pdf/split { "file": {"inline":""}, "ranges": "1-3,7-9" } ``` Split produces multiple files, so it returns a **manifest** (not the single output envelope): each entry is a storage reference you can re-submit as an `inputKey` to another op. ```json { "files": [ {"index": 0, "outputKey": "...", "outputUrl": "https://...", "pages": 3, "sizeBytes": 12345} ] } ``` ## rotate Rotate pages by 90 / 180 / 270 degrees. `pages` is an optional 1-based range string; omit it to rotate every page. ```json POST /v1/pdf/rotate { "file": {"inline":""}, "degrees": "90", "pages": "1,3-5" } ``` ## pages Edit page structure. One of three actions: - `delete` — drop a range of pages: `{"action":"delete","file":{...},"pages":"2,4-6"}` - `reorder` — supply the new 1-based page order: `{"action":"reorder","file":{...},"order":[3,1,2]}` - `insert` — splice another PDF in at a 1-based position (`at: 1` inserts before the first page; `at: N+1` appends): `{"action":"insert","file":{...},"insert":{...},"at":2}` ## watermark Stamp text on every page (or a `pages` range). Tune `opacity` (0–1), `size` (pt, ≤ 400), `color` (`#RRGGBB`), and `position` (`center` | `top-left` | `top-right` | `bottom-left` | `bottom-right`). ```json POST /v1/pdf/watermark { "file": {"inline":""}, "text": "CONFIDENTIAL", "opacity": 0.2, "position": "center" } ``` ## form Fill or flatten an AcroForm. Pass `fields` as a map of field name → string or boolean (checkbox); set `flatten: true` to bake the values in so the form is no longer editable. ```json POST /v1/pdf/form { "file": {"inline":""}, "fields": {"name":"Ada Lovelace","subscribe":true}, "flatten": true } ``` ## extract-text Pull text + structure out of a PDF (engine: pdf.js / unpdf). Returns per-page text plus the concatenated whole. `pages` is an optional 1-based range string. ```json POST /v1/pdf/extract-text { "file": {"inline":""}, "pages": "1-5" } ``` Response (not an output envelope — this op returns parsed text directly): ```json { "totalPages": 12, "pages": [ {"page": 1, "text": "..."} ], "text": "..." } ``` ## metadata Read document metadata, or set any of `title`, `author`, `subject`, `keywords`, `creator`, `producer`. Read by omitting `set`; write by passing it. Flat per-call price. A read returns `{ "metadata": { ... } }`; a write returns the standard output envelope with the updated PDF. ```json POST /v1/pdf/metadata { "file": {"inline":""}, "set": {"title":"Q3 Report","author":"finance-bot"} } ``` ## encrypt Password-protect a PDF (qpdf, 256-bit AES). Supply `userPassword` (the open password), `ownerPassword` (permissions password), or both — at least one. Neither may begin with `-`. Supplying only an `ownerPassword` leaves the open password empty: anyone can open the document, but permissions stay owner-gated (the standard PDF model). Eight optional **permission booleans** — the canonical PDF permission set — restrict what an opened document allows: `print`, `printHighRes`, `copy` (text/graphics extraction), `modify`, `annotate`, `fillForms` (fill in form fields), `extract` (extract for accessibility), and `assemble` (insert/delete/rotate pages). An **absent** boolean means the qpdf default (allowed); pass `false` to restrict. Permissions are enforced by PDF viewers against the owner password. Two notes on the set: `print` and `printHighRes` fold into one print level — `print:false` blocks printing entirely, `print:true` with `printHighRes:false` allows only low-resolution printing, and `print:true` alone allows full-quality printing. And `copy` is the live text/graphics extraction control; `extract` is the separate *accessibility*-extraction bit, which PDF 2.0 (the 256-bit encryption used here) deprecated and always grants — so `extract:false` may be a no-op on modern viewers. Use `copy:false` to actually block extraction. ```json POST /v1/pdf/encrypt { "file": {"inline":""}, "userPassword": "hunter2", "ownerPassword": "admin-only", "print": true, "printHighRes": false, "copy": false, "modify": false, "annotate": false, "fillForms": true, "assemble": false } ``` Passwords ride only in the request body to the IAM-gated worker — they're never logged. ## decrypt Remove a password from a PDF (qpdf). The supplied `password` must open the document. ```json POST /v1/pdf/decrypt { "file": {"inline":""}, "password": "hunter2" } ``` ## compress Recompress a PDF's object and content streams (qpdf). Billed **per page, not by savings** — the compression ratio depends entirely on the document (an already-optimized PDF may not shrink). Set `linearize: true` for web "fast view" (byte-range streaming). ```json POST /v1/pdf/compress { "file": {"inline":""}, "linearize": true } ``` Looking for a "linearize" / web-optimize op? It's this flag — `linearize: true` on compress IS the linearization path (there is no separate `pdf/linearize` op). ## repair Repair a damaged PDF (qpdf). Two steps in one call: `qpdf --check` **diagnoses** the document (a findings report — broken xref, stream-length mismatches, structural warnings), then a full **rewrite** re-serializes it, reconstructing the cross-reference table and normalizing structure. Billed per page (compress's rate). Returns the repaired PDF **always as a storage ref** plus the diagnosis: ```json POST /v1/pdf/repair { "file": {"inline":""} } ``` ```json { "output": { "outputKey": "...", "outputUrl": "https://...", "sizeBytes": 51234, "contentType": "application/pdf", "filename": "repaired.pdf" }, "repair": { "exitCode": 2, "findings": ["xref not found", "Attempting to reconstruct cross-reference table", "..."] } } ``` `repair.exitCode` is qpdf's `--check` verdict: `0` clean, `2` errors found, `3` warnings — all three still produce a rewritten output. A PDF too damaged to parse at all returns `422 PDF_PARSE_FAILED` (charge reversed). ## render Rasterize PDF pages to PNG or JPEG (poppler `pdftoppm`). `pages` is an optional 1-based range (default: all); `dpi` ≤ 300 (default 150); `format` is `png` or `jpeg`. Non-embedded standard-14 fonts (Helvetica, Times, …) render correctly via a vendored DejaVu Sans substitution. **Capped at 40 pages and 300 DPI** — over either returns `422` pre-charge. Billed per **output** page. ```json POST /v1/pdf/render { "file": {"inline":""}, "pages": "1-5", "dpi": 200, "format": "png" } ``` Render produces multiple images, so it returns a **manifest** (like split), one presigned-GET image ref per page: ```json { "images": [ {"index": 0, "page": 1, "outputKey": "...", "outputUrl": "https://..."} ] } ``` ## ocr Extract text from a scanned/image PDF (poppler `pdftoppm` → Tesseract LSTM). `pages` is an optional 1-based range (default: all). **Synchronous and capped at 5 pages** — the call must clear a ~30-second window, so larger asynchronous OCR jobs are coming. Over the cap returns `422 OCR_TOO_MANY_PAGES_SYNC` (which advertises the cap) before any charge. Billed per **source** page. `lang` selects the OCR language from the **deployed 10-language roster**: `eng` (default), `spa`, `fra`, `deu`, `ita`, `por`, `nld`, `pol`, `rus`, `chi_sim`. The live roster is the operator-tunable `cputools.ocr.langs`; an off-roster `lang` returns a **free** `422 UNSUPPORTED_LANG` that names the roster. The same `lang` parameter (and roster) applies to `ocr-searchable` and [`image/ocr`](/docs/image-tools). ```json POST /v1/pdf/ocr { "file": {"inline":""}, "pages": "1-3", "lang": "spa" } ``` Response (parsed text, not an output envelope): ```json { "pages": [ {"page": 1, "text": "..."} ], "text": "..." } ``` ## ocr-searchable Turn a scanned PDF into a **searchable PDF**: per page, the worker rasterizes (`pdftoppm`), Tesseract emits a page PDF carrying the original page image plus an **invisible text layer**, and qpdf merges the pages. The result looks identical but is selectable, copyable, and indexable. Same `pages` / `lang` parameters and the same 5-page synchronous cap as `ocr`; billed per **source** page at a premium over plain OCR (it runs OCR *plus* PDF assembly). ```json POST /v1/pdf/ocr-searchable { "file": {"inline":""}, "pages": "1-3", "lang": "eng" } ``` The output is **always a storage ref** — `{ output: { outputKey, outputUrl, sizeBytes, contentType, filename } }`, never inline (the render convention; OCR'd page images are large). Fetch the PDF from the presigned `outputUrl`, or chain `outputKey` straight into another op. ## from-html **Generate a PDF from HTML** — render caller-supplied HTML/CSS to a PDF on a headless Chromium worker. This is how you produce invoices, reports, contracts, and certificates from a template + data: build the HTML, get back a print-ready PDF. Flat per-call price. ```json POST /v1/pdf/from-html { "html": "

Invoice #1042

…", "options": { "format": "A4", "margin": {"top":"2cm","bottom":"2cm"}, "printBackground": true, "headerTemplate": "

Invoice #1042

", "footerTemplate": "

Page of

", "pageRanges": "1-3" } } ``` `options` (all optional): `format` (`A4` | `Letter` | `Legal` | `A3` | `A5` | `Tabloid` | `Ledger`), `landscape` (boolean), `margin` (`top` / `right` / `bottom` / `left`, each a size string in `px`, `cm`, `mm`, or `in`), `printBackground` (boolean), `scale` (0.1–2). Plus print furniture and output controls: `headerTemplate` / `footerTemplate` (HTML for the running header/footer — use the `date`, `title`, `url`, `pageNumber`, and `totalPages` classes to inject values; leave room with `margin`), `displayHeaderFooter` (boolean — auto-enabled when you supply a template, so you rarely set it yourself; set `false` to suppress), `pageRanges` (a subset to print, e.g. `"1-5, 8"`; default all pages), and `tagged` (emit a tagged/accessible PDF — **default `true`**). The HTML itself is capped at 5 MB (`cputools.render.max_html_bytes`, operator-tunable; over → free `422 HTML_TOO_LARGE`). Returns the standard output envelope (a PDF). **Sandboxed by design.** The renderer is locked **read-only with no network and no JavaScript**: every external resource request — ``, ``, `