CSV Input
Read a CSV file and emit one row per data line. Configure the column delimiter, choose whether the first row is a header, and skip blank lines. CSV Input streams rows as it parses, so it handles large files without loading everything into memory.
How it works
Section titled “How it works”CSV Input is a source node — it produces rows from an uploaded file but takes no input handle. Parsing uses PapaParse line-by-line, which means rows are emitted as the file is read; downstream streaming transforms can begin work before the upload finishes.
The parser samples the first five data rows to infer column types. If every non-empty value in a column parses as a number, the column is coerced to number; otherwise values stay as strings. Columns whose first five rows are all empty fall back to string. Type detection happens once per column on the sample window — values past row five are coerced using the type chosen from the sample.
When Has Header Row is unchecked, the parser generates synthetic column names col_0, col_1, … and treats every line as data.
Input: A .csv file uploaded by the user.
Output: A row stream where each row is an object keyed by header (or col_N).
Options
Section titled “Options”| Option | Type | Default | Description |
|---|---|---|---|
delimiter | string | , | Single character that separates columns. Common values: ,, ;, \t, |. |
hasHeader | boolean | true | Treat the first non-empty row as column names. When false, columns are named col_0, col_1, … |
skipEmptyLines | boolean | true | Skip lines that are empty after trimming. When false, blank lines parse as zero-column rows. |
Examples
Section titled “Examples”Standard comma-separated file with header
Section titled “Standard comma-separated file with header”A typical sales export — comma delimiter, first row contains column names.
Before (raw file content):
order_id,customer,amount1001,Acme Corp,24001002,Beta Inc,7501003,Gamma LLC,1200Configuration: delimiter: ",", hasHeader: true, skipEmptyLines: true.
After:
| order_id | customer | amount |
|---|---|---|
| 1001 | Acme Corp | 2400 |
| 1002 | Beta Inc | 750 |
| 1003 | Gamma LLC | 1200 |
order_id and amount are inferred as number; customer stays a string.
Semicolon delimiter, no header
Section titled “Semicolon delimiter, no header”A European-locale export with no header row. Synthetic column names are generated.
Before (raw file content):
2024-06-01;Alice;125.502024-06-02;Bob;88.002024-06-03;Carol;212.75Configuration: delimiter: ";", hasHeader: false.
After:
| col_0 | col_1 | col_2 |
|---|---|---|
| 2024-06-01 | Alice | 125.50 |
| 2024-06-02 | Bob | 88.00 |
| 2024-06-03 | Carol | 212.75 |
Tips and Edge Cases
Section titled “Tips and Edge Cases”- Type inference is sample-based, not full-scan. Numeric columns are detected from the first five non-empty rows. If row 7 of a column inferred as
numbercontains"N/A", it will be coerced viaNumber("N/A")and becomeNaNrather than reverting to string. Pre-clean numeric columns or coerce types downstream if your file has sparse non-numeric rows past the first five. Seeapps/web/src/transforms/input-csv/logic.ts:45-70. delimitermust be exactly one character. The Zod schema rejects multi-character delimiters at config-validation time. To parse files with multi-character separators, replace the separator upstream or pre-process the file. Seeapps/web/src/transforms/input-csv/logic.ts:24-29.- PapaParse runs per line, not over the whole file. Quoted values that span multiple lines are not preserved — each physical line is parsed independently. CSVs with embedded newlines inside quoted fields will produce extra rows. See
apps/web/src/transforms/input-csv/logic.ts:101-109.
Related Transforms
Section titled “Related Transforms”- Excel Input — read
.xlsxinstead of CSV; the rest of the pipeline is identical. - JSON Input — read structured JSON or NDJSON when the source is hierarchical.
- Change Type — fix up columns whose inferred type does not match the data.