Excel Input
Read an Excel workbook and emit one row per data row from a chosen sheet. Pick the sheet by name or zero-based index, skip the first row as a header, and optionally drop additional leading rows. Excel Input loads the whole workbook into memory because the underlying SheetJS parser is not streaming.
How it works
Section titled “How it works”Excel Input is a source node that reads a .xlsx (or .xls) workbook via SheetJS. Unlike CSV Input, it requires the entire file in memory before any rows are emitted — large workbooks scale memory linearly with cell count. Once parsed, rows are yielded one at a time so downstream streaming transforms still see a row stream.
Sheet selection is sheetName over sheetIndex over the first sheet. If sheetName is set, it must match exactly; a missing name throws with a list of available sheets. If sheetIndex is out of range, it throws with the workbook’s sheet count. Native Excel types are preserved — numbers stay numbers, dates come through as Excel serial numbers unless the cell is formatted as text.
When hasHeader is true (the default), sheet_to_json uses the first row as keys. When false, columns are emitted as col_0, col_1, …. skipRows drops that many rows from the top of the JSON output, which means it skips data rows after the header (or all leading rows when there is no header).
Input: An .xlsx or .xls file uploaded by the user.
Output: A row stream of objects keyed by header (or col_N).
Options
Section titled “Options”| Option | Type | Default | Description |
|---|---|---|---|
sheetName | string? | undefined | Name of the sheet to import. Takes priority over sheetIndex. Leave unset to use the first sheet. |
sheetIndex | number? | undefined | Zero-based sheet index. Used only when sheetName is unset. |
hasHeader | boolean | true | Treat the first row as column names. When false, columns are named col_0, col_1, … |
skipRows | number | 0 | Number of rows to drop from the top of the data (after the header row, if any). |
Examples
Section titled “Examples”First sheet with header
Section titled “First sheet with header”An invoice export — first sheet, first row is the header.
Before (sheet contents):
| invoice_id | customer | total | issued_at |
|---|---|---|---|
| INV-1001 | Acme Corp | 4250.00 | 2025-03-12 |
| INV-1002 | Beta Inc | 1875.50 | 2025-03-13 |
| INV-1003 | Gamma LLC | 990.00 | 2025-03-14 |
Configuration: defaults — sheet 0, hasHeader: true, skipRows: 0.
After:
| invoice_id | customer | total | issued_at |
|---|---|---|---|
| INV-1001 | Acme Corp | 4250 | 2025-03-12 |
| INV-1002 | Beta Inc | 1875.5 | 2025-03-13 |
| INV-1003 | Gamma LLC | 990 | 2025-03-14 |
Named sheet with leading metadata rows
Section titled “Named sheet with leading metadata rows”A workbook with title and timestamp rows above the actual data on a sheet called Q2_Sales.
Before (sheet contents):
| A | B | C |
|---|---|---|
| Q2 Sales Report | ||
| Generated 2025-07-01 | ||
| region | sales | reps |
| EMEA | 412000 | 14 |
| AMER | 588000 | 22 |
| APAC | 195000 | 6 |
Configuration: sheetName: "Q2_Sales", hasHeader: true, skipRows: 2.
After:
| region | sales | reps |
|---|---|---|
| EMEA | 412000 | 14 |
| AMER | 588000 | 22 |
| APAC | 195000 | 6 |
skipRows: 2 drops the title and timestamp rows. The next row after that becomes the header row, since hasHeader is true.
Tips and Edge Cases
Section titled “Tips and Edge Cases”sheetNameandsheetIndexare mutually exclusive in the UI. Setting one clears the other. The runtime checkssheetNamefirst, thensheetIndex, then defaults to sheet 0. AsheetNamemismatch throws with the list of available sheets — useful for debugging mistyped names. Seeapps/web/src/transforms/input-xlsx/logic.ts:77-101.- Dates come through as Excel serial numbers, not ISO strings. If a cell is formatted as a date, SheetJS returns the underlying serial (days since 1900-01-00) unless the cell is explicitly typed as text. Use Format Dates downstream with the
excel-serialsource format to convert. Seeapps/web/src/transforms/input-xlsx/logic.ts:108-113. - Empty cells become
"", notnull. The parser usesdefval: "", so missing cells are empty strings rather than nullish. Tests that expectnullfor blanks will need to coerce. Seeapps/web/src/transforms/input-xlsx/logic.ts:108-113.
Related Transforms
Section titled “Related Transforms”- CSV Input — read
.csvfiles when the source is plain text. - JSON Input — read structured JSON or NDJSON when the source is hierarchical.
- Format Dates — convert Excel serial dates to ISO or other formats.