Group By
Group rows by one or more key columns and compute aggregates over the remaining columns. Each unique combination of key values produces one output row carrying the keys plus the requested aggregations. Use it to collapse line items into per-customer totals, count records per category, or compute min/max bounds per bucket.
How it works
Section titled “How it works”Group By is a full-dataset batch transform. It buffers all input rows, partitions them by the configured groupKeys (using a JSON-stringified key tuple so multi-column groups work), then emits one output row per group with the key columns and every named aggregation.
When groupKeys is empty but aggregations is configured, the entire input collapses to a single output row holding the global aggregates. When both lists are empty, rows pass through unchanged.
Output row order follows the order each unique key combination first appeared in the input. Aggregate columns appear after the group-key columns in the order they were declared.
Input: One tabular data connection. Output: One row per unique group-key combination, with key columns first and aggregations second.
Options
Section titled “Options”| Option | Type | Default | Description |
|---|---|---|---|
groupKeys | string[] | [] | Columns whose values together identify a group. Empty list collapses everything into a single global row. |
aggregations | Array<{ inputColumn, function, outputColumn }> | [] | Each entry computes one aggregate. function is one of "sum", "count", "avg", "min", "max". outputColumn names the result column (required). |
Examples
Section titled “Examples”Total sales per customer
Section titled “Total sales per customer”You have order line items and want per-customer totals plus order counts.
Before:
| order_id | customer | amount |
|---|---|---|
| 1001 | Acme Corp | 250 |
| 1002 | Beta Inc | 750 |
| 1003 | Acme Corp | 1100 |
| 1004 | Beta Inc | 500 |
| 1005 | Acme Corp | 300 |
Configuration: group keys: ["customer"], aggregations: sum(amount) → total, count(amount) → orders.
After:
| customer | total | orders |
|---|---|---|
| Acme Corp | 1650 | 3 |
| Beta Inc | 1250 | 2 |
Multi-column group with min/max/avg
Section titled “Multi-column group with min/max/avg”You want bounds and average price per category and region.
Before:
| product | category | region | price |
|---|---|---|---|
| Widget | tool | EU | 25 |
| Widget | tool | EU | 27 |
| Widget | tool | US | 30 |
| Gadget | toy | EU | 12 |
| Gadget | toy | EU | 14 |
| Gadget | toy | US | 18 |
Configuration: group keys: ["category", "region"], aggregations: min(price) → price_min, max(price) → price_max, avg(price) → price_avg.
After:
| category | region | price_min | price_max | price_avg |
|---|---|---|---|---|
| tool | EU | 25 | 27 | 26 |
| tool | US | 30 | 30 | 30 |
| toy | EU | 12 | 14 | 13 |
| toy | US | 18 | 18 | 18 |
Global aggregate (no group keys)
Section titled “Global aggregate (no group keys)”You want a single summary row for the whole dataset.
Before:
| order_id | amount |
|---|---|
| 1001 | 250 |
| 1002 | 750 |
| 1003 | 1100 |
| 1004 | 500 |
Configuration: group keys: [], aggregations: sum(amount) → total, count(amount) → orders, avg(amount) → average.
After:
| total | orders | average |
|---|---|---|
| 2600 | 4 | 650 |
Tips and Edge Cases
Section titled “Tips and Edge Cases”countcounts rows in the group, not non-null cells. TheinputColumnis required by the schema but ignored by the executor — every row in the group counts. To count only rows where a column has a value, filter upstream first. Seeapps/web/src/transforms/group-by/logic.ts:42-43.sumandavgskip non-numeric cells silently. A cell that failsparseFloat(e.g."n/a", empty string) is excluded from the running total and from the divisor inavg. The result is the average over numeric cells only — not zero-filled. If every cell is non-numeric,avgreturnsnullandsumreturns0. Seeapps/web/src/transforms/group-by/logic.ts:44-58.minandmaxuse a hybrid numeric/string comparison. When both candidate values parse as numbers, comparison is numeric (9 < 10); otherwise it falls back tolocaleCompare. This meansminover["9", "10", "2"]returns"2"numerically, butminover mixed["banana", "10"]falls back to string order. Coerce the column with Change Type upstream to force one regime. Seeapps/web/src/transforms/group-by/logic.ts:31-38.
Related Transforms
Section titled “Related Transforms”- Pivot — produce wide cross-tabs (one column per pivot value) instead of long groups.
- Filter Rows — drop rows before grouping to exclude unwanted records from aggregates.
- Sort Rows — order the grouped output by an aggregation column (e.g. top customers by total).