Skip to content

Lookup

Aggregation

Match rows from a primary table against a lookup table by a key column and merge the matched rows together. Choose between left join (keep all primary rows, fill nulls when no match), inner join (drop unmatched), or enrich (skip columns already in primary instead of renaming). Use it to attach reference data — currency rates, product metadata, customer tiers — to a working dataset.

Lookup is a multi-input batch transform with two named input handles: primary (the rows to enrich) and lookup (the reference table). Both inputs are buffered fully; the lookup table is then indexed by lookupKey into a hash map for O(1) per-row matching.

When a primary row’s primaryKey value matches a lookup row’s lookupKey, the two rows are merged. If multiple lookup rows share the same key, the primary row is duplicated once per match — a one-to-many lookup expands the output. If no match exists, behavior depends on the mode.

In left_join and inner_join modes, lookup columns whose names collide with primary columns are renamed to lookup_{name} to preserve both values. In lookup (enrich) mode, colliding columns are skipped — the primary value wins. This makes enrich mode the right choice when you only want to add new columns from the reference table without overwriting or duplicating existing ones.

Input: Two tabular data connections — primary (left-side rows) and lookup (right-side reference table). Output: Merged rows, with row count equal to primary rows in lookup and left_join modes (or more if the lookup has duplicate keys), and equal to the number of matching primary rows in inner_join mode.

OptionTypeDefaultDescription
mode"left_join" | "inner_join" | "lookup""left_join"left_join keeps all primary rows; inner_join drops unmatched primary rows; lookup (enrich) keeps all primary rows and skips colliding columns.
primaryKeystring""Column in the primary input to match on. Required.
lookupKeystring""Column in the lookup input to match against. Required.

Enrich orders with customer details (left join)

Section titled “Enrich orders with customer details (left join)”

Your orders table has customer IDs. The customers table has the names and tiers you want to attach.

Primary input (orders):

order_idcustomer_idamount
1001C-1250
1002C-2750
1003C-31100
1004C-9500

Lookup input (customers):

customer_idnametier
C-1Acme Corpstandard
C-2Beta Incpro
C-3Gamma LLCstandard

Configuration: mode: left_join, primary key: customer_id, lookup key: customer_id.

After:

order_idcustomer_idamountlookup_customer_idnametier
1001C-1250C-1Acme Corpstandard
1002C-2750C-2Beta Incpro
1003C-31100C-3Gamma LLCstandard
1004C-9500

The unmatched order (C-9) is kept; its lookup_customer_id, name, and tier are filled with null. Note that customer_id is a column on both inputs: in left_join and inner_join modes the lookup-side copy is renamed to lookup_customer_id to preserve both. The third example below shows enrich mode’s alternative behavior, which discards the lookup-side copy instead.

You want only orders that have a matching customer record, dropping any orphans.

Primary input (orders): same as above.

Lookup input (customers): same as above.

Configuration: mode: inner_join, primary key: customer_id, lookup key: customer_id.

After:

order_idcustomer_idamountlookup_customer_idnametier
1001C-1250C-1Acme Corpstandard
1002C-2750C-2Beta Incpro
1003C-31100C-3Gamma LLCstandard

The order for C-9 is dropped because there’s no matching customer. As in the previous example, the lookup-side customer_id is preserved as lookup_customer_id.

Enrich mode preserves primary on column collisions

Section titled “Enrich mode preserves primary on column collisions”

Both primary and lookup have a region column but they mean different things. You want to keep the primary’s region and add the lookup’s other columns (without prefixing).

Primary input (orders):

order_idcustomer_idregion
1001C-1EU
1002C-2US

Lookup input (customers):

customer_idregiontier
C-1EMEAstandard
C-2AMERICASpro

Configuration: mode: lookup, primary key: customer_id, lookup key: customer_id.

After:

order_idcustomer_idregiontier
1001C-1EUstandard
1002C-2USpro

The lookup’s region column is dropped (it would collide with primary), and only tier is added. In left_join mode the same configuration would have produced a lookup_region column.

  • Duplicate keys in the lookup table multiply the primary row. If the lookup has two rows with the same lookupKey, every matching primary row is emitted twice. Use Deduplicate on the lookup input first if you need a one-to-one match. See apps/web/src/transforms/lookup/logic.ts:70-73.
  • Keys are compared as strings. The executor calls String(value ?? "") on both sides before indexing. This means 42 (number) matches "42" (string), but null and undefined both collapse to "" and match each other. If your keys are numeric and you have null cells, run a Filter Rows upstream to drop them. See apps/web/src/transforms/lookup/logic.ts:42-44 and apps/web/src/transforms/lookup/logic.ts:64.
  • The two join modes prefix collisions; enrich mode skips them. With identically named columns on both sides, left_join and inner_join rename the lookup-side column to lookup_{name}. lookup mode silently keeps the primary-side value and discards the lookup-side. There is no error or warning when this happens — read the schema preview in the editor to confirm the output shape. See apps/web/src/transforms/lookup/logic.ts:67-68 and apps/web/src/transforms/lookup/logic.ts:118-127.
  • Stack Rows — append rows from multiple inputs vertically (no key matching).
  • Group By — aggregate per-group totals after enriching with Lookup.
  • Remove Duplicates — collapse duplicate keys in the lookup table before joining.