Column matching

In the importer, users match columns from their file to the right fields in your schema. You can choose how column matching works:

Use managed column matching with { type: "managed" } when you want us to run our standard waterfall of matching steps.
Use custom column matching with { type: "custom", columnMatchHandler } when you want to run your own matching logic.

Matcher Input Context

The AI matcher runs on a compact payload with context for every unmatched source column and every unmatched schema field.

From the upload

Physical column index
Header name from the file
Sample values from the uploaded column

From your schema

Field name
Label
Field type
Description
Column name aliases

When users upload files with customer-specific header names, add context on schema fields:

.label() and .columnNameAliases() help deterministic matching and are passed to AI matching
.description() adds extra context for managed and custom AI matching

x.string()
  .label("Net revenue")
  .description("Revenue after discounts, refunds, and tax")
  .columnNameAliases(["net sales", "total after refunds"]);

Managed Matching

Our managed matching pipeline always starts with deterministic matching, then uses inference only for columns that are still unmatched when columnMatching.inference is enabled.

import { CSVImporter, x } from "@expresscsv/sdk";

const schema = x.row({
  customerName: x
    .string()
    .label("Customer name")
    .columnNameAliases(["client", "account name"]),
  email: x.string().email().label("Email address"),
  billingAddress: x
    .string()
    .label("Billing address")
    .description("The customer's billing street address"),
});

const importer = new CSVImporter({
  schema,
  getSessionToken: async () => fetchSessionToken(),
  importNamespace: "customer-import",
  columnMatching: { type: "managed", inference: true }, 
});

Plan requirements

Managed column matching is only available on paid plans. Usage is included under ExpressCSV fair-use limits.

How Matching Runs

Deterministic pass first: built-in matchers compare each uploaded header against the schema field name, label, and columnNameAliases.
Inference second (optional): when columnMatching.inference is enabled, AI matching runs only for columns still unmatched after the deterministic pass.
Context for AI: labels, descriptions, and aliases from your schema are included in the inference payload.

Managed matching exposes each step in the waterfall:

columnMatching: {
  type: "managed",
  exact: true,
  caseInsensitive: true,
  normalized: true,
  inference: true,
}

Each flag is optional. By default, deterministic steps run and inference stays off unless you enable it.

Custom Matching

Pass type: "custom" when matching should run in your own backend. Your handler:

Receives unmatched source columns and unmatched target fields
Returns { matches } with sourceColumnIndex and targetField for each mapping

import {
  CSVImporter,
  type ColumnMatchHandler,
  type ColumnMatchHandlerResult,
} from "@expresscsv/sdk";

const columnMatchHandler: ColumnMatchHandler<typeof schema> = async ({
  sessionId,
  sourceColumns,
  targetFields,
}) => {
  const response = await fetch("/your-api/ai/column-matching", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${accessToken}`,
    },
    body: JSON.stringify({
      sessionId,
      sourceColumns,
      targetFields,
    }),
  });

  if (!response.ok) {
    throw new Error("Column matching failed");
  }

  // Return { matches } from your backend using sourceColumnIndex and
  // targetField.
  return (await response.json()) as ColumnMatchHandlerResult<typeof schema>;
};

const importer = new CSVImporter({
  schema,
  getSessionToken: async () => fetchSessionToken(),
  importNamespace: "customer-import",
  columnMatching: { 
    type: "custom", 
    columnMatchHandler, 
  }, 
});

Your handler receives:

Property	Description
`sessionId`	ID for the current import run. Send it to your backend to correlate AI requests with the import being edited.
`sourceColumns`	Unmatched source columns, including `sourceColumnIndex`, header `name`, and `sampleValues`.
`targetFields`	Unmatched schema fields, including `name`, `label`, `type`, `description`, and `aliases`.

The returned matches must use the original sourceColumnIndex and the target field's schema key.

return {
  matches: [
    {
      // Use the original index from sourceColumns.
      sourceColumnIndex: 2,
      // Use the schema field key from targetFields.
      targetField: "email",
    },
  ],
};

Each target field should appear at most once in matches.
Leave unmapped source columns out of matches; the user can still map them manually in the importer.