Express CSV Logo

Column matching

In the importer, users match columns from their file to the right fields in your schema. You can choose how column matching works:

  • Use managed column matching with { type: "managed" } when you want us to run our standard waterfall of matching steps.
  • Use custom column matching with { type: "custom", columnMatchHandler } when you want to run your own matching logic.

Matcher Input Context

The AI matcher runs on a compact payload with context for every unmatched source column and every unmatched schema field.

From the upload

  • Physical column index
  • Header name from the file
  • Sample values from the uploaded column

From your schema

  • Field name
  • Label
  • Field type
  • Description
  • Column name aliases

When users upload files with customer-specific header names, add context on schema fields:

  • .label() and .columnNameAliases() help deterministic matching and are passed to AI matching
  • .description() adds extra context for managed and custom AI matching
x.string()
  .label("Net revenue")
  .description("Revenue after discounts, refunds, and tax")
  .columnNameAliases(["net sales", "total after refunds"]);

Managed Matching

Our managed matching pipeline always starts with deterministic matching, then uses inference only for columns that are still unmatched when columnMatching.inference is enabled.

import { CSVImporter, x } from "@expresscsv/sdk";

const schema = x.row({
  customerName: x
    .string()
    .label("Customer name")
    .columnNameAliases(["client", "account name"]),
  email: x.string().email().label("Email address"),
  billingAddress: x
    .string()
    .label("Billing address")
    .description("The customer's billing street address"),
});

const importer = new CSVImporter({
  schema,
  getSessionToken: async () => fetchSessionToken(),
  importNamespace: "customer-import",
  columnMatching: { type: "managed", inference: true }, 
});

How Matching Runs

  • Deterministic pass first: built-in matchers compare each uploaded header against the schema field name, label, and columnNameAliases.
  • Inference second (optional): when columnMatching.inference is enabled, AI matching runs only for columns still unmatched after the deterministic pass.
  • Context for AI: labels, descriptions, and aliases from your schema are included in the inference payload.

Managed matching exposes each step in the waterfall:

columnMatching: {
  type: "managed",
  exact: true,
  caseInsensitive: true,
  normalized: true,
  inference: true,
}

Each flag is optional. By default, deterministic steps run and inference stays off unless you enable it.

Custom Matching

Pass type: "custom" when matching should run in your own backend. Your handler:

  • Receives unmatched source columns and unmatched target fields
  • Returns { matches } with sourceColumnIndex and targetField for each mapping
import {
  CSVImporter,
  type ColumnMatchHandler,
  type ColumnMatchHandlerResult,
} from "@expresscsv/sdk";

const columnMatchHandler: ColumnMatchHandler<typeof schema> = async ({
  sessionId,
  sourceColumns,
  targetFields,
}) => {
  const response = await fetch("/your-api/ai/column-matching", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${accessToken}`,
    },
    body: JSON.stringify({
      sessionId,
      sourceColumns,
      targetFields,
    }),
  });

  if (!response.ok) {
    throw new Error("Column matching failed");
  }

  // Return { matches } from your backend using sourceColumnIndex and
  // targetField.
  return (await response.json()) as ColumnMatchHandlerResult<typeof schema>;
};

const importer = new CSVImporter({
  schema,
  getSessionToken: async () => fetchSessionToken(),
  importNamespace: "customer-import",
  columnMatching: { 
    type: "custom", 
    columnMatchHandler, 
  }, 
});

Your handler receives:

PropertyDescription
sessionIdID for the current import run. Send it to your backend to correlate AI requests with the import being edited.
sourceColumnsUnmatched source columns, including sourceColumnIndex, header name, and sampleValues.
targetFieldsUnmatched schema fields, including name, label, type, description, and aliases.

The returned matches must use the original sourceColumnIndex and the target field's schema key.

return {
  matches: [
    {
      // Use the original index from sourceColumns.
      sourceColumnIndex: 2,
      // Use the schema field key from targetFields.
      targetField: "email",
    },
  ],
};
  • Each target field should appear at most once in matches.
  • Leave unmapped source columns out of matches; the user can still map them manually in the importer.