Express CSV Logo

Column matching

You choose how column matching works:

  • Use managed column matching with { type: "managed" } when you want us to run our standard waterfall of matching steps.
  • Use custom column matching with { type: "custom", match } when you want to run your own matching logic.

Matcher Input Context

The AI matcher receives a compact payload with context for each unmatched source column and each unmatched schema field.

Source column context:

  • Physical column index
  • Header name from the file
  • Sample values from the uploaded column

Schema field context:

  • Field name
  • Label
  • Field type
  • Description
  • Column name aliases

Use .label(), .description(), and .columnNameAliases() on schema fields when your users upload files with customer-specific names. Those hints help both deterministic matching and AI matching.

x.string()
  .label("Net revenue")
  .description("Revenue after discounts, refunds, and tax")
  .columnNameAliases(["net sales", "total after refunds"]);

Managed Matching

Our managed matching pipeline always starts with deterministic matching, then uses inference only for columns that are still unmatched when columnMatching.inference is enabled.

import { useExpressCSV, x } from "@expresscsv/react";

const schema = x.row({
  customerName: x
    .string()
    .label("Customer name")
    .columnNameAliases(["client", "account name"]),
  email: x.string().email().label("Email address"),
  billingAddress: x
    .string()
    .label("Billing address")
    .description("The customer's billing street address"),
});

export function ImportCustomersButton() {
  const { open } = useExpressCSV({
    schema,
    getSessionToken: async () => fetchSessionToken(),
    importIdentifier: "customer-import",
    columnMatching: { type: "managed", inference: true },
  });

  return (
    <button
      onClick={() =>
        open({
          onData: async (_chunk, next) => next(),
        })
      }
    >
      Import customers
    </button>
  );
}

Managed Usage Limits

Managed column matching is not available on the free plan. Paid plans include usage under ExpressCSV fair-use limits.

How Matching Runs

The importer first makes a best-effort pass using ExpressCSV's built-in deterministic matchers. When columnMatching.inference is enabled, model-based matching runs only for columns that still need a match after that pass.

Managed matching exposes each step in the waterfall:

columnMatching: {
  type: "managed",
  exact: true,
  caseInsensitive: true,
  normalized: true,
  inference: true,
}

Each flag is optional. By default, deterministic steps run and inference stays off unless you enable it.

Custom Matching

Pass type: "custom" when you want matching to run in your own infrastructure (custom backend). The React SDK keeps your handler stable across renders and delegates matching requests through the importer iframe.

import {
  useExpressCSV,
  type ColumnMatchingOptions,
} from "@expresscsv/react";

type CustomColumnMatching = Extract<
  ColumnMatchingOptions<typeof schema>,
  { type: "custom" }
>;

const match: CustomColumnMatching["match"] = async ({
  sessionId,
  sourceColumns,
  targetFields,
}) => {
  const response = await fetch("/api/ai/column-matching", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      sessionId,
      sourceColumns,
      targetFields,
    }),
  });

  if (!response.ok) {
    throw new Error("Column matching failed");
  }

  // Return { matches } from your backend using sourceColumnIndex and targetField.
  return (await response.json()) as Awaited<ReturnType<typeof match>>;
};

const { open } = useExpressCSV({
  schema,
  getSessionToken: async () => fetchSessionToken(),
  importIdentifier: "customer-import",
  columnMatching: {
    type: "custom",
    match,
  },
});

Your handler receives:

PropertyDescription
sessionIdThe current import session ID.
sourceColumnsUnmatched source columns, including sourceColumnIndex, header name, and sampleValues.
targetFieldsUnmatched schema fields, including name, label, type, description, and aliases.

The returned matches must use the original sourceColumnIndex and the target field's schema key.

return {
  matches: [
    {
      // Use the original index from sourceColumns.
      sourceColumnIndex: 2,
      // Use the schema field key from targetFields.
      targetField: "email",
    },
  ],
};

Each target field should appear at most once. If a source column should stay unmapped, leave it out of matches; the user can still map it manually in the importer.