Migrating to JSON Schema

Current Config Structure

This section gives some background on the current config structure and the new JSON Schema config structure. If you’d like to jump to migrating to the new JSON Schema config structure, you can go straight to the Migrating to JSON Schema section.

If your organization started using Extend before April 2025, you likely have been using the Fields Array config type. You can tell whether a processor is using the Fields Array config type by looking at processor details in the Studio. If you don’t see this section, your processor is using the JSON Schema config type and you do not have access to the legacy “Fields Array” config type.

If you’re processor is using the Fields Array config type, the config object in processor has a fields array that contains the fields for the processor. Here is an example config object of this type:

{
  "type": "EXTRACT",
  "fields": [
    {
      "id": "invoice_number",
      "name": "invoice_number",
      "type": "string",
      "description": "The unique identifier for this invoice"
    },
    {
      "id": "amount",
      "name": "amount",
      "type": "currency",
      "description": "The total amount of the invoice"
    }
  ]
  // other fields...
}

This schema has worked well, however since releasing it, the industry has standardized around JSON Schema as the way response schemas are defined. To make our processors easier to use for developers, we are moving to JSON Schema as the way schemas are defined for processors.

New JSON Schema Config Structure

A JSON Schema config object equivalent of the above example is:

{
  "type": "EXTRACT",
  "schema": {
    "type": "object",
    "properties": {
      "invoice_number": {
        "type": ["string", "null"],
        "description": "The unique identifier for this invoice"
      },
      "amount": {
        "type": "object",
        "properties": {
          "value": {
            "type": ["number", "null"]
          },
          "iso_4217_currency_code": {
            "type": ["string", "null"]
          }
        },
        "required": ["value", "iso_4217_currency_code"],
        "additionalProperties": false
      }
    },
    "required": ["invoice_number", "amount"],
    "additionalProperties": false
  }
  // other fields...
}

You’ll notice that instead of the fields array, we have a schema object. This object is a JSON Schema object that describes the shape of the output you will receive from the processor. For more information on the JSON Schema config structure, please see the JSON Schema Config section of the API Reference.

Current Output Structure

The current output structure for Extraction processors is an object with the field names as keys and the values inside an object with the following properties:

id: The unique identifier for the field
type: The type of the field
value: The value of the field
confidence: The confidence score of the field
insights: The insights for the field
references: The references for the field

Here is an example of the output:

{
  "invoice_number": {
    "id": "invoice_number",
    "type": "string",
    "value": "36995",
    "confidence": 0.98,
    "insights": [
      {
        "type": "reasoning",
        "content": "The invoice number is clearly labeled as 'Invoice #36995' at the top right of the document, making it straightforward to extract."
      }
    ],
    "references": [
      {
        "page": 1,
        "boundingBoxes": [
          {
            "left": 296.73359999999997,
            "top": 40.888799999999996,
            "right": 386.4168,
            "bottom": 52.1712
          }
        ],
        "referenceText": "Invoice #36995"
      }
    ]
  },
  "amount": {
    "id": "amount",
    "type": "number",
    "value": 15735.1,
    "confidence": 0.98,
    "insights": [
      {
        "type": "reasoning",
        "content": "The total amount is shown as '$15,735.1' in both the table summary and the bottom right of the document. The currency symbol '$' and the US address indicate the currency is USD. The value is numeric and matches the required format."
      }
    ],
    "references": [
      {
        "page": 1,
        "boundingBoxes": [
          {
            "left": 430.164,
            "top": 722.772,
            "right": 467.27279999999996,
            "bottom": 722.8296
          }
        ],
        "referenceText": "TOTAL  $15,735.1"
      }
    ]
  }
}

In this output, the metadata like confidence, insights, and references are nested inside each field’s object right next to the value. The benefit of this is it’s very easy to access the metadata for a specific field. The downside is that it doesn’t work very well for recursive fields like arrays and objects.

New JSON Schema Output Structure

The output structure for JSON Schema processors is composed of two properties: value and metadata. The value object is the actual data extracted from the document which conforms to the JSON Schema defined in the processor config. The metadata object holds details like confidence scores and citations for the extracted data. It uses keys that represent the path to the corresponding data within the value object. For instance, if your data has value.line_items[0].name, the metadata specifically for that name field will be found using the key ‘line_items[0].name’ within the metadata object. For more information on the metadata object, please see the Accessing Metadata section of the API Reference. Below is an example of the output you will receive from a JSON Schema processor:

{
  "value": {
    "amount": {
      "amount": 15735.1,
      "iso_4217_currency_code": "USD"
    },
    "invoice_number": "36995"
  },
  "metadata": {
    "amount": {
      "insights": [
        {
          "type": "reasoning",
          "content": "The total amount is shown as '$15,735.1' in both the table summary and the bottom right of the document. The currency symbol '$' and the US address indicate the currency is USD. The value is numeric and matches the required format."
        }
      ],
      "citations": [
        {
          "page": 1,
          "polygon": [
            {
              "x": 430.164,
              "y": 722.772
            },
            {
              "x": 467.27279999999996,
              "y": 722.8296
            },
            {
              "x": 467.2584,
              "y": 731.6351999999999
            },
            {
              "x": 430.1496,
              "y": 731.5776
            }
          ],
          "referenceText": "TOTAL  $15,735.1"
        }
      ],
      "ocrConfidence": 0.992,
      "logprobsConfidence": 1
    },
    "invoice_number": {
      "insights": [
        {
          "type": "reasoning",
          "content": "The invoice number is clearly labeled as 'Invoice #36995' at the top right of the document, making it straightforward to extract."
        }
      ],
      "citations": [
        {
          "page": 1,
          "polygon": [
            {
              "x": 296.73359999999997,
              "y": 40.888799999999996
            },
            {
              "x": 386.4168,
              "y": 40.464000000000006
            },
            {
              "x": 386.4744,
              "y": 52.1712
            },
            {
              "x": 296.7912,
              "y": 52.596000000000004
            }
          ],
          "referenceText": "Invoice #36995"
        }
      ],
      "ocrConfidence": 0.986,
      "logprobsConfidence": 1
    }
  }
}

The benefit of this output structure is that it’s very easy to access the data for a specific field and it should be easy to ingest it as a typed object because it conforms to the JSON Schema defined in the processor config. The Typescript types for the output are the following:

type ExtractionOutput = {
  value: ExtractionValue;
  metadata: ExtractionMetadata;
};

type ExtractionValue = Record<string, any>; // Conforms to the schema defined in the processor config
type ExtractionMetadata = {
  [key: string]: ExtractionMetadataEntry | undefined;
};

type ExtractionMetadataEntry {
  ocrConfidence?: number | null;
  logprobsConfidence: number | null;
  citations?: Citation[];
  insights?: OutputInsight[];
}

type Citation = {
  page?: number;
  referenceText?: string | null;
  polygon?: Point[];
};

type Point {
  /**
   * x coordinate - relative from the left side of the page
   */
  x: number;
  /**
   * y coordinate - relative from the top of the page
   */
  y: number;
}

type Insight = {
  type: "reasoning";
  content: string;
};