Processor configs
Learn how to configure document processors through our API
Document Processor Configuration Guide
Document processors are the core components that analyze and manipulate information from your documents. This guide explains how to configure processors through our API, including detailed examples for each processor type.
Overview
We support three types of document processors:
- Extraction Processors: Extract specific fields from documents.
- Classification Processors: Categorize documents.
- Splitter Processors: Divide documents into logical sub-documents.
Generally we recommend using our UI for configuration, but the API can be useful in programmatic workflows, when you need to configure a large number of processors, or when you need to keep your configurations in source control and versioned.
You can also use our webhook events to consume changes made to configurations in the Extend Dashboard and keep your saved configurations in sync.
Best Practices
- Field IDs: Use clear, lowercase, underscore-separated identifiers.
- Descriptions: Provide detailed descriptions for all fields and classifications.
- Field Types: Choose the most specific field type for your use case.
- Validation: Test your configurations with sample documents.
- Base Processors: Allow Extend to default to latest version of a processor on creation, or specify a pinned version to use instead. (See Changelog for more details).
Error Handling
The API will return validation errors if your configuration is invalid. Common issues include:
- Missing required fields
- Invalid field types
- Malformed enum options
- Invalid base processor references
Schema Definitions
Base Processor Schema
All processor configurations share these base properties:
They will be set by default to latest available on processor creation unless otherwise specified - see Changelog for more details.
Extraction Processor Schema
Classification Processor Schema
Splitter Processor Schema
Updating Processor Configurations
You can update a processor’s configuration using the following endpoint:
Extraction Processor Configuration
Extraction processors are used to extract specific fields from documents. They support a wide range of field types and nested structures.
Basic Example
All Field Types
Extraction processors support these field types:
string
: Text valuesnumber
: Numeric valuescurrency
: Monetary valuesboolean
: True/false valuesdate
: Date valuesarray
: Lists of valuesenum
: Predefined, constrained text output optionsobject
: Nested structuressignature
: Signature information
Example with Nested Fields
Example with nested arrays and objects
Classification Processor Configuration
Classification processors categorize documents into predefined types.
Example
Splitter Processor Configuration
Splitter processors divide documents into logical sub-documents based on defined classifications.