This endpoint allows you to create a new processor or clone an existing one. Typically processors are created and configured in the Extend Studio , but this endpoint can be used to create processors programmatically in order to sync ID’s across systems.
Body
The name of the new processor.
The type of the processor.
The ID of an existing processor to clone. If provided, a new processor will be created that clones the config of this processor.
Optionally supply a config to be used when creating a processor. Any fields not supplied will use the listed defaults.
Your config will take one of three shapes depending on the processor type:
Configuration for an extraction processor.
Must be "EXTRACT"
for extraction processors. Other supported values:
"CLASSIFY"
: For classification processors
"SPLITTER"
: For splitter processors
The base processor to use.
The version of the base processor to use.
Array of field definitions to extract from the document.
Unique identifier for the field.
Human-readable name for the field.
Type of the field. Supported values:
string
: Text values
number
: Numeric values
currency
: Monetary values
boolean
: True/false values
date
: Date values
array
: Lists of values (requires schema)
enum
: Values from a predefined list (requires enum)
object
: Nested structure (requires schema)
signature
: Signature information
Detailed description of the field, including expected content and format.
Required when type is “array” or “object”. Contains nested field definitions.
Required when type is “enum”. List of allowed values.
Description of the enum value.
Custom rules to guide the extraction process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
Provide a hint about the document type (e.g. “invoice”, “receipt”, etc.).
Define specific key terms or concepts relevant to the document type.
modelReasoningInsightsEnabled
Enable model reasoning insights in the extraction results.
advancedMultimodalEnabled
Enable advanced multimodal processing for better handling of visual elements.
Enable citation information for extracted fields.
advancedFigureParsingEnabled
Enable advanced parsing of figures and diagrams in the document.
Options for controlling document chunking.
Strategy for chunking the document. Supported values:
standard
: Default chunking strategy
semantic
: Content-aware chunking based on document structure
customSemanticChunkingRules
Custom rules for semantic chunking in natural language.
Number of pages per chunk.
Strategy for selecting chunks. Supported values:
intelligent
: AI-based selection
confidence
: Select based on confidence score
take_first
: Always use first chunk
take_last
: Always use last chunk
Configuration for a classification processor.
Must be "CLASSIFY"
for classification processors. Other supported values:
"EXTRACT"
: For extraction processors
"SPLITTER"
: For splitter processors
The base processor to use.
The version of the base processor to use.
Array of possible classifications for the document.
Unique identifier for the classification. We recommend lowercase, underscore-separated format.
Type identifier for the classification.
Detailed description of the classification.
Custom rules to guide the classification process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
The context to use for classification. Supported values:
default
max
advancedMultimodalEnabled
Enable advanced multimodal processing for better handling of visual elements during classification.
Configuration for a splitter processor.
Must be "SPLITTER"
for splitter processors. Other supported values:
"EXTRACT"
: For extraction processors
"CLASSIFY"
: For classification processors
The base processor to use.
The version of the base processor to use.
Array of classifications that define the possible types of document sections.
Unique identifier for the split classification.
Type identifier for the split classification.
Detailed description of the document section type.
Custom rules to guide the document splitting process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
Method to use for splitting. Supported values:
high_precision
: More accurate but potentially slower
low_latency
: Faster but potentially less precise
splitExcelDocumentsBySheetEnabled
For Excel documents, split by worksheet.
Response
A true or false value indicating whether the processor was created successfully or not.
Error Responses
Will be false
if the request failed.
A description of the error that occurred.
Possible Common Errors
400 Bad Request : If the request body fails schema validation.
404 Not Found : If the processor to clone (cloneProcessorId) is not found.
{
"success" : true ,
"processor" : {
"object" : "document_processor" ,
"id" : "processor_1234" ,
"name" : "New Invoice Processor" ,
"type" : "EXTRACT" ,
"createdAt" : "2024-03-01T12:00:00Z" ,
"updatedAt" : "2024-03-01T12:00:00Z"
}
}