API Documentation
Workflow Endpoints
Processor Endpoints
Parse Endpoints
Evaluation Set Endpoints
Objects
Webhooks
Update Processor
Update an existing processor in Extend.
curl --request POST \
--url https://api-prod.extend.app/v1/processors/:id \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"name": "<string>",
"config": {
"type": "<string>",
"baseProcessor": "<string>",
"baseVersion": "<string>",
"schema": {},
"fields": [
{
"id": "<string>",
"name": "<string>",
"type": "<string>",
"description": "<string>",
"schema": [
{}
],
"enum": [
{
"value": "<string>",
"description": "<string>"
}
]
}
],
"extractionRules": "<string>",
"advancedOptions": {
"fixedPageLimit": 123,
"splitMethod": "<string>",
"splitIdentifierRules": "<string>",
"splitExcelDocumentsBySheetEnabled": true
},
"classifications": [
{
"id": "<string>",
"type": "<string>",
"description": "<string>"
}
],
"classificationRules": "<string>",
"splitClassifications": [
{
"id": "<string>",
"type": "<string>",
"description": "<string>"
}
],
"splitRules": "<string>"
}
}'
{
"success": true,
"processor": {
"object": "document_processor",
"id": "processor_1234",
"name": "Updated Invoice Processor",
"type": "EXTRACT",
"createdAt": "2024-03-01T12:00:00Z",
"updatedAt": "2024-03-01T13:00:00Z",
"draftVersion": {
"id": "dpv_4567",
"version": "draft",
"config": {
"fields": [
{
"id": "field_1234",
"name": "invoice_number",
"description": "The invoice number",
"type": "string"
}
]
}
}
}
}
This endpoint allows you to update the properties of an existing processor.
Path Parameters
The ID of the processor to update.
Body
The new name for the processor.
The new config to update the processor with.
Must be "EXTRACT"
for extraction processors.
The base processor to use. For extractors, this is either "extraction_performance"
or "extraction_light"
. See the base processor documentation for more details.
The version of the base processor to use (e.g. "4.0.0"
). If this is provided, baseProcessor
must be provided as well. See the processor changelog for available versions.
The schema that defines the structure of data to extract from documents. One of schema
or fields
must be provided. We recommend using schema
as fields
is deprecated. See the extraction processor schema documentation for more details.
The schema that defines the structure of data to extract from documents. One of schema
or fields
must be provided. We recommend using schema
as fields
is deprecated. See the extraction processor schema documentation for more details on using the fields
shape.
Unique identifier for the field.
Human-readable name for the field.
Type of the field. Supported values:
string
: Text valuesnumber
: Numeric valuescurrency
: Monetary valuesboolean
: True/false valuesdate
: Date valuesarray
: Lists of values (requires schema)enum
: Values from a predefined list (requires enum)object
: Nested structure (requires schema)signature
: Signature information
Detailed description of the field, including expected content and format.
Required when type is “array” or “object”. Contains nested field definitions.
Custom rules to guide the extraction process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
Provide a hint about the document type (e.g. “invoice”, “receipt”, etc.).
Define specific key terms or concepts relevant to the document type.
Enable model reasoning insights in the extraction results.
Enable advanced multimodal processing for better handling of visual elements.
Enable citation information for extracted fields.
Enable advanced parsing of figures and diagrams in the document.
Options for controlling document chunking.
Strategy for chunking the document. Supported values:
standard
: Default chunking strategysemantic
: Content-aware chunking based on document structure
Custom rules for semantic chunking in natural language.
Number of pages per chunk.
Strategy for selecting chunks. Supported values:
intelligent
: AI-based selectionconfidence
: Select based on confidence scoretake_first
: Always use first chunktake_last
: Always use last chunk
Must be "EXTRACT"
for extraction processors.
The base processor to use. For extractors, this is either "extraction_performance"
or "extraction_light"
. See the base processor documentation for more details.
The version of the base processor to use (e.g. "4.0.0"
). If this is provided, baseProcessor
must be provided as well. See the processor changelog for available versions.
The schema that defines the structure of data to extract from documents. One of schema
or fields
must be provided. We recommend using schema
as fields
is deprecated. See the extraction processor schema documentation for more details.
The schema that defines the structure of data to extract from documents. One of schema
or fields
must be provided. We recommend using schema
as fields
is deprecated. See the extraction processor schema documentation for more details on using the fields
shape.
Unique identifier for the field.
Human-readable name for the field.
Type of the field. Supported values:
string
: Text valuesnumber
: Numeric valuescurrency
: Monetary valuesboolean
: True/false valuesdate
: Date valuesarray
: Lists of values (requires schema)enum
: Values from a predefined list (requires enum)object
: Nested structure (requires schema)signature
: Signature information
Detailed description of the field, including expected content and format.
Required when type is “array” or “object”. Contains nested field definitions.
Custom rules to guide the extraction process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
Provide a hint about the document type (e.g. “invoice”, “receipt”, etc.).
Define specific key terms or concepts relevant to the document type.
Enable model reasoning insights in the extraction results.
Enable advanced multimodal processing for better handling of visual elements.
Enable citation information for extracted fields.
Enable advanced parsing of figures and diagrams in the document.
Options for controlling document chunking.
Strategy for chunking the document. Supported values:
standard
: Default chunking strategysemantic
: Content-aware chunking based on document structure
Custom rules for semantic chunking in natural language.
Number of pages per chunk.
Strategy for selecting chunks. Supported values:
intelligent
: AI-based selectionconfidence
: Select based on confidence scoretake_first
: Always use first chunktake_last
: Always use last chunk
Must be "CLASSIFY"
for classification processors.
The base processor to use. For classifiers, must be "classification_performance"
or "classification_light"
. See the base processor documentation for more details.
The version of the base processor to use (e.g. "3.2.0"
). If this is provided, baseProcessor
must be provided as well. See the processor changelog for available versions.
Custom rules to guide the classification process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
The context to use for classification. Supported values:
default
max
Enable advanced multimodal processing for better handling of visual elements during classification.
Must be "SPLITTER"
for splitter processors.
The base processor to use. For splitters, this can currently only be "splitter_performance"
. See the base processor documentation for more details.
The version of the base processor to use (e.g. "1.0.0"
). If this is provided, baseProcessor
must be provided as well. See the processor changelog for available versions.
Custom rules to guide the document splitting process in natural language.
Advanced configuration options.
Limit processing to a specific number of pages from the beginning of the document.
Method to use for splitting. Supported values:
high_precision
: More accurate but potentially slowerlow_latency
: Faster but potentially less precise
For Excel documents, split by worksheet.
Response
A true or false value indicating whether the processor was updated successfully or not.
A DocumentProcessor object representing the updated processor. See the DocumentProcessor object for more details.
Error Responses
Will be false
if the request failed.
A description of the error that occurred.
{
"success": true,
"processor": {
"object": "document_processor",
"id": "processor_1234",
"name": "Updated Invoice Processor",
"type": "EXTRACT",
"createdAt": "2024-03-01T12:00:00Z",
"updatedAt": "2024-03-01T13:00:00Z",
"draftVersion": {
"id": "dpv_4567",
"version": "draft",
"config": {
"fields": [
{
"id": "field_1234",
"name": "invoice_number",
"description": "The invoice number",
"type": "string"
}
]
}
}
}
}
curl --request POST \
--url https://api-prod.extend.app/v1/processors/:id \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"name": "<string>",
"config": {
"type": "<string>",
"baseProcessor": "<string>",
"baseVersion": "<string>",
"schema": {},
"fields": [
{
"id": "<string>",
"name": "<string>",
"type": "<string>",
"description": "<string>",
"schema": [
{}
],
"enum": [
{
"value": "<string>",
"description": "<string>"
}
]
}
],
"extractionRules": "<string>",
"advancedOptions": {
"fixedPageLimit": 123,
"splitMethod": "<string>",
"splitIdentifierRules": "<string>",
"splitExcelDocumentsBySheetEnabled": true
},
"classifications": [
{
"id": "<string>",
"type": "<string>",
"description": "<string>"
}
],
"classificationRules": "<string>",
"splitClassifications": [
{
"id": "<string>",
"type": "<string>",
"description": "<string>"
}
],
"splitRules": "<string>"
}
}'
{
"success": true,
"processor": {
"object": "document_processor",
"id": "processor_1234",
"name": "Updated Invoice Processor",
"type": "EXTRACT",
"createdAt": "2024-03-01T12:00:00Z",
"updatedAt": "2024-03-01T13:00:00Z",
"draftVersion": {
"id": "dpv_4567",
"version": "draft",
"config": {
"fields": [
{
"id": "field_1234",
"name": "invoice_number",
"description": "The invoice number",
"type": "string"
}
]
}
}
}
}