API Documentation
Workflow Endpoints
Processor Endpoints
Parse Endpoints
Evaluation Set Endpoints
Objects
Webhooks
The BatchProcessorRun object
{
"object": "batch_processor_run",
"id": "bpr_1234",
"processorId": "dp_5678",
"processorVersionId": "dpv_91011",
"processorName": "Invoice Extractor",
"metrics": {
"numFiles": 10,
"numPages": 25,
"meanRunTimeMs": 1500,
"type": "EXTRACT",
"fieldMetrics": {
"invoice_number": {
"meanConfidence": 0.95,
"recallPerc": 98.5,
"precisionPerc": 99.2
},
"invoice_date": {
"meanConfidence": 0.92,
"recallPerc": 95.1,
"precisionPerc": 97.3
}
}
},
"status": "PROCESSED",
"source": "STUDIO",
"sourceId": "dp_5678",
"runCount": 1,
"options": {
"fuzzyMatchFields": ["invoice_number"],
"excludeFields": ["internal_notes"],
"clearPreProcessingCache": false
},
"createdAt": "2023-05-15T10:30:45Z",
"updatedAt": "2023-05-15T10:35:22Z"
}
The BatchProcessorRun object is returned by the Get Batch Processor Run endpoint.
The object represents a run of a processor over a batch of files and contains all the information about the run, including metrics, the processor that was run, and the status of the run.
The type of response, will always be “batch_processor_run”.
The unique identifier for this batch processor run.
The ID of the processor used for this run.
The ID of the specific processor version used.
The name of the processor.
The metrics for the batch processor run.
The total number of files that were processed.
The total number of pages that were processed.
The type of batch processor run. Possible values are EXTRACT
, CLASSIFY
, and SPLITTER
.
The sections below show the fields in this object that are present for each type of run.
Record mapping field names to their respective metrics.
The mean confidence score for this field across all documents.
The recall percentage for this field, representing how many of the expected values were correctly extracted.
The precision percentage for this field, representing how many of the extracted values were correct.
For nested object fields, this contains metrics for the child fields. Has the same structure as the parent fieldMetrics.
Maps the root array field name to a number indicating how many times the array field has the correct number of rows extracted.
The overall accuracy percentage.
The mean confidence score.
Record mapping classification values to their counts.
Mapping from classification to accuracy percentage as calculated from the confusion matrix.
Mapping from actual class to predicted class to count. Only present when accuracy percentage is present.
Number of predicted subdocuments that are in the expected set of subdocuments divided by total number of predicted subdocuments.
Number of expected subdocuments that are in the predicted set of subdocuments divided by total number of expected subdocuments.
The number of expected documents.
The number of predicted documents.
The number of correctly predicted documents.
The mean runtime in milliseconds per document.
The current status of the batch processor run. Possible values are PENDING
,
PROCESSING
, PROCESSED
, FAILED
.
The source of the batch processor run.
The batch processor run was made from an evaluation set. In this case, the
sourceId
will be the ID of the evaluation set, such as ev_1234
.
The batch processor run was made from the playground. The sourceId
will
not be set for this value.
The batch processor run was made for a processor in Studio. The sourceId
will be the
ID of the processor, such as dp_1234
.
The ID of the source of the batch processor run. See the source
field for
more details.
The number of runs that were made.
The date and time the batch processor run was created.
The date and time the batch processor run was last updated.
{
"object": "batch_processor_run",
"id": "bpr_1234",
"processorId": "dp_5678",
"processorVersionId": "dpv_91011",
"processorName": "Invoice Extractor",
"metrics": {
"numFiles": 10,
"numPages": 25,
"meanRunTimeMs": 1500,
"type": "EXTRACT",
"fieldMetrics": {
"invoice_number": {
"meanConfidence": 0.95,
"recallPerc": 98.5,
"precisionPerc": 99.2
},
"invoice_date": {
"meanConfidence": 0.92,
"recallPerc": 95.1,
"precisionPerc": 97.3
}
}
},
"status": "PROCESSED",
"source": "STUDIO",
"sourceId": "dp_5678",
"runCount": 1,
"options": {
"fuzzyMatchFields": ["invoice_number"],
"excludeFields": ["internal_notes"],
"clearPreProcessingCache": false
},
"createdAt": "2023-05-15T10:30:45Z",
"updatedAt": "2023-05-15T10:35:22Z"
}
{
"object": "batch_processor_run",
"id": "bpr_1234",
"processorId": "dp_5678",
"processorVersionId": "dpv_91011",
"processorName": "Invoice Extractor",
"metrics": {
"numFiles": 10,
"numPages": 25,
"meanRunTimeMs": 1500,
"type": "EXTRACT",
"fieldMetrics": {
"invoice_number": {
"meanConfidence": 0.95,
"recallPerc": 98.5,
"precisionPerc": 99.2
},
"invoice_date": {
"meanConfidence": 0.92,
"recallPerc": 95.1,
"precisionPerc": 97.3
}
}
},
"status": "PROCESSED",
"source": "STUDIO",
"sourceId": "dp_5678",
"runCount": 1,
"options": {
"fuzzyMatchFields": ["invoice_number"],
"excludeFields": ["internal_notes"],
"clearPreProcessingCache": false
},
"createdAt": "2023-05-15T10:30:45Z",
"updatedAt": "2023-05-15T10:35:22Z"
}