Overview

A Block represents a distinct content element within a document, such as a paragraph of text, a heading, a table, or a figure. Blocks are the fundamental units that make up chunks in parsed documents.

Block Object Structure

object
string
The type of object. Always “block”.
id
string
A unique identifier for the block, deterministically generated as a hash of the block content.
type
string
The type of block. Possible values include:
  • text: Regular text content
  • heading: Section or document headings
  • section_heading: Subsection headings
  • table: Tabular data with rows and columns
  • figure: Images, charts, or diagrams
content
string
The textual content of the block, formatted according to the target format specified in the parse request.
details
object
Additional details specific to the block type. The structure varies depending on the block type.
metadata
object
Metadata about the block.
polygon
array
An array of points defining the polygon that bounds the block on the page. Each point is an object with x and y coordinates.
boundingBox
object
A simplified rectangular bounding box for the block, derived from the polygon.

Block Type Examples

{
  "object": "block",
  "id": "block_1a2b3c4d5e",
  "type": "text",
  "content": "This is a paragraph of text content that appears in the document.",
  "details": {},
  "metadata": {
    "pageNumber": 1
  },
  "polygon": [
    {"x": 100, "y": 50},
    {"x": 500, "y": 50},
    {"x": 500, "y": 80},
    {"x": 100, "y": 80}
  ],
  "boundingBox": {
    "top": 50,
    "left": 100,
    "width": 400,
    "height": 30
  }
}