{
  "object": "file",
  "id": "file_1234",
  "name": "example_file",
  "type": "PDF",
  "presignedUrl": "https://s3.example.com/file_1234.pdf",
  "parentFileId": "file_5678", // Optional, only set if this file is a derivative of another file
  "contents": {
    "rawText": "This is the raw text content of the file...",
    "pages": [
      {
        "pageNumber": 1,
        "markdown": "This is the markdown content of the page...",
      }
    ]
  },
  "metadata": {
    "parentSplit": { // Optional, only set if this file is a derivative of another file
      "id": "324kjlfsd",
      "type": "addendum",
      "identifier": "addendum_1",
      "startPage": 7,
      "endPage": 9
    }
  }
  "createdAt": "2024-01-01T00:00:00Z",
  "updatedAt": "2024-01-01T00:00:00Z"
}

The File object represents a file in Extend. Files are created for each workflow run, and can also be created directly via API for use in evaluation sets.

Note: There are several deprecated fields that are still in the payload for backwards compatibility. These are:

  • markdown/rawText in IMGs not nested under pages array. These will still be included in payloads until full deprecation in December 2024.
{
  "object": "file",
  "id": "file_1234",
  "name": "example_file",
  "type": "PDF",
  "presignedUrl": "https://s3.example.com/file_1234.pdf",
  "parentFileId": "file_5678", // Optional, only set if this file is a derivative of another file
  "contents": {
    "rawText": "This is the raw text content of the file...",
    "pages": [
      {
        "pageNumber": 1,
        "markdown": "This is the markdown content of the page...",
      }
    ]
  },
  "metadata": {
    "parentSplit": { // Optional, only set if this file is a derivative of another file
      "id": "324kjlfsd",
      "type": "addendum",
      "identifier": "addendum_1",
      "startPage": 7,
      "endPage": 9
    }
  }
  "createdAt": "2024-01-01T00:00:00Z",
  "updatedAt": "2024-01-01T00:00:00Z"
}