{ "object": "file", "id": "file_1234", "name": "example_file", "type": "PDF", "presignedUrl": "https://s3.example.com/file_1234.pdf", "parentFileId": "file_5678", // Optional, only set if this file is a derivative of another file "contents": { "rawText": "This is the raw text content of the file...", "pages": [ { "pageNumber": 1, "markdown": "This is the markdown content of the page...", } ] }, "metadata": { "parentSplit": { // Optional, only set if this file is a derivative of another file "id": "324kjlfsd", "type": "addendum", "identifier": "addendum_1", "startPage": 7, "endPage": 9 } } "createdAt": "2024-01-01T00:00:00Z", "updatedAt": "2024-01-01T00:00:00Z"}
File Endpoints
Get File
Fetch a file by its ID to obtain additional details and the raw file content.
{ "object": "file", "id": "file_1234", "name": "example_file", "type": "PDF", "presignedUrl": "https://s3.example.com/file_1234.pdf", "parentFileId": "file_5678", // Optional, only set if this file is a derivative of another file "contents": { "rawText": "This is the raw text content of the file...", "pages": [ { "pageNumber": 1, "markdown": "This is the markdown content of the page...", } ] }, "metadata": { "parentSplit": { // Optional, only set if this file is a derivative of another file "id": "324kjlfsd", "type": "addendum", "identifier": "addendum_1", "startPage": 7, "endPage": 9 } } "createdAt": "2024-01-01T00:00:00Z", "updatedAt": "2024-01-01T00:00:00Z"}
If set to true, the markdown content of the file will be included in the response. This is useful for indexing very clean content into RAG pipelines for files like PDFs, Word Documents, etc.*Only available for files with a type of PDF, IMG.*or .doc/.docx files that were auto-converted to PDFs.
If set to true, the raw text content of the file will be included in the response. This is useful for indexing text-based files like PDFs, Word Documents, etc.
If set to true, the html content of the file will be included in the response. This is useful for indexing html content into RAG pipelines for files like PDFs, Word Documents, etc.*Only available for files with a type of DOCX.
A File object representing the fetched file. See the File object for more details.
Copy
{ "object": "file", "id": "file_1234", "name": "example_file", "type": "PDF", "presignedUrl": "https://s3.example.com/file_1234.pdf", "parentFileId": "file_5678", // Optional, only set if this file is a derivative of another file "contents": { "rawText": "This is the raw text content of the file...", "pages": [ { "pageNumber": 1, "markdown": "This is the markdown content of the page...", } ] }, "metadata": { "parentSplit": { // Optional, only set if this file is a derivative of another file "id": "324kjlfsd", "type": "addendum", "identifier": "addendum_1", "startPage": 7, "endPage": 9 } } "createdAt": "2024-01-01T00:00:00Z", "updatedAt": "2024-01-01T00:00:00Z"}