Ingest API
The Ingest Service is designed to ingest documents into the Docbyte Vault, offering functionalities for managing document ingestion, metadata, and updating existing records.
Table Of Contents
1. Ingest Document
Description
Creates or updates a SIP in the Archive using the provided document(s) and metadata. Can be used for new packages or as a delta SIP for existing AIPs.
Request
PUT /document
Headers
Name | Description | Type | Required |
---|---|---|---|
| Bearer token (e.g., |
| Yes |
| Optional list to specify what information should be part of the result. If no value is provided only the basic information of a package is returned.
|
| No |
Body Parameters
Name | Description | Type | Required |
---|---|---|---|
| Name of profile to store the document in the archive. Retention policy is applied based on profile.
|
| Yes |
| Allows you to provide a custom package ID for the document. Only alphanumeric characters are allowed. If not provided, a new UUID is automatically generated. |
| No |
| Single file to ingest (Content) |
| No |
| Multiple files to ingest (array of Content) At least one of |
| No |
| Array of metadata objects (MetadataObject) |
| Yes |
Example -
{
"profile": "string",
"metadatas": [
{
"schema": "string",
"metadata": {
"employeeId": "string",
"title": "Contract",
"documentDate": "2025-02-15T00:00:00Z",
"department": "HR",
"category": "Manual",
"expiryDate": "2027-02-15T00:00:00Z"
}
}
],
"document": {
"type": "S3",
"name": "contract.pdf",
"content": "b31b7379-2864-4b74-8475-0da8f7b1148b"
}
}
Response
200
Returns an IngestResponse with basic SIP details (status, submission date, etc.).default
Returns an ErrorResponse with an error code and message.
2. Update Document
Description
Updates the descriptive metadata associated with an existing package in the archive.
The ingest service will create a delta SIP with the new metadata and ingest it in the archive. Ingest process is async
Request
POST /document/{packageId}
Headers
Name | Description | Type | Required |
---|---|---|---|
| Bearer token (e.g., |
| Yes |
| Optional list to specify what information should be part of the result. If no value is provided only the basic information of a package is returned.
|
| No |
Path Parameters
Name | Description | Type | Required |
---|---|---|---|
| The unique identifier of the AIP (Archival Information Package) that you want to update. Where to find it ?
|
| Yes |
Body Parameters
Name | Description | Type | Required |
---|---|---|---|
| Same value as Path Parameter |
| Yes |
| Array of metadata objects (MetadataObject) |
| Yes |
Example -
{
"packageId": "<string>",
"metadatas": [
{
"schema": "string",
"metadata": {
"employeeId": "string",
"title": "Contract",
"documentDate": "2025-02-15T00:00:00Z",
"department": "HR",
"category": "Manual",
"expiryDate": "2027-02-15T00:00:00Z"
}
}
]
}
Response
200
Returns an IngestResponse.default
Returns an ErrorResponse.
3. Request Upload URL
Description
Generates a URL that can be used for uploading Content which can be referenced during ingest. This can help handle large files by avoiding large direct uploads to the ingest endpoint.
Request
GET /url
Headers
Name | Description | Type | Required |
---|---|---|---|
| Bearer token (e.g., |
| Yes |
Response
200
Returns the url as a stringdefault
Returns an ErrorResponse.
4. Ingest SIP
Description
Triggers ingest of an existing SIP package.
Request
PUT /sip
Headers
Name | Description | Type | Required |
---|---|---|---|
| Bearer token (e.g., |
| Yes |
| Optional list to specify what information should be part of the result. If no value is provided only the basic information of a package is returned.
|
| No |
Body Parameters
Name | Description | Type | Required |
---|---|---|---|
| How the content is provided (binary data, direct URL, or S3 reference)
|
| Yes |
| Actual content (base64-encoded data) or a reference (URL or S3 key) |
| Yes |
| Name of the file being ingested |
| Yes |
Example -
{
"type": "BINARY | URL | S3",
"content": "base64-encoded data or URL or S3 key",
"name": "filename"
}
Response
200
Returns an IngestResponse.default
Returns an ErrorResponse.
5. SIP Ingest Status
Description
Retrieves the status of a SIP ingest operation.
Request
GET /sip/{packageId}/{submissionName}
Headers
Name | Description | Type | Required |
---|---|---|---|
| Bearer token (e.g., |
| Yes |
| Optional list to specify what information should be part of the result. If no value is provided only the basic information of a package is returned.
|
| No |
Path Parameters
Name | Description | Type | Required |
---|---|---|---|
| the |
| Yes |
|
|
| Yes |
Response
200
Returns an IngestResponse.404
SIP not found. This can mean ingestion is finished or the SIP was never uploaded. AIP should be checked to be sure.default
Returns an ErrorResponse.
Schemas
The schemas below are referenced above. All are JSON-based.
IngestResponse Schema
Example of a successful response when a package is accepted for ingestion. This does not mean the ingest was successful.
Ingest is an async process. Status should be reviewed and followed up.
{
"packageId": "string", // Unique id for the package
"packageType": "string", // Indicates the type of package (e.g., AIP)
"pluginReports": [ // The overview of plugins executed on the package and their result. Populated when the pluginreports enricher is enabled in the header.
{
"executionDate": "string", // Date/time of plugin execution in ISO 8601 format
"pluginName": "string", // Name of the executed plugin
"message": "string", // Message returned by the plugin
"status": "SUCCESS | WARNING | FAILURE" // Status of the plugin execution
}
],
"permissions": [
"string" // The permission the current user has on the package. Populated when the permission enricher is enabled in the header.
],
"name": "string", // Name of the SIP
"status": "string", // Current status of the SIP (e.g., 'INGESTED', 'PENDING')
"submissionDate": "string" // Date/time the SIP was submitted in ISO 8601 format
}
ErrorResponse Schema
Returned when an error occurs.
{
"code": "ERROR_CODE",
"message": "Description of the error"
}
Content Schema
{
"type": "BINARY | URL | S3", // How the content is provided (binary data, direct URL, or S3 reference)
"content": "string", // Actual content (base64-encoded data) or a reference (URL or S3 key)
"name": "string" // Name of the file being ingested
}
MetadataObject Schema
Represents a single piece of metadata. You can either specify metadata fields directly using MetadataFields schema or attach a metadata file using MetadataFile schema.
MetadataFields
Name | Description | Type | Required |
---|---|---|---|
| The URI of the metadata schema of this object. The metadata schema should be registered in the archive.
|
| YES |
| A JSON object containing the metadata values adhering to the defined schema |
| YES |
{
"schema": "string",
"metadata": {
// Key-value pairs describing the metadata
// Example:
"exampleKey": {
"subKey": "someValue" // Structure can be nested objects
}
}
}
MetadataFile
{
"schema": "string",
"file": { // Same as Content Schema
"type": "BINARY | URL | S3",
"Content": "base64-encoded data or URL or S3 key",
"name": "filename"
}
}