Document conversion

Uploadcare can convert documents from all popular formats via REST API. For example, you can upload PDF and generate its thumbnail with PDF to PNG first page conversion, bring different office formats to web-friendly PDFs, or make previews for artists' PSD/AI files.

Converts the input files including the following formats:

  • Microsoft: DOC, DOCX, XLS, XLSX, PPS, PPSX, PPT, PPTX, VSD, VSDX, SKW, WPD, WPS, XLR, PUB, MPP.
  • Apple: KEY, MSG, NUMBERS, PAGES.
  • OpenDocument: ODT, ODS, ODP.
  • Text: TXT, RTF, PDF, DJVU, EML.
  • Data: CSV, XPS, PS, EPS.
  • Image: PSD, AI.

The output file formats ($target_format options): doc, docx, xls, xlsx, odt, ods, txt, rtf, pdf, jpg, png, html5.

Note: When converting multi-page documents to an image format (jpg or png), the output will be a zip archive with one image per page.

How it works

Document conversion works asynchronously through the REST API, unlike the image processing operations performed on the fly.

It generates encoded output as a separate file while the original file remains intact.

  1. Start a processing job via REST API. Send an input file UUID with necessary processing operations.
  2. Wait until the processing job status becomes finished.
  3. The processed file can be addressed via its new UUID, as well as an URL with operations provided in a processing request.

Processing job

The endpoint for your requests is:

curl -X POST \
    -H "Content-Type: application/json" \
    -H "Accept: application/vnd.uploadcare-v0.7+json" \
    -H "Authorization: Uploadcare.Simple $YOUR_PUBLIC_KEY:$YOUR_SECRET_KEY" \
    -d '{"paths": ["UUID/document/-/format/:target_format/"], "store": "1"}' \
    "https://api.uploadcare.com/convert/document/"bash

Project pointed with $YOUR_PUBLIC_KEY and $YOUR_SECRET_KEY should have enabled Document conversion feature. The UUID of the source file to convert should belong to the project.

Input file identifiers should be formatted with UUID followed by /document/ (please note, this is not operation, so there shouldn't be /-/ between them) and ended by operations separated by /-/:

https://ucarecdn.com/:uuid/document/-/format/$target_format/

The following operations are available for the conversion job:

  • /format/:target_format/ — defines the target conversion format.
  • /page/:number/ — converts a single page of a multi-page document to either jpg or png. This method won't work for other target formats. :number — is a one-based number for a page to convert.

Operations can be omitted; the document will then be delivered as a PDF.

Check out detailed API reference for POST /convert/document/.

Job status

As a request result, you immediately receive an UUID of the new file. But it cannot be accessible due to that conversion takes time. Once you get a processing job result, you can get a processing job status using :token.

curl -X GET \
    -H "Content-Type: application/json" \
    -H "Accept: application/vnd.uploadcare-v0.7+json" \
    -H "Authorization: Uploadcare.Simple $YOUR_PUBLIC_KEY:$YOUR_SECRET_KEY" \
    "https://api.uploadcare.com/convert/document/status/:token/"bash

Check out detailed API reference for GET /convert/document/status/:token/.

Alternatively, you can subscribe to the file.uploaded and file.infected event via the webhooks REST API endpoint.

Job result

The processed document becomes addressable via both a new UUID and a CDN URL you provided in the request:

https://ucarecdn.com/:new_uuid/
https://ucarecdn.com/:original_uuid/document/-/:operation/:parameters/

Applying the same set of conversion operations to the original document will create a new file with a different new_UUID, the conversion result will be the same.

Create document thumbnail

To create a document thumbnail, you should set :target_format/ as jpeg or png, and set :number/ of pages as 1.

curl -X POST \
    -H "Content-Type: application/json" \
    -H "Accept: application/vnd.uploadcare-v0.7+json" \
    -H "Authorization: Uploadcare.Simple $YOUR_PUBLIC_KEY:$YOUR_SECRET_KEY" \
    -d '{"paths": ["UUID/document/-/format/jpeg/-/page/1/"], "store": "1"}' \
    "https://api.uploadcare.com/convert/document/"bash

Multipage conversion

You can convert a multi-page document into a group of files. To form a group of files from converted pages of a multi-page document, you must pass the save_in_group=1 parameter in the request to set up the conversion task:

curl -X POST \
    -H "Content-Type: application/json" \
    -H "Accept: application/vnd.uploadcare-v0.7+json" \
    -H "Authorization: Uploadcare.Simple $YOUR_PUBLIC_KEY:$YOUR_SECRET_KEY" \
    -d '{"paths": ["UUID/document/-/format/:target-format/"], "store": "1", "save_in_group":"1"}' \
    "https://api.uploadcare.com/convert/document/"bash

The response will contain the UUID of zip archive:

{"result": [{"original_source": "UUID/document/-/format/$target_format/", "token": :token, "uuid": "zip_UUID"}], "problems": {}}

A zip archive with all converted pages can be downloaded via URL with the zip_UUID:

https://ucarecdn.com/zip_UUID/

Note: The API response to the conversion request doesn't indicate the group UUID since the group identifier includes the number of files in the group, which is determined later.

To get the UUID of the converted files group:

curl -X GET \
    -H "Content-Type: application/json" \
    -H "Accept: application/vnd.uploadcare-v0.7+json" \
    -H "Authorization: Uploadcare.Simple $YOUR_PUBLIC_KEY:$YOUR_SECRET_KEY" \
    "https://api.uploadcare.com/convert/document/UUID/"bash

The response will contain the group UUID:

{"error":null,"format":{"name":"pdf", ... },"converted_groups":{":target_format":"group_UUID"}} 

You can view all files in a group by URL:

https://ucarecdn.com/group_UUID/

To further work with group files, use the group REST API.

If the source document is not multi-page or the conversion of a specific page is requested, for example, UUID/document/-/format/jpg/-/page/1/, then creating a group is impossible.

All files of converted pages of one multipage document will be added to variations in the same way as if the conversion of each page was requested separately: {"document/-/format/:target_format/-/page/:page_number/": "file_uuid", ...}

Information about the groups formed based on the conversion results to different formats can be obtained from the /convert/document/UUID/ endpoint. In response to a request to the endpoint, the identifiers of the formed groups will be available along the path .format.converted_groups in the form of a dictionary: {":target_format": "group_UUID", ...}

If the same file is converted again to the same format, the group ID in the response will be updated, but the previous group and its files will not be deleted.

API integrations

You don't have to code most of the low-level API integrations. We have high-level libraries for all popular platforms:

Billing

  • This feature is available on paid plans.
  • This operation uploads a new file.
  • Learn how we charge for this operation.