Handling large file uploads - A developer guide

August 6, 2024

Handling large file uploads is a common challenge for developers building applications that involve media storage, cloud backups, or large dataset uploads. Uploadcare’s File Uploader provides robust solutions for managing these uploads efficiently. In this article, we will consider various techniques for handling large file uploads, focusing on the API and HTTP aspects crucial for developers.

Large file uploading to Uploadcare servers

How large is large? A large file definition

Most HTTP web servers and browsers generally limit file uploads to 2 GB (with the exception of Google Chrome and Opera, which can support up to 4 GB files).

Limits for other services may significantly vary. For instance, a Gmail attachment can be 20-25 MB in size. If a file is bigger, the service automatically loads it to Google Drive and offers to send a link.

Suppose you want to enable large file uploads on your applications, either for your end users or for your team. In that case, you might consider using a cloud storage provider like Google Cloud, Azure Blob, Dropbox, or Amazon S3.

Amazon S3 storage supports uploading objects of up to 5 GB in a single operation and files of up to 5 TB when divided into chunks and processed by the API. This is sufficient even for uploading very large files, such as 200+ GB, at once.

Uploadcare receives around 1,000,000 files daily from all over the world, and files larger than 10 MB are considered large. Observing the patterns, we can conclude that the size and quantity of media are expanding at a rapid pace, attributable primarily to the increase in video content.

Problems with large file uploads

1. Low upload speed and latency

The larger a file, the more bandwidth and time it takes to upload. This rule seems logical for a developer but can become a huge pain point for an end user.

Speed problems usually occur if you transfer data in a single batch to your server. In this scenario, no matter where your end user is located, all the files go to a single destination via the same road, creating gridlock like in Manhattan during rush hour.

And if the files are huge, your channel gets paralyzed: the speed goes down, and you can’t use your assets to their full potential.

2. Uploading errors

The most common upload errors are due to limitations on the user’s browser or your web server.

Generally, 2 GB is a safe maximum supported by all browser types and versions. As for a web server, it can reject a request:

If it isn’t sent within the allotted timeout period;
If memory usage limits are exceeded;
If there’s a network interruption;
If the client’s bandwidth is low or the internet connection is unstable.

Techniques for handling large file uploads

How do you handle large files and avoid problems related to large file uploads while ensuring your users have a good experience when uploading files in your application? Let’s consider the following techniques:

1. Chunking

Chunking involves breaking a large file into smaller parts (chunks) and uploading them individually. This method allows for easier error management, as only the failed chunk needs to be re-uploaded rather than the entire file. Additionally, chunking helps manage memory and bandwidth usage efficiently.

For example, say you have an HTML form with an input that you would like to use for uploading large files with the markup:

<form id="upload-form">
  <input type="file" id="file-upload">
  <button type="submit">Upload file</button>
</form>

Using Javascript, you can split the large file into smaller chunks and upload them individually using the code below:

document.getElementById('upload-form').addEventListener('submit', function (e) {
  e.preventDefault();
  uploadFile();
});

async function uploadFile() {
  const fileInput = document.getElementById('file-upload');
  const file = fileInput.files[0];
  const chunkSize = 10 * 1024 * 1024; // 10MB per chunk

  if (!file) {
    alert('Please select a file!');
    return;
  }

  let start = 0;

  // Split the file into 10MB chunks
  while (start < file.size) {
    const chunk = file.slice(start, start + chunkSize);
    await uploadChunk(chunk);
    start += chunkSize;
  }
}

async function uploadChunk(chunk) {
  console.log('Uploading chunk:', chunk.size);
  const formData = new FormData();
  formData.append('file', chunk);

  try {
    const response = await fetch('https://upload-api.com/', {
      method: 'POST',
      body: formData,
    });

    if (!response.ok) {
      throw new Error('Network response was not ok');
    }

    const data = await response.json();
    console.log('Chunk uploaded successfully:', data);
  } catch (error) {
    console.error('Error uploading chunk:', error);
  }
}

The code above splits a large file upload into smaller chunks (10MB in size) using the Blob.slice method and uploads them in bits to an HTTP API URL. This allows large files to be easily uploaded while minimizing memory and bandwidth usage.

Uploadcare’s File Uploader splits all files larger than 10 MB into 5 MB chunks. Each chunk is uploaded in 4 batches simultaneously. This method maximizes channel capacity usage, prevents upload errors, and boosts upload speed by up to 4x.

Large file chunking and simultaneous uploading with Uploadcare

2. Resumable uploads

Implementing resumable uploads helps maintain upload integrity in case of interruptions. They allow the upload to pause and resume without starting over. This technique is beneficial in environments with unstable network connections.

Resumable uploads allows users to pause and resume uploads at any time. This feature is particularly useful for large files, as it enables users to continue uploading from where they left off in case of network interruptions or other issues.

3. Streaming

Another useful method is streaming, where the file is uploaded as it is being read. This is particularly beneficial for large files as it lessens the stress on both the client and server and allows for continuous data transfer. An example of this is using the createReadStream function in Node.js from the fs module to read files as they are being uploaded:

// example of streaming large file uploads in Node.js
const fs = require('fs');
const http = require('http');

const server = http.createServer((req, res) => {
  const stream = fs.createReadStream('largefile.mp4');
  stream.pipe(res);
});

server.listen(3000, () => console.log('Server running on port 3000'));

4. Use a CDN and upload files to the closest data center

Using a Content Delivery Network (CDN) effectively handles large file uploads and ensures a smooth and reliable user experience. A CDN is a distributed server network that delivers content to users based on their geographic location.

By using a CDN for your backend services and ensuring files are uploaded to the closest data center, you can speed up the uploading of large files in your application by a large percentage.

At Uploadcare, we use Amazon S3, which receives numerous batches of data simultaneously and stores each in globally distributed edge locations. To increase speed and latency even further, we use an acceleration feature that enables fast transfers between a browser and an S3 bucket.

By adopting this method, you can produce a reverse CDN wow effect: if a user is in Singapore, the uploaded data doesn't try to reach the primary AWS server in the US but goes to the nearest data center, which is 73% faster.

A speed estimate for uploading data to AWS with and without transfer acceleration feature

Check out the speed comparison and possible acceleration for your target regions in this speed checker.

Integration with Uploadcare

Uploadcare’s File Uploader simplifies handling large file uploads with features like automatic chunking, resumable uploads, and error management. By integrating Uploadcare, you can leverage these capabilities to implement large file-uploading capabilities in your application and leave the heavy work to us.

You can integrate the File Uploader components from Uploadcare into your application to enable file uploading:

<uc-config
  ctx-name="my-uploader"
  pubkey="YOUR_PUBLIC_KEY"
  img-only="true"
  multiple="true"
  max-local-file-size-bytes="524288000"
  use-cloud-image-editor="true"
  source-list="local, url, camera, dropbox"
>
</uc-config>

<uc-file-uploader-regular ctx-name="my-uploader"></uc-file-uploader-regular>

Some features Uploadcare provides for handling large file uploads include:

Automatic splitting of large files into chunks by the File Uploader when uploading large files
Upload RESTful APIs and SDKs for uploading large files in multipart (up to 100 MB chunk size per file for large files) to Uploadcare servers.
Uploaded files are distributed across multiple global CDNs, ensuring fast access and download speeds for your users worldwide.
All uploads are made via secure HTTPS, ensuring secure data transfer. Fine-grained access controls and permissions can be set to manage who can upload, view, or modify files.

Case study: Supervision Assist is an application that helps manage practicum and internship university programs. It allows university coordinators to supervise their students through live or recorded video sessions. The company needed a secure HIPAA-compliant service that would handle large uncompressed files with recorded sessions in MP4, MOV, and other formats generated by cameras. The team managed to build such a system from scratch but eventually got overwhelmed by upload errors, bugs, and overall maintenance.

If an upload didn’t complete, one of our devs would have to go look on the web server, see what data was stored and how much was there. Individually, it’s not a big deal, but over time that adds up.

— Maximillian Schwanekamp, CTO

By integrating Uploadcare, the company could seamlessly accept files of any format and as big as 5 TB without spending in-house development resources.

Apart from handling large file uploads, using service like Uploadcare can offer some additional perks like data validation, file compression and transformations, and video encoding. The latter allows adjusting the quality, format and size of a video, cutting it into pieces, and generating thumbnails.

Conclusion

There’s no universally accepted concrete definition of a “large file,” but every service or platform has its file handling limits. Uploading large files without respecting those limits or the individual user’s context may lead to timeouts, errors and low speed.

By utilizing techniques such as chunking, streaming, and resumable uploads, developers can ensure efficient and reliable application file uploads. Additionally, they can integrate solutions such as Uploadcare to leverage these methods.

For more information on secure file uploads, check out our blog post on secure file uploads.