Storing file references, avoiding orphaned files, etc

Hi, we’re primarily looking at UploadCare in a use case where our users can upload a file or files, then view them. As noted https://uploadcare.com/learning/guides/files-browser/ this obviously requires we ourselves store information about each uploaded file; primarily, we need to know what files each user owns. We’d store this information in our database.

To avoid orphaned files, a typical file upload flow built natively would do something like this:

  1. User selects a file they want to upload
  2. User clicks “upload my file” button
  3. An api call is made which stores the file and user information in our database
  4. Only if #3 is successful, commence the actual file upload (if #3 failed, show the user an error message)

Importantly, #3 occurs before #4. That way, a record exists in our own database of all files which might have been uploaded. If #3 were to come after #4, orphaned files can occur in this way:

  1. User selects a file they want to upload
  2. User clicks “upload my file” button
  3. File upload commences
    [ if any error occurs here, or if user closes browser, etc. then we end up with orphaned files]
  4. Once file upload completes, an api call is made which stores the file and user information in our database

As you can see, if this second flow is followed, files can be uploaded for which no record in the database exists. These are what I mean by “orphaned” files; over time, orphaned files can not only increase storage costs, but present security risks.

How can we accomplish the best practice of the first flow, using the React/javascript widget? I do see that there’s a webhook which fires when an upload is completed, but in my experience that can be a bit difficult to implement on our end in a non-flaky way.

Thanks!

Hi @john2, thanks for your question, and sorry for the delayed response! As I understand, you plan to use file uploader v3. This widget starts uploading immediately once a user selects a file. Unfortunately, there’s no way to delay the upload process. However, if you want to avoid orphaned files, you can set up the following workflow:

First, you disable the “auto file storing” option in your project settings. As a result, all your file uploads will be marked as temporary by default, and our system will delete them automatically after 24 hours unless you change their state to “stored”. See Uploadcare Storage Workflow for more information.

  • User selects a file they want to upload
  • User clicks “upload my file” button
  • File upload commences
  • Once file upload completes, an API call is made, which stores the file and user information in our database.
  • If the above is completed successfully, your app makes an API call to store the file. Otherwise, the uploaded file will be deleted automatically (e.g. if something went wrong when adding a record to your DB).

Also, it’s possible to add arbitrary metadata to files, and you can do this upon uploading. Metadata fields are passed through webhooks, which means you can use them to notify your system about successful uploads and match the uploads with particular entities in your app (e.g. user ID, order ID, session, etc.). So far, there’s no way to search/filter files by metadata, but we plan to make it possible in the future.

1 Like

Thanks Alex, that makes sense. One question – what if we’re wanting to have UploadCare store files in s3.

In this case is it true that the API call we make to UploadCare “store this file, don’t delete it after 24 hours” is not relevant? Meaning that once UploadCare has replicated the file to s3, it will be stored there no matter what; UploadCare will not delete it after 24 hours.

It depends on whether you want your files to be retained in Uploadcare storage. However, the “store” behavior doesn’t affect files replicated to your S3 bucket; our platform doesn’t perform any actions with files in your bucket.