Using Multipart Uploads

The Oracle Cloud Infrastructure Object Storage service supports multipart uploads for more efficient and resilient uploads, especially for large objects. You can perform multipart uploads using the API, the Software Development Kits and Command Line Interface, or the Command Line Interface (CLI). The Console uses multipart uploads to upload objects larger than 64 MiB.

With multipart uploads, individual parts of an object can be uploaded in parallel to reduce the amount of time you spend uploading. Multipart uploads performed through the API can also minimize the impact of network failures by letting you retry a failed part upload instead of requiring you to retry an entire object upload.

Multipart uploads can accommodate objects that are too large for a single upload operation. Oracle recommends that you perform a multipart upload to upload objects larger than 100 MiB. The maximum size for an uploaded object is 10 TiB. Object parts must be no larger than 50 GiB. For large uploads performed through the API, you have the flexibility of pausing between the uploads of individual parts, and resuming the upload when your schedule and resources allow.

You can use object lifecycle policy rules to automatically delete any uncommitted or failed multipart uploads after a specified number of days. See Using Object Lifecycle Management for details.

Required IAM Policy

To use Oracle Cloud Infrastructure, you must be given the required type of access in a policy  written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool. If you try to perform an action and get a message that you don’t have permission or are unauthorized, confirm with your administrator the type of access you've been granted and which compartment  you should work in.

If you are new to policies, see Getting Started with Policies and Common Policies.

For administrators:

  • You can create a policy that lets the specified IAM group manage Object Storage namespaces, buckets, and their associated objects in all compartments in the tenancy:

    Allow group <IAM_group_name> to manage object-family in tenancy
  • Alternatively, you can create policies that reduce the scope of access. For example, to let the specified group manage only buckets and objects in a particular compartment in the tenancy:

    Allow group <IAM_group_name> to manage buckets in compartment <compartment_name>
Important

If you write more restrictive policies, ensure that you include the permissions required for multipart uploads. The user needs a policy that grants both OBJECT_CREATE and OBJECT_OVERWRITE permissions.

For more information about other alternatives for writing policies, see Details for Object Storage, Archive Storage, and Data Transfer.

Monitoring Resources

You can monitor the health, capacity, and performance of your Oracle Cloud Infrastructure resources by using metrics, alarms, and notifications. For more information, see Monitoring Overview and Notifications Overview.

For more information about monitoring multipart uploads, see Object Storage Metrics.

Using the Multipart Upload API

A multipart upload performed using the API consists of the following steps:

  1. Initiating an upload
  2. Uploading object parts
  3. Committing the upload

Before you use the multipart upload API, you are responsible for creating the parts to upload. Object Storage provides API operations for the remaining steps. The service also provides API operations for listing in-progress multipart uploads, listing the object parts in an in-progress multipart upload, and aborting in-progress multipart uploads initiated through the API. Here we provide a high-level overview of the API steps, but you can refer to the API Reference for specifics about supported API calls.

Creating Object Parts

With multipart upload, you split the object you want to upload into individual parts. Individual parts can be as large as 50 GiB. Decide what part number you want to use for each part. Part numbers can range from 1 to 10,000. You do not need to assign contiguous numbers, but Object Storage constructs the object by ordering part numbers in ascending order.

Initiating an Upload

After you finish creating object parts, initiate a multipart upload by making a CreateMultipartUpload REST API call. Provide the object name and any object metadata. Object Storage responds with a unique upload ID that you must include in any requests related to this multipart upload. Object Storage also marks the upload as active. The upload remains active until you explicitly commit it or abort it.

Uploading Object Parts

Make an UploadPart request for each object part upload. In the request parameters, provide the Object Storage namespace, bucket name, upload ID, and part number. In the request body, include the object part. Object parts can be uploaded in parallel and in any order. When you commit the upload, Object Storage uses the part numbers to sequence object parts. Part numbers do not have to be contiguous. If multiple object parts are uploaded using the same upload ID and part number, the last upload overwrites the part and is committed when you call the CommitMultipartUpload API.

Object Storage returns an ETag (entity tag) value for each part uploaded. You need both the part number and corresponding ETag value for each part when you commit the upload.

If you have network issues, you can restart a failed upload for an individual part. You do not need to restart the entire upload. If for some reason, you cannot perform an upload all at once, multipart upload lets you continue uploading parts at your own pace. While a multipart upload is still active, you can keep adding parts as long as the total number is less than 10,000.

You can check on an active multipart upload by listing all parts that have been uploaded. (You cannot list information for an individual object part in an active multipart upload.) The ListMultipartUploadParts operation requires the Object Storage namespace, bucket name, and upload ID. Object Storage responds with information about the parts associated with the specified upload ID. Parts information includes the part number, ETag value, MD5 hash, and part size (in bytes).

Similarly, if you have multiple multipart uploads occurring simultaneously, you can see what uploads are in-progress. Make an ListMultipartUploads API call to list active multipart uploads in the specified Object Storage namespace and bucket.

Charges for parts storage begin accruing when you upload data.

Committing the Upload

When you have uploaded all object parts, commit the upload. Use the CommitMultipartUpload request parameters to specify the Object Storage namespace, bucket name, and upload ID. Include the part number and corresponding ETag value for each part in the body of the request. When you commit the upload, Object Storage constructs the object from its constituent parts. The object is stored in the specified bucket and Object Storage namespace. You can treat it like you would any other object. Garbage collection releases storage space occupied by any part numbers you uploaded, but did not include in the CommitMultipartUpload request.

You cannot list or retrieve parts from a completed upload. You cannot append or remove parts from the completed upload either. If you want, you can replace the object by initiating a new upload.

If you decide to abort a multipart upload instead of committing it, wait for in-progress part uploads to complete and then use the AbortMultipartUpload operation. If you abort an upload while part uploads are still in progress anyway, Object Storage cleans up both completed and in-progress parts. Upload IDs from aborted multipart uploads cannot be reused.

API Documentation

For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.

Use the following operations to manage multipart uploads:

Using the CLI

When you perform a multipart upload using the CLI, you do not need to split the object into parts as you are required to do by the API. Instead, you specify the part size of your choice, and Object Storage splits the object into parts and performs the upload of all parts automatically. You can choose to set the maximum number of parts that can be uploaded in parallel. By default, the CLI limits the number of parts that can be uploaded in parallel to three. When using the CLI, you do not have to perform a commit when the upload is complete.

You can also use the CLI to list in-progress multipart uploads, and to abort multipart uploads initiated through the API.

For information about using the CLI, see Command Line Interface (CLI). For a complete list of flags and options available for CLI commands, see the Command Line Reference.

To perform a multipart upload using the CLI

To upload an object, open a command prompt and run oci os object put with the --part-size flag. The --part-size value represents the size of each part in mebibytes (MiBs). Object Storage waives the minimum part size restriction for the last uploaded part. The --part-size value must be an integer.

Optionally, you can use the --parallel-upload-count flag to set the maximum number of parallel uploads allowed.

oci os object put --namespace <object_storage_namespace> -bn <bucket_name> --file <file_location> --name <object_name> --part-size <upload_part_size_in_MB> --parallel-upload-count <maximum_number_parallel_uploads>

For example:

oci os object put --namespace MyNamespace -bn MyBucket --file ~/path/to/file --name MyObject --parallel-upload-count 10 --part-size 500
Upload ID: 277ffff5-e1b5-e81d-5f81-c374a8f33998
Split file into 12 parts for upload.
Uploading object ################################### 100%
{ "etag": "861c8341-74d8-4142-8da4-28e1ce7783ba", "last-modified": "Wed, 25 Sep 2019 19:59:15 GMT", "opc-multipart-md5": "9Qn1eyou2yMiyOO9Bc7o1A==-12" } 

For more information on the oci os object put command, see To upload an object to a bucket.

To list the parts of an unfinished or failed multipart upload
oci os multipart list -ns <object_storage_namespace> -bn <bucket_name>

For example:

oci os multipart list --bucket-name MyBucket{
  "data": [
    {
      "bucket": "MyBucket",
      "namespace": "MyNamespace",
      "object": "MyObject",
      "time-created": "2019-07-25T21:55:21.973000+00:00",
      "upload-id": "0b7abd48-9ff2-9d5f-2034-63a02fdd7afa"
    },
    {
      "bucket": "MyBucket",
      "namespace": "MyNamespace",
      "object": "MyObject",
      "time-created": "2019-07-25T21:53:09.246000+00:00",
      "upload-id": "1293ac9d-83f8-e055-a5a7-d1e13277b5c0"
    },
    {
      "bucket": "MyBucket",
      "namespace": "MyNamespace",
      "object": "MyObject",
      "time-created": "2019-07-25T21:46:34.981000+00:00",
      "upload-id": "33e7a875-9e94-c3bc-6577-2ee5d8226b53"
    }
...
Tip

See the Command Line Reference for command options to control the pagination of the list output.
To delete a part of an uncomitted or failed multipart upload
oci os multipart abort -ns <object_storage_namespace> -bn <bucket_name> --object-name <object_name> --upload-id <upload_ID>

For example:

oci os multipart abort --bucket-name MyBucket --object-name MyObject --upload-id 0b7abd48-9ff2-9d5f-2034-63a02fdd7afa
WARNING: Are you sure you want to permanently remove this incomplete upload? [y/N]: y
Tip

The CLI interface asks you to confirm the deletion request. To delete without the confirmation prompt, use the --force flag.

You can also create a lifecycle policy that automatically deletes uncommitted or failed multipart uploads. See Using Object Lifecycle Management for details.

To delete the parts of an uncomitted or failed multipart upload
#!/bin/bash

BUCKET=$1

oci os multipart list --bucket-name $BUCKET | \
    jq -c '.data | map({'o': .object, 'i': ."upload-id"}) | .[]' | \
    while read JSON; do
        OBJECTNAME=$(echo $JSON | jq '.o' | sed -e 's/\"//g;')
        UPLOADID=$(echo $JSON | jq '.i' | sed -e 's/\"//g;')
        echo Removing Object name $OBJECTNAME, ID $UPLOADID
        oci os multipart abort --bucket-name $BUCKET \
                --object-name $OBJECTNAME \
                --upload-id $UPLOADID \
                --force
    done

You can also create a lifecycle policy that automatically deletes uncommitted or failed multipart uploads. See Using Object Lifecycle Management for details.