Multipart uploads on Sia using Filebase and the AWS CLI

Learn how to store large files on Sia using S3 multipart uploads with Filebase and the AWS CLI.

Multipart uploads on Sia using Filebase and the AWS CLI

One of the most commonly used tools for transferring data to object storage services is the AWS CLI (Command Line Interface). This is a tool developed by Amazon and is largely based on the Python botocore library. The CLI features several commands that allow you to manage objects and buckets. Fortunately, it can be configured to point to any S3-compatible object storage service, including Filebase.

This tool is especially popular amongst developers and IT administrators. It is useful because it’s invoked as a command line program. This allows for it to be easily integrated with backup and other types of custom scripts. One common example would include a nightly backup script that runs via a cron job, backing up a servers hard drive to a storage bucket.

The AWS CLI is certified for use with Filebase.

To use it with Filebase, you can follow the configuration steps below. Before running through any of these steps, you will need to ensure you have the CLI properly installed. You can visit the following page for install instructions: https://aws.amazon.com/cli/

Configuration

There are three configuration values that need to be passed to the AWS CLI to use it with Filebase. They are:

  1. S3 API Access Key ID
  2. S3 API Secret Access Key
  3. S3 API Endpoint

The Access Key ID and Secret Access Key can be stored in a configuration file. However, the API endpoint will need to be passed in with every command. There are ways to avoid this by using 3rd party plugins, such as awscli-plugin-endpoint. We will discuss how to do this in a future blog post.

To setup our access keys, we will open a new terminal window. From there, we can run:

aws configure

This will trigger a prompt. Enter your Access Key ID and Secret Access Key. You can leave the region and output format blank by simply hitting enter.

Once the above steps are complete, we are ready to move onto interacting with the Filebase S3 API.

Create a new bucket

Let’s start out by creating a new bucket for this tutorial, named my-test-bucket. To create a new bucket, the following command should be run:

aws --endpoint https://s3.filebase.com s3 mb s3://my-test-bucket

Listing Buckets

Now that we’ve created a new bucket, let’s verify it shows up under our account. The s3 ls command will list all the buckets tied to your account:

aws --endpoint https://s3.filebase.com s3 ls

Once the above command is run, we should see a list of buckets returned. This is a list of buckets that you own. As you can see below, our new bucket now appears on this list.

List bucket contents

Next let’s try listing the contents of our new bucket. It should be empty since it’s brand new, but let’s verify that.

To list the contents of a specific bucket, run the following:

aws --endpoint https://s3.filebase.com s3 ls s3://my-test-bucket

An empty response is returned, confirming that our new bucket is indeed empty.

Uploading files to a bucket

You can upload a single file or multiple files at once when using the AWS CLI. To upload multiple files at once, we can use the s3 sync command. In this example, we will upload the contents of a local folder named my-test-folder into the root of our bucket.

aws --endpoint https://s3.filebase.com s3 sync my-test-folder/ s3://my-test-bucket

Once the upload has completed, we can list the contents of the bucket using the S3 API to confirm:

And of course, all files are always available from our browser-based UI console as well:

We can use the s3 cp command to upload a single file:

aws --endpoint https://s3.filebase.com s3 cp s3-api.pdf s3://my-test-bucket

Multipart uploads

The AWS CLI takes advantage of S3-compatible object storage services that support multipart uploads. By default, the multipart_threshold of the AWS CLI is 8MB. This means any file larger than 8MB will be automatically split into separate chunks and uploaded in parallel. Multipart uploading is important because it increases performance and allows for resumable file transfers in the event of network errors.

To upload a file using multipart, simply try uploading a file larger than 8MB in size — the AWS CLI will automatically take care of the rest. In the example below, we will upload a 1GB file.

aws --endpoint https://s3.filebase.com s3 cp 1GB.zip s3://my-test-bucket

Verifying uploaded files

The AWS CLI can also be used to interact with several other Filebase S3 APIs. For example, we can use the s3api head-object command to fetch object metadata. One of the metadata fields returned by the service is the entity tag, also known as an ETag. With Filebase, the ETag of an object is equivalent to an object’s MD5 checksum. Using the MD5 is a common practice with S3-compatible object storage services.

In the example below, we will fetch the object’s metadata from the Filebase S3 API. After this, we run a command which calculates the MD5 of the same file on our local machine. If the MD5’s match, we can be sure that our upload was successful and the service received our data properly.

aws --endpoint https://s3.filebase.com s3api head-object --bucket my-test-bucket --key 1GB.zip

To calculate the MD5 checksum of our file locally on macOS:

md5sum 1GB.zip

As you can see, the MD5 value of cd573cfaace07e7949bc0c46028904ff matches — therefore we can be assured the service received and processed our file correctly.

= Success

Conclusion

Working within the command line isn’t as bad as it seems, but we hope this tutorial was a good start at how to use Filebase within this manner.

Please let us know if you have any questions, comments or concerns at hello@filebase.com

For more information:
Filebase website
Filebase technical documentation

About Filebase
Filebase is the world’s first object storage platform powered by multiple decentralized storage networks. Filebase helps customers save over 90% compared to traditional cloud providers. Additionally, Filebase’s proprietary edge caching technology helps customers achieve industry-leading storage performance when fetching data from decentralized networks.

Filebase was awarded the “Most Exciting Data Storage And Sharing Project” in HackerNoon’s 2020 Noonies Awards and was a finalist in Storage Magazine’s 2019 Product of the Year Awards.

Visit our website and blog, follow us on Twitter and LinkedIn and like us on Facebook.