Skip to content

Object Storage management using boto3 python library

boto3

boto3 is a python-based library which is used to create, configure, and manage S3 object storage.

Installation

You may want to activate python virtual environment.

pip install boto3

Credentials

There are multiple ways to configure credentials in boto3:

  • interactively via aws client
  • passing credentials as parameters in python client (demonstrated here)
  • setting credentials as environment variables
  • via AWS configuration file
  • via boto2 configuration file
  • via Assume Role provider
  • see more options in the official documentation of boto3 credentials.

Regardless of the chosen source, you must have S3 credentials generated.

In the example below, we show the method of setting credentials in your python code as parameters.

We will not generate new S3 credentials, but read them from already created configuration file s3config.cfg. How to create configuration file with S3 credentials is shown in detail in the Swift S3 API Overview section.

Using boto3

To make service requests to S3 storage using boto3, you may use two different abstractions, namely Client and Resource.

There are a number of differences between the two abstractions. For example, Clients provide a low-level interface to AWS whose methods map close to 1:1 with service APIs, while Resources represent an object-oriented interface to AWS and provide a higher-level abstraction than the raw, low-level calls made by service clients.

Example below shows how to authenticate and manage S3 storage (create bucket, upload files) in python using Client object.

For more methods available for Client abstraction, refer to the Client API documentation.

To read configuration file install python library that parses key-value pairs files:

pip install python-dotenv
import boto3
import os
import logging
from botocore.exceptions import ClientError
import string
import random

# read credentials from existing configuration file
from dotenv import dotenv_values
config = dotenv_values("s3config.cfg")

# instantiate boto3 s3 client with credentials
s3client = boto3.client('s3',
        endpoint_url='https://' + config['host_base'],
        aws_access_key_id=config['access_key'],
        aws_secret_access_key=config['secret_key'])

# list existing buckets
response = s3client.list_buckets()
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')

# create bucket
# due to bucket naming restrictions
# use some unique identifier e.g. $USER, with some random lowercase string
# to create unique bucket names
seed = ''.join(random.choices(string.ascii_lowercase + string.digits, k=4))
bucket_name = "-".join([os.getlogin(),"bucket",seed,"boto3"])
try:
    s3client.create_bucket(Bucket=bucket_name)
    print("Created bucket " + bucket_name)
except ClientError as e:
    logging.error(e)

# upload file to bucket
file_name = 'test_file_boto3.txt'
with open(file_name, "w") as file:
    file.write("line1\nline2")

# the file-like object must be in binary mode, hence use "rb" to open the file
with open(file_name, 'rb') as f:
    s3client.upload_fileobj(f, Bucket=bucket_name, Key=file_name)
    print('Uploaded file using file_obj() method')

# another method to upload a file to bucket
# no benefits are gained by calling one class’s method over another
# file with the same name is replaced with local copy
s3client.upload_file(file_name, Bucket=bucket_name, Key=file_name)
print('Uploaded file using upload_file() method')