Backing Up S3 Bucket With Duplicity
The following guide will show you how to create iterative backups of your S3 bucket to another S3 bucket, in such a way that you don't need to be able to hold all of the contents of either bucket locally. We achieve this by making use of S3FS to mount the source bucket, and backing up straight to another S3 bucket.
Steps
First we need to mount the bucket we wish to backup, using s3fs, so that we can treat it like a local filesystem.
# Fill in settings here, the AWS IAM key neeeds to have permission to access the bucket you wish to backup (source)
ACCESS_KEY_ID=xxxxxxxxxxxxx
SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxx
BUCKET_NAME=dev-bae-bit-portal-bucket
S3_MOUNT_POINT=/mnt/s3-bucket
# Install the S3FS package
sudo apt update && sudo apt install s3fs -y
# Create your credentials file that we will use later to mount the filesystem
echo $ACCESS_KEY_ID:$SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
chmod 600 ${HOME}/.passwd-s3fs
# Create the mount point
mkdir -p $S3_MOUNT_POINT
# Finally, mount the S3 bucket to the mount point.
s3fs \
$BUCKET_NAME \
$S3_MOUNT_POINT \
-o passwd_file=${HOME}/.passwd-s3fs
Create Backups
Install duplicity if you haven't already
sudo apt-get install duplicity -y
Wel also need to install "boto" through pip for duplicity to work with S3.
sudo apt-get install python3-pip -y
pip3 install boto
Run the following script to create the backup
#!/bin/bash
# Credentials for the S3 Bucket you wish to backup to.
AWS_ACCESS_KEY_ID=xxxxxxxxxxxxxxxxxxxxxx
AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxx
BUCKET_REGION=s3-eu-west-2
DAYS_BETWEEN_FULL_BACKUPS="30D"
BACKUP_LIFETIME_DAYS="400D"
BACKUP_BUCKET_NAME="dev-bae-bit-portal-backups"
LOCAL_FOLDER_TO_BACKUP=/mnt/s3-bucket
export $AWS_ACCESS_KEY_ID
export $AWS_SECRET_ACCESS_KEY
# Cleanup if any previous backup had issues.
/usr/bin/duplicity cleanup \
--force \
--no-encryption \
"s3://${BUCKET_REGION}.amazonaws.com/${BACKUP_BUCKET_NAME}/"
# Remove outdated backups
/usr/bin/duplicity remove-older-than \
--force \
--no-encryption \
${BACKUP_LIFETIME_DAYS} \
"s3://${BUCKET_REGION}.amazonaws.com/${BACKUP_BUCKET_NAME}/"
# Take the backup.
/usr/bin/duplicity \
--no-encryption \
--s3-european-buckets \
--s3-use-new-style \
--verbosity 4 \
--full-if-older-than ${DAYS_BETWEEN_FULL_BACKUPS} \
$LOCAL_FOLDER_TO_BACKUP \
"s3://${BUCKET_REGION}.amazonaws.com/${BACKUP_BUCKET_NAME}/"
Docker Containers
If you are doing this within a docker container, you need to run the container with --cap-add SYS_ADMIN --device /dev/fuse
, or you can just run with --privileged
, but there are a lot of security implications if you run with privileged mode.
If using docker-compose, you can specify privileged with
my_service:
privileged: true
--cap-add SYS_ADMIN --device /dev/fuse
with docker-compose`
You also need to set the timezone automatically in your Dockerfile, as installing the s3fs package will trigger the setting of the timezone.
Finally, you will need to add --allow-source-mismatch
to the backup command because the docker container's hostname will change. E.g.
/usr/bin/duplicity \
--allow-source-mismatch \
--no-encryption \
--s3-european-buckets \
--s3-use-new-style \
--verbosity 4 \
--full-if-older-than ${DAYS_BETWEEN_FULL_BACKUPS} \
$LOCAL_FOLDER_TO_BACKUP \
"s3://${BUCKET_REGION}.amazonaws.com/${BACKUP_BUCKET_NAME}/"
References
- Duplicity encrypted backups to Amazon S3
- Programster - Iterative Remote Backups With Duplicity
- Github - s3fs
- Github - s3fs-fuse issue - fuse: device not found, try 'modprobe fuse' first
First published: 21st August 2020