How to set up automatic backups to swift object storage with restic¶
This article describes how to set up backups of data stored on ScienceCloud instances or volumes to the Swift object storage using Restic.
1. Obtain service account¶
To avoid exposing your password, it is recommended that you request a Swift service account for your project (tenant). Henceforth, we will use service_acc
as the service account username and myproj
as the project name.
2. Configure environment¶
Download OpenStack RC File v3 for your project using ScienceCloud web interface. You can find it under Project > Compute > Access & Security > API Access > Download OpenStack RC File v3
. For simplicity, rename the file to openrc
.
Make the following modifications in the file.
- Change
OS_USER_DOMAIN_NAME
tointernal
- Change
OS_USERNAME
to the name of the service account - Comment out the password prompting
read
command and the precedingecho
command by placing a#
sign in front of them - Change
OS_PASSWORD
value to the password of the service account
After the changes are made, the password section of the RC file should look similar to
# With Keystone you pass the keystone password.
# echo "Please enter your OpenStack Password: "
# read -sr OS_PASSWORD_INPUT
export OS_PASSWORD="xxx"
where xxx
is the service account password.
On your virtual instance that you want to back up, create a directory named ~/restic
and place the file there.
3. Install software¶
Henceforth, all the operations are performed on the ScienceCloud instance where you plan to run the backups.
Update the package repositories¶
It is a good idea to upgrade
or even dist-upgrade
your instance at this point, which may require an instance reboot. However, an update
by itself might be sufficient in many cases.
sudo apt update
sudo apt dist-upgrade
sudo reboot
Please make sure to log back in if you rebooted the instance.
Install packages¶
sudo apt install restic mailutils python3-swiftclient
You will be prompted for additional information during the installation of mailutils
. Accept Internet site
as General type of mail configuration
and localhost
as System mail name
; i.e., press Enter
each time.
Test email¶
It is important to make sure that sendmail
works properly. Otherwise, you may not receive a notification if a backup fails.
echo "This is a test from `hostname`" | \
mail -s "Test" -r xxx.xxx@uzh.ch xxx.xxx@uzh.ch
In this command, -r xxx.xxx@uzh.ch
specifies the return address. Without it the message will be rejected by the recipient server. That is why the email address is repeated twice. First time as the return address and second time as a recipient.
Warning
It is strongly recommended to use your UZH email address. Many external email providers will reject or silently drop email messages that originate from a host on a private network. If you do not see your message in your inbox, it might be in your spam folder.
4. Create Swift container¶
You can find the details on Swift container usage in a different how-to article. Here, we only provide container creation commands for convenience.
Replica-2 container¶
source openrc
swift post mybackup
Note
If a segment container is needed for files larger than 5 GB, Swift will create it automatically.
Container with error correction only¶
To create a container without replication (ec-104), add the -H 'X-Storage-Policy: ec104'
flag to your command. In addition, you may want to create a separate container to store segments larger than 5GB. In most cases, Restic splits files into much smaller pieces. However, if you do not create a segment container and it is needed, swift will create automatically a replica-2 container.
source openrc
swift post mybackup -H 'X-Storage-Policy: ec104'
swift post mybackup_segments -H 'X-Storage-Policy: ec104'
5. Configure repository¶
Generate a strong password (e.g., using a password generator) and paste it into the ~/restic/pass
file. Please note that you will need this password to access your backups, so you should save it in your personal password manager.
Important
If you do not save the password locally in your password manager and your instance is corrupted, you will not be able to restore your data.
The following command initialises a repository in the myinstance
directory of the mybackup
container.
source openrc
eval `swift auth`
restic init -r swift:mybackup:/myinstance -p pass
Restic is unable to authenticate using the environment variables from the openrc
. However, it works with authentication tokens that can be generated with the swift auth
command above.
6. Initial backup¶
Depending on how much data and what file types you have, your initial backup may take a considerable amount of time. To give you a rough idea, 778 GiB (34,800 files) were transferred in 7 hours 18 minutes during a test run. However, this speed may vary considerably depending on the data and overall network usage.
The command below creates an initial backup of the /data
directory. Since the transfer may require several hours, it would be beneficial to use nohup
so that you could safely disconnect from the session.
nohup sudo bash -c 'source openrc && eval `swift auth` && \
restic backup -r swift:mybackups:/data -p pass /data > initial.log 2>&1' \
>& nohup.out &
Warning
Restic only works with service account authentication tokens, which are only valid for 8 hours. If your initial backup takes longer, it will fail once the token expires. See the recommendations below if you have more than ~750 GB of data.
Large directories (> 750 GB)¶
Since an authentication token expires after 8 hours, you will not be able to transfer significantly more than 750 GB at a time. In principle, restic can recover from failures that occur during a backup. However, recovery entails a time-consuming error checking that becomes progressively longer as the size of the transferred data increases. Therefore, a better and faster approach would be to back up large directories in chunks. Restic also allows excluding certain data based on paths or patterns. You can start by excluding some subdirectories to have the total size below ~750 GB. Then rerun the command while progressively removing the exclude parameters.
Suppose you need to backup the /data
directory.
sudo du -hsc /data/*
# 500 /data/set1
# 700 /data/set2
# 250 /data/set3
# 900 /data/set4
The total size of /data/set1
and /data/set3
is 750 GB. So, you can start by excluding /data/set2
and /data/set4
.
nohup sudo bash -c 'source openrc && eval `swift auth` && \
restic backup -r swift:mybackups:/data -p pass \
--exclude /data/set2 --exclude /data/set4 > initial_1.log 2>&1' \
>& nohup.out &
Once this backup is complete, you can proceed by re-running the backup command without the exclusion of /data/set2
.
nohup sudo bash -c 'source openrc && eval `swift auth` && \
restic backup -r swift:mybackups:/data -p pass \
--exclude /data/set4 > initial_2.log 2>&1' \
>& nohup.out &
However, /data/set4
is too large for a single operation. Hence, it would be necessary to exclude some parts of that directory.
sudo du -hsc /data/set4/*
# 200 /data/set4/subset1
# 400 /data/set4/subset2
# 300 /data/set4/subset3
nohup sudo bash -c 'source openrc && eval `swift auth` && \
restic backup -r swift:mybackups:/data -p pass \
--exclude /data/set4/subset2 > initial_3.log 2>&1' \
>& nohup.out &
Finally, you can re-run the command without any exclusions.
nohup sudo bash -c 'source openrc && eval `swift auth` && \
restic backup -r swift:mybackups:/data -p pass > initial_4.log 2>&1' \
>& nohup.out &
Validation¶
The log file initial.log
will show some statistics about the backup. If the operation was successful, near the end of the file there should be a line similar to the one below.
[7:11:23] 100.00% 731.915 GiB / 731.915 GiB 38106 / 38106 items 0 errors ETA 0:00
The log file may also show some broken pipe and EOF errors. Those are typically harmless as long as the total number of errors reported at the end is still 0
. Even when no errors have been reported, you should still validate the backup.
source openrc
eval `swift auth`
restic snapshots -r swift:mybackup:/data -p pass
restic check -r swift:mybackup:/data -p pass
If you have used the file exclusion process to create the initial backup, the check command may report some index inconsistencies and recommend to run rebuild-index
. Please note that this is a time consuming operation. A sample command is shown below.
restic rebuild-index -r swift:fsbackup:/samba -p pass
7. Backup script¶
A backup script performs an incremental backup, prunes the old snapshots, and checks the integrity of the remaining snapshots.
config¶
Place the following variables into a file named config
. You will need to adjust the values to suit your environment.
As was mentioned before, it is strongly recommended that you use your UZH email address. You can also specify multiple email addresses separated by commas as shown below.
# Email address to use as return address
returnEmail="xxx.xxx@uzh.ch"
# Email addresses where errors should be sent
adminEmails="xxx.xxx@uzh.ch,yyy.yyy@uzh.ch"
# Name of the container
container="mybackup"
# Backup path within the container
snapshotPath="/data"
# Location of the file that contains the restic password
# relative to the root's home directory
resticPassPath="restic/pass"
# Absolute path to the directory that should be backed up
srcPath="/data"
Backup script¶
Save the backup script in the file named backup
. The script will keep the last 7 daily snapshots, 4 weekly snapshots, and 12 monthly snapshots. Snapshots share unchanged data; i.e., two snapshots do not necessarily need as much space as two separate copies of the data. Nevertheless, you can reduce space requirements by keeping fewer snapshots.
#!/bin/bash
source restic/config
source restic/openrc
set -Eu
trap onExit 1 2 3 15 ERR
function onExit() {
local exitStatus=${1:-$?}
if [[ $exitStatus -ne 0 ]]; then
mail -s "Backup failed on $(hostname)" -r "$returnEmail" "$adminEmails" <<- EOL
An error occurred during backup operation at $(hostname).
Backup failed. Please investigate. Exit status: $exitStatus
EOL
echo "$(date +%Y%m%d-%H%M%S) Backup failed. Exit status: $exitStatus"
fi
exit $exitStatus
}
echo "*******"
echo "$(date +%Y%m%d-%H%M%S) Starting backup"
eval `swift auth`
restic backup -r swift:$container:$snapshotPath -p $resticPassPath $srcPath
restic forget -r swift:$container:$snapshotPath -p $resticPassPath \
--keep-daily 7 --keep-weekly 4 --keep-monthly 12 --prune
restic check -r swift:$container:$snapshotPath -p $resticPassPath
echo "$(date +%Y%m%d-%H%M%S) Backup complete"
echo "*******"
onExit
Set permissions¶
The script needs the execute permission.
chmod 700 backup
8. Move files¶
A good place for all the backup related files might be the root's home directory (i.e., /root
). This is because we will be running cron jobs as the root user to ensure that the backup script has access to all files in the source directory. You can remove initial.log
and nohup.out
if you no longer need them.
rm initial*.log nohup.out
cd ..
sudo mv restic /root
sudo chown -R root:root /root/restic
9. Configure log rotation¶
Save the following definition in /etc/logrotate.d/restic
.
/var/log/restic.log {
monthly
missingok
rotate 6
compress
delaycompress
notifempty
create 644 root root
}
If you prefer to rotate weekly, you can change monthly
to weekly
. You can also change the number of log files to keep by changing the rotate
parameter from 6
to your desired value.
10. Configure cron jobs¶
Insert the following entry before the last line in /etc/crontab
. Make sure that the last line is still the one with #
.
37 02 * * * root /root/restic/backup >> /var/log/restic.log 2>&1
The backup will run daily at 02:37
(by default the instances are using UTC). Please change this to some other value. Otherwise, it may create some resource contention as all users would try to run their backups at the same time.
We recommend that you check /var/log/restic.log
after the first run of the backup script to ensure that everything works correctly. You may want to check the file periodically in case a backup and the error notification email both fail.
Restore¶
There are two main approaches to restore the data. You can either copy files from the mounted repository or restore the whole or partial snapshot to some location on your system.
Repository mount¶
First, you would need to create a directory to use as a mount point. For example, this could be ~/repository
. Please note that this directory must be empty. Then you can mount the backup repository with restic mount
. Note that you have to specify the repository password. In this case, the password is taken from ~/restic/pass
.
mkdir ~/snapshots
restic mount -r swift:mybackups:/myinstance -p ~/restic/pass ~/repository
Now you can open a new connection to the instance and list the available snapshots as well as the content of each snapshot to find the files you want to restore.
ls ~/repository/snapshots
ls snapshots/2021-02-07T02:38:05Z/data/project01/input
You can copy the files or directories from that snapshot somewhere else. For example,
mkdir ~/restore
cp -R snapshots/2021-02-07T02:38:05Z/data/project01/input ~/restore
cp snapshots/latest/data/project02/*.tar.gz ~/restore
Once you are done, return to your first session and press Ctrl-c
to unmount the repository. You can also remove the mount point if you no longer need it.
Occasionally you may get an unable to umount
error after which you may see Transport endpoint is not connected
error when listing the directory. If that happens, you can force the unmount with the following command. Before running the command, ensure that you are done copying the files and exit the mounted repository directory. Ideally, you should also close the session you used for file copying.
fusermount -uz ~/repository
Restore¶
To list available snapshots, use the following command. As before, we assume that the repository password is saved in ~/restic/pass
.
restic snapshots -r swift:mybackups:/myinstance -p ~/restic/pass
# ID Time Host Tags Paths
#--------------------------------------------------------------------------
#ffd704b4 2021-01-31 02:38:12 myinstance /data
#f74204d9 2021-02-07 02:37:45 myinstance /data
Then you can restore the chosen snapshot using its ID.
restic restore -r swift:mybackups:/myinstance -p ~/restic/pass --target ~/restore ffd704b4
Depending on the size of your backup, the full restore might take a lot of time. If you only need to restore a few files, you can use --include
flag. In that case, all other files and directories are ignored. For example, you can restore the /data/project01/input
directory with the following command.
restic restore -r swift:mybackups:/myinstance -p ~/restic/pass --target ~/restore \
--include /data/project01/input ffd704b4
Summary¶
The cron job will execute the backup script on a regular basis. If the job fails, an email will be sent to the addresses specified in the configuration file. The output of the job will be saved in /var/log/restic.log
and the files will be automatically rotated.
It is recommended to occasionally test the restore procedure. While the software and script make an effort to catch potential errors, it is best practice to confirm this from time to time.