Let's talk about GitHub backups!

June 6, 2018 (6 months ago)

Let’s talk about Github backups

Following Microsoft’s announcement to acquire GitHub, this shed some light on how I’ve been too generous to centralized services offering free services. After all, you do get what you pay for.

Although I have had tendencies to jump on the Microsoft hate bandwagon a few times, due to the (in my opinion) terrible customer focus – I don’t totally dislike the idea of GitHub being owned by Microsoft. I will save that spiel for another though. Today, I wanted to focus on getting a bit more control of the data I house on GitHub. After all, GitHub is almost like a resume for developers, and if I were to lose that data, I don’t know what I’d do.

python-github-backup

I began browsing for utilities/services that could backup my account info, repositories, gists, and stars, etc. I came across python-github-backup. Today, I will briefly go over how I use it.

I wanted to setup weekly backups, to my NAS (with over 40TB worth of free space).

Installation

Let’s begin with the installation. It’s silly simple (using root):

$ sudo pip install github-backup

Or (using root):

$ sudo pip install git+https://github.com/josegonzalez/python-github-backup.git#egg=github-backup

Or (as a regular user):

$ pip install --user github-backup

Usage

Take a look at github-backup --help, or the README in the above linked repo. It’s pretty extensively amazing.

On a cron you say?

I’ve setup a simple wrapper script, which runs backups in a temporary directory, then archives them and places them in a desired location:

#!/bin/bash

USER="${1:?usage: $0 <user> <archive>}"
ARCHIVE_LOC="${2:?usage: $0 <user> <archive>}"

which github-backup > /dev/null 2>&1 || pip install github-backup

if [ -z "$GH_BACKUP_TOKEN" ];then
	echo 'missing $GH_BACKUP_TOKEN...'
	exit 1
fi

_tmp=$(mktemp -d -p "/run/user/$(id -u)" -t "github-backup-XXXXX")
if [ "$?" != 0 ];then exit 1;fi

trap "rm -rf $_tmp" EXIT
echo "using '$_tmp'..."

github-backup \
	--token "$GH_BACKUP_TOKEN" \
	--starred \
	--followers \
	--following \
	--issues \
	--issue-comments \
	--repositories \
	--wikis \
	--gists \
	--starred-gists \
	--private \
	--output-directory "$_tmp" "$USER"

tar -czvf "$ARCHIVE_LOC" -C "$_tmp" .

Above script is also available here.

I’ve dropped it in ~/bin/gbackup, but you can place it anywhere. I also have two cronjobs, which run every Sunday, at ~3:00AM and ~3:15AM respectively:

0 3 * * 0 /root/bin/gbackup lrstanley /net/media/archives/github-backups/github-backup-lrstanley.$(date +%Y.%m.%d-%H.%M.%S).tar.gz > /dev/null
15 3 * * 0 find /net/media/archives/github-backups/ -type f -mtime +365 -delete > /dev/null

Notice that second command – it finds any archives older than 365 days old, and removes them. That’s around 52 backups across an entire year, at about ~45MB .tar.gz’d per backup. Not bad!

Lastly, the script needs a variable set. GH_BACKUP_TOKEN, which you can generate using this GitHub page.

You can invoke the github-backup script with it using GH_BACKUP_TOKEN=<token> /root/bin/gbackup or place it in e.g. ~/.bashrc, using export GH_BACKUP_TOKEN=<token> where it will be accessible when you run the script.

Welp, now I have GitHub backups. It’s not amazingly robust, but since GitHub doesn’t have a full-account backup system in place, this is sufficient enough for me at this time. Happy commiting.

Comments