Liam Stanley
Published 8 years ago
Let's talk about Github backups
Following Microsoft's announcement to acquire GitHub, this shed some light on how I've been too generous to centralized services offering free services. After all, you do get what you pay for.
Although I have had tendencies to jump on the Microsoft hate bandwagon a few times, due to the (in my opinion) terrible customer focus -- I don't totally dislike the idea of GitHub being owned by Microsoft. I will save that spiel for another though. Today, I wanted to focus on getting a bit more control of the data I house on GitHub. After all, GitHub is almost like a resume for developers, and if I were to lose that data, I don't know what I'd do.
python-github-backup
I began browsing for utilities/services that could backup my account info, repositories, gists, and stars, etc. I came across python-github-backup. Today, I will briefly go over how I use it.
I wanted to setup weekly backups, to my NAS (with over 40TB worth of free space).
Installation
Let's begin with the installation. It's silly simple (using root):
1sudo pip install github-backup
Or (using root):
1sudo pip install git+https://github.com/josegonzalez/python-github-backup.git#egg=github-backup
Or (as a regular user):
1pip install --user github-backup
Usage
Take a look at github-backup --help, or the README in the above linked repo. It's pretty extensively
amazing.
On a cron you say?
I've setup a simple wrapper script, which runs backups in a temporary directory, then archives them and places them in a desired location:
1#!/bin/bash
2
3USER="${1:?usage: $0 <user> <archive>}"
4ARCHIVE_LOC="${2:?usage: $0 <user> <archive>}"
5
6which github-backup > /dev/null 2>&1 || pip install github-backup
7
8if [ -z "$GH_BACKUP_TOKEN" ];then
9 echo 'missing $GH_BACKUP_TOKEN...'
10 exit 1
11fi
12
13_tmp=$(mktemp -d -p "/run/user/$(id -u)" -t "github-backup-XXXXX")
14if [ "$?" != 0 ];then exit 1;fi
15
16trap "rm -rf $_tmp" EXIT
17echo "using '$_tmp'..."
18
19github-backup \
20 --token "$GH_BACKUP_TOKEN" \
21 --starred \
22 --followers \
23 --following \
24 --issues \
25 --issue-comments \
26 --repositories \
27 --wikis \
28 --gists \
29 --starred-gists \
30 --private \
31 --output-directory "$_tmp" "$USER"
32
33tar -czvf "$ARCHIVE_LOC" -C "$_tmp" .
Above script is also available here.
I've dropped it in ~/bin/gbackup, but you can place it anywhere. I also have two cronjobs, which
run every Sunday, at ~3:00AM and ~3:15AM respectively:
0 3 * * 0 /root/bin/gbackup lrstanley /net/media/archives/github-backups/github-backup-lrstanley.$(date +%Y.%m.%d-%H.%M.%S).tar.gz > /dev/null
15 3 * * 0 find /net/media/archives/github-backups/ -type f -mtime +365 -delete > /dev/null
Notice that second command -- it finds any archives older than 365 days old, and removes them. That's
around 52 backups across an entire year, at about ~45MB .tar.gz'd per backup. Not bad!
Lastly, the script needs a variable set. GH_BACKUP_TOKEN, which you can generate using this GitHub page.
You can invoke the github-backup script with it using GH_BACKUP_TOKEN=<token> /root/bin/gbackup or
place it in e.g. ~/.bashrc, using export GH_BACKUP_TOKEN=<token> where it will be accessible
when you run the script.
Welp, now I have GitHub backups. It's not amazingly robust, but since GitHub doesn't have a full-account backup system in place, this is sufficient enough for me at this time. Happy commiting.