GitLab Backup Made Easy

by on

GitLab In Brief

GitLab is an open source software to collaborate on code. GitLab offers Git repository management, code reviews, issue tracking, activity feeds and wikis. Enterprises install GitLab on-premise and connect it with LDAP and Active Directory servers for secure authentication and authorization. A single GitLab server can handle more than 25,000 users but it is also possible to create a high availability setup with multiple active servers.

Why Take A Backup?

Every day data keeps adding to your GitLab application's production server and all is working fine. But you don't realize that your server can crash due to a lot of reasons and sometimes by the time you realize all your data is lost, all your repositories are now residing somewhere you obviously don't know. So, taking backup of your applications is as important as adding new features to them. It is a part of maintaining your application which is generally ignored by most of the people. So keep your data safe, take backup.

Taking backup is an important activity even then a few people ignore it because taking backups repeatedly is too boring and adds to their frustration as a developer. So, automating the whole process of taking the backup of your GitLab repositories is a good idea.

What's The Smart Way?

  • One basic approach that everyone should follow is to take backup daily, weekly and monthly because if one fails you have another.
  • Never keep your backup on the GitLab server itself because if server crashes then your backups are lost too.
  • Keep a check of the backup process status by sending confirmation mails on completion of process.
  • Taking a backup is not enough, make sure you give importance to the restoration process also.

Where To Keep Backup?

There are many options available:

  • S3 - You will need an Amazon AWS (S3) account. You can get one here.
  • CloudFiles - You will need a Rackspace Cloud Files account. You can get one here.
  • Dropbox - You will need a A Dropbox Account and a Dropbox App.
  • Ninefold - You will need a Ninefold account. You can get one here.
  • RSync - It looks up the difference between the source and destination backups and only transfer the bytes that changed.
  • FTP
  • SCP
  • SFTP
  • Local

But the one which felt more simple, efficient and safe to store was in a S3 bucket on Amazon.

How To Take The Backup?

GitLab has already defined a rake task that has to be run in the Gitlab root directory to take the backup of database as well as of all the repositories.

$ cd /path/to/your/gitlab/directory
# Installation from source or cookbook
$ bundle exec rake gitlab:backup:create RAILS_ENV=production
# omnibus-gitlab installation
$ sudo gitlab-rake gitlab:backup:create

Just one command and you are done.

How To Store It In S3 Bucket?

Is there a gem for this?

Bingo, backup gem provides an easy way to store any file on S3 bucket. It offers huge array of features including various database adapters, flexible filesystem archiving, compression, encryption, backup destinations and synchronization. Every option is explained in its corresponding section in detail here.

To get started you will have to generate a model accordingly:

$ mkdir ~/backup_module && cd backup_module
$ backup generate:model --trigger gitlab_backup --storages="s3" --notifiers="mail" --archives
Note: This will also create a config.rb file.

Also, configure the model accordingly:

daily_gitlab_backup.rb
 Model.new(:daily_gitlab_backup, 'Daily Backup for Gitlab') do

  archive :gitlab_backup do |archive|
    #  archive.use_sudo
    file_name = Dir.glob('absolute/path/to/gitlab/tmp/backups/*').max_by{|f| File.ctime(f)}
    archive.add file_name
  end


  store_with S3 do |s3|
    s3.access_key_id     = ENV['S3_ACCESS_KEY_ID']
    s3.secret_access_key = ENV['S3_SECRET_ACCESS_KEY']
    s3.keep              = 30
    s3.region            = "****"
    s3.bucket            = "****"
    s3.path              = "/daily"
    end

  compress_with Gzip

end
weekly_gitlab_backup.rb
 Model.new(:weekly_gitlab_backup, 'Weekly Backup for Gitlab') do

  archive :gitlab_backup do |archive|
    #  archive.use_sudo
    file_name = Dir.glob('absolute/path/to/gitlab/tmp/backups/*').max_by{|f| File.ctime(f)}
    archive.add file_name
  end

  store_with S3 do |s3|
    s3.access_key_id     = ENV['S3_ACCESS_KEY_ID']
    s3.secret_access_key = ENV['S3_SECRET_ACCESS_KEY']
    s3.keep              = 10
    s3.region            = "****"
    s3.bucket            = "****"
    s3.path              = "/weekly"
    end

  compress_with Gzip

end
monthly_gitlab_backup.rb
 Model.new(:monthly_gitlab_backup, 'Monthly Backup for Gitlab') do

  archive :gitlab_backup do |archive|
    #  archive.use_sudo
    file_name = Dir.glob('absolute/path/to/gitlab/tmp/backups/*').max_by{|f| File.ctime(f)}
    archive.add file_name
  end

  store_with S3 do |s3|
    s3.access_key_id     = ENV['S3_ACCESS_KEY_ID']
    s3.secret_access_key = ENV['S3_SECRET_ACCESS_KEY']
    s3.region            = "****"
    s3.bucket            = "****"
    s3.path              = "/monthly"
    end

  compress_with Gzip

end

Note: all the environment variables are set in the config/schedule.rb file.

How To Automate This Whole Process?

Backup gem does the job very efficiently by not letting you get stuck with cronjobs. It provides a simple DSL where you define your timing of the job and the command to run that's it, whenever gem automatically adds your job to the server's crontab and voila you are good to go.

schedule.rb
BACKUP_MODULE_PATH = File.absolute_path("")
CONFIG_FILE_PATH = File.absolute_path("gitlab_config.rb").to_s

env :PATH, ENV['PATH']
set :environment, 'production'
env :S3_ACCESS_KEY_ID, "******"
env :S3_SECRET_ACCESS_KEY, "******"
env :MAIL_FROM, "******"
env :MAIL_ADDRESS, "******"
env :MAIL_DOMAIN, "******"
env :MAIL_PASSWORD, "******"

every 1.day, :at => '1:00 am' do
  command "cd /home/git/gitlab && bundle exec rake gitlab:backup:create RAILS_ENV=production"
end

every 1.day, :at => '2:00 am' do
  command "cd #{BACKUP_MODULE_PATH} && backup perform --trigger daily_gitlab_backup --config-file #{CONFIG_FILE_PATH}"
end

every :sunday, :at => "2:20 am" do
  command "cd #{BACKUP_MODULE_PATH} && backup perform --trigger weekly_gitlab_backup --config-file #{CONFIG_FILE_PATH}"
end

every 1.month, :at => '2:40 am' do
  command "cd #{BACKUP_MODULE_PATH} && backup perform --trigger monthly_gitlab_backup --config-file #{CONFIG_FILE_PATH}"
end

This will create a backup file daily at 4:00am which will be stored to the S3 bucket at 5:00am everyday.

The backup archive will be saved in backup_path (see config/gitlab.yml). The filename will be [TIMESTAMP]_gitlab_backup.tar. This timestamp can be used to restore an specific backup.

Finally your directory structure will look like this:

  • ~/backup_module/gitlab_config.rb - contains the configurations that will be overriden.
  • ~/backup_module/models/daily_gitlab_backup.rb - contains the backup configuration for daily backup for database, S3, Mail Notifiers and compressors.
  • ~/backup_module/models/weekly_gitlab_backup.rb - contains the backup configuration for weekly backup for database, S3, Mail Notifiers and compressors.
  • ~/backup_module/models/monthly_gitlab_backup.rb - contains the backup configuration for monthly backup for database, S3, Mail Notifiers and compressors.
  • ~/backup_module/config/schedule.rb - contains the backup jobs that will be scheduled to crontab.

Additional Features

Backup gem comes with a functionality of sending notification mails for the backup process. It also attaches the log file to the mail if the backup process generated warnings or errors which makes debugging easier. Add the following to your config.rb:

Backup::Logger.configure do
  logfile.enabled   = true
  logfile.log_path  = File.absolute_path("log").to_s
end

Add the following to your gitlab_backup.rb:

notify_by Mail do |mail|
    mail.on_success           = true
    mail.on_warning           = true
    mail.on_failure           = true

    mail.from                 = ENV['MAIL_FROM']
    mail.to                   = "******"
    mail.address              = ENV['MAIL_ADDRESS']
    mail.port                 = 587
    mail.domain               = ENV['MAIL_DOMAIN']
    mail.user_name            = ENV['MAIL_FROM']
    mail.password             = ENV['MAIL_PASSWORD']
    mail.authentication       = "****"
    mail.encryption           = :none
  end

How To Restore Backup?

Just download the archive with the specific timestamp to gitlab_application_root/tmp/backups directory from S3(or wherever you are storing the backups) and run the following commands.

$ cd /path/to/your/gitlab/directory
# Installation from source or cookbook
$ bundle exec rake gitlab:backup:restore RAILS_ENV=production BACKUP=timestamp_of_backup
# omnibus-gitlab installation
$ sudo gitlab-rake gitlab:backup:restore RAILS_ENV=production BACKUP=timestamp_of_backup

Hope this has helped and encouraged you to give importance to take backups regularly. Also, backups are not boring only if you are able to automate them ;) Cheers!!

Share the Love

Published in devops | Tagged with devops, gitlab, s3, tutorials, cloud, aws, backup-gem

CATEGORIES
web-development
javascript
ruby
ruby-on-rails
tutorials
startups
products
events
devops
mobile

TAGS
web-development
javascript
frameworks
ruby
open-source
tutorials
ruby-on-rails
front-end
reactjs
startups

MORE

RSS

X

Talk to us, that's always a good idea!