Backup2s3

Backup2s3 is a RubyGem used to create backups of Ruby on Rails applications and store them on Amazon S3. This gem is written for Linux and Mac OS X systems and invokes its main backup features via rake tasks. Along with backup creation and automatic rotation, a backup restoration function is provided which allows you to restore your application and database state to that of a previous backup. Backup2s3 is also modularized such that even if your application won’t start, a backup restoration can still be performed. This backup solution is very simple and a good fit for those who need to keep backups of application files as well as the database.

Installation

Installing Backup2s3 is easy, follow these steps:

1. Install the Backup2s3 RubyGem using the gem package manager:

gem install backup2s3

2. Add the following dependencies to your application Gemfile:

gem 'backup2s3'
gem 'aws-s3'

3. Run the generator in your application root directory:

rails g backup2s3

4. Finally, change the settings (documented below) in the configuration file located here:

config/backup2s3.yml

Settings

Backup2s3 allows you to specify what you want to backup, how to make the backup, and how many backups to keep. This is all handled inside the backup2s3.yml settings file which is located in the application’s config folder. Below is an example settings file.

:backups:
  :max_number_of_backups: 5
  :backup_database: true
  :backup_application_folders:
    - public    
    - lib

:adapter:
  :type: S3Adapter
  :access_key_id: XXXXXXXXXXXXXXXXXXXX
  :secret_access_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  :use_ssl: true

Backups Section (:backups:)
The :backups: configuration settings deal specifically with what to backup (application and/or database) as well as how many backups to keep.

  • max_number_of_backups – This value sets the maximum number of backups to keep for your application on S3. When a newly created backup is uploaded to S3, and the number of backups goes over this number, the oldest backup will automatically be deleted.
  • backup_database – This is a boolean value. Backup2s3 will backup your database if this is set to true and skip database backup if false.
  • backup_application_folder – Specifies an array of the top level application folders that Backup2s3 should tar, zip and backup to S3.

Adapter Section (:adapter:)
The Adapter sections specifies how Backup2s3 should connect to Amazon S3. The adapters are the tools that actually connect to S3 and do all the real work of creating, moving and deleting files. The adapter portion of this gem has been modularized such that creating a new adapter is fairly simple. To create a different adapter, you need only implement a handful of required methods. The difference between the S3Adapter and S3cmdAdapter should serve as a good example of this.

  • type – The adapter type you would like to use. There are currently two adapter types supported, S3Adapter and S3cmdAdapter.
    • S3Adapter – This adapter uses the aws-s3 gem and does not provide a percentage complete, failsafe or throttling.
    • S3cmdAdapter – This adapter uses the s3cmd Python library. This adapter displays upload progress and also has failsafes and throttling incase an upload fails. The failsafe will slow the upload speed and retry the upload incase of failure.
  • access_key_id – Your Amazon S3 access key id.
  • secret_access_key – Your Amazon S3 secret access key.
  • use_ssl – Uses an ssl connection for backups if set to true.

Using Backup2s3 – Create, List, Delete, Restore

Backup2s3 implements rake tasks to create, delete, list and restore backups. Using rake tasks simplifies the entire backup process. In addition, using rake tasks allows for backup automation by using cron. Below are the associated rake tasks and how to use each one.

Creating Backups
Creating a backup can be done automatically by a cronjob as previously stated, or by manually running the rake task. The following rake task takes the optional parameter comment. This parameter enters a note or description about the particular backup. This can come in handy, for instance, if you are about to update your application to a new version and want to make a backup before deployment.

rake backup2s3:backup:create  comment='optional notes about backup here'

Viewing Available Backups
In order to delete or restore a specific backup, you will need to view a list of the backups that are currently stored on S3 and find the associated backup id. This task will list the date that the backup was created and the backup id. There is also an optional parameter, details, which when set to true will also display the application and database backup filenames as well as any comments added to the backup.

rake backup2s3:backup:list  details=true

Running the above command will return output that looks something like this:

--- Backups by Date ---
1. 08-22-2011 18:28:26, ID - 20110822182826
   --- App -> 20110822182826-application-application.tar.gz
   --- DB -> 20110822182826-application-database.sql
   --- Comment -> 
2. 08-14-2011 03:10:09, ID - 20110814031009
   --- App -> 20110814031009-application-application.tar.gz
   --- DB -> 20110814031009-application-database.sql
   --- Comment -> 
3. 08-09-2011 23:07:09, ID - 20110809230709
   --- App -> 20110809230709-application-application.tar.gz
   --- DB -> 20110809230709-application-database.sql
   --- Comment -> 
4. 07-31-2011 01:42:24, ID - 20110731014224
   --- App -> 20110731014224-application-application.tar.gz
   --- DB -> 20110731014224-application-database.sql
   --- Comment -> 
5. 07-26-2011 02:19:08, ID - 20110726021908
   --- App -> 20110726021908-application-application.tar.gz
   --- DB -> 20110726021908-application-database.sql
   --- Comment -> 
-----------------------

Note, the output above did not include any backups with comments.

Removing Backups
Removing a backup is very simple as well. First, run the list rake task as shown above and find the backup and associated id that you would like to remove. Copy the id and run the delete rake task with the value as the id parameter as follows.

rake backup2s3:backup:delete  id='20110809230709'

Note, running the above task would remove backup #3 from the list example above. Take note that unlike the previous rake tasks, the id parameter is required for the delete task.

Restoring Backups
Now onto the main purpose behind keeping backups — restoring your application to it’s previous state when trouble happens. Let’s say that something deleted several very important rows or even tables from your application’s database. Somehow the images and files in your public folder were also deleted. At this point, you can use the restore task to restore your application’s database and files to that of a previous backup. Even better, Backup2s3 is modular enough such that if your application is failing to start, as long as the lib and config folders are intact, Backup2s3 will still be able to perform a restore. Below is an example of the restore task.

rake backup2s3:backup:restore id='20110822182826'

Note, running the above task would restore the application to that of backup #1 from the list example above. Also, the id parameter is required for this task.

Amazon S3 Buckets & File Structure

Backup Buckets
The Amazon S3 Buckets that are used to hold backups are named in such a fashion that you are able to backup multiple applications onto a single Amazon S3 account. They are also setup to make it easy to identify which bucket is associated with which application.

Bucket names are constructed by grabbing two separate pieces of information about the application you are running. First, the database name of the application and second, the name of the machine or server on which your application is running. These two values are concatenated in the following format.

database_name-ON-machine_name

The resulting bucket name is then processed to remove any offenders of the Amazon S3 bucket naming restrictions.

Application Backups
The application backups that are stored on S3 are tarred and gzipped prior to upload, resulting in a smaller file and smaller bill from Amazon. The filename for application backups is created using a numeric timestamp (which also serves as the backup id) and the associated application’s database name in the following format.

numeric_timestamp-database_name-application.tar.gz

Database Backups
Database backups stored in a .sql file and named similarly to the application backups.

numeric_timestamp-database_name-database.sql

Conclusion

Backup2s3 provides an easy to use backup solution for Ruby on Rails applications. Aside from the security of knowing that your data is backed up and safe on Amazon S3, the restore function included provides a much needed tool if a serious problem ever arises.

For more information, leave a comment or contact us. To take a look at the technical workings behind Backup2s3, check out the GitHub repository here: https://github.com/aricwalker/backup2s3

Credits

This project was originally branched from Xavier Shay’s db2s3 project.