Amazon S3 Backup

This is the third tier of a backup system, the last resort if everything has been destroyed or corrupted. This script can run on a local machine or elsewhere. I chose to run it locally because the credentials are not on a publicly accessible server. The local machine copies the data from the publicly accessible servers, stores it, then sends it to S3.

The first step is to sign up at Amazon for an S3 account, create a bucket and a user. Limit the privileges for the user as much as possible, for this script, the user needs only the putObject privilege.

The script is written in Ruby. It reads JSON configuration file which contains all the servers, files and databases to be backed up.

JSON file syntax:

{
"email": "user@localhost",
"servers": {
"example.com": {
"login": "username",
"password" : "password",
"databases": [ { "name": "database_name", "dbuser": "user", "dbpass": "password"} ],
"files": ["backup.tgz"] }
},
"s3": {
"bucket": "example.com",
"username": "user",
"accesskeyid": "-- S3 Access Key Id --",
"secretaccesskey": "-- S3 Secret Access Key --"
}
}

Each server can include multiple databases and files. Be sure to limit the privileges for this database user to SELECT and LOCK TABLES, which makes them effectively read only. Be sure to grant remote access to the database for the backup server.

The files are to be placed in a directory where they can be retrieved with wget - in the example above it would be http://example.com/backup.tgz. The intent of these files is that they contain content already publicly available. This is NOT a place to put the application configuration settings.

Each server will have a hierarchy like this:

example.com
|-- initial.tgz
`-- 20140101093022
|-- backup.tgz
`-- database_name.sql.tgz

Create initial.tgz manually - run the tar command at the top of the account, download it to your local machine, then upload it to S3. If you want to get it to S3 from the server, that's fine, just be careful not to ever leave your S3 credentials on the source server.

This is the backup script. It uses wget to get the files (you can use scp, but then you may have a credential issue), and dumps the database.

Code

#!/usr/bin/env ruby
 
require 'json'
require 'net/smtp'
require 'rubygems'
require 'aws-sdk'
 
class ItemStatus
  def initialize(item_name, exit_status, ls_file)
    @item_name, @exit_status, @ls_file = item_name, exit_status, ls_file
  end
 
  def name
    return @item_name
  end
 
  def error
    return @download_exit_status != 0
  end
end
 
json = File.read('config/.json')
parms = JSON.load(json)
 
if parms["email"].nil? || parms["email"].empty?
  to_email = "user@localhost"
else
  to_email = parms["email"]
end
 
s3 = AWS::S3.new(
  :access_key_id => parms['s3']['accesskeyid'],
  :secret_access_key => parms['s3']['secretaccesskey']
)
 
backup_dir = "servers"
bucket = s3.buckets[parms['s3']['bucket']]
 
backup = Array.new
parms["servers"].each_pair {|server_name, server|
  puts "Server: #{server_name}"
  if !server.empty?
    date = `date "+%Y%m%d%H%M"|tr -d "\n"`
    dir = backup_dir + "/" + server_name + "/" + date
    mkdir = `mkdir -p "#{dir}"`
    if $?.exitstatus === 0
      dir_created = true
      if !server["files"].nil? && !server["files"].empty?
        files = server["files"]
                                if (files.length > 0)
                if !server["login"].nil? && !server["password"].nil?
            files.each {|file_name|
              dir_file_name = "#{dir}/#{file_name}"
              Net::SSH.start("#{server_name}", "#{server["login"]}", :password => "#{server["password"]}") do |ssh|
                ssh.scp.download! "#{file_name}", "#{dir_file_name}"
              end
              `ls -l "#{dir_file_name}"`
              backup.push(ItemStatus.new("#{file_name}", $?.exitstatus, `ls -l "#{dir_file_name}"`))    
              bucket.objects[dir_file_name].write(Pathname.new(dir_file_name));
            }
          else
            files.each {|file_name|
              dir_file_name = "#{dir}/#{file_name}"
              `wget -q http://"#{server_name}"/"#{file_name}" -O "#{dir_file_name}"`
              backup.push(ItemStatus.new("#{file_name}", $?.exitstatus, `ls -l "#{dir_file_name}"`))    
              bucket.objects[dir_file_name].write(Pathname.new(dir_file_name));
            }
          end
                                end
      end
      if !server["databases"].nil? && !server["databases"].empty?
        databases = server["databases"]
        if (databases.length > 0)
          databases.each {|db|
            dbvalues = db.values_at("name", "dbuser", "dbpass").delete_if {|v| v.nil? || v.empty?}
            if dbvalues.length === 3
              dir_file_name = "#{dir}/#{db["name"]}.sql"
              dump = `mysqldump -C #{db["name"]} -u"#{db["dbuser"]}" -p"#{db["dbpass"]}" -h"#{server_name}" > "#{dir_file_name}"`
              backup.push(ItemStatus.new(db["name"], $?.exitstatus, `ls -l "#{dir_file_name}"`))
              tar_file_name = dir_file_name + ".tgz"
              tar = `tar czf #{tar_file_name} #{dir_file_name}`
              backup.push(ItemStatus.new(tar_file_name, $?.exitstatus, `ls -l "#{tar_file_name}"`))
              bucket.objects[tar_file_name].write(Pathname.new(tar_file_name));
            end
          }
        end
      end
    else
      dir_created = false
    end
  end
  error = backup.select{|item| item.error}
  if error.length == 0
    `find -mindepth 1 -mtime +8 | xargs --no-run-if-empty rm -rf`
  end
  msg = <<END_OF_MESSAGE
To: Me #{to_email}
Subject: #{server_name} backup status
 
END_OF_MESSAGE
 
  if !server.empty?
    if dir_created
      msg = msg + "Created #{dir} okay\n\n"
      if backup.length > 0
        msg = msg + "Files\n"
        backup.each {|v|
          msg = msg + "\t" + v.to_s
        }
        msg = msg + "\nColumns\n\t1. Source\n\t2. Exit Status\n\t3. File Information\n"
      end
    else
      msg = msg + "mkdir #{dir} failed"
    end
  else
    msg = msg + "No backup configuration"
  end
  msg = msg + "\n\n\n"
  begin
    Net::SMTP.start('localhost', 25) do |smtp|
      smtp.send_message msg,'amazon@localhost', to_email
    end
  rescue
    puts "Mail send failed"
  end
 
}

Finally, create a cron job to run the script as needed.

It is assumed that version control for the code is handled elsewhere. This backup is for data, with an emergency copy of the code. If the code is updated, it must be manually updated.

A note about leaving the password in the config file. I understand it is a security issue. That's why this is running on a local machine. Is it completely secure? No. But it isn't on a publicly accessible server either. Could I spend more time making it secure? Absolutely. Am I going to? Probably not.