toglacier

package module
v3.2.0+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 11, 2017 License: MIT Imports: 15 Imported by: 1

README

GoDoc license Build Status Coverage Status Go Report Card codebeat badge

toglacier

toglacier

Send data to Amazon Glacier service periodically.

What?

Have you ever thought that your server could have some backup in the cloud to mitigate some crazy ransomware infection? Great! Here is a peace of software to help you do that, sending your data periodically to Amazon Glacier. It uses the AWS SDK behind the scenes, all honors go to the Amazon developers.

The program will first add all modified files (compared with the last sync) to a tarball and then, if a secret was defined, it will encrypt the archive. After that it will decide to send it in one shot or use a multipart strategy for larger files. For now we will follow the AWS suggestion and send multipart when the tarball gets bigger than 100MB. When using multipart, each part will have 4MB (except for the last one). The maximum archive size is 40GB (but we can increase this).

Old backups will also be removed automatically, to avoid keeping many files in AWS Glacier service, and consequently saving you some money. Periodically, the tool will request the remote backups in AWS to synchronize the local storage.

Some cool features that you will find in this tool:

  • Backup the desired directories periodically;
  • Upload only modified files (small backups parts);
  • Detect ransomware infection (too many modified files);
  • Ignore some files or directories in the backup path;
  • Encrypt backups before sending to the cloud;
  • Automatically download and rebuild backup parts;
  • Old backups are removed periodically to save you some money;
  • List all the versions of a file that was backed up;
  • Smart backup removal, replacing references for incremental backups;
  • Periodic reports sent by e-mail.

Install

To compile and run the program you will need to download the Go compiler, set the $GOPATH, add the $GOPATH/bin to your $PATH and run the following command:

go get -u github.com/rafaeljusto/toglacier/...

If you are thinking that is a good idea to encrypt some sensitive parameters and want to improve the security, you should replace the numbers of the slices in the function passwordKey of the encpass_key.go file for your own random numbers, or run the python script (inside internal/config package) with the command bellow. Remember to compile the tool again (go install).

encpass_key_generator.py -w

As this program can work like a service/daemon (start command), in this case you should run it in background. It is a good practice to also add it to your system startup (you don't want your backup scheduler to stop working after a reboot).

Usage

The program will work with environment variables or/and with a YAML configuration file. You can find the configuration file example on cmd/toglacier/toglacier.yml, for the environment variables check bellow:

Environment Variable Description
TOGLACIER_AWS_ACCOUNT_ID AWS account ID
TOGLACIER_AWS_ACCESS_KEY_ID AWS access key ID
TOGLACIER_AWS_SECRET_ACCESS_KEY AWS secret access key
TOGLACIER_AWS_REGION AWS region
TOGLACIER_AWS_VAULT_NAME AWS vault name
TOGLACIER_PATHS Paths to backup (separated by comma)
TOGLACIER_DB_TYPE Local backup storage strategy
TOGLACIER_DB_FILE Path where we keep track of the backups
TOGLACIER_LOG_FILE File where all events are written
TOGLACIER_LOG_LEVEL Verbosity of the logger
TOGLACIER_KEEP_BACKUPS Number of backups to keep (default 10)
TOGLACIER_BACKUP_SECRET Encrypt backups with this secret
TOGLACIER_MODIFY_TOLERANCE Maximum percentage of modified files
TOGLACIER_IGNORE_PATTERNS Regexps to ignore files in backup paths
TOGLACIER_SCHEDULER_BACKUP Backup synchronization periodicity
TOGLACIER_SCHEDULER_REMOVE_OLD_BACKUPS Remove old backups periodicity
TOGLACIER_SCHEDULER_LIST_REMOTE_BACKUPS List remote backups periodicity
TOGLACIER_SCHEDULER_SEND_REPORT Send report periodicity
TOGLACIER_EMAIL_SERVER SMTP server address
TOGLACIER_EMAIL_PORT SMTP server port
TOGLACIER_EMAIL_USERNAME Username for e-mail authentication
TOGLACIER_EMAIL_PASSWORD Password for e-mail authentication
TOGLACIER_EMAIL_FROM E-mail used when sending the reports
TOGLACIER_EMAIL_TO List of e-mails to send the report to
TOGLACIER_EMAIL_FORMAT E-mail content format (html or plain)

Most part of them you can retrieve via AWS Console (My Security Credentials and Glacier Service). You will find your AWS region identification here.

By default the tool prints everything on the standard output. If you want to redirect it to a log file, you can define the location of the file with the TOGLACIER_LOG_FILE. Even with the output redirection, the messages are still written in the standard output. You can define the verbosity using the TOGLACIER_LOG_LEVEL parameter, that can have the values debug, info, warning, error, fatal or panic. By default the error log level is used.

There are some commands in the tool to manage the backups:

  • sync: execute the backup task now
  • get: retrieve a backup from AWS Glacier service
  • list or ls: list the current backups in the local storage or remotely
  • remove or rm: remove a backup from AWS Glacier service
  • start: initialize the scheduler (will block forever)
  • report: test report notification
  • encrypt or enc: encrypt a password or secret to improve security

You can improve the security by encrypting the values (use encrypt command) of the variables TOGLACIER_AWS_ACCOUNT_ID, TOGLACIER_AWS_ACCESS_KEY_ID, TOGLACIER_AWS_SECRET_ACCESS_KEY, TOGLACIER_BACKUP_SECRET and TOGLACIER_EMAIL_PASSWORD, or the respective variables in the configuration file. The tool will detect an encrypted value when it starts with the label encrypted:.

For keeping track of the backups locally you can choose boltdb (BoltDB) or auditfile in the TOGLACIER_DB_TYPE variable. By default boltdb is used. If you choose the audit file, as it is a human readable and a technology free solution, the format is defined bellow. It's a good idea to periodically copy the audit file or the BoltDB file somewhere else, so if you lose your server you can recover the files faster from the AWS Glacier (don't need to wait for the inventory). If you change your mind later about what local storage format you want, you can use the toglacier-storage program to convert it.

[datetime] [vaultName] [archiveID] [checksum] [size]

When running the scheduler (start command), the tool will perform the actions bellow in the periodicity defined in the configuration file. If not informed default values are used.

  • backup the files and folders;
  • remove old backups (save storage and money);
  • synchronize the local storage;
  • report all the scheduler occurrences by e-mail.

A simple shell script that could help you running the program in Unix environments:

#!/bin/bash

TOGLACIER_AWS_ACCOUNT_ID="encrypted:DueEGILYe8OoEp49Qt7Gymms2sPuk5weSPiG6w==" \
TOGLACIER_AWS_ACCESS_KEY_ID="encrypted:XesW4TPKzT3Cgw1SCXeMB9Pb2TssRPCdM4mrPwlf4zWpzSZQ" \
TOGLACIER_AWS_SECRET_ACCESS_KEY="encrypted:hHHZXW+Uuj+efOA7NR4QDAZh6tzLqoHFaUHkg/Yw1GE/3sJBi+4cn81LhR8OSVhNwv1rI6BR4fA=" \
TOGLACIER_AWS_REGION="us-east-1" \
TOGLACIER_AWS_VAULT_NAME="backup" \
TOGLACIER_PATHS="/usr/local/important-files-1,/usr/local/important-files-2" \
TOGLACIER_DB_TYPE="boltdb" \
TOGLACIER_DB_FILE="/var/log/toglacier/toglacier.db" \
TOGLACIER_LOG_FILE="/var/log/toglacier/toglacier.log" \
TOGLACIER_LOG_LEVEL="error" \
TOGLACIER_KEEP_BACKUPS="10" \
TOGLACIER_BACKUP_SECRET="encrypted:/lFK9sxAXAL8CuM1GYwGsdj4UJQYEQ==" \
TOGLACIER_MODIFY_TOLERANCE="90%" \
TOGLACIER_IGNORE_PATTERNS="^.*\~\$.*$" \
TOGLACIER_SCHEDULER_BACKUP="0 0 0 * * *" \
TOGLACIER_SCHEDULER_REMOVE_OLD_BACKUPS="0 0 1 * * FRI" \
TOGLACIER_SCHEDULER_LIST_REMOTE_BACKUPS="0 0 12 1 * *" \
TOGLACIER_SCHEDULER_SEND_REPORT="0 0 6 * * FRI" \
TOGLACIER_EMAIL_SERVER="smtp.example.com" \
TOGLACIER_EMAIL_PORT="587" \
TOGLACIER_EMAIL_USERNAME="user@example.com" \
TOGLACIER_EMAIL_PASSWORD="encrypted:i9dw0HZPOzNiFgtEtrr0tiY0W+YYlA==" \
TOGLACIER_EMAIL_FROM="user@example.com" \
TOGLACIER_EMAIL_TO="report1@example.com,report2@example.com" \
TOGLACIER_EMAIL_FORMAT="html" \
toglacier $@

With that you can just run the following command to start the scheduler:

./toglacier.sh start

Just remember to give the write permissions to where the stdout/stderr and audit files are going to be written (/var/log/toglacier).

Deployment

For developers that want to build a package, we already have 2 scripts to make your life easier. As Go can do some cross-compilation, you can build the desired package from any OS or architecture.

Debian

To build a Debian package you will need the Effing Package Management tool. Then just run the script with the desired version and release of the program:

./package-deb.sh <version>-<release>
FreeBSD

You can also build a package for the FreeBSD pkgng repository. No external tools needed here to build the package.

./package-txz.sh <version>-<release>
Windows

To make your life easier you can use the tool NSSM to build a Windows service to run the toglacier tool in background. The following commands would install the service:

c:\> nssm.exe install toglacier

c:\> nssm.exe start toglacier

Documentation

Overview

Package toglacier have all the functions to manage your backups in the cloud.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ErrorEqual

func ErrorEqual(first, second error) bool

ErrorEqual compares two Error objects. This is useful to compare down to the low level errors.

Types

type EmailInfo

type EmailInfo struct {
	Sender   EmailSender
	Server   string
	Port     int
	Username string
	Password string
	From     string
	To       []string
	Format   report.Format
}

EmailInfo stores all necessary information to send an e-mail.

type EmailSender

type EmailSender interface {
	SendMail(addr string, a smtp.Auth, from string, to []string, msg []byte) error
}

EmailSender e-mail API to make it easy to mock the smtp.SendEmail function.

type EmailSenderFunc

type EmailSenderFunc func(addr string, a smtp.Auth, from string, to []string, msg []byte) error

EmailSenderFunc helper function to create a fast implementation of the EmailSender interface.

func (EmailSenderFunc) SendMail

func (r EmailSenderFunc) SendMail(addr string, a smtp.Auth, from string, to []string, msg []byte) error

SendMail sends the e-mail.

type Error

type Error struct {
	Paths []string
	Code  ErrorCode
	Err   error
}

Error stores error details from a problem occurred while executing high level commands from toglacier.

func (Error) Error

func (e Error) Error() string

Error returns the error in a human readable format.

func (Error) String

func (e Error) String() string

String translate the error to a human readable text.

type ErrorCode

type ErrorCode string

ErrorCode stores the error type that occurred while processing commands from toglacier.

const (
	// ErrorCodeModifyTolerance error when too many files were modified between
	// backups. This is an alert for ransomware infection.
	ErrorCodeModifyTolerance ErrorCode = "modify-tolerance"
)

func (ErrorCode) String

func (e ErrorCode) String() string

String translate the error code to a human readable text.

type ToGlacier

type ToGlacier struct {
	Context context.Context
	Archive archive.Archive
	Envelop archive.Envelop
	Cloud   cloud.Cloud
	Storage storage.Storage
	Logger  log.Logger
}

ToGlacier manages backups in the cloud.

func (ToGlacier) Backup

func (t ToGlacier) Backup(backupPaths []string, backupSecret string, modifyTolerance float64, ignorePatterns []*regexp.Regexp) error

Backup create an archive and send it to the cloud. Optionally encrypt the backup with the backupSecret password, if you leave it blank no encryption will be performed. There's also an option to stop the backup if there're to many files modified (ransomware detection), the modifyTolerance is the percentage (0 - 100) of modified files that is tolerated. If there's no need to keep track of the modified files set modifyTolerance to 0 or 100. You could also ignore some files or directories in the backup paths using regular expressions in the ignorePatterns parameter.

func (ToGlacier) ListBackups

func (t ToGlacier) ListBackups(remote bool) (storage.Backups, error)

ListBackups show the current backups. With the remote flag it is possible to list the backups tracked locally or retrieve the cloud inventory.

func (ToGlacier) RemoveBackups

func (t ToGlacier) RemoveBackups(ids ...string) error

RemoveBackups delete a backups identified by ids from the cloud and from the local storage. It will also try to replace or remove the reference from the removed backup on other backups. When it is possible to replace the reference it will try to get the file version right before the removed backup date.

func (ToGlacier) RemoveOldBackups

func (t ToGlacier) RemoveOldBackups(keepBackups int) error

RemoveOldBackups delete old backups from the cloud. This will optimize the cloud space usage, as too old backups aren't used.

func (ToGlacier) RetrieveBackup

func (t ToGlacier) RetrieveBackup(id, backupSecret string, skipUnmodified bool) error

RetrieveBackup recover a specific backup from the cloud. If the backup is encrypted it can be decrypted if the backupSecret is informed. Also, it is possible to avoid downloading backups that contain only unmodified files with the skipUnmodified flag.

func (ToGlacier) SendReport

func (t ToGlacier) SendReport(emailInfo EmailInfo) error

SendReport send information from the actions performed by this tool via e-mail to an administrator.

Directories

Path Synopsis
cmd
internal
archive
Package archive builds the backup archive.
Package archive builds the backup archive.
cloud
Package cloud manages the backup in a specific cloud.
Package cloud manages the backup in a specific cloud.
config
Package config store all necessary configuration parameters for the project.
Package config store all necessary configuration parameters for the project.
log
Package log defines an interface for the library be able to log what is happening on each stage.
Package log defines an interface for the library be able to log what is happening on each stage.
report
Package report build a text with all actions performed by the tool.
Package report build a text with all actions performed by the tool.
storage
Package storage keep track of the uploaded backups.
Package storage keep track of the uploaded backups.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL