Enterprise Java

Monitoring and managing your backup system

Last time we set up a sturdy backup system, now we will look at how we monitor backup sets. We need to verify that backup sets are properly cleaned up – this is called a delete policy – and that they are consistent – this is called a consistency policy.

A backup set can consist of several file sets. A file set is a collection of backup files, residing under the same source directory of a backup set.

The following YAML config shows an example of backup sets and file sets:

backup-set-configs:
- name: Mikrotik Backups
  uri: /volume1/backupftp/mikrotik_backup
  type: DISK
  file-set:
    - name: fe-prodnet01 export
      filterPattern: '.*fe-prodnet01-.*\.rsc'
    - name: fe-prodnet11 backup
      filterPattern: '.*fe-prodnet11.*\.backup'
- name: Exchange Backups
  uri: /volume1/pex/backups
  type: DISK
  file-set:
    - name: Exchange psts
      filterPattern: '.*\.pst'
      groupByPattern: '.*\/backups\/(\d{4}-\d{2}-\d{2})\/'
      groupByPatternHasDatePattern: 'yyyy-MM-dd'
      deletePolicy:
        deleteEmptyDirectories: true
- name: Proxmox Backups
  uri: /volume1/proxmox/dump
  type: DISK
  file-set:
    - name: QEMU backups
      filterPattern: '.*vzdump-qemu.*\.vma\.lzo'
      groupByPattern: 'proxmox/dump/vzdump-qemu-(\d+)-'
      consistencyPolicy:
          numberOfBackupsRequired: 3
    - name: LXC backups
      filterPattern: '.*vzdump-lxc.*\.tar\.lzo'
      groupByPattern: 'proxmox/dump/vzdump-lxc-(\d+)-'
      consistencyPolicy:
        numberOfBackupsRequired: 3

The first backup set resides under the /volume1/backupftp/mikrotik_backup directory and contains two sets of files. You will mostly configure it this way when you have the files of several servers in the same directory. There is another solution which groups by server name (or any identifier), as you will see in the third example.

A file set has a name, which is just a logical name for the GUI and a filterPattern. This pattern will filter all the files that match under the path of the backup set, no matter how deep the directory tree is.

The delete policy and consistency policy will be applied to the files, ordered by modification date (on disk), in descending order.

The second backup set is for exchange backups. Note how we use a groupByPattern in this example. This groups all the filtered file names (from filterPattern) by the groupByPattern. In this case the groupByPattern is also a date pattern, which is specified by groupByPatternHasDatePattern.

We will end up with a file set, grouped by date, following the date pattern specified, and the delete policy and consistency policy will apply the delete and consistency policy on the matched files, grouped dates, in descending order.

The third backup set, for Proxmox backups, has files stored in the “dump” directory and mixes two types of backups. The QEMU and LXC backups are split into two file sets and grouped by the VMID (Virtual Machine Identifier), specified by the (\d+). Since the groupByPattern is a decimal value and not a date, the delete and consistency policies will be applied to the matched files, ordered by modification date (on disk), in descending order.

Note that we don’t always specify a deletePolicy or consistencyPolicy, because we use sensible defaults for both policies. Both of them will execute for each backup set and each file set in it.

The deletePolicy has two configuration settings:

  • deleteEmptyDirectories: By default disabled, this setting is useful when you have a groupByPattern which is a date value. When the retention policy is exceeded, all files in the directory will be removed, leaving you with an empty “date” directory. In this case, you can specify deleteEmptyDirectories. The directory will only be cleaned up if it was indeed empty (just in case some other log files are lingering around).
  • deleteNumberOfBackupsLargerThan: By default a value of 30. In the case of daily backups, this is similar to 30 days. If you have weekly backups, it would represent a retention policy of 30 weeks. You can change this value as you wish, of course, irrespective of the number of days. It represents how many file sets need to be kept on disk.

The consistencyPolicy has three configuration knobs:

  • numberOfBackupsRequired: By default a value of 7. Even though we retain 30 file sets in the deletePolicy, we only require 7 file sets for the consistency to pass. In case of daily backups, it means that we need at least 7 days of backups for the file set to be consistent.
  • minimumFileSizeRequired: By default a value of 1. This means that each file in the file set must at least be 1 byte or larger. This makes sure that there is at least something in the file. You could set it larger, or set it to 0 to disable the check.
  • numberOfDaysOfBackupsRequired: By default a value of 7. This checks that the last file in the file set (ordered by date or modification time, in descending order), is at least more recent than 7 days ago.

Combining all the settings makes sure that:

  1. The file sets contain files that are recent enough.
  2. At least something is written in the files of the file set.
  3. We have a set of files that span a certain period, and this period does not exceed or interfere with the delete policy.

If any of the checks fail, the file set will fail. If a file set fails, the backup set fails and by consequence, the global status will also fail.

The implementation of the consistency policy and the delete policy both extend the same abstract class AbstractFileVisitor, which in turn extends SimpleFileVisitor<Path>. This FileVisitor will loop all subdirectories in the URI of the backup set twice, first executing the delete policy and afterward executing the consistency policy.

The AbstractFileVisitor will then filter all files in all subdirectories matching the filters and place them in map structures.

To finalize the process, the map structures are looped and files are deleted and validated according to the policy rules.

Are you curious about our implementation? Do you wish to fork and adapt our code? Shout out in the comments and we can talk about open-sourcing our solution.

Improvements

Improvements can always be made, here’s a list of things in our head:

  • Add MD5 checksums on the source, before the backup goes down the wire. Check those MD5 checksums in our application to make sure our backup is consistent over time.
  • Parse archives (.zip, .tar, .gz) in memory in our application, walk the file entries, and see if we reach the end of the archive; this should rule out corrupt .zip files.
  • Parse mailbox backups and walk the entries to see if the archive is not corrupt.
  • Send database backups to an infrastructure which checks if the database can be restored.
  • Add the ability to check backups remotely (e.g. login via SSH to a remote server and check whether backups are also available and consistent there), useful for some offsite backup scenarios.
  • Add the ability to check backups via an API (e.g. for the status of AWS backups or Proxmox backups that are hosted on a third party system).
Published on Java Code Geeks with permission by Marc Vanbrabant, partner at our JCG program. See the original article here: Monitoring and managing your backup system

Opinions expressed by Java Code Geeks contributors are their own.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button