Unix Filesystem Backups at ITC (this page last updated 2008/01/30)


Software Design of the ITC Backups

Admin

     The administrative component of the ITC backups schedules all backups and associated processes. The relatively simple watcher process restarts mtr if the latter exits; this is necessary not only if mtr aborts on an error, but also when mtr purposely exits in order to rotate its current log file. Mtr starts the following processes. One tmc process is started for each target machine associated with an admin, to handle requests directed to the target. A pmg process is started to assign port numbers to pull clients. A ptm process is started to periodically contact each target machine for an updated list of available disk space for each filesystem on the target. A expnc process is started once a day to remove expired backups for filesystems which are not currently configured to receive new backups (not in the backup cycle). Mtr periodically scans configuration files for filesystems to receive backups, and appropriately starts mfp processes to run backups, create NetApp snapshots, or rebuild client machines' readme files.

     An mfp to run a backup of a filesystem, called an fsmfp, normally starts an ssh session to user uvadumpc on a "push" client, which runs uvadump on the client. The fsmfp then monitors the progress of the backup via this session. If fsmfp is to run a backup on a "pull" client, no ssh or username is involved. In this case, fsmfp waits for a connection from the client on a TCP port number assigned when fsmfp was started. Exactly the same parameters are passed to the client as for the "pull" case, and the same results are achieved. In its final state, an fsmfp contacts the appropriate target machines to remove any expired backups.

     Admin processes run as user uvadumpa. Beyond the various configuration files, interactive control is handled with a login to this user and execution of the easy script.

     Tmc uses ssh to login to a target machine as user uvadumpm and starts a tms process on the target. The session is kept open indefinitely. Requests from various admin processes are passed to tms and the replies are relayed back.

     Only one admin can be run on a machine.

Target

     The target component of the ITC backup software stores the archives created by clients. Target software consists of tms to respond to requests from admin, rmt to write archives to disk, and rmtwrap to respond to requests from clients for restores and to act as a front-end for tms and rmt. Target processes run as user uvadumpm. The access script may be run as root on a target to do restores for any archives on the target; this operation is called a local restore. Multiple targets can be associated with an admin. The machine running a target, called a target machine, can also run an admin and a client. It is also possible for two admins to use the same target; access to archives is qualified by the name of the responsible admin.

Client

     The client component of the ITC backups runs uvadump as root to create backup archives from filesystems accessed on the local machine, writing the archives to a target. The root user of a client's machine may run the access script to restore from backup archives on a target; this operation is called a remote restore if client and target are not on the same machine. A client may be managed by only one admin. A client may use more than one target.

     Admin manages a "push" client with ssh logins to root user uvadumpc, which executes res, a limited-function login shell. "Pull" clients run a persistent process called pullc to periodically contact admin for requests, which are then run via res.

Sample Component Setup

     Figure 1 shows a hypothetical deployment of the ITC backup software and the paths taken by archive data. Each brown rectangle represents a separate machine. Names are attached to each software component instance; note that these are not machine names.

     An admin named Fruit controls the backups of a client named Apple, which writes backups to target Plate. Admin, target and client each run on a different machine. Apple writes archives to Plate during backups, and also reads them during restores. This simple scenario does not provide any backups for the admin or target machines.

     Another admin named Vegetable controls backups for clients Corn and Carrot, which write their backups to targets Plate and Bowl, respectively. Corn takes care of backing up configuration files for Vegetable and various other files on the machine. If Corn were to write to Bowl, and the machine they share is destroyed, Vegetable's files would be gone forever. Therefore, Corn writes to Plate instead.

     Figure 2 depicts the direction of requests made between components. An admin makes requests to a client to create archives, or to a target to manage archives. A client makes requests to a target to write or read archives.

Note: In this document, publicly accessible on the Web, the fictitious name T.itc represents either the name of the machine handling a robotic tape library or the name of the machine handling virtual tape libraries. A.itc and H.itc represent the names of the administrative machine and the onsite backup storage machine, respectively. The corresponding actual names, which may be found in /common/ud/a/u/access, are not used here to avoid inviting intrusion attempts.

     Figure 3 diagrams the use of ITC's actual deployment of software components during backups. Most clients write to a target on T.itc, storing archives in HSM-managed RAID filesystems which are flushed to tape or virtual tape as needed. A few clients write long-term backups of some of their filesystems to the target on T.itc and short-term backups of others (mostly email inboxes) to a target on A.itc, storing them only in RAID. T.itc and A.disk are located offsite from all other machines, so that backup archives are available for disaster recovery. Backups of T.itc and A.itc are written to a target on onsite machine H.itc to provide a recovery mechanism in case of a disaster at the offsite location.

T.itc actually represents either of two machines whose functions are conceptually the same. One of the machines uses a physical tape robot. The other uses two virtual tape libraries.