Supriya Ghosh (Editor)

Incremental backup

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

An incremental backup is one in which successive copies of the data contain only that portion that has changed since the preceding backup copy was made. When a full recovery is needed, the restoration process would need the last full backup plus all the incremental backups until the point of restoration. Incremental backups are often desirable as they reduce storage space usage, and are quicker to perform than differential backups.

Contents

Incremental

The most basic form of incremental backup consists of identifying, recording and, thus, preserving only those files that have changed since the last backup. Since changes are typically low, incremental backups are much smaller and quicker than full backups. For instance, following a full backup on Friday, a Monday backup will contain only those files that changed since Friday. A Tuesday backup contains only those files that changed since Monday, and so on. A full restoration of data will naturally be slower, since all increments must be restored. Should any one of the copies created fail, including the first (full), restoration will be incomplete.

A Unix example would be:

The use of rsync's --link-dest option is what makes this command an example of incremental backup.

Multilevel incremental

A more sophisticated incremental backup scheme involves multiple numbered backup levels. A full backup is level 0. A level n backup will back up everything that has changed since the most recent level n-1 backup. Suppose for instance that a level 0 backup was taken on a Sunday. A level 1 backup taken on Monday would include only changes made since Sunday. A level 2 backup taken on Tuesday would include only changes made since Monday. A level 3 backup taken on Wednesday would include only changes made since Tuesday. If a level 2 backup was taken on Thursday, it would include all changes made since Monday because Monday was the most recent level n-1 backup.

Reverse incremental

An incremental backup of the changes made between two instances of a mirror is called a reverse incremental. By applying a reverse incremental to a mirror, the result will be a previous version of the mirror. In other words, after the initial full backup, each successive incremental backup applies the changes to the previous full, creating a new synthetic full backup every time, while maintaining the ability to revert to previous versions. The main advantage of this type of backup is a more efficient recovery process, since the most recent version of the data (which is the most frequently restored version) is a (synthetic) full backup, and no incrementals need to be applied to it during its restoration. Reverse incremental backup works for both tapes and disks, but in practice tends to work better with disks. Companies using the reverse incremental backup method include Intronis and Zetta.net.

Incrementals forever

This style is similar to the Synthetic backup concept. After an initial full backup, only the incremental backups are sent to a centralized backup system. This server keeps track of all the incrementals and sends the proper data back to the client during restores. This can be implemented by sending each incremental directly to tape as it is taken and then refactoring the tapes as necessary. If enough disk space is available, an online mirror can be maintained along with previous incremental changes so that the current or older versions of the systems being backed up can be restored. This is a suitable method in the case of banking systems.

In modern cloud architectures, or disk to disk backup scenarios, this is much simpler. Data is broken into chunks and placed on a cloud storage system, such as Amazon S3. Metadata about the chunks is stored in a persistent system, which allows the system to assemble a point in time backup from these chunks at restore time. There is no need to refactor tape.

Block level incremental

This method backs up only the blocks within the file that changed. This requires a higher level of integration between the sender and receiver.

Byte level incremental

These backup technologies are similar to the "block level incremental" backup method; however, the byte (or binary) incremental backup method is based on a binary variation of the files compared to the previous backup: while the block-based technologies work with heavy changing units (blocks of 8K, 4K or 1K), the byte-based technologies work with the minimum unit, saving space when reflecting a change on a file. Another important difference is that they work independently on the file system. At the moment, these are the technologies that achieve the highest relative compression of the data, turning into a great advantage for the security copies carried out through the Internet.

Synthetic full backup

A Synthetic backup is an alternative method of creating full backups. Instead of reading and backing up data directly from the disk, it will synthesize the data from the previous full backup (either a regular full backup for the first backup, or the previous synthetic full backup) and the periodic incremental backups. As only the incremental backups read data from the disk, these are the only files that need to be transferred during Offsite Replication. This greatly reduces the bandwidth needed for Offsite Replication.

Differential

A differential backup is a cumulative backup of all changes made since the last full or normal backup, i.e., the differences since the last full backup. The advantage to this is the quicker recovery time, requiring only a full backup and the last differential backup to restore the system. The disadvantage is that for each day elapsed since the last full backup, more data needs to be backed up, especially if a significant proportion of the data has changed.

Forward Incremental-Forever

A forward incremental-forever backup allows the synthetic operation to create a new full backup, which is limited to the size of the incremental file, instead of the complete size of a full backup file as it would happen in a “forward mode with synthetic fulls”. The overall consumed I/O is the same as the Reversed incremental, but during the duration of the backup activity only 1 write I/O is used and the snapshot of the VM is opened for less time than the Reversed incremental; the remaining 2 I/O are used to update the full backup file.

References

Incremental backup Wikipedia