What is backup for, and why is it important?
Backups essentially have two functions:
- To enable the recovery of data after it is lost (either through corruption or deletion); and
- To enable that recovery of data from an earlier time (before the loss).
Whilst other approaches (e.g., replication) can provide a convenient means of recovering a user’s lost file, certain incidents affecting data (e.g., malware infection) could also affect the replicated data.
The data on a device (e.g., photos on a laptop or smartphone) or within a business (e.g., customer details, accounts) are usually important, and backing this up is one of the best ways of reducing the risk of loss.
Cyber crime is increasing. Malware is growing in complexity. Hardware components do fail. These issues are all risks to data and systems that backup can help to mitigate.
Backup and other Approaches
As we’ve started to explore above, it is important to distinguish between a backup, and other ways of protecting data from certain situations.
- Backup: a backup is technically an archive copy of the data that is in use (in the sense that it is archived until you need to use it).
- Replication: copying data and moving it to another location. With services like Microsoft 365, replication is taking place more and more.
- Synchronisation: similar to replication, synchronisation is the process of copying files and folders in use to another location.
- Archiving: not a copy of data, archiving is the process of storing data that is no longer in use, but is (or may be) still needed, in another location.
Whilst replication and synchronisation have certain advantages, it is important to see them as a complement to backup, rather than a replacement.
Backups are taken, and fixed, at a point in time. This means that the data included in the backup is held in that state until needed. When further backups are completed, more versions of the ‘fixed state data’ are created, providing a range of choices if restoration is required.
Replication and synchronisation are generally iterative processes, where changes to live data are subsequently incorporated into the copy data. The additional ‘version’ of the data is updated when changes take place (rather than multiple versions being created).
Whilst it is possible to establish multiple replication or synchronisation locations, and in doing so achieve multiple versions of the data, it can require a great deal of storage (and therefore expense).
Backing up remains the recommended method of protecting data, whether local or in the cloud, from the greatest number of data loss scenarios.
Backup in a Cloud World
Synchronisation and replication (which cloud storage services do very well) are not the same as data backup. A backup is taken at a point in time, and is then ‘disconnected’ from the live data, while cloud storage services are (generally) continuously synchronising and replicating data across multiple locations.
In thinking about backup for cloud data, there are a number of scenarios to consider, including:
- Malware infection: there is the potential with certain types of malware for an infection in the ‘live’ data to also infect the replicated or synchronised data. There are countermeasures available that are designed to prevent this, but as with any technology product of this nature, it cannot be 100% reliable. Infection of this type would result in data being unavailable and a separate, offline backup being required to restore from.
- User error: another area to consider is connected to mistakes being made. These include users deleting files incorrectly and sometimes not realising straight away, but also permission management issues can mean that contributors to other users’ data have the ability to delete, and may do so in error. Though it may be possible to recover incorrectly deleted files, it may not be (see below).
- Recoverability: most cloud storage services, and of course the Windows operating system, support soft and hard delete cycles. A file that is deleted first goes to a ‘recycle bin’; in cloud systems, it is retained in the recycle bin for a defined period of time, after which time it is deleted (often irretrievably). There are also many examples of recoveries from the recycle bin not being successful, so it should not be relied upon for important data.
- Account management: when user accounts are no longer required (e.g., a user leaves the organisation), their account may be disabled, and then removed. After a defined period of time, the data stored in the user’s storage area will be deleted irretrievably.
A backup process provides a means of mitigating these and other issues.
Whenever data is created or changed, a backup requirement exists. In order to ensure an appropriate backup process is in place, certain objectives need to be considered:
- Recovery Point Objective (RPO): this is the length of time between backups, and dictates how recent or current the restored data will be following a loss. For example, a daily backup, taken overnight and started at 18.00, could result in the loss of a days’ data if an incident occurred at 17.30.
- Recovery Time Objective (RTO): this is the amount of time it takes to recover the data after an incident, and should be considered in the context of the impact on the organisation. For example, if an incident occurs that requires a full restoration, that restoration using a given backup technology may take two days.
- Backup security: the two main techniques are encryption (encrypting the data on the backup media) and physical security (the way in which the backup media is handled and stored).
- Retention period: certain regulations will require particular data to be held for a prescribed amount of time, and other regulations will require that data be held for no longer than that prescribed time. The organisation may also want to balance confidence that restoration of data deliver a suitable RPO with the cost of creating and storing backups.
A good backup process should address these objectives.
For a backup process to be effective, there are some generally held good practices that should be observed.
Having a regular and automated backup process in place is the key to being prepared for a data disaster. These days, backing up important data any less than daily is likely to be an unnecessary risk. And scheduling backups to take place automatically, and start at a prescribed time, is straightforward.
The 3-2-1 Rule
The accepted best practice rule for backup is called the ‘3-2-1 backup’ rule. This means that, when backing data up, you should:
|3)||Have at least three copies of the data (so that in addition to the live data, you have at least two other copies). This is to mitigate a failure on both the main device (storing the live data) and the backup device (storing the first copy of the live data). This is the approach used in ‘disk-to-disk-to-tape’ backup.|
|2)||Store backups on at least two different media. This, again, is to reduce the risk that a failure of a device impacts the recoverability of data. A failure of disks on one server could indicate an increased likelihood of disk failure in another server (e.g., they may be of similar specification and age, and subject to similar environmental influences and use patterns). By also using another media (e.g., tape, or cloud storage), the risk is reduced.|
|1)||Store one of the backups offsite. Storing one of the backups in a physically separate location is important, as it’s the only reliable way of protecting against issues like fire, flood and other disasters. This can be achieved by physically taking tapes to another (secure) location after the backup, or using a cloud storage service.|
Adopting the 3-2-1 backup approach should ensure the survival of at least one copy of the data in the event of a serious incident.
To have good RPO, as well as not waste computing power and storage making constant backups, it may be the case that not all data needs to be treated equally. Some data changes regularly, and is in use more than other data (sometimes called ‘cold data’) that is updated less regularly.
With some backup technologies, a lot of computing power and storage is required to be able to store a full backup of everything every day, and whilst it may be possible (depending on data volumes), if data grows it may become less practical.
Business critical data should be backed up as often as possible and practical.
Centralise Backup Operations
Being able to backup data that is in one location is likely to be simpler, and therefore more successful, than trying to backup data that exists in many different places.
Cloud storage for live data (e.g., Microsoft 365, with OneDrive and SharePoint services) not only provides a way to ease the backup and restore process, but can also simplify other operations (such as providing remote access to files).
Microsoft 365 also provides benefit from ‘replication’ as part of a backup strategy, to which additional backup technologies and processes can be attached to adequately protect data from loss.
Simple tests, such as creating a test file inside a folder that will be backed up, running the backup, then deleting the file from the ‘live data’ and attempting restore from backup, can prove that the backup itself is valid and that restore processes work.
More in-depth tests, involving larger and more complex data sets, are also beneficial. However, it’s not usually advisable to put healthy live systems at risk in order to test, so this may be better considered as part of a wider IT strategy (e.g., re-use plans for old hardware).
File and Image Backups
There are two main levels of backup: file-level, and image-level.
A file backup makes an archive copy of each of the files and folders that are specified. The software used to perform the backup can usually also allow the ‘applications’ and ‘system state’ to be backed up, which makes restoration in the event of a disaster possible.
An image backup is essentially a clone of the entire hard drive, and therefore contains all the data (including the applications and the system state) but in a completely different way.
Image-level backups require more storage space, but are faster and more flexible for recovery. Both should allow the restoration of individual files and folders, where ‘minor’ incidents have occurred, though image-level backups will be slightly more complicated. Both should also allow full system recovery, where more serious incidents have occurred, but image-level backups can perform ‘bare metal’ restoration much more quickly, and more easily support restoration to different hardware.
There are a number of different backup types, including:
Full backup: a full data and/or system backup, as the name suggests. With a full backup, all data is copied to another location: a complete copy of all data is made in a single location, simplifying restoration. However, it takes longer to complete than other types of backup.
Incremental backup: only the information that has changed since the last backup (full or incremental) is included. It can be carried out more quickly and frequently than a full backup, as less information is being processed. However, recovery can be complex, as the information that needs to be restored may be spread across multiple backups.
Differential backup: similar to incremental, but the including information that has changed since the last full backup (whereas incremental is changed data since any backup). Quicker than full backup, and simpler to restore than incremental backup. However, it takes longer to complete and requires more storage space than incremental backup.
Full backups are required from time to time, and incremental or differential backups can be used effectively between full backups.
There are various types of different backup media, including:
- Tapes (which require a tape drive);
- Hard drives (which could be internal or external) and solid state storage (e.g. USB sticks); and
- Cloud storage.
Backup tapes are considered an older technology now, but can still be effective. A 1.5 TB tape can cost as little as £25, while a 1.5 TB USB hard disk drive is around double that. Of course, a tape drive would be necessary to operate the tape, but assuming one is in place already, tape has a comparatively low cost per GB.
Tapes are also reasonably secure: for a start, a compatible tape drive is required in order to read the data on the tape. Many tape backup systems also support encryption of the data during the backup process.
It is important, however, that tapes are properly taken care of. They are magnetic, and can degrade over time, so are sensitive to environmental conditions and repeated use.
Disk-based backup is a widely-used approach, in both businesses and homes; USB hard disk drives and large capacity USB memory sticks are widely available and inexpensive.
For larger backup requirements, Network Attached Storage (NAS) devices are commonly used. A NAS device houses one or more hard disk drives (HDDs) in a chassis that connects to the network, allowing it to be accessed by more than one device.
Disk backups are usually quicker than tape, and the capacity of standard disks is greater than standard tapes. From a cost perspective, it’s usually cheaper to use disks than tapes if no tape drive is in place (or if its older and may fail, it may be false economy to pay for more tapes rather than disks).
Cloud backup isn’t a new technology: Microsoft OneDrive (previously SkyDrive) first launched in 2007, and Google Drive in 2012. Both of these, and others, provide synchronisation services to allow local data to be copied to the cloud.
If there’s no backup technology in place already, cloud can often be a cost-effective approach, as there’s no need to buy any hardware.
It also provides one of the marks of a good backup approach by default: keeping at least one copy of the data in an off-site location.
Security is an important aspect to check, though many provide encryption of the data in the process of backing up (‘data in transit’), and in the cloud once the backup is complete (‘data at rest’).
It is usually slower than local backup, as the data has to travel a lot further. It’s important to use a trusted provider too, as switching can be difficult and it’s important to be certain you can access the data if and when you need it.
There are many different options in terms of the software required to control the backup.
For macOS users, Time Machine is included in much the same way as Windows Backup is for Windows.
For businesses, sometimes different applications present particular backup requirements. Things like SQL Server and Exchange Server, for example, may require specific ‘agents’ to ensure backup software can correctly back them up and restore them when needed.
We recommend a cloud backup solution for servers that supports ‘bare-metal’ restore (i.e., no operating system or software required on the server), testing of recovery, and is suitable for physical and virtual servers. Get in touch with us if you’d like to talk about this.
Backup for Microsoft 365
A common question: is it important to backup Microsoft 365 (which used to be called Office 365)?
Microsoft say that it is important. In the Microsoft Services Agreement, it states that:
- “You should have a regular backup plan”; and
- “We recommend that you regularly backup Your Content and Data that you store on the [Microsoft] Services or store using Third-Party Apps and Services”.
Revisiting the guidance earlier too, note that Microsoft 365 has built-in replication (so data does exist in multiple places), but each instance of the data is essentially ‘live’: if malware or user error causes loss in one place, the loss will be replicated in the other places.
We’d say that it is absolutely important to backup Microsoft 365; all of those emails are important, and all of those files in OneDrive need to be protected.
We recommend a cloud solution that provides:
- Protection against data loss: deleted emails, OneDrive and SharePoint files can be quickly and easy restored;
- Compliance support: data retention requirements, including archival, can be more easily managed;
- Access to former employee email: mailboxes and shared documents can be obtained from the backup rather than by maintaining them within Microsoft 365 (which can attract cost);
- Data retention: email (Exchange) data is retained for seven years, and files (OneDrive and SharePoint) data retained for one year, providing multiple potential restore points; and
- Multiple restore points: mailboxes are backed up automatically up to six times a day, and files up to four times a day, providing multiple daily restore points.
Again, just get in touch with us if you’d like to talk about this.