Backup and Recovery Solution Checklist
Companies are buying solutions to back up their corporate application data in order to meet regulatory compliance and save their company from various levels of data disaster. There are managed hosted solutions that keep the backups in the vendors servers (i.e.: Cloud backup) and solutions that maintain a local database copy of their data. The evaluation criteria include meeting functional requirements, flexibility, simplicity and cost of deployment and management. All of these factors must also drive a solution that fits into the existing corporate IT environment and simultaneously capitalizes on the in-house skills of the technical team.
Let’s get into the details of what a backup and recovery solution has to accomplish. There are many vendors in this space, a wide span of pricing and deployment options. By focusing on the nature of the problem instead of starting with an inventory of existing vendor solutions, we can see what features are essential to solving the problem. Then you can apply the rules to the vendor solutions to rule out vendors that don’t actually solve the problem, and winnow down your list.
Vendor selection should encompass the following factors.
Support for Your Particular Application
There are vendors that have specialization and expertise in solutions for particular application vendors, and some that claim backup and recovery capability for many applications. Remember that backing up can be generalized somewhat, but recovery involves depth of experience with a specific application’s architecture and business rules. What’s important is that your vendor can solve your problems, not things that you don’t have or won’t have at some hypothetical future date. Any vendor who has a wide variety of supported applications is likely to have varying levels of support for recovery of any given application.
Special Consideration for Hosted Applications
Regulatory compliance for companies using SaaS solutions can be challenging. A typical Cloud application vendor backs up its database for all of its customers. “The database” is comprised of all customers on a database server in the vendor’s server farm, not an individual customer. Database recovery is an all-or-nothing proposition for everything in “the database.” Therefore, your vendor cannot restore your entire set of data to a given time. There is also no way that your vendor can recover individual deleted or corrupted records in a reasonable amount of time. Salesforce.com offers such a service, but says it takes over a month to get your data back.
A simple and obvious solution is to set up a strictly maintained copy of the CRM data, either locally or as a Cloud-based service. However, management of the local or in-house application data repository must ensure complete audit and recoverability of change history and deleted records. Flat file backups cannot be done incrementally. It is very expensive, time consuming, and error prone to take a full backup every time. Also, if the schema changes, you can’t recover multiple versions of each schema into one Cloud database.
Recovery has to take into account not only parent-child relationships, but recursive relationships between records of the same type.
The most effective way recover user data is to back it up to an offline database, be able recover selected or all records, and keep the original relationships in the data intact.
Both Backup and Recovery Capability
To ensure complete backup and recovery functionality, a recovery program is required to allow recovery of deleted or corrupted cloud-based application data. You will essentially need a backup program to get it out and a recovery program to restore it, no matter how they are packaged within the compliance application. Surprisingly, many vendors offer a backup solution without a recovery component, or haven’t resolved the full set of recovery issues. Backing up data is logically much easier than recovering it because you don’t have to solve problems of relating records, handling business rules such as being able to create an order when your inventory doesn’t exist, or figuring out which version of a record to restore.
Full or Incremental Backup
It is really easy to design a simple backup solution that copies all the data every time. But if you’re dealing with vast amounts of data in your company, that may not work, especially if you’re backing up a Cloud application with millions of records. Incremental backup definitely has its advantages for any traffic going over the internet, whether it’s offsite backup of on-premise data or any form of backup for hosted applications.
The other issue with full backups is the amount of space required to support extensive data retention. If you have 5 GB of data and a 7-year retention period, that’s going to result in 12.7 terabytes of file storage, uncompressed. And with that you don’t get all the intervening versions of records during the day, only a daily snapshot.
So, unless you don’t have a lot of data, don’t plan on growing your company, and don’t need the backup for more than a few days, you should consider setting your sites on solutions with incremental backup capability.
Frequency of backup
If you can stand to lose an entire day’s work, a daily backup is sufficient. But what if your accounting system is corrupted and you have to go back to yesterday’s backup? Now you have no idea who you wrote checks or sent money to. Or maybe your sales system had some great new leads, and now you’ve lost them?
Daily backups also have the problem of not knowing what really happened between the backups. If there is an audit consideration, daily backups don’t show you the complete change history of a record. Perhaps two or more employees modified financial or healthcare information, but you only backed up the final version. Who are you going to hold accountable for process violations?
Unless your application is mostly read-only, doesn’t have a compliance need, doesn’t contain critical business information, you have a paper trail of everything you change, and you have only one employee with a photographic memory using it, daily backups just don’t cut it, especially with a large work force who are going to be really annoyed with you when you tell them to just rekey the entire day’s work because you picked a daily backup solution.
Data Mapping Automation
Application systems often allow for user customization of the schema, with custom fields and perhaps even custom objects (records). New releases of a product will also change the schema. The backup application needs to respond to application schema changes of any nature by adding new fields or record types to the backup and not crashing if fields or record types are removed from the applications.
There are four levels of data mapping automation.
- Hard-wired ETL approach. This is typically used to pull source data into a relational database.
- Determine the backup data structure based on the source application’s data model. Create the backup schema.
- Build the backup process yourself with a general-purpose ETL (Extract Transform Load) tool. This may include incremental logic.
- Some vendors have implemented predefined application-specific starter packs for their ETL products that include standard record types and fields, and can be customized by the end user to add new fields.
- One-time schema automation. This is similar to the hard-wired approach, except for automation of the initial database and mappings creation.
- Have a one-time process that creates a backup data structure using dynamic or hard-coded metadata
- Require the administrator to add new fields in the application to the backup data store and mappings. They can either be ignored by the backup process until this is done, or the process could crash with an error message about missing fields in the mappings.
- One-time schema creation with automation of new fields and record types. Many vendors document standard fields and record types in human-readable form but not so well in machine-readable form, while custom schema changes by the user are available in good machine-readable metadata.
- Determine a backup data structure based on standard record types and fields in the source application; this part is hard coded and is release-dependent.
- Handle custom fields and record types with automation of metadata, mapping the vendor’s data types to a few standard database column types of character, date, numeric, long text, and binary. The columns are automatically created in the database and mappings.
- Fully automated. This requires really good metadata from the source application for all record types and fields.
- In this model, creation and ongoing maintenance of the entire backup data structure and all the data mappings is entirely automated.
- New record types and fields are automatically created in the database as tables and columns with appropriate data types and sizes. This must be done for each replication cycle to ensure that all data is copied. Consideration must be given for limitations of the database on column name length, and the potential for ties if we’re dealing with a 30-character name length limitation for a data source that allows a longer name, including if there are multiple fields with the same first 30 characters. Consideration must also be given to not using reserved words as column or table names; for instance, Oracle will not let you create tables named ACCOUNT or USER.
- Since database table and column names may not be exactly the same as source application names, the mappings need to be saved as local metadata.
Any solution that maps to a relational database that does not have full metadata automation is going to cause a lot of job failures and intervention by both the compliance solution administrator and the database administrator. This will cause lots of emergencies, and unhappy employees who are wasting time day and night perpetually keeping up with changes to the data model.
Setup and Maintenance Effort
Product setup time is the first and most obvious factor that most companies factor in their decision to buy a solution. Yet you only have to do it once. The maintenance costs and aggravation will far outweigh the initial setup costs, so examine
- The degree of automation in handling new vendor releases or local customization of the data model. See Data Mapping Automation, above.
- Ease of upgrade for the compliance software. This is especially important if there are significant incompatibilities between releases of the compliance software and the source application. Hosted solutions can make this transition more seamless, but the vendor must keep up to date on their releases and be able to detect when your particular application instance is upgraded, as the application vendor may roll out their upgrades over time.
Keep Deleted Records
Deleted records must be retained in the backup, marked as deleted but never physically removed. Maintaining inactive records allows restoration of any of deleted or lost data in the cloud, including the entire hierarchy of a customer with its contacts, sales orders, and related data with all relationships intact.
Solutions can be either on-premise or hosted (Cloud). If the data is going to be on-premise, the compliance software has to be on-premise as well, since no Cloud application is going to be able to push data through your firewall.
There are advantages in having the compliance vendor manage the solution in the Cloud:
- No staff required to administer the solution.
- No hardware required.
- Vendor expertise included in the price.
There are several reasons that could be a bad idea though:
- Dependency on availability of vendor staff.
- Scalability of vendor support staff in case of a major event affecting many of their customers at one time.
- Your data is their data. You don’t know their employees, who may be data mining or selling your data. Shipping your sensitive data to foreign countries may heighten this problem, especially if your government requires all versions of your data be kept in your country. The European Union is very rigid on this, for instance, as they don’t trust American companies with their data. With the regularity of world wide data breaches, this is a reasonable concern.
- You don’t have immediate access to your backups. Downloading CSV files every time you want access can be painfully slow and cumbersome.
If your solution is cloud-based backup, this really shouldn’t be your concern, but the storage method may compromise the capabilities of the recovery process. Common storage media are Comma Separated Values (CSV) files and relational databases. Databases are typically used only for on-premise solutions, although there is no reason that Cloud vendors can’t use relational technology. If you have an on-premise database, this avails you of many opportunities to use the data for other purposes, such as a reporting warehouse or a data integration hub. It also permits the compliance software to version individual records without having to create a complete backup set each time, and makes incremental copy a lot easier. Sifting through hundreds or thousands of CSV files to find the various versions of a record or find the last version of a record before it was deleted is impractical, and navigating changes in the data model would further complicate this task.
Examine your backup retention requirement in light of your actual business continuity and regulatory contexts. If you’re dealing with financial data, you need to keep every version of every record for typically seven years. Not because seven is a lucky number, but because laws were written around the concept of seven years being long enough to support an audit. Your mileage may vary according to your compliance govern body.
Data breaches can happen with your own staff as easily as with vendors, including inadvertent release to hackers using company email phishing attacks or other penetration techniques. If your DBA or system administrator has access to the database or server where the data is kept, or passwords are lying around in plain text, your data is not 100% safe from data breaches. If there is plain text data on a removable hard drive, you’re not safe. These and other principles also apply to Cloud vendors, who should be able to devote more effort to security, since it’s their core business, but can obtain more value in mining data of their many customers. A Cloud vendor is a target rich environment for hackers, who can mine thousands of customer databases instead of just one.
Scalability and Performance
Scalability and performance become the most significant reliability considerations for backing up Cloud application data when there are gigabytes or terabytes of data. Cloud applications are much slower to access than on-premise applications, due to internet latency, connection speed, and the performance and governors of the application’s APIs. Asking a cloud application to retrieve millions of records in one API call is just not going to happen; you will get timeouts and failures mid-way through the retrieval process. API governors may prevent you from copying the entire dataset in one pass. Any solution has to handle API timeouts and failures gracefully, with extensive retry and restart logic that will not force a complete reload of a dataset if there is a failure mid-stream. If the backup and recovery software isn’t designed to handle these factors, the reliability will suffer and large datasets may even be impossible to copy. Multi-threaded replication will give a large performance boost. Incremental logic means you’ll have to do the full load only once. Various solutions will break as data volumes increase, so evaluate the backup solution with a similar volume of data that you’ll eventually be using.
There is no value in a database backup if you cannot easily and completely recover the data. The objective is to be able to recover any or all records, and any or all fields, to any point in time, whether it involves recovery of lost data in a record or deleted records.
If your on-premise or managed hosted offering does not include professional services or at least 24-hour phone support for actual recovery, you are going to be twisting in the wind come an actual recovery event. This is not something that many people train for on a daily basis, and the circumstances and kind of recovery needed will vary from time to time. Successful application recovery depends not only on the recovery software and support, but on navigating business rules in the business applications that prevent recovery, such as needing to have inventory to recover sales orders or passing edits or duplicate checks that have been added since the original data was created. No data recovery product can know what these rules are in advance, so expect some surprises.
The least valuable data compliance services or products are those that expect you to develop your own recovery system using low-level data import tools. You’re wasting your money if you don’t have a complete solution.
Unless you’re just recovering one object that has no related records in your Cloud application, it’s very important to handle the relationships between records. For instance, an Account has Contacts and many other related objects. If you restore the Account record, you also need to recover the Contacts and other record types, and do it in a way that they are related using primary keys on the Account that are auto-generated at recovery time. We call this a Parent-Child relationship, supported by Primary Keys in the Parent (ACCOUNT.ID) and Foreign Keys in the Child (CONTACT.ACCOUNTID) with the same value. They foreign keys in any deleted Contact records that point to the deleted version of Account have to be changed before you can recover the child records (CONTACT.ACCOUNTID à ACCOUNT.ID). If your recovery solution doesn’t handle this automatically, you can’t recover the entire structure.
Likewise, the relationships may extend to
- Same record types (ACCOUNT.PARENTID to ACCOUNT.ID) for a subsidiary company,
- Intersection objects (PARTNER.PARTNERACCOUNTID to ACCOUNT.ID and PARTNER.ACCOUNTID TO ACCOUNT.ID of a separate record).
- Recursive multi-level relationships (ORDER.ACCOUNTID TO ACCOUNT.ID and ORDER.PARTNERID to ACCOUNT.ID of a separate record).
If your recovery solution can’t handle all of these, chances are you can’t recover all of your data, including the relationships.
Cloud applications typically allow user customization, and may include a metadata API layer to retrieve the data structure, code, business rules, reference data, report and screen layouts. Customers of such applications often have concerns about being able to recover from loss of the whole system as well as data disasters. Backing up metadata is typically much easier than recovering it because of the interrelated nature of customizations; you can’t create code that references some object that doesn’t yet exist. Some applications require you to code changes in a test environment and deploy it once code coverage testing has been done, so directly restoring code is not even allowed. Just be aware of the limitations of metadata recovery.
In summary, each of the following features and qualities of backup and recovery systems may or may not be relevant to your use case. Decide what is relevant. Then ensure that your solution meets the show-stopper requirements, and narrow it down to the best product based on the nice-to-have features.
- Supports your particular application; don’t worry about apps that you don’t have
- Handles the special considerations involved if you have a hosted application
- Has both backup and recovery capability
- Allows full or incremental backup, unless you have a very small amount of data
- Frequency of backup matches the business requirement
- Data mapping is automated initially and when the source schema changes
- Setup and maintenance effort is reasonable; this is a minor factor, since setup is a one-time operation
- Keeps deleted records in the backup, otherwise it can’t recover them
- Deployment options suit your needs; hosted for low hassle, on-premise if you need access to your data or don’t trust outsiders with your data
- Storage media suitable to your needs, if on-premise; databases are useful for reporting and integration too
- Backup retention meets business continuity and regulatory requirements
- Security is adequate to prevent data breaches
- Scalability and performance can handle your data volumes, especially for hosted applications
- Data recovery can be done rapidly and easily, knowing that full automation may not be possible without preparation and testing
- Recovers relationships, including parent-child, intersection, and recursive if applicable to your source schema
- Metadata recovery, or at least an audit of changes so you can put it back in a controlled manner