10 Questions you should ask your DR provider

There has been a ton of misinformation surrounding legitimate Disaster Recovery for Virtualized environments. There are many purveyors of “all-you-can-eat DR to the Cloud” for very attractive blanket rates such as $600.00/month. Other providers offer “per-VM to the Cloud” protection in the sub $20.00/month price range.

Most of these unbelievably-low-cost offerings are purely smoke-and-mirrors which obtain their profits by engaging in one of two methodologies:

  1. Offsite backup solutions without committed resources, masquerading as Disaster Recovery Plans. While, technically, an enterprise could recover from a remote backup, locating resources, provisioning those resources and transferring data would take far too long.
  1. Complex and incomprehensible price structures designed to lead organizations into to making a commitment, and then be obligated for far more than originally intended. In other industries, this is known as bait-and-switch, but this concept has yet to be applied to Disaster Recovery as a Service (DRaaS), as far as I know.

Before we can continue, we must understand what the difference between Offsite Backup and true Disaster Recovery as a Service (DRaaS).

Offsite Backup

Offsite Backup is just that: incremental transfer of your data or VM images to another site. The provider of the service is at liberty to compress, de-duplicate or archive your data in any legitimate way of their choosing. Furthermore, some providers grant only image-level access (whole VM at a time), while others also allow file-level access to data. Offsite Backup services can also dictate the terms to return your data in a time of need. Not only may there be upcharges for recovery, larger environments (10+ TB) may be faced with having their data returned on physical disks by common-carrier, courier or private jet!

Some higher-end Offsite Backup providers also have Compute resources available (a Compute Cloud) onto which you could transfer your backups for recovery. There will most always be an upcharge, not only for transferring the data, but also for powering-on. Moreover, think what’s going to happen when the next Hurricane Sandy hits; the big players will gobble up all of the resources and the rest will be left standing in the rain (pun intended!).

Disaster Recovery as a Service

DRaaS providers are those organizations that dedicate Compute resources to customers in case of a disaster. The most legitimate of these providers use a technique called Replication, to create a copy (a “Replica”) of your working systems in a Compute Cloud. With DRaaS you have a known-amount of committed resources in which you can power-up your systems and items such as: allocated network bandwidth, number of IP Addresses are clearly stated in advance.

With DRaaS, it is possible to test and validate your Disaster Recovery plan on a schedule in order to maintain compliance with standard such as SSAE 16, ISO 9000 and more.

Tough questions for DR providers:

Before you commit, we would like you to consider the following questions, and how your chosen DRaaS provider would answer them:

  1. In the event of a disaster, are there committed resources on which I can power my VMs?
    • If no, this is probably not a DR solution, rather an offsite backup and Data Warehousing solution.
  2. If I need to move/transfer my data before running a DR plan, how is my data transferred to the Compute Cloud where it will run (Disks in the mail or online)?
    • Plans which require transfer of data are offsite backup and Data Warehousing solutions.
    • Plans of this type may require over 24 hours to move/transfer data. Be sure to understand the SLA to make your data operational.
    • DR testing and validation becomes nearly impossible.
  3. What is the Recovery Point Objective (RPO) into the DR solution?
    • RPO over 1-hour is often not acceptable for enterprise
    • Many plans will try to maintain that “up to 24 hours” is acceptable for RPO Can your organization sustain the loss of 24-hours data?
  4. What is the Recovery Time Objective (RTO) during a disaster?
    • RTO should allow users to begin failover immediately in the event of a disaster
  5. Is there a clearly stated and understandable price-structure so I understand costs if my needs change?
    • Overly-complicated price and rate structures are the ruse Big Cloud providers use to cause customers to sign-on for one amount and end up paying something vastly different.
  6. Does this solution allow me a minimum of 1-Week (168 Hours) of scheduled DR testing at no additional charge?
    • If no, this may not actually be a DR solution
    • If there is an additional charge for testing, is that charge and all associated expenses clearly understood? (See question #5)
  7. During scheduled DR testing would my VMs/replicas powered-on in the Compute Cloud where I would run in case of a disaster?
    • If no, this may not actually be a DR solution.
    • If no, any reasonable test is invalid as it does not use the resources that would be committed in case of a disaster.
  8. What are the up-time statistics in the Compute Cloud where I would run in case of a disaster?
    • The standard is “Five Nines” 99.999% up-time.
    • Many providers frequently violate client SLA’s. Ask if the provider is in violation of any client SLA’s.
  9. Before I commit to this solution, is there a trial-period where may I initiate a call at a time of my choosing to your actual technical-support system (not a technical sales associate) to judge response time, technical proficiency and language skills?
    • Due-diligence should mandate a test-drive of this solution’s support-system at a time of your choosing.
    • You need to understand the response-times, technical proficiency and language skills of actual Technical Support Representatives.
    • The vendor is likely to try to direct your calls during a trial to a highly-educated and English-language proficient Technical Sales Engineer (TSE). Beware, this is not representative of an actual support call at 6:00 P.M. on Sunday afternoon in advance of Hurricane Sandy!
  10. Is the DR provider audited to any known or acceptable standard such as SSAE 16 SOC Type 2?
John Borhek

About: John Borhek

John Borhek (VCP 3-6.5) is the IT Director and Lead Solutions Architect at VMsources Group Inc. John has soup-to-nuts experience in Mission Critical Infrastructure and GxP systems, specializing in Datacenter Infrastructure Management (DCIM) and Operational Technology (OT) all over the United States and throughout the Americas.


Leave a Reply

Your email address will not be published. Required fields are marked *