Where to Start with Disaster Recovery in SQL Server
Backup and restore? Log shipping? Maybe Failover Clustered Instances or Availability Groups can be used? Oh, what about Azure or another cloud provider? Some data centers offer “push button DR,” will that work? There are so many options. Where should we start with Disaster Recovery for our SQL Server?
The point of Disaster Recovery
Disasters happen. No matter how hard we try to keep the proverbial trains running on time, sometimes things go off the rails.
Despite our best efforts, unexpected events can and do occur that take down a network, compromise a server stack, or corrupt a database. Ransomware has become a profitable, albeit illegal, business model for many. And SQL Server-specific attacks, such as MrbMiner and Vollgar, have been found in the wild.
Some disasters can even bring down entire regions. Hurricanes and tornados, wildfires and snowstorms, even a typo in an update script can affect the infrastructure our business networks depend on.
Successful companies account for those contingencies. More importantly, they practice for them. But we’re getting ahead of ourselves.
The first question to ask when thinking about Disaster Recovery
All of the technologies mentioned before may be useful in Disaster Recovery plan. So which is right for you?
Before addressing or selecting the technology, it’s best to fully understand what you’re trying to accomplish with your Disaster Recovery plan. What are the business requirements? How much does it cost the company if one or more of its main SQL Servers is down?
For example, if you’re a healthcare provider company, how much does it cost for your practice management system or electronic medical records system to be offline for an hour because of a disaster? If you’re a bank or a law firm, a construction company or a governmental agency, it’s the same question. How will your business, your operations, your customers, be affected if your systems are unavailable?
Start with that question.
The answer to that question will lead to two additional questions. What are your Recovery Time Objectives (RTOs) and what are your Recovery Point Objectives (RPOs)?
Recovery Time Objectives (RTOs)
When a critical system is stricken by a disaster of some kind, whether it’s accidentally self-inflicted or due to an unpredictable fluke, how long can it be offline without causing a severe problem for the organization or its customers?
You’ll need to think through the question in great detail. For some businesses, the primary pain point will be loss of revenue or productivity. Maybe the damaged reputation in the marketplace is big factor as well. For other businesses, actual lives may be a stake. 911 systems, emergency responders, and emergency rooms cannot afford to be offline.
Get the business’ perspective on it. Ask the stakeholders. Understand the consequences.
Those answers will drive your RTOs. Your RTO is the amount of time it may take to recover from a disaster. And your RTOs will help determine the technology you’ll need to support your objectives.
Recovery Point Objectives (RPO)
If a critical system goes offline unexpectedly how much data can be lost as a result of the disruptive event? For example, if your customer relationship management system (CRM) crashes, how much data loss is acceptable? Five minutes? An hour? Four hours? More?
What about your online store? Or your pharmacy system? How much data loss is acceptable?
Consider this example. If your financial system crashes during year end processing, can you afford to re-enter 10 minutes worth of work? 30 minutes? 8 hours? These questions fuel your RPOs. Just how close to the time of the event will you need to recover data? And how much can be re-entered or re-imported?
As with your RTOs, your RPOs will help determine your Disaster Recover approach, and the technology you use to implement it.
Balancing cost, probability, and pain
In a world without financial constraints, your RPOs and RTOs will determine the technology decisions. If you really want 99.99% uptime with no data loss in the event of a disaster, you can architect a solution to support those goals. It will be expensive. Maybe not cost prohibitive if those are truly your goals, but expensive nonetheless.
But most of us don’t have the luxury of an unlimited budget. So we must consider the cost of our solution. You’ll need to consider hardware costs, licensing costs, and storage costs of your solution. If you’re looking at multiple locations across the country, you’ll need to factor those costs.
It’s a balancing act. To help with the balancing act, factor in the likelihood of the event happening. How likely is it that you’ll have a major disruption in the next 12 months? In the next 24 or 48 months?
Don’t fall into the trap of looking in the past for guidance. “Nothing happened in the last 2 years so we should be good.” That’s not good planning. Look at the industry. Consider the trends. Assess the risk. And make a decision.
So, where do you start with Disaster Recovery in SQL Server?
Start by asking the right question. How will a disaster affect my customers and my business? Based on that answer, determine your RTOs and RPOs. And finally, select the most appropriate technology to meet your needs.
Want to work with The Sero Group?
Want to learn more about how SERO Group helps organizations take the guesswork out of managing their SQL Servers? It’s easy and there is no obligation.
Schedule a call with us to get started.
One Response
[…] (RTOs) and Recovery Point Objectives (RPOs). How do FCIs and AGs stack up on RTOs and RPOs? (See Where to Start with Disaster Recovery in SQL Server for a description of RTOs and […]