If your business is like every other organization in the 21st century, it relies on key data and systems to operate. Key systems might be email, phone, or industry-operation applications. Key data examples include customer records, payment information, and service/order records. Heck, even the most “basic” small business is responsible for hundreds-to-thousands of gigabytes in data, and highly dependent on its preferred communications or operations app.
If your enterprise matches this description, you need business continuity. Unfortunately, most small business leader’s idea of business continuity is an external hard drive (installed three years ago) and the phone number of their IT support company in case of a natural disaster. Such an approach won’t get the job done, and could result in thousands of dollars in damage, or worse: permanently closed doors.
So, what is business continuity and how do you achieve it (realistically)? Let’s explore:
What is business continuity?
Business continuity planning or business continuity management is broader than data/network contingencies. One helpful definition from DRI International describes business continuity as a “holistic management process that identifies potential threats to an organization and the impacts to business operations those threats, if realized, might cause, and which provides a framework for building organizational resilience with the capability of an effective response that safeguards the interests of its key stakeholders, reputation, brand and value-creating activities.”
What do you do when the internet goes down, when power goes down, when a hurricane comes through, when the company president goes on extended medical leave, when one of the company service vehicles gets totaled, etc.? While this article is focused on the data/network aspects of business continuity, many of the practices listed here scale out to the overall business continuity plan.
IT-specific business continuity
Data, network, power, systems, hardware – these are some of the aspects of your business which are directly IT-related, but bleed into all other aspects of operations. The goal here is to have a defined plan/process with the right people and solutions in place to mitigate risk of two great evils: downtime and data-loss.
The cost of downtime and data-loss – results vary from org. to org., but a 2018 study (sponsored by IBM) estimates data record values at $148 per lost record. The cost of downtime? A 2015 IDC study put estimated damage at hundreds of dollars per minute for SMBs. Data points like these are why it isn’t surprising to hear how many small businesses permanently close doors following a disaster. The Federal Emergency Management Agency (FEMA) has previously put the number at nearly 40% (of small businesses who never reopen following disaster).
In short, the risks are grievous and most SMBs simply aren’t equipped to avoid thousands of dollars in damages when thrust into inevitable situations.
Here are some of the most common causes of downtime and data-loss which could cause such loss.
Common causes of IT (i.e. business) disasters
- UPS battery failure: the power goes out (lightning strike) and the UPS doesn’t do what it’s supposed to (often because there isn’t one in place)
- Human error: Accidental deletes, spills/drops, lost device; closely related is theft and larceny
- Equipment failures: Failed motherboard, RAID malfunction, Windows corruption, bad update
- Internet outages: fiber cut, cable line malfunction, area wide outage
- Cybercrime: ransomware, brute force attack, destructive virus
- Natural disasters: hurricane, tornado, flooding
While most business leaders equate disaster with hurricanes and ransomware attacks, the other listed causes happen much more frequently. In early 2019, Microsoft had to roll-back updates which were causing PCs to freeze. In 2017, millions of business users experienced downtime due to a 4-hour outage at Amazon Web Services (Amazon’s popular Cloud infrastructure platform). One time, one of our clients had an A/C system overflow and leak through the ceiling onto its network rack!
The point is disasters come in many shapes and sizes and Murphy seems to be proven right, eventually.
My business backs up its data, so I have business continuity, right?
Wrong. A properly implemented, business-grade backup solution will prevent permanent data-loss; however, it does nothing to mitigate down-time. Let’s compare some of the common IT approaches to data management and highlight best practices to mitigate data-loss:
Data backup – Most data backup solutions are what IT nerds call “flat file”. Simply put, you have the files in a backup drive, and nothing more. The fact that yours comes from a popular software company, is automated, is managed, or is “in The Cloud” doesn’t erase the fact that your business relies on a weak system. In the event your primary system is damaged, it could take days-to-weeks to restore the primary environment with flat-file data backups.
Image backup – The difference between an image and data backup is the image backup comprises the data along with applications, profiles, and settings. Restoring a system with an image backup can be completed in a fraction of the time it takes to restore with mere data backup. We’re getting better…but not there yet.
Disaster recovery – Disaster recovery (D/R) is colloquially used by IT pros to describe having a comprehensive backup in multiple locations, namely onsite and offsite. The point is to have a comprehensive image onsite for swiftest restore, but also have a backup at a separate location, e.g., public cloud or secondary data center.
In this setup, your system will be still be down until you can repair or replace it, but the system image will be ready to go even if a flood takes out the local backup or ransomware encrypts the local image.
You may have read these options, and asked “um, where’s the option where we don’t lose function for hours-to-days at a time?” Now you are talking about business continuity.
Achieving IT business continuity – system design
Redundancy!! – Business continuity is all about contingencies, right? If A goes down, then where is B? The key to reducing risk of downtime and data-loss is redundancy. Too many small businesses suffer thousands of dollars in damages because they run their operations on “single point-of-failure” systems. Your business should avoid this design like the plague.
Here are some examples of how to deploy redundancy in your network:
- Locally hosted application: mirrored host at second data center or emergency Cloud service
- Executive computer: retain a “hot-spare” computer
- Internet service: back-up cable or cellular service plan
- Power: battery backups e.g., building generator, server UPS, desktop UPS
Ask your IT partner if redundancy is right for you – if they say no, fire them. Quality IT service organizations help SMBs design their infrastructure with redundancy at the forefront.
Offsite backup, server clustering, server virtualization, hybrid infrastructure, and even RAID drive redundancy are all methods to protect business operations. It’s more sophisticated than “put it in The Cloud” (which often doesn’t have redundancy).
Avoid single point-of-failure!
Achieving IT business continuity – the plan
By definition, business continuity is really a management process (rather than a product) and should be more robust than calling your IT partner. When a lightning storm kills the office power or a ransomware attack encrypts the application server, everyone needs to know their role in the process of disaster recovery. From the IT perspective, how do we mitigate downtime and data loss?
As a result, you’ll need to determine your RPO and RTO to have an effective business continuity plan.
Recovery Point Objective – How much data can your business lose permanently before the business experiences significant damage? Before the business suffers damage from which it can’t recover?
To bring the point home, how many customer records could you permanently lose, and the damage be insignificant ($148 per record)? How many customer payment profiles could be lost before the business has to shut down permanently?
Recovery Time Objective – How long can your business be without X system or X files before it suffers significant or irreparable damage?
What would happen if your organization had an email outage for the entire day? What if it lost access to the accounting system for several days?
The task of determining RPO and RTO for every department, system, and data set seems daunting, and would likely prevent small business leaders from taking initiative to protect their organization. Rather than succumbing to “paralysis-by-analysis”, let’s get started with something more elementary.
Your first step to business continuity
Get out a piece of paper and think about the most important systems in your business (or network if you want to focus on IT). Which people would be the most difficult to replace for a short duration (maternity leave, vacation, injury) or permanent duration? Which applications are most imperative to servicing customers or managing the company books? Organize such facets of your business into a hierarchy, get executive buy-in, and determine the RPO and RTO for the top two or three.
From there you can work with your internal team and external partners to A. strengthen for threats, and B. create a plan and process for recovering from threats realized.
Remember that failure to create business continuity plans costs “everyday” small businesses thousands in damages via downtime and data-loss. The causes of network disasters are numerous and managing company data is only becoming more challenging as data growth rates soar.
So, ditch that old data backup and work with executive management and IT partnership to drive redundancy in design and clarity in process. That’s business continuity.