DISASTER RECOVERY PLAN & BUSINESS CONTINUITY OF HDI
A Straightforward Approach
“Don’t put all your eggs in one basket.” This old chestnut is particularly appropriate when it comes to
managing risk. Risk represents different things depending on your audience. We often hear of the
importance of managing the risks associated with making personal investments. Until we understand
investing well enough to weigh equity against the level of tolerable risk, we risk making uninformed
decisions that could have devastating results.
The same principle holds true for organizations seeking to manage their own risk while striving to
achieve the goals and priorities set by their management. The business continuity and disaster recovery
(BC/DR) plan is a key tool in any organization’s risk management toolbox. This article is intended to
serve as a primer for creating a BC/DR plan, providing you with the essential knowledge, expertise, and
practical decision-making skills you need to be successful.
An Enterprise-Centric, Holistic Framework
Business requirements drive the criteria for quality. These requirements and their corresponding
processes are the “customer-driven” nervous system of the organization. The continuity requirements of
these processes drive the level of risk mitigation investment the organization includes in its business
strategy. These business processes, in turn, enable service delivery. The result is a BC/DR program that
provides for the needs of the business and reflects the business’s commitments to its customers.
IT provides specific technical services to various lines of business throughout an organization. These
business operations have an inside-out view of the organization, while the executive-level looks from
the outside in. The diagram on the following page illustrates the holistic nature of a typical BC/DR
program with this bidirectional view. Lower-level operational activities are the key to linking specific
customer-driven requirements for risk tolerance with criteria for quality. The high-level activities set
the framework for strategy, policy, and constraints on corporate requirements.
Let’s take a closer look at the continuum model section by section to better understand the key activities
and deliverables, and the role the service desk plays in the overall delivery.
Senior Executive Direction and Commitment
Leadership is an essential element in setting the strategic vision. In today’s fast-paced business world,
most executives are concerned with developing a strategy that helps the organization achieve and
sustain strength and profitability. With this in mind, it is still possible to build a sustainable BC/DR
program and secure solid commitment at the executive level. The priorities established during strategic
planning become the goals, objectives, values, and guiding principles that support the risk management
program and protect the organization’s investment.
From a risk management perspective, senior leadership typically takes the following into consideration:
Regulatory, financial, and legal issues;
Customer obligations;
Insurance coverage (requirements and protection);
Risk mitigation requirements (as a means of protecting some business functions); and
Image and reputation.
BC/DR Strategy and Policy
Before continuing, it is important to understand the differences between BC/DR programs and BC/DR
plans. The BC/DR program is the business continuity management lifecycle that supports the risk
management process and protects the organization’s investment and assets. The BC/DR plan falls under
the control of the BC/DR program and serves as the instruction manual for the continuity and recovery
of business operations and technology services. Holistic in nature, it is a road map for seamless recovery,
enough to sustain an acceptable level of service delivery.
The strategy and policy phase sets the policies and standards required to achieve the desired results.
The key components of this phase are:
Policies and standards (for ensuring the continuity of business operations);
Maximum allowable downtime (MADT) and recovery time objectives (RTOs);
Resilience;
Methodology and tools;
Processes; and
Deliverables and results.
Typically, a business continuity (BC) manager is responsible for driving the program by providing the
oversight and direction that opens the doors of communication and cooperation from upper
management down through to the lower levels of the organization. The BC manager is ultimately
responsible for the program’s success; however, he or she will require the assistance of key individuals.
A disaster recovery (DR) manager, for example, may be chosen to lead the DR portion of the program.
Each business unit within the organization also plays a key role. The service desk, for example, is
typically a mission-critical business unit. The services it provides before, during, and after a service
disruption are vital. Therefore, the service desk should be proactive and develop and maintain its own
viable BC/DR plan. Such a plan would include things like workaround procedures to mitigate the
adverse impacts of a service disruption and keep key business processes operating at an acceptable
level. It would also include provisions for handling the increased call volumes the service desk is likely
to experience during a service disruption.
The planning stage is critical, and for best results, you should follow your tactical delivery road map. The
essential goal is to design, develop, and implement the BC/DR program in a controlled manner, using the
iterative process illustrated in the diagram above. Remember, to ensure seamless implementation, a
well-established delivery model and execution plans is critical.
BC/DR Business Processes
The planning, solution development, and delivery activities that take place in this phase are
continuously improved over time, as business requirements evolve. For example, the service desk
contributes to the BC/DR program by actively participating in planning activities, and by building,
training, testing, and continuously improving its own BC/DR plan. It also collaborates with other service
units, when necessary, to ensure that the BC/DR Plan is consistent and provides seamless, end-to-end
continuity.
The service desk is also responsible for provisioning the back-up facility, to be used when the primary
facility is rendered unusable. Any tools and systems lost during the disaster, such as the incident logging
tool, must be recovered. Following its BC/DR plan, the service desk would simply mobilize its staff and
any other required resources to the alternate facility and resume operations in an effort to maintain
service continuity. It may not be feasible to deploy the whole solution all at once, but pieces of the
solution can be compartmentalized and deployed selectively over time.
These are the typical elements of an established BC/DR program and plan:
Business requirements;
Methodology, policies, and guidelines;
Processes and procedures;
Tools and templates;
Risk assessment and business impact analysis (BIA);
Delivery and technical capability;
BC/DR strategy and solution;
Emergency response (includes disaster declaration, escalation and notification, call trees, etc.);
Technical recovery plan (i.e., recover the IT infrastructure);
Resumption plan;
Repatriation plan (i.e., return process activity and/or IT back to steady-state); and
Ongoing testing, maintenance, audit governance, and continuous improvement.
Building, Maintaining, and Testing Plans
Once the BC/DR program and plan are in place, they must be maintained. Having invested large amounts
of time, energy, and money deploying the program and plan, they must be continuously improved,
particularly as business requirements change and evolve. This helps ensure that their accuracy and
integrity are in line with the business’s needs.
The BC manager will work with key representatives from across the organization, including the service
desk, to provide oversight and direction, and to conduct testing exercises. Customers and third-party
service providers should also be included in these exercises. Testing provides a mechanism for
identification and correcting any deficiencies and nonconformities, and for keeping management and
customers happy. The depth and breadth of testing depends on the program’s key requirements. In
general, the program should be tested annually, though specific testing can be conducted in isolated
settings when and where it is necessary.
As noted above, the service desk must participate in the organization’s test exercises. These exercises
must test employee competency and the functionality of processes, facilities, and technology to verify
that the BC/DR plan is sufficient. In addition, a test simulation setting at the alternate facility is
absolutely necessary.
A typical test plan should consist of the following:
An executive summary;
Scope and scenario;
Dates, locations, participants, and timescales;
Assumptions and limitations;
Objectives;
Results of the objectives;
Key strengths;
Areas requiring improvement, lessons learned, and noteworthy items;
Risk and change control;
Test preparations;
Notification, procedural systems, and participants check;
Postrecovery check;
Test cases;
Technical infrastructure tier testing; and
An activity log, issues log, and action items log.
The Emergency Response Plan: BC/DR Integration
To work effectively, the design of the technology and delivery model must uphold the business’s
requirements and it must be seamless. These requirements are typically quantified by three metrics:
maximum allowable downtime (MADT), recovery time objective (RTO), and recovery point objective
(RPO). The RPO specifies the point in the operating cycle at which recovery must occur for the business
to resume normal operating activities. The BC manager will provide the necessary oversight to ensure
that the BC/DR plan integrates the business and its technology, and will make adjustments where
necessary.
When a disaster is declared, the service desk may provide extended and even additional services to
assist with the recovery. However, the service desk will also be busy recovering assets and resuming
operations specific to its own services.
Application Analysis and Asset, Incident, and Change Management
To ensure that the required assets are included in the BC/DR plan, a reliable method for maintaining
critical assets is crucial. All assets must be tracked and managed when changes are being made. At this
stage, a configuration management database (CMDB) or other compatible tool for tracking assets will
prove beneficial, as will integration with the change management process, which will ensure that the
BC/DR plan stays in sync as changes are implemented.
Incident management is also a key process, as a disaster situation is an incident and it must follow the
path to resolution set forth in the incident management process. The service desk is responsible for
logging and tracking all of the calls it receives that pertain to the service disruption, as well as logging,
tracking, and managing all first-pass resolution attempts. Likewise, the service desk is responsible for
maintaining asset information and making sure that information is available to assist with handling
service disruptions. Again, as other business and service units invoke their notification and escalation
plans, the service desk may be called upon to assist where necessary.
Business systems and applications are interdependent. Establishing a recovery capability that
synchronizes application interdependencies will reduce the recovery time window, thus lowering
recovery costs and mitigating any adverse effects on service.
DRP/Data Storage Integration
Finally, selecting the most appropriate data storage solution architecture and recovery/restoration
method, based on the DR solution, is essential to any BC/DR plan. The way data is backed up and stored
off site, whether your organization uses a simple tape-based method or a sophisticated mass storage
design, is crucial if adequate recovery is to be achieved. Synchronization between the primary data
center and the DR management hot-site must be engineered to facilitate the simplest and most effective
recovery possible in the shortest amount of time. Many organizations fail to realize the importance of
this step and find that their data storage and recovery solutions are insufficient.
Business continuity and disaster recovery management are complex disciplines. I trust the insights I’ve
provided here will help you develop your BC/DR program, produce a well-crafted BC/DR plan, and
avoid putting “all your eggs in one basket.”