Disaster Recovery Planning and Network Services Continuity

Disaster recovery planning (DRP) starts with awithout prior approval and at any time of the day.
discussion that involves key managementThe configuration change doesn't work as
employees. It is important to get their supportexpected and it is 10 am while employees are
with any disaster recovery initiative. Explain whatstarting their day. Guess what, your day just got
disaster recovery is and why it is required forlonger. Pro-Active fault and performance
business continuity, cost reduction, generatingmonitoring strategies will indicate when a device or
revenue and improving productivity. Disasterserver is not operational or near capacity. Those
scenarios such as fire, flood, earthquake, coldsituations will obviously affect network availability.
weather and employee sabotage should beThe performance assessment will describe how
discussed. Alternate vendors should be discussedwell the network is performing and whether there
as well as a potential issue with businessare any capacity issues and what offices are
continuity. affected. The infrastructure assessment will focus
Risk Assessmenton issues such as media mismatches, switch port
The Risk Assessment is a " what if analysis " thatcapacity, IOS version problems, router memory
describes the amount of risk associated with theshortages, application software versions and
current state of the network. The following areprotocols. Facilities are considered with an
some things to consider before any disasteravailability assessment and focus on rack space,
recovery strategy formulation.temperature controls, power availability and raised
• Average cost per/minute that yourfloors.
network is unavailable.Select Failover Strategies
• Cost of replacing servers, applications,1. On-Line data synchronization between the
circuits and devices.production Data Center and a remote Data
• What if any disaster recovery plan existsCenter facility. The cutover or convergence time
and how extensive it is.should be transparent to employees and all
• Have alternate vendors been identifiedcurrent data would be available. This requires the
should primary vendors have their own disastercost of a remote facility with routers, switches
recovery problems.and matching servers and applications to
Disaster Recovery Strategysynchronize the offices. Cisco distributed director
The disaster recovery strategy describestechnology can be utilized to configure both Data
operational changes, design changes and failoverCenters for concurrent operation if that is
strategies for business continuity. An action planrequired.
document is created that describes all those2. Configure the distributed director to redirect
strategies and a detailed escalation proceduresessions to the alternate Data Center once a
should the network become unavailable. It shouldcertain percentage of TCP sessions were running
document employees, responsibilities, time frames,at the primary Data Center. It is still a good idea
event sequence, vendors and processes.to consider standby sites as described below since
The following describes recommended operationalboth on-line Data Centers could be unavailable.
changes:3.  Configure a 48 hour standby site for the
1.  Network Documentationcompany which is a remote facility that has all the
Automate the network documentation process. Itequipment necessary for restoring a specified
is difficult to restore a network without havingservice level within 48 hours. This is a  
current documentation of the network before ittemporary strategy for continuing network
became unavailable. Running a networkservice for a short time frame before the
assessment will collect some information howeverproblems are fixed or cutover to a 10 day site.
you need application and device configurations asThis can be provisioned by company employees
well. Find a tool that will automate this process !or contracted to a third party DRP vendor.
Document these items:4.  Configure a 10 day standby site for the
• Current Topologycompany which is a remote facility that has all the
• Infrastructureequipment necessary for restoring all specified
• Security Policiesservices within 10 days. This would be  utilized in
• Management Strategya situation where restoration of Data Center
• Application Configurations, Versions andservices would require months. This can be
Patchesprovisioned by company employees or contracted
• Device Configurations, IOS Versions andto a third party DRP vendor.
FirmwareContingency Testing
2. Regular Backups rotated off-site and tested forTest your disaster recovery (business continuity)
data integritystrategy utilizing the action plan document from
The following list describes recommended designthe strategy phase. There should be a meeting
changes:with specific employees and vendors to discuss
Review and modify design, infrastructure,responsibilities, time frames, test event sequence
configuration, security and management forand processes. The company strategy and action
improved network resiliency and availability. It isplan should be changed as problems are identified
my contention that running a networkfrom the testing phase or company requirements
assessment is an effective strategy forchange. Plan on regular testing of the disaster
determining what changes should be made torecovery plan 3 - 4 times per year.
your network. The argument could be made thatRecommendations
all assessment groups have some affect onThe results from contingency testing will be
network availability and resiliency. The availabilityutilized to make sound recommendations for
assessment will collect most of the keyimproving the disaster recovery strategy and the
information however the security assessmenttesting process. The complexity of your
must be considered since problems with companyorganization will affect how difficult it is to build a
security will expose your network to attacks.workable disaster recovery plan. The
When your network is being attacked it isn'trecommendations will streamline your DRP and
available!ensure it works when it is required. The on
Management strategy assessments are key asdemand circuit is homed to the remote DR facility
well since the absence of effective managementrouter where it converges with the company
policies and applications will create a tenuousnetwork and employees can utilize the mainframe
situation. For instance without any changeapplications. The DR mainframe should be
management policies you will have employeessynchronized with the company mainframe for
changing application and device configurationstransactions during that period before service is
(assuming they have security authorization)restored.