Top 10 Tips for Disaster Recovery Planning

Essentially, the key to Disaster Recovery success6. Disaster Recovery ResponsibilitiesDisaster
is having a realistic and well understood set ofRecovery roles and responsibilities need to be
objectives that are based on the business needs.clearly defined. In the case of a disaster, who will
This involves planning and preparation, from thebe there to recover the data and initiate the
business impact analysis, to understanding andDisaster Recovery plan? With disasters like Sep
quantifying risks, to classifying and prioritising11, there was a clear demonstration of the risk of
applications and data for recoverability.staff not being available to perform recovery.
Additionally, there is the need for preparingEven in situations where tragedy is not the issue,
systems to be able to recover, and thenit might simply be a case of not being able to
documenting everything, especially the Disasterphysically reach Disaster Recovery sites. Any
Recovery plan. Another factor for success is toDisaster Recovery planning scenario must
make Disaster Recovery less than an exceptionconsider redundancy of roles to ensure that
by integrating Disaster Recovery hardwarepeople are available to cover various
components into production. The dynamic natureresponsibilities in the process.
of IT requires continuous review and updates ofThis highlights the need for comprehensive
the process and the plan. It must be part of thedocumentation and training. Large organisations
day-to-day operations. with distributed IT expertise are in the best shape
Finally, investing in a solid technology basis isas far as this is concerned, because they can
critical. An organisation must leverage newerleverage resources in multiple locations. There is
technologies that provide higher performance atalso the possibility of contracting and enlisting
lower cost where possible, and at a minimum itthird-party service companies to help in the
must ensure that backups are functioning well.planning and preparation process.
1. Business and IT need to be linkedCreating aDisaster recovery requires organisation,
Disaster Recovery plan is a compromise and whilecoordination, and execution. Perhaps the best
people are aware of best practices, they faceprofile for a Disaster Recovery Manager is that of
issues related to cost. When best practices area military commander. Executing a Disaster
pitted against cost, cost needs to be the secondRecovery plan is analogous to a military operation.
and not first priority. Even more important,It requires that each participant understands his or
though, is that capabilities needs to be matched toher job, who they have to interact with and,
expectations. Responding to a disaster is anmost importantly, the proper chain of command.
exception, but preparing for it should not be aAs in these circumstances, chaos is a given, being
burden but integrated with day-to-day priorities.able to react to new and changing circumstances
2. There needs to be a Disaster Recoveryquickly and confidently is key.
planThe Disaster Recovery plan needs toSome of the factors that need to be considered
represent all functional areas within IT prior to,are how and when a disaster is declared, time to
during, and after a disaster. It needs to includenotify and position people at Disaster Recovery
applications, networks, servers & storage.sites, equipment logistics, recovery initiation, and
Contingencies, such as "what-if" scenarios shouldthe overall execution process for recovery.
be considered as part of planning process. 
Decisions need to be made regarding levels of7. Disaster Recovery RiskThe Disaster Recovery
disruption that will constitute a disaster, downtimeplan needs to address the right risks. Disaster
and loss tolerances.recovery is essentially an insurance policy. How
 much and what kind of insurance is needed? What
3. Keep the Disaster Recovery Plan currentsort of risks is the organisation willing to take?
Disaster Recovery planning needs to be part ofThe definition of what constitutes a disaster that
the day-to-day operations of the IT environmentis covered by the plan has to be considered. Many
and even though it is an exception, it shouldrecent disasters were floods but various kinds of
always be at the forefront of people's minds.other weather activity and fires need to be
Once the Disaster Recovery plan is created, itconsidered as well. There are elements within the
needs to be maintained and updated every timeorganisation's environment that need to be
an element within the IT environment changes.considered from the standpoint of what
The dynamic nature of IT environment ensuresconstitutes a disaster. A site outage, application
that the Disaster Recovery plan will fail if theoutage, or even a server outage could constitute
management of the plan is not part of changea disaster for an organisation.
management. 
 8. Good BackupsWhat happens when the
4. Test the Disaster Recovery PlanThe Disasterbackups don't work? For many companies, tape
Recovery plan needs to be tested regularly tobackup is still the primary medium for disaster
ensure the business can recover the operationrecovery, certainly for off-site disaster recovery.
successfully and in a timely fashion. DisasterAs an alternative, a WAN replication is growing,
Recovery testing is a major challenge for mostbut it might be too costly an option for some
IT departments, but if recovery has not beenbusinesses. Application recoverability must be
tested all the way to the application level, it isvalidated through the recovery of backups to the
very likely that problems will occur.application level.
Even though a Disaster Recovery test is a major 
operational disruption it shouldn't be treated as a9. Alternative Recovery ServicesIt needs to be
pro forma exercise but needs to include trueclearly defined who - in the case of a disaster -
end-to-end testing all the way to production. Thewill be there to recover operations and initiate the
focus needs to be on recovering applicationsDisaster Recovery plan. While this is an
rather than servers since with today's complexuncomfortable consideration it needs to be
applications, client server and web-based multi-tierconsidered nonetheless. Disasters like September
applications, the components reside on multiple11 clearly highlighted the risk of staff not being
servers thus there are interdependenciesavailable to perform recovery. Some of the
between these. If disaster recovery has not beenorganisations affected had a backup copy of their
tested all the way to the application level, it isdata offsite, however it was only a short distance
very likely that problems will occur.away from the World Trade Centre site and
The philosophy for Disaster Recovery testingstaff couldn't access the site for weeks caused
needs to change. Basically the approach used forby the exclusion zone set up around Ground Zero.
software quality testing should be adopted, whereEven in situations where tragedy is not the issue,
finding bugs is a positive thing. Finding problems init might simply be a case of not being able to
Disaster Recovery is equally positive as long asphysically reach Disaster Recovery sites.
these issues are resolved to eliminate problems 
during a real disaster.10. Disaster Recovery Cost ConsiderationData
 protection and recovery requirements may seem
5. Set realistic recovery objectivesFrequently,too expensive and Disaster Recovery is
organisations have established objectives andconsidered a particularly heavy expense, one that
prioritised servers and applications in accordancemost organisations have a great deal of difficulty
with Disaster Recovery policies. However, upon anabsorbing. It returns to the gap between the ideal
objective examination of Disaster Recoveryand the practical. Being able to address the IT
capabilities and resources, it turns out that thesecost for Disaster Recovery is an issue of
goals are not attainable. Thus it is important tointegrating Disaster Recovery into standard
set realistic Recovery Point Objectives (RPO) andoperations as much as possible. Ideally, the
Recovery Time Objectives (RTO).Disaster Recovery resources and equipment are
In regards to the RPO when does the clock startnot viewed as technologies that are sitting idle.
ticking and what tolerance is permissible for anUltimately, this comes down to making an
outage. As for the RTO how current is the datainformed decision of either spending money or
prior to the disaster. These are the key matrixaccepting risk. Newer technologies are emerging
items that need to be determined and supported.that make this more cost effective. Regardless,
It is important to examine whether theDisaster Recovery needs to be treated as an
infrastructure can support the goals.investment. It is an insurance policy.