| Essentially, the key to Disaster Recovery success | | | | 6. Disaster Recovery ResponsibilitiesDisaster |
| is having a realistic and well understood set of | | | | Recovery roles and responsibilities need to be |
| objectives that are based on the business needs. | | | | clearly defined. In the case of a disaster, who will |
| This involves planning and preparation, from the | | | | be there to recover the data and initiate the |
| business impact analysis, to understanding and | | | | Disaster Recovery plan? With disasters like Sep |
| quantifying risks, to classifying and prioritising | | | | 11, there was a clear demonstration of the risk of |
| applications and data for recoverability. | | | | staff not being available to perform recovery. |
| Additionally, there is the need for preparing | | | | Even in situations where tragedy is not the issue, |
| systems to be able to recover, and then | | | | it might simply be a case of not being able to |
| documenting everything, especially the Disaster | | | | physically reach Disaster Recovery sites. Any |
| Recovery plan. Another factor for success is to | | | | Disaster Recovery planning scenario must |
| make Disaster Recovery less than an exception | | | | consider redundancy of roles to ensure that |
| by integrating Disaster Recovery hardware | | | | people are available to cover various |
| components into production. The dynamic nature | | | | responsibilities in the process. |
| of IT requires continuous review and updates of | | | | This highlights the need for comprehensive |
| the process and the plan. It must be part of the | | | | documentation and training. Large organisations |
| day-to-day operations. | | | | with distributed IT expertise are in the best shape |
| Finally, investing in a solid technology basis is | | | | as far as this is concerned, because they can |
| critical. An organisation must leverage newer | | | | leverage resources in multiple locations. There is |
| technologies that provide higher performance at | | | | also the possibility of contracting and enlisting |
| lower cost where possible, and at a minimum it | | | | third-party service companies to help in the |
| must ensure that backups are functioning well. | | | | planning and preparation process. |
| 1. Business and IT need to be linkedCreating a | | | | Disaster recovery requires organisation, |
| Disaster Recovery plan is a compromise and while | | | | coordination, and execution. Perhaps the best |
| people are aware of best practices, they face | | | | profile for a Disaster Recovery Manager is that of |
| issues related to cost. When best practices are | | | | a military commander. Executing a Disaster |
| pitted against cost, cost needs to be the second | | | | Recovery plan is analogous to a military operation. |
| and not first priority. Even more important, | | | | It requires that each participant understands his or |
| though, is that capabilities needs to be matched to | | | | her job, who they have to interact with and, |
| expectations. Responding to a disaster is an | | | | most importantly, the proper chain of command. |
| exception, but preparing for it should not be a | | | | As in these circumstances, chaos is a given, being |
| burden but integrated with day-to-day priorities. | | | | able to react to new and changing circumstances |
| 2. There needs to be a Disaster Recovery | | | | quickly and confidently is key. |
| planThe Disaster Recovery plan needs to | | | | Some of the factors that need to be considered |
| represent all functional areas within IT prior to, | | | | are how and when a disaster is declared, time to |
| during, and after a disaster. It needs to include | | | | notify and position people at Disaster Recovery |
| applications, networks, servers & storage. | | | | sites, equipment logistics, recovery initiation, and |
| Contingencies, such as "what-if" scenarios should | | | | the overall execution process for recovery. |
| be considered as part of planning process. | | | | |
| Decisions need to be made regarding levels of | | | | 7. Disaster Recovery RiskThe Disaster Recovery |
| disruption that will constitute a disaster, downtime | | | | plan needs to address the right risks. Disaster |
| and loss tolerances. | | | | recovery is essentially an insurance policy. How |
| | | | | much and what kind of insurance is needed? What |
| 3. Keep the Disaster Recovery Plan current | | | | sort of risks is the organisation willing to take? |
| Disaster Recovery planning needs to be part of | | | | The definition of what constitutes a disaster that |
| the day-to-day operations of the IT environment | | | | is covered by the plan has to be considered. Many |
| and even though it is an exception, it should | | | | recent disasters were floods but various kinds of |
| always be at the forefront of people's minds. | | | | other weather activity and fires need to be |
| Once the Disaster Recovery plan is created, it | | | | considered as well. There are elements within the |
| needs to be maintained and updated every time | | | | organisation's environment that need to be |
| an element within the IT environment changes. | | | | considered from the standpoint of what |
| The dynamic nature of IT environment ensures | | | | constitutes a disaster. A site outage, application |
| that the Disaster Recovery plan will fail if the | | | | outage, or even a server outage could constitute |
| management of the plan is not part of change | | | | a disaster for an organisation. |
| management. | | | | |
| | | | | 8. Good BackupsWhat happens when the |
| 4. Test the Disaster Recovery PlanThe Disaster | | | | backups don't work? For many companies, tape |
| Recovery plan needs to be tested regularly to | | | | backup is still the primary medium for disaster |
| ensure the business can recover the operation | | | | recovery, certainly for off-site disaster recovery. |
| successfully and in a timely fashion. Disaster | | | | As an alternative, a WAN replication is growing, |
| Recovery testing is a major challenge for most | | | | but it might be too costly an option for some |
| IT departments, but if recovery has not been | | | | businesses. Application recoverability must be |
| tested all the way to the application level, it is | | | | validated through the recovery of backups to the |
| very likely that problems will occur. | | | | application level. |
| Even though a Disaster Recovery test is a major | | | | |
| operational disruption it shouldn't be treated as a | | | | 9. Alternative Recovery ServicesIt needs to be |
| pro forma exercise but needs to include true | | | | clearly defined who - in the case of a disaster - |
| end-to-end testing all the way to production. The | | | | will be there to recover operations and initiate the |
| focus needs to be on recovering applications | | | | Disaster Recovery plan. While this is an |
| rather than servers since with today's complex | | | | uncomfortable consideration it needs to be |
| applications, client server and web-based multi-tier | | | | considered nonetheless. Disasters like September |
| applications, the components reside on multiple | | | | 11 clearly highlighted the risk of staff not being |
| servers thus there are interdependencies | | | | available to perform recovery. Some of the |
| between these. If disaster recovery has not been | | | | organisations affected had a backup copy of their |
| tested all the way to the application level, it is | | | | data offsite, however it was only a short distance |
| very likely that problems will occur. | | | | away from the World Trade Centre site and |
| The philosophy for Disaster Recovery testing | | | | staff couldn't access the site for weeks caused |
| needs to change. Basically the approach used for | | | | by the exclusion zone set up around Ground Zero. |
| software quality testing should be adopted, where | | | | Even in situations where tragedy is not the issue, |
| finding bugs is a positive thing. Finding problems in | | | | it might simply be a case of not being able to |
| Disaster Recovery is equally positive as long as | | | | physically reach Disaster Recovery sites. |
| these issues are resolved to eliminate problems | | | | |
| during a real disaster. | | | | 10. Disaster Recovery Cost ConsiderationData |
| | | | | protection and recovery requirements may seem |
| 5. Set realistic recovery objectivesFrequently, | | | | too expensive and Disaster Recovery is |
| organisations have established objectives and | | | | considered a particularly heavy expense, one that |
| prioritised servers and applications in accordance | | | | most organisations have a great deal of difficulty |
| with Disaster Recovery policies. However, upon an | | | | absorbing. It returns to the gap between the ideal |
| objective examination of Disaster Recovery | | | | and the practical. Being able to address the IT |
| capabilities and resources, it turns out that these | | | | cost for Disaster Recovery is an issue of |
| goals are not attainable. Thus it is important to | | | | integrating Disaster Recovery into standard |
| set realistic Recovery Point Objectives (RPO) and | | | | operations as much as possible. Ideally, the |
| Recovery Time Objectives (RTO). | | | | Disaster Recovery resources and equipment are |
| In regards to the RPO when does the clock start | | | | not viewed as technologies that are sitting idle. |
| ticking and what tolerance is permissible for an | | | | Ultimately, this comes down to making an |
| outage. As for the RTO how current is the data | | | | informed decision of either spending money or |
| prior to the disaster. These are the key matrix | | | | accepting risk. Newer technologies are emerging |
| items that need to be determined and supported. | | | | that make this more cost effective. Regardless, |
| It is important to examine whether the | | | | Disaster Recovery needs to be treated as an |
| infrastructure can support the goals. | | | | investment. It is an insurance policy. |
| | | | | |