What company is immune to the risks of an IT outage?
Many factors can lead to an IT accident. Whether a hardware failure, lost files, a virus or even a natural disaster, no company is safe from the damage wreaked by this type of calamity. In a business environment where, on average, IT infrastructure supports 85% of a company’s operations, collateral damage from an IT outage can seriously threaten an organization’s viability. This is why far-signed organizations include an IT disaster recovery plan as a part of their operational risk management.
First, we should note that an IT disaster recovery plan revolves around these two concepts:
The RTO (Recovery Time Objective): The total acceptable outage time helps determine the maximum amount of time the IT system can be unavailable.
The RPO (Recovery Point Objective): The acceptable amount of data that the organization can tolerate losing following an IT outage.
Step 1 : Inventory IT equipment in a database
It is important to keep this database updated, in order to keep a handle on all information relating to hardware commissioning and depreciation.
Step 2 : Make a list of data and applications
This step is necessary because it enables you to know where all data is stored, and which applications are installed. In this step, you should create a map showing connections between data centres, storage devices, data flows, applications and databases.
Step 3 : Categorize assets critical to company’s operations
This categorization focuses on intervention priorities that will minimize the loss of important data. To determine the real level of importance of applications, we recommend that you refer to this type of categorization:
Applications of critical importance:
Without this type of application and associated data, business operations come to a halt.
Applications of moderate importance:
Without this type of application and associated data, service would be suspended for a period, requiring certain employees to completely stop their tasks.
Applications of low importance:
Without this type of application and associated data, business operations would see little to no impact.
Once applications and their data have been classified, the IT director can send out an optimal recovery plan based on segmented priorities.
Step 4 : Define recovery priorities
In reference to the inventory and categorization of assets, we suggest that you create a document that contains the rules and procedures for restarting services. In this step, it is recommended that you work with an expert who has the skills and know-how to conduct a full analysis of the risks to which the company is exposed.
Step 5 : Define the RPO/RTO thresholds
This step identifies the capacity of the IT infrastructure to evaluate its strengths and take stock of the weaknesses it presents during a turbulent period.
Step 6 : Analysis and approval of technical and financial solutions
After defining the RPO and RTO thresholds, some technical recommendations may be proposed to improve the results of these two references. In such a case, it is important to produce a report with the different recovery plans associated with the level of risk.
Step 7 : Create and write out the procedure report
Once the budget, resources and objectives have been established for the recovery plan, the technical solutions can be created. When these improvements have been made, it is necessary to document the procedures, which will need to be secured and archived in an external, easily accessed place.
Step 8 : Create a recovery plan and determine conditions for initiation
The recovery plan involves several scenarios that may entail a level of risk associated with the preservation of applications and their data. This allows you to identify risks posed to the company, and propose corresponding interventions for likely scenarios.
Step 9 : Testing period and updates to procedures and recovery plan
Once it has been created, it is important to test the IT recovery plan regularly. These simulations aim to test your technical and organizational capacity during a simulated outage. The results from these tests will confirm if the RPO and RTO were respected. If this is not the case, then the IT director can make improvements to the procedures in order to ensure these thresholds are respected.
If modifications are made to the IT infrastructure, it is important to go back through the 9-step cycle.
A solution to plan for the unpredictable
You already know full well that your organization’s management hinges on the efficiency of your IT structure and data storage. Without access to these resources, all of your company’s activities will be impacted. If you have no recovery plan in place when an unplanned outage occurs, this could easily jeopardize the operational and financial health of your company. This is why it is essential to manage the risks that threaten your IT infrastructure.