Booklet: Business Continuity Planning
Section:
Risk Monitoring and Testing
Subsection: Principles of the Business Continuity Testing Program
 

 

 

 

 

 

Action Summary additional information.

Risk monitoring and testing is necessary to ensure that the business continuity planning process remains viable through the incorporation of the BIA and risk assessment into an enterprise-wide BCP and testing program. The testing program has become a key focus of banking supervisors, in light of recent, catastrophic events, and has received heightened attention within the financial services industry because such a program can be used to validate the viability of the BCP. As such, there are various principles that should be followed by financial institutions when developing a testing program.

The following principles should be addressed in the business continuity testing program of all institutions, regardless of whether they rely on service providers or process their work internally:

  • Roles and responsibilities for implementation and evaluation of the testing program should be specifically defined;
  • The BIA and risk assessment should serve as the foundation of the testing program, as well as the BCP that it validates;
  • The breadth and depth of testing activities should be commensurate with the importance of the business process to the institution, as well as to critical financial markets;
  • Enterprise-wide testing should be conducted at least annually, or more frequently, depending on changes in the operating environment;
  • Testing should be viewed as a continuously evolving cycle, and institutions should work towards a more comprehensive and integrated program that incorporates the testing of various interdependencies; additional information.
  • Institutions should demonstrate, through testing, that their business continuity arrangements have the ability to sustain the business until permanent operations are reestablished;
  • The testing program should be reviewed by an independent party; and
  • Test results should be compared against the BCP to identify any gaps between the testing program and business continuity guidelines, with notable revisions incorporated into the testing program or the BCP, as deemed necessary.

A key challenge for management is to develop a testing program that provides a high degree of assurance for the continuity of critical business processes, including supporting infrastructure, systems, and applications, without compromising production environments. Therefore, a robust testing program should incorporate roles and responsibilities; a testing policy that includes testing strategies and test planning; the execution, evaluation, independent assessment, and reporting of test results; and updates to the BCP and testing program.

Roles and Responsibilities

The board and senior management are responsible for establishing and reviewing an enterprise-wide testing program. Once the program is established, they direct the following groups to develop, implement, and evaluate the institution’s business continuity testing program. additional information.

  • Business line management, who has ownership and accountability for the testing of business operations;
  • IT management, who has ownership and accountability for testing recovery of the institution’s information technology systems, infrastructure, and telecommunications;
  • Crisis management, who has ownership and accountability for testing the institution’s event management processes;
  • Facilities management, who has ownership and accountability for testing the operational readiness of the institution’s physical plant and equipment, environmental controls, and physical security; and
  • The internal auditor (or other qualified independent party), who has the responsibility for evaluating the overall quality of the testing program and the test results.

Testing Policy

An enterprise-wide business continuity testing policy should be established by the board and senior management and should set expectations for business lines and support functions to follow in implementing testing strategies and test plans. The policy should establish a testing cycle that increases in scope and complexity over time. As such, the testing policy should continuously improve by adapting to changes in business conditions and supporting expanded integration testing.  

The testing policy should incorporate the use of a BIA and risk assessment for developing enterprise-wide and business line continuity testing strategies. The policy should identify key roles and responsibilities and establish minimum requirements for the institution’s business continuity testing, including baseline requirements for frequency, scope, and reporting test results.

Testing policies will vary depending on the size and risk profile of the institution. While all institutions should develop testing policies on an enterprise-wide basis and involve essential employees in the testing process, some considerations differ depending on whether the institution relies on service providers (serviced institutions) or whether it processes its work internally (in-house).

A serviced institution’s testing policy should include guidelines addressing tests between the financial institution and its service provider.additional information. Serviced institutions should test communication and connectivity procedures to be followed when either the financial institution’s or service provider’s systems, at their primary or alternate sites, are inoperable. Serviced institutions should participate in tests with their critical service providers to ensure that institution employees fully understand the recovery process.

The testing policy for in-house institutions should address the active involvement of personnel when systems and data files are tested. In-house institutions often send their back-up media to a recovery site to be processed by the back-up service provider’s employees. This is not a sufficient test of an institution's BCP and is considered ineffective because financial institution employees are not directly involved in the testing process. As a result, the institution cannot verify that tests were conducted properly and institution personnel may not be familiar with recovery procedures and related logistics in the event of a true disaster.

Once an institution develops the testing policy, this policy is typically implemented through the development of testing strategies that include the testing scope and objectives and test planning using various scenarios and testing methods.

Testing Strategies

The testing policy should include enterprise-wide testing strategies that establish expectations for individual business linesadditional information. across the testing life cycle of planning, execution, measurement, reporting, and test process improvement. The testing strategy should include the following:

  • Expectations for business lines and support functions to demonstrate the achievement of business continuity test objectives consistent with the BIA and risk assessment;
  • A description of the depth and breadth of testing to be accomplished;
  • The involvement of staff, technology, and facilities;
  • Expectations for testing internal and external interdependencies; and
  • An evaluation of the reasonableness of assumptions used in developing the testing strategy.

Testing strategies should include the testing scope and objectives, which clearly define what functions, systems, or processes are going to be tested and what will constitute a successful test.  The objective of a testing program is to ensure that the business continuity planning process is accurate, relevant, and viable under adverse conditions.  Therefore, the business continuity planning process should be tested at least annually, with more frequent testing required when significant changes have occurred in business operations. Testing should include applications and business functions that were identified during the BIA. The BIA determines the recovery point objectives and recovery time objectives, which then help determine the appropriate recovery strategy. Validation of the RPOs and RTOs is important to ensure that they are attainable 

Testing objectives should start simply, and gradually increase in complexity and scope.  The scope of individual tests can be continually expanded to eventually encompass enterprise-wide testing and testing with vendors and key market participants.  Achieving the following objectives provides progressive levels of assurance and confidence in the plan.  At a minimum, the testing scope and objectives should:

  • Not jeopardize normal business operations;
  • Gradually increase the complexity, level of participation, functions, and physical locations involved; 
  • Demonstrate a variety of management and response proficiencies under simulated crisis conditions, progressively involving more resources and participants;
  • Uncover inadequacies so that testing procedures can be revised; 
  • Consider deviating from the test script to interject unplanned events, such as the loss of key individuals or services; and
  • Involve a sufficient volume of all types of transactions to ensure adequate capacity and functionality of the recovery facility.

Test Planning

The testing policy should also include test planning, which is based on the predefined testing scope and objectives established as part of management’s testing strategies. Test planning includes test plan review procedures and the development of various testing scenarios and methods. Management should evaluate the risks and merits of various types of testing scenarios and develop test plans based on identified recovery needs. Test plans should identify quantifiable measurements of each test objective and should be reviewed prior to the test to ensure they can be implemented as designed. Test scenarios should include a variety of threats, event types, and crisis management situations and should vary from isolated system failures to wide-scale disruptions. Scenarios should also promote testing alternate facilities with the primary and alternate facilities of key counterparties and third-party service providers. Comprehensive test scenarios focus attention on dependencies, both internal and external, between critical business functions, information systems, and networks.additional information. As such, test plans should include scenarios addressing local and wide-scale disruptions, as appropriate. Business line management should develop scenarios to effectively test internal and external interdependencies, with the assistance of IT staff members who are knowledgeable regarding application data flows and other areas of vulnerability. Institutions should periodically reassess and update their test scenarios to reflect changes in the institution’s business and operating environment.

Test plans should clearly communicate the predefined test scope and objectives and provide participants with relevant information, including:

  • A master test schedule that encompasses all test objectives;
  • Specific description of test objectives and methods;
  • Roles and responsibilities for all test participants, including support staff;
  • Designation of test participants;
  • Test decision makers and succession plans;
  • Test locations; and
  • Test escalation conditions and test contact information.

Test Plan Review

Management should prepare and review a scriptadditional information. for each test prior to testing to identify weaknesses that could lead to unsatisfactory or invalid tests. As part of the review process, the testing plan should be revised to account for any changes to key personnel, policies, procedures, facilities, equipment, outsourcing relationships, vendors, or other components that affect a critical business function. In addition, as a preliminary step to the testing process, management should perform a thorough review of the BCP (checklist review). A checklist review involves distributing copies of the BCP to the managers of each critical business unit and requesting that they review portions of the plan applicable to their department to ensure that the procedures are comprehensive and complete.

Testing Methods

Testing methods can vary from simple to complex depending on the preparation and resources required. Each bears its own characteristics, objectives, and benefits. The type or combination of testing methods employed by a financial institution should be determined by, among other things, the institution’s age and experience with business continuity planning, size, complexity, and the nature of its business.

Testing methods include both business recovery and disaster recovery exercises. Business recovery exercises primarily focus on testing business line operations, while disaster recovery exercises focus on testing the continuity of technology components, including systems, networks, applications, and data. To test split processing configurations, in which two or more sites support part of a business line’s workload, tests should include the transfer of work among processing sites to demonstrate that alternate sites can effectively support customer-specific requirements and work volumes and site-specific business processes. A comprehensive test should involve processing a full day’s work at peak volumes to ensure that equipment capacity is available and that RTOs and RPOs can be achieved.       

More rigorous testing methods and greater frequency of testing provide greater confidence in the continuity of business functions. While comprehensive tests do require greater investments of time, resources, and coordination to implement, detailed testing will more accurately depict a true disaster and will assist management in assessing the actual responsiveness of the individuals involved in the recovery process. Furthermore, comprehensive testing of all critical functions and applications will allow management to identify potential problems; therefore, management should use one of the more thorough testing methods discussed in this section to ensure the viability of the BCP before a disaster occurs. Examples of testing methods in order of increasing complexity include:

Tabletop Exercise/Structured Walk-Through Test

A tabletop exercise/structured walk-through test is considered a preliminary step in the overall testing process and may be used as an effective training tool; however, it is not a preferred testing method. Its primary objective is to ensure that critical personnel from all areas are familiar with the BCP and that the plan accurately reflects the financial institution’s ability to recover from a disaster. It is characterized by:

  • Attendance of business unit management representatives and employees who play a critical role in the BCP process;
  • Discussion about each person’s responsibilities as defined by the BCP;
  • Individual and team training, which includes a walk-through of the step-by-step procedures outlined in the BCP; and
  • Clarification and highlighting of critical plan elements, as well as problems noted during testing.

Walk-Through Drill/Simulation Test

A walk-through drill/simulation test is somewhat more involved than a tabletop exercise/structured walk-through test because the participants choose a specific event scenario and apply the BCP to it. However, this test also represents a preliminary step in the overall testing process that may be used for training employees, but it is not a preferred testing methodology. It includes:

  • Attendance by all operational and support personnel who are responsible for implementing the BCP procedures;
  • Practice and validation of specific functional response capabilities;
  • Focus on the demonstration of knowledge and skills, as well as team interaction and decision-making capabilities;
  • Role playing with simulated response at alternate locations/facilities to act out critical steps, recognize difficulties, and resolve problems in a non-threatening environment;
  • Mobilization of all or some of the crisis management/response team to practice proper coordination without performing actual recovery processing; and
  • Varying degrees of actual, as opposed to simulated, notification and resource mobilization to reinforce the content and logic of the plan.

Functional Drill/Parallel Test

Functional drill/parallel testing is the first type of test that involves the actual mobilization of personnel to other sites in an attempt to establish communications and perform actual recovery processing as set forth in the BCP. The goal is to determine whether critical systems can be recovered at the alternate processing site and if employees can actually deploy the procedures defined in the BCP. It includes:

  • A full test of the BCP, which involves all employees;
  • Demonstration of emergency management capabilities of several groups practicing a series of interactive functions, such as direction, control, assessment, operations, and planning;
  • Testing medical response and warning procedures;
  • Actual or simulated response to alternate locations or facilities using actual communications capabilities;
  • Mobilization of personnel and resources at varied geographical sites, including evacuation drills in which employees test the evacuation route and procedures for personnel accountability; and
  • Varying degrees of actual, as opposed to simulated, notification and resource mobilization in which parallel processing is performed and transactions are compared to production results.

Full-Interruption/Full-Scale Test

Full-interruption/full-scale test is the most comprehensive type of test. In a full-scale test, a real-life emergency is simulated as closely as possible. Therefore, comprehensive planning should be a prerequisite to this type of test to ensure that business operations are not negatively affected. The institution implements all or portions of its BCP by processing data and transactions using back-up media at the recovery site. It involves:

  • Enterprise-wide participation and interaction of internal and external management response teams with full involvement of external organizations;
  • Validation of crisis response functions;
  • Demonstration of knowledge and skills as well as management response and decision-making capability;
  • On-the-scene execution of coordination and decision-making roles;
  • Actual, as opposed to simulated, notifications, mobilization of resources, and communication of decisions;
  • Activities conducted at actual response locations or facilities;
  • Actual processing of data using back-up media; and
  • Exercises generally extending over a longer period of time to allow issues to fully evolve as they would in a crisis and to allow realistic role-playing of all the involved groups.

Execution, Evaluation, Independent Assessment, and Reporting of Test Results

Once testing strategies and test plans are developed, the following procedures should be implemented as part of the overall testing policy:

Execution and Documentation

Testing requires centralized coordination by the BCP coordinator or team. The team or coordinator is responsible for overseeing the accomplishment of targeted objectives and ensuring the test results are appropriately documented.

Generally, it is advisable to have the maximum number of personnel involved in implementing the BCP to also participate in the test. Management should also rotate personnel periodically during the testing process to reduce dependence on specific individuals who may leave the organization or may not be available during a disaster. This participation increases awareness and ownership in achieving successful BCP implementation.

Once the tests are executed, test results should be properly documented and include the following, at a minimum:

  • Test dates and locations;
  • An executive summary detailing a comparison between the test objectives and test results;
  • Material deviations from the test plans, including whether intended participation levels were achieved;
  • Problems identified during testing; and
  • An evaluation by a qualified independent party.

Evaluation

Once tests have been executed and documented, test results should be evaluated to ensure that test objectives are achieved and that business continuity successes, failures, and lessons learned are thoroughly analyzed. Business lines and support function management should review test results to validate whether test procedures were effectively completed and adequately documented. Finally, test results, including quantitative metrics, such as achieving RTOs and RPOs, should be used to determine the effectiveness of the institution’s BCP. If test objectives were not achieved, business line and support function management should identify necessary corrective measures and determine whether a follow-up test should be conducted prior to the next regularly scheduled exercise. Exceptions to this process should be documented and approved by senior management.

Institutions are expected to evaluate testing across business lines and support functions in order to validate the BCP. An analysis of tests completed over a period of time should be conducted to determine whether the institution is capable of achieving its overall business continuity objectives.

Independent Assessment

Key tests should be observed, verified, and evaluated by independent parties. This provides assurance to the board and other stakeholders of the validity of the testing process and the accuracy of test results. This independent assessment is typically conducted by internal audit, although it can be performed by other qualified third parties. An effective practice is to include a review by both business line and IT auditors. This review should include an assessment of the testing scope and objectives, written test plans, testing methods and schedules, and communication of test results and recommendations to the board. The analysis of underlying assumptions and the results of modeling and simulation techniques should also be independently assessed to assure the board and other stakeholders of their reasonableness and validity. In addition, the board should receive and review audit reports addressing the effectiveness of the institution’s process for identifying and correcting areas of weakness, and audit recommendations should be monitored to ensure that they are implemented in a timely manner.

Reporting Test Results

Test results, gaps between the BCP and the actual test results, and the resolution of any problems should be reported to several audiences, including the board and senior management, business line management, risk management, IT management, and other stakeholders. A management assessment of the institution’s ability to meet its continuity objectives and testing program requirements should be provided to the board at least annually. The assessment should contain sufficient information so that the board can determine if the BCP meets the objectives established by the BIA. In addition, business lines and support functions should identify the validity of the test data processed, any untested aspects of production operations, and the need for additional tests. The board should receive reports more frequently when test results for critical business lines indicate an inability to meet continuity objectives.

Updating Business Continuity Plan and Test Program

After the test results are executed, evaluated by management, independently assessed, and reported to the board, it may be necessary to update the BCP and test program. As part of this process, the BCP and test program should be reviewed by senior management, the planning team or coordinator, team members, and the board at least annually. The team or coordinator should contact business unit managers throughout the financial institution at regular intervals to assess the nature and scope of any changes to the institution’s business, structure, systems, software, hardware, personnel, or facilities. If significant changes have occurred in the business environment, or if audit findings warrant changes to the BCP or test program, the business continuity policy guidelines and program requirements should be updated accordingly. In addition, an independent assessment of the revised BCP and test program should be performed by an auditor to ensure that both are comprehensive and updated based on the institution’s risk profile and test results.

The process of updating the BCP and the test program requires management to document, track, and ultimately resolve any necessary changes by revising the BCP, the test program, or conducting additional tests, if deemed necessary.

Issue Tracking, Resolution and Continuity Update

Test owners, typically business line or support management, should assign responsibility for resolution of material business continuity problems identified during testing and should track issues to ensure that they are effectively addressed in a timely manner. Issues requiring resolution may stem from a number of factors, including changes in internal or external dependencies involving staff, technology, facilities, and third parties. Test results and issues should be periodically analyzed to determine whether problems encountered during testing could be traced to a common source, such as inadequate change control procedures. Software applications are commercially available to assist the BCP coordinator in identifying and tracking changes so that the BCP can be appropriately updated. Once the BCP is updated, the financial institution should ensure that the revised BCP is distributed throughout the organization.

Updating Test Program and Re-Testing

Once tests have been completed, documented, and assessed, the test program should be updated to address any gaps identified during the tests. Suggestions for improving test scenarios, plans, or scripts provided by test participants should be incorporated into the testing cycle. In the event that tests do not succeed in meeting their required objectives, management should determine whether it is necessary to re-test prior to the next scheduled test. Failure to meet significant test objectives for critical business functions requires management to address re-testing based on the risk to the institution.