|
Booklet:
Operations
Section: Risk
Mitigation and Control Implementation
Subsection:
Controls
Implementation
|
| |
|
|
ENVIRONMENTAL
CONTROLS
Many financial institutions rely on IT operations that are complex, sensitive,
or critical to daily functioning. Disruptions to the IT operations environment
can pose significant operational, strategic, transaction, and reputation
risks. Consequently, management should control and monitor environmental
factors whether at the business line or the consolidated data center.
Management should carefully assess the IT operations environment and implement
relevant controls.
Computing equipment should have a continuous uninterrupted power source.
Independent electrical feeds drawing on separate power grids are the most
reliable power source, however they may be cost prohibitive and may not
be feasible in many geographic locations. Management should take reasonable
action to protect computing equipment power sources. Where dual feeds
or back-up power generators are used, wiring should support automatic
switching in the event one power source is disrupted. Power surges can
also damage computer equipment. Consequently management should monitor
and condition or stabilize the voltage of electricity sources to prevent
power fluctuations.
IT operations centers should have an alternative power source independent
of local power grids. Typically, this is provided by a combination of
a battery-based uninterruptible power supply (UPS) and a generator powered
by gasoline, kerosene, natural gas, or diesel fuel. Management should
configure the UPS to provide sufficient electricity within milliseconds
to power equipment until there is an orderly shutdown or transition to
the back-up generator. The back-up generator should generate sufficient
power to meet the requirements of mission critical technology and environmental
support systems. The institutions should have sufficient fuel in storage
or readily available to sustain operations for at least two or three business
days. In addition, management should make arrangements to replenish the
fuel supply in the event of an extended outage. Gasoline becomes stale
after an extended period of time; the tank should be drained and refilled
with new gasoline at least annually. A lower cost alternative to installing
a permanent generator is to configure the operations center with an exterior
electrical box to connect a temporary generator. Under this scenario,
management should also establish reliable arrangements for the availability
and delivery of a generator and fuel within a required time frame.
Similarly, IT operations centers should have independent telecommunication
feeds from different vendors. Wiring configurations should support rapid
switching from one provider to another without burdensome rerouting or
rewiring. Because vendors often share or sublease the same common cabling
or are routed through the same central office, management should have
the vendors perform line traces to ensure there is no single point of
failure or path redundancy.
Even small IT operations centers with modest computer equipment can contain
a significant amount of computer cabling. Management should physically
secure these cables to avoid accidental or malicious disconnection or
severing. In addition, management should document wiring strategies and
organize cables with labels or color-codes to facilitate easy troubleshooting,
repair, and upgrade.
Every operations center should have adequate heating, ventilation, and
air conditioning (HVAC) systems in order for personnel and equipment to
function properly. Older computer equipment produces a significant amount
of heat, requiring cooling capacity exceeding that of a standard office
building. Some newer models do not produce as much heat and thus do not
require as much air conditioning. Organizations should plan their HVAC
systems with the requirements of their computer systems in mind. Back-up
sources of electricity should be able to sustain HVAC systems, because
inadequate cooling could render computer equipment inoperable in a short
period of time. Also, operations personnel should be familiar with written
emergency procedures in the event of HVAC system disruption.
Personnel should also be able to function in the event utility service
is interrupted. Therefore, management should keep a one- to two-day supply
of bottled water and non-perishable food on the premises.
All operations centers should have heat and smoke detectors installed
in the ceiling, in exhaust ducts, and under raised flooring. Detectors
should not be situated near air conditioning vents or intake ducts that
can disburse smoke and prevent the triggering of alarms. Some large and
complex operations centers are beginning to use very early smoke detection
alert (VESDA) systems in place of conventional smoke detectors. VESDA
systems sample the air on a continuous basis and are far more sensitive.
They are capable of detecting a fire at the pre-combustion stage. Although
more expensive than conventional systems, a VESDA system can detect a
smoldering wire and alert management before a fire starts. The early notice
may also prevent suppression equipment from deploying water or foam that
can damage computer equipment.
A variety of strategies are available for fire suppression. One of the
more widely used systems was a halon gas system that deprived a fire of
oxygen. The government phased out production of halon because it determined
that halon causes ozone depletion. The phase out deadline was December
31, 2003. Once existing reserves are depleted additional halon may not
be purchased. Institutions still using halon systems should be prepared
to switch to another fire suppression system. Newer systems rely on the
same theory, but use inert agents such as Inergen, FM-200, FE-13, and
carbon dioxide. Many facilities continue to rely on water as a fire suppressant,
choosing a wet-pipe or dry-pipe configuration. In the wet-pipe configuration,
the pipes are filled with water and may be subject to leakage. In the
dry-pipe configuration, the pipes are empty until a fire is detected,
minimizing the risk of water damage from burst or leaking pipes. Ideally,
the fire suppression system should allow operators time to shut down computer
equipment and cover it with waterproof covers before releasing the suppressant.
Many facilities store waterproof covers throughout the data center to
cover sensitive equipment quickly if sprinklers are activated.
Water leaks can cause serious damage to computer equipment and cabling
under raised floors. For this reason, operations centers should be equipped
with water detectors under raised flooring to alert management to leaks
that may not be readily visible. Management should also consider installing
floor drains to prevent water from collecting beneath raised floors or
under valuable computer equipment. Furthermore, management has several
considerations in protecting cables from water damage; running cables
under raised flooring risks flooding from below but suspending cables
overhead risks water damage from leaks resulting from the roof or floors
above.
PREVENTIVE
MAINTENANCE
Preventive maintenance on equipment minimizes equipment failure and can
lead to early detection of potential problems. This includes minor maintenance
such as cleaning peripheral equipment as well as more extensive maintenance
provided by the manufacturer, vendor, or maintenance contractor. Preventive
maintenance also includes general housekeeping to keep the operations
center clean and orderly.
Unless specifically authorized by management, computer operators should
not repair equipment or perform other than the most routine maintenance.
Even if they have the requisite knowledge and experience, many hardware
and software warranties disclaim liability for unauthorized maintenance
or alteration. Maintenance by computer operators should be performed according
to manufacturers' recommendations. As a general rule, these duties include:
| |
Cleaning
tape heads each shift; |
| |
Cleaning
printers daily; |
| |
Checking
and cleaning the magnetic ink character recognition (MICR) reader/sorter
at the end of each shift; and |
| |
Periodically
checking and cleaning the area under raised flooring. |
Maintenance
schedules may vary considerably depending on the number and variety of
technology systems and the volume of work processed. All maintenance should
follow a predetermined schedule. Employees should document maintenance
in logs or other records. Management review of these records will aid
in monitoring employee and vendor performance.
The manufacturer or vendor will usually perform maintenance under contract.
For leased equipment, maintenance may be part of the lease arrangement.
When equipment is owned or leased from a third party, management should
obtain a separate maintenance or service agreement between the operations
center and the equipment manufacturer. The service or maintenance agreement
should provide repair services, detail the preventive maintenance, and
include a schedule for both. When an operations center uses hardware from
more than one manufacturer, it may be desirable to enter into an arrangement
whereby one vendor takes responsibility for all repair maintenance. Under
this arrangement, the operations center would contact the designated vendor
to determine the source of the problem and to make all the necessary repairs.
In any event, management should ensure maintenance contracts guarantee
timely performance.
Management should schedule time and resources for preventive maintenance
and coordinate that schedule with production. During scheduled maintenance,
the computer operators should dismount all program and data files and
work packs, leaving only the minimum software required for the specific
maintenance task on the system. If this is impractical, management should
review system activity logs to monitor access to programs or data during
maintenance. Also, at least one computer operator should be present at
all times when the service representative is in the computer room.
Some vendors can perform computer maintenance online. Operators should
be aware of the online maintenance schedule so that it does not interfere
with normal operations and processing. Operators and information security
personnel should adhere to established security procedures to ensure they
grant remote access only to authorized maintenance personnel at predetermined
times to perform specific tasks.
Operators should maintain a written log of all hardware problems and downtime
encountered between maintenance sessions. A periodic report on the nature
and frequency of those problems is a necessary management tool, and can
be valuable for vendor selection, equipment benchmarking, replacement
decisions, or planning increased equipment capacity.
|