Make sure IT is RecoverableFrank Busa
June 5, 2008 — 1,070 views
Having a comprehensive plan is essential in keeping business continuity for the firm. The first step is to develop a Disaster Recovery Plan. This may involve the head of IT, IT staff, consultants, vendors, and management. Developing a DR plan specifically from the IT standpoint can take some time to formulate. It involves documenting contact information for your phone service and vendor telephone/computer equipment vendors, specifications detailing minimum requirements for mission critical servers and applications, contact information for IT staff, insurance agent contact information, and a hierarchy of critical services and infrastructure to determine what should be brought online first, There are four basic components to develop a disaster recovery plan. The components are data recovery, equipment, facilities and personnel.
The data recovery,' component includes making backups and keeping them offsite. It is advisable to make complete system backups of servers, and have documentation/configurations of servers, routers, switches, and firewalls. Remember to keep the backup data and documentation/configurations offsite and easily accessible.
The equipment component includes making a list of every key server, what it does, the applications and/or data it has, and a list of features for replacement. The list should contain all vital information about the server such as the vendor, processor, motherboard, hard drive, controller, power supply type, operating system, and serial number.
The facilities component includes circumstances such as a power outage, climate control in data center or computer room, fire, water, and obstruction to the entrance into the building. Although one or more of these circumstances might be temporary, it should be addressed in order to continue business. For example, a power outage may last several hours or days. Uninterruptible power systems will allow approximately t0-20 minutes to safely shut down the servers in an orderly' state. A fire in or near the data center/computer room can disrupt business for days, weeks or months.
The last component is personnel. Your staff is the most important component when faced with a disaster. Properly documenting your DR plan can ensure a smooth recovery. Organize your manual so that it documents all your processes. It is very important to test your DR plan.
There are several different technologies that backup, protect, and recover your data. For example, replication of your servers, backing up data to another hard drive or SAN (Storage Area Network), virtualizing your environment, backup and restoring to and from tape. Replicating servers can be done onsite or offsite to another location. Clustering cam be implemented to perform maintenance of servers and can protect from corruption or data loss. Use hot-swappable RAID arrays for data redundancy. RAID is an acronym for Redundant Array of Independent Disks. There are different levels of RAID. Usually the operating system drive is configured with RAID 1 or mirroring. RAID 5 is typically setup for your data drive which stripes data along various disks. RAID 10 also called RAID 0 + 1 is another option which stripes data across multiple disks and then all of the drives are mirrored. Use redundant power supplies to protect your server's uptime if the main power supply fails. There are utilities such as NSI Double Take software that replicate data over a high speed internet connection to a remote disaster recovery center or co-location. Also, there is backup software such as E-Vault that can back-up data and send it to a remote disaster recovery center's data vault. This makes it faster and easier to restore your data. Another solution is to vitalize your production environment to a co-location or remote Disaster Recovery center. VMware ESX Server is an example of the software that could be used. This can enable server consolidation without having the same hardware installed at the co-location or remote DR center. In other words, !0 servers from your production environment can be consolidated to 1 server in the remote DR center. Imaging software such as Symantec's LiveState can be used to take live snapshots of your entire server Without interruption or a reboot of a server. Some of these technologies can be used together to make a faster recovery from a temporary or total disaster. There are other companies that offer different technologies, but these are just a few to be named.
Control physical access to your server room. A simple solution is to put a keyless lock on the server room entrance. Access control cards are the most robust way of protecting access to your server room. You should consider adopting a rule precluding anyone from bringing in any food or beverages into the server room. Keep the server room door closed for climate control and security reasons
Protect against critical internet security threats. These threats would include viruses, spyware, brute force attacks, hacking, password cracking, etc. It is highly recommended to have enterprise level anti-virus software installed on all servers and individual PC's. There are several anti-virus companies that provide such protection, including Symantec, McAfee, and MicroTrends. Cisco has software called CSA that is advertised to prevent unknown virus attacks or anomalies from penetrating your network. Anti-spyware controls should be in place to protect your network. Spyware is a general term used for software that performs certain behaviors such as displaying, advertising, collecting personal information, or changing the configuration of your computer, generally without obtaining the PC user's consent. Anti-spyware software can be installed on a server based level to prevent malicious spyware from being downloaded and installed on an unsuspecting user's PC ff they visit an infected website. WebSense is an internet monitoring tool that can prevent a user from visiting a known infected spyware website. It does not remove the spyware from a PC but it prevents the download and ensuing damage.
In conclusion, recovering IT from a disaster requires care ful and through planning, a process that can be better managed and controlled by the use of new software tools and hardware technologies such as those described above.
Reprinting permission: This article originally appeared in the Fall 2005 issue of American Legal Administrators newsletter.
For more information about this topic, please contact Frank Busa.