Sunday, August 06, 2006

Are there babies dying?

With the recent turmoil in the middle east, there is indeed a situation where babies are dying. It is tragic to watch the news and see so many innocent children caught up in this tragic violence. Despite varying political views, no one wants to see young children suffer. We see rescue workers on both sides struggling to give aide to these victims as they place there own lives at risk. This is no doubt an emergency situation. The point of this post is to contrast these "real life" emergencies with the IT emergencies that we deal with.

On countless occasions I have been drawn into panic situations on the IT front, where it appeared as if certain doom was imminent, if the server wasn't back up right now. Managers are screaming, the phone rings off the hook, vacations are called off, and the world is on the edge of destruction, right? I worked with gentleman a few years ago, and while in the middle of one of these IT catastrophes asked "Are there babies dying?". This really hit home with me and helped me put things in the proper perspective. Large revenue impacting events are definitely important and should be responded to appropriately, but they are not on the same scale as the life and death events we see around us. Organizations find themselves in a panic first mentality which is counter productive, stressfull, and not cost effective.


In dealing with the IT emergency, some well thought out procedures can deal with the most critical events imaginable. The organization I work for had several incidents with hurricanes last year which required the shutdown of offices, rerouting of business, and the human task of getting money to people while the office was out of commission. We were able to do this, without undue stress, efficiently, and with some compassion. All of this was possible because we took time to plan, when the emergency arose, we new what to do.

Here are some really basic tips on disaster planning that we often take for granted.

1. Maintain an up to date contact list, with roles and responsibilities. It is amzing durng a crisis how hard phone numbers are to find.

2. Have a communication plan. Know who communicates with clients and who will manage internal communication. Let your staff do there jobs and have someone else manage communication. Know what you will do if all Internet, and telephone communication is shut off.

3. Prepare an escalation process, and make sure everyone knows how to use it.

4. Keep up to date inventories of equipment and software.

5. Backups, backups, backups. Enough said.

6. Document the chain of events. Use your incident management system if available to keep track of all activities surrounding the emergency.

7. At the end of the event, do a postmortem. Review the time line of events and look for things that could have been done better. Be critical of yourself, this will pay huge dividends the next time an event occurs. I found that sharing the weakness in our responses and a solid plan to correct them next time, adds credibility with clients and auditors. Don't be afraid of the truth!

In conclusion I would like to say that some cool heads, and basic management techniques solve problems faster, and more effectively than panic, and finger pointing. Remember the next time a server crashes ask "Are there babies dying?".

2 comments:

Anonymous said...

pppppp= proper planning prevents piss-poor-performance.

Good blog entry, hits the problem on the head.

Are there babies dying, indeed.

thanks,

Jim H.

Anonymous said...

Great information!
James A