Dirk Deimeke: The philosophy of monitoring …

This article is a must-read for system administrators:

Some monitors may have thresholds set which check for certain conditions; when those conditions are met, you may want to send some type of alert to an administrator. There are two types of notifications – Active and Passive:

• Active Notification: Immediate Action is Required: “Site is Down!” A phone call, page, or IM may be used to contact someone. Direct action expected.
• Passive Notification: Informational Purposes only: “JVM Memory usage is high.” Information is logged, and perhaps an email is sent. No direct action is expected.

It’s easy to become addicted to passive notifications – but remember, data overload can mask important information. It becomes habit to ignore notifications if they are unimportant. The question then is not so much “when should you notify,” but “when shouldn’t you?” What it really boils down to is “Can/should I do anything about it right now?”

The article covers the following sections:
• Why you Monitor
• When you should NOT Notify
• Reacting Properly
• How Much is Too Much?
• Making Contact
• Afflictions
• Don’t be [A]pathetic

One mistake I’ve seen is using email as a reliable and immediate method of contact, often expecting a quick response. My favorite is when someone sends you and email, then walks down to your desk immediately after and asks “did you see my email?” You check and see it was sent literally less than two minutes ago. People don’t reliably check their email. Admins especially don’t due to the sheer volume we receive.

