How to Control the Frequency of Database Down or Outage Alerts

Is there a way to tune frequency/nag for dbDown and other dbmonitor-generated alerts?

Alert Text

dbDown /db/myDB [DOWN] missing .lk file

Description

We are getting too many alerts for our database being down, especially during planned outages. How do we control this?

Corrective Action(s)

To set the alert interval for database downs and crashes from the default of 1 hour (3600 seconds), to say, 2 hours, edit your PROTOPDIR/bin/localenv file.

1. First, copy bin/localenv.x to bin/localenv if bin/localenv does not already exist.

2. Edit this line in bin/localenv:

# export DBSHUTNAG=3600                 # Nag interval for db shutdowns and crashes

3. Remove the # from the beginning of the line.

4. Change 3600 to 7200, it should now look like this:

export DBSHUTNAG=7200                 # Nag interval for db shutdowns and crashes

5. Save the file and exit.

6. Restart the database monitor by removing PROTOPDIR/tmp/dbmonitor.flg.  Your dbmonitor will be restarted shortly.

Other Stuff

The frequency of these alerts are different from alert frequency as defined in Alert Configuration Overview.  That configuration file is used by the protop agents which must connect to the database(s) to gather metrics, check thresholds and send alerts to the portal.  The dbmonitor is a standalone process independent of the resources ProTop is monitoring, it must be so that it can check and report the status of the resources it monitors.

See also

Alert Configuration Overview

If all else fails...

Contact us at support@wss.com or use the online chat. We'll be happy to help.