How do I reduce alert "noise"?

Getting too many ProTop alerts? Here's how to fine-tune the alerts you see in your feed on the ProTop portal.

ProTop arrives with dozens of alerts ready to fire "out of the box," and many do.

These alert definitions reside in your [PROTOP]/etc/alert.[siteid].cfg. It is a copy of etc/alert.cfg, customized according to your site-specific needs. If you do not have an etc/alert.[siteid].cfg, take a moment and copy etc/alert.cfg to it now. The etc/alert.cfg file is overwritten by each ProTop update, so you don't want to make customizations there.

NOTE: The metric names at the start of each alert definition in your etc/alert.[custid].cfg do not always match case usage with the metric name in the alerts you see in the portal. For instance, sqlStats in an alert title corresponds with SQLStats in your etc/alert/[custid].cfg file. So, when trying to find a metric in your alert config file, be sure to use a case insensitive approach (grep -i ; findstr /i; NotePad++ etc)

NOTE: Your etc/alert.[custid].cfg is re-read every monitoring interval. There is no need to restart anything for these changes to take effect.

0. What does this alert mean?

Before you can accurately adjust which alerts appear in your feed, you must know what they mean. Over 100 alerts are enabled by default. Each default alert has a corresponding page in this knowledgebase and can be found by searching here for the metric name, the first string in the alert subject.

The alert subject is in yellow in the image below, and the metric name in the red box is "sqlStats." Enter sqlStats (not case sensitive) in the search box above to find articles about sqlStats, including the article that explains it that begins "Alert."

1. I don't care about this alert. Make it go away!

If you see an alert that you believe does not apply to you or do not care about it, the simplest way to reduce that noise is to comment it out in your etc/alert.[siteid].cfg file. For example, we see this alert for sqlStats -1 < 0:

If you do not use SQL clients in your environment and have no intention of doing so in the future, then edit your etc/alert.[siteid].cfg file, find the line(s) that begins SQLStats and add a # to the beginning of the line, so it looks like this:

#SQLStats     num   <   0 "" "daily" "&1 &2 &3" alert

Save the file; you will no longer see this alert in your alerts feed on the ProTop Portal.

2. I care about this alert, but I'm getting too many (the threshold is too low for my environment).

For instance, we see here multiple alerts for LatchReq exceeding the threshold, which is set to 3000000 by default:

If you know your total latch requests regularly exceed 5000000 and it is not a problem for your application, set the threshold higher. Edit your etc/alert.[siteid].cfg, find the line that begins with latchReq (yes, llowercase"L") and change 3000000 to a number above which you want to be alerted, say 5500000 and save the file. You will see alerts only when total latch requests in a monitoring interval exceed 5500000.

3. What is a "blue" alert or alerts that start with a number in parentheses; how do I make it disappear?

Blue alerts are "info" alerts. One type of "info" alert is a message that ProTop finds in the database log and passes along to the ProTop Portal as an alert. For example:

You can control how ProTop handles these db message alerts in your etc/messages.custid.cfg file.

Edit the file and look for the line that begins with the error number in parentheses that you see in the alert. "794" in the example above. Change the second parameter on the line from "info" to "ignore" and save the file. ProTop will pick up the change in the next few minutes and stop sending that message as an alert.

What about databases (or other resources) I have no configuration control over?

I'm running a vendor's app and database, and I cannot reconfigure them. Any alerts related to optimizing the configuration of these resources are noise to me. How do I stop such ProTop alerts for these resources?

Create an alert configuration file specific to the resource(s) you want to eliminate configuration-related alerts for. You can do this on a one-to-one basis or for a resource "type" according to the ProTop Configuration File Naming Hierarchy.

Scenario One - You have one resource, vendorDB1, you want no alerts for

Copy etc/alert.cfg to etc/alert.vendorDB1.cfg and comment out the alert definitions found there that you no longer want to receive. The pt3agent on startup will find this file and load those definitions instead of inheriting and loading the site wide etc/alert.<site>.cfg.

Scenario Tow - You have multiple vendor supplied resources you want to reduce or eliminate ProTop alerts for

Let's say you are running a vendor-supplied app server and database on your production server along with databases you do manage and need alerts for. You can give the vendor's resources a "type" (aka "group" on the portal) which is an arbitrary string that you add to the resource definition. Type lives in the 6th position in each resource definition found in etc/dblist.cfg:

win10s2k|c:\db\122\sports2000|WIN10-VPN|yes||VENDOR

It is also found in the "Group" field under Resources Adminstration on the ProTop Portal:

Here we have added the type/group of VENDOR for this resource.

Now you can create an etc/alert.VENDOR.cfg and when the ProTop agents start for the resources with VENDOR in the type position they will read this file instead of etc/alert.<site>.cfg that applies to all other resources on this server.

Edit this file and comment out the alert defintions for alerts you see in your feed that you know you can do nothing about.

CAUTION: You should not comment out all of the definitions, especailly in a production environment. No one knows when a change or unexpected interaction will alter the behavior of the vendor's product for the worse and we do want ProTop to be watching when it does. Again eliminate only those alerts you see on the portal that you know you cannot do anything about.