ProTop alert configuration is broad and deep. This article details all of the elements that make up the alert definitions found in your etc/alert.*.cfg file.
Because the default alert configuration file, etc/alert.cfg, does change over time and will be overwritten during ProTop updates, be sure to follow the advice found in the Configuration File Hierarchy article and copy the default to an appropriately named customized version. The simplest and most common of which is a configuration file named for your site.
For a site named "wss" (your site name is in etc/custid.cfg), copy etc/alert.cfg to etc/alert.wss.cfg and make all your edits to that file, leaving etc/alert.cfg untouched. Your site named alert configuration file will not be deleted or overwritten by future ProTop updates or even reinstalls.
At the top of etc/alert.cfg and therefore at the top of any customized copy of it (i.e. etc/alert.wss.cfg) you will find this alert configuration legend:
# Configure ProTop Alerts
# Metric Type Compare Target Sensitivity Notify Message Action(s)
# ====== ==== ======= ====== =========== ====== ==================== ================
# LogRd num > 100000 3:5 Always "Hit Ratio &1 &2 &3" event,event,...
# Metric The name of the field being monitored
# Type Data type -- char or num (string or numeric).
# Compare Operator -- >, <, =, <>, <=, >=
# Target The threshold value of the metric to be tested.
# Sensitivity How sensitive to be -- blank means always
# # = after # or more occurences
# #:# = after # or more occurrences over # samples
# Notify How often to notify (aka "nag level") -- blank means always
# always = 0
# daily = 86400
# hourly = 3600
# # = frequency in seconds
# Message Message text:
# &1 = current value
# &2 = comparison operator
# &3 = target
# &4 = sensitivity
# &5 = notify
# Action Comma delimited list of events which will be PUBLISHed with
# message as a parameter. Current event types are:
# z* - adds information to the body of the alert
# alert - "this is something interesting"
# alarm - "you should act on this soon"
# page - "a bad thing is happening, act immediately"
# script - executes $PROTOP/bin/metricName
Metric is the name of one of the nearly 1000 data elements tracked by ProTop. See Alertable Metrics Overview for metrics by category and the details needed to create a new alert.
Type refers to the data-type of the metric being evaluated in the alert definition, either "char" or "num" which tells ProTop how to compare this metric, alphabetically or mathematically.
Compare is the operator used to perform the comparison, one of: >, <, =, <>, <=, >=
Target is a limit of the value being testing beyond which an alert is sent; or, a string (word or phrase) being sought for its presence (=) or absence (<>) and when the condition is met the alert is triggered.
Sensitivity moderates the nag frequency (see Notify below). If Notify is set to once an hour and Sensitivity is blank, you will get alerts every hour if the target is breached. Set this value to a number, say 2, and you will be alerted every other hour. Set it to 2:2 and you will get the alert only when it occurs a second time in two samples. For example, logical database reads (LogRd) can be jumpy or spiky. Use Sensitivity to be alerted only when the metric is persisting above the Target, not every time it breaches.
Notify is a character string indicating how often you want to check the metric for a breach and be alerted if it has. The values "always", "daily" and "hourly" are converted to frequency in seconds before being evaluated. You can also set Notify to any frequency in seconds. For instance to be alerted once a week, set Notify to the number of seconds in a week or "604800" (including the quotes).
Message is a quoted string that among other things serves as the basis for the Alert Subject as seen in the portal. It is most often made up of:
- a small amount of descriptive text, placed within the quotes (e.g.Empty AI Extents)
- &1 is the current value of the Metric being evaluated (e.g. 7)
- &2 is the Compare Operator (e.g. < )
- &3 is the Target value (e.g. 2)
and can include:
- &4 the Sensitivity (e.g. "3:5")
- &5 and the Notify value (e.g. "always")
Here is an example Message:
"Empty AI Extents &1 &2 &3"Action is a comma delimited list of alert definition modifiers:
- z* refers to Alert Enhancers which if desired should be included first in the list as they add information to the body of the alert
- alert indicates this as a basic event of interest and is shown in the portal in yellow
- alarm indicates this is an alert of special interest and shows up as orange in the portal. This is telling the user "you should act on this soon" and tells the portal to trigger an alarm event which can include a response like sending an email to a list of addresses
- page indicates "a bad thing is happening, act immediately", shows up in the portal in red and triggers a page event in the portal which can include sending a text to a list of pager email addresses
- script - when included in the list, tells ProTop to look for $PROTOP/bin/Metric. This is an executable script, having no file name extension and named exactly the same as the metric this alert definition is for. If the metric-named script exists, ProTop executes it