1. ProTop Knowledge Base
  2. Advanced Alerting Configuration

Removing Old Transactions (trxmon)

How do I Identify and remove users who have long-running transactions from my OpenEdge database? They're causing my BI file to grow too much too fast! [UNIX/Linux only]

CAVEAT:  This is paliative care!  With this approach we are simply protecting the database, technically the BI file, from misuse, or rather from not being re-used as intended. The root cause for this sort of intervention is bad code. Such logic must be identified and refactored to prevent old transactions from wreaking havoc in your system in the first place.  

Use ProTop Real-Time (RT) to help identify the procedure name and line number  implicated in your long running transaction(s), and pass that information along to your development team for analysis and correction. See Finding Code: Old Transactions for more detail.

In the meantime...

ProTop includes a feature, referred to as "trxmon", which can monitor for and remove Progress client sessions that are holding transactions open longer than is healthy for your environment.  

The components of this feature exist in subdirectories in your PROTOP installation:

  • etc/trxmon.friendlyName.cfg - set the parameters that control trxmon's behavior; see Setup below
  • bin/trxmon.sh - this is "trxmon", schedule this to run in etc/schedule.*.cfg
  • util/trxmon.p - called by trxmon.sh
  • bin/disconnect - default disconnect script called by trxmon.p
  • bin/disconn.local - called by bin/disconnect, if it exists (allows you to add custom functionality)
  • bin/disconnectx - called by trxmon if the session becomes stuck; bin/disconnect sent the disconnect message and the disconnect message was received by the session, but the session is not disconnecting; it will let you know if manual intervention is required
  • bin/disconnx.local - called by bin/disconnectx, if it exists (allows you to add custom functionality)
  • bin/killprosession.sh - not recommended for automation but it can be run from  bin/disconnx.local when you want to be more aggressive about removing the session; it uses progressively more aggressive attempts to kill the offending process; can also be run manually; read the script for more details and cautions

Setup

  1. copy the example etc/trxmon.s2k.cfg to etc/trxmon.friendlyName.cfg
  2. edit etc/trxmon.friendlyName.cfg and update the variables to your liking:
    1. monInt 60 - frequency in seconds to repeat scan loops within trxmon.p 
    2. trxThreshold 1200 - transactions older than this many seconds are eligible for disconnection
    3. trxZapAfter 600 - once a transaction crosses trxThreshold, the session must be idle (no database reads or writes) for this many seconds before it will be disconnected (zapped)
    4. emailList "" -  send the disconnect message to email addresses in this list
    5. stuckList fred@flintstone.com - send "stuck" session message to this list
    6. disconScript bin/disconnect - this script used by trxmon to disconnect the session
  3. Add trxmon.sh to your schedule.*.cfg, for example, to run trxmon against the resource "friendlyName", every 15 minutes, add this line:
0,15,30,45 * * * * trxmon.sh friendlyName > ${PTTMP}/trxmon.err 2>&1 [NOALERT]

The trxmon loops internally every monInt seconds until it is asked to stop.  When the above line attempts to start trxmon and finds it is already running, the attempt will exit.  [NOALERT] at the end of the line suppresses the alert normally sent to the portal when a job is run.

Shut Down

If you want to stop the current run of the transaction monitor, simply remove tmp/trxmon.friendlyName.flg.  It will be restarted by the scheduler according to the configuration you provided, at the next quarter-hour in the example above.  

If you want to stop trxmon permanently, comment it out or remove it from your etc/schedule.*.cfg file.