NUMA Woes

Non-Uniform Memory Access (NUMA) is a feature of most modern server machines, virtualized or not. This provides great flexibility in building your machines, but there can be a catch when your resource needs reach beyond a single zone.

Symptoms of potential NUMA performance problems:

1. The system uses NUMA:

- On Linux "numactl --hardware" shows more than one node OR

   dmesg | grep -i numa (more than one node)

- On Windows > Task Manager > View > CPU History OR

   Task Manager > right click process > Set affinity OR

   Resource Manager > CPU Tab shows NUMA information

- On either:  ProTop RT > OS Info (o) shows NUMA nodes > 1 or (sneaky) many cores yet only a small number of nodes, like 1.

2. ProTop Portal Trend Data: Server Dashboard > CPU goes way up AND Advanced Dashboard  > Reads/requests stay flat or drop off.  

3. ProTop Portal Trend Data: Advanced dashboard > BogoMIPS moves along quite steadily, then becomes erratic, and we see a sawtooth pattern in the chart.

4. ProTop Portal Trend Data > Latches >  Waits increase (Timeouts?) (LK*, BH*).

5. Perceived application performance is going well but drops off suddenly.

What's happening?

NUMA has near and far memory.  Near memory is on the same board as the CPUs being used, and far memory is on any other node.  NUMA looks good until you are required to do "far" memory reads, at which point the performance of your OpenEdge application dives.

The "far" reads take three times as long as close reads.  The slow reads can cause latches to be held longer, which slows everything down.

Am I a victim of "far" reads?

If you are on a NUMA system and seeing many of the symptoms above, on Linux, you can use nmon to see if OE processes run on all cores or just those within one node.  

What can be done?

None of the options are splendid:

- VMWare/Hypervisor: set affinity such that the VM is dedicated to one node (stopping access to memory on other nodes)

- Disable cores outside the node you want your dbs/clients to access (again eliminating access to the memory on those boards)

- If possible, use numactl --nodebind to assign affinity to the same node in all your scripts. Child processes do inherit the affinity set this way.

- Reduce -spin to decrease CPU demand to decrease the need to cross node boundaries.

Solve the NUMA problem before it starts!

When building new servers for your OE database applications, be sure to factor in the potential negative impact of NUMA.

If you are faced with a potential NUMA performance situation or are trying to avoid it in the first place, please get in touch with us, we can help. Use the chatbot on this page or the Questions? Comments? link in the header.