DESCRIPTION:

A Windows computer, especially a Terminal Server (or similar XenApp server) will become slow because some user is running a program that hogs all the CPU or memory. Sometimes it’s caused by part of the operating system. It often comes and goes within a few minutes, so you have to quickly login and use Task Manager or similar tool to see the hogging process(es).

Solution: The first step of the solution is to find out which process(es) are hogging CPU or memory. This script and datasource do just that.

“CPU_Top_5” is a PowerShell script in a LogicMonitor datasource. It would typically be set to run every 2-10 minutes.

 

DISCLAIMER:

Use at your own risk. Not officially supported by LogicMonitor tech support.

MORE INFO:

———————–

Top-5 checks CPU or memory usage on local or remote computer. If it's above the specified threshold, it gets the top 5 processes and optionally logs them in the event logs of target computer.
Settings:
target_computer:     computername of local or remote computer
type_of_check:       CPU or memory
threshold:           percent of CPU or percent of memory (without the % sign). Don't get details unless it's beyond threshold
severity (optional): error or warning in System event log (EventID is 888)
--------------------

LogicMonitor by default collects the system event log and alerts on severity level of error and above.

The datasource includes a graph that shows CPU percent, the threshold, and error detection.

Compatibility:  Tested on Windows server 2008 and newer. Version 3 or newer of PowerShell is required (free).

Requirements:  You must run this as a user that’s a administrator on both the target computer and the collector computer. Usually this isn’t a problem because the LogicMonitor collector service is set to run with a “service account” that is a user with these permissions.

 

OUTPUT / message in log:

Name           PID         Owner         CPU %
-------       -------     ----------     ------
cpustres        4567     domain\mike      94
excel           6543     domain\bob        3
svchost         876      SYSTEM            1
wmiprvse        9876     SYSTEM            0
explorer        6545     domain\bob        0

Memory:
Name        Memory (MB)
-------    -------------
outlook     566
explorer     64
svchost      22
excel        11
myapp         8

NOTES:

The challenge I found in this project is that there are a few utilities, scripts, and methods to do this but most showed the output as “CPU time” which means you have to take 2 samples a few seconds apart and subtract and calculate the percent. One utility I found PSLIST.exe by Microsoft SysInternals allowed you to do this but it displays a lot more information than I needed, it didn’t have a threshold capability, nor the write to event log capability, and for some reason, it wouldn’t exit automatically as documented when I used the -s parameter (task manager mode).

INSTRUCTIONS:

  1. Download this datasource file ( link ) and add it into your LogicMonitor account (Settings > DataSources > Add > From file )
  2. Apply it to device(s). I suggest creating a property on the device called “apply_Top_5”. You must type something in the ‘value’ field.
  3. Set the ‘threshold” in the script or as properties on the device. Default is 90%
  4. If you haven’t already, make sure you allow unsigned scripts to run using command “Set-ExecutionPolicy unrestricted” or similar.
  5. You might have to enable the “remoting” feature of Windows using this command:Enable-PSRemoting -Force
  6. Test by using the ‘Raw Data’ tab. I used the free CPUstres utility to simulate high CPU.
    If you think you have errors, I suggest testing by copying the PowerShell script to your collector and type in the computername and test it right at the PowerShell prompt or ISE (powershell editor).