DESCRIPTION:
A Windows computer, especially a Terminal Server (or similar XenApp server) will become slow because some user is running a program that hogs all the CPU or memory. Sometimes it’s caused by part of the operating system. It often comes and goes within a few minutes, so you have to quickly login and use Task Manager or similar tool to see the hogging process(es).
Solution: The first step of the solution is to find out which process(es) are hogging CPU or memory. This script and datasource do just that.
“CPU_Top_5” is a PowerShell script in a LogicMonitor datasource. It would typically be set to run every 2-10 minutes.
DISCLAIMER:
Use at your own risk. Not officially supported by LogicMonitor tech support.
MORE INFO:
———————–
Top-5 checks CPU or memory usage on local or remote computer. If it's above the specified threshold, it gets the top 5 processes and optionally logs them in the event logs of target computer.
Settings: target_computer: computername of local or remote computer type_of_check: CPU or memory threshold: percent of CPU or percent of memory (without the % sign). Don't get details unless it's beyond threshold severity (optional): error or warning in System event log (EventID is 888)
--------------------
LogicMonitor by default collects the system event log and alerts on severity level of error and above.
The datasource includes a graph that shows CPU percent, the threshold, and error detection.
Compatibility: Tested on Windows server 2008 and newer. Version 3 or newer of PowerShell is required (free).
Requirements: You must run this as a user that’s a administrator on both the target computer and the collector computer. Usually this isn’t a problem because the LogicMonitor collector service is set to run with a “service account” that is a user with these permissions.
OUTPUT / message in log:
Name PID Owner CPU % ------- ------- ---------- ------ cpustres 4567 domain\mike 94 excel 6543 domain\bob 3 svchost 876 SYSTEM 1 wmiprvse 9876 SYSTEM 0 explorer 6545 domain\bob 0
Memory:
Name Memory (MB) ------- ------------- outlook 566 explorer 64 svchost 22 excel 11 myapp 8
NOTES:
The challenge I found in this project is that there are a few utilities, scripts, and methods to do this but most showed the output as “CPU time” which means you have to take 2 samples a few seconds apart and subtract and calculate the percent. One utility I found PSLIST.exe by Microsoft SysInternals allowed you to do this but it displays a lot more information than I needed, it didn’t have a threshold capability, nor the write to event log capability, and for some reason, it wouldn’t exit automatically as documented when I used the -s parameter (task manager mode).
INSTRUCTIONS:
- Download this datasource file ( link ) and add it into your LogicMonitor account (Settings > DataSources > Add > From file )
- Apply it to device(s). I suggest creating a property on the device called “apply_Top_5”. You must type something in the ‘value’ field.
- Set the ‘threshold” in the script or as properties on the device. Default is 90%
- If you haven’t already, make sure you allow unsigned scripts to run using command “Set-ExecutionPolicy unrestricted” or similar.
- You might have to enable the “remoting” feature of Windows using this command:Enable-PSRemoting -Force
- Test by using the ‘Raw Data’ tab. I used the free CPUstres utility to simulate high CPU.
If you think you have errors, I suggest testing by copying the PowerShell script to your collector and type in the computername and test it right at the PowerShell prompt or ISE (powershell editor).
No comments