BASH Script – CPU Load Average reading and alerts in relation to the number of existing CPU Cores

The CPU load can be determined by viewing the three decimal numbers shown when running the “top” or “uptime” command. Below is an example of the CPU load produced by the uptime command. The first decimal number shows the CPU load for the last 15Minutes. The second decimal number shows the CPU load for the last 10Minutes. The third and last decimal number shows the CPU load for the last 5Minutes.

13:59:11 up 13:13, 14 users,  load average: 0.00, 0.00, 0.00

The golden rule is that these number should not exceed the number of the total number of the CPU cores the host has. If that is the case then the CPU is trying to process more information that it can handle. If any of the decimal numbers are equal to the number of the CPU cores that means the CPU is at maximum processing capacity. If it is lower than the total number of CPU cores then the CPU is handling much less tasks than it is capable.

The number of CPU cores can be determined by viewing the “/proc/cpuinfo” file.  The file shows information for all CPU cores. The Field called “processor” indicates the numerical value (labeling) for each CPU core. Please note that first CPU core is assigned the numerical label of zero. One can quickly find out how many CPU cores a host has by running the following command:

cat /proc/cpuinfo  | grep ‘processor’ | wc -l

 

The following script reads from the /proc directory how many CPU cores exist within the CPU.
It then calculates the load average for the last 5 and 10 minutes.
The latter information is taken and then calculated by the data given by the uptime command.
It actually reports the CPU load for the last 7.5 minutes.

Please let me know if you think the logic behind it is flawed.

Below is the link to the script in a .txt format. You will need to save it as .sh and give it executable permissions.

clear;
load_5min=`uptime | awk '{print $9}' | sed 's/,//'`;
load_10min=`uptime | awk '{print $10}' | sed 's/,//'`;
load_average=`echo "(( $load_5min + $load_10min ) / 2)" | bc -l | cut -c 1-4`;
cpu_cores=`cat /proc/cpuinfo | grep 'processor' | wc -l`
cpu_model=`cat /proc/cpuinfo  | grep 'model name'| uniq`
#echo "Load 5 min $load_5min"
#echo "Load 10 min $load_10min"
echo "CPU Load average $load_average"
echo "CPU Model: $cpu_model"
echo "Num of cored $cpu_cores"
#---
if [[ "$cpu_cores" > "$load_average" ]]
then
echo "CPU Load is at Normal levels - Below the number of cores."
echo "Load is at $load_average - Number of cores is $cpu_cores."
fi
#---
if [[ "$cpu_cores" <  "$load_average" ]]
then
echo "CPU Load is above number of cores."
echo "Load is at $load_aver - Number of cores is $cpu_cores."
mailx -s "CRITICAL: CPU Load is above number of cores."
fi
#---
if [[ "$cpu_cores" = "$load_average" ]]
then
echo "CPU Load is at maximum capacity. Please check"
echo "Load is at $load_aver - Number of cores is $cpu_cores."
mailx -s "WARNING CPU Load is at maximum capacity. Please check"
fi
exit;

 

 

 

 

Share Button

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload the CAPTCHA.