Bash script to check server load and notify by email

Lately I have been dealing with high server load problems, both at work and on my own server so I've been reading about it and trying to understand what it means and how to debug such issues. I don't mean to lecture you on the subject on this post but I will point you to a great article that explains it in a very simple way: Understanding Linux CPU Load - when should you be worried?

After reading that you'll know a lot more about what load means and the its different types. So how do you know if your server has a sudden spike of traffic causing high load for example? You surely can't be logged in all day typing 'uptime'. You could, however, have some sort of monitoring tool that will alert you automatically. If you're dealing with a high traffic production server you most certainly have a "real" monitoring tool such as Nagios or Zabbix. If you're like me and host a small website on a small VPS then some of those solutions might be overkill.

Just for the fun of it and to practice a little bit of scripting I decided to create a small bash script to alert me by email whenever the server load goes above a specified threshold. I started of with a one-liner (which I quite like) but decided to organize it better. Just place the script below in a cron to make it run every 5 minutes for example and you'll get an email when the server is in trouble. If you have suggestions to make this better or even extend functionality I'd love to hear it, leave a comment!

#! /bin/sh
load1m=$(uptime | awk '{ print $10 }' | cut -c1-4)
load10m=$(uptime | awk '{ print $11 }' | cut -c1-4)
load15m=$(uptime | awk '{ print $12 }' | cut -c1-4)
threshold1m="2.50"
threshold10m="3.00"
threshold15m="3.80"
result1m=$(echo "$load1m > $threshold1m" | bc)
result10m=$(echo "$load10m > $threshold10m" | bc)
result15m=$(echo "$load15m > $threshold15m" | bc)
email="youremail"
mailbody=$(mktemp)
send=0

if [ "$result15m" == 1 ]; then
  msg="Load 15 min: $load15m (threshold $threshold15m)"
  subject="ALERT: High load 15m ($load15m)"
  send=1
elif [ "$result10m" == 1 ]; then
  msg="Load 10 min: $load10m (threshold $threshold10m)"
  subject="ALERT: High load 10m ($load10m)"
  send=1
elif [ "$result1m" == 1 ]; then
  msg="Load 1 min: $load1m (threshold $threshold1m)"
  subject="ALERT: High load 1m ($load1m)"
  send=1
fi

echo $msg << $mailbody
echo " " << $mailbody
echo "Load average: $(uptime | cut -c55-)" << $mailbody
echo " " << $mailbody
echo "Time: $(date)" << $mailbody
echo "Host: $(hostname)" << $mailbody

if [ "$send" == 1 ]; then
  mail -s "$subject" "$email" > $mailbody
fi

10 comments to Bash script to check server load and notify by email

  • DH

    Hello,

    I just found your post when I was looking for a tool to install on my dreamhost dedicated server to know when I am doing things on websites that break the server.

    Do you have any idea ?

    Thank you for your help.

  • i did a modification on 2015 year …

    load1m=$(uptime | awk ‘{ print $10 }’ | cut -c1-4 | sed ‘s/,/./g’)
    load10m=$(uptime | awk ‘{ print $11 }’ | cut -c1-4 | sed ‘s/,/./g’)
    load15m=$(uptime | awk ‘{ print $12 }’ | cut -c1-4 | sed ‘s/,/./g’)
    threshold1m=”2.50″
    threshold10m=”3.00″
    threshold15m=”3.80″
    result1m=`echo “($load1m)>($threshold1m)” | bc`
    result10m=`echo “($load10m)>($threshold10m)” | bc`
    result15m=`echo “($load15m)>($threshold15m)” | bc`

    this modification is because …. bc no work ..!

    Bye …

  • Miguel Rea

    Thanks very much for the script! Very helpful!

  • Michael

    You could make your script independent of your current system configuraiton by calculating a treshold based on the number of cpu cores available (note: load 4 on a 8-core machine is finde, equivalent to load 0.5 on a 1-core machine).

    grep ‘model name’ /proc/cpuinfo | wc -l
    would do for core count

  • thanks! saved me some time.

    there are some error, look at the comment from “Stars” above.

  • Dan

    Thanks! Seems to work. Interestingly, when I run the script manually at the command line I get this output:

    (standard_in) 1: syntax error
    (standard_in) 1: syntax error

    And it doesn’t finish… Seems it is waiting for input?

  • Stars

    But it won’t work! Please fix errors: line 24 msg instead of sg, line 29-34 should be >> instead of <<, line 37 should be . Also good practice is to remove temp file after use so at the end put rm $mailbody

  • Nice! Was looking for a quick utility to help babysit a server getting random load spikes, this fits the bill nicely. The only thing I thought it needed was a bit more identification in the subject line so I put the hostname at the end of the averages.

  • Thanks man. Very useful, duly stolen, and made to prevent Bash scripts running on my Raspberry Pi if the load is too high.

  • I have used bash scripts before when i was running game servers. They would shut down sometimes but having a bash scripts got them running again in seconds. Tho it would have been nice to have some output infos like the one above.

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>