Skip to main content

Server Load Monitor

We have launched the "Server Load Monitor" that you see on your Server Panel. In this article, I will explain how you can accurately determine the health of your server's computing resources (CPUs) just by looking at the server load monitor.

Do not confuse the Server Load Monitor with the Server Health Monitor. The Server Health Monitor provides overall CPU, memory, and disk usage information, while the Server Load Monitor provides only CPU usage information, but with high accuracy.

The example you see below is from one of the servers in ServerAvatar's network. Based on what I see on this chart, I can say the following things about the server, even without knowing what kind of site it hosts:

  • It is a very stable server. This means that the application hosted on this server is very stable and well-developed, and it is running smoothly.
  • However, the server needs an upgrade as soon as possible. Recently, there were some traffic spikes or other updates going on, which can be seen in the server load monitor as spikes.

If you can see the same thing in the chart, that's awesome! If you don't know what's going on, let me explain.

The Server Load Monitor contains 4 series, which are:

  1. 1 Minute load average (Blue series)
  2. 5 Minutes load average (Red series)
  3. 15 Minutes load average (Orange series)
  4. Cores (Green series)

The "Cores" series shows the number of cores your server has at a specific time. The following server has 6 cores, and there were no upgrades performed recently, so that series has not changed.

The "Load Average" is the average workload that all of your vCPUs had to do in a specific time. It is a number, like 0.16, 3.45, 76.67, or any other number.

There is a basic threshold that the server load should not go higher than, depending on your application and server itself, otherwise your application will slow down. For example, if you have a server with 10 vCPUs and the load average of any specific time frame is 11, it is still fine. But if it goes any higher than 13-15 (depending on your application), your server will slow down.

In the following screenshot, you can see that there are many spikes above 6 as a threshold. To understand this, let's understand the importance of 1 minute, 5 minutes, and 15 minutes load averages.

If the 1 minute load average goes higher than 150-200%, it is fine IF it returns back under the threshold. Otherwise, the 5 minute load average will also start rising.

If the 5 minute load average goes higher than 130-140% and it stays there for longer than 5-10 minutes, you have a huge spike and a huge requirement of computing power than you currently have. And it will also cause the 15 minutes load average to rise.

If the 15 minutes load average goes higher than 110% of the threshold, your application starts slowing down. In the following screenshot, the 15 minute load average (orange) stays exactly or around the threshold. It means that the site is sometimes slow but the difference is negligible.

That is why the application(s) on the following server is working just fine right now. But if it gets a huge flow of visitors, the site will slow down with the current configuration. Sooner or later, the owner of this application has to upgrade the server. Right now, the owner of this server is getting a 100% return on investment constantly for the server.

Memory also affects server load. If your server lacks memory, it will start swapping data between the memory and swap space, which can slow down the server and require additional computing power. Therefore, it is important to ensure that your server is not swapping too much. We will add swap memory usage to the dashboard so you can monitor this as well.