Scaling Node.js application using clustering and pm2 process manager

By default, when you run a Node.js program on a multi-CPU system, it creates a process that uses only a single CPU. Because Node.js is single-threaded, all requests must be handled by that one thread running on a single CPU. For CPU-intensive tasks, the operating system must schedule them to share this single CPU until completion. This can lead to performance issues if the process becomes overwhelmed with too many requests. If the process crashes, your application becomes inaccessible to users.

To address this, Node.js provides the cluster module. This module creates multiple copies of your application on the same machine and runs them simultaneously. It includes a load balancer that distributes requests evenly among processes using a round-robin algorithm. If one instance crashes, the remaining processes continue serving users. This significantly improves application performance by evenly distributing the load across multiple processes, preventing any single instance from becoming overwhelmed.

First, let's create a node.js application without using a Cluster

Without clustering, the application runs as a single process on one CPU core, identified by a unique process ID. All CPU-intensive tasks must run through this single process, which can lead to performance degradation or crashes when handling multiple concurrent requests.

for eg: index.js:

import express from "express";
 
const port=3000;
const app=express();
 
app.get("/heavy" , (req, res)=>{
    let total=0;
    for(let i=0; i<50_000_000; i++){
        total++;
    }
    res.send(`The result of the CPU intensive task is ${total}\n`)
})
 
app.listen(port, ()=>{
    console.log(`Server is running on port ${port}`);
    console.log(`worker pid=${process.pid}`)
})

let's test the performance using a load-testing tool load test.

Output:

The output shows the total number of errors caused by process crashes and the mean latency.

Now let's use clustering

Let's create a file called primary.js that will improve performance and prevent crashes. This file will create multiple instances of the process across all available CPU cores and use a load balancer with a round-robin algorithm to distribute tasks evenly.

import cluster from "cluster";
import os from "os";
import { dirname  } from "path";
import { fileURLToPath } from "url";
 
const __dirname= dirname(fileURLToPath(import.meta.url));
 
const cpuCount= os.cpus().length;
 
 
console.log(`The total number of CPU's is ${cpuCount}`);
console.log(`Primary pid=${process.pid}`);
 
cluster.setupPrimary({
    exec: __dirname + "/index.js",
})
 
for(let i=0; i<cpuCount; i++){
    cluster.fork();
}
 
cluster.on("exit" , (worker, code, signal)=>{
    console.log(`worker pid=${worker.process.pid} died`);
    console.log(`Starting another worker`);
    cluster.fork();
})

In the code above, we use the CPU count to set up the primary process and execute index.js. The script creates copies (forks) of the process based on the number of available CPUs. Additionally, if any process fails or dies, the code automatically creates a new process, ensuring continuous operation.

Let's test the performance after implementing the cluster module.

The results show that both the error count has been eliminated and the mean latency has decreased significantly.

Now let's use pm2 for clustering:

npx pm2 start index.js -i 0 (with pm2 saved as a dev dependency)

Using pm2 for clustering achieves the same results as our primary.js implementation—it creates 8 process instances across all CPU cores and provides identical load balancing and performance benefits.