Scaling node.js application using cluster model and pm2 process manager which creates instances of a process in all the available cores in the CPU helps in load-balancing and improves performance.
Sat, May 31, 2025
3 min read
By default, when you run a Node.js program on a multi-CPU system, it creates a process that uses only a single CPU. Because Node.js is single-threaded, all requests must be handled by that one thread running on a single CPU. For CPU-intensive tasks, the operating system must schedule them to share this single CPU until completion. This can lead to performance issues if the process becomes overwhelmed with too many requests. If the process crashes, your application becomes inaccessible to users.
To address this, Node.js provides the cluster
module. This module creates multiple copies of your application on the same machine and runs them simultaneously. It includes a load balancer that distributes requests evenly among processes using a round-robin algorithm. If one instance crashes, the remaining processes continue serving users. This significantly improves application performance by evenly distributing the load across multiple processes, preventing any single instance from becoming overwhelmed.
Without clustering, the application runs as a single process on one CPU core, identified by a unique process ID. All CPU-intensive tasks must run through this single process, which can lead to performance degradation or crashes when handling multiple concurrent requests.
for eg: index.js:
import express from "express";
const port=3000;
const app=express();
app.get("/heavy" , (req, res)=>{
let total=0;
for(let i=0; i<50_000_000; i++){
total++;
}
res.send(`The result of the CPU intensive task is ${total}\n`)
})
app.listen(port, ()=>{
console.log(`Server is running on port ${port}`);
console.log(`worker pid=${process.pid}`)
})
let's test the performance using a load-testing tool load test.
The output shows the total number of errors caused by process crashes and the mean latency.
import cluster from "cluster";
import os from "os";
import { dirname } from "path";
import { fileURLToPath } from "url";
const __dirname= dirname(fileURLToPath(import.meta.url));
const cpuCount= os.cpus().length;
console.log(`The total number of CPU's is ${cpuCount}`);
console.log(`Primary pid=${process.pid}`);
cluster.setupPrimary({
exec: __dirname + "/index.js",
})
for(let i=0; i<cpuCount; i++){
cluster.fork();
}
cluster.on("exit" , (worker, code, signal)=>{
console.log(`worker pid=${worker.process.pid} died`);
console.log(`Starting another worker`);
cluster.fork();
})
In the code above, we use the CPU count to set up the primary process and execute index.js. The script creates copies (forks) of the process based on the number of available CPUs. Additionally, if any process fails or dies, the code automatically creates a new process, ensuring continuous operation.
Let's test the performance after implementing the cluster module.
The results show that both the error count has been eliminated and the mean latency has decreased significantly.
Using pm2 for clustering achieves the same results as our primary.js implementation—it creates 8 process instances across all CPU cores and provides identical load balancing and performance benefits.