Understanding Node JS

Understanding Node JS

What is Node and how it is different from other frameworks?

Node is a Js runtime that uses the Google V8 engine to execute the js code in the V8 engine's call stack. In addition to the V8 engine, Node also uses libuv that handles the blocking IO and network call using the C++ worker threads.

V8 engine + Libuv + Js library /C++ bindings (To access OS modules) => Node environment.

When compared to plain javascript, Node offers additional capabilities like accessing a file system and handling other os modules using the C++ bindings, handling file systems or os modules is not part of vanilla Javascript that is originally created to operate at the web browser level.

The primary idea behind node is to eliminate the process blocking or in other words prevent a resource/worker from waiting for a long-running task such as an IO-bound task to complete before it could work on other tasks.

e.g, consider a worker is occupied in a DB query and it is blocked until it receives the query result back from the DB. Instead of waiting, the worker can pick up the next priority tasks and complete them and resume back on DB call operation once it gets notified of the response from the DB query it made earlier.

Node achieves this by choosing singled thread event loop model from the V8 engine which utilizes the event emitting and call-back capability from JavaScript. In addition to the V8 engine's call stack, it utilizes the libuv event loop and C++ worker pool threads. Any of the blocking IO and network calls are offloaded to libuv C++ worker threads. Other than IO bound and network calls rest of the tasks run on the single-threaded event loop present in the V8 engine. Whenever IO or network call finishes it notifies its completion status back to the event loop via call-back registers. The V8 engine's single-threaded event loop will process these callbacks in its next iteration.

Wait, Is this mean the node is not suitable for compute-bound tasks? What will happen to the single-threaded event loop when it synchronously processes a heavy computing task?

hmm... True. Before stable node version 12.0, there is no support to create a new worker thread. Any of the long compute-bound tasks will block the event loop and the entire node process will be stuck until the compute-bound operation finishes. Before the node 12.0 version, compute tasks are executed by spinning a child process from the main process to avoid blocking the event loop. however, creating a process is costlier when compared to creating a thread within a process. In a later version node module 'worker_threads' are introduced to overcome this problem.

initially, the node was created to handle the IO operations. Thread management was kept away from developers. This, in turn, saves developers from worrying about thread synchronization and thread management problem such as creating and destroying threads. Node's primary idea is to keep it efficient and simple.

Ok, how the node HTTP server could handle multiple requests using a single event threaded event loop without interfering with or overlapping the data from another request? This is possible through the function scope behaviour of Javascript. Each HTTP request is registered through the callback function the first param in createserver. the data or variables within each request are scoped within the callback function.

const httpServer = http.createServer((request, resposne)=> 
{  
   console.log(request,resposne);
   resposne.write('hello');
   resposne.end();
})

Order of execution inside Event Loop

The event loop follows certain order to bring the tasks to its execution call stack. I remember the order as this acronym ETID (event emitted, Timmers - setTimeouts, IO/network bound calls, Deferred execution -setInterval)

Forking new process, Multithreading, Running node in Clusters

node creates / fork child process and this way many number of sub process can run parallely and distrubed across multiple core. instead of multiple threadf. Parent and child process communicate via IPC.

Using the Child process let's create an HTTP server that listens on localhost:3036 for incoming requests and send back a response for each request.

//Main node parent process
console.log('I am parent process');

var cp = require('child_process');
let childProcess = cp.fork(__dirname +'/http.js');

childProcess.on('error', (err)=>
{
    console.log(err);
});

childProcess.on('message', (msg)=>
{
    console.log(msg);
})

childProcess.on('exit',()=>
{
    console.log('child process has ended now');
    console.log('To show parent process is independent of child process let us see its memory usage');
  console.log(process.memoryUsage);

})

//Parent process receiving signal to exit. This will end the child process.
process.on('SIGINT' , ()=>
{
    console.log('killing parent process by issuing Signal interupt cmd Ctrl + c')
})

process.on('exit', ()=>
{
    console.log('Parent process ending now');
})
//Child process where http server object is created.

const http = require('node:http'); //Load from the node _module
let globalObj = 0;
const httpServer = http.createServer((request, resposne)=> 
{  
   console.log(request,resposne);
   process.send(`Data from child to parent process ${globalOb}`);
   resposne.write('hello ' + globalObj);
   this.globalObj = this.globalObj +1;
   resposne.end();
});

setTimeout(() => {

    invokeEndProcess()

}, 5000);

function invokeEndProcess()
{
//process.send = process.send || function () {};
 process.send('child sending completed msg to parent before terminating itself');
process.kill('SIGINT');
}

child process kills itself at the time interval of 5000 ms after it was started from the parent process. Till timeout is triggered we can send a request to the HTTP server in the child process. Once timeout happened child process sends msg back to the parent process which is intercepted in the parent process using an event listener.

//msg from child process are intercepted in this event listener 
childProcess.on('message', (msg)=>
{   console.log(msg);})

When to Choose Node vs other backend languages

My view is that when it comes to developing a more complex web application it is better to choose node js with Typescript compatibility which gives a powerful framework to develop an app faster and with Type safety. Nest JS - Node + TS is such one of the powerful framework that uses express underneath for API development. It is getting more popular in recent times with the dev community.

Have a look at my article on developing microservice using nest js in a more structured way.

My two cents on choosing node packages. When it comes to third-party libraries there are numerous ones available in npm repository. One should be careful while selecting the npm packages by analyzing its dependent packages and it should not be stale and poorly maintained.

One thing Node inspires the other older matured languages such as C# and .NET frameworks to go for simplicity. The .NET 6.0 minimalistic API is get inspired by the node framework. Refer to this link from Microsoft PM for .NET 6.0 https://youtu.be/1daODFp6xvs

Stats from techempower site on node's performance compared with other popular frameworks

https://www.techempower.com/benchmarks/#section=data-r17&hw=ph&test=fortune&l=gcv6kd-0&p=hweg3v-qt5t8q-d1fif4-72&w=tcd33s-2c8ef&f=0-0-jz6vk-0-0-2t4w-0-0-0

To be continued..

References for further reading

Packt publication - Mastering Node Js book authored by Mr Sandro Pasquali

http://docs.libuv.org/en/v1.x/design.html

https://nodejs.org/api/cluster.html#class-worker

https://coursehunters.online/t/jscomplete-com-node-beyond-basics-part-1/2618

https://github.com/samerbuna/efficient-node/blob/main/100-learning-node-runtime.adoc

Did you find this article valuable?

Support Vivekananthan Pasupathi by becoming a sponsor. Any amount is appreciated!