Monday, January 2, 2017

Some notes on NodeJS's async I/O


Many of us are sometimes confused with what it means that NodeJS is both single-threaded, and async. So, it's a bit more complicated than that.

NodeJS core contains an non-blocking loop for requests. The received requests is immediately sent for processing and the listening process resumes. This way, NodeJS's primary thread is almost always ready to accept new requests.  Data can be processed either in other threads or on other servers (like in case of a distributed system or a distributed database).



Here is a nice explanation by slebetman about the way NodeJS operates at its core:

Since you're asking a fairly low level question I'll answer with a low level answer. Hope you're comfortable with C.
First, a disclaimer: I'll be talking mostly about networking code because the only widely used database I know of that use file I/O is sqlite. Since you're asking about postgres I can assume you're interested about how socket I/O (be it TCP socket or unix local sockets) can work with only one thread.
At the core of almost all async systems and libraries is a piece of code that looks like this:
while (1)
{
  read_fd_set = active_fd_set;

  // This blocks until we receive a packet or until timeout expires:
  select(FD_SETSIZE, &read_fd_set, NULL, NULL, timeout);

  // Process timed events:
  timeout = process_timeout();

  // Process I/O:
  for (i = 0; i < FD_SETSIZE; ++i) {
    if (FD_ISSET(i, &read_fd_set)) {
      if (i == sock) {
        /* Connection arriving on listening socket */
        int new;
        size = sizeof(clientname);
        new = accept (sock,(struct sockaddr *) &clientname, &size);
        FD_SET (new, &active_fd_set);
      }
      else {
        /* Data arriving on an already-connected socket. */
        if (read_from_client(i) < 0) {
          close (i);
          FD_CLR (i, &active_fd_set);
        }
      }
    }
  }
}
(code example paraphrased from a GNU socket programming example)
As you can see, the code above uses no threading whatsoever. Yet it can handle many connections simultaneously. If you take a look at the for loop it is also obvious that it is basically a simple state machine that processes sockets one at a time if they have any packets waiting to be read (if not it is skipped by the if (FD_ISSET...) statement).
Non-I/O events can logically only come from timed events. And that's where the timeout management (details not shown for clarity) comes in. All I/O related stuff (basically almost all your async code) gets called back from the read_from_client() function (again, details omitted for clarity).
There is zero code running in parallel.

Where does the parallelization come from?

Basically the server you're connecting to. Most databases support some form of parallelism. Some support mulththreading. Some even support node.js or vert.x style parallelism by supporting asynchronous disk I/O (like postgres). Some configurations of databases allow higher level of parallelism by storing data on more than one server via partitioning and/or sharding and/or master/slave servers.
That's where the big parallelism comes from -- parallel computing. Most databases have very strong support for read parallelism but weaker support for write parallelism (master/slave setups for example allow you to write only to the master database). But this is still a big win because most apps read more data than they write.

Where does disk parallelism come from?

The hardware. Mostly this has to do with DMA which can transfer data without the CPU. DMA is not one thing. It is more like a concept. Different systems like the PCI bus, SATA, USB even the CPU RAM bus itself has various kinds of DMA to transfer data directly to RAM (and in the case of RAM, to transfer data higher up to the various levels of CPU cache) or to a faster buffer.
While waiting for the DMA to complete. The CPU is not doing anything. And while it is doing nothing and there happens to be a network packet coming in or a setTimeout() expiring the code that handles them can be executed on the CPU. All while a file is being read into RAM.

But Node.js docs keep mentioning I/O threads

Only for disk I/O. It's not impossible to do async disk I/O with a single thread. Tcl has done that for years and many other programming languages and frameworks have too. It's just very-very messy since BSD does it differently form Linux which does it differently from Windows and even OSX may be subtly different form BSD even though it is derived from it etc. etc.
For the sake of simplicity and solid reliability node developers have opted to process disk I/O in separate threads.
Note that even for socket I/O it is not as simple as the code example I gave above. Since select()has some limitations (for example, you're forced to loop over ALL sockets to check for incoming data even though most won't have incoming data), people have come up with better APIs. And obviously different OSes do it differently. That is why there are a lot of libraries created to handle cross platform event processing like libevent and libuv (the one node.js uses).

OK. But postgres still runs on my PC

Asynchronous, event-oriented systems does not automagically give you performance superpowers. What they DO give you is choice: the app server is blazing fast so where you put your database servers and what database you use us up to you.

OK. But I can do this with threads. Why async?

Benchmarks.
Since 1999, many people have run many benchmarks and in the majority of cases single threaded (or low thread count), event-oriented systems have outperformed simple multithreaded systems. It was especially true in the old days of single CPU, single core servers. It is still partly true now (since cores are still limited).
That is why Apache was re-written into Apache2 to use a thread pool of async listeners and why Nginx was written from scratch to use a thread pool of async code.
Yes, on modern servers ideally you'd still want some threads in order to use all your CPUs. The alternative is a process pool like how the cluster module works in node.js. But you'd want the number of threads/processes to be constant or as constant as possible to avoid the overhead of context switching and thread creation.


Full thread on Stack Overflow.

No comments:

Post a Comment

Ubuntu 12.04, 14.04, 16.04 - auto start an app or script before login

To run a command or application at startup, even before the user has logged in, you can use this file: /etc/rc.local The commands entered...