3 min readJan 1, 2021


Based on The Linux Programming Interface

Today, if you are writing a server (HTTP daemon) program, you should be expecting tens of thousands of connections already, and oftentimes, after these connections are established, most of them will just remain alive for the minutes to come to save unnecessary hand-shakings. But how can we handle such a huge amount of (long lived) connections simultaneously?

The most natural response to this problem would be to use more processes or threads to handle new connections. However, even today, it would be still too expensive to create tens of thousands of threads, let alone processes. It would be better to use a thread pool. By “thread pool”, I mean you can keep, for example, five threads open all the time, and they will just stay there even when there is no connections at all. That does help, but that alone doesn’t make things a lot better, I’m afraid. After all, these threads, too, have to handle connections one by one. And don’t forget that these connections may well last longer than a few minutes before they close, which means, each of these five threads will have to wait until their connection is closed or timed-out. Each connection is a small yet de facto denial of service attack already.

So we can see, to make our server responsive, it’s not enough to create more theads, however lightweight they can be, especially when we are dealing with long lived connections. The real problem with long lived connections is that they keep our server waiting, idly. What if we can leave the current connection aside and take care of other waiting connections first? It’s definitely much better to spend our time this way, right? But how? How can we do this? For Unix, there are serveral ways to do this. First, you can make them (connection sockets) nonblocking. After that, you first check if there are new requests from this connection, but if there is still nothing for you to read, you just walk away and check another connection (this method is also known as “polling”). It makes sense, but if you are talking abou tens of thousands of connections, it will still take a lot of time. To make things better, you can use signals, asking your system to inform you whenever there is a new request. To be honest, I really like this idea and it’s of course extremely efficient but I am afraid it would not be easy to handle, for example, thousands of signals at the same time. To solve this problem, people have developed many different mechanisms, and many of them have been devised to make polling more efficient by asking the kernel to do the polling for you. These mechanisms include select(), poll(), and epoll() (or kqueue for the BSD systems). The idea behind these mechanisms is pretty much the same: You give the kernel a list of file/socket descriptors you care about, and the kernel will do the polling for you and give you back a list of descriptors that are available for you to read and the underlying algorithms for their implementations causes the differece in efficiency.

To sum up, the secret behind a responsive server is pretty much like the secret behind an efficient life: A) spend your time wisely (I/O multiplexing, signal-driven I/O, epoll/poll/select), B) speed up, get better hardware and use the resources you have (user-level threads, a.k.a. coroutines, faster CPU/memory access, and support for more threads/processes).