top 1000 websites” (see
http://w3techs.com/blog/entry/nginx_just_became_the_most_used_web_server_among_the_top_1000_websites
The article summary says that “34.9% of the top 1000 web sites rely on
Nginx. That makes it the most trusted web server on high traffic sites, just
ahead of Apache.”
The original design of Nginx was created to allow higher numbers of
concurrent website requests. Larger websites often have tens of thousands
of clients connected simultaneously, each one making HTTP requests that
must be responded to. The designers of Nginx heard this problem described
as C10K and decided they could write a web server that would be capable
of serving at least 10,000 clients simultaneously.
THE C10K PROBLEM
The canonical website for learning more about this problem is
http://www.kegel.com/c10k.html. The article provided at this link is from the
early 2000s and describes ideas for configuring operating systems and
writing code to solve the problem of serving at least 10,000 simultaneous
clients from a web server. Today, this problem is even more common, and
with the continuing maturity of Nginx, lighttpd, and other web servers,
many of the largest, highest-traffic sites have switched away from Apache.
Newer versions of the Apache web server and other modern web servers rely
on the concept of threads. Threads are kind of like lightweight processes. This
deserves some explanation. A process is a specific instance of a computer
program running. The process contains both the machine code (the binary, or
the compiled version of the program that the computer processor can
understand and obey—which is either precompiled as in C or C++ programs
or may be the output of a just-in-time compilation as happens with languages
like Python or Perl) as well as the current activity of that program, such as the
calculations it is performing or the data it stores in memory on which it is
operating. Serving an HTTP page by running a complete process each time
would be bad because the server’s resources would be quickly used up if the
site were even moderately popular. Process after process would be started,
and they would fight for attention. A thread is the ordered control of a
program: First, do this; then, do that; finally, do this other thing. One process
may control many threads. This is good for resource management. By using
threads instead of processes, a larger number of client requests can be served
using less system resources than with a process-based server.
Most web servers have traditionally been either process based or thread
based. There are also examples of hybrid models, where many multithread
processes are used. Process-based servers are great because they are stable,