Serving thousands of concurrent learners with NGINX
At SkillsLogic we develop and maintain bespoke learning management systems on various server operating systems, but our preferred option is Linux and we tend to use Apache as we have tons of experience configuring, optimising and keeping it running smoothly. However, as part of a review a clients bespoke learning management system architecture we recommended (and implemented) a move to the high performance web server NGINX.
The throughput we've been achieving on NGINX has been sensational. Running on a number of solid state, load balanced virtual web servers the user response times are lightning fast, allowing the system to support thousands of concurrent learners, tutors and system administrators without breaking a sweat.
Whilst serving up millions of pages and e-learning videos per month as fast as possible was a goal of the new design we also wanted to ensure that malicious denial of service attacks were mitigated. The original system design used the default Apache prefork model, and we would occasionally suffer from memory issues when the service was under attack. Switching to a worker model (in either Apache or NGINX) was always going to improve memory management, but we also wanted to take the opportunity to benefit from page request limiting features to better handle these situations.
The Leaking Bucket
NGINX rate limiting uses something called the leaky bucket algorithm, widely used in telecommunications networks to deal with burstiness when bandwidth is limited.
Imagine a bucket where water is poured into the top and leaks from holes in the bottom. If the rate at which water is poured into the bucket exceeds the rate at which it can leak out, then bucket overflows. The bucket is the rate limiter, the leaking water represents requests that are allowed through and the overflowing water represents requests that are discarded.
Configuring Rate Limiting Zones
A rate limiting zone can be configured to control the number of requests permitted per second. You can give the zone a friendly name to identify it, and specify a maximum amount of memory that its allowed to use. Zones use memory to store the IP addresses of the requests handled
limit_req_zone $binary_remote_addr zone=app_zone_one:10m rate=10r/s;
In the above example we’re configuring a 10mb bucket and limiting the rate to 10 requests per second from a single IP address.
Handling Bursts
When the 11th request comes in within a second from that IP address then it will receive a 503 error - which may not be ideal. This can be therefore extended using the burst option which will queue additional requests and let them through at the 10r/s rate. In the example below the login page is set to use our rate limiting zone, and allow an additional 20 requests to be placed into the queue if they are received within a second.
location /login/ { limit_req zone=app_zone_one burst=20 nodelay; }
An extra consideration we had to include was support for many learners using the learning management system from a single institution. This particular system is used by 100s of Further Education colleges, so when learners all come online at the same time, the traffic arrives at the system from a single IP address. Using additional configuration it is possible to exclude certain addresses from having limits imposed.