London, UK 
6 Snow Hill, London, EC1A 2AY, UK


The Hague, The Netherlands 
WTC The Hague, Prinses Beatrixlaan 582, 2595 BM Den Haag

+31 (0) 70 240 0021

Sofia, Bulgaria 
141 Tsarigradsko Shose Blvd, VIP Security Building, Floor 2, Sofia 1784, Bulgaria

  • YouTube
  • Black Facebook Icon
  • Black Twitter Icon

© 2019 HeleCloud™

Handling Hundreds of Thousands of Concurrent HTTP Connections on AWS

January 9, 2019


By Ivaylo Vrabchev, Cloud Services Consultant at HeleCloud™



In the era of Cloud computing we need to be able to design highly-available, fault-tolerant, cost-efficient, scalable systems. Some of these systems are heavily loaded with thousands or even millions of requests per second. Most of the Cloud Load Balancers are designed to handle such loads, but they scale up gradually, in line with traffic. So what would happen if all requests were received at once or within a few seconds’ time window? There would be a huge spike, which the standard Cloud Load Balancers will not be able to handle.

Let’s imagine that we are AWS Cloud Architects who have to provide a simple Web server solution to handle more than 100,000 concurrent HTTP connections.



Graphic 1: Web Server Solution Design


The components, which comprise the architecture are:

  1. VPC

  2. DMZ/Private Subnets

  3. Elastic LoadBalancer

  4. EC2Instances

  5. CloudWatch


VPC and Subnets

In this architecture, will be use a standard VPC configuration with two DMZ and two private subnets situated in two availability zones. DMZ will be utilised only for publicly facing services e.g. Load Balancers, Bastion Hosts etc. Private Subnets will be used to host all other services e.g. FrontEnd, Backend servers.


Load Balancer

We need to select which type of load balancer we are going to use. As described earlier, the standard Application Load Balancer (ALB) will not be able to handle such spikes, due to gradual traffic scalability. They can be pre-warmed but this is a manual task that requires assistance from the AWS Support team.

As the leading public Cloud provider, AWS is always there to help. They identified this problem and released Network Load Balancer (NLB). It is designed to handle tens of millions of requests per second while maintaining high throughput at ultra-low latency. NLB operates at the connection layer (OSI Layer 4), routing connections to targets based on IP protocol data. That means we are responsible to configure properly the rest of the communication to Layer 7, in order to provide a fully operational HTTP application.


EC2 Instance

In our case, there are no specific requirements for the underlying operating system, so I highly recommend usingAmazon Linux. It is optimised for the AWS Platform.Horizontal scaling is always better than vertical if your goal is to create a highly-available and fault-tolerant system.


Many connections would result in high CPU usage. That's why our preference here is to use a CPU optimised instance type like the C5, the next generation of the Amazon EC2 Compute Optimized instance family.


Adding a few lines in /etc/sysctl.conf and /etc/security/limits.conf will prepare our OS to handle all these connections:



net.core.somaxconn = 65536

net.ipv4.tcp_max_tw_buckets = 1440000

net.ipv4.ip_local_port_range = 1024 65000

net.ipv4.tcp_fin_timeout = 15

net.ipv4.tcp_window_scaling = 1

net.ipv4.tcp_max_syn_backlog = 3240000



soft nofile 4096

hard nofile 100000


NGINX has been known for its high performance, stability, rich feature set, simple configuration, and low resource consumption. That's why it helps us avoid this situation and will serve the rest of the communication from L4 to L7 for us. So we need to fine tune it with the following records in the nginx.conf:



worker_rlimit_nofile 100000;


worker_processes auto;


events {

   worker_connections 25000;

   use epoll;

   multi_accept on;



error_log /var/log/nginx/error.log buffer=1024k;

access_log  /var/log/nginx/access.log main buffer=1024k;



However, using NLB instead of ALB has some shortcomings.NLB doesn’t support logging, so the full HTTP communication logs are stored on each nginx instances. In order to keep all logs in one centralised place, we can use Cloud Watch Logs and its agent to export all local files into a CloudWatch log group.



The installation and configuration are very simple:


            sudo yum install -y awslogs



Once the agent is installed we have to configure it to export the desired logs by adding following lines to /etc/awslogs/awslogs.conf:



datetime_format = %b %d %H:%M:%S

file = /var/log/nginx/error.log

buffer_duration = 5000

log_stream_name = {instance_id}

initial_position = start_of_file

log_group_name = /var/log/nginx/error



datetime_format = %b %d %H:%M:%S

file = /var/log/nginx/access.log

buffer_duration = 5000

log_stream_name = {instance_id}

initial_position = start_of_file

log_group_name = /var/log/nginx/access


If we want to use CloudWatch within the same region as the instances, we need to configure the agent region:


REGION=`curl|grepregion|awk -F\" '{print $4}'`

sudo sed -i -e "/region =/ s/= .*/= ${REGION}/" /etc/awslogs/awscli.conf


Then we need to start the agent:


sudo service awslogs start


After a few seconds, the two new log groups will be created and we can find all Nginx logs in them.



Now, once you know how to do a fine tuning on the configuration, and what type of AWS ELB to select, you can bring up a test environment using the following configuration:


1 x Amazon Load Balancer (NLB)

1 x Target Group with 2 x ec2 instances (c5.xlarge)


When all components are up and running, we are ready to validate if our configuration meets the initial requirements. That can be done using various open source load testing tools like Bees with Machine Guns, LocustApache JMeter etc. There is an interesting article provided by Blezemeter comparing different load testing tools.


I will use ApacheBench (AB) for purpose of this testing because I have already an automated solution for that.


The following command will be executed from 6 nodes in order to perform the load testing.


ab -k -n 3000000 -c 20000  {NLB  URL address}


That will run 3000000 get requests processing up to 20000 requests concurrently per each individual server against the Network Load Balancer URL address.

Once all the requests are sent the test is completed. Because of the nature of the tool that we are using, we will have results per server similar to presented in the table below.




As you can see there are no failed request, which proves that our configuration can handle hundreds of thousands concurrent requests per seconds.



The AWS Network Load Balancer allows you to design your system architecture at a low and performant networking level while helping you to handle millions of requests per seconds. It is very useful when you have to handle unpredictable spikes in network traffic. AWS allows you to design the system in a few different ways based on your requirements and all best practices provided by their team.


Please be aware that Application Load Balancer now supports containers and lambda function in order to help serverless architectures, functionality announced at re:Invent 2018. Similar assessment can be done on that front as well. Follow us for further publications. 


Please get in touch if we can help with AWS solution designs, or any other aspect of the AWS platform.



Share on Facebook
Share on Twitter
Please reload

Featured Posts
Please reload