Friday, 17 January 2014

Maximum concurrent connections to the same domain for browsers

Do you surprise when I told you that there is a limit on how many parallel connections that a browser can make to the same domain?

Maximum concurrent connections to the same domain

Don't be too surprise if you never heard about it as I have seen many web developers missed this crucial point. If you want to have quick figure, this table is from the book PROFESSIONAL Website Performance: OPTIMIZING THE FRONT END AND THE BACK END by Peter Smith

The impact of this limit 

How this limit will affect your web page? The answer is a lot. Unless you let user load a static page without any images, css, javascript at all, other while, all these resources need to queue and compete for the connections available to be downloaded. If you take into account that some of the resources depend on other resource to be loaded first, then it is easy to realize that this limit can greatly affect page load time.

Let analyse further on how browser load a webpage. To illustrate, I used Chrome v34 to load one article of my blog (10 ideas to improve Eclipse IDE usability). I prefer Chrome over Firebug because its Developer Tool has the best visualization of page loading. Here is how it looks like

 I already crop the loading page but you should still see a lot of requests being made. Don't be scared by the complex picture, I just want to emphasize that even a simple webpage need many HTTP requests to load. For this case, I can count of 52 requests, including css, images, javascript, AJAX, html.

If you focus on the right side of the picture, you can notice that Chrome did a decent job of highlighting different kind of resources with different colours and also manage to capture the timeline of requests.

Let see what Chrome told us about this webpage. At first step, Chrome load the main page and spend a very short time parsing it. After reading the main page, Chrome send a total of 8 parallel requests almost at the same times to load images, css and javascript. For now, we know that Chrome v34 can send up to 8 concurrent request to a domain. Still, 8 requests are not enough to load the webpage and you can see that some more requests are being sent after having available connection.

If you still want to dig further, then we can see that there are two javascripts and one AJAX call (the 3 requests at the bottom) are only being sent after one of the javascript is loaded. It can be explained as the execution of javascript trigger some more requests. To simplify the situation, I create this simple flowchart

I tried my best to follow colour convention of Chrome (green for css, purple for images and light blue for AJAX and html). Here is the loading agenda

  • Load landing page html
  • Load resources for landing pages
  • Execute javascript, trigger 2 API calls to load comments and followers.
  • Each comment and follower loaded will trigger avatar loading.
  • ...
So, in minimum you have 4 phases of loading webpage and each phase depend on the result of earlier phase. However, due to the limit of 8 maximum parallel requests, one phase can be split into 2 or more smaller phases as some requests are waiting for available connection. Imagine what will happen if this webpage is loaded with IE6 (2 parallel connections, or minimum 26 rounds of loading for 52 requests)?

Why browsers have this limit?

You may ask if this limit can have such a great impact to performance, then why don't browser give us a higher limit so that user can enjoy better browsing experience. However, most of the well-known browsers choose not to grant your wish, so that the server will not be overloaded by small amount of browsers and end up classifying user as DDOS attacker.

In the past, the common limit is only 2 connections. This may be sufficient in the beginning day of web pages as most of the contents are delivered in a single page load. However, it soon become the bottleneck when css, javascript getting popular. Because of this, you can notice the trend to increase this limit for modern browsers. Some browsers even allow you to modify this value (Opera) but it is better not to set it too high unless you want to load test the server.

How to handle this limit?

This limit will not cause slowness in your website if you manage your resource well and not hitting the limit. When your page is first loaded, there is a first request which contain html content. When the browser process html content, it spawn more requests to load resource like css, images, js. It also execute javascript and send Ajax requests to server as you instruct it to do.

Fortunately, static resources can be cached and only be downloaded the first time. If it cause slowness, it happen only on first page load and is still tolerable. It is not rare that user will see a page frame loaded first and some pictures slowly appear later later. If you feel that your resources is too fragmented and consume too many requests, there are some tools available that compress and let browser load all resources in single request (UglifyJS, Rhino, YUI Compressor, ...)

Lack of control on Ajax requests cause more severe problem. I would like to share some sample of poor design that cause slowness on page loading.

1. Loading page content with many Ajax requests

This approach is quite popular because it let user feel the progress of page loading and can enjoy some important parts of contents while waiting for the rest of contents to be loaded. There is nothing wrong with this but thing is getting worse when you need more requests to load content that the browser can supply you with. Let say if you create 12 Ajax requests but your browser limit is 6, in best case scenario, you still need to load resources in two batches. It is still not too bad if these 12 requests are not nesting or consecutive executed. Then browser can make use of all available connections to serve the pending requests. Worse situation happen when one request is initiated in another request callback (nested Ajax requests). If this happen, your webpage is slowed down by your design rather than by browser limit.

Few years ago,  I took over one project, which is haunted with performance issue. There are many factors that causing the slowness but one concern is too many Ajax requests. I opened browser in debug mode and found more than 6 requests being sent to servers to load different parts of page. Moreover, it is getting worse as the project is delivered by teams from different continents, different time zone. Features are developed in parallel and the developer working on a feature conveniently add server endpoint and Ajax request to let work done. Worrying that the situation is going out of control, we decided to shift the direction of development. The original design is like this:

For most of Ajax requests, the response return JSON model of data. Then, the Knock-out framework will do the binding of html controls with models. We do not face the nested requests issue here but the loading time cannot be faster because of browser limit and many http threads is consumed to serve a single page load. One more problem is the lack of caching. The page contents are pretty static with minimal customization on some parts of webpages.

After consideration, we decided to do a reset to the number of requests by generating page contents in one page. However, if you do not do it properly, it may become like this:

This is even worse than original design. It is more or less equal to having the limit of 1 connection to server and all the requests are handled one by one.

The proper way to achieve similar performance use Aysnc Programming

Each promise can be executed in a separate thread (not http thread) and the response is returned when all the promises are completed. We also apply caching to all of the services to ensure the service to return quickly. With the new design, the page response is faster and server capacity is improved as well.

2. Fail to manage the request queue

When you make a Ajax request in javascript and browser do not have any available connection to serve your request, it will temporarily put the request to the request queue. Disaster happens when developers fail to manage the request queue properly. This often happens with rich client application. Rich client application functions more like an application than a web page. Clicking on button should not trigger loading new web address. Instead, the page content is uploaded with result of Ajax requests. The common mistake is to let new requests to be created when you have not managed to clean up the existing requests in queue.

I have worked on a web application that make more than 10 Ajax requests when user change value of a first level combo box. Imagine what happen if user change the value of the combo box 10 times consecutively without any break in between? There will be 100 Ajax requests go to request queue and your page seem hanging for a few minutes. This is an intermittent issue because it only happen if user manage to create Ajax requests faster than the browser can handle.

The solution is simple, you have two options here. For the first option, forget about rich client application, refreshing the whole page to load new contents. To persist the value of the combo box, store it as a hash attached to the current URL address. In this case, browser will clear up the queue. The second option is even simpler, block user from making change to combo box if the queue is not yet cleared. To avoid bad experience, you can show the loading bar while disabling the combo box.

3. Nesting of Ajax requests

I have never seen a business requirement for nesting Ajax request. Most of the time I saw nesting request, it was design mistake. For example, if you are a lazy developer and you need to load flags for every country in the world, sorting by continent. Disaster happen when you decide to write code this way:
  • Load the continent list
  • For each continent, loading countries
Assume the world have 5 continents, then you spawn 1 + 5 = 6 requests. This is not necessary as you can return a complex data structure that contain all of these information. Making requests is expensive, making nesting request is very expensive, using Facade pattern to get what you want in a single call is the way to go.