Performance Research, Part 4: Maximizing Parallel Downloads in the Carpool Lane

By YUI TeamApril 11th, 2007

This article, co-written by Steve Souders, is the fourth in a series of articles describing experiments conducted to learn more about optimizing web page performance (Part 1, Part 2, Part 3). You may be wondering why you’re reading a performance article on the YUI Blog. It turns out that most of web page performance is affected by front-end engineering, that is, the user interface design and development.

Parallel Downloads

The biggest impact on end-user response times is the number of components in the page. Each component requires an extra HTTP request, perhaps not when the cache is full, but definitely when the cache is empty. Knowing that the browser performs HTTP requests in parallel, you may ask why the number of HTTP requests affects response time. Can’t the browser download them all at once?

The explanation goes back to the HTTP/1.1 spec, which suggests that browsers download two components in parallel per hostname. Many web pages download all their components from a single hostname. Viewing these HTTP requests reveals a stair-step pattern, as shown in Figure 1.

Figure 1. Downloading 2 components in parallel

Figure 1. Downloading 2 components in parallel

If a web page evenly distributed its components across two hostnames, the overall response time would be about twice as fast. The HTTP requests would look as shown in Figure 2, with four components downloaded in parallel (two per hostname). The horizontal width of the box is the same, to give a visual cue as to how much faster this page loads.

Figure 2. Downloading 4 components in parallel

Figure 2. Downloading 4 components in parallel

Limiting parallel downloads to two per hostname is a guideline. By default, both Internet Explorer and Firefox follow the guideline, but users can override this default behavior. Internet Explorer stores the value in the Registry Editor. (See Microsoft Help and Support.) Firefox’s setting is controlled by the network.http.max-persistent-connections-per-server setting, accessible in the about:config page.

It’s interesting to note that for HTTP/1.0, Firefox’s default is to download eight components in parallel per hostname. Figure 3 shows what it would look like to download these ten images if Firefox’s HTTP/1.0 settings are used. It’s even faster than Figure 2, and we didn’t have to split the images across two hostnames.

Figure 3. Downloading 8 components in parallel

Figure 3. Downloading 8 components in parallel

Most web sites today use HTTP/1.1, but the idea of increasing parallel downloads beyond two per hostname is intriguing. Instead of relying on users to modify their browser settings, front-end engineers could simply use CNAMEs (DNS aliases) to split their components across multiple hostnames. Maximizing parallel downloads doesn’t come without a cost. Depending on your bandwidth and CPU speed, too many parallel downloads can degrade performance.

If browsers limit the number of parallel downloads to two (per hostname over HTTP/1.1), this raises the question:

What if we use additional aliases to increase parallel downloads in our pages?

We’ve seen a couple great blogs and articles written recently on the subject, most notably Ryan Breen of Gomez and Aaron Hopkins over at Google. Here’s another spin. The performance team at Yahoo! ran an experiment to measure the impact of using various numbers of hostname aliases. The experiment measured an empty HTML document with 20 images on the page. The images were fetched from the same servers as those used by real Yahoo! pages. We ran the experiment in a controlled environment using a test harness that fetches a set of URLs repeatedly while measuring how long it takes to load the page on DSL. The results are shown in Figure 4.

Figure 4. Loading an Empty HTML Document with 20 images using Various Number of Aliases

Figure 4. Loading an Empty HTML Document with 20 images using Various Number of Aliases

Note: Times are for cached aliases, empty file cache page loads on DSL (~800 kbps).

In our experiment, we vary the number of aliases: 1, 2, 4, 5, and 10. This increases the number of parallel downloads to 2, 4, 8, 10, and 20 respectively. We fetch 20 smaller-sized images (36 x 36 px) and 20 medium-sized images (116 x 61 px). To our surprise, increasing the number of aliases for loading the medium-size images (116 x 61px) worsens the response times using four or more aliases. Increasing the number of aliases by more than two for smaller-sized images (36 x 36px) doesn’t make much of an impact on the overall response time. On average, using two aliases is best.

One possible contributor for slower response times is the amount of CPU thrashing on the client caused by increasing the number of parallel downloads. The more images that are downloaded in parallel, the greater the amount of CPU thrashing on the client. On my laptop at work, the CPU jumped from 25% usage for 2 parallel downloads to 40% usage for 20 parallel downloads. These values can vary significantly across users’ computers but is just another factor to consider before increasing the number of aliases to maximize parallel downloads.

These results are for the case where the domains are already cached in the browser. In the case where the domains are not cached, the response times get significantly worse as the number of hostname aliases increases. For web pages desiring to optimize the experience for first time users, we recommend not to increase the number of domains. To optimize for the second page view, where the domains are most likely cached, increasing parallel downloads does improve response times. The choice depends on which scenario was most typical.

Another issue to consider is that DNS lookup times vary significantly across ISPs and geographic locations. Typically, DNS lookup times for users from non-US cities are significantly higher than those for users within the US. If a good percentage of your users are coming from outside the US, the benefits of increasing parallel downloads is offset by the time to make many DNS lookups.

Our rule of thumb is to increase the number of parallel downloads by using at least two, but no more than four hostnames. Once again, this underscores the number one rule for improving response times: reduce the number of components in the page.

Steve Souders is also writing a series of blogs on Yahoo! Developer Network describing best practices he’s developed at Yahoo! for improving performance (Part 1, Part 2).

37 Comments

  1. I do think the DNS lookup time will be a huge issue. DNS lookups usually is not fast, and many ISPs provides poor DNS solutions.

    With opendns.com as your DNS provider; this method would possibly work. If not I doubt it would do any good.

  2. Thanks for these articles, they are great. Really interesting stuff.

    On the topic of splitting requests across hosts, particularly the “front-end engineers could simply use CNAMEs (DNS aliases)”, I think it’s vital to stress the importance of canonical URLs. The principle is sound, but carries risks if URLs aren’t managed carefully.

    If you don’t ensure the same URL is used for the same resource then caching mechanisms (e.g. browsers, proxy servers) won’t recognise all requests for cached resources… if the URL is different, there will be a new HTTP request for a resource that is cached under a different URL. And that’d be bad for performance (among other things), right?

    You haven’t delved specifically into the effect of shared caching on the Internet, where one’s HTTP request today might be somebody’s proxy request tomorrow… how important is it in the quest for optimal delivery? I’d love to know more :)

    Keep up the great posts!
    Ben

  3. 20 parallel downloads of a 3.4k file will come close to maxing out an 800kbps line. Was bandwidth utilisation measured at all?

  4. [...] Die Entwickler von YUI haben wieder Nachforschungen betrieben: Wie lässt sich der Seitenaufbau beschleunigen. Teil 4 dieser Serie hat untersucht, welche Auswirkungen es hat Host-Aliase zu verwenden. Hintergrund: Die Standardeinstellungen der Browser lässt nur 2 parallele Downloads pro Host zu. Nun gaukelt man dem Browser vor, dass es sich um mehrere Server handelt und dieser kann nun von jedem dieser Aliase 2 Downloads gleichzeitig starten. [...]

  5. In your test, you just used a page and sample images. How does this apply to mixed media, like pages in an IFrame, JavaScript and CSS files? Since most sites already request content from multiple hostnames in the form of analytics, third-party widgets, ads, and other content, it seems awfully unrealistic to limit a site to consuming resources from only 2 hostnames unless they proxy everything. I don’t see how this test corresponds to the real world.

  6. Joe P brings up a good point. With mixed media sites, especially in the rise of sites such as YouTube, and media rich flash based sites, the number of components loading onto a page is nearly irrelevant at this point. Flash files can contain all images used for a site in a single file, or load them on command from within the file itself. With YouTube or any one of its number of clones, the files are loaded from a remote server (and HTTP request to another hostname) Now this brings up the topic of bandwidth leeching which is a totally different thing so I will stop.

  7. Steve Souders said:
    April 12, 2007 at 3:30 pm

    Ben’s comment about canonical URLs is right on. At Yahoo! we address that by the entity type: js and css come from “us.js2.yimg.com”, images come from “us.i1.yimg.com”. It’s not a perfect split, but is a good balance between performance and ease of implementation.

    It’s important to address the transition of these experimental results to the real world, as Joe brings up. In most sites, the bulk of the components are hosted by the site owner. For those components, the decision about how to split them across hostnames must be made. For those components, whether they be images, js, css, Flash, these results are generally applicable.

  8. [...] Yahoo! User Interface Blog: Maximizing Parallel Downloads in the Carpool Lane Speeding up downloads by using multiple hostnames. [...]

  9. There is an additional potential downside to increasing the number of aliases which can occur depending on how the increase is implemented. For instance, suppose when your page is generated you select the hostname for each image at random out of three possible aliases. This may increase performance for the first page, but subsequent pages that use the same images (or further requests for the same) will get a different hostname for each image two-thirds of the time, meaning users will be re-downloading images they already have in their cache. So in order to maximize both download performance and cache usage, the same image should always be served from the same hostname.

    However any partitioning scheme used to ensure the same hostnames are used for any given image runs the risk of not evenly distributing requests, which reduces the benefits of multiple hostnames.

    Also, just out of curiosity: These tests appear to have used just two sets of images with identical file sizes. What happens to this behavior when images are of significantly differing sizes? It seems like the graphs would be better optimized in some situations. For instance a page with one 20kb image and five 2kb images should be able to download all six images in the time it takes to download the one 20kb image, because while one connection downloads the large image, the other is able to fulfill the smaller images. Obviously this wouldn’t be the case if the large image is one of the last to be requested, but my point is that performance may be influenced greatly by the amount of variation in image sizes and whether the requests are happening in an optimal order.

  10. I just stumbled across this series of articles – thanks so much for sharing this information!

    Eric.

  11. Tenni Theurer said:
    April 23, 2007 at 1:57 pm

    @Mark: You are absolutely right with your example of the additional downside when increasing the number of aliases. At Yahoo!, we typically use us.i1.yimg.com to download images and us.js2.yimg.com to download JavaScript and CSS on our pages. This helps increase the number of parallel downloads without compromising the benefit of the browser’s cache. We did not try to run the experiment with varying image sizes nor number of images – but I agree, the graph could be optimized in various situations.

    @Eric: I’m glad you like the articles. Hope you find them helpful and thanks for reading!

  12. [...] Read about it here: Maximizing Parallel Downloads in the Carpool Lane [...]

  13. Background Downloading Large Data…

    Background Downloading Large Data over a network Introduction This page is a collection of research I did into various ways of doing downloading of large data files in the background…….

  14. [...] parallel requests per host imposed by HTTP 1.1. By configuring host aliases for your server you can increase the number of parallel requests — but more than 2-4 also increase DNS lookups resulting in higher CPU usage and slower [...]

  15. [...] creating two subdomains to host assets – images and css, maximizing parallel downloads (also following rule #9) 2. creating a new template (theme) based on the default subSilver template [...]

  16. [...] Maximizing Parallel Downloads in the Carpool Lane [...]

  17. [...] kenyataanya Java Script menghalangi browser melakukan parallel downloads dan menghalangi rendering semua content yang berada di bawahnya (CSS, HTML, image, dll). Pada [...]

  18. Aren’t your figures, and your basic premise, failing to take into account that bandwidth is not unlimited?

    If I have one pipe which is fat or divide that into 20 pipes which are thin, download time will be the same (r.e. your Fig 3.) The difference is in how the HTTP requests are allocated. Initiating 20 HTTP connections requires much more overhead than one.

    You need to measure bandwidth for each connection to really see whether you have a saving; your analysis is flawed otherwise. Also, the transactional overhead of HTTP has to be taken into account (which is where the differences between pipelining and not occurs; or even parts of the difference for HTTP/1.0 vs. HTTP/1.1)

  19. [...] The good folks over at the YUI blog posted this: What the 80/20 Rule Tells Us about Reducing HTTP Requests a while ago. I bookmarked it, but wanted to point it out to other folks–it’s a nice bit of research, with numbers and graphs and all that good stuff. It opened my eyes to various non intuitive aspects of web application performance. The whole series is a nice read; part 1 is linked above and here’s part 2, part 3 and part 4. [...]

  20. [...] urli do takich feedów. Ocena plus dodani: +1 Obsługa assetów Druga pozytywna grupa zmian. Doświadczenia przeprowadzone przez chłopaków z Yahoo dowodzą, że spore przyspieszenie ładowania aplikacji można uzyskać jeżeli dane (JS, CSS, [...]

  21. Great series. Steve’s talk for google was also very interesting. We have your book at the company and I think everyone should read it

  22. [...] experiments conducted to learn more about optimizing web page performance (Part 1, Part 2, Part 3, Part 4). You may be wondering why you’re reading a performance article on the YUI Blog. It turns out [...]

  23. [...] les téléchargements puisque les limites sont “par domaine”. Au delà de 4 domaines l’équipe de performance Yahoo! a remarqué qu’entre une trop grande parallélisation et la multiplication des requêtes DNS, [...]

  24. [...] is two components per hostname. For more information of this topic check the article “Maximizing Parallel Downloads in the Carpool Lane” by Tenni [...]

  25. [...] more information check “Maximizing Parallel Downloads in the Carpool Lane” by Tenni Theurer and Patty [...]

  26. Hey Tenni and Steve,

    Thanks for this great article. It gives a lot of insights on increase loading speed through parallel downloading.

    Can you give more info on what type of client you used to conduct the experiment with Figure 4? I’m wondering with improvement made to people’s machines and also browser technologies, e.g. FF3, since you guys wrote the article, if we should lean more on the 4 or 5 aliases end to take full advantage of it?

  27. [...] (imagens utilizadas neste subcapitulo retiradas daqui) [...]

  28. [...] Recommended reading: Maximizing Parallel Downloads in the Carpool Lane. [...]

  29. Hi,

    Thanks for such a good articles!!!!

    I think its advisable to use 4 hosts for US and 2 hosts for NON US countries. Ultimately it will increase page response for both kind of user.

  30. What tool do you use to check the perfomance?
    I’ve heard a lot about DNS Lookup but I don’t understand how to reduce it.

  31. [...] Recommended reading: Maximizing Parallel Downloads in the Carpool Lane. [...]

  32. Great article. Can anyone verify that these tests still hold true with browsers released since 2007 when the article was published?

  33. What tool do you use to check the perfomance?

  34. [...] more information check “Maximizing Parallel Downloads in the Carpool Lane” by Tenni Theurer and Patty [...]

  35. “This page makes 61 parallelizable requests” How can I fix this issue?