The URL download has four measurable phases. They are DNS lookup, network connection, server processing, and transferring content. The average time for the top-level URL on a website is in the following chart. Page load times are typically 2 to 3 times higher because there are multiple URLs.
Adding bytes to the page raises the load on the server, network, and web browser. Pages at the 90th percentile are 51 times larger than the 10th, as shown in the chart below. The massive difference significantly impacts download times.
The number of URLs per page strongly correlates with load times. Increasing the count often increases the ratio of invisible elements. For example, adding several plugins to WordPress can results in many more URLs with little visual impact. The following chart shows the distribution of URL per page across websites.
Images consume half the bandwidth for a typical website. Optimizing size decreases the payloads while maintaining quality. The following chart shows the average number of pictures on a website sorted by file type.
: A JPEG format works best with photographs. It uses lossy compression to reduce the size significantly, but it also degrades quality. The following optimizations reduce sizes for websites. However, retain the original for reuse.
Resize the pixel height and width to fit the display area.
Reduce image quality until just before granularization. A reasonable starting point is 50%.
Minimize the chroma to 4:2:0 or until just before the color changes become noticeable.
Use progressive optimization. It renders a low-resolution image while continuing to download more detail. Also, the file tends to be slightly smaller.
Strip the metadata content from the image.
The following chart shows the average reduction in file size for the three full-width sliders on the top-level page of this site. The desktop size is 2560 pixels wide, and the mobile is 500. Use mobile image encoding to send an image of the appropriate pixel width to the device. The chart shows pixel size and quality setting have the most impact on JPG file size.
: Reducing PNG File Sizes: The PNG format is optimal for images with small color palettes like charts. The following steps decrease their size.
Resize the pixel height and width to fit the display area.
Minimize the color palette. A JPEG may result in a smaller file when the palette exceeds 256.
When practical, remove the transparency layer.
If possible, use an SVG or other very high-resolution images as a source.
Maximize compression.
Remove metadata.
The following chart shows how optimizations changed the file sizes for the diagrams on this page. It shows pixel size, and the size of the color pallet has the most impact on PNG file sizes.
: An SVG is a resolution-independent format that looks great at any size. It works best with logos and charts. It is a programming language that can be in a standalone file or embedded into a web page. It can produce the smallest file and requires more testing as browsers only interpret a subset of its features.
: A GIF is an old image format. The color pallet is at most 256, so it does not work with photographs. Also, compression is less effective compared to PNG. It supports animation, which is useful for files under 10 KB.
: An ICO file is useful for adding /favicon.ico. The is the default image located in the browser tab. Adding it reduce logged error messages. However, sites should also include one or more site logos in one of the other supported formats.
HTML coding can dramatically reduce load time. It contrasts with optimizing images and tuning systems. The following chart shows that few websites use these simple and low-cost techniques
: Tuning images for the display area in the browser reduces there size. For example, the full-width images for widescreens displays are 14 times larger than mobile. That applies to the slider images on the root page of this site. In the following code, the browser downloads the image optimized to its current screen width.
: Embedding CSS content into the root page allows the browser to start rendering while still downloading the first URL on a page. However, it's more efficient to develop content with a single CSS file shared by all pages. The trick is to create a script that delays embedding the CSS until deploying into production.
: HTML code includes indentation, spaces, comments, and blank lines to make it human-readable. However, the browser ignores them. Removing that content reduces payloads for text-based files. The following command strips the excess spaces.
: Resource hints help browsers optimize downloading. The following code uses a preconnect to get the IP address, establish a network connection, and negotiate an SSL session. When properly designed, it starts those tasks sooner. Determine when to use resource hints by reviewing download diagrams.
: The cache-control defines how long to save copies of the website. Both a CDN and a web browser store content. However, the cache-control does not exist or is set to zero for the top-level URL on 99% of sites. That forces the browser to connect with the webserver on each page.
: The entity tag (eTag) is a URL fingerprint. The browser always sends cached eTags for the URL. If the client and server tags match, there is no content in the response. Otherwise, it sends the content.
: The Keep-Alive directive reuses existing TCP connections for additional URLs on the page. The more modern HTTP/2 protocol always reuses connections and ignores the tag. Although HTTP/2 is still uncommon.
: Cookies store content in the web browser. They maintain sessions for shopping carts and track users through pages. Cookies attached to the top-level page makes that page unique. A CDN cannot cache a collection of unique pages. The CDN works better when the cookie attaches to a different DNS name.
: Content encoding reduces network payloads by compressing files. It is only useful with text files because images have built-in compression. It is possible to reduce CPU overhead by precompressing text files, although, it is more common to compress them when downloading.
The following is an example of web page headers for performance.
Server processing times are a significant contributor to page load times. Also, there are vast differences between websites. The ones at the 10th percentile are 32 times faster than the 90th, as shown in the following diagram. The delays predominantly come from inefficient code and server overloading.
Reducing distances between the client and server accelerate network transmission. It does that by reducing latency. The network bandwidth only becomes relevant when the browser is very close to the server, and when there are sufficient concurrent connections. Retransmission rates also increase the farther a packet goes. Each one lost typically causes a three-second delay. Minimizing distances help both the web server as well as the DNS server.
: Latency routing selects the server with the fastest network connection for each client. The enabling technologies are either DNS or BGP. The DNS design provides clients one of many IP addresses while BGP updates the core routing tables for the internet. The DNS approach is more flexible because individual applications can apply adjustments within minutes, whereas only network providers can tune BGP.
: A Content Delivery Network (CDN) reduces response times by minimizing the distance between the web browser and the server, as shown in the next diagram. It does this by serving local copies of website content. However, over 99% of sites require going back to the master to download each page. Those sites do not have cache-control, require eTag checks, or set the cache-control to zero on the root page.
: A multi-location website distributes standalone web servers to multiple locations, as shown in the following diagram. It works better for most sites because it minimizes network latency while eliminating coding dependencies. Also, it removes single points of failure making it more available compared to a CDN.
A DNS lookup occurs when users enter a web site name. It converts names like strategic.com to IP addresses like 8.8.8.8. As shown in the following diagram, the search starts on the right with the local disk. If that fails, it progresses to the secondary DNS server. If needed, the secondary server can get the address from the primary DNS server. The DNS response time is part of the URL response time. It varies across domains, as shown in the next chart. There are a couple of ways to improve DNS response time. Distributing the primary DNS servers around the world reduces network latency. Increasing the time to live (TTL) results in more localized lookups, which are faster. The following chart shows TTL distribution times across websites.
SSL Stapling allows multiple network connections to use the same HTTPS ticket. It improves performance by minimizing the number of calls issued to the certificate authority. It does that by reusing the same ticket across multiple network connections.
Reduce page loading latency without a server load.
Decrease URL download times while fully loading the server.
Recover quickly from overloading and failover.
Running the tests in the above sequence reduces the tuning effort. Use performance tests to guide hardware sizing, software selection, system configuration, and server location. They help reduce response times, minimize hardware costs, and improve availability. There are many online testing tools. Google Page Insights detect issues like image file size. The timing chart shown below comes from Pingdom. It helps identify dependencies, parallelization, and efficient use of system resources. The following shows the first URL download, followed by many URLs in the second pass. That is good parallelization. Also, notice the reuse of DNS, TCP Connect, and SSL sessions. Open-source tools like Apache Workbench (AB) and curl allow website administrators to run a load test. However, hosting platforms often restrict load testing by capping server loads. Also, shared hosting platforms can frequently adjust which websites are on a server. Those platforms make much of the performance analysis impractical. The following code is an example of the ab command. Be sure to include the trailing slash. The -c is for concurrency, and the -n is for the number of runs.