The Hitchhiker’s Guide to Better Web Performance

by Gerald Madlmayr / July 25th 2017

Page loading times are key for today’s web applications. Fast-loading pages encourage the customer to stay longer and so they generate more revenue (https://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales). Fast loading sites also result in a better search engine ranking, which is why continuously measuring and improving loading time during development is a key feature of modern software development processes.

In this article, we show how web performance can be measured and what KPIs should be taken into consideration. The second section outlines how to measure these KPIs using Selenium, Chrome, and a virtual frame buffer. In closing, we demonstrate measures to improve the loading time and KPIs of your website to provide a better experience for your end user.

“What gets measured gets managed” – Peter Drucker

When we talk about “loading time”, there is often a lot of confusion about what is actually meant as there are so many variables to take into consideration.

The following loading time KPIs are the most common and build the most suitable method for comparing page loading times:

  • Time To First byte (TTFB): this is the time it takes for the first byte of the HTTP response to be received by the browser after the HTTP request has been sent. This value gives an indication of how long it takes for the server to start generating a reply.
  • Time To Last Byte (TTLB): this is the time it takes for the last byte of the HTTP response to be received by the browser after the HTTP request has been sent to it. The higher the TTLB, the more time it takes for the server to compose the HTML code that is transferred to the client.
  • DOM Interactive: this is the point in time at which the entire DOM tree of the HTML is parsed and modifiable by the browser. After this point in time JavaScript can change elements in the DOM tree.
  • DOM Content Loaded: this is the point in time at which all other HTTP element requests nested in the DOM tree have been triggered. Painting of the site may have started, but does not need to have been completed. Stylesheets, images, and sub-frames will not have finished loading. This timing is a useful indicator of whether or not the HTML structure can support good page performance/short loading times.
  • OnLoad Event: the OnLoad Event is triggered after all elements in the DOM tree have been loaded. This means that the browser has already processed the HTML, CSS, and JavaScript, although painting of the page may not have been completed and asynchronous calls such as tracking pixels or ads may still be loading. This KPI indicates when all no-blocking resources of the site have been collected. This makes this a suitable indicator of how fast the whole website content can be collected.
  • HTTP Traffic Completed: this is the point in time at which the last byte of all nested HTTP requests has been transferred to the browser. All asynchronous calls have been completed and the page is fully rendered.

The most relevant KPIs are DOM Content Loaded and OnLoad Event. Both events are also visually marked in Chrome’s Developer Tools by the blue (DOM Content Loaded) and red bars (OnLoad Event) in the network tab.

The user can interact with the site before DOM Content Loaded event or the OnLoad Event have been completed and is indicated by the DOM Interactive Event. The KPI that you choose to focus on is your prerogative but DOM Content Loaded and the OnLoad Event are good indicators for benchmarking with peers.

From a user perspective, DOM Content Loaded gives an indication of when the first paints have started and the user can see the first elements. With the OnLoad Event, all the elements that are required for painting have been loaded and in order to finish, only the client side process (rendering and JavaScript Execution) is required. JavaScript may change the site structure and require a repainting element, but there is no network traffic involved in these changes (https://www.youtube.com/watch?v=LdebARb8UJk).

If you need dedicated elements on the site to be loaded (e. g. ads that load asynchronously), manual coding is required. Another option would be to look for the HTTP Traffic Completed time, which waits until the last byte has been transferred to the browser and includes asynchronous calls.

There are three different technical methods for measuring KPIs.

A web browser is required to measure timings that are close to reality. In order to keep the level of resources used for testing low, the use of headless browsers such as phantomJS (http://phantomjs.org/) or chromium headless (https://chromium.googlesource.com/chromium/src/+/lkgr/headless/) is recommended.

Both tools can be controlled using a JavaScript API or Selenium WebDriver. Other platforms such as Firefox or Internet Explorer can also be used, but these browsers require the opening of a GUI and so consume more resources. There are also options to operate other browsers on Linux in headless mode (using a virtual framebuffer like xvfb)

Setting up the components

To enable measurement automation we use Google’s Chrome browser in headless mode (supported from version 59 on). This version of Chrome can be run server-sided without a graphical user interface. But even before this version it was possible to run Chrome for testing in combination with a virtual framebuffer such as xvfd. A bonus side effect is that by using an X11 server (which simulates a display) alongside a remote screen viewer (VNC), you can also see what is happening on Chrome at that moment.

An alternative option would have been phantomJS (https://github.com/macbre/phantomas), but following the announcement of Chrome headless, the maintainer of phantomJS stepped down and so the future of phantomJS is now unknown (https://groups.google.com/forum/#!topic/phantomjs/9aI5d-LDuNE).

So, we start off with the setup of Chrome. The installation process depends on the version of the OS version you are using, but it works on Linux and OSX (although not yet on Windows). If you run CentOS, this bug might be helpful for you (https://bugs.chromium.org/p/chromium/issues/detail?id=695212).

In order to interact with Chrome, you will need a Chrome WebDriver (https://sites.google.com/a/chromium.org/chromedriver/home). This application can control Chrome through its DevTool API (similar to the JSON WebWire API). The WebDriver itself is supported by the Selenium Test framework which allows us to automate browser interaction.

To create a client with Selenium, Java is our weapon of choice. Selenium and WebDrivers for other Browsers are available for different platforms such as Python or NodeJS – but the functionality stays the same. For Java, our project requires a corresponding dependency:

<!-- Selenium -->
<dependency>
  <groupId>org.seleniumhq.selenium</groupId>
  <artifactId>selenium-server</artifactId>
  <version>3.4.0</version>
</dependency>

The chain of communication is shown in the figure below.

The chain of communication

Our Java Selenium Client talks to the WebDriver and tells it what to do; the WebDriver controls the Chrome Headless Browser; the Browser handles communication with the external site.

Don’t mix the Chrome Remote Debugging Port with the WebDriver Port. For the test itself, you don’t need to run Chrome yourself, but pass the information on the binary to the WebDriver. The WebDriver starts Chrome for you and then executes the commands it receives from the Selenium client.

To instruct Selenium to open a webpage, we need the following snippet:

final ChromeOptions chromeOptions = new ChromeOptions();
// put the path to Google’s Chrom here
chromeOptions.setBinary("/opt/google/chrome/google-chrome");
chromeOptions.addArguments("headless");
// disable gpu based rendering, as we are in headless mode chromeOptions.addArguments("disable-gpu");
// if you require a proxy, pass it over to the Chrome like this.chromeOptions.addArguments("proxy-server=proxy.xyz.net:8080");
// create a Capabilities Object and apply the Chrome settings.
((DesiredCapabilities) caps).setCapability(ChromeOptions.CAPABILITY, chromeOptions);
// Create a RemoteWebDriver with Chrome Parameters. The URL points to the running WebDriver.
final WebDriver driver = new RemoteWebDriver(url, caps), timeout));
driver.get(“http://www.prosiebensat1.com”);

This opens a connection to the RemoteWebDriver, handing over the information of which binary to use (google-chrome in this case), starting it headlessly, disabling the GPU, and adding a proxy.

Measure the loading times

Selenium enables JavaScript code to run inside the browser and evaluates statements that are returned by these injected JavaScript statements. This can be implemented by a JavaScriptExecutor. The JavaScriptExectutor can be used just like the JavaScript console in your browser.

To determine the page loading metrics of your website, you can make use of the JavaScript performance API (as stated above). The JavaScript performance object has several attributes that refer to events like onload event or DOM Content Loaded. These elements contain absolute time stamps. So, in order to get meaningful loading times, the start time stamp (fetchStart) needs to be deducted from the performance timing time stamps.

final long fetchStart = (long) js.executeScript("return performance.timing.fetchStart");
final long requestStart = (long) js.executeScript("return performance.timing.requestStart");
final long responseStart = (long) js.executeScript("return performance.timing.responseStart");
final long loadEventEnd = (long) js.executeScript("return performance.timing.loadEventEnd");
final long domContentLoadedEventEnd = (long) js.executeScript("return performance.timing.domContentLoadedEventEnd");
final long DOM Interactive = (long) js.executeScript("return performance.timing.domInteractive");

The durations can be computed from the extracted values.

LOGGER.debug("TTFB {}", (responseStart - fetchStart));
LOGGER.debug("loadEventEnd {}", (loadEventEnd - fetchStart));
LOGGER.debug("DOM Content LoadedEventEnd {}", (domContentLoadedEventEnd - fetchStart));
LOGGER.debug("DOM Interactive {}", (DOM Interactive - fetchStart));

Overall setup

Selenium tests are commonly used during testing or run continuously to create synthetic transactions.

To scale our testing, we created a browser test cluster, based on containers that run chrome-headless and the WebDriver. We also considered using a Selenium grid, but having more control over the single browsers using containers gives us more control and options for debugging and analysis.

As the WebDriver only allows connections on localhost, we have added an instance of nginx to the container to route traffic from an outside port to the WebDriver. In the beginning, we considered iptables for this job, but it turned out that nginx does the job well and provides logs that help debugging when issues occur. With this setup, Selenium tests can run in parallel and therefore can scale horizontally.

Optimization

Now that we know how fast our website loads and how to measure the metrics, we can start optimizing the code. A good starting point for this is Google PageSpeed; it offers numerous hints and good advice on where to start.

Some general hints are

  • Turn on GZIP for the transfer of the files from your server.
  • Keep an eye on the amount of data that is loaded. Although browsers will cache JavaScript and images, the initial load and the customer’s first experience are important. Consequently, loading time should be as quick as possible.
  • Make use of CDNs (CloudFront, Akamai, Cloudflare, Level3 etc.) for holding your third party content, as these CDNs are able to serve the content from a global network of data centers and usually less time is needed to fetch the data compared to using your own servers.
  • Reduce third party JavaScript and Tracking pixels. Not only are you giving away your users’ data to third parties, but the loading time of such scripts also influences your page loading times – even if they load asynchronously. Do your best not to increase these integrations.

Additional technical measures to improve loading times are:

  • Use HTTP/2 instead of HTTP/1.1 and make sure your CDN supports H2.
  • If your TTFB is slow and you can’t optimize your backend to serve the content quicker, consider using full page caches such as Varnish or Squit. There are also CDN providers such as Fastly (http://www.fastly.com) that offer full-page-cache-as-a-service. This enables you to evaluate the use of caching in less than 10 minutes.
  • If you are using third party pixels, use a Tag Manager that allows you to control the pixels in a simpler way and only integrate the pixel on the pages required. Google Tag Manager also allows pixel loading after OnLoad. If your pixels can support this, then your page loading time will also benefit.

Loading time optimization is not a one-time expense. It is a continuous process of measuring and improving. Make sure you have tools for measurement automation in place and also apply these tools to your competitors’ sites so that you have a benchmark.

Performance Management at ProSiebenSat.1

At ProSiebenSat.1 we take Performance Management seriously. We have built a dedicated toolchain based on the components outlined above for measuring and tracking the loading time performance of our venture’s websites.

Besides loading times, we also continuously measure and improve other KPIs such as SSL Score, Google Page Speed, or simply the availability of the site.

Examples of high performing sites are Amorelie or Verviox. These sites also have performance tests as part of their CI tool chain to ensure a sound customer experience with every new feature release on their websites.

MORE BLOG POSTS