Faster WordPress Websites

Website scores from GTMetrix (PageSpeed: 99%, YSlow: 95%) and Google's TestMySite (2s, Excellent)

As a freelance WordPress developer, I have used a number of techniques to make my website load faster. It has a 99% PageSpeed score on GTMetrix with a load time of 0.7s. Google’s TestMySite gave it a 2s load time on 3G and a score of Excellent. This speed is non-trivial and doesn’t happen automatically. It also has an impact on search rankings as search engines award extra points for loading speed. In this post I’ll go over all the things I did to make my website blazing fast.

How fast my website loads is directly correlated to four things:

  • How much I load
  • How fast I make and send
  • How fast I receive and paint
  • How fast the connection is

The last one is out of my control, but let’s go over the first three.

How much I load

Too many bytes
Though a bit obvious, the first problem with all slow websites is that they load many more bytes than required by the content. On my website, I tackle this as follows:

  1. Optimize text
  2. Optimize images
  3. Compress: gzip, minimize, concatenate
  4. Reduce server requests
  5. Inline images
  6. Use browser cache aggressively
1. Optimize text

I’ve made the container hierarchy of my pages simple and tried to use only semantic tags. No div>div>div hierarchy if possible. Newer CSS3 layout features like the flex-box and css-grid make a deep DOM hierarchy unnecessary. So my landing page has only 150-170 nodes, depending on my Twitter feed. This doesn’t imply less content but unburdened content.

DOM optimization: Reduce container hierarchy by using css-grid and flex-box
2. Optimize images
Staging desktop wallpapers as if they were paintings in an art gallery.
Include images of the right size

The fast and smart way is to transfer only as much image data as the browser needs. This means transferring an image that is close in size to the space in which the browser will display it. I do this explicitly, like the thumbnail images on the blog archive page or the project listing in my portfolio on the home and portfolio pages. In most places on this website, I’ve used the srcset attribute in <img> tags in addition to the src attribute so that the correctly sized image is selected and downloaded by the browser. For this to work on WordPress, insert the following code in your functions.php file to activate the post-thumbnails feature. This directs WordPress to create and store uploaded media images at the resolution specified.

  ...
  add_theme_support( 'post-thumbnails' );
  set_post_thumbnail_size( 200, 113 ); //the size I need for most of my site
  ...

I also handle the compression of full-sized JPG images manually. A compression/quality level of 75 is sufficient for most photo quality images on the web. If that’s too much work, consider using a plugin like the EWWW Image Optimizer.

SVG: The background image on this website adds up to 1100 bytes (of HTML, CSS, and SVG). If it were a 1920×1080 JPG image, it would’ve been 250KB even at good compression. That is a 23,000% improvement. I use SVG wherever possible (like the illustration for DOM optimization above). It is more compressible and results in sharper images than raster formats like JPG and PNG.

SVG should be optimized for use on the web. Most drawing tools will add meta information to the SVG file that isn’t really image data. This information can be stripped, along with unused gradients and other definitions. Read more about how to use SVG on the web in this article by Sara Soueidan.

3. Compress: gzip, minimize, concatenate

Turn on gzip compression on your web server. This should be activated by default, but you never know. It can reduce transfer sizes by up to 70%. Text files are particularly compressible (.html, .css, .js, .xml, .svg). This useful article on GTMetrix describes how to enable gzip compression on your server by editing the .htaccess file:


  # Compress HTML, CSS, JavaScript, Text, XML and fonts
  AddOutputFilterByType DEFLATE application/javascript
  AddOutputFilterByType DEFLATE application/rss+xml
  ...
  AddOutputFilterByType DEFLATE text/plain
  AddOutputFilterByType DEFLATE text/xml

I also made sure to serve only one script and one .css file for my pages. These script and .css files are minified for compression. Additionally, file concatenation helps reduce the number of server requests, which brings us to the next point.

4. Reduce server requests

On a good day, my website’s landing page makes only 14 requests. Of these, 4 are housekeeping (analytics, CDN, email protection etc.). This leads to an excellent PageSpeed score.

One of the ways to reduce server requests is to concatenate all CSS and all JS files into one file each. Image files can also be “concatenated,” especially if there are several small ones on the page, by using an image sprite. The traditional way is to use CSS background-image and background-position with a blank image, but it can also be done with the <img> tag as outlined in this CSS-Tricks post.

5. Inline images

Inlining also helps reduce the number of server requests. Small images, icons in particular, can be converted to data URIs and specified directly in the src attribute or in the CSS. The background SVG pattern of pages on my website, for example, has been inlined:

#backgrounds #bg-sticks {
  background-image: url(data:image/svg+xml;charset=utf8,%3Cs ... );
}

The data URI can also be formed for binary format images by first converting them to text. This can be done by using base64 encoding. More details can be found in this post about data URIs by CSS-Tricks.

I don’t inline binary images much because all my binary images are large images. My smaller images are SVG, and they are all inlined—for example, the site logo and the social media interaction icons at the top of this post. The illustration for DOM optimization above is also inlined SVG.

6. Use browser cache aggressively

I have set my browser cache timeouts to a year for most of the resources downloaded from my website. (Everything has a version parameter so I can cause a cache invalidation if necessary.) From the second load onwards, most of the bytes for my site no longer come from the server. This is done by using the .htaccess file to configure the web server, which then specifies expiry times in the response headers of different file types. Again, GTMetrix has a nice article about it.

## EXPIRES CACHING ##

ExpiresActive On
ExpiresByType image/jpg "access plus 1 year"
ExpiresByType image/jpeg "access plus 1 year"
ExpiresByType image/gif "access plus 1 year"
ExpiresByType image/png "access plus 1 year"
ExpiresByType text/css "access plus 1 year"
ExpiresByType application/pdf "access plus 1 year"
ExpiresByType text/x-javascript "access plus 1 year"
ExpiresByType image/x-icon "access plus 1 year"
ExpiresDefault "access plus 7 days"

## EXPIRES CACHING ##

How fast I make and send

Once I’d cut out all the bloat, it was still important that the server create and send the page to the browser as fast as possible. Here are some options:

  1. Static server
  2. DB, object, and page caching
  3. CDN
1. Static server

If you are up to it, this is your best option. With a little knowledge of markdown and an engine like Jekyll or Hugo, you can “make” the pages after you write the content by running a script, and then deploy the generated static HTML files to your web server. Content is stored in flat files in markdown format instead of a database. This is technically the fastest you can get, but static servers are probably not for everyone. Here is an excellent post by David Walsh about static site generators.

2. DB, object, and page caching
Cars in a car-park, representing caching
Caching. Photo by Omer Rana on Unsplash

Given that the websites I design and develop for clients are in WordPress, a static server won’t work for me. The next best alternative is to cache pages when they are requested and then serve them from the cache. I do this using the W3 Total Cache plugin. In addition to caching pages, it also helps manage minification, concatenation, browser cache settings, DB caching, and CDN setup. I use it for DB caching, object caching, and page caching. I’ve set it up to use a memcache socket provided by my ISP, but a disk-based cache option is also available. (DB cache caches the result of database queries. The object cache caches intermediate objects like users, site options, and global posts. The page cache stores generated page output for each URL accessed.)

3. CDN

A content delivery network, or a CDN, caches site resources on several servers in diverse geographical locations. This way when someone in Asia requests a page from your site, they get it from a server nearby in Asia, saving a quarter of a second per server request. I use a free account on Cloudflare. CDNs also help hide email IDs on my page from bots and protect against DDoS attacks.

How fast I receive and paint

The third and just as important bit to handle is receiving resources without blocking other resources or their utilization. This is done using

  1. critical CSS,
  2. asynchronous and deferred loading of JS,
  3. non-blocking CSS,
  4. lazy-loaded images,
  5. non-blocking execution of Javascript.
A line of people in snow, indicating slow loading of queued resources in a browser
Slow: Browser loading resources in a queue
1. Critical CSS

The way we perceive page-load speed is the time taken for something meaningful to appear on screen and whether it appears in bits and pieces. So it is important that at least some of the page is already styled when it gets loaded. The way to do this is to separate out the most important bits of CSS, the bits that control the appearance of the first screenful of information, and inline them in a <style> tag in the <head>. This ensures that the content that appears is already rendered as it should.

I haven’t yet gotten around to doing this for my website, because it is incredibly hard to do after SCSS has been written, modularized, and separated for responsiveness. But I am doing it for all my new projects and it really, really helps.

2. Asynchronous script loading

Because this article is about WordPress, I’m going to assume that any JS that needs to be loaded can wait until after the page has been parsed. This means that the JS can be loaded asynchronously and deferred to after page load. Where earlier, just putting the <script> tag at the end of the body was the best we could do, now we have two new attributes async and defer that help us:

...
<script src="script.min.js" async defer type="text/javascript"></script>
</body>
</html>

Asynchronous script needs an alternative way to run document.onload code. This listener will get called only if the document gets loaded after the script is loaded, which might not always happen in that order. The way around this is:

var onloadAlreadyRun = false;
var onloadEventFired = false;
document.addEventListener('DOMContentLoaded', function(e){
  onloadEventFired=true;
  if (!onloadAlreadyRun){
    ready(e);
  }
});
function ready(e){
  onloadAlreadyRun=true;
  //run code that should run when the document is loaded
}
if (!onloadAlreadyRun && !onloadEventFired){
  ready();
}
3. Non-blocking, asynchronous, preload-enabled CSS
Stop sign indicating that CSS is render blocking
CSS is render blocking

CSS is a render-blocking resource. So that <link> tag on the page that downloads the stylesheet prevents the page from rendering until the CSS file it specifies has been downloaded, parsed, and applied. If you have a large CSS file like I do, your time to first render will suffer.

The reason browsers do this is because without the CSS, plain HTML pages are unusable. But if you’re using critical CSS this is obviously not the case. So it should be OK to make the non-critical CSS non-blocking. One of the things that should be done is to divide up the non-critical CSS into different files for different media queries and link them independently using the correct media="###" attribute. The non-matching stylesheets are then downloaded in a non-blocking manner, and the matching ones consequently block for a shorter time.

Another way around this is by indicating to the browser that the linked CSS isn’t really a stylesheet, and mark it for preload using the rel="preload" attribute. Browsers that support this directive will download the file asynchronously and fire an onload event on that node. At this point you can tell the browser to parse the file and apply it as a stylesheet by changing the rel attribute to stylesheet.

<head>
...
<link href="style.min.css" id="mainStyleSheet" rel="preload" as="style" media="screen" onload="this.rel='stylesheet'" type="text/css">
...
</head>
<body>
...
<!-- The whole of body -->
...
<!-- end of body tag -->
<noscript><link rel='stylesheet' href="style.min.css"></noscript>

For browsers that don’t support preload, the loadCSS polyfill by the Filament Group can be used.

4. Lazy-load images

Images are already loaded asynchronously by browsers. However, if you have large banner images on your page, they will appear to load after a flash of empty space, giving the impression that the page is slow. You can mitigate this effect by loading very low-res images in the initial load and then lazy-loading full resolution images once the rest of the page has been rendered. I’ll do a blog post about how to do this at some point, but until then this pen by @derek_morash demonstrates this technique nicely:

See the Pen Lazy Load Images by Derek Morash (@derekmorash) on CodePen.

5. Asynchronous scroll-event handling

Scroll jank is another perception-related performance issue for web pages. If you make DOM changes (adding/removing/modifying nodes, attributes, classes etc.) that result in a repaint, the page repaint will pause the scroll interaction, making the mousewheel (or touch drag) appear non-responsive. This is called scroll jank, and it creates the perception that the website is slow loading. Google’s search ranking algorithms detect it and penalize for bad user experience. Here’s what jank looks like:

On my website the navigation header is scroll-linked, reducing in size and sticking to the top on scrolling. I made sure to fix the jank.
Jank is eliminated by delinking repaint from the scroll event handler by using passive event listeners and the requestAnimationFrame API.

var latestKnownScrollY = 0, ticking = false;

//detect whether passive event listeners are supported
var supportsPassive = false;
try {
  var opts = Object.defineProperty({}, 'passive', {
    get: function() {
      supportsPassive = true;
    }
  });
  window.addEventListener("test", null, opts);
} catch (e) {}

// Use ready function from async, deferred js example earlier
function ready(){
  //add a scroll listener in a passive manner if supported
  window.addEventListener("scroll", onScrollWrap, supportsPassive ? {passive: true} : false);
}

// This is the wrapper listener that is called on scroll event. It does no DOM manipulation
function onScrollWrap(){
  latestKnownScrollY = window.scrollY;
  requestTick();
}

// Request an animation frame to execute our real listener so that 
// its changes are incorporated in the next repaint. Set a flag
function requestTick(){
  if (!ticking){
    requestAnimationFrame(onScrollInner);
  }
  ticking=true;
}

// The actual scroll listener where we perform DOM manipulation. 
function onScrollInner(){
  ticking=false;

  // Perform actual scroll related DOM manipulation below
  // using the latestKnownScrollY value set in the wrapper listener
}

Explanation
Javascript event listeners are executed before the default behavior for the event. The browser waits for all the listeners to finish execution just in case one of them cancels the default execution of the event or stops its propagation. If one of the registered listeners performs a “big” job on a frequently fired event (like a scroll event), it delays the default action of moving the viewport, causing the jank. Marking the listener passive tells the browser not to wait for it to finish its work but to continue with default execution. This is the first part of the solution.

The second part is to not process all the scroll events, but only the latest event at the time. This is done using the requestAnimationFrame API. Only the scroll-dependent code is executed (if it is not already running) by setting a flag that is unset on listener execution.

Conclusion

That’s it. These are all the performance optimizations that have made this a 99% PageSpeed website. Let me know in the comments if you have any queries or tweet at me @laaltoofan.

Shameless Plug

I am a designer and developer who makes WordPress websites. Hire me to speed up your existing website or to get yourself a brand new website that’s blazing fast.

Leave a Reply

Your email address will not be published. Required fields are marked *