Friday, June 18, 2010

Costs and Benefits of Website Performance Optimization

My last post covered the how, and supplied just enough explanation of the benefits to keep people interested. Here I will try to go in depth with my analysis of the costs and benefits of the website performance optimization steps I have taken this week.

Costs

I spent the better part of 4 days this week focusing on site performance enhancements. Since we launched the new site in November of last year, I may have spent as much as another 4 days on similar tasks. That time would have been spread out an hour here and there rather than devoting most of a work week to it. Let’s say in total I’ve spent 2 work weeks setting up these various tweaks. Setup is a one time cost.

Some of the costs stretch back to developing the redesign. It took a lot more time to separate content, presentation, and behavior than it would have to just export from Photoshop to Dreamweaver and push the results up the server. At the time I was thinking more in terms of quality than performance. But it would paint an incomplete picture to ignore that end of the equation. There are benefits above and beyond performance for that sort of attention to detail as well. The overall code footprint for the new site is about 4% that of the old site, and normal site maintenance takes a fraction of the time it once did. I’ve blogged on that topic before, although that entry is on the WordPress server which is currently down for maintenance. (3.0 upgrade! w00t!)

My previous entry included a step by step altered workflow for doing something as simple as adding a new icon to our footer. But to be honest most of the extra steps are trivial. A couple of extra “Save As” in DreamWeaver and minifying the results. These things take seconds or even a fraction of a second. That’s nothing compared to the normal edit/test/debug/publish cycle. Occasionally we’ll hit an image sprite edit that will be complex. But most of those will be as simple as sticking the new element at the bottom of the existing file and saving under a new version number. I’d be hard pressed to assign a number to this cost. As a percentage of my time, it’s probably somewhere in the single digits, low single digits at that.

Costs in a Nutshell

Two weeks up front to set up. Another two to four weeks per year in recurring maintenance costs. And the human capital investment to have the people around to pull it off. (That means pay enough to hire someone who knows what they are doing, or put forth the time and effort to educate yourself.)

Benefits

To gauge our progress, I ran tests on WebPageTest.org as I completed each website performance optimization milestone. Full results for each test are publicly available and linked from a previous post.

As a sample, I selected 3 pages. The home page represents only itself, but alone it accounts for 35% of our page views. The Current Students page represents our role based navigation pages and the A to Z index. These are only 6 such pages, but together they account for another 28% of our page views. The Library home page represents around 250 department/office home pages which taken together account for about 21% of our page views. I didn’t worry about the remaining 16% of our traffic since it’s distributed across about 1,000 other pages.

Definitions

Start Render

The time it takes for the browser to receive enough information to start rendering the page. In other words, time our users spend staring at a blank screen.

Load Time

Time spent loading all requests, including AJAX calls made after onLoad (when the status bar says “Done”) For us this includes loading Google Analytics.

Total Requests

The number of calls between the browser and the web server. Think of the server as something like a switch board operator answering around 7 calls per second, non stop. Each request is one of these calls. Our server must get really annoyed with us.

Total kb

Total kilobytes transferred throughout the entire process. This should translate directly to bandwidth cost savings.

Estimates & Assumptions

Based on Google Analytics data collected over the past 7 months, we average around 3,000,000 page views per year. Traffic calculations for the pages represented by our test sample are outlined above. Approximately 25% of our page views are from first time visitors, placing the other 75% into the repeat views category.

Home Page

First Visit

3,000,000 total page views times 35% represented in sample times 25% first time visitors = 262,500 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 1.089s 0.854s -0.235s 21.579% decrease 0.7 days
Load Time 6.350s 4.250s -2.1s 33.071% decrease 6.4 days
Total Requests 41 26 -15 36.585% decrease  
Total kb 524 389 -135 25.763 % decrease  

Repeat Visit

3,000,000 total page views times 35% represented in sample times 75% repeat visitors = 787,500 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 0.920s 0.523s -0.397s 43.152 % decrease 3.6 days
Load Time 2.572s 1.462s -1.110s 43.157 % decrease 10.1 days
Total Requests 38 4 -34 89.474 % decrease  
total kb 59 52 -7 11.864 % decrease  

Current Students Page

First Visit

3,000,000 total page views times 28% represented in sample times 25% first time visitors = 210,000 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 1.075s 0.785s -0.290s 26.977 % decrease 0.7 days
Load Time 7.162s 4.342s -2.820s 39.374 % decrease 6.9 days
Total Requests 38 26 -12 31.579 % decrease  
Total kb 521 382 -139 26.679 % decrease  

Repeat Visit

3,000,000 total page views times 28% represented in sample times 75% repeat visitors = 630,000 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 1.222s 0.519s -0.703s 57.529 % decrease 5.1 days
Load Time 2.625s 1.475s -1.15s 43.81 % decrease 8.4 days
Total Requests 35 4 -31 88.571 % decrease  
Total kb 66 63 -3 4.545 % decrease  

Library Home Page

First Visit

3,000,000 total page views times 21% represented in sample times 25% first time visitors = 157,500 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 1.379s 0.943s -0.436s 31.617 % decrease 0.8 days
Load Time 4.994s 4.983s -0.011s 0.22 % decrease 0.02 days
Total Requests 33 22 -11 33.333 % decrease  
Total kb 470 369 -101 21.489 % decrease  

Repeat Visit

3,000,000 total page views times 21% represented in sample times 75% repeat visitors = 472,500 per year.

Metric Monday’s Test Thursday’s Test Difference Percent Change Yearly Savings
Start Render 1.249s 0.515s -0.734s 58.767 % decrease 4.0 days
Load Time 2.266s 1.472s -0.794s 35.04 % decrease 4.3 days
Total Requests 30 3 -27 90.000 % decrease  
Total kb 70 80 +10 14.286 % increase  

Benefits in a Nutshell

Over the next year, visitors to our site will spend about 15 fewer days staring at a blank screen waiting for our server to respond and about 36 fewer days waiting for our pages to completely load.

Guide to Intermediate Website Performance Optimization

If you need a primer on website performance optimization, check out these best practices from Google and these best practices from Yahoo.

The Easy Stuff

As long as we’re designing our sites with web standards some of these practices will already be familiar.

  • External CSS and JS files
  • Avoid CSS expressions and filters
  • Lighter markup compared to table based layouts (fewer DOM elements)
  • Specify a character set early
  • Avoid empty tags and duplicate scripts

Other best practices are super easy to implement given clean, efficient, and (mostly) standard coding practices.

  • Use <link> over @import
  • Put CSS at the top, scripts at the bottom
  • Combine external CSS and JS files

The Not-So-Easy Stuff

Image Optimization

Unfortunately this isn’t as simple as using “Save for Web” in Photoshop. To really squeeze every byte out of your image bandwidth without quality loss requires a little research into image formats and which one is appropriate for a given situation. Optimization tools like Smush.it are nice too. You can see Google’s suggestions and Yahoo’s suggestions but I’ll summarize my understanding below.

  • JPG and PNG files handle almost everything quite well
  • JPG for photos, PNG for everything else (icons, backgrounds, etc)
  • PNG-8 does everything GIF can do except for animations
  • The last browser support hurdle for PNG is alpha channel transparency, which is not possible with GIF anyway

Minification

Things start to get a bit more involved at this stage. Even “advanced” image optimization is a fire-and-forget sort of process. You do it right the first time and reap the benefits forever after. Minification effectively adds a compile step to our workflow and a lot of us webbies aren’t used to that sort of thing.

But it’s not as difficult as it may sound. I keep a non-minified version of code around to edit and test with, then minify it before putting it into production. I started out using Yahoo’s command line compressor. Since Google Page Speed is part of my standard testing routine, it’s trivial to save the minified version straight from its report once I’m satisfied with the changes.

Visitors to www.volstate.edu are served minified JavaScript (40% smaller) and CSS (19% smaller).

CSS Sprites

You can find a ton of resources on this topic with a quick Google search, but I prefer the classics. This technique can get quite confusing, and I recommend taking the time to really understand it before implementing it. That being said, we have many online sprite tools to make the job easier. I used SpriteMe and found it useful enough. Just make sure you create a test page to run it on that makes use of every image you plan to sprite, and you may want to adjust the suggested sprites to minimize white space.

The downside is a certain loss of agility when changing or adding images to the site. If we want to add LinkedIn to our social media icons in the footer of our site, we’ll have to edit the Icons sprite rather than just uploading a new icon to the server and linking it in separately. But there are also benefits.

  • Fewer HTTP requests (reduced by about a third on our site)
  • Sprites are often no bigger than the combined file size of the individual files, sometimes even smaller
  • Commonly used images are effectively preloaded even if not used on the initial page visited

Be warned that sprites aren’t appropriate for everything. I tried to combine our PNG logo with the JPG RODP logo and the resulting sprite was 27k larger than the individual files. I think that was due to things like gradients and shadows in the RODP logo greatly increasing the number of colors required. I stuck with jpg for that.

Compression

Now we’re moving to server configuration. I’m lucky enough to have access to and admin rights on the server, so I could do this stuff myself. In another organization I might have to rely on the server admin group for this.

I used a combination of blogs and official documentation to set this up on our IIS server. It seems to be easier on Apache. Since I’ve never set it up on Apache myself I can’t vouch for the quality of the available documentation, but once again a quick Google search should turn up a wealth of information. On either platform, the important things to remember are:

  • Static, plain text content (HTML, CSS, JavaScript, etc.) should be heavily compressed and the resulting file cached server side to save on CPU cycles
  • Dynamic content (PHP, ASP, etc.) should be freshly compressed with each request — no caching — and not quite as heavily compressed as static content
  • Compressing binary files such as images, PDFs, Word documents, etc. costs CPU cycles on the server with little to no benefit — in some cases compressing such files can even make them larger

Like image optimization, this is something you set up right the first time and reap the benefits with no repeating costs. It’s worth it to buddy up with the server admins to get this done. Compression has netted us around 70% reduction in transferred file size for both static and dynamic content and our hardware isn’t remotely taxed even when serving up around 10,000 visits per day.

Browser & Proxy Caching

I took this step only after some heavy thought. This involves more server configuration and has a dramatic impact on workflow. Essentially we set up the server to tell the browser to not even bother asking for new copies of static content until a given date or until after X time has passed. This content is loaded from the local cache. The benefit is nearly instant page renders for repeat visitors. Of course, that means if the file has changed since it was saved in the local cache, the user won’t see that change.

On the Server

After snooping around with HTTP Live Headers I determined that our server was already sending last-modified information, although our plain text documents seem to always use the create date. Could be because I edit on a Mac and upload to a Windows server. Binary files like images are fine. Setting up content expiration on IIS was pretty easy. Be mindful that all binary files will be cached. The browser won’t even request the file for however long you specify (or until after whatever date you specify). For us that was a problem because we have a lot of PDF newsletters that are updated by simply saving over the old file on the server. So rather than enabling content expiration on the entire site I did the CSS, JavaScript, and image directories. That covers that vast majority of the content that will benefit the most from client side caching.

Workflow Changes

To get users to updated content, we have to change the file name. That means we also have to change the markup, CSS, and/or JavaScript that references these files. HTML pages won’t be cached, so in the case of markup we just have the one extra step. Of course, if a link is hard coded into a few thousand pages, that one extra step can get pretty massive. Server side includes help us avoid those kinds of problems. CSS and JavaScript files themselves will be cached. That complicates things. To return to a previous example, let’s say we add a new social media icon to our footer. Here’s the process to do something like that.

  1. Edit sprite image to incorporate the new icon
  2. Save the updated sprite with a new file name — I recommend version numbering
  3. Edit un-minified CSS to reference the new image and adjust other styles as appropirate
  4. Save CSS with a new file name
  5. Test and debug cycle
  6. Minify CSS
  7. Update <link> in universal header include to point to updated CSS file

That’s a bit more complex than uploading the new icon and editing the CSS file as needed. If I miss something in the test and debug cycle, I can’t just make a quick edit and call it a day. I have to start the whole process over again because anyone who hit the site in the meantime will be stuck with a cached copy of those files for the next 6 months. My CSS and JavaScript files now have a change log in the comments at the top, complete with file dependencies. These sorts of workflow changes are stuff we should be doing anyway but getting serious about website performance forces us to treat web development a bit more like software engineering.

Benefits

After incorporating all these changes, first time page loads for first time visitors are about 2 seconds faster. Subsequent page loads, even for first time visitors, are about 1 second faster. That means in a year’s time our users will spend about 1 month less time waiting for our pages to load. Plus the gains that brings to the user experience (a bit harder to quantify). Plus the associated reduction in bandwidth costs (a question for our IT department). Plus possibly gaining favor with search engines.

Monday, June 14, 2010

Optimizing Site Performance

Purpose & Goal

Right now I’m running some tests through webpagetest.org to establish baselines for site performance. This week I hope to generate some CSS sprites for our common images and set up gzip and xpiration headers on the server. Those are the last 3 big things pointed out by YSlow (actually I’m using Google Page Speed, but YSlow is cool too). Hopefully soon I'll be able to run a new set of tests to compare to the baselines I'm running now.

Sampling

I'm trying to get a representative sample of of pages to test. The home page gets the most traffic, followed by the Current Students page. Pride Online has the most popular departmental home page, but it's utilitarian and not representative of other pages. That moves us down to the home page for the library.

Those 3 pages realistically represent only about 1/6 of our total pages, but it’s the most heavily visited 1/6. I could add one more test case and increase my representativeness to the majority of our pages, but we’d see diminishing returns in terms of traffic (only about 16% of total site traffic). Also, the remaining 5/6 of pages are mostly text heavy and image lite. The benefits to those pages will come from optimizing the overall site template, which will be partially reflected in the 3 chosen test cases.

Thanks to Google Analytics for the ability to access that info easily.

The Site Performance widget in Google Webmaster Tools says we gained nearly half a second in our rendering time around June 6th. That half second was enough to move us up from around the 73rd percentile to the 82nd percentile. I have no idea why we saw this jump in overall site performance but I'm glad to have it. Maybe we can even hold onto it.

The Results Are (Mostly) In

The 3rd and final test is still running as I type this. I’ll link out to the results as they become available.

Step 1: CSS Sprites

I'm using SpriteMe and my own best judgement to create some CSS sprites for commonly used background images. I'm assuming SpriteMe already has something like Smush.it built in.

Results

Ok, maybe I need to smush my sprites after all. I've added about 60kb to each page load while freeing up about 15 http requests. Smush.it can’t find anything to optimize, but the image optimizer in YSlow can shave about 5k total from 2 of the sprites. SpriteMe makes some less than optimal choices, even occasionally introducing bugs. But like all software it has to make certain assumptions and those don’t always pan out. I need to improve upon the default sprites while I’m at it. With these newly optimized sprites, let’s look at the trade offs.

jQuery UI Theme Icons for Play Controls

The first sprite I made was actually already made for me thanks to the jQuery UI Theme Roller. I used the blue and red versions of the UI sprite to replace the play control icons under the featured stories on the home page. The bad news is that almost doubles raw file size. In exchange, I pre-load a lot of UI elements that we're not currently using anywhere else. But I plan to make more use of the jQuery UI in the future. So I think this is a worthwhile change.

Baseline With Sprites Change
HTTP Requests 8 2 -6
Total Bytes 5,310 9,724 +4,414

Stripes Sprite

Both the header and the footer have background stripes that go all the way across the page. These were tiny, 1 pixel wide repeated images. But they were easy enough to combine.

Baseline With Sprites Change
HTTP Requests 2 1 -1
Total Bytes 905 122 -783

Image & Badge Drop Shadows

Some day I’ll be able to pull this off with pure CSS but in the meantime I’m stuck with background images. The one for the standard large image gets used on most pages. The one for smaller images at the same aspect ratio and the badges gets used fairly often too. The other 2 don’t see as much use. Some pages don’t use any of these, but the majority of our high traffic pages do. The image produced by SpriteMe left extra white space to the right of the 16px icons screwing up the placement. I had to edit the sprite in Photoshop. After that Smush.it actually did a better job compressing the resulting image than YSlow.

Baseline With Sprites Change
HTTP Requests 4 1 -3
Total Bytes 4,018 5,092 +1,074

Icons

The image produced by SpriteMe left extra white space to the right of the 16px icons screwing up the placement. I had to edit the sprite in Photoshop. After that Smush.it actually did a better job compressing the resulting image than YSlow.

Baseline With Sprites Change
HTTP Requests 18 1 -17
Total Bytes 14,589 13,677 -912

Logos

I combined our logo with the RODP logo that’s featured on every page. But that only nets us 1 less HTTP request and as you’ll see below costs us significant bytes due to the differences between .jpg and .png image formats.

Baseline With Sprites Change
HTTP Requests 2 1 -1
Total Bytes 12,270 39,876 +27,606

In a Nutshell

In a best case, a single page that references all these images (which is not terribly realistic), we’ve traded 28 fewer HTTP requests for 31,399 more bytes. That’s about 22 additional HTTP packets. Based on our test cases we’re realistically saving 12 – 15 HTTP requests per page. I know we’ll see a big pay off when we start effectively caching and gzipping this stuff, but I have to admit I’m disappointed with this first round of results.

Step 1.1: Learning from Mistakes

Ok, now I understand why the RODP logo was initially flagged as not to be included in a sprite due to being a jpg. I need to roll back the Logos sprite. I also need to alter the way we’re doing the random headers. We’ve been using something based on Dan Benjamin’s method from A List Apart but in terms of site performance it’s one of our slowest resources.

Results

We’re starting to see some minor pay off on the repeat viewing. Start render time on the current students page is now under 1 second. That’s pretty awesome, but we’ve still got work to do.

Step 2: GZip

I followed this guide as well as some official documentation from Microsoft. This online gzip test says we’re still not using compression but my web developer tool bar begs to differ.

Results

I just alphabetized the properties in each CSS declaration, primarily as a way to clean up my CSS and make it easier to manage. But Google’s Page Speed documentation points out the possibility of alphabetized CSS seeing more efficient gzip compression. Checking the alphabetized, minified, and gzipped version against the just minified and gzipped version with the Web Developer Toolbar (Information —> View Document Size) shows 8k either way. It doesn’t break it down into units any smaller than that. So maybe it saved me a few bytes, but not a kilobyte. So if we saw any gzip efficiency change from alphabetizing the CSS it’s less than 12.5%. If we assume rounding, then it’s less than 6.25%.

Step 3: Browser Cache

This one will be labor intensive. Before I put in the hours to set expiration dates and last modified dates for our various static resources I want to make sure they are properly optimized. We’ve covered CSS and JavaScript, but all the images that didn’t find their way into a sprite will need to be tested. I started this entry on Monday morning. It is now 1:30 on Wednesday. I this step will probably keep me busy the rest of today and a good chunk of tomorrow.

Ok, the clean up and optimization process has saved me about 10,000,000 bytes and 400 files. Gzip has been running for a full day and server’s resources don’t seem to be at all taxed. So far so good. Time to dive into Leverage Browser Caching.

I have a problem in that IIS reports the wrong last-modified date for certain files. Images seem fine. PHP files and other plain text files are not. Maybe I’m missing something in Dreamweaver. Our server also sends E-Tag hashes, which are newer than and in theory superior to last-modified headers. But they are also redundant together. E-Tags are apparently frowned upon when using multiple servers, such as a Content Delivery Network. All our content comes from just the one server so I don’t think it’s an issue with us. At worst we’s sending a single redundant HTTP response header.

So half the browser caching equation was essentially taken care of for us by IIS’s default settings. The other end of that equation is content expiration. I did not like that IIS just gave me a single check box to turn content expiration on. I wanted to be able to set it up by file extension. No such luck. I turned it on and did some testing with the Live HTTP Headers extension for Firefox. I was able to confirm that our php pages were not caching. Which is good. The last thing I want is someone to spend the next 6 months with an unchanging copy of our home page stuck in their browser cache. CSS and JavaScript files were caching, along with images. I already use version numbering on CSS, JavaScript, and image sprites. For the most part other images will have specific names anyway, so that’s good. Then I tested PDF files and did not like what I found.

I downloaded the PDF of our campus map and saw that the response headers told the browser to cache it for the next 6 months as well. We may need to update the map in the next 6 months. Maybe an office location changes to a new building or something. Worse still, many of our PDFs are newsletters. Monthly newsletters. Some offices have an archive listing years’ worth of issues while others just link to the most recent issue. In the latter case, I just save the new issue over the old issue on the server. With browsers caching these files, that wouldn’t work anymore. I’d have to give each issue a distinct name and adjust the HTML links to match, greatly complicating that process.

I rolled back those changes and tried setting up expiration on a directory by directory basis. So far I’ve only done /images/, /css/, and /js/. That covers the vast majority of our static files. Occasionally a directory off of root will have its own images directory. For now I’m leaving those alone. The handful of child directories inside /images/ also apparently did not did inherit the expiration settings of their parent folder.

The downside to all this experimenting is I have to bounce the IIS service for changes to take effect. That only takes about 8 seconds by my count, but I’t to think of someone submitting their final form for online orientation during that tiny window.

Page Speed still says we fail the browser cache check, but that’s due to the FEMA widget currently on our site. All resources being served from volstate.edu pass. So let’s run another batch of tests.

Results

Conclusion

This has been less a blog entry and more 4 days of notes to myself. Tomorrow I will try to recap the high points and present my data analysis.