SlideShare a Scribd company logo
Welcome
Web Site Optimization Presentation By: Sunil Patil
Our sponsors:
AGENDA AGENDA FOR THE SESSION What is web site optimization ? Why you should worry about web site optimization ? Suggestions for optimizing web site Make fewer requests Use Caching Minimize request overhead Minimize response size Optimize browser rendering Tools
What is web site optimization? WHAT IS WEB SITE OPTIMIZATION End user cares about how much time it takes to render a page in his browser (Perceived performance) and how fast he can move from one page to another When you access a page in browser, it performs following steps to render page Make request Get HTML response (We focus mostly on this) Parse HTML response Find out resources (JS, CSS, Images) required on the page Download resource Parse resources Execute resources During web site optimization, we try to optimize each of the above steps and try to improve the perceived performance of the web site
Connected.atech.com Time to generate HTML 0.9 sec Time to render page 40 sec.
Advantages of web site optimization WHY YOU SHOULD THINK ABOUT OPTIMIZING YOUR WEBSITE Less than 15- 20 % of time is spent on generating and downloading html Improving this performance is not easy. It might require,  Creating new architecture Re-fractoring code, introducing caching Tune backend If you improve this part by say 50 % overall gain will be 8-10% More than 80% of time is spent in downloading, parsing and executing resources Improving performance is easy Configuration changes at infrastructure level Additional tasks, guidelines during development and build phase Additional components at infrastructure level If you improve this part by 50 %, overall gain will be 40% Web pages are getting richer and complex (50 + resources, Ajax,..)
Lessons Learned LESSONS LEARNED FROM WEBSITE OPTIMIZATION EXPERIENCE AT CLIENT Load testing does not fully capture all the performance related problems Business users, use older browsers compared to technical users Location of users matters Network speed matters Changing HTTP Server level configuration takes time HTTP Servers are normally shared across different teams Application teams would be on different release cycle so they might not make changes in their code We under estimate the impact of web site optimization As sites are getting richer, complex, there is greater need for web site optimization and lot of research is happening in this area
Make fewer request 1
What is parallel connection ? HOW PARALLEL CONNECTIONS IN BROWSER WORKS The  Http 1.1 specification  says that a browser should allow at the most two parallel connections per host name. So if your web page has 50 resources then browser will start 2 downloads and queue the rest. Once a download is finished it will start next download from queue The total round-trip time is N/X, where N is the number of resources to fetch from a host
Number of parallel connection NUMBER OF PARALLEL CONNECTIONS DEPEND ON BROWSER Older browsers follow 2 parallel connections per host rule, but newer browsers use more parallel connections.  IE 6/7  -> 2 IE 8  -> 6 Firefox 2  -> 2 Firefox 3  -> 6 Safari 3/ 4  -> 4 Opera  -> 4 Chrome  -> 4 Browser can bring down number of parallel connection in special cases If you use IE 8 on dial up connection it will use 2 parallel connections
Effect of number of parallel connections 2   parallel connections 6 parallel connections
Script blocks parallel download SCRIPT DOWNLOAD IS STOP ALL EVENT IN SOME BROWSER When a browser encounters a <script> tag in html, it will stop everything until it downloads the script , parses and executes the script Script tag might have a document.write(), which could affect  the page content so browser waits for the script to download and execute If your script performs long executing operation onload then it could cause issues Scripts on page must be executed in proper order second.js might depends on first.js, so first.js must be executed before second.js Some of the newer browser download scripts in parallel but execute them in order
Effect of browsers that block everything for scitpt Browsers that blocks everything while downloading script
How to improve parallelization WHAT CAN WE DO TO ACHIEVE MORE PARALLELIZATION Browsers limit number of parallel connections per hostname, so easiest way to get around this problem will be to use multiple host names for downloading resources. You can use one hostname to download HTML and up to 4 hostnames for downloading other resources You can use  www.static-atech.com  for downloading resources. The  www.staticatech.com  will actually point to same server Combine files of similar type Use tools like Dojo Shrink Safe, YUI Compressor to combine multiple JS files Create a custom Dojo Build with additional classes, widgets,.. etc Use YUI Compressor to combine multiple .css files Use Images Maps, CSS Sprites Inline smaller/non- cacheable resources
Use caching 2
Expiry based caching WHAT IS EXPIRY BASED CACHING ? Setting expiry caching header instructs browser to load resource from disk instead of network. You can let browser know that it can cache response for certain period of time The HTTP 1.1 Specification introduced Cache-Control header, you can set Cache-Control: max-age=<noofseconds> and browser will cache the resource for <noofseconds>. If it gets another request for resource during that time it will just use it from disk The HTTP 1.0 Specification had Expires header. You can set Expires:  Fri, 1 Oct 2010 12:00:00 GMT(Date in GMT) format. The browser will cache the resource and use it till 1st October If you set both Cache-Control and Expires header then Cache-Control will take precedence, older HTTP clients don’t understand Cache-control Resource might get purged from cache if the browser’s cache size is reached
What happens if you don’t set caching headers HOW BROWSERS AND CACHES DEAL WITH ABSENCE OF EXPIRY RELATED HEADER If you don’t want browser to cache a resource then you must set Cache-Control: no-cache If you don’t set either Expires or Cache-Control header, then browser or cache proxies can use  heuristic expiration Http Clients will read value of Last-Modified and if the resource is not changed for 10 months it will cache it for 1 months (Expiration Time = Now + 0.1 * (Time since Last-Modified)) Firefox  IE 7 Caching proxies Basic idea is if a resource is not changed for long time then it has less chance of changing in future Different clients might use different algorithms to come up with expiration time and result could be unpredictable
What can you do to improve caching ? USE AGGRESSIVE CACHING OF STATIC RESOURCES If you don’t know when resource will be updated, you should configure your site so that HTML never gets cached and other resources get cached for long time (Months or years) HTML document has references to all the resources on the page, so if a resource is changed change its reference/URL in the HTML  Change the file name Ex. From test.js to test_v1.js Change the folder Ex test.js to v1/test.js Create mod_rewrite rule. Ex v1/test.js, v2/test.js, v3/test.js gets mapped to test.js If you know precisely when resource will be updated set Expires to that date
Caching static resources HOW TO CONFIGURE CACHING AT HTTP SERVER LEVEL Apache HTTP Server has mod_expires module that you can be used to generate expiry based caching header in response Sets both Cache-control and Expires header Can set headers for static content served by HTTP Server as well as static content returned by the WebSphere’s File Serving Servlet Granular control, Can set headers globally or at URL, directory level Can set different expiry rules based on response content type, file extension,.. This configuration says that images should be cached for 3 month and other resources should be cached for 1 month ExpiresActive On  ExpiresDefault &quot;access 1 month&quot; ExpiresByType image/gif &quot;access plus 3 month&quot;
Caching dynamic resources HOW TO CONFIGURE CACHING OF RESOURCES SERVRED BY WEBSPHERE The file serving servlet (Used for serving static files) does not set expires/cache-control header. You can add ServletFilter in your web application You can set Expires/Cache-Control headers in Servlet WebSphere Portal server has  navigatorservice.properties  file that lets you configure overall portal level caching, caching for ATOM feed You can configure WPS to make anonymous page cachable , process is complicated The Portlet Specification 2.0 has concept of expiration cache, which you can use for setting Cache-control max-age and public/private header Set expiration-cache and cache-scope in portlet.xml Use ResourceResponse.getCacheControl() to get object of javax.portlet.CacheControl and call its method setExpirationTime() and setPublicScope() methods Use ResourceURL.setCacheability() so that WPS generates cache friendly URLs
Validation based caching WHAT IS VALIDATION BASED CACHING ? When a static HTML file is served (Apache HTTP Server, WebSphere’s File Serving Servlet), the server will send Last-Modified header will value equal to date when the file was modified (OS date) Apache HTTP Server can generate ETag for static files based on its modification time, size,.. If you don’t set Cache-Control: no-store, browser will store the response in cache But every time you request the resource(No cached, or stale) it will send Conditional GET request, with If-Modified-Since, If-None-Match header Server will check if the resource is actually modified, if not it will return HTTP 304 with no body(Average 250 byte response) to indicate that browser can use the response Validation based caching is better than getting full HTTP 200 response with full body but worst than cached resource which does not require HTTP request
Validation based caching HOW CACHE VALIDATION WORKS The HTTP Specification has concept of Conditional GET, that helps client to prevent download of same resource repeatedly The Server can send Last-Modified, ETag header in response HTTP Client (Browser, caching proxies) will copy the resource in disk cache along with the headers Next time when you request that resource the client will add If-Modified-Since and If-None-Match headers to the request with the value that it had on disk Server compares this values to the version it has and sends a HTTP 200 OK, with full resource in the body of response if the resource is changed but if the resource is not changed the server will send HTTP 304 Not Modified with only headers Original resource could be say 100kb, but the HTTP 304 respose will be 200-250 bytes, you can save on download size Client has to make a request using one of the connections from parallel connection pool
How validation caching works
Configure ETag WHY YOU SHOULD CONSIDER DISABLING ETAG ETags are introduced to help with multiple HTTP server environment HTTP Server can generate ETag(Similar to a version number) for the static resources. Its enabled by default. The default format of ETag is INode MTime Size Apache HTTP Server sends both Last-Modified and ETag header. You cant disable Last-Modified. Browser will send both If-Modified-Since and If-None-Match header to check if resource is still valid As per HTTP Specification both IMS and INM conditions should be met for server to return HTTP 304 (Desired behavior with smaller response) If your request goes to HTTP server that has different file permission but same date, Server will return HTTP 200 instead of HTTP 304 You can configure, disable ETag by adding FileETag None to httpd.conf. Or at least configure it to FileETag MTime
Leverage proxy caching HOW TO CACHE RESOURCE ACROSS USERS Big portion of internet traffic goes through caching proxy Proxy provided by ISP Proxy provided by corporate network for outbound connection Proxy infront of your web server for inbound connection Enabling public caching in the HTTP headers for static resources allows the browser to download resources from a nearby proxy server rather than from a remoter origin server Proxy will share cached resources across proxies You use the Cache-control: public header to indicate that a resource can be cached by public web proxies in addition to the browser that issued the request. Set appropriate Vary header  (Vary: Accept-Encoding, User-Agent)
Minimize request overhead 3
HTTP Requst WHAT HAPPENS WHEN BROWSER REQUESTS A RESOURCE When you try accessing a resource in your browser, it performs following steps DNS resolution Establish HTTP connection Send request Receive response You should try and reduce overhead on each of these steps
Reduce DNS resolution time REDUCE DNS RESOLUTION TIME Before a browser establishes a connection with server it must resolve host name into IP address. This value is cached by Operating System Browser The DNS record cache has short life time and might have to traverse hierarchy to get record Reducing the number of unique hostnames from which resources are served cuts down on the number of DNS resolutions that the browser has to make Don't use more than 1 host for less than 5 resources, balance resources across host names Serve early loaded JavaScript from same domain as that of host Browsers block parallel download while downloading JavaScript, so it should be as fast as possible
Use HTTP Persistent Connection WHAT IS HTTP PERSISTENT CONNECTION AND WHY YOU SHOULD CARE Web clients often open connection to same site for downloading HTML and related resources. HTTP 1.1 (Keep Alive in HTTP 1.0) allows HTTP devices to keep TCP connection open after transaction complete and to reuse the preexisting connection for future HTTP requests. The connections that are kept open after transaction are called persistent connection You can avoid slow connection setup You can avoid slow-start congestion adaption phase. Persistent connections are more efficient when used in conjunction with parallel connections. Starting from HTTP 1.1 connection is persistent by default unless you set  Connection: close You can set “KeepAlive on” in Apache to turn on persistent connection
Persistent Connection
Size of HTTP Request WHY SIZE OF HTTP REQUEST MATTERS ? Most users have asymmetric connection, upload to download speed is in ration 1:4 to 1:20. That means uploading 500 bytes is same as downloading 10 KB. We cant compress data in HTTP request. You should try and keep your request size small so that it fits in one packet of 1500 bytes Initial HTTP request suffers from Startup Throttling  HTTP request is made up of following things Request header set by browser URL, Referral URL Cookies  You should try and reduce size of each of the request components
Request for static resource
Minimize cookie size HOW YOU CAN REDUCE COOKIE SIZE Enterprise applications need at least few big cookies that we cant avoid LTPA Token, JSessionId, SSO related cookies Every time a client sends an HTTP request, it has to send all associated cookies that have been set for that domain and path along with it. Use server side storage for cookie for most of the cookie payload and send only a Key in the cookie.  Serving static resources from a cookie less domain reduces the total size of requests made for a page Static resources do not need cookies  Typical static file will be less than 10 KB, so more time is spent in making request then getting response
Minimize response size 4
Compress response USE GZIP FOR COMPRESSING RESPONSE Compressing resources with GZip will reduce the size of resource by 70 % Most modern browsers support compressed data. Browser sends Accept-Encoding header to specify what all encodings it supports You can configure HTTP server to compress both static files that it serves and dynamic content that goes through it You should compress only text files such as HTML, JavaScript, CSS You should not compress binary files such as Images, PDF, They are already compressed and there size might increase after GZip You should not compress resources less than 150 bytes
Configure GZip on Apache HTTP Server HOW TO CONFIGURE APACHE HTTP SERVER FOR GZIP Apache HTTP Server has a mod_deflate module that you can use to GZip the response You can use it to GZip both static files served by Apache and dynamic responses that are tunneled through Apache HTTP Server It checks if browser supports GZip and if yes then only GZip’s response It allows you to configure GZip by content type LoadModule deflate_module modules/mod_deflate.so AddOutputFilterByType DEFLATE text/html text/plain text/xml Make sure that you set Vary: Accept-Encoding so that proxy can deal with clients who do not support GZip properly
Minification MINIFY TEXT FILES Minification is the practice of removing unnecessary characters from the code to reduce its size  Extra spaces Line breaks Indentation Comments You can use tools to minify  JavaScript  CSS  HTML
Minify JavaScript WHY MINIFY JAVASCRIPT Compacting JavaScript code can save many bytes of data and speed up downloading, parsing, and execution time. Minification will reduce size by up to 30 % There are several tools that you can use for minifying JavaScript Dojo Shrink safe YUI Compressor Google’s Closure compiler Task to minify JavaScript should be part of your build script You can also minify JavaScript on the fly using Servlet Filter
Minify CSS WHY MINIFY CSS Compacting CSS code can save many bytes of data and speed up downloading, parsing, and execution time. Minifying CSS has same advantages that of minifying JavaScript There are several tools for minifying CSS YUI Compressor Cssmin.js You can add task to minify CSS in the build script You can minify CSS on the fly using Servlet Filter
Minify HTML WHY COMPACT/MINIFY HTML Compacting HTML code, including any inline JavaScript and CSS contained in it, can save many bytes of data and speed up downloading, parsing, and execution time. There are YUI Tag libraries that you can use to compress inline JavaScript and CSS WebSphere generates quite few blank lines and white spaces in HTML Set com.ibm.wsspi.jsp.usecdatatrim  property to true in Web Container Custom settings to bring size of generated HTML by up to 15%
Optimize Images WHY OPTIMIZE IMAGES Properly formatting and compressing images can save many bytes of data Images saved from programs like Fireworks can contain kilobytes of extra comments, and use too many colors, even though a reduction in the color palette may not perceptibly reduce image quality Choose an appropriate Image file format PNGs are almost always superior to GIFs and are usually the best choice Use GIFs for very small or simple graphics and for images which contain animation. Use JPGs for all photographic-style images.  Do not use BMPs or TIFFs.  Use an image compressor
Optimize browser rendering 5
What is optimizing browser rendering OPTIMIZE BROWSER RENDERING Once resources have been downloaded to the client, the browser still needs to load, interpret, and render HTML, CSS, and Javascript code. By simply formatting your code and pages in ways that exploit the characteristics of current browsers, you can enhance performance on the client side. Put CSS at the top of the document Always specify content type encoding Specifying a character set early for your HTML documents allows the browser to begin executing scripts immediately Put JavaScript at the end of the document Avoid CSS expressions
Tools 6
Testing tools WHAT TOOLS SHOULD YOU USE FOR TESTING Traditional load testing tools like Load Runners are not well suited for capturing browser performance data They take simplistic view of HTTP transaction Browser has lot of logic and variations Use load testing tools that run inside browser iOpus iMacros Selenium  Gomez
Yahoo YSlow
Google Page speed
Charles Web Debugging Proxy
Reference MORE INFORMATION My Blog ( http://wpcertification.blogspot.com/search/label/clientsideperformance ) High performance web site, Oreilly Publication Even faster web site, Oreilly Publication
THANK YOU FOR WATCHING CONTACT INFO: ASCENDANT TECHNOLOGY, LLC 8601 Ranch Road 2222 Building I, Suite 205 Austin, TX  78730 Phone (512) 346-9580 Thank You Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. December 19, 2010

More Related Content

Web Site Optimization

  • 2. Web Site Optimization Presentation By: Sunil Patil
  • 4. AGENDA AGENDA FOR THE SESSION What is web site optimization ? Why you should worry about web site optimization ? Suggestions for optimizing web site Make fewer requests Use Caching Minimize request overhead Minimize response size Optimize browser rendering Tools
  • 5. What is web site optimization? WHAT IS WEB SITE OPTIMIZATION End user cares about how much time it takes to render a page in his browser (Perceived performance) and how fast he can move from one page to another When you access a page in browser, it performs following steps to render page Make request Get HTML response (We focus mostly on this) Parse HTML response Find out resources (JS, CSS, Images) required on the page Download resource Parse resources Execute resources During web site optimization, we try to optimize each of the above steps and try to improve the perceived performance of the web site
  • 6. Connected.atech.com Time to generate HTML 0.9 sec Time to render page 40 sec.
  • 7. Advantages of web site optimization WHY YOU SHOULD THINK ABOUT OPTIMIZING YOUR WEBSITE Less than 15- 20 % of time is spent on generating and downloading html Improving this performance is not easy. It might require, Creating new architecture Re-fractoring code, introducing caching Tune backend If you improve this part by say 50 % overall gain will be 8-10% More than 80% of time is spent in downloading, parsing and executing resources Improving performance is easy Configuration changes at infrastructure level Additional tasks, guidelines during development and build phase Additional components at infrastructure level If you improve this part by 50 %, overall gain will be 40% Web pages are getting richer and complex (50 + resources, Ajax,..)
  • 8. Lessons Learned LESSONS LEARNED FROM WEBSITE OPTIMIZATION EXPERIENCE AT CLIENT Load testing does not fully capture all the performance related problems Business users, use older browsers compared to technical users Location of users matters Network speed matters Changing HTTP Server level configuration takes time HTTP Servers are normally shared across different teams Application teams would be on different release cycle so they might not make changes in their code We under estimate the impact of web site optimization As sites are getting richer, complex, there is greater need for web site optimization and lot of research is happening in this area
  • 10. What is parallel connection ? HOW PARALLEL CONNECTIONS IN BROWSER WORKS The Http 1.1 specification says that a browser should allow at the most two parallel connections per host name. So if your web page has 50 resources then browser will start 2 downloads and queue the rest. Once a download is finished it will start next download from queue The total round-trip time is N/X, where N is the number of resources to fetch from a host
  • 11. Number of parallel connection NUMBER OF PARALLEL CONNECTIONS DEPEND ON BROWSER Older browsers follow 2 parallel connections per host rule, but newer browsers use more parallel connections. IE 6/7 -> 2 IE 8 -> 6 Firefox 2 -> 2 Firefox 3 -> 6 Safari 3/ 4 -> 4 Opera -> 4 Chrome -> 4 Browser can bring down number of parallel connection in special cases If you use IE 8 on dial up connection it will use 2 parallel connections
  • 12. Effect of number of parallel connections 2 parallel connections 6 parallel connections
  • 13. Script blocks parallel download SCRIPT DOWNLOAD IS STOP ALL EVENT IN SOME BROWSER When a browser encounters a <script> tag in html, it will stop everything until it downloads the script , parses and executes the script Script tag might have a document.write(), which could affect the page content so browser waits for the script to download and execute If your script performs long executing operation onload then it could cause issues Scripts on page must be executed in proper order second.js might depends on first.js, so first.js must be executed before second.js Some of the newer browser download scripts in parallel but execute them in order
  • 14. Effect of browsers that block everything for scitpt Browsers that blocks everything while downloading script
  • 15. How to improve parallelization WHAT CAN WE DO TO ACHIEVE MORE PARALLELIZATION Browsers limit number of parallel connections per hostname, so easiest way to get around this problem will be to use multiple host names for downloading resources. You can use one hostname to download HTML and up to 4 hostnames for downloading other resources You can use www.static-atech.com for downloading resources. The www.staticatech.com will actually point to same server Combine files of similar type Use tools like Dojo Shrink Safe, YUI Compressor to combine multiple JS files Create a custom Dojo Build with additional classes, widgets,.. etc Use YUI Compressor to combine multiple .css files Use Images Maps, CSS Sprites Inline smaller/non- cacheable resources
  • 17. Expiry based caching WHAT IS EXPIRY BASED CACHING ? Setting expiry caching header instructs browser to load resource from disk instead of network. You can let browser know that it can cache response for certain period of time The HTTP 1.1 Specification introduced Cache-Control header, you can set Cache-Control: max-age=<noofseconds> and browser will cache the resource for <noofseconds>. If it gets another request for resource during that time it will just use it from disk The HTTP 1.0 Specification had Expires header. You can set Expires: Fri, 1 Oct 2010 12:00:00 GMT(Date in GMT) format. The browser will cache the resource and use it till 1st October If you set both Cache-Control and Expires header then Cache-Control will take precedence, older HTTP clients don’t understand Cache-control Resource might get purged from cache if the browser’s cache size is reached
  • 18. What happens if you don’t set caching headers HOW BROWSERS AND CACHES DEAL WITH ABSENCE OF EXPIRY RELATED HEADER If you don’t want browser to cache a resource then you must set Cache-Control: no-cache If you don’t set either Expires or Cache-Control header, then browser or cache proxies can use heuristic expiration Http Clients will read value of Last-Modified and if the resource is not changed for 10 months it will cache it for 1 months (Expiration Time = Now + 0.1 * (Time since Last-Modified)) Firefox IE 7 Caching proxies Basic idea is if a resource is not changed for long time then it has less chance of changing in future Different clients might use different algorithms to come up with expiration time and result could be unpredictable
  • 19. What can you do to improve caching ? USE AGGRESSIVE CACHING OF STATIC RESOURCES If you don’t know when resource will be updated, you should configure your site so that HTML never gets cached and other resources get cached for long time (Months or years) HTML document has references to all the resources on the page, so if a resource is changed change its reference/URL in the HTML Change the file name Ex. From test.js to test_v1.js Change the folder Ex test.js to v1/test.js Create mod_rewrite rule. Ex v1/test.js, v2/test.js, v3/test.js gets mapped to test.js If you know precisely when resource will be updated set Expires to that date
  • 20. Caching static resources HOW TO CONFIGURE CACHING AT HTTP SERVER LEVEL Apache HTTP Server has mod_expires module that you can be used to generate expiry based caching header in response Sets both Cache-control and Expires header Can set headers for static content served by HTTP Server as well as static content returned by the WebSphere’s File Serving Servlet Granular control, Can set headers globally or at URL, directory level Can set different expiry rules based on response content type, file extension,.. This configuration says that images should be cached for 3 month and other resources should be cached for 1 month ExpiresActive On ExpiresDefault &quot;access 1 month&quot; ExpiresByType image/gif &quot;access plus 3 month&quot;
  • 21. Caching dynamic resources HOW TO CONFIGURE CACHING OF RESOURCES SERVRED BY WEBSPHERE The file serving servlet (Used for serving static files) does not set expires/cache-control header. You can add ServletFilter in your web application You can set Expires/Cache-Control headers in Servlet WebSphere Portal server has navigatorservice.properties file that lets you configure overall portal level caching, caching for ATOM feed You can configure WPS to make anonymous page cachable , process is complicated The Portlet Specification 2.0 has concept of expiration cache, which you can use for setting Cache-control max-age and public/private header Set expiration-cache and cache-scope in portlet.xml Use ResourceResponse.getCacheControl() to get object of javax.portlet.CacheControl and call its method setExpirationTime() and setPublicScope() methods Use ResourceURL.setCacheability() so that WPS generates cache friendly URLs
  • 22. Validation based caching WHAT IS VALIDATION BASED CACHING ? When a static HTML file is served (Apache HTTP Server, WebSphere’s File Serving Servlet), the server will send Last-Modified header will value equal to date when the file was modified (OS date) Apache HTTP Server can generate ETag for static files based on its modification time, size,.. If you don’t set Cache-Control: no-store, browser will store the response in cache But every time you request the resource(No cached, or stale) it will send Conditional GET request, with If-Modified-Since, If-None-Match header Server will check if the resource is actually modified, if not it will return HTTP 304 with no body(Average 250 byte response) to indicate that browser can use the response Validation based caching is better than getting full HTTP 200 response with full body but worst than cached resource which does not require HTTP request
  • 23. Validation based caching HOW CACHE VALIDATION WORKS The HTTP Specification has concept of Conditional GET, that helps client to prevent download of same resource repeatedly The Server can send Last-Modified, ETag header in response HTTP Client (Browser, caching proxies) will copy the resource in disk cache along with the headers Next time when you request that resource the client will add If-Modified-Since and If-None-Match headers to the request with the value that it had on disk Server compares this values to the version it has and sends a HTTP 200 OK, with full resource in the body of response if the resource is changed but if the resource is not changed the server will send HTTP 304 Not Modified with only headers Original resource could be say 100kb, but the HTTP 304 respose will be 200-250 bytes, you can save on download size Client has to make a request using one of the connections from parallel connection pool
  • 25. Configure ETag WHY YOU SHOULD CONSIDER DISABLING ETAG ETags are introduced to help with multiple HTTP server environment HTTP Server can generate ETag(Similar to a version number) for the static resources. Its enabled by default. The default format of ETag is INode MTime Size Apache HTTP Server sends both Last-Modified and ETag header. You cant disable Last-Modified. Browser will send both If-Modified-Since and If-None-Match header to check if resource is still valid As per HTTP Specification both IMS and INM conditions should be met for server to return HTTP 304 (Desired behavior with smaller response) If your request goes to HTTP server that has different file permission but same date, Server will return HTTP 200 instead of HTTP 304 You can configure, disable ETag by adding FileETag None to httpd.conf. Or at least configure it to FileETag MTime
  • 26. Leverage proxy caching HOW TO CACHE RESOURCE ACROSS USERS Big portion of internet traffic goes through caching proxy Proxy provided by ISP Proxy provided by corporate network for outbound connection Proxy infront of your web server for inbound connection Enabling public caching in the HTTP headers for static resources allows the browser to download resources from a nearby proxy server rather than from a remoter origin server Proxy will share cached resources across proxies You use the Cache-control: public header to indicate that a resource can be cached by public web proxies in addition to the browser that issued the request. Set appropriate Vary header (Vary: Accept-Encoding, User-Agent)
  • 28. HTTP Requst WHAT HAPPENS WHEN BROWSER REQUESTS A RESOURCE When you try accessing a resource in your browser, it performs following steps DNS resolution Establish HTTP connection Send request Receive response You should try and reduce overhead on each of these steps
  • 29. Reduce DNS resolution time REDUCE DNS RESOLUTION TIME Before a browser establishes a connection with server it must resolve host name into IP address. This value is cached by Operating System Browser The DNS record cache has short life time and might have to traverse hierarchy to get record Reducing the number of unique hostnames from which resources are served cuts down on the number of DNS resolutions that the browser has to make Don't use more than 1 host for less than 5 resources, balance resources across host names Serve early loaded JavaScript from same domain as that of host Browsers block parallel download while downloading JavaScript, so it should be as fast as possible
  • 30. Use HTTP Persistent Connection WHAT IS HTTP PERSISTENT CONNECTION AND WHY YOU SHOULD CARE Web clients often open connection to same site for downloading HTML and related resources. HTTP 1.1 (Keep Alive in HTTP 1.0) allows HTTP devices to keep TCP connection open after transaction complete and to reuse the preexisting connection for future HTTP requests. The connections that are kept open after transaction are called persistent connection You can avoid slow connection setup You can avoid slow-start congestion adaption phase. Persistent connections are more efficient when used in conjunction with parallel connections. Starting from HTTP 1.1 connection is persistent by default unless you set Connection: close You can set “KeepAlive on” in Apache to turn on persistent connection
  • 32. Size of HTTP Request WHY SIZE OF HTTP REQUEST MATTERS ? Most users have asymmetric connection, upload to download speed is in ration 1:4 to 1:20. That means uploading 500 bytes is same as downloading 10 KB. We cant compress data in HTTP request. You should try and keep your request size small so that it fits in one packet of 1500 bytes Initial HTTP request suffers from Startup Throttling HTTP request is made up of following things Request header set by browser URL, Referral URL Cookies You should try and reduce size of each of the request components
  • 33. Request for static resource
  • 34. Minimize cookie size HOW YOU CAN REDUCE COOKIE SIZE Enterprise applications need at least few big cookies that we cant avoid LTPA Token, JSessionId, SSO related cookies Every time a client sends an HTTP request, it has to send all associated cookies that have been set for that domain and path along with it. Use server side storage for cookie for most of the cookie payload and send only a Key in the cookie. Serving static resources from a cookie less domain reduces the total size of requests made for a page Static resources do not need cookies Typical static file will be less than 10 KB, so more time is spent in making request then getting response
  • 36. Compress response USE GZIP FOR COMPRESSING RESPONSE Compressing resources with GZip will reduce the size of resource by 70 % Most modern browsers support compressed data. Browser sends Accept-Encoding header to specify what all encodings it supports You can configure HTTP server to compress both static files that it serves and dynamic content that goes through it You should compress only text files such as HTML, JavaScript, CSS You should not compress binary files such as Images, PDF, They are already compressed and there size might increase after GZip You should not compress resources less than 150 bytes
  • 37. Configure GZip on Apache HTTP Server HOW TO CONFIGURE APACHE HTTP SERVER FOR GZIP Apache HTTP Server has a mod_deflate module that you can use to GZip the response You can use it to GZip both static files served by Apache and dynamic responses that are tunneled through Apache HTTP Server It checks if browser supports GZip and if yes then only GZip’s response It allows you to configure GZip by content type LoadModule deflate_module modules/mod_deflate.so AddOutputFilterByType DEFLATE text/html text/plain text/xml Make sure that you set Vary: Accept-Encoding so that proxy can deal with clients who do not support GZip properly
  • 38. Minification MINIFY TEXT FILES Minification is the practice of removing unnecessary characters from the code to reduce its size Extra spaces Line breaks Indentation Comments You can use tools to minify JavaScript CSS HTML
  • 39. Minify JavaScript WHY MINIFY JAVASCRIPT Compacting JavaScript code can save many bytes of data and speed up downloading, parsing, and execution time. Minification will reduce size by up to 30 % There are several tools that you can use for minifying JavaScript Dojo Shrink safe YUI Compressor Google’s Closure compiler Task to minify JavaScript should be part of your build script You can also minify JavaScript on the fly using Servlet Filter
  • 40. Minify CSS WHY MINIFY CSS Compacting CSS code can save many bytes of data and speed up downloading, parsing, and execution time. Minifying CSS has same advantages that of minifying JavaScript There are several tools for minifying CSS YUI Compressor Cssmin.js You can add task to minify CSS in the build script You can minify CSS on the fly using Servlet Filter
  • 41. Minify HTML WHY COMPACT/MINIFY HTML Compacting HTML code, including any inline JavaScript and CSS contained in it, can save many bytes of data and speed up downloading, parsing, and execution time. There are YUI Tag libraries that you can use to compress inline JavaScript and CSS WebSphere generates quite few blank lines and white spaces in HTML Set com.ibm.wsspi.jsp.usecdatatrim  property to true in Web Container Custom settings to bring size of generated HTML by up to 15%
  • 42. Optimize Images WHY OPTIMIZE IMAGES Properly formatting and compressing images can save many bytes of data Images saved from programs like Fireworks can contain kilobytes of extra comments, and use too many colors, even though a reduction in the color palette may not perceptibly reduce image quality Choose an appropriate Image file format PNGs are almost always superior to GIFs and are usually the best choice Use GIFs for very small or simple graphics and for images which contain animation. Use JPGs for all photographic-style images. Do not use BMPs or TIFFs. Use an image compressor
  • 44. What is optimizing browser rendering OPTIMIZE BROWSER RENDERING Once resources have been downloaded to the client, the browser still needs to load, interpret, and render HTML, CSS, and Javascript code. By simply formatting your code and pages in ways that exploit the characteristics of current browsers, you can enhance performance on the client side. Put CSS at the top of the document Always specify content type encoding Specifying a character set early for your HTML documents allows the browser to begin executing scripts immediately Put JavaScript at the end of the document Avoid CSS expressions
  • 46. Testing tools WHAT TOOLS SHOULD YOU USE FOR TESTING Traditional load testing tools like Load Runners are not well suited for capturing browser performance data They take simplistic view of HTTP transaction Browser has lot of logic and variations Use load testing tools that run inside browser iOpus iMacros Selenium Gomez
  • 50. Reference MORE INFORMATION My Blog ( http://wpcertification.blogspot.com/search/label/clientsideperformance ) High performance web site, Oreilly Publication Even faster web site, Oreilly Publication
  • 51. THANK YOU FOR WATCHING CONTACT INFO: ASCENDANT TECHNOLOGY, LLC 8601 Ranch Road 2222 Building I, Suite 205 Austin, TX 78730 Phone (512) 346-9580 Thank You Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. December 19, 2010

Editor's Notes

  1. Applicable to developer and administrators on server side, J2EE applications in WebSphere. Not going to get into details on JavaScript, CSS,..
  2. What is perceived performance, end user cares about perceived performance, administrators, developers care about time it to takes to generate HTML