Introduction to FEO – edge case management

Edge Cases, new developments

This post provides some input as to Monitoring approaches/modifications suggested by various specific cases, driven by advances in technology increasingly adopted within target applications.

The following are briefly addressed:

  • Single Page Applications
  • Applications incorporating Server push and/or service worker
  • Internet of Things (IoT)
  • HTTP/2
  • Microservices based applications
  • Managing non-design performance adoptions
  • Bots
  • Performance APIs

 

  • Single Page Applications [SPAs]:

So-called single page applications [SPAs] are becoming increasingly common, either as a complete application (as the name suggests) or as an element of a larger ‘compound’ application. SPA’s are characterised by the use of client side JavaScript frameworks (such as Angular js or React.js). They permit dynamic extension of HTML, and leverage the computing power of modern user devices.

The issue that SPAs present from a monitoring perspective is that they minimise the network interactions between user device and origin infrastructure. The historic ‘page based’ download paradigm (and dependency) is broken. This presents a particular problem for traditional synthetic monitoring, given that they are based on capturing and analysing just that ‘over the wire’ interaction.

User:site interactions (termed ‘soft’ navigations) and data delivery are independent of the standard W3C navigation API flags (DOM ready, onload, etc). Many interactions occur entirely within the client device.

Although some nuances can exist depending upon the detailed design of particular applications, unless a particular user interaction (eg button click) is reliably associated with a network request, the primary (but important) value of synthetic monitoring in this use case becomes the monitoring of availability. This key metric is unavailable to ‘passive’ (site visitor) based tools, for obvious reasons.

Any interactions that are directly linked to a network call can (in most synthetic monitoring scripting utilities) can be specifically ‘bracketed’ and their response patterns examined. Otherwise, monitoring best practice requires the use of RUM (Real User Monitoring) instrumentation.

Unfortunately, not all RUM tools are created equal, so if it is likely that you will be squaring up to SPAs, it will be important to check (and validate) that your RUM tool (as an extension of your APM tooling or otherwise) offers the granularity of recording required. If not, an alternative (and assuming that the APM Vendor cannot provide realistic comfort re their roadmap) may be to integrate a ‘standalone’ RUM such as SOASTA mPulse. This product has been specifically modified to meet the SPA use case. Details are given in this blog post http://www.soasta.com/blog/angularjs-real-user-monitoring-single-page-applications/. This is a currently evolving situation of direct business relevance. Others will undoubtedly follow.

 

 

 

 

 

  • HTTP/2 based applications

The evolutionary specification of HTTP/2, a formalisation of Google SPDY, has been available for some time. Adoption is now reported to be rapid, and this rate is expected to further increase with progressive server adoption.

HTTP/2 provides a number of transport efficiencies relative to HTTP/1.x. These include request multiplexing (ie effective handling of multiple requests over the same connection), compression of head components, and other design interventions to avoid multiple retransmission of head metadata.

These changes deliver considerable advantages, particularly in site with large numbers of element requests and those involving delivery to users in high latency conditions.

These changes make it necessary to be aware of changes in interventions formerly regarded as ‘best practice’ for optimised performance.

Domain sharding, which was formerly adopted to increase the effective number of parallel connections, becomes an anti-pattern. Domain sharding involves the risk of request failure and consequent retransmission, particular in conditions of limited connectivity (mobile delivery in rural locations, countries with poor internet infrastructure). It impacts the inherent HTTP/2 efficiencies of header compression and transmission optimisation and resource prioritisation possible over connection to a single domain. Does not present monitoring or analysis challenges per se. Can form part of optimisation recommendations.

Content concatenation, the most prominent usage of which is in image spriting, but which may also be applied to other content, has the objective of reducing the number of roundtrip requests. This has, however the disadvantage of forcing refresh if any part of the grouped content changes. Revised best practice, driven by the transmission efficiencies inherent in HTTP/2, directs reduced individual object payloads, and essentially a more granular management of content at individual element level. This, for example, this supports more appropriate cache setting, having regard to the specifics of particular objects.

Inlining, ie the incorporation of content (eg JavaScript) within highly prioritised download components (eg HTML), was formerly adopted to accelerate delivery of required content whilst minimising the requirement for round trip journeys and delays due to differential content-type download priorities by HTML. It had the disadvantage of preventing individual caching of the inlined content. Recommended best practice replaces inlining with server push based delivery, thus supporting both progressive content delivery and more granular cache management.

It should be noted that, with the exception of increased adoption of ‘server push’ interactions (see following section), these changes involve modification of FEO interpretation and recommendation, rather than impacting monitoring practice.

 

 

 

  • Server Push content, Service Worker interactions:

Persistent duration server:client interactions are a core facet of modern applications. In certain cases this is driven by the nature of the application itself (eg delivery of live update betting odds). Other drivers are the leverage of HTTP/2 efficiencies (see section above) and the development of ‘network independent’ ‘mobile ‘WebApps’.

WebApps effectively co-exist with native mobile applications. They incorporate local device caching and store and forward capabilities that enable usage in unstable or ‘network off’ conditions. WebApps utilise Service Workers. These replace the limitations of former AppCache based approaches. They are event driven, and permit access to server push interactions.

Service Worker capability offers many attractive advantages in the creation of more business-centric mobile device based interactions.

The challenge to historic monitoring practice is that long duration connections distort the recorded page load endpoint in traditional synthetic monitoring tools. This must be identified and corrected for, otherwise incorrect performance inferences may be drawn, particularly in terms of recorded response variation.

Fortunately, identification of server push interactions is usually obvious from inspection of standard ‘waterfall’ charts. Correcting for it in an elegant manner is more difficult. Ignoring the validation approaches incorporated within certain synthetic monitoring product scripting (as they are not widely adopted), arguably the best approach to synthetic testing is simply to identify and then filter out the server push calls. Although somewhat of a blunt instrument, it does get around the problem.

A more elegant approach, based on RUM analysis, emerges with the availability of the new sendBeacon API, the syntax of which is as follows:

 

navigator.sendBeacon(url,data);

 

 

Use of this call enables granular instrumentation of application code to specifically record response to particular events. It should be noted that this is newly released (at the time of writing), so that it is likely that reliable cross browser support in unlikely to be complete. However, I understand that the leading-edge performance team at Financial Times in London report effective use of this API in production conditions (P Hamann, personal communication).

send Beacon code example_img sharp

Example code instrumentation using the sendBeacon API

 

  • Internet of Things:

A brief note on the ‘Internet of Things’

Sensor – based applications, collectively known as the ‘Internet of Things (IoT)’ have been slowly evolving since Coca Cola introduced the self-reordering dispensing machine many decades ago. It is now in danger of becoming one of the most hyped areas of new technology. Certainly, actual companies are now trading (in the UK Hive and Nest to name but two). Regardless of whether the app is controlling your heating thermostats, reordering the contents of your fridge, or (in the future) ordering up a driverless car for your commute to work, it is important to be able to understand and validate performance in objective terms.

Although companies offering the wet string (Cisco etc) are ready and waiting, full evolution will be accelerated by the mass adoption of  intermediation platform technology such as Jasper,  Apple Homekit, etc.

Hive placeholder sharp

IoT application control panel & code (HIVE Home)

It may be asked “why do I want to check the performance of my smoke alarm remotely anyway?”. Well, clearly, the value of performance monitoring lies in the relevance of whatever is being tested. As such, monitoring may be more appropriate to Vendors of such services rather than individual domestic customers, but, again, it depends – and IoT system performance at individual level may become relevant to the smooth running of all our lives in the future.

Monitoring will probably be based around assurance of the successful completion of core control transactions, based on a (predominantly) mobile application interface. The core use-case is therefore more akin to availability monitoring. Depending upon how such IoT systems are architected, the effect of high traffic load on performance may become relevant.

IoT networks are fairly closed systems, but core mobile app monitoring principles apply. As closed systems, they are not accessible to scheduled synthetic external testing.

Two approaches are possible, either:

  1. Instrument the mobile application used to control the system [standard SDK-based techniques], timing specific response end-points (eg ‘temperature set’ flag or whatever).
  2. If available, monitor via the API – APM tooling can often provide webservices based gateways. These can either be custom developed today, although they will undoubtedly become available for the major providers off-the-shelf as the market develops.

The former obviously only monitors the performance of the control application, not the IoT devices themselves (which are assumed to operate correctly if appropriate control application responses are received).

 

 

  • Microservices based applications

The primary development and/or extension of applications based on microservices – ie discrete functionality containerised elements – is becoming very popular. Arguably this is being driven by the popularity and availability of open source platforms, particularly Docker, though alternatives exist.

The pros and cons of microservices adoption are outside both my core experience and the scope of this material. Suffice it to say that despite the ownership advantages of highly granular functional elements from an agile development perspective, microservices-based applications deliver an additional layer of integration and management complexity from an operations perspective.

Performance understanding should be considered from both a back end and external approach.

From the point of view of the containers themselves, the major APM Vendors are increasingly including specific support for these technologies. Currently, given the market dynamics, specific support starts with Docker, although other platforms are/will be explicitly supported moving forward. The extent of visibility offered by the various APM tools does vary, although it is likely that your choice will be made by other considerations (and therefore you will ‘get what you get’ with respect to container performance visibility).

ruxit 1 sharp

Ruxit 2 sharp

 

 

 

 

 

 

 

Microservices container monitoring (RUXIT APM)

In terms of external monitoring practice, the core change is not the high level approach/tooling mix, but rather the importance of ensuring that poor performance of core services and/or module interactions are targeted, such that interventions can be made rapidly. This is particularly apposite given that the nature of testing and pre-production environments is such that it is likely that issues will arise that only emerge post release-to-production when the application comes under real world load and interaction complexity conditions.

The take home message should therefore be to monitor with underlying services in mind. This implies a ‘subpage’ monitoring approach. Greater granularity of monitoring can be achieved, by, for example, (with synthetic tooling) scripting transactions to bracket key in-page interactions (reported as step timings), and (with RUM) using additional timing markers / beacons to achieve the same effect.

Issues not specifically detected by these techniques should reveal themselves by changes to traffic flows/user behaviours. These are best detected by cultivating an approach to Web Analytics reports that is both absolute and intuitive.

  • Bots

Although not strictly associated with FEO, a few words on Bots are relevant to the consideration of third party related performance constraints. Bots (or web robots) are automated site interactions. Although the majority (ranging from SEO to synthetic testing and price aggregation) are not malicious in intent, they represent a huge proportion of total site traffic – over two thirds for typical retail sites for example.

Bot vs non bot sharp

Global car rental site – UK traffic by unique IP per hour – total vs customer traffic

This represents a significant economic cost, both in maintaining otherwise unnecessary infrastructure and in reducing the effective capacity overhead of the site (and therefore its ability to benefit from peaks in ‘real customer’ traffic. These benefits can be extremely significant. One of our retail clients was able to reduce its IBM licence requirement for WebSphere Commerce Suite from 5 to 3 cores, thus generating a substantial ongoing annual cost saving.

Unfortunately, Bot effects are not simply confined to generating excess traffic. So called “bad” bots have a range of negative effects, from inadvertent inefficiencies due to poorly written code, to spam, malicious hacks, and high volume Denial of Service (DDoS) attacks. According to the Anti Phishing Working group (report, Q1-Q3 2015), over one third of all computers worldwide are infected with Malware.

Various approaches to mitigation are possible. These include:

  • ID blocking
  • CAPTCHA (largely regarded as compromised)
  • Multi parameter traffic ‘fingerprinting’ and
  • Bot ‘honeytraps’

From the point of view of performance practice / FEO, Bots are an indirect consideration, however one that should be considered when making recommendations regarding overall performance enhancement. Seek to quantify the extent of the problem and identify potential interventions. These are likely to depend upon the economics of the threat and existing relationships. They can range from specialist target solutions (eg Distil Networks) security extensions to firewalls (eg Barracuda), added options from CDN or other performance vendors (eg Akamai, Radware), to focussed integrated traffic management solutions (eg Intechnica Alipta).

 

 

 

  • Performance APIs

A few words on the use of performance-centric APIs. These  include the ‘traditional’ navigation flags – DOM Ready, page unload etc.- that have been around for a few years now, together with more leading edge developments such as sendBeacon (already referenced wrt monitoring service worker / push content) Event.timestamp property & others.

The only negative to introducing timing APIs to this series of posts is that it moves us across the ‘dev’ spectrum and away from an introduction for day-to-day operations. Failure to exploit them, will, however, prove a serious limitation to effective performance practice going forward, so awareness and if possible, adoption is increasingly important.

Network timing attributes are collected for each page resource. Navigation and resource timers are delivered as standard in most modern browsers for components of ‘traditional’ page downloads. User interaction and more client centric design (eg SPAs), however, require event based timers.

Basic custom timers introduce a timing mark() at points within the page/code. Your RUM tooling should, ideally, be able to support these, as they enable read-across between different tooling eg using visual end points for ‘user experience endpoints’ / browser fill times in synthetic measurements. Not all RUM products do support these, however, so this is an important aspect of understanding when making a product purchase decision.

Other APIs have been developed to support, for example, image rendering, and frame timing – important if seeking to ensure smooth jank-free user experiences.

Browser support cannot be taken for granted, particularly with the newer APIs. It is important to be aware of which browsers support a particular method, as you will be ‘blind’ with respect to the performance of users with non-supported technologies. In certain cases (eg Opera in Russia, or Safari for media-centric user bases), this can introduce serious distortions to results interpretation.

A useful primer for Web Performance Timing APIs, which also contains links to further specialist information in this evolving area can be found here> bit.ly/perf-timing-primer.

Resource API timing support_caniuse_sharp

Browser support for resource timing API – May 2016 [caniuse.com]

 

 

Introduction to FEO – Summary & further reading

This post, part of an introductory series to Front End Optimisation practice, summarises some of the key aspects covered, and provides a list of further reading and sources of more advanced/ongoing knowledge of this important and continuously evolving field.

Other titles in this blog series are:

  • FEO – reports of its death have been much exaggerated [also published in APM Digest] – 22 February 2016
  • Introduction to FEO – Tooling Part 1 – 1 March 2016
  • Introduction to FEO – Tooling Part 2 – 8 March 2016
  • Introduction to FEO – Operations – Process – 15 March 2016
  • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
  • Introduction to FEO – Granular analysis Part 2 -29 March 2016
  • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
  • Introduction to FEO – Edge Case management – 12 April 2016
  • Introduction to FEO – Summary, bibliography for further study

 

Recommendations

A few final thoughts about FEO recommendations

  • Link priorities to business drivers – competitive revenue exposure etc
  • Live in the real world – what can be changed at economic cost, in realistic timescales
  • Beware major effort for marginal improvement
  • Seek to deliver a combination of immediate prioritised interventions & ongoing governance/management of objectives – set iterative goals for improvement (unless in crisis mode).
  • Suggest triggers for ongoing intervention based on a combination of direct (synthetic monitoring) and indirect (web analytics, RUM) alert flags.

 

Summary – everything changes, everything stays the same

In conclusion, the rate of change in the variety of end user applications continues to increase. Some aspects may be black boxed as far as monitoring and analysis is concerned – at least, if without access to core systems and code.

However, in the vast majority of cases, every meaningful aspect of an end user transaction will continue to involve visible changes to the GUI. As such, end user performance can continue to be monitored. Network interaction will still be required, even if it’s on a store and forward rather than real-time basis. Together, these provide the basis of Front End analysis and optimisation.

Provided that there is access to developers / the source code, outputting timing metrics via the range of APIs becoming available provides a robust, production-ready richness to analysis of visitor performance experience. These can prompt intervention & provide deep understanding of production performance. The caveats already expressed (monitoring availability, competitors, and performance in low-traffic situations (including pre-production) still apply.

When presented with new challenges to effective practice, the following cascade should prove its worth:

Cascade flowmap_summary sharp

 

Suggested reading list:

This blog series, although perhaps longer than ideal, has only skated across the surface of the subject. Hopefully, it will provide some practical pointers to effective management / best practice. For those wishing to acquire a deeper understanding (which is certainly recommended). a lot of material exists.  The blogosphere, Vendors, independent Consultancies (such as Intechnica [www.intechnica.co.uk]) eBooks, and web performance Meetup groups (such as the excellent London Web Perf MeetUp  http://www.meetup.com/London-Web-Performance-Group/ ) are all good sources for keeping abreast of recent developments. For ‘core’ reading, the following are a good start:

  • High Performance Websites S Souders Pub O’Reilly 2008 (NB good for core principles, but some of the detail now superseded)
  • Even Faster Websites S Souders Pub O’Reilly 2009
  • High Performance Browser Networking I Grigorik Pub O’Reilly 2013
  • Using WebPagetest Viscomi et al Pub O’Reilly, 2016
  • The Art of Application Performance Testing I Molyneaux 2nd Edition 2015 (currently being revised and updated)

Introduction to FEO – granular analysis Pt 3

This post, part of an introductory series to Front End Optimisation practice, considers detailed analysis of clientside components. It is a relatively high level treatment. A list of sources for more detailed study will be provided in the final (summary) post.

Other titles in this blog series are:

  • FEO – reports of its death have been much exaggerated [also published in APM Digest] – 22 February 2016
  • Introduction to FEO – Tooling Part 1 – 1 March 2016
  • Introduction to FEO – Tooling Part 2 – 8 March 2016
  • Introduction to FEO – Operations – Process – 15 March 2016
  • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
  • Introduction to FEO – Granular analysis Part 2 -29 March 2016
  • Introduction to FEO – Granular analysis Part 3
  • Introduction to FEO – Edge Case management
  • Introduction to FEO – Summary, bibliography for further study

This final post on granular analysis, as applied to Front End Optimisation, briefly considers the increasingly important area of performance to mobile devices.

  • Mobile device analysis

The high and increasing traffic from mobile device users make careful consideration of the end user experience a key part of most current FEO efforts.

Investigation typically uses a combination of emulation (including browser developer tool) based analysis – including rules based screening (eg PageSpeedInsights discussed above), and real device testing.

The key advantage of testing from ‘real’ mobile devices as opposed to spoofed user string/PC based testing is that the interrelationship between device metrics and application performance can be examined. As discussed in the ‘tools’ post, ensuring good, known control conditions, whether of connectivity (bandwidth, SIM public carrier or WiFi) and device environment is crucial to effective interpretation of results.

Most ‘cross device’ tools are designed for functional (or in some cases load) testing rather than performance testing per se. This limits their value. The choices are between:

  • Limiting investigation to browser dev tools
  • Building/running an in house device lab with access to presentation layer timings and device system metrics
  • Using a commercial tool – these are thin on the ground, but Perfecto Mobile is worth a look
  • Using the real device testing offered by Vendors such as TestPlant (eggOn), or Keynote.

Four approaches to understanding the performance of native application are possible:

  • Consider Perfecto Mobile’s combination of visual endpoint and core metric testing (www.perfectomobile.com)
  • Instrument the application code using a Software Developer Kit [SDK] (this is the approach adopted by the APM vendors. Typically stronger on end user visibility rather than control of test conditions or range of device metrics. Inclusion of crash analytics can be useful.
  • Use a PCAP approach – analysing the initial download size and ongoing network traffic between the user device and origin. This is the approach taken by the AT&T ARO tool (https://developer.att.com/application-resource-optimizer)
  • Build your own in-house device lab. This is  potentially more problematical than it may appear, for many reasons.  This presentation, by  Destiny Montague and Lara Swanson of Etsy given at the 2014 Velocity conference, provides a good overview from a corporate team that have successfully embraced this approach
    • Part 1 here> https://www.youtube.com/watch?v=QOatJD_3bTM
    • Part 2 here> https://www.youtube.com/watch?v=YBn_bQrdVRI

Either way, having defined your control conditions within the constraints of the tool/approach selected, key aspects include:

  • Timeline – understand the interrelationship between the various delivery components – JavaScript processing, image handling etc and CPU utilisation
  • System metrics – when delivering cached & uncached content. These include:
    • CPU – O/S (Kernel), User, Total
    • Memory – Free, Total
    • Battery state
    • Signal strength
  • Crash analytics
  • Impact of third party content
  • Association of issues with delivery infrastructure/core application performance. This coordination is effectively provided by many modern APM tools.

 

CPU utilisation chartsharp 1

CPU utilisation trace during test transaction – Android device (Perfecto Mobile)

The next post in this series considers monitoring approaches to a number of edge case conditions.

Introduction to FEO -granular analysis Pt 2

This post, part of an introductory series to Front End Optimisation practice, considers detailed analysis of clientside components. It is a relatively high level treatment. A list of sources for more detailed study will be provided in the final (summary) post.

Other titles in this blog series are:

    • FEO – reports of its death have been much exaggerated – 22 February 2016  [also published in APM Digest]
    • Introduction to FEO – Tooling Part 1 – 1 March 2016
    • Introduction to FEO – Tooling Part 2 – 8 March 2016
    • Introduction to FEO – Operations – Process – 15 March 2016
    • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
    • Introduction to FEO – Granular analysis Part 2 -29 March 2016
    • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
    • Introduction to FEO – Edge Case management – 12 April 2016
    • Introduction to FEO – Summary, bibliography for further study – 12 April 2016

Granular analysis Part 2 – Component-level characteristics

In the post on 22 March, we looked at the use of detailed, external-monitoring based scrutiny of site performance. Careful consideration of this data (for example, the effect of third party components on browser fill times between ISPs or at different connection speeds), should deliver value in two areas: i) the performance characteristics of the site, for example latency to international markets, or excessive variation during periods of high demand, and ii) some understanding of the root cause of the issues – client side/ server side, DNS lookup, image content delivery, or whatever.

ISP Peerage sharp

Armed with this knowledge, we can now focus our optimisation efforts on causation in specific areas, using appropriate ‘deep dive’ tooling. This approach will be both more time and cost effective than seeking to apply ‘broad brush’ solutions such as increasing infrastructure capacity.

Detailed interventions will obviously depend upon the nature of the issue(s) encountered, but a number of sources exist which consider specifics in more detail than is possible here (see bibliography in summary post).

ISP Peerage chart (dynaTrace synthetic)

  • Component level analysis

Following investigation of response anomalies, it is useful to undertake some detailed analysis of the client-side components of the target page(s) – for example each individual page in a key revenue-bearing transaction – what is sometimes termed ‘static’ (as opposed to ‘dynamic’) analysis.

Raw  data from representative individual tests (in controlled conditions) should be downloaded and examined. It can be particularly useful to compare with a similar high-performance reference site (possibly a competitor). Such analysis should include consideration of both individual components and network packet capture [PCAP] traces.

Static analysis comp sharp

‘Static analysis’ Individual component/competitor comparison

Notes regarding investigation of individual components:

  • Client-side logic (JavaScript). Consider:
    • The absolute number of scripts and their blocking behaviour (if any)
    • Download source, number of domains
    • Code size – compression, minification.
    • Code coverage.
    • .js total duration vs execution time. Long duration scripts should be examined to understand which elements consume the most time overall
    • CPU overhead – particularly important if delivered to limited capacity mobile devices

Chrome dev timeline sharp

Example – Intensive JavaScript processing (Chrome developer tools timeline)

JS targets sharp

JavaScript comparison – note discrepancies between execution time & total duration, CPU overhead and size (dynaTrace AJAX Edition)

Individual script – time distribution breakdown  (Transfer/Wait/Server);  Long duration script – individual call stacktrace examination

 

  • Images/multimedia

Images and multimedia content often represent a huge proportion of the total payload of a page. This is a very fertile area for enhancement, and also for the application of effective governance of acceptable content specification. It is important to avoid bad practice such as HTML based scaling.

Content format should be considered, both in terms of relevance to the image type – icon, animation, transparency etc, and the recipient user device. The former is covered well in the relevant chapter of Steve Souders ‘Even Faster Websites’, though note that the SmushIt tool is unfortunately no longer available. Some visibility of the effect of optimal compression (at least as far as .png images are concerned) can be gained from using other open source tools such as pngcrush (http://pmt.sourceforge.net/pngcrush) or png gauntlet (pmggauntlet.com).

Efficient image handling by particular devices is facilitated by using an appropriate format – WebP images for Android mobile devices will save some 30%, for example. Compression is also important, although some of the most dramatic savings are delivered by ensuring that quality is the ‘minimum acceptable’ rather than ‘best possible’. This is a subjective decision, but well worth exploring, especially for small format mobile devices.

Having determined and stored a set of ideal images, delivery can be managed automatically by reference to the visitor browser user string. These services are offered as options from established major vendors such as Akamai and Adobe Scene 7.  The imgix site (https://sandbox.imgix.com/create) is worth exploring, both a source of services, but also (using their sandbox) to examine the effect on overall size of changing specific parameters.

With regard to monitoring multimedia streams, it is worth referencing that several major APM vendors are planning this capability as an extension to RUM functionality in their downstream roadmaps.

  • Network based investigation

Synthetic test tooling essentially operates by analysing network interactions between delivery infrastructure and user client (with varying degrees of sophistication).

Much of this information is derived from PCAP (packet capture) data. This is usually represented graphically using a waterfall chart. Such charts can provide much useful information in addition to the parsing of individual object delivery (ie partitioning between DNS lookup, connection, first byte and content delivery times).

Such aspects as blocking behaviour, asynchronous/synchronous delivery and the absence of persistent connections are clearly shown. Side by side comparison of waterfalls with different page response time outcomes is a useful technique.

rich rel waterfall 2 sharp

3rd party JavaScript blocking activity (due to interaction with 3,000+ DOM nodes) [Chrome Developer Tools]

In certain cases, it may be useful to examine traces from ‘full fat’ network tracing tools (such as Gomez Transaction Trace Analyzer or WireShark/CloudShark). The image below illustrates the use of Cloudshark to investigate/confirm server response latency by identifying delay between network SYNchronise and ACKnowledge responses.

Cloudshark analysis2 sharp

Pinpointing server latency (CloudShark example)

  • Other aspects

The above highlight some of the most fertile areas for potential optimisation. Many others exist, for example cache/compression handling, 3rd party content interactions (including ‘daisy chaining’ behaviour by introduced affiliate tags (a potential security as well as performance, issue).

Poor performance by design or governance practice include the presence of multiple/outdated versions of affiliate tags or jQuery libraries, and sites with excessive numbers of DOM nodes. Although a not infrequent cause of performance inefficiency, the latter is good example of a finding that is not amenable to immediate change, as it requires a fundamental rebuild of the site.

The blocking behaviour of stylesheets and fonts is worth considering – in the case of fonts particularly if your user base has a high proportion of Safari users due to this browser’s poor response to missing content.

Two examples that highlight potential cross domain content interrelationship issues:

Ghostery latency map sharp

Third party latency map (Ghostery)

Request Map sharp

3rd Party interrelationships by type ( RequestMap [NCC Group]) – open source

3rd party connectivity log sharp

Further detail – third party affiliate tag content by DNS response (WebPage Test)

Tools of this type can be extremely useful in visualising issues, particularly in sites with heavy affiliate content loads.

  • CDN Performance Assurance

Content Delivery Network (CDN) usage is extremely pervasive, both for ‘core’ content and by third parties. A useful first step is to screen your target site with a tool such as CDN Planet’s CDN finder tool (http://www.cdnplanet.com/tools/cdnfinder). This will, in most cases, display a list of CDNs detected by domain. CDN technology is very powerful, but needless to say, it does not have miraculous powers. Like any other tool, it is reliant on correct initial configuration, both of the CDN itself and the cache settings of the accelerated site. Individual CDNs do vary in efficiency, particularly  between global regions. For all these reasons, it is worth undertaking a CDN performance assurance test, providing that you have suitable tools at your disposal. Today, this means dynaTrace Synthetic (formerly Gomez) ‘Last Mile’ network, or the equivalent feature in Catchpoint.

Both Vendors offer the ability to test from a distributed network of consumer PCs. ISP based testing is of limited use for this purpose (for reasons that I don’t have space to go  into). Although in an ideal world ongoing assurance testing (linked to a Service Level Agreement) is beneficial, in practice a limited test of 24 or 48 hour duration will pick up any gross issues.

Two aspects are useful for FEO screening purposes:

  1. Providing that it is possible to set up testing directly against the origin content (by- passing the CDN) – this will depend on how the site is coded with regard to the CDN – set up parallel tests from end user locations in relevant geographies. One test navigating to the origin servers, and the other to the local (CDN) cache.

The discrepancy between the response values obtained is effectively what you are paying the CDN for. In the example below, an average acceleration of 77% was delivered during the period of test. For ongoing tests (eg for operations dashboards etc), it is easier to visualise using average line traces rather than scattergrams.

  1. Using reverse IP lookup, examine the origin location for the CDN content. Bear in mind that CDN delivery is dynamically optimised for performance, not origin location. However, such inspection can pull up examples of poor configuration where present (the example below is normal, although the range of origin locations (delivering content to an exclusively UK user base) is interesting.

CDN assurance test sharp

CDN performance assurance – target international market – origin vs local cache response

CDN by host location sharp

CDN host response by location

In my next post (Introduction to Granular analysis Part 3), I will briefly cover approaches to analysis of delivery to mobile devices.

Introduction to FEO -granular analysis Pt 1

This post, part of an introductory series to Front End Optimisation practice, considers detailed analysis of clientside components. It is a relatively high level treatment. A list of sources for more detailed study will be provided in the final (summary) post.

Other titles in this blog series are:

    • FEO – reports of its death have been much exaggerated – 22 February 2016  [also published in APM Digest]
    • Introduction to FEO – Tooling Part 1 – 1 March 2016
    • Introduction to FEO – Tooling Part 2 – 8 March 2016
    • Introduction to FEO – Operations – Process – 15 March 2016
    • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
    • Introduction to FEO – Granular analysis Part 2 -29 March 2016
    • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
    • Introduction to FEO – Edge Case management – 12 April 2016
    • Introduction to FEO – Summary, bibliography for further study – 12 April 2016

Granular FEO analysis

In earlier posts, I gave an overview of the types of tooling available for use as part of a Front End Optimisation effort, and sketched out a suggested process for effective results in this area.

Having understood the external performance characteristics of the application, in both ‘clean room’ and, more particularly, in a variety of end user monitoring conditions, we now approach the core of Front End Optimisation. Monitoring will give a variety of ‘whats’, but only detailed granular analysis will provide the ‘whys’ necessary for effective intervention.

The initial monitoring activity should have provided a good understanding of how your site/application performs across a range of demand conditions. In addition, regardless of the absolute speed of response, comparison with the performance of competitor and other sites should indicate how well visitor expectations are being met, and the initial goals for improvement.

Before plunging into hand-to-hand combat with the various client side components of your site, it is worth taking time to ensure that whoever is charged with the analysis knows the site in detail. How is it put together? What are the key constraints – business model, regulation, 3rd party inclusions, legacy components – it’s a long list… Whilst being prepared to challenge assumptions, it is good to know what the ‘givens’ are and what is amenable to modification. This provides a good basis for detailed analysis. The team at Intechnica typically adopt a structured approach as outlined below, bearing in mind that the focus of investigation will differ depending on what is found during the early stages.

As these posts are aimed at the ‘intelligent but uninformed’ rather than leading edge experts, it is also worth ensuring that you are aware of the core principles. These are well covered in a number of published texts, although things move quickly, so the older the book, the more caution is required. A short suggested reading list is provided in the ‘Summary’ post at the end of this blog series .

In summary, a logical standard flow for the analysis phase could be as follows:

  • Rules based screening
  • Anomaly investigation
  • Component-level analysis
  • Network-based investigation
  • Recommendations & ongoing testing

The above applies to all investigations, although tooling will differ depending on the nature of the target application. We take a similar approach to all PC based applications. Analysis of delivery to mobile devices, whether web, webapp, or native mobile applications, benefit from some additional approaches, and these are also summarised below.

Taking the various stages in turn:

  • Rules based screening:

AJAX ed BP RAG sharp

Flippantly, traditional rules based tools have the advantage of speed, and the disadvantage of everything else! Not quite true of course, but it is necessary to interpret results with caution for a number of reasons, including:

  • developments of technology and associated best practice (eg adoption of HTTP2 makes image spriting – formerly a standard recommendation – an antipattern)
  • limitations of practical interpretation/priority (eg rules based on the percentage gains from compression can flag changes that are small in absolute terms)
  • Just plain wrong (eg rules which interpret CDN usage as ‘not using our CDN’)

Perhaps for a combination of these reasons, the number of free screening tools is rapidly diminishing –YSlow and (the excellent) SmushIt image optimisation and dynaTrace AJAX Edition tools have all been deprecated over the last year or so. Page Speed Insights from Google is a ‘best in class’ survivor. This is incorporated within a number of other tools. It provides speed and usability recommendations for both mobile and PC.

So the message is – rules based screening is a good method for rapidly getting an overall picture of areas for site optimisation, but a) use recent tools and b) interpret judiciously.

In general, the developer tools provided by the browser vendors are an excellent resource for detailed analysis. Access (highlighted) to Page Speed Insights via the Chrome developer tool illustrated below

PageSpeedInsights sharp

Automated (rules-based) analysis – Google PageSpeed Insights

Rules based screening should provide an insight into the key areas for attention. This is particularly valuable in time-intensive screening of multiple components (eg cache settings).

  • Anomaly investigation – slow vs fast vs median

A next logical step is to investigate the underlying root cause of anomalies highlighted in the preliminary monitoring phase. Average traces are useless (for all except possibly long term trend identification), so it will be necessary to identify outliers and other anomalies on the basis of scattergrams of raw data. Seek to associate underlying causes. Prior to detailed ‘drilldown’, consider possible high level effects.

Common amongst these are traffic (compare with data from RUM or web analytics), poor resilience to mobile bandwidth limitations, and delivery infrastructure resource impact – from background batch jobs or cross over effects in multitenant providers.

Multitenant effects sharp

Multitenant platform effects – base HTML object response during Black Friday peak trade weekend 2015 (reference site in blue)

The amount of detail available will obviously depend upon the tooling used for the initial monitoring, although recurrent effects, if identified, should enable focused repeat testing with other, more analytics focused products such as WebPageTest.

A few notes:

  • Statistical analysis of individual components is powerful – compare maximum, minimum and dispersion of individual components (DNS time, connect time etc) from median and outlier responses. Progressively remove specific content (eg 3rd party tags) and compare the effect.

Visual delivery WPT sharp

Visual progress charts with and without 3rd party affiliates (WebPage Test)

GA Traffic patterns sharp

Daily traffic patterns to major UK eCommerce site (Google Analytics)

intraday analysis sharp

Intraday analysis – peak vs low traffic

  • Beware distortion – particularly if page load endpoints have been insufficiently well defined (see earlier posts). Waterfall charts should always be inspected to detect ‘gotchas’ such as below the fold asychronous content or server push connections. Caution needs to be exercised in interpretation of short responses as well as long. Compare payloads – these are often impacted by variable implementation of server side compression, or content failure.

APM Baseline data may be useful here – although baseline management deserves a post to itself!

Further consideration will be given to detailed  analysis in the next post [Granular analysis Part 2].

Introduction to FEO – Process

Our introductory survey of FEO best practice continues by outlining a standardised approach to monitoring and analysis.

Other titles in this blog series are:

    • FEO – reports of its death have been much exaggerated – 22 February 2016  [also published in APM Digest]
    • Introduction to FEO – Tooling Part 1 – 1 March 2016
    • Introduction to FEO – Tooling Part 2 – 8 March 2016
    • Introduction to FEO – Operations – Process – 15 March 2016
    • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
    • Introduction to FEO – Granular analysis Part 2 -29 March 2016
    • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
    • Introduction to FEO – Edge Case management – 12 April 2016
    • Introduction to FEO – Summary, bibliography for further study – 12 April 2016

Process – evidence from Monitoring:

 Having considered the types of tools available to support external monitoring, this post continues the ‘FEO toe dip’ by outlining a structured process to support understanding and intervention in this important area.

At high level, a logical Front End Optimisation [FEO] process seeks to progressively move from general to specific understanding. The following are the key stages:

  • Target definition, test specification
  • External performance (response) and patterns
  • ‘Performance monetisation goal’ – what is the optimal performance/investment that will just meet business goals
  • Distribution of time between front end processing backend and third party components – how much FEO is going to be required?
  • Detailed, granular client side analysis [covered in a future post]
  • KPI definition; ongoing monitoring (best case external + end user)

A detailed report should consider many aspects of performance, both outturn (that is, the recorded response of the site or application in known test conditions), together with the underlying performance of relevant contributory factors. The table below is an extract from an analysis report on a major corporate site. It illustrates some of the factors considered

FEO RAG summary table sharp

RAG summary – FEO report

Before embarking on any actual analysis, it is worth pausing to define the targets for the testing. Such target definition is likely to collate information from multiple sources, including:

  • Knowledge of the key user touchpoints, for example landing pages, product category pages, shopping basket. In thinking about this, a useful guide is to “follow the money”, in other words to track key revenue generating paths/activities.
  • Information derived from web analytics. This useful source will identify key transaction flows (any unusual patterns from ‘theoretical’ expectations – may reflect design or performance issues?). Areas of the site associated with unexplained negative behaviours should be included. Examples: transaction steps associated with high abandonment, high discrepancies between destination traffic (eg Search Engine derived) and high bounce rate.

If available, user click pattern ‘heat maps’ can also be a useful supplementary.

Heatmap_img sharp

Visitor interaction – click pattern ‘heatmap’ – standalone example [www.crazyegg.com]

  • ‘Folk knowledge’ – internal users, ‘friends and family’, customer services, CEO’s golfing partners…..
  • Other Visitor based analysis (eg Real User Monitoring) in particular key markets, devices, operating systems, screen resolutions, connectivity distributions. The latter is particularly useful if supported by your APM tool.

One note of caution. ‘Raw’ visitor derived data (ie derived from the field, not usability lab.) is (obviously) the outcome of actual experience rather than objective controlled test conditions. For example, a low proportion of low specification mobile devices in the user stats may just be a reflection of user demographics, or they may reflect user satisfaction issues. This is where validation of RUM inferences using synthetic testing is particularly useful. ‘Why might this not be true?’ is a useful mindset for interpretation.

  • Marketing/Line of Business input – who are the key competitors (by market), can anything be learnt from digital revenue data?

This will lead to a definition of the test parameters. Although the more core and edge case conditions tested the better the overall understanding, as with everything, these will be limited in practice by time and money.

Example test matrix:

Parameter Value
PC Browser(s), versions Edge, Chrome 48, IE 10,11 FF 34, 44
Screen resolution(s) 1366×768
Mobile device (web) Samsung Galaxy 5 and 6, IPhone 6…
Mobile device, O/S (Mobile Apps) Samsung Galaxy 6 Android
Mobile App details, source Xyz.apk, Google Play
Connection bandwidth range (hardwired & wireless) 0.25 – 2.0 Mbps
Target ISPs and wireless carriers (by market) [UK, hardwired] BT, Telstra

[UK wireless] EE, Vodafone

Key geography(ies) UK; S Spain (Malaga), Hong Kong
Competitor sites (/target details), by market
Other specific factors (eg user details associated with complaints)

Armed with the target specifications for testing, it is useful to begin by monitoring the outturn performance of the site/application. Such monitoring should be representative of a broad range of user conditions, and should identify patterns of response behaviour across a relatively extended period of use (perhaps 2 weeks for ‘business as usual’ data, together with peak events as a comparison). Bear in mind that the details of the clientside test conditions are likely to markedly affect the data. Some examples:

Cross browser response variation_PC sharp

3rd Party-associated  cross browser variation – PC example

Lting Bandwidth test_img sharp

Mobile response testing – limiting bandwidth connection speed vs page response

Payload variation S4 sharp

Payload variation -Samsung smartphone, wireless carrier, 1-1.5 Mbps

Variation in page size (example above) should always be investigated. Causes may be internal (eg differential compression settings across origin servers), or external (eg differences in handling by CDN provider or wireless carriers. Appropriately designed follow-up testing will isolate the primary cause, providing that the tooling used offers appropriate flexibility.

Such preliminary monitoring enables us to understand what we are ‘up against’ from a Front End Optimisation perspective, and ultimately whether we are looking at fine tuning or wholesale interventions. Use of APM tooling can be particularly useful at this initial stage, both in understanding the relative proportion of delivery time associated with client side vs back end processing (example below), and in isolating/excluding any issues associated with delivery infrastructure or third party web services calls. However, as the external monitoring extensions to APM tools are still evolving in functionality (particularly in relation to synthetic testing), additional tools will probably be preferred for FEO monitoring, due to the better control of test conditions and/or granular analysis offered by more mature products.

Client server split dt AJAX_img sharp

Client Side vs Server Side processing time [dynaTrace (AJAX Edition)]

In capturing this baseline data, it is important to compare both consistent and end user (inherently variable) conditions. Ideally, both visitor based [RUM] and Synthetic-based data should be used. This will give useful information regarding the performance of all components an all traffic conditions. As mentioned in Blog 3, if it is possible to introduce common ‘above the fold’ (perceived render time) endpoints as custom markers in both types of test, that will assist in reading across between the test types. Such modifications provide a more realistic understanding of actual end user response, although would be somewhat cumbersome to implement across a wide range of screen resolutions.

Sub page level performance. Depending upon the detailed characteristics of the target sites, it is often useful to run several comparative monitor tests. Some specific cases (eg Single Page Applications, server push content) will be covered in a later blog. It is often useful to understand the impact of particular components on overall response. This can be achieved in a number of ways, but two of the most straightforward are to test for a SPOF or single point of failure – ie the effect of failure of particular (often third party) site content, or to remove content altogether. Techniques for achieving this will depend on the particular tool being used. See Viscomi et al’s Using WebPageTest (O’Reilly, 2016) for details in relation to that particular tool. The same intervention can be made in most synthetic tools (with more or less elegance) using the relevant scripting language/utility.

ATF pixel insertion_img sharp

‘Above the fold’ end point – custom insertion of flag image – synthetic testing

Use of different test end points can have significant effects on reported results / interpretation. The table below illustrates the variation to a single target page within a major UK eCommerce site:

Response time by endpoint_PC sharp

Selective filtering of content can also be used to examine the effect of particular calls on aggregate delivery metrics such as DNS resolution or content delivery times.

The following is a standard monitoring matrix that we typically use for preliminary screening of external performance. The results are used to inform and direct detailed granular analysis, and overview of which will be covered in our next blog in this series.

FEO preliminary screening process – example:

  • ‘Dynamic’ performance – page onload and perceived render [‘Above the Fold’]
    • 24×7 Availability and response patterns – synthetic ISP (by market)
    • 24×7 Availability & response – end user by market (synthetic & RUM)
      • Target browser/screen resolution & device
        • Any cross browser/device discrepancies?
      • Defined connectivity – hardwired & public wireless carrier
    • Page response distribution
      • Histogram of response ranges
    • Median Response & distribution (Median Absolute Dispersion)
      • Weekly business hours
      • Day vs Night (variation with traffic)
      • Cached vs Uncached
      • By key market / user category
    • Performance monetisation (tool dependent, examples):
      • Page/Transaction response vs shopping cart conversion
      • Page/transaction response vs abandonment
      • Page/transaction response vs mean basket size
      • Page/transaction response vs digital revenue (defined time period)
      • Page response vs bounce or exit rate
    • Competitive comparison – direct and mass market sites
      • Page and ‘revenue bearing transaction’ (eg search & Add to Basket)
    • Limiting bandwidth tests
      • Response to progressively reducing connectivity conditions
        • Wi-Fi & public carrier
      • Transaction step comparison
        • Where are the slowest steps (& why – eg database lookup)
      • ‘Payload’ analysis
        • Page download size patterns
      • ‘Affiliate load’ – 3rd party effects
        • Filter & SPOF testing
      • ‘Real device’ mobile testing
      • Component splits/patterns
        • DNS/SSL resolution
        • Connectivity
        • First byte time delivery (infrastructure latency)
        • Content delivery
      • CDN performance assurance
        • Origin vs local cache comparison
      • Detailed ‘static’ component analysis

The above 13-point checklist will provide a picture of the revenue-relevant behaviour of the site/application. This is not exhaustive – it is necessary to be led by the findings from case to case – but it supports targeting for the more granular ‘static’ component-level analysis which provides the root cause / business justification basis for specific remediation interventions. Some approaches to detailed, granular analysis are covered in the next post of this series.

 

Introduction to FEO – Tooling Pt 2

This post continues the introductory survey by concluding an examination of tooling approaches.

Other titles are:

    • FEO – reports of its death have been much exaggerated – 22 February 2016  [also published in APM Digest]
    • Introduction to FEO – Tooling Part 1 – 1 March 2016
    • Introduction to FEO – Tooling Part 2 – 8 March 2016
    • Introduction to FEO – Operations – Process – 15 March 2016
    • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
    • Introduction to FEO – Granular analysis Part 2 -29 March 2016
    • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
    • Introduction to FEO – Edge Case management – 12 April 2016
    • Introduction to FEO – Summary, bibliography for further study – 12 April 2016

Tooling Part 1 considered synthetic (otherwise known as active monitoring) of PC based sites – examining data from replicate ‘heartbeat’ external tests in known conditions. Now let’s consider complimentary monitoring of actual visitor traffic, and aspects of mobile device monitoring.

Passive monitoring, variously known as Real User Monitoring [RUM], End User Monitoring [EUM], or User Experience Monitoring [UEM], is based on the performance analysis of actual visitors to a website. This is achieved by (manual or more usually automatic) introduction of small JavaScript components to the webpage. These typically record and return by (means of a beacon) the response values for the page, based on standard W3C navigation metrics – DOM ready time, page onload time, etc. It is worth noting in passing that these are not supported by all browsers – notably older versions of Safari and others. However, the proportion of user traffic using unsupported versions of non-Safari browsers will probably be fairly negligible today, at least for core international markets.

RUM dashboard img AppD 250216sharp

A RUM dashboard, showing near real-time performance by geography, device, etc.  [AppDynamics]

Modern RUM tooling increasingly captures some information at object level as well (or can be modified to do so). A useful capability, available in some tools, is the ability to introduce custom end points. If supported, these can be coordinated with appropriately modified synthetic tests (as discussed in blog 2), providing the ability to read across between active and passive test results.

A further useful capability in some RUM tools is Event Timing. Event timing involves the placing of flags to bracket and record specific user invoked events (for example the invocation of a call to a Payment Service Provider as part of an eCommerce purchase.

The ability to report on transaction timings (as opposed to single page or page group performance) is particularly useful, although relatively rarely supported. When present, this extends the ability to monetise performance, that is, to understand the association between page response to end users and business-relevant metrics such as order size or transaction abandonment.

Creation of such performance: revenue decay curves (for different categories of user) – together with an understanding of performance relative to key competitors- enable decision support regarding optimal site performance – ie avoiding under or over investment in performance.

Another approach to monetisation is to use the events database analytics extensions offered by some APM Vendors. Examples include New Relic Insights, and AppDynamics Analytics. These types of offers certain provide powerful visibility through the ability to perform multiparameter SQL-like interrogations of rich business and application big-data sets. To obtain maximal value, such products should ideally support relational joins – to, for example, compare conversion rates between transaction speed ‘buckets’. It is worth delving into the support (immediate or planned) from a given Vendor for the detailed outputs that will underpin business decision support in this important area.

A key question:

  • What is the optimum performance goal balancing investment vs revenue return?

12hr revenue by BT perf histogram 240216 sharp

Monetisation: Revenue bearing transaction performance vs revenue [AppDynamics]

Perf vs Abandonment FEO blog

Monetisation: Key page response vs transaction abandonment rate [dynaTrace]

Mobile device monitoring:

Effective monitoring and analysis of application delivery to mobile devices is crucial, given the predominance of mobile users to many sites. Tooling categories are outlined below, together with their use case. It is likely that a combination of tools will be required.

A core distinction is between emulated and real device testing. Emulation testing has the advantage of convenience and the ability to rapidly test delivery across a wide variety of device types. It also uses a consistent, powerful PC based platform. This can be useful depending on the precise nature of the testing undertaken. Emulation consists of ‘spoofing’ the browser user string such that the request is presented to the target site as a mobile device. Given that it is important to replicate (a range of) realistic user conditions to gain an understanding of actual performance in the field, the most useful tools will permit comparison across a variety of connection types and bandwidths – ‘hardwired’; Wi-Fi; and public carrier network.

Many tools (eg WebPage Test, browser dev tools) only offer hardwired connectivity throttled to provide a range on connection speeds. This can be appropriate during ‘deep dive’ analysis. It is however insufficient for monitoring comparisons.

Chrome dev tools img sharp

Emulation testing – ‘device’ selection [Chrome Developer Tools]

Real device monitoring

Testing from real mobile devices has a number of advantages. Access to the GIU for script recording (as, for example, in Perfecto Mobile) enables visual end point recording. Transactions may be recorded, not only for web sites but also native mobile applications. A further advantage of testing from real devices is the enhanced control, and understanding the performance influence of, device characteristics. The performance to a given device is likely to be influenced by system constraint. These may be inherent eg processor and memory capacity, Operating System version, or dynamic – battery state, memory and CPU utilisation etc. In addition, user behaviour and environmental factors can have a significant influence – everything from applications running in the background, number of browser tabs open, or even the ambient temperature.

Perfecto img sharp

Testing from real device – device selection [Perfecto mobile]

Its that control word again – the more accurate your modelling of particular test conditions (particularly edge states), the more accurate and relevant your interpretation will become.

Native mobile Application analysis & monitoring.

Two approaches are possible here. For monitoring/visitor analysis, the most widely used approach (and that adopted by APM tooling) is to provide Software Development Kit (SDK) – based measurement. The application is instrumented by introducing libraries to the code via the SDK. The degree of visibility can usually be extended by introducing multiple timing points, eg for the various user interactions across a logical transaction. Errors are reported, usually together with some crash data.

All the major Vendors support both Android and iOS. Choices for other OS’s (RIM, Windows Mobile) are much more limited due to their relatively small market share. Among the Gartner APM ‘Magic Quadrant’ APM vendors I believe that only New Relic have any support in this area, via Titanium (cross platform) – at the time of writing, anyway.

NR SDK options img. sharp

Other tools exist for point analysis of native apps. AT&T’s Application Resource Optimiser (ARO) utility [https://developer.att.com/application-resource-optimizer] is a useful (open source) example. This screens applications against best practice in 25 areas, based on initial download size and network interactions (pcap analysis) via a VPN probe.

ATT ARO img sharp

AT&T ARO – rules based best practice analysis (25 parameters) for native mobile apps

APM based external monitoring

Most modern APM tools will offer both synthetic and passive external monitoring to support ‘end to end’ visibility of user transactions. Although it is possible to integrate ‘foreign’ external monitoring into an APM backend, this is unlikely to repay the effort and maintenance overhead. The key advantage of using the APM vendor’s own end user monitoring is that the data collected is automatically integrated with the core APM tool. The great strength of APM is the ability to provide a holistic view of performance. The various metrics are correlated, this supporting a logical drilldown from a particular end user transaction to root cause, whether application code or infrastructure based.

It is important to understand any limitations of the RUM and active test capabilities offered, both to assist in accurate interpretation and to make provision for supplementary tooling to support deep dive FEO analytics.

Ultimately, the strength of an APM lies in its ability to monitor over time against defined KPIs and Heath Rules, to understand performance trends and issues as they occur, and to rapidly isolate the root cause of such issues.

These are powerful benefits. They do not well support detailed client side analysis against best practice ‘performance by design’ principles. These are best undertaken as a standalone exercise, using independent tools designed specifically with such analysis in mind. The key use case for APM is to support an understanding of ‘now’ in both absolute & relative terms, and to support rapid issue isolation/resolution when problems occur.

APM EUM back end integration sharp img

Front-End : Back-End correlation – RUM timings, link (highlighted) to relevant backend data

I will cover some aspects of standalone optimisation in blog 5 of this series [Granular analysis].

Introduction to FEO – Tooling Pt 1

This is Blog 2 of my ‘introduction to Front End Optimisation [FEO] series. They are designed to provide an overview to effective best practice in external monitoring and Front End Optimisation for newcomers to the field, particularly non-technical business Managers. Many advanced reference works exist for those wishing to develop a specialism in this field. Some are listed in Blog 7 – summary & bibliography.

Other titles are:

    • FEO – reports of its death have been much exaggerated [also published in APM Digest] – 22 February 2016
    • Introduction to FEO – Tooling Part 1 – 1 March 2016
    • Introduction to FEO – Tooling Part 2 – 8 March 2016
    • Introduction to FEO – Operations – Process – 15 March 2016
    • Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
    • Introduction to FEO – Granular analysis Part 2 -29 March 2016
    • Introduction to FEO – Granular analysis Part 3 – 5 April 2016
    • Introduction to FEO – Edge Case management – 12 April 2016
    • Introduction to FEO – Summary, bibliography for further study – 12 April 2016

This is second in my nine-post blog series for newcomers to Front End Optimisation and analysis. APM tooling certainly has its place here, particularly for integrated, ongoing monitoring. However, it is probably useful to think of FEO as an extension activity, undertaken separately to the core KPI tracking and issue resolution supported by APM. I will reference APM tooling in the context of the various categories considered. In order to keep the size manageable, I will split the tooling consideration into two posts: introduction & synthetic testing (this one); and RUM (including mobile) [Tooling Part 2].

Let’s start with a summary of available tool types (split into two parts), and then a structured FEO process.  I am assuming an operations- rather than developer- centric approach. Certainly, the most robust approach to ensuring client side performance efficiency is to bake it in from inception, using established ‘Performance by Design’ principles and cutting edge techniques. However, as in most cases “I wouldn’t have started from here” is not exactly a productive recommendation, let’s set the scene for approaches to understanding and optimising the performance of existing web applications.

So, tooling. Any insights gained will start with the tools used. The choice will depend upon the technical characteristics of the target (e.g. ‘traditional’ HTTP Website, Single Page Application, WebApp, Native Mobile App), and the primary objective of the test phase [the spectrum of (ongoing) Monitoring through to (point) Analysis].

Note: I will use examples drawn from many tools to illustrate particular points. These do not necessarily represent overall endorsement of the specific tools. Any decision should be made given a broad consideration of your individual needs and circumstances.

The first hurdle is gaining appropriate visibility. However, it must be noted that any tool will produce data, the key is effective interpretation of the results. This is largely a function of knowledge and control of the test conditions.

A good place to start in tool selection is to stand back from the data and understand the primary design goal of the particular class of tool. As examples consider two tools, both widely used, neither of which is appropriate to FEO work, whilst being superficially relevant.

The first, Google Analytics. This powerful and mass market product certainly will generate some performance (page response) data. However, the tool is primarily designed for behavioural web analytics. The information that it provides can be extremely useful for defining analysis targets, both in terms of key transaction flows and specific cases eg top ranked SEO destination pages with high bounce rates. It is of limited use for FEO analysis for a number of detailed reasons, but mainly because the reported performance figures are averaged from a tiny sample of the total traffic, and granular component response data is absent.

Secondly, Sauce Labs. This is more of a niche Vendor than GA, but certainly is a fine product of its type. Sauce Labs offer comparative cross browser and device testing, both emulated and real-device. All testing originates in the US, introducing high and unpredictable latency into the testing. This tooling is excellent for functional testing, which is what it is designed to do. Different choices are required for effective FEO support.

So, what are the relevant categories of front end test tooling? The following does not seek to provide a blow-by-blow comparison of the multiplicity of competitors in each category – and in any case, the best choice for you will be determined by your own specific circumstances. Rather, it is a high level category guide. As a general rule of thumb, examples of each category will ideally be used to provide a broad insight into end user performance status and Front End Optimisation. Modern APM tools increasingly tick many of these boxes, although some of the more arcane (but useful) details are yet to appear.

As we will see when considering process, FEO practice in Operations essentially consists of two aspects. One is understanding the outturn performance to external end points (usually end users). This is achieved through monitoring, that is, obtaining an objective understanding of transaction, page, or page component response from replicate tests in known conditions, or of site visitors over time.

Monitoring provides information relative to patterns of response of the target site or application, both absolute and relative to key competitors or other comparators.

The other aspect is Analysis of the various components delivered to the end user device.  These components fall into three categories: static, dynamic, or logic (JavaScript code). Data for detailed analysis may be obtained as a by-product of monitoring, or from single or multiple point ‘snapshot’ tests. Component analysis will be covered in a subsequent post.

Tools for monitoring of external performance fall into two distinct types: active or passive.

Active (also called Synthetic) monitoring involves replicate testing from known external locations. Data captured is essentially based on reporting on the network interactions between the test node and the target site.

  1. Understanding the availability of the target site
  2. Understanding site response/patterns in consistent test conditions – for example to determine long term trends, the effect of visitor traffic load, performance in low traffic periods, or objective comparison with competitor (or other comparator) sites
  • Understanding response/patterns of individual page components. These can be variations in the response of the various elements of the object delivery chain – DNS resolution, Initial connection, First byte (ie the dwell time between the connection handshake and the commencement of data transfer over the connection – a measure of infrastructure latency), and content delivery time. Alternatively, the objective may be to understand the variation in total response time of a specific element, for example 3rd Party content (useful for Service Level Agreement management).

Increasingly, modern APM tools offer a synthetic monitoring option. These tend to be useful in the context of the APM – ie holistic, ongoing performance understanding, but more limited in terms of control of test conditions and specific granular aspects of FEO point analysis such as Single Point Of Failure (SPOF) testing of third party content.

In brief, key aspects of such tooling for FEO analysis are:

  • Range of external locations – geography and type
    • eg Tier 1 ISP/LINX test locations; end user locations; private peer (ie specific known test source)
    • PC and mobile (the latter increasingly important)
  • Control of connection conditions – hardwired vs wireless; connection bandwidth
  • Ease & sophistication of transaction scripting – introducing cookies, filtering content, coping with dynamic content (popups etc)
  • Control of recorded page load end point

As a rule of thumb, the more control the better. However, a good compromise position is to take whatever is on offer from the APM Vendor – provided you are clear as to exactly what is being captured; and supplement this with a ‘full fat’ tool that is more analysis-centric – Web Page Test being a popular, open source choice – though beware variable test node environments if using the public network.

dT synthetic end user test peers img sharp

Synthetic testing – custom end user peer clusters – note the flexibility in terms of geography and connection speed (dynaTrace Synthetic ‘Last Mile’ PC testing)

A final word on page load end points. ‘Traditional’ synthetic tools (such as Gomez/dynaTrace synthetic in the above example) relied on the page onload navigation marker. It really is essential to define an end point more closely based on end user experience – ie browser fill time. With older tools this needs to be done by introducing a flag to the page. This can either be existing content such as an image appearing at the base of the page (at a given screen resolution), or by introducing such content at the appropriate point. This marker can then be recorded by modification of the test script.

Note that, given the dynamic nature of many sites, attempting to time to a particular visual component can be a short lived gambit. Introducing your own marker, assuming that you have access to the code, is a more robust intervention.

Some modern tooling (eg AppDynamics APM) have introduced this as a standard feature. It is likely that competitors will follow suit. Use of the onload marker will produce results that do not bear any meaningful relationship to end user experience, particularly in sites with high affiliate content loads.

Modifications of standard testing to meet the requirements/manage misleading results in specific cases eg server push, Single Page Applications, will be covered in a subsequent post.

APM for Enterprise: How Does It Scale?

This is an extract from my recent post on APMDigest. Please click here to read it in its entirety.


 

It is easy to feel that so called “second generation” Application Performance Management (APM) tooling rules the world.

And for good reason, many would argue – certainly the positive disruptive effects of support for highly distributed / Service Orientated architectures, and the requirements of many fast moving businesses to support a plethora of different technologies are a powerful dynamic. That leaves aside the undoubted advantages of comprehensive traffic screening (as opposed to “hard” sampling), ease of installation and commissioning (relative in some cases), user accessibility, flexible reporting and tighter productive association between IT and business – in short, empowering the DevOps and PerfOps revolution.

So, modern APM is certainly well attuned to the requirements of current business. What’s not to like?

Could these technologies have an Achilles heel? Certainly, they are generally strong on lists of customer logos, but tight lipped when it comes to detailed high volume case studies.

Hundreds or thousands of JVMs and moderately high transaction volumes are all very well (and well attested), but how do these technologies stack up for the high end enterprise? What other options might exist?

It could be argued that an organization with tens of thousands of JVMs and millions of metrics has a fundamentally different issue than those closer to the base of the pyramid. Certainly these organizations are fewer in number, but that is scant comfort for those with the responsibility of managing their application delivery. Whether in banking/financial trading, FMCG or elsewhere, the issue of effectively analyzing daily transaction flows at high scale is real. The situation is exacerbated at peak – one large UK gaming company generates 20-30,000 events per second during a normal daily peak. During the popular Grand National race meeting, traffic increases 5-10 times – creating the need to transfer several terabytes a day into an APM data store.

The question is: which if any of the APM tools can even come close to these sorts of volumes?

It is certainly possible to instrument these organizations with second generation APM – but what snares lie in wait for the unwary, and what compromises will have to be made?


Read the rest of the post by clicking here.

 

Alerting Survival Strategies

This is an excerpt from my recent post on APMDigest. To read it in its entirety, click here.

In considering alerting, the core issue is not whether a given tool will generate alerts, as anything sensible certainly will. Rather, the central problem is what could be termed the actionability of the alerts generated. Failure to flag issues related to poor performance is a clear no-no, but unfortunately over-alerting has the same effect, as these will rapidly be ignored.

Effective alert definition hinges on the determination of “normal” performance. Simplistically, this can be understood by testing across a business cycle (ideally, a minimum of 3-4 weeks). That is fine providing performance is reasonably stable. However, that is often not the case, particularly for applications experiencing large fluctuations in demand at different times of the day, week or year.

In such cases (which are extremely common), the difficulty becomes “at which point of the demand cycle should I base my alert threshold?” Too low, and your system is simply telling you that it’s lunchtime (or the weekend, or whenever greatest demand occurs). Too high, and you will miss issues occurring during periods of lower demand.

There are several approaches to this difficulty, of varying degrees of elegance…

Read the full article on APMDigest