SlideShare a Scribd company logo
• Philip Tellis

•                           .com
• @bluesmoon
• geek paranoid speedfreak

    Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   1
I’m a Web Speedfreak

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   2
We measure real user website performance

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   3
This talk is about the Statistics we learned while building it

  Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   4

Recommended for you

Trend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple CompanyTrend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple Company

This document analyzes the annual report of Apple Inc. from 2004 to 2008. It summarizes the company's income statement, showing increasing sales, earnings, and net income each year. It also analyzes Apple's balance sheet, noting increasing current assets, total assets, and total equity. Overall, the trend percentages show strong growth across all financial metrics each year, indicating Apple has a solid financial position and is growing rapidly over this period.

Common Size Analysis
Common Size AnalysisCommon Size Analysis
Common Size Analysis

This document discusses common size analysis, which allows companies to be compared across time and against competitors by expressing financial statement items as percentages of a base figure. There are three types of common size analysis: vertical common size income statements, horizontal common size income statements, and common size balance sheets. An example is provided where net income figures for two companies of different sizes are expressed as a percentage of sales revenue for better comparison. The document outlines some limitations of common size analysis and provides an example analysis of common size income statements and balance sheets.

analysiscommonsize analysisfore
Report on financial statement for five years using trend, comparative & comm...
Report on financial statement for five years using trend, comparative  & comm...Report on financial statement for five years using trend, comparative  & comm...
Report on financial statement for five years using trend, comparative & comm...

Detailed analysis of five years financial statements( Income statement and Balance Sheet) using Trend , Comparative and Common size analysis.
The Statistics of Web Performance Analysis

            Philip Tellis /

             Boston #WebPerf Meetup / 2012-08-14

 Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   5

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   6
Accurately measure page performance∗

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   7
Be unintrusive

     If you try to measure something accurately, you will change
                          something related
                                                                       – Heisenberg’s uncertainty principle

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis                       8

Recommended for you

financial analysis of icici prudential life insurance
financial analysis of icici prudential life insurancefinancial analysis of icici prudential life insurance
financial analysis of icici prudential life insurance

This document provides a financial analysis of ICICI Prudential Life Insurance company. It includes an overview of the company and its products, sources of finance such as equity shares and debt, comparative financial statements from 2007-2008 showing increases in reserves and current assets. Financial ratios are calculated including current ratio, profitability ratio, debt ratio and return ratio. Findings note more charges taken from customers and lack of profit until 2008. Recommendations include caring more about fund management and providing lower charges to customers.

Cross comparision analysis of financial crisis
Cross comparision analysis of financial crisisCross comparision analysis of financial crisis
Cross comparision analysis of financial crisis

The document compares several major financial crises from 1929 to present day. It summarizes the key causes and impacts of each crisis, including the Great Depression, Black Monday, the European Exchange Rate Mechanism crisis, the Global Financial Crisis, the Greece crisis, Japan's debt crisis, and Brexit. It analyzes factors like leverage, liquidity issues, and policy failures that led to the crises. Economic indicators like GDP, unemployment, inflation, and stock market losses are compared across the different crises. Overall lessons on regulation, coordination, and stability of exchange rate mechanisms are discussed.

Bajaj Auto Financial Analysis
Bajaj Auto Financial AnalysisBajaj Auto Financial Analysis
Bajaj Auto Financial Analysis

Bajaj Auto was founded in 1926 and initially manufactured sugar before diversifying into vehicle manufacturing in 1945. It is now India's largest two and three-wheeler manufacturer and the world's fourth largest. Bajaj Auto has experienced steady growth and released many new vehicle models over time. While its financial position is not as strong as competitor Hero Honda, with lower profit margins and negative working capital, Bajaj Auto remains an important player in India's large automobile industry and continues community service initiatives.

And one number to rule them all

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   9
What do we measure?

    • Network Throughput
    • Network Latency
    • User perceived page load time

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   10
We measure real user data

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   11
Which is noisy

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   12

Recommended for you

General Motor's Balance Sheet
General Motor's Balance SheetGeneral Motor's Balance Sheet
General Motor's Balance Sheet

This document analyzes GM's balance sheet to highlight three important traits: size, liquidity, and solvency. It summarizes that GM is one of the largest companies in America based on total assets of $189 billion. It has adequate liquidity with a current ratio of 1.13, on par with other large companies. While GM's debt-to-equity ratio of 1.66 exceeds 1.0, it is comparable to many other S&P 500 firms.

automotivemarket analysisstocks
Kingfisher airlines ppt
Kingfisher airlines pptKingfisher airlines ppt
Kingfisher airlines ppt

The document discusses Kingfisher Airlines, an Indian airline established in 2003 that began operations in 2005. It provides key details about Kingfisher such as its headquarters, destinations served, and five-star rating. It then outlines Kingfisher's strengths as well as weaknesses, opportunities, and threats using a SWOT analysis framework. The presentation concludes by discussing problems Kingfisher faced such as heavy losses, strikes, and lack of management, and provides suggestions for how Kingfisher can continue to meet expectations of customers, suppliers, employees, and society.

gaurav patel jaipur
Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis

This document analyzes Coca-Cola's financial statements and business strategies. It begins with an analysis of Coca-Cola's governance, including details about the CEO, board of directors, and executive compensation. It then discusses Porter's Five Forces analysis of the soda industry, finding rivalry to be high but threats of new entrants and substitutes to be medium. The document also analyzes Coca-Cola's income statements, balance sheets, profitability, and forecasts growth.

                        Statistics - 1

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   13

   I am not a statistician

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   14
1-1  Random Sampling

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   15

                        All possible users of your system

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   16

Recommended for you

Financial statement analysis
Financial statement analysisFinancial statement analysis
Financial statement analysis

The document discusses various types of financial statement analysis including comparative statements, common size statements, trend analysis, and ratio analysis. It provides an example of a comparative balance sheet analysis for a company from 2006 to 2007. The analysis shows increases in fixed assets, long term liabilities, equity, and current assets. It also shows an increase in sales, gross profit, and net profit, indicating the overall financial position and profitability of the company improved from 2006 to 2007.

Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis

A financial analysis for Coca-Cola: company profile, financial statement, liquidity ratio, current ratio, cash ratio, quick ratio, profitability, efficiency, short term activity, long term activity, solvency, DuPont analysis and historical enterprise value (HEV). Done By Elie Obeid and Isabelle Khalil

quick ratiofinancial statementefficiency
Project excursion
Project excursionProject excursion
Project excursion

This document provides an overview of project planning and estimation using function point analysis. It discusses defining a project's domain and scope, comparing projects and products, and the learning from projects. It also covers the software development lifecycle (SDLC) including planning, estimation models, and a case study on student registration using object-oriented analysis and design. Throughout are examples of applying function point analysis to estimate effort for a simple customer project.


                    Representative subset of the population

         Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   17
Bad sample

                                   Sometimes it’s not

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   18
How to randomize?


      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis                    19
How to randomize?

      • Pick 10% of users at random and always test them


      • For each user, decide at random if they should be tested

         Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   20

Recommended for you


Slide deck that explains the rationale behind perceptual speed index, and the collaborative work that happened over last year.

user experienceweb performancecomputer vision
Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1

This document outlines a project to map and optimize the end-to-end total account process for a client globally. The project aims to identify opportunities to align with global best practices, standardize processes, and reduce turnaround times by 30-50%. Key deliverables include developing level 1 and 2 process maps, identifying optimization opportunities, and helping to validate process flows for a lift and shift project. The document describes the current state process which experiences high data mismatches and variations. Analysis identifies constraints like specialized roles and data issues. Recommendations include process documentation, error proofing, reducing handoffs, and using lean tools like 5S, visual management and standard work. The target state aims to improve cycle time, reduce defects, and

IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website

This document discusses using web scraping techniques to collect bank offer data from websites. It describes how an offer scavenger software can automate the extraction of relevant data from websites and organize it in a predefined format like a database. The document then provides details on how the researchers collected bank offer data from websites like using web scraping and Python libraries. It explains the data extraction, transformation and loading process to clean the scraped data and load it into a database. Some preliminary statistics are also generated from the collected data. Finally, it discusses some legal aspects of using web scraping techniques.

Select 10% of users - I

       if($sessionid % 10 === 0) {
          // instrument code for measurement

     • Once a user enters the measurement bucket, they stay
       there until they log out
     • Fixed set of users, so tests may be more consistent
     • Error in the sample results in positive feedback

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   21
Select 10% of users - II

       if(rand() < 0.1 * getrandmax()) {
          // instrument code for measurement

     • For every request, a user has a 10% chance of being
     • Gets rid of positive feedback errors, but sample size !=
       10% of population

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   22
How big a sample is representative?

                                     Select n such that
                                1.96 √n ≤ 5%µ

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   23
1-2     Margin of Error

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   24

Recommended for you

IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website

The document discusses using web scraping techniques to collect bank offer data from websites. It describes how web scraping works by analyzing website content, extracting relevant data, and formatting it into a structured database or spreadsheet. The paper then presents the process used to scrape bank offer data from Indian websites, including developing a Python script to automate scraping, scheduling regular scraping, cleaning the extracted data, and transforming it into a standardized format for analysis. The results section demonstrates the web scraping process and shows how the extracted data is further transformed using an ETL process into a clean dataset for analytics purposes.

Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011

The document proposes a web analytics software to analyze website visitor behavior using data from website access logs. The software would clean and extract log file data, store it in a database, and generate statistics, graphs, and forecasts. This would provide insights into visitor patterns like which products or services are most popular, how promotions impact sales, and classify profitable customer groups. The software aims to help websites better understand visitor behavior and customize services to increase sales and profits.

ifitttechnologyweb analytics
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPAAir Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA

Developing a new modelling tool that will model emissions from road traffic. A pilot study is focusing on the situation in Aberdeen.

air qualityair quality modelling and monitoringaberdeen
Standard Deviation

     • Standard deviation tells you the spread of the curve
     • The narrower the curve, the more confident you can be

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   25
MoE at 95% confidence

                                 ±1.96 √n

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   26
MoE & Sample size

   There is an inverse square root correlation between sample size
                         and margin of error

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   27
1-3   Central Tendency

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   28

Recommended for you

Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis

If your business is heavily dependent on the Internet, you may be facing an unprecedented level of network traffic analytics data. How to make the most of that data is the challenge. This presentation from Kentik VP Product and former EMA analyst Jim Frey explores the evolving need, the architecture and key use cases for BGP and NetFlow analysis based on scale-out cloud computing and Big Data technologies.

networkingnewtork managementbgp
Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...

Learn how to perform text analytics with Minitab and Python. Text mining is useful when analyzing customer comments, call center transcriptions, doctor reports or any other free-form text that contains information you want to extract. We will show the basic steps from collecting summary statistics like Inverse Document Frequency (IDF) and cleaning text to more advanced concepts such as bag of words and sentiment values.

minitabminitab statistical softwarepython
When Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t WorkWhen Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t Work

When Data Visualizations and Data Imports Just Don’t Work – Importing data is a dirty job as can painting user final pictures with that data. This webinar will explore the dirty little secrets that ensure data is imported completely and accurately, as well as, painting scenarios when a visualization may not be the best approach to meeting an audit objective. Specific learning objectives include: o Walk through case studies of “dirty” data and how to improve then using improved data requests and cleansing tools. o Watch case study examples of top tests to validate data tables to ensure data quality. o Discover a host of baseline tests and other baseline statistics to validate, understand and possibly extract key trends for review. o Understand visualization and dashboard types along with their associated analytical strengths from an audit perspective. o Identify situations where statistics may be more effective audit extractors than relying on the human eye to spot notable events.

data analyticsinternal auditaudit
Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   29
One number

    • Mean (Arithmetic)
       • Good for symmetric curves
       • Affected by outliers

                Mean(10, 11, 12, 11, 109) = 30

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   30
One number

    • Median
       • Middle value measures central tendency well
       • Not trivial to pull out of a DB

              Median(10, 11, 12, 11, 109) = 11

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   31
One number

    • Mode
       • Not often used
       • Multi-modal distributions suggest problems

                Mode(10, 11, 12, 11, 109) = 11

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   32

Recommended for you

TripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation DatabaseTripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation Database

This is the presentation given by Jon Kostyniuk at the #ITEToronto2017 conference on August 1, 2017. The TripChain framework represents an innovation in how trip generation data points can be stored and propagated for use within the transportation industry. This is accomplished through the implementation of a distributed, open peer-to-peer database for transportation professionals. For more information, please visit

iteinstitute of transportation engineers#itetoronto2017
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent SolutionsUnveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent Solutions

This document summarizes Ian Machen's presentation on using citywide data and artificial intelligence to generate transportation solutions. It outlines the purpose, applications, challenges, process, lessons learned, and recommendations for developing data visualizations and fusing data from multiple city systems. The presentation covered use cases, sample visualization methods, challenges around complete insights and department silos, and the recommended step-process including identifying goals, challenges, performing audits and needs assessments, and setting performance indicators.

transportationregional planningrural transportation
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager

Data is growing exponentially. What should business managers do to make better business decisions? I explain three key things step by step. Just start today!

big datardata analysis
Other numbers

    • A percentile point in the distribution: 95th , 98.5th or 99th
        • Used to find out the worst user experience
        • Makes more sense if you filter data first

                P95th (10, 11, 12, 11, 109) = 12

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   33
Other means

    • Geometric mean
        • Good if your data is exponential in nature
          (with the tail on the right)

           GMean(10, 11, 12, 11, 109) = 16.68

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   34
Wait... how did I get that?

                    ΠN xi — could lead to overflow

               ΣN loge (xi )
          e                       — computationally simpler

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Wait... how did I get that?

                    ΠN xi — could lead to overflow

               ΣN loge (xi )
          e                       — computationally simpler

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35

Recommended for you

L2 DS Tools and Application.pptx
L2 DS Tools and Application.pptxL2 DS Tools and Application.pptx
L2 DS Tools and Application.pptx

This document provides an introduction to data science from Amity Institute of Information Technology. It discusses data science tools and applications, the data science life cycle, and data science job roles. The data science life cycle includes 6 steps: defining the problem statement, data collection, data preparation, exploratory data analysis, data modeling, and data communication. Some applications of data science mentioned are internet search, recommendation systems, image and speech recognition, gaming, and online price comparison. Common data science jobs include data scientist, data engineer, data analyst, statistician, data architect, data admin, business analyst, and data/analytics manager.

ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...

ISWC 2015 - Linked Data Track Collecting, integrating, enriching and republishing open city data as linked data

semantic webcity
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...

As’s former head of Performance and Reliability, Cliff Crocker knows large scale web performance. Now SOASTA’s VP of products, Cliff is pouring his passion and expertise into cloud testing to solve the biggest challenges in mobile and web performance. The holiday rush of mobile and web traffic to your web site has the potential for unprecedented success or spectacular public failure. The world’s leading retailers have turned to the cloud to assure that no matter what load, mobile and web apps will delight customers and protect revenue. Join us as Cliff explores the key criteria for holiday web performance readiness: Closing the gap in front- and back-end web performance and reliability Collecting real user data to define the most realistic test scenarios Preparing properly for the virtual walls of traffic during peak events Leveraging CloudTest technology, as have 6 of 10 leading retailers

load testingretailperformance testing
Wait... how did I get that?

                    ΠN xi — could lead to overflow

               ΣN loge (xi )
          e                       — computationally simpler

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Wait... how did I get that?

                    ΠN xi — could lead to overflow

               ΣN loge (xi )
          e                       — computationally simpler

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Other means

    And there is also the Harmonic mean, but forget about that

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   36
...though consequently

   We have other margins of error
    • Geometric margin of error
          • Uses geometric standard deviation
     • Median margin of error
        • Uses ranges of actual values from data set
     • Stick to the arithmetic MoE
       – simpler to calculate, simpler to read and not incorrect

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   37

Recommended for you

Using big data_to_your_advantage
Using big data_to_your_advantageUsing big data_to_your_advantage
Using big data_to_your_advantage

Presentation by John Repko to the Colorado Society for Information Management (, March 19, 2013. It talks about big data "killer apps," and the two kinds of innovation ("Hindsight" and "Foresight") that big data can bring to any business.

How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)

Plumbee built analytics capabilities from scratch over time by starting with third-party analytics, collecting extensive unstructured event data, analyzing that data using spreadsheets and Hive, developing in-house experimentation and automation tools, and eventually transitioning to a relational data mart. This process allowed Plumbee to gain insights, improve features and the virtual economy, and scale their data use to support growth from 3 founders to 1.2 million monthly active users.

datatechnologydata warehousing
Environmental Data Management and Analytics
Environmental Data Management and AnalyticsEnvironmental Data Management and Analytics
Environmental Data Management and Analytics

The document describes EMC's experiences with environmental data analytics projects. It discusses EMC setting up India's first environmental data management system for CPCB in 1986. This included air and water data management and analysis. The document also outlines other projects EMC has worked on, including an online environmental monitoring system for Egypt, analysis of Ganga river water quality data from sensors, and a corporate sustainability report for an Indian company. The presentation emphasizes that environmental data is large, irregular, fuzzy and from diverse sources, requiring advanced analytics to generate meaningful insights and reports.

dataenvironmental managementair quality management
...though consequently

   We have other margins of error
    • Geometric margin of error
          • Uses geometric standard deviation
     • Median margin of error
        • Uses ranges of actual values from data set
     • Stick to the arithmetic MoE
       – simpler to calculate, simpler to read and not incorrect

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   37
                        Statistics - 2

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   38
2-1         Distributions

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   39
Let’s look at some real charts

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   40

Recommended for you

Effort estimation for web applications
Effort estimation for web applicationsEffort estimation for web applications
Effort estimation for web applications

I have been working on a new breed of estimation methodologies called "Open estimation methodologies". They can be called "Deliverable based estimation methodologies" also. This presentation is about this family of methodologies.

effort estimationproject planningsoftware estimation
Improving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksImproving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other Hacks

This document discusses techniques for improving the performance of D3 visualizations. It begins with an overview of D3 and some basic tutorials. It then describes issues with performance for force-directed layouts and edge-bundled layouts as the number of nodes and links increases. Solutions proposed include using canvas instead of SVG for rendering, reducing unnecessary calculations, and caching repeated drawing states. The document concludes that the number of DOM nodes has major performance implications and techniques like canvas can help when exact mouse interactions are not required.

Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person

There’s no such thing as fast enough. You can always make your website faster. This talk will show you how. The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.In this talk we’ll start with the basics and get progressively insane. We’ll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they’ve changed over the years. We’ll also look at some great tools to help you.

Sparse Distribution

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   41
Log-normal distribution

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   42
Bimodal distribution

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   43
What does all of this mean?

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   44

Recommended for you

Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxFrontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou Furieux

Frontend Performance Beginner to Expert to Crazy Person The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave. In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you. La performance front-end de débutant, à expert, à fou furieux ! La toute première condition nécessaire à une bonne expérience utilisateur est de pouvoir obtenir les octets de cette expérience avant que l'utilisateur ne se lasse et parte. Nous débuterons cette conférence avec les bases pour progressivement devenir démentiel. Nous aborderons plusieurs des meilleurs pratiques de la performance front-end, quelques anti-patterns à éviter, le raisonnement derrière les règles, et comment ces dernières ont changé au fil des ans. Nous regarderons d'un peu plus près quelques très bon outils qui peuvent vous aider.

Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy Person

The document outlines steps for front-end performance optimization, beginning with basic techniques like caching, compression and domain sharing and progressing to more advanced strategies involving preloading, parallel downloads, and predicting response times. It was presented by Philip Tellis at WebPerfDays New York and includes references for further reading on topics like CDNs, TCP tuning, and the page visibility API.

Beyond Page Level Metrics
Beyond Page Level MetricsBeyond Page Level Metrics
Beyond Page Level Metrics

RUM isn’t just for page level metrics anymore. Thanks to modern browser updates and new techniques we can collect real user data at the object level, finding slow page components and keeping third parties honest. In this talk we will show you how to use Resource Timing, User Timing, and other browser tricks to time the most important components in your page. We’ll also share recipes for several of the web’s most popular third parties. This will give you a head start on measuring object level performance on your own site.


     • Sparse distribution suggests that you don’t have enough
       data points
     • Log-normal distribution is typical
     • Bi-modal distribution suggests two (or more) distributions

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   45
In practice, a bi-modal distribution is not uncommon

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   46
Hint: Does your site do a lot of back-end caching?

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   47
2-2               Filtering

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   48

Recommended for you

Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...

The document outlines steps web performance experts take to optimize frontend performance, moving from beginner to advanced techniques. It starts with basic optimizations like enabling gzip, caching, and image optimization. It then discusses more advanced strategies like using a CDN, splitting JavaScript, auditing CSS, and parallelizing downloads. Finally it discusses very advanced techniques like pre-loading assets, detecting broken Accept-Encoding headers, and understanding how to optimize for HTTP/2. The document provides references for further information on each topic.

Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person

The document discusses front-end web performance optimization from beginner to expert levels. At the beginner level, it recommends starting with basic optimizations like measuring performance, enabling gzip compression, optimizing images, and caching. At the expert level, it discusses more advanced techniques like using a CDN, splitting JavaScript files, auditing CSS, and flushing content early. Finally, it outlines "crazy" optimizations like pre-loading assets, post-load fetching, and understanding round-trip network latency.

Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person

Boston Web Performance Meetup, April 22, 2014 The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave. In this talk we'll start with the basics and get progressively insane. We'll go over several front-end performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you. Schedule: 6:30, pizza 7:15: talk


                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49

                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49

                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49

                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49

Recommended for you

Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person

The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave. In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.

mmm... beacons
mmm... beaconsmmm... beacons
mmm... beacons

The document appears to be a presentation on measuring real user experiences using Real User Monitoring (RUM) and analyzing the data. It discusses using RUM tools like Boomerang to collect data on user behavior and performance in real-time. The presentation then examines specific metrics collected like user patience, cache behavior, and how quickly new software versions are distributed based on the RUM data.

RUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IRUM Distillation 101 -- Part I
RUM Distillation 101 -- Part I

Part I of RUM Distillation 101. Part II is by Jonathan Klein available here:

DNS problems can cause outliers

     • 2 or 3 DNS servers for an ISP
     • 30 second timeout if first fails
     • ... 30 second increase in page load time
     • Maybe measure both and fix what you can

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   50
Band-pass filtering

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   51
Band-pass filtering

     • Strip everything outside a reasonable range
         • Bandwidth range: 4kbps - 4Gbps
         • Page load time: 50ms - 120s
     • You may need to relook at the ranges all the time

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   51
IQR filtering

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   52

Recommended for you

Improving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesImproving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFrames

This document discusses using <IFRAME> tags to improve the performance of third party scripts. It describes how third party scripts normally block page loading and proposes using an iframe to load scripts asynchronously in parallel without blocking. It provides code for creating an iframe targeted to load scripts, handling cross-domain issues, and modifying the Method Queue Pattern to support iframes. The approach allows third party scripts to load without blocking the main page load.

Extending Boomerang
Extending BoomerangExtending Boomerang
Extending Boomerang

The document discusses Boomerang, an open source tool for measuring real user performance on websites. It measures load times, bandwidth usage, latency and other metrics. Additional functionality can be added through plugins. The presentation encourages developers to use Boomerang to analyze user behavior, identify performance issues, and continuously improve sites based on real user data. It provides several examples of insights that can be gained, such as how performance varies by country, browser, and internet connection speed.

Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"

The document is a presentation about abusing JavaScript to measure web performance. It discusses using JavaScript to measure network latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and other performance metrics. It provides code examples for measuring each metric in JavaScript and notes challenges to consider. The presentation encourages the use of the open source Boomerang library for accurate performance measurement.

IQR filtering

                  Here, we derive the range from the data

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   52
Further Reading

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   53

    • Choose a reasonable sample size and sampling factor
    • Tune sample size for minimal margin of error
    • Decide based on your data whether to use mode, median
      or one of the means
    • Figure out whether your data is Normal, Log-Normal or
      something else
    • Filter out anomalous outliers

      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   54
• Philip Tellis

•                           .com
• @bluesmoon
• geek paranoid speedfreak

    Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   55

Recommended for you

Abusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformanceAbusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web Performance

While building boomerang, we developed many interesting methods to measure network performance characteristics using JavaScript running in the browser. While the W3C's NavigationTiming API provides access to many performance metrics, there's far more you can get at with some creative tweaking and analysis of how the browser reacts to certain requests. In this talk, I'll go into the details of how boomerang works to measure network throughput, latency, TCP connect time, DNS time and IPv6 connectivity. I'll also touch upon some of the other performance related browser APIs we use to gather useful information. I will NOT be covering the W3C Navigation Timing API since that's been covered by Alois Reitbauer in a previous Boston Web Perf talk.

Rum for Breakfast
Rum for BreakfastRum for Breakfast
Rum for Breakfast

The document discusses analyzing real user monitoring (RUM) data to gain insights into website performance and user behavior. It describes building plugins to collect navigation and timing data from browsers. Various statistical techniques for analyzing the data are covered, including log-normal distributions, filtering outliers, sampling, and correlating metrics like page load time and bounce rates. The analysis of an example 8 million page dataset suggests very fast or slow page loads are associated with higher bounce rates, and thresholds for user-unfriendly performance are proposed based on bounce rates exceeding 50%.

Analysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptAnalysing network characteristics with JavaScript
Analysing network characteristics with JavaScript

This document contains slides from a presentation about using JavaScript to analyze network performance. It discusses how to measure latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and private network scanning using JavaScript. Code examples are provided for measuring each of these network metrics by making image requests and timing the responses. The presentation emphasizes that accurately measuring network throughput requires requesting resources of different sizes and accounting for TCP slow start. It also notes some challenges around caching and geo-located DNS results.

Thank you

Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   56
Photo credits

     • by leoffreitas
     • by cobalt123
     • by Lisa

       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   57
List of figures


       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   58

More Related Content

Viewers also liked

Apple (AAPL) valuation using Discounted Cash Flow (DCF) model
Apple (AAPL) valuation using Discounted Cash Flow (DCF) modelApple (AAPL) valuation using Discounted Cash Flow (DCF) model
Apple (AAPL) valuation using Discounted Cash Flow (DCF) model
Naren Chawla
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
Brendan Gregg
Kingfisher assignment
Kingfisher assignmentKingfisher assignment
Kingfisher assignment
Trend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple CompanyTrend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple Company
Common Size Analysis
Common Size AnalysisCommon Size Analysis
Common Size Analysis
Madhusudan Partani
Report on financial statement for five years using trend, comparative & comm...
Report on financial statement for five years using trend, comparative  & comm...Report on financial statement for five years using trend, comparative  & comm...
Report on financial statement for five years using trend, comparative & comm...
Sayen Upreti
financial analysis of icici prudential life insurance
financial analysis of icici prudential life insurancefinancial analysis of icici prudential life insurance
financial analysis of icici prudential life insurance
amit soni
Cross comparision analysis of financial crisis
Cross comparision analysis of financial crisisCross comparision analysis of financial crisis
Cross comparision analysis of financial crisis
Shubham Khandelwal
Bajaj Auto Financial Analysis
Bajaj Auto Financial AnalysisBajaj Auto Financial Analysis
Bajaj Auto Financial Analysis
General Motor's Balance Sheet
General Motor's Balance SheetGeneral Motor's Balance Sheet
General Motor's Balance Sheet
Kingfisher airlines ppt
Kingfisher airlines pptKingfisher airlines ppt
Kingfisher airlines ppt
Gaurav Patel
Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis
Austin Jacobs
Financial statement analysis
Financial statement analysisFinancial statement analysis
Financial statement analysis
kiran bala sahoo
Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis
Elie Obeid

Viewers also liked (14)

Apple (AAPL) valuation using Discounted Cash Flow (DCF) model
Apple (AAPL) valuation using Discounted Cash Flow (DCF) modelApple (AAPL) valuation using Discounted Cash Flow (DCF) model
Apple (AAPL) valuation using Discounted Cash Flow (DCF) model
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
Kingfisher assignment
Kingfisher assignmentKingfisher assignment
Kingfisher assignment
Trend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple CompanyTrend Analysis Of Balance Sheet Of Apple Company
Trend Analysis Of Balance Sheet Of Apple Company
Common Size Analysis
Common Size AnalysisCommon Size Analysis
Common Size Analysis
Report on financial statement for five years using trend, comparative & comm...
Report on financial statement for five years using trend, comparative  & comm...Report on financial statement for five years using trend, comparative  & comm...
Report on financial statement for five years using trend, comparative & comm...
financial analysis of icici prudential life insurance
financial analysis of icici prudential life insurancefinancial analysis of icici prudential life insurance
financial analysis of icici prudential life insurance
Cross comparision analysis of financial crisis
Cross comparision analysis of financial crisisCross comparision analysis of financial crisis
Cross comparision analysis of financial crisis
Bajaj Auto Financial Analysis
Bajaj Auto Financial AnalysisBajaj Auto Financial Analysis
Bajaj Auto Financial Analysis
General Motor's Balance Sheet
General Motor's Balance SheetGeneral Motor's Balance Sheet
General Motor's Balance Sheet
Kingfisher airlines ppt
Kingfisher airlines pptKingfisher airlines ppt
Kingfisher airlines ppt
Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis
Financial statement analysis
Financial statement analysisFinancial statement analysis
Financial statement analysis
Coca-Cola Financial Analysis
Coca-Cola Financial AnalysisCoca-Cola Financial Analysis
Coca-Cola Financial Analysis

Similar to The Statistics of Web Performance Analysis

Project excursion
Project excursionProject excursion
Project excursion
Mallikarjuna G D
Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1
Fred Zuercher
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET Journal
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET Journal
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
IFITT Greece
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPAAir Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
Alex Henthorn-Iwane
Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...
Minitab, LLC
When Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t WorkWhen Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t Work
Jim Kaplan CIA CFE
TripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation DatabaseTripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation Database
Jon Kostyniuk
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent SolutionsUnveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
RPO America
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager
L2 DS Tools and Application.pptx
L2 DS Tools and Application.pptxL2 DS Tools and Application.pptx
L2 DS Tools and Application.pptx
Shambhavi Vats
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
Stefan Bischof
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Using big data_to_your_advantage
Using big data_to_your_advantageUsing big data_to_your_advantage
Using big data_to_your_advantage
John Repko
How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)
Environmental Data Management and Analytics
Environmental Data Management and AnalyticsEnvironmental Data Management and Analytics
Environmental Data Management and Analytics
Ekonnect Knowledge Foundation
Effort estimation for web applications
Effort estimation for web applicationsEffort estimation for web applications
Effort estimation for web applications
Nagaraja Gundappa

Similar to The Statistics of Web Performance Analysis (20)

Project excursion
Project excursionProject excursion
Project excursion
Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1Lean Six Sigma Green Belt Certification 1
Lean Six Sigma Green Belt Certification 1
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Costis Aivalis Web analytics Software IFITT Greece Hilton Athens Sept 2011
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPAAir Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...Performing at your best turning words into numbers and numbers into data driv...
Performing at your best turning words into numbers and numbers into data driv...
When Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t WorkWhen Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t Work
TripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation DatabaseTripChain: A Peer-to-Peer Trip Generation Database
TripChain: A Peer-to-Peer Trip Generation Database
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent SolutionsUnveiling Citywide Data to Generate Artificial Intelligent Solutions
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager
L2 DS Tools and Application.pptx
L2 DS Tools and Application.pptxL2 DS Tools and Application.pptx
L2 DS Tools and Application.pptx
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
ISWC 2015 - Collecting, integrating, enriching and republishing open city dat...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
Using big data_to_your_advantage
Using big data_to_your_advantageUsing big data_to_your_advantage
Using big data_to_your_advantage
How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)How we built analytics from scratch (in seven easy steps)
How we built analytics from scratch (in seven easy steps)
Environmental Data Management and Analytics
Environmental Data Management and AnalyticsEnvironmental Data Management and Analytics
Environmental Data Management and Analytics
Effort estimation for web applications
Effort estimation for web applicationsEffort estimation for web applications
Effort estimation for web applications

More from Philip Tellis

Improving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksImproving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other Hacks
Philip Tellis
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Philip Tellis
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxFrontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou Furieux
Philip Tellis
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy Person
Philip Tellis
Beyond Page Level Metrics
Beyond Page Level MetricsBeyond Page Level Metrics
Beyond Page Level Metrics
Philip Tellis
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Philip Tellis
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Philip Tellis
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Philip Tellis
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Philip Tellis
mmm... beacons
mmm... beaconsmmm... beacons
mmm... beacons
Philip Tellis
RUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IRUM Distillation 101 -- Part I
RUM Distillation 101 -- Part I
Philip Tellis
Improving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesImproving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFrames
Philip Tellis
Extending Boomerang
Extending BoomerangExtending Boomerang
Extending Boomerang
Philip Tellis
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Philip Tellis
Abusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformanceAbusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web Performance
Philip Tellis
Rum for Breakfast
Rum for BreakfastRum for Breakfast
Rum for Breakfast
Philip Tellis
Analysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptAnalysing network characteristics with JavaScript
Analysing network characteristics with JavaScript
Philip Tellis
A Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web TrafficA Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web Traffic
Philip Tellis
Input sanitization
Input sanitizationInput sanitization
Input sanitization
Philip Tellis
Messing with JavaScript and the DOM to measure network characteristics
Messing with JavaScript and the DOM to measure network characteristicsMessing with JavaScript and the DOM to measure network characteristics
Messing with JavaScript and the DOM to measure network characteristics
Philip Tellis

More from Philip Tellis (20)

Improving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksImproving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other Hacks
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxFrontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy Person
Beyond Page Level Metrics
Beyond Page Level MetricsBeyond Page Level Metrics
Beyond Page Level Metrics
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
mmm... beacons
mmm... beaconsmmm... beacons
mmm... beacons
RUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IRUM Distillation 101 -- Part I
RUM Distillation 101 -- Part I
Improving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesImproving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFrames
Extending Boomerang
Extending BoomerangExtending Boomerang
Extending Boomerang
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformanceAbusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web Performance
Rum for Breakfast
Rum for BreakfastRum for Breakfast
Rum for Breakfast
Analysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptAnalysing network characteristics with JavaScript
Analysing network characteristics with JavaScript
A Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web TrafficA Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web Traffic
Input sanitization
Input sanitizationInput sanitization
Input sanitization
Messing with JavaScript and the DOM to measure network characteristics
Messing with JavaScript and the DOM to measure network characteristicsMessing with JavaScript and the DOM to measure network characteristics
Messing with JavaScript and the DOM to measure network characteristics

Recently uploaded

How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter

Recently uploaded (20)

How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter

The Statistics of Web Performance Analysis

  • 1. • Philip Tellis • .com • • @bluesmoon • geek paranoid speedfreak • Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 1
  • 2. I’m a Web Speedfreak Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 2
  • 3. We measure real user website performance Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 3
  • 4. This talk is about the Statistics we learned while building it Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 4
  • 5. The Statistics of Web Performance Analysis Philip Tellis / Boston #WebPerf Meetup / 2012-08-14 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 5
  • 6. 0 Numbers Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 6
  • 7. Accurately measure page performance∗ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 7
  • 8. Be unintrusive If you try to measure something accurately, you will change something related – Heisenberg’s uncertainty principle Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 8
  • 9. And one number to rule them all Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 9
  • 10. What do we measure? • Network Throughput • Network Latency • User perceived page load time Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 10
  • 11. We measure real user data Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 11
  • 12. Which is noisy Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 12
  • 13. 1 Statistics - 1 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 13
  • 14. Disclaimer I am not a statistician Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 14
  • 15. 1-1 Random Sampling Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 15
  • 16. Population All possible users of your system Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 16
  • 17. Sample Representative subset of the population Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 17
  • 18. Bad sample Sometimes it’s not Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 18
  • 19. How to randomize? Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 19
  • 20. How to randomize? • Pick 10% of users at random and always test them OR • For each user, decide at random if they should be tested Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 20
  • 21. Select 10% of users - I if($sessionid % 10 === 0) { // instrument code for measurement } • Once a user enters the measurement bucket, they stay there until they log out • Fixed set of users, so tests may be more consistent • Error in the sample results in positive feedback Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 21
  • 22. Select 10% of users - II if(rand() < 0.1 * getrandmax()) { // instrument code for measurement } • For every request, a user has a 10% chance of being tested • Gets rid of positive feedback errors, but sample size != 10% of population Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 22
  • 23. How big a sample is representative? Select n such that σ 1.96 √n ≤ 5%µ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 23
  • 24. 1-2 Margin of Error Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 24
  • 25. Standard Deviation • Standard deviation tells you the spread of the curve • The narrower the curve, the more confident you can be Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 25
  • 26. MoE at 95% confidence σ ±1.96 √n Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 26
  • 27. MoE & Sample size There is an inverse square root correlation between sample size and margin of error Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 27
  • 28. 1-3 Central Tendency Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 28
  • 29. Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 29
  • 30. One number • Mean (Arithmetic) • Good for symmetric curves • Affected by outliers Mean(10, 11, 12, 11, 109) = 30 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 30
  • 31. One number • Median • Middle value measures central tendency well • Not trivial to pull out of a DB Median(10, 11, 12, 11, 109) = 11 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 31
  • 32. One number • Mode • Not often used • Multi-modal distributions suggest problems Mode(10, 11, 12, 11, 109) = 11 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 32
  • 33. Other numbers • A percentile point in the distribution: 95th , 98.5th or 99th • Used to find out the worst user experience • Makes more sense if you filter data first P95th (10, 11, 12, 11, 109) = 12 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 33
  • 34. Other means • Geometric mean • Good if your data is exponential in nature (with the tail on the right) GMean(10, 11, 12, 11, 109) = 16.68 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 34
  • 35. Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 36. Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 37. Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 38. Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 39. Other means And there is also the Harmonic mean, but forget about that Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 36
  • 40. ...though consequently We have other margins of error • Geometric margin of error • Uses geometric standard deviation • Median margin of error • Uses ranges of actual values from data set • Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
  • 41. ...though consequently We have other margins of error • Geometric margin of error • Uses geometric standard deviation • Median margin of error • Uses ranges of actual values from data set • Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
  • 42. 2 Statistics - 2 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 38
  • 43. 2-1 Distributions Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 39
  • 44. Let’s look at some real charts Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 40
  • 45. Sparse Distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 41
  • 46. Log-normal distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 42
  • 47. Bimodal distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 43
  • 48. What does all of this mean? Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 44
  • 49. Distributions • Sparse distribution suggests that you don’t have enough data points • Log-normal distribution is typical • Bi-modal distribution suggests two (or more) distributions combined Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 45
  • 50. In practice, a bi-modal distribution is not uncommon Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 46
  • 51. Hint: Does your site do a lot of back-end caching? Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 47
  • 52. 2-2 Filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 48
  • 53. Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 54. Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 55. Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 56. Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 57. DNS problems can cause outliers • 2 or 3 DNS servers for an ISP • 30 second timeout if first fails • ... 30 second increase in page load time • Maybe measure both and fix what you can • Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 50
  • 58. Band-pass filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
  • 59. Band-pass filtering • Strip everything outside a reasonable range • Bandwidth range: 4kbps - 4Gbps • Page load time: 50ms - 120s • You may need to relook at the ranges all the time Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
  • 60. IQR filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
  • 61. IQR filtering Here, we derive the range from the data Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
  • 62. Further Reading Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 53
  • 63. Summary • Choose a reasonable sample size and sampling factor • Tune sample size for minimal margin of error • Decide based on your data whether to use mode, median or one of the means • Figure out whether your data is Normal, Log-Normal or something else • Filter out anomalous outliers Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 54
  • 64. • Philip Tellis • .com • • @bluesmoon • geek paranoid speedfreak • Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 55
  • 65. Thank you Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 56
  • 66. Photo credits • by leoffreitas • by cobalt123 • by Lisa Brewster Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 57
  • 67. List of figures • • • • Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 58