If you're interested in measuring real user web performance, you'll find tools like boomerang or episodes quite handy. Some popular web frameworks even have modules that make it easy to add them to your site. However, what does one do once one has collected the data? How do you filter out the noise and get meaningful insights from the data?
In this talk, I'll go over the techniques we've picked up by analyzing millions of datapoints daily. I'll cover some simple rules to filter out invalid data, and the statistics to analyze and make sense of what's left. Do you use the mean, median or mode? What about the geometric mean and standard deviation? How confident are we in the results? And finally, why should we care?
This talk should help you gain useful insights from a histogram, or at the very least point you in the right direction for further analysis.
This document analyzes the annual report of Apple Inc. from 2004 to 2008. It summarizes the company's income statement, showing increasing sales, earnings, and net income each year. It also analyzes Apple's balance sheet, noting increasing current assets, total assets, and total equity. Overall, the trend percentages show strong growth across all financial metrics each year, indicating Apple has a solid financial position and is growing rapidly over this period.
This document discusses common size analysis, which allows companies to be compared across time and against competitors by expressing financial statement items as percentages of a base figure. There are three types of common size analysis: vertical common size income statements, horizontal common size income statements, and common size balance sheets. An example is provided where net income figures for two companies of different sizes are expressed as a percentage of sales revenue for better comparison. The document outlines some limitations of common size analysis and provides an example analysis of common size income statements and balance sheets.
financial analysis of icici prudential life insurance
This document provides a financial analysis of ICICI Prudential Life Insurance company. It includes an overview of the company and its products, sources of finance such as equity shares and debt, comparative financial statements from 2007-2008 showing increases in reserves and current assets. Financial ratios are calculated including current ratio, profitability ratio, debt ratio and return ratio. Findings note more charges taken from customers and lack of profit until 2008. Recommendations include caring more about fund management and providing lower charges to customers.
The document compares several major financial crises from 1929 to present day. It summarizes the key causes and impacts of each crisis, including the Great Depression, Black Monday, the European Exchange Rate Mechanism crisis, the Global Financial Crisis, the Greece crisis, Japan's debt crisis, and Brexit. It analyzes factors like leverage, liquidity issues, and policy failures that led to the crises. Economic indicators like GDP, unemployment, inflation, and stock market losses are compared across the different crises. Overall lessons on regulation, coordination, and stability of exchange rate mechanisms are discussed.
Bajaj Auto was founded in 1926 and initially manufactured sugar before diversifying into vehicle manufacturing in 1945. It is now India's largest two and three-wheeler manufacturer and the world's fourth largest. Bajaj Auto has experienced steady growth and released many new vehicle models over time. While its financial position is not as strong as competitor Hero Honda, with lower profit margins and negative working capital, Bajaj Auto remains an important player in India's large automobile industry and continues community service initiatives.
This document analyzes GM's balance sheet to highlight three important traits: size, liquidity, and solvency. It summarizes that GM is one of the largest companies in America based on total assets of $189 billion. It has adequate liquidity with a current ratio of 1.13, on par with other large companies. While GM's debt-to-equity ratio of 1.66 exceeds 1.0, it is comparable to many other S&P 500 firms.
The document discusses Kingfisher Airlines, an Indian airline established in 2003 that began operations in 2005. It provides key details about Kingfisher such as its headquarters, destinations served, and five-star rating. It then outlines Kingfisher's strengths as well as weaknesses, opportunities, and threats using a SWOT analysis framework. The presentation concludes by discussing problems Kingfisher faced such as heavy losses, strikes, and lack of management, and provides suggestions for how Kingfisher can continue to meet expectations of customers, suppliers, employees, and society.
This document analyzes Coca-Cola's financial statements and business strategies. It begins with an analysis of Coca-Cola's governance, including details about the CEO, board of directors, and executive compensation. It then discusses Porter's Five Forces analysis of the soda industry, finding rivalry to be high but threats of new entrants and substitutes to be medium. The document also analyzes Coca-Cola's income statements, balance sheets, profitability, and forecasts growth.
The document discusses various types of financial statement analysis including comparative statements, common size statements, trend analysis, and ratio analysis. It provides an example of a comparative balance sheet analysis for a company from 2006 to 2007. The analysis shows increases in fixed assets, long term liabilities, equity, and current assets. It also shows an increase in sales, gross profit, and net profit, indicating the overall financial position and profitability of the company improved from 2006 to 2007.
A financial analysis for Coca-Cola:
company profile, financial statement, liquidity ratio, current ratio, cash ratio, quick ratio, profitability, efficiency, short term activity, long term activity, solvency, DuPont analysis and historical enterprise value (HEV).
Done By Elie Obeid and Isabelle Khalil
This document provides an overview of project planning and estimation using function point analysis. It discusses defining a project's domain and scope, comparing projects and products, and the learning from projects. It also covers the software development lifecycle (SDLC) including planning, estimation models, and a case study on student registration using object-oriented analysis and design. Throughout are examples of applying function point analysis to estimate effort for a simple customer project.
This document outlines a project to map and optimize the end-to-end total account process for a client globally. The project aims to identify opportunities to align with global best practices, standardize processes, and reduce turnaround times by 30-50%. Key deliverables include developing level 1 and 2 process maps, identifying optimization opportunities, and helping to validate process flows for a lift and shift project. The document describes the current state process which experiences high data mismatches and variations. Analysis identifies constraints like specialized roles and data issues. Recommendations include process documentation, error proofing, reducing handoffs, and using lean tools like 5S, visual management and standard work. The target state aims to improve cycle time, reduce defects, and
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
This document discusses using web scraping techniques to collect bank offer data from websites. It describes how an offer scavenger software can automate the extraction of relevant data from websites and organize it in a predefined format like a database. The document then provides details on how the researchers collected bank offer data from websites like centralbank.net.in using web scraping and Python libraries. It explains the data extraction, transformation and loading process to clean the scraped data and load it into a database. Some preliminary statistics are also generated from the collected data. Finally, it discusses some legal aspects of using web scraping techniques.
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank Website
The document discusses using web scraping techniques to collect bank offer data from websites. It describes how web scraping works by analyzing website content, extracting relevant data, and formatting it into a structured database or spreadsheet. The paper then presents the process used to scrape bank offer data from Indian websites, including developing a Python script to automate scraping, scheduling regular scraping, cleaning the extracted data, and transforming it into a standardized format for analysis. The results section demonstrates the web scraping process and shows how the extracted data is further transformed using an ETL process into a clean dataset for analytics purposes.
The document proposes a web analytics software to analyze website visitor behavior using data from website access logs. The software would clean and extract log file data, store it in a database, and generate statistics, graphs, and forecasts. This would provide insights into visitor patterns like which products or services are most popular, how promotions impact sales, and classify profitable customer groups. The software aims to help websites better understand visitor behavior and customize services to increase sales and profits.
If your business is heavily dependent on the Internet, you may be facing an unprecedented level of network traffic analytics data. How to make the most of that data is the challenge. This presentation from Kentik VP Product and former EMA analyst Jim Frey explores the evolving need, the architecture and key use cases for BGP and NetFlow analysis based on scale-out cloud computing and Big Data technologies.
Performing at your best turning words into numbers and numbers into data driv...
Learn how to perform text analytics with Minitab and Python. Text mining is useful when analyzing customer comments, call center transcriptions, doctor reports or any other free-form text that contains information you want to extract. We will show the basic steps from collecting summary statistics like Inverse Document Frequency (IDF) and cleaning text to more advanced concepts such as bag of words and sentiment values.
When Data Visualizations and Data Imports Just Don’t Work
When Data Visualizations and Data Imports Just Don’t Work – Importing data is a dirty job as can painting user final pictures with that data. This webinar will explore the dirty little secrets that ensure data is imported completely and accurately, as well as, painting scenarios when a visualization may not be the best approach to meeting an audit objective. Specific learning objectives include:
o Walk through case studies of “dirty” data and how to improve then using improved data requests and cleansing tools.
o Watch case study examples of top tests to validate data tables to ensure data quality.
o Discover a host of baseline tests and other baseline statistics to validate, understand and possibly extract key trends for review.
o Understand visualization and dashboard types along with their associated analytical strengths from an audit perspective.
o Identify situations where statistics may be more effective audit extractors than relying on the human eye to spot notable events.
TripChain: A Peer-to-Peer Trip Generation Database
This is the presentation given by Jon Kostyniuk at the #ITEToronto2017 conference on August 1, 2017.
The TripChain framework represents an innovation in how trip generation data points can be stored and propagated for use within the transportation industry. This is accomplished through the implementation of a distributed, open peer-to-peer database for transportation professionals.
For more information, please visit http://tripchain.org/.
Unveiling Citywide Data to Generate Artificial Intelligent Solutions
This document summarizes Ian Machen's presentation on using citywide data and artificial intelligence to generate transportation solutions. It outlines the purpose, applications, challenges, process, lessons learned, and recommendations for developing data visualizations and fusing data from multiple city systems. The presentation covered use cases, sample visualization methods, challenges around complete insights and department silos, and the recommended step-process including identifying goals, challenges, performing audits and needs assessments, and setting performance indicators.
Data is growing exponentially. What should business managers do to make better business decisions? I explain three key things step by step. Just start today!
This document provides an introduction to data science from Amity Institute of Information Technology. It discusses data science tools and applications, the data science life cycle, and data science job roles. The data science life cycle includes 6 steps: defining the problem statement, data collection, data preparation, exploratory data analysis, data modeling, and data communication. Some applications of data science mentioned are internet search, recommendation systems, image and speech recognition, gaming, and online price comparison. Common data science jobs include data scientist, data engineer, data analyst, statistician, data architect, data admin, business analyst, and data/analytics manager.
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...
As Walmart.com’s former head of Performance and Reliability, Cliff Crocker knows large scale web performance. Now SOASTA’s VP of products, Cliff is pouring his passion and expertise into cloud testing to solve the biggest challenges in mobile and web performance.
The holiday rush of mobile and web traffic to your web site has the potential for unprecedented success or spectacular public failure. The world’s leading retailers have turned to the cloud to assure that no matter what load, mobile and web apps will delight customers and protect revenue.
Join us as Cliff explores the key criteria for holiday web performance readiness:
Closing the gap in front- and back-end web performance and reliability
Collecting real user data to define the most realistic test scenarios
Preparing properly for the virtual walls of traffic during peak events
Leveraging CloudTest technology, as have 6 of 10 leading retailers
Presentation by John Repko to the Colorado Society for Information Management (http://www.sim-colorado.org/), March 19, 2013. It talks about big data "killer apps," and the two kinds of innovation ("Hindsight" and "Foresight") that big data can bring to any business.
How we built analytics from scratch (in seven easy steps)
Plumbee built analytics capabilities from scratch over time by starting with third-party analytics, collecting extensive unstructured event data, analyzing that data using spreadsheets and Hive, developing in-house experimentation and automation tools, and eventually transitioning to a relational data mart. This process allowed Plumbee to gain insights, improve features and the virtual economy, and scale their data use to support growth from 3 founders to 1.2 million monthly active users.
The document describes EMC's experiences with environmental data analytics projects. It discusses EMC setting up India's first environmental data management system for CPCB in 1986. This included air and water data management and analysis. The document also outlines other projects EMC has worked on, including an online environmental monitoring system for Egypt, analysis of Ganga river water quality data from sensors, and a corporate sustainability report for an Indian company. The presentation emphasizes that environmental data is large, irregular, fuzzy and from diverse sources, requiring advanced analytics to generate meaningful insights and reports.
I have been working on a new breed of estimation methodologies called "Open estimation methodologies". They can be called "Deliverable based estimation methodologies" also. This presentation is about this family of methodologies.
Improving D3 Performance with CANVAS and other Hacks
This document discusses techniques for improving the performance of D3 visualizations. It begins with an overview of D3 and some basic tutorials. It then describes issues with performance for force-directed layouts and edge-bundled layouts as the number of nodes and links increases. Solutions proposed include using canvas instead of SVG for rendering, reducing unnecessary calculations, and caching repeated drawing states. The document concludes that the number of DOM nodes has major performance implications and techniques like canvas can help when exact mouse interactions are not required.
Frontend Performance: Beginner to Expert to Crazy Person
There’s no such thing as fast enough. You can always make your website faster. This talk will show you how. The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.In this talk we’ll start with the basics and get progressively insane. We’ll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they’ve changed over the years. We’ll also look at some great tools to help you.
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance Beginner to Expert to Crazy Person
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.
In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
La performance front-end de débutant, à expert, à fou furieux !
La toute première condition nécessaire à une bonne expérience utilisateur est de pouvoir obtenir les octets de cette expérience avant que l'utilisateur ne se lasse et parte.
Nous débuterons cette conférence avec les bases pour progressivement devenir démentiel. Nous aborderons plusieurs des meilleurs pratiques de la performance front-end, quelques anti-patterns à éviter, le raisonnement derrière les règles, et comment ces dernières ont changé au fil des ans. Nous regarderons d'un peu plus près quelques très bon outils qui peuvent vous aider.
The document outlines steps for front-end performance optimization, beginning with basic techniques like caching, compression and domain sharing and progressing to more advanced strategies involving preloading, parallel downloads, and predicting response times. It was presented by Philip Tellis at WebPerfDays New York and includes references for further reading on topics like CDNs, TCP tuning, and the page visibility API.
RUM isn’t just for page level metrics anymore. Thanks to modern browser updates and new techniques we can collect real user data at the object level, finding slow page components and keeping third parties honest.
In this talk we will show you how to use Resource Timing, User Timing, and other browser tricks to time the most important components in your page. We’ll also share recipes for several of the web’s most popular third parties. This will give you a head start on measuring object level performance on your own site.
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
The document outlines steps web performance experts take to optimize frontend performance, moving from beginner to advanced techniques. It starts with basic optimizations like enabling gzip, caching, and image optimization. It then discusses more advanced strategies like using a CDN, splitting JavaScript, auditing CSS, and parallelizing downloads. Finally it discusses very advanced techniques like pre-loading assets, detecting broken Accept-Encoding headers, and understanding how to optimize for HTTP/2. The document provides references for further information on each topic.
Frontend Performance: Beginner to Expert to Crazy Person
The document discusses front-end web performance optimization from beginner to expert levels. At the beginner level, it recommends starting with basic optimizations like measuring performance, enabling gzip compression, optimizing images, and caching. At the expert level, it discusses more advanced techniques like using a CDN, splitting JavaScript files, auditing CSS, and flushing content early. Finally, it outlines "crazy" optimizations like pre-loading assets, post-load fetching, and understanding round-trip network latency.
Frontend Performance: Beginner to Expert to Crazy Person
Boston Web Performance Meetup, April 22, 2014
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave. In this talk we'll start with the basics and get progressively insane. We'll go over several front-end performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
Schedule: 6:30, pizza
7:15: talk
Frontend Performance: Beginner to Expert to Crazy Person
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave.
In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
The document appears to be a presentation on measuring real user experiences using Real User Monitoring (RUM) and analyzing the data. It discusses using RUM tools like Boomerang to collect data on user behavior and performance in real-time. The presentation then examines specific metrics collected like user patience, cache behavior, and how quickly new software versions are distributed based on the RUM data.
Improving 3rd Party Script Performance With IFrames
This document discusses using <IFRAME> tags to improve the performance of third party scripts. It describes how third party scripts normally block page loading and proposes using an iframe to load scripts asynchronously in parallel without blocking. It provides code for creating an iframe targeted to load scripts, handling cross-domain issues, and modifying the Method Queue Pattern to support iframes. The approach allows third party scripts to load without blocking the main page load.
The document discusses Boomerang, an open source tool for measuring real user performance on websites. It measures load times, bandwidth usage, latency and other metrics. Additional functionality can be added through plugins. The presentation encourages developers to use Boomerang to analyze user behavior, identify performance issues, and continuously improve sites based on real user data. It provides several examples of insights that can be gained, such as how performance varies by country, browser, and internet connection speed.
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
The document is a presentation about abusing JavaScript to measure web performance. It discusses using JavaScript to measure network latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and other performance metrics. It provides code examples for measuring each metric in JavaScript and notes challenges to consider. The presentation encourages the use of the open source Boomerang library for accurate performance measurement.
While building boomerang, we developed many interesting methods to measure network performance characteristics using JavaScript running in the browser. While the W3C's NavigationTiming API provides access to many performance metrics, there's far more you can get at with some creative tweaking and analysis of how the browser reacts to certain requests.
In this talk, I'll go into the details of how boomerang works to measure network throughput, latency, TCP connect time, DNS time and IPv6 connectivity. I'll also touch upon some of the other performance related browser APIs we use to gather useful information. I will NOT be covering the W3C Navigation Timing API since that's been covered by Alois Reitbauer in a previous Boston Web Perf talk.
The document discusses analyzing real user monitoring (RUM) data to gain insights into website performance and user behavior. It describes building plugins to collect navigation and timing data from browsers. Various statistical techniques for analyzing the data are covered, including log-normal distributions, filtering outliers, sampling, and correlating metrics like page load time and bounce rates. The analysis of an example 8 million page dataset suggests very fast or slow page loads are associated with higher bounce rates, and thresholds for user-unfriendly performance are proposed based on bounce rates exceeding 50%.
This document contains slides from a presentation about using JavaScript to analyze network performance. It discusses how to measure latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and private network scanning using JavaScript. Code examples are provided for measuring each of these network metrics by making image requests and timing the responses. The presentation emphasizes that accurately measuring network throughput requires requesting resources of different sizes and accounting for TCP slow start. It also notes some challenges around caching and geo-located DNS results.
Apple (AAPL) valuation using Discounted Cash Flow (DCF) modelNaren Chawla
This document provides an analysis of Apple's (AAPL) valuation. It includes sections on Apple's corporate overview, financials, competitive landscape, trends and forecasts, key assumptions for valuation, a discounted cash flow valuation, and a conclusion. The summary values Apple at around $240 currently, believing the iPhone will maintain innovation leadership, Google will not dominate with Android, Apple can succeed without Steve Jobs, and the anticipated iTablet will compensate for slower desktop and laptop growth.
MeetBSDCA 2014 Performance Analysis for BSD, by Brendan Gregg. A tour of five relevant topics: observability tools, methodologies, benchmarking, profiling, and tracing. Tools summarized include pmcstat and DTrace.
Kingfisher PLC is a multinational home improvement company operating stores across Europe and Asia. The document analyzes Kingfisher's strategy using tools like PESTLE, Porter's Five Forces, SWOT and BCG matrix. It finds Kingfisher has a strong brand portfolio and loyal customers but could invest more in R&D. The strategy aims to grow profits in key markets like the UK and expand operations in Asia and Eastern Europe. The analysis concludes Kingfisher should consider expanding its strong brands into new international markets to remain competitive.
Trend Analysis Of Balance Sheet Of Apple Companysunnychhutani28
This document analyzes the annual report of Apple Inc. from 2004 to 2008. It summarizes the company's income statement, showing increasing sales, earnings, and net income each year. It also analyzes Apple's balance sheet, noting increasing current assets, total assets, and total equity. Overall, the trend percentages show strong growth across all financial metrics each year, indicating Apple has a solid financial position and is growing rapidly over this period.
This document discusses common size analysis, which allows companies to be compared across time and against competitors by expressing financial statement items as percentages of a base figure. There are three types of common size analysis: vertical common size income statements, horizontal common size income statements, and common size balance sheets. An example is provided where net income figures for two companies of different sizes are expressed as a percentage of sales revenue for better comparison. The document outlines some limitations of common size analysis and provides an example analysis of common size income statements and balance sheets.
financial analysis of icici prudential life insuranceamit soni
This document provides a financial analysis of ICICI Prudential Life Insurance company. It includes an overview of the company and its products, sources of finance such as equity shares and debt, comparative financial statements from 2007-2008 showing increases in reserves and current assets. Financial ratios are calculated including current ratio, profitability ratio, debt ratio and return ratio. Findings note more charges taken from customers and lack of profit until 2008. Recommendations include caring more about fund management and providing lower charges to customers.
The document compares several major financial crises from 1929 to present day. It summarizes the key causes and impacts of each crisis, including the Great Depression, Black Monday, the European Exchange Rate Mechanism crisis, the Global Financial Crisis, the Greece crisis, Japan's debt crisis, and Brexit. It analyzes factors like leverage, liquidity issues, and policy failures that led to the crises. Economic indicators like GDP, unemployment, inflation, and stock market losses are compared across the different crises. Overall lessons on regulation, coordination, and stability of exchange rate mechanisms are discussed.
Bajaj Auto was founded in 1926 and initially manufactured sugar before diversifying into vehicle manufacturing in 1945. It is now India's largest two and three-wheeler manufacturer and the world's fourth largest. Bajaj Auto has experienced steady growth and released many new vehicle models over time. While its financial position is not as strong as competitor Hero Honda, with lower profit margins and negative working capital, Bajaj Auto remains an important player in India's large automobile industry and continues community service initiatives.
This document analyzes GM's balance sheet to highlight three important traits: size, liquidity, and solvency. It summarizes that GM is one of the largest companies in America based on total assets of $189 billion. It has adequate liquidity with a current ratio of 1.13, on par with other large companies. While GM's debt-to-equity ratio of 1.66 exceeds 1.0, it is comparable to many other S&P 500 firms.
The document discusses Kingfisher Airlines, an Indian airline established in 2003 that began operations in 2005. It provides key details about Kingfisher such as its headquarters, destinations served, and five-star rating. It then outlines Kingfisher's strengths as well as weaknesses, opportunities, and threats using a SWOT analysis framework. The presentation concludes by discussing problems Kingfisher faced such as heavy losses, strikes, and lack of management, and provides suggestions for how Kingfisher can continue to meet expectations of customers, suppliers, employees, and society.
This document analyzes Coca-Cola's financial statements and business strategies. It begins with an analysis of Coca-Cola's governance, including details about the CEO, board of directors, and executive compensation. It then discusses Porter's Five Forces analysis of the soda industry, finding rivalry to be high but threats of new entrants and substitutes to be medium. The document also analyzes Coca-Cola's income statements, balance sheets, profitability, and forecasts growth.
The document discusses various types of financial statement analysis including comparative statements, common size statements, trend analysis, and ratio analysis. It provides an example of a comparative balance sheet analysis for a company from 2006 to 2007. The analysis shows increases in fixed assets, long term liabilities, equity, and current assets. It also shows an increase in sales, gross profit, and net profit, indicating the overall financial position and profitability of the company improved from 2006 to 2007.
A financial analysis for Coca-Cola:
company profile, financial statement, liquidity ratio, current ratio, cash ratio, quick ratio, profitability, efficiency, short term activity, long term activity, solvency, DuPont analysis and historical enterprise value (HEV).
Done By Elie Obeid and Isabelle Khalil
This document provides an overview of project planning and estimation using function point analysis. It discusses defining a project's domain and scope, comparing projects and products, and the learning from projects. It also covers the software development lifecycle (SDLC) including planning, estimation models, and a case study on student registration using object-oriented analysis and design. Throughout are examples of applying function point analysis to estimate effort for a simple customer project.
Lean Six Sigma Green Belt Certification 1Fred Zuercher
This document outlines a project to map and optimize the end-to-end total account process for a client globally. The project aims to identify opportunities to align with global best practices, standardize processes, and reduce turnaround times by 30-50%. Key deliverables include developing level 1 and 2 process maps, identifying optimization opportunities, and helping to validate process flows for a lift and shift project. The document describes the current state process which experiences high data mismatches and variations. Analysis identifies constraints like specialized roles and data issues. Recommendations include process documentation, error proofing, reducing handoffs, and using lean tools like 5S, visual management and standard work. The target state aims to improve cycle time, reduce defects, and
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET Journal
This document discusses using web scraping techniques to collect bank offer data from websites. It describes how an offer scavenger software can automate the extraction of relevant data from websites and organize it in a predefined format like a database. The document then provides details on how the researchers collected bank offer data from websites like centralbank.net.in using web scraping and Python libraries. It explains the data extraction, transformation and loading process to clean the scraped data and load it into a database. Some preliminary statistics are also generated from the collected data. Finally, it discusses some legal aspects of using web scraping techniques.
IRJET- Web Scraping Techniques to Collect Bank Offer Data from Bank WebsiteIRJET Journal
The document discusses using web scraping techniques to collect bank offer data from websites. It describes how web scraping works by analyzing website content, extracting relevant data, and formatting it into a structured database or spreadsheet. The paper then presents the process used to scrape bank offer data from Indian websites, including developing a Python script to automate scraping, scheduling regular scraping, cleaning the extracted data, and transforming it into a standardized format for analysis. The results section demonstrates the web scraping process and shows how the extracted data is further transformed using an ETL process into a clean dataset for analytics purposes.
The document proposes a web analytics software to analyze website visitor behavior using data from website access logs. The software would clean and extract log file data, store it in a database, and generate statistics, graphs, and forecasts. This would provide insights into visitor patterns like which products or services are most popular, how promotions impact sales, and classify profitable customer groups. The software aims to help websites better understand visitor behavior and customize services to increase sales and profits.
If your business is heavily dependent on the Internet, you may be facing an unprecedented level of network traffic analytics data. How to make the most of that data is the challenge. This presentation from Kentik VP Product and former EMA analyst Jim Frey explores the evolving need, the architecture and key use cases for BGP and NetFlow analysis based on scale-out cloud computing and Big Data technologies.
Performing at your best turning words into numbers and numbers into data driv...Minitab, LLC
Learn how to perform text analytics with Minitab and Python. Text mining is useful when analyzing customer comments, call center transcriptions, doctor reports or any other free-form text that contains information you want to extract. We will show the basic steps from collecting summary statistics like Inverse Document Frequency (IDF) and cleaning text to more advanced concepts such as bag of words and sentiment values.
When Data Visualizations and Data Imports Just Don’t WorkJim Kaplan CIA CFE
When Data Visualizations and Data Imports Just Don’t Work – Importing data is a dirty job as can painting user final pictures with that data. This webinar will explore the dirty little secrets that ensure data is imported completely and accurately, as well as, painting scenarios when a visualization may not be the best approach to meeting an audit objective. Specific learning objectives include:
o Walk through case studies of “dirty” data and how to improve then using improved data requests and cleansing tools.
o Watch case study examples of top tests to validate data tables to ensure data quality.
o Discover a host of baseline tests and other baseline statistics to validate, understand and possibly extract key trends for review.
o Understand visualization and dashboard types along with their associated analytical strengths from an audit perspective.
o Identify situations where statistics may be more effective audit extractors than relying on the human eye to spot notable events.
TripChain: A Peer-to-Peer Trip Generation DatabaseJon Kostyniuk
This is the presentation given by Jon Kostyniuk at the #ITEToronto2017 conference on August 1, 2017.
The TripChain framework represents an innovation in how trip generation data points can be stored and propagated for use within the transportation industry. This is accomplished through the implementation of a distributed, open peer-to-peer database for transportation professionals.
For more information, please visit http://tripchain.org/.
Unveiling Citywide Data to Generate Artificial Intelligent SolutionsRPO America
This document summarizes Ian Machen's presentation on using citywide data and artificial intelligence to generate transportation solutions. It outlines the purpose, applications, challenges, process, lessons learned, and recommendations for developing data visualizations and fusing data from multiple city systems. The presentation covered use cases, sample visualization methods, challenges around complete insights and department silos, and the recommended step-process including identifying goals, challenges, performing audits and needs assessments, and setting performance indicators.
Data is growing exponentially. What should business managers do to make better business decisions? I explain three key things step by step. Just start today!
This document provides an introduction to data science from Amity Institute of Information Technology. It discusses data science tools and applications, the data science life cycle, and data science job roles. The data science life cycle includes 6 steps: defining the problem statement, data collection, data preparation, exploratory data analysis, data modeling, and data communication. Some applications of data science mentioned are internet search, recommendation systems, image and speech recognition, gaming, and online price comparison. Common data science jobs include data scientist, data engineer, data analyst, statistician, data architect, data admin, business analyst, and data/analytics manager.
Keys To World-Class Retail Web Performance - Expert tips for holiday web read...SOASTA
As Walmart.com’s former head of Performance and Reliability, Cliff Crocker knows large scale web performance. Now SOASTA’s VP of products, Cliff is pouring his passion and expertise into cloud testing to solve the biggest challenges in mobile and web performance.
The holiday rush of mobile and web traffic to your web site has the potential for unprecedented success or spectacular public failure. The world’s leading retailers have turned to the cloud to assure that no matter what load, mobile and web apps will delight customers and protect revenue.
Join us as Cliff explores the key criteria for holiday web performance readiness:
Closing the gap in front- and back-end web performance and reliability
Collecting real user data to define the most realistic test scenarios
Preparing properly for the virtual walls of traffic during peak events
Leveraging CloudTest technology, as have 6 of 10 leading retailers
Presentation by John Repko to the Colorado Society for Information Management (http://www.sim-colorado.org/), March 19, 2013. It talks about big data "killer apps," and the two kinds of innovation ("Hindsight" and "Foresight") that big data can bring to any business.
How we built analytics from scratch (in seven easy steps)plumbee
Plumbee built analytics capabilities from scratch over time by starting with third-party analytics, collecting extensive unstructured event data, analyzing that data using spreadsheets and Hive, developing in-house experimentation and automation tools, and eventually transitioning to a relational data mart. This process allowed Plumbee to gain insights, improve features and the virtual economy, and scale their data use to support growth from 3 founders to 1.2 million monthly active users.
The document describes EMC's experiences with environmental data analytics projects. It discusses EMC setting up India's first environmental data management system for CPCB in 1986. This included air and water data management and analysis. The document also outlines other projects EMC has worked on, including an online environmental monitoring system for Egypt, analysis of Ganga river water quality data from sensors, and a corporate sustainability report for an Indian company. The presentation emphasizes that environmental data is large, irregular, fuzzy and from diverse sources, requiring advanced analytics to generate meaningful insights and reports.
I have been working on a new breed of estimation methodologies called "Open estimation methodologies". They can be called "Deliverable based estimation methodologies" also. This presentation is about this family of methodologies.
Similar to The Statistics of Web Performance Analysis (20)
Improving D3 Performance with CANVAS and other HacksPhilip Tellis
This document discusses techniques for improving the performance of D3 visualizations. It begins with an overview of D3 and some basic tutorials. It then describes issues with performance for force-directed layouts and edge-bundled layouts as the number of nodes and links increases. Solutions proposed include using canvas instead of SVG for rendering, reducing unnecessary calculations, and caching repeated drawing states. The document concludes that the number of DOM nodes has major performance implications and techniques like canvas can help when exact mouse interactions are not required.
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
There’s no such thing as fast enough. You can always make your website faster. This talk will show you how. The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.In this talk we’ll start with the basics and get progressively insane. We’ll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they’ve changed over the years. We’ll also look at some great tools to help you.
Frontend Performance: De débutant à Expert à Fou FurieuxPhilip Tellis
Frontend Performance Beginner to Expert to Crazy Person
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.
In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
La performance front-end de débutant, à expert, à fou furieux !
La toute première condition nécessaire à une bonne expérience utilisateur est de pouvoir obtenir les octets de cette expérience avant que l'utilisateur ne se lasse et parte.
Nous débuterons cette conférence avec les bases pour progressivement devenir démentiel. Nous aborderons plusieurs des meilleurs pratiques de la performance front-end, quelques anti-patterns à éviter, le raisonnement derrière les règles, et comment ces dernières ont changé au fil des ans. Nous regarderons d'un peu plus près quelques très bon outils qui peuvent vous aider.
Frontend Performance: Expert to Crazy PersonPhilip Tellis
The document outlines steps for front-end performance optimization, beginning with basic techniques like caching, compression and domain sharing and progressing to more advanced strategies involving preloading, parallel downloads, and predicting response times. It was presented by Philip Tellis at WebPerfDays New York and includes references for further reading on topics like CDNs, TCP tuning, and the page visibility API.
RUM isn’t just for page level metrics anymore. Thanks to modern browser updates and new techniques we can collect real user data at the object level, finding slow page components and keeping third parties honest.
In this talk we will show you how to use Resource Timing, User Timing, and other browser tricks to time the most important components in your page. We’ll also share recipes for several of the web’s most popular third parties. This will give you a head start on measuring object level performance on your own site.
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Philip Tellis
The document outlines steps web performance experts take to optimize frontend performance, moving from beginner to advanced techniques. It starts with basic optimizations like enabling gzip, caching, and image optimization. It then discusses more advanced strategies like using a CDN, splitting JavaScript, auditing CSS, and parallelizing downloads. Finally it discusses very advanced techniques like pre-loading assets, detecting broken Accept-Encoding headers, and understanding how to optimize for HTTP/2. The document provides references for further information on each topic.
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
The document discusses front-end web performance optimization from beginner to expert levels. At the beginner level, it recommends starting with basic optimizations like measuring performance, enabling gzip compression, optimizing images, and caching. At the expert level, it discusses more advanced techniques like using a CDN, splitting JavaScript files, auditing CSS, and flushing content early. Finally, it outlines "crazy" optimizations like pre-loading assets, post-load fetching, and understanding round-trip network latency.
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
Boston Web Performance Meetup, April 22, 2014
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave. In this talk we'll start with the basics and get progressively insane. We'll go over several front-end performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
Schedule: 6:30, pizza
7:15: talk
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get fed up and leave.
In this talk we'll start with the basics and get progressively insane. We'll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they've changed over the years. We'll also look at some great tools to help you.
The document appears to be a presentation on measuring real user experiences using Real User Monitoring (RUM) and analyzing the data. It discusses using RUM tools like Boomerang to collect data on user behavior and performance in real-time. The presentation then examines specific metrics collected like user patience, cache behavior, and how quickly new software versions are distributed based on the RUM data.
Improving 3rd Party Script Performance With IFramesPhilip Tellis
This document discusses using <IFRAME> tags to improve the performance of third party scripts. It describes how third party scripts normally block page loading and proposes using an iframe to load scripts asynchronously in parallel without blocking. It provides code for creating an iframe targeted to load scripts, handling cross-domain issues, and modifying the Method Queue Pattern to support iframes. The approach allows third party scripts to load without blocking the main page load.
The document discusses Boomerang, an open source tool for measuring real user performance on websites. It measures load times, bandwidth usage, latency and other metrics. Additional functionality can be added through plugins. The presentation encourages developers to use Boomerang to analyze user behavior, identify performance issues, and continuously improve sites based on real user data. It provides several examples of insights that can be gained, such as how performance varies by country, browser, and internet connection speed.
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Philip Tellis
The document is a presentation about abusing JavaScript to measure web performance. It discusses using JavaScript to measure network latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and other performance metrics. It provides code examples for measuring each metric in JavaScript and notes challenges to consider. The presentation encourages the use of the open source Boomerang library for accurate performance measurement.
Abusing JavaScript to Measure Web PerformancePhilip Tellis
While building boomerang, we developed many interesting methods to measure network performance characteristics using JavaScript running in the browser. While the W3C's NavigationTiming API provides access to many performance metrics, there's far more you can get at with some creative tweaking and analysis of how the browser reacts to certain requests.
In this talk, I'll go into the details of how boomerang works to measure network throughput, latency, TCP connect time, DNS time and IPv6 connectivity. I'll also touch upon some of the other performance related browser APIs we use to gather useful information. I will NOT be covering the W3C Navigation Timing API since that's been covered by Alois Reitbauer in a previous Boston Web Perf talk.
The document discusses analyzing real user monitoring (RUM) data to gain insights into website performance and user behavior. It describes building plugins to collect navigation and timing data from browsers. Various statistical techniques for analyzing the data are covered, including log-normal distributions, filtering outliers, sampling, and correlating metrics like page load time and bounce rates. The analysis of an example 8 million page dataset suggests very fast or slow page loads are associated with higher bounce rates, and thresholds for user-unfriendly performance are proposed based on bounce rates exceeding 50%.
Analysing network characteristics with JavaScriptPhilip Tellis
This document contains slides from a presentation about using JavaScript to analyze network performance. It discusses how to measure latency, TCP handshake time, network throughput, DNS lookup time, IPv6 support and latency, and private network scanning using JavaScript. Code examples are provided for measuring each of these network metrics by making image requests and timing the responses. The presentation emphasizes that accurately measuring network throughput requires requesting resources of different sizes and accounting for TCP slow start. It also notes some challenges around caching and geo-located DNS results.
A Node.JS bag of goodies for analyzing Web TrafficPhilip Tellis
This document is a presentation about analyzing web traffic using Node.js modules. It introduces Node.js and the npm package manager. It then discusses modules for parsing HTTP logs, including parsing user agents, handling IP addresses, geolocation, and date formatting. It also covers modules for statistical analysis like fast-stats, gauss, and statsd. The presentation provides code examples for using these modules and takes questions at the end.
The document discusses input validation and output encoding to prevent vulnerabilities like XSS and SQL injection. It provides examples of how unexpected input can enable attacks, like special characters or invalid data types being passed to endpoints and rendered unencoded. The key lessons are that input validation is needed to receive clean, expected data, while output encoding is crucial to prevent exploits when displaying data to users. Both techniques are important defenses that address different but related issues.
Messing with JavaScript and the DOM to measure network characteristicsPhilip Tellis
This document discusses using JavaScript to analyze network performance. It covers measuring latency, TCP handshake time, DNS lookup time, network throughput, and IPv6 support. The document provides code examples for measuring each of these metrics using JavaScript and analyzing image load times. It notes that network conditions vary and accurate measurements require statistical analysis over many samples.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
1. • Philip Tellis
• .com
• philip@lognormal.com
• @bluesmoon
• geek paranoid speedfreak
• http://bluesmoon.info/
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 1
2. I’m a Web Speedfreak
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 2
3. We measure real user website performance
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 3
4. This talk is about the Statistics we learned while building it
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 4
5. The Statistics of Web Performance Analysis
Philip Tellis / philip@lognormal.com
Boston #WebPerf Meetup / 2012-08-14
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 5
6. 0
Numbers
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 6
7. Accurately measure page performance∗
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 7
8. Be unintrusive
If you try to measure something accurately, you will change
something related
– Heisenberg’s uncertainty principle
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 8
9. And one number to rule them all
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 9
10. What do we measure?
• Network Throughput
• Network Latency
• User perceived page load time
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 10
11. We measure real user data
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 11
12. Which is noisy
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 12
13. 1
Statistics - 1
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 13
14. Disclaimer
I am not a statistician
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 14
15. 1-1 Random Sampling
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 15
16. Population
All possible users of your system
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 16
17. Sample
Representative subset of the population
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 17
18. Bad sample
Sometimes it’s not
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 18
19. How to randomize?
http://xkcd.com/221/
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 19
20. How to randomize?
• Pick 10% of users at random and always test them
OR
• For each user, decide at random if they should be tested
http://tech.bluesmoon.info/2010/01/statistics-of-performance-measurement.html
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 20
21. Select 10% of users - I
if($sessionid % 10 === 0) {
// instrument code for measurement
}
• Once a user enters the measurement bucket, they stay
there until they log out
• Fixed set of users, so tests may be more consistent
• Error in the sample results in positive feedback
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 21
22. Select 10% of users - II
if(rand() < 0.1 * getrandmax()) {
// instrument code for measurement
}
• For every request, a user has a 10% chance of being
tested
• Gets rid of positive feedback errors, but sample size !=
10% of population
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 22
23. How big a sample is representative?
Select n such that
σ
1.96 √n ≤ 5%µ
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 23
24. 1-2 Margin of Error
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 24
25. Standard Deviation
• Standard deviation tells you the spread of the curve
• The narrower the curve, the more confident you can be
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 25
26. MoE at 95% confidence
σ
±1.96 √n
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 26
27. MoE & Sample size
There is an inverse square root correlation between sample size
and margin of error
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 27
28. 1-3 Central Tendency
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 28
30. One number
• Mean (Arithmetic)
• Good for symmetric curves
• Affected by outliers
Mean(10, 11, 12, 11, 109) = 30
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 30
31. One number
• Median
• Middle value measures central tendency well
• Not trivial to pull out of a DB
Median(10, 11, 12, 11, 109) = 11
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 31
32. One number
• Mode
• Not often used
• Multi-modal distributions suggest problems
Mode(10, 11, 12, 11, 109) = 11
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 32
33. Other numbers
• A percentile point in the distribution: 95th , 98.5th or 99th
• Used to find out the worst user experience
• Makes more sense if you filter data first
P95th (10, 11, 12, 11, 109) = 12
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 33
34. Other means
• Geometric mean
• Good if your data is exponential in nature
(with the tail on the right)
GMean(10, 11, 12, 11, 109) = 16.68
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 34
35. Wait... how did I get that?
N
ΠN xi — could lead to overflow
i=1
ΣN loge (xi )
i=1
N
e — computationally simpler
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
36. Wait... how did I get that?
N
ΠN xi — could lead to overflow
i=1
ΣN loge (xi )
i=1
N
e — computationally simpler
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
37. Wait... how did I get that?
N
ΠN xi — could lead to overflow
i=1
ΣN loge (xi )
i=1
N
e — computationally simpler
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
38. Wait... how did I get that?
N
ΠN xi — could lead to overflow
i=1
ΣN loge (xi )
i=1
N
e — computationally simpler
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
39. Other means
And there is also the Harmonic mean, but forget about that
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 36
40. ...though consequently
We have other margins of error
• Geometric margin of error
• Uses geometric standard deviation
• Median margin of error
• Uses ranges of actual values from data set
• Stick to the arithmetic MoE
– simpler to calculate, simpler to read and not incorrect
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
41. ...though consequently
We have other margins of error
• Geometric margin of error
• Uses geometric standard deviation
• Median margin of error
• Uses ranges of actual values from data set
• Stick to the arithmetic MoE
– simpler to calculate, simpler to read and not incorrect
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
42. 2
Statistics - 2
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 38
43. 2-1 Distributions
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 39
44. Let’s look at some real charts
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 40
45. Sparse Distribution
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 41
46. Log-normal distribution
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 42
47. Bimodal distribution
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 43
48. What does all of this mean?
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 44
49. Distributions
• Sparse distribution suggests that you don’t have enough
data points
• Log-normal distribution is typical
• Bi-modal distribution suggests two (or more) distributions
combined
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 45
50. In practice, a bi-modal distribution is not uncommon
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 46
51. Hint: Does your site do a lot of back-end caching?
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 47
52. 2-2 Filtering
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 48
53. Outliers
• Out of range data points
• Nothing you can fix here
• There’s even a book about
them
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
54. Outliers
• Out of range data points
• Nothing you can fix here
• There’s even a book about
them
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
55. Outliers
• Out of range data points
• Nothing you can fix here
• There’s even a book about
them
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
56. Outliers
• Out of range data points
• Nothing you can fix here
• There’s even a book about
them
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
57. DNS problems can cause outliers
• 2 or 3 DNS servers for an ISP
• 30 second timeout if first fails
• ... 30 second increase in page load time
• Maybe measure both and fix what you can
• http://nms.lcs.mit.edu/papers/dns-ton2002.pdf
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 50
58. Band-pass filtering
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
59. Band-pass filtering
• Strip everything outside a reasonable range
• Bandwidth range: 4kbps - 4Gbps
• Page load time: 50ms - 120s
• You may need to relook at the ranges all the time
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
60. IQR filtering
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
61. IQR filtering
Here, we derive the range from the data
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
62. Further Reading
lognormal.com/blog/2012/08/13/analysing-performance-data/
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 53
63. Summary
• Choose a reasonable sample size and sampling factor
• Tune sample size for minimal margin of error
• Decide based on your data whether to use mode, median
or one of the means
• Figure out whether your data is Normal, Log-Normal or
something else
• Filter out anomalous outliers
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 54
64. • Philip Tellis
• .com
• philip@lognormal.com
• @bluesmoon
• geek paranoid speedfreak
• http://bluesmoon.info/
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 55
66. Photo credits
• http://www.flickr.com/photos/leoffreitas/332360959/ by leoffreitas
• http://www.flickr.com/photos/cobalt/56500295/ by cobalt123
• http://www.flickr.com/photos/sophistechate/4264466015/ by Lisa
Brewster
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 57
67. List of figures
• http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg
• http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg
• http://en.wikipedia.org/wiki/File:KilroySchematic.svg
• http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.png
Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 58