The document describes Yahoo's failsafe mechanism for its homepage using Apache Storm and Apache Traffic Server. The key points are:
1. The failsafe architecture uses AWS components like EC2, ELB, S3 and autoscaling to serve traffic from failsafe servers if the primary servers fail.
2. Apache Traffic Server is used as a caching proxy between the user and origin servers. The "Escalate" plugin in ATS fetches content from failsafe servers if the origin server response is not good.
3. Apache Storm Crawler crawls content for different devices and maps URLs to the failsafe domain for storage in S3 with query parameters in the path. This provides more relevant fail
PostgreSQL High-Availability and Geographic Locality using consulSean Chittenden
Virtual IPs or floating IPs have long been the workhorse mechanism for providing high-availability for database systems, however floating IP addresses have several limitations that make it problematic in modern data centers and cloud environments, notably that it requires all members be in the same Layer-2 domain. consul is a strongly consistent way of providing high-availability services in Layer-3 environments and provides fail-over across different geographic regions. In this talk we will discuss the benefits, setup, and use of consul for fail-over of PostgreSQL, both in a local data center scenario and a geographic redundancy scenario where databases are split across multiple data centers.
This document discusses using ngx_lua in UPYUN 2. It provides an example of adding and calling a Lua module in Nginx configuration to add two numbers together. It then describes UPYUN's CDN and API architecture which is built on Nginx with ngx_lua. It outlines the large codebase of Lua modules used and shows the project structure including Makefile, dependencies, and Nginx configuration.
This document describes setting up and testing ProxySQL for query routing and high availability with Percona XtraDB Cluster (PXC). It includes instructions for installing and configuring ProxySQL, adding backend PXC servers, creating query rules for routing, and testing read/write splitting and failover through sysbench tests. Failover is demonstrated by stopping one PXC node, and ProxySQL is shown routing queries to the remaining nodes and marking the failed node as offline in its status.
This document discusses Amazon EC2 Container Service (ECS) and its benefits for container management. It provides an overview of ECS components like container instances, clusters, task definitions, and services. It also demonstrates how to use the ECS CLI to register task definitions, run tasks, and manage clusters. Examples are given of companies like Coursera using ECS for its benefits of scalability, flexibility, and ease of managing containers compared to traditional virtual servers. ECS can be used along with other AWS services like Lambda, ELB, and more to build flexible container-based architectures.
This summary provides the key information from the document in 3 sentences:
The document discusses using Lua and the ngx_lua module in UPYUN's CDN and API systems. It was built on top of Nginx with ngx_lua and contains over 40,000 lines of Lua code across many Lua modules. The system uses Lua for upstream health checking, load balancing, caching, and dynamic configuration of Nginx upstream servers.
기존에 저희 회사에서 사용하던 모니터링은 Zabbix 였습��다.
컨테이너 모니터링 부분으로 옮겨가면서 변화가 필요하였고, 이에 대해서 프로메테우스를 활용한 모니터링 방법을 자연스럽게 고민하게 되었습니다.
이에 이영주님께서 테크세션을 진행하였고, 이에 발표자료를 올립니다.
5개의 부분으로 구성되어 있으며, 세팅 방법에 대한 내용까지 포함합니다.
01. Prometheus?
02. Usage
03. Alertmanager
04. Cluster
05. Performance
How to create a secured cloudera clusterTiago Simões
This presentation, it’s for everyone that is curious with Big Data and does have the know how to start learning...
With this, you will be able to create quickly a Kerberos secured Cloudera Cluster.
The document provides instructions for setting up a Kubernetes cluster with one master node and one worker node on VirtualBox. It outlines the system requirements for the nodes, describes how to configure the networking and hostnames, install Docker and Kubernetes, initialize the master node with kubeadm init, join the worker node with kubeadm join, and deploy a test pod. It also includes commands to check the cluster status and remove existing Docker installations.
This document summarizes concepts and techniques for administering and monitoring SolrCloud, including: how SolrCloud distributes data across shards and replicas; how to start a local or distributed SolrCloud cluster; how to create, split, and reload collections using the Collections API; how to modify schemas dynamically using the Schema API; directory implementations and segment merging; configuring autocommits; caching in Solr; metrics to monitor such as indexing throughput, search latency, and JVM memory usage; and tools for monitoring Solr clusters like the Solr administration panel and JMX.
Scalable Architecture Design
DEVIEW 2013 에서 발표한 "오픈소스를 활용한 분산 아키텍처 구현기술" 장표입니다.
Scalable Architecture 디자인을 위해 필요한 다양한 구현 기술 중 몇가지를 소개해 드립니다.
관련된 내용으로 문의 있으시면 메일로 연락 주세요~
This document proposes using RPM packages to deploy Java applications to Red Hat Linux systems in a more automated and standardized way. Currently, deployment is a manual multi-step process that is slow, error-prone, and requires detailed application knowledge. The proposal suggests using Maven and Jenkins to build Java applications into RPM packages. These packages can then be installed, upgraded, and rolled back easily using common Linux tools like YUM. This approach simplifies deployment, improves speed, enables easy auditing of versions, and allows for faster rollbacks compared to the current process.
Docker provides containerization capabilities while Ansible provides automation and configuration capabilities. Together they are useful DevOps tools. Docker allows building and sharing application environments while Ansible automates configuration and deployment. Key points covered include Docker concepts like images and containers, building images with Dockerfiles, and using Docker Compose to run multi-container apps. Ansible is described as a remote execution and configuration tool using YAML playbooks and roles to deploy applications. Their complementary nature makes them good DevOps partners.
The document provides configuration details for setting up a Capistrano deployment with multistage environments and recipes for common tasks like installing gems, configuring databases, and integrating with Thinking Sphinx. It includes base configuration definitions, recipes for setting up Thinking Sphinx indexes and configuration files, and instructions for packaging the Capistrano configurations as a gem.
This document provides a cheat sheet on common Logical Volume Manager (LVM) commands for displaying, creating, modifying, and troubleshooting physical volumes (PVs), volume groups (VGs), and logical volumes (LVs) in Linux. It lists directory locations and files related to LVM, describes tools for diagnostics and debugging, and provides examples of commands for scanning and managing PVs, VGs, and LVs, including displaying information, creating, extending, reducing, removing, and changing attributes of volumes. It also discusses snapshots, mirroring, and procedures for repairing corrupted LVM metadata with and without replacing faulty disks.
This document discusses setting up MySQL auditing using the Percona Audit Plugin and ELK (Elasticsearch, Logstash, Kibana) stack to retrieve and analyze MySQL logs. Key steps include installing the Percona Audit Plugin on MySQL servers, configuring it to log to syslog, installing and configuring rsyslog/syslog-ng on database and ELK servers to forward logs, and installing and configuring the ELK stack including Elasticsearch, Logstash, and Kibana to index and visualize the logs. Examples are provided of creating searches, graphs, and dashboards in Kibana for analyzing the MySQL audit logs.
Nomad is popular as an efficient, lightweight container orchestrator. But a truly efficient, lightweight deployment environment can only be built on a minimal Linux that is designed specifically for running containers.
In this talk, we introduce Flatcar Container Linux to the Hashicorp/Nomad community. Already well known and widely deployed by Kubernetes users, Flatcar works just as well – or perhaps even better! – for Nomad.
Flatcar Container Linux is a secure, immutable, auto-updating, lightweight Linux operating system. This makes Flatcar a perfect match for Linux containers running on Nomad: nodes will update automatically and stay secure in a simple way, without the administrator having to do the heavy lifting.
This talk will explain best practices for deploying Nomad on Flatcar and demonstrate a Nomad cluster running on Flatcar.
How to configure a hive high availability connection with zeppelinTiago Simões
With this presentation, you not only should be able to configure a Hive Interpreter on Zeppelin but also with a High Availability, Load balancing and Concurrency architecture.
It will be created a JDBC connection with kerberos authentication that will communicate with your Zookeeper on the cluster.
This document discusses designing a real-time service platform using Node.js and building a distributed server environment. It covers setting up load balancing with Nginx and Zookeeper for scaling, designing the server architecture for authentication, assigning servers, and monitoring servers with tools like PM2, InfluxDB and Grafana. The document provides code examples and explanations for key aspects like authentication, server assignment, load balancing algorithms and monitoring the distributed system.
PuppetDB: Sneaking Clojure into Operationsgrim_radical
The document provides an overview of PuppetDB, which is a system for storing and querying data about infrastructure as code and system configurations. Some key points:
- PuppetDB stores immutable data about systems and allows querying of this data to enable higher-level infrastructure operations.
- It uses techniques like command query responsibility separation (CQRS) to separate write and read pipelines for better performance and reliability.
- The data is stored in a relational database for efficient querying, and queries are expressed in an abstract syntax tree (AST)-based language.
- The system is designed for speed, reliability, and ease of deployment in operations. It leverages techniques from Clojure and the JVM.
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps_Fest
В Dev-Pro DevOps-специалисты работают с Terraform в рамках Azure. Команда работает с множеством окружений и ресурсов, среди которых есть AKS (Kubernetes). Сергей поделится опытом успешного написания модулей и провайдеров для Terraform.
The document provides an introduction to web application security and the Damn Vulnerable Web Application (DVWA). It discusses common web vulnerabilities like cross-site scripting (XSS), SQL injection, and information leakage. It demonstrates how to find and exploit these vulnerabilities in DVWA, including stealing cookies, extracting database information, and creating a backdoor PHP shell. The document is intended to educate users about web security risks and show how hackers can compromise applications.
The security of an application is a continuous struggle between solid proactive controls and quality in SDLC versus human weakness and resource restrictions. As the pentester's experience confirms, unfortunatelly even in high-risk (e.g. banking) applications, developed by recognized vendors, the latter often wins - and we end up with critical vulnerabilities.
One of the primary reasons is lack of mechanisms enforcing secure code by default, as opposed to manual adding security per each function. Whenever the secure configuration is not default, there will almost inevitably be bugs, especially in complex systems. I will pinpoint what should be taken into consideration in the architecture and design process of the application. I will show solutions that impose security in ways difficult to circumvent unintentionally by creative developers. I will also share with the audience the pentester's (=attacker's) perspective, and a few clever tricks that made the pentest (=attack) painful, or just rendered the scenarios irrelevant.
The security of an application is a continuous struggle between solid proactive controls and quality in SDLC versus human weakness and resource restrictions. As the pentester's experience confirms, unfortunatelly even in high-risk (e.g. banking) applications, developed by recognized vendors, the latter often wins - and we end up with critical vulnerabilities.
One of the primary reasons is lack of mechanisms enforcing secure code by default, as opposed to manual adding security per each function. Whenever the secure configuration is not default, there will almost inevitably be bugs, especially in complex systems.
I will pinpoint what should be taken into consideration in the architecture and design process of the application. I will show solutions that impose security in ways difficult to circumvent unintentionally by creative developers. I will also share with the audience the pentester's (=attacker's) perspective, and a few clever tricks that made the pentest
(=attack) painful, or just rendered the scenarios irrelevant.
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Sherlock Homepage - A detective story about running large web services (VISUG...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Reversing Engineering a Web Application - For fun, behavior and detectionRodrigo Montoro
This document discusses reverse engineering a web application for web application firewall (WAF) detection. It describes analyzing application traffic and structure, including parameter matching, file structure analysis, and restricting access. Statistical analysis of traffic is also suggested to identify attacks and new trends for the WAF. Challenges include vulnerabilities in code, themes, plugins and handling multiple languages.
Sherlock Homepage - A detective story about running large web services - NDC ...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Building Powerful WebSocket, Comet, and RESTful Applications Using Atmosphere
This document discusses the Atmosphere framework for building asynchronous web applications. It introduces key concepts like suspending responses, broadcasting events, scheduling broadcasts, and clustering. It also provides an example of building a real-time Twitter search application with Atmosphere and discusses how Atmosphere allows writing applications once that can run anywhere across different transports without browser workarounds. The document encourages developers to use the simple Atmosphere APIs to build powerful asynchronous applications and to join the Atmosphere community.
Web application security is an important topic gaining more attention. Sensitive data needs protection not only on servers but also when traveling over networks. Common web application vulnerabilities include cross-site scripting, SQL injection, and cross-site request forgery. Developers should implement measures like encryption, limiting file access and uploads, hiding errors, and using secure sessions to authenticate users. Security requires ongoing consideration to prevent network attacks, unauthorized access, and data theft.
(BAC404) Deploying High Availability and Disaster Recovery Architectures with...Amazon Web Services
The document discusses disaster recovery strategies for AWS including backup and restore, pilot light, and warm standby approaches. It provides examples of architectures using these approaches including replicating databases across Availability Zones and regions for high availability and disaster recovery. CloudFormation templates are shown that can automate the deployment of load balanced auto-scaled web servers across Availability Zones for disaster recovery.
SmartFrog is a framework for describing, deploying, and managing distributed service components across a network. It uses a declarative description language to specify configurations and templates that can be extended and combined. The SmartFrog deployment engine loads and instantiates components based on the descriptions, supplying the correct configuration to each one. Components implement a lifecycle and can be written to deploy specific services.
The document discusses the future of server-side JavaScript. It covers various Node.js frameworks and libraries that support both synchronous and asynchronous programming styles. CommonJS aims to provide interoperability across platforms by implementing synchronous proposals using fibers. Examples demonstrate how CommonJS allows for synchronous-like code while maintaining asynchronous behavior under the hood. Benchmarks show it has comparable performance to Node.js. The author advocates for toolkits over frameworks and continuing development of common standards and packages.
The document discusses how to deploy Rails applications using Capistrano. It covers setting up the Rails environment with Ruby, RubyGems, Rails, Mongrel, Subversion, and Capistrano. It then discusses configuring Capistrano, Apache virtual hosts, and Mongrel clusters. It provides details on the deploy.rb file configuration including database, mongrel cluster, and roles.
This document summarizes best practices for secure .NET programming. It discusses guidelines for safer code like using the SecureString class and checked keywords. It also covers vulnerabilities like SQL injection and insecure configuration files. Additionally, it outlines secure communication methods in WCF like SSL and hashing, as well as runtime security features in .NET like CAS and reflection permissions. The document stresses the importance of input validation, authorization, encryption, and overall secure development practices to build a safe .NET environment.
Burn down the silos! Helping dev and ops gel on high availability websitesLindsay Holmwood
HA websites are where the rubber meets the road - at 200km/h. Traditional separation of dev and ops just doesn't cut it.
Everything is related to everything. Code relies on performant and resilient infrastructure, but highly performant infrastructure will only get a poorly written application so far. Worse still, root cause analysis in HA sites will more often than not identify problems that don't clearly belong to either devs or ops.
The two options are collaborate or die.
This talk will introduce 3 core principles for improving collaboration between operations and development teams: consistency, repeatability, and visibility. These principles will be investigated with real world case studies and associated technologies audience members can start using now. In particular, there will be a focus on:
- fast provisioning of test environments with configuration management
- reliable and repeatable automated deployments
- application and infrastructure visibility with statistics collection, logging, and visualisation
Node has captured the attention of early adopters by clearly differentiating itself as being asynchronous from the ground up while remaining accessible. Now that server side JavaScript is at the cutting edge of the asynchronous, real time web, it is in a much better position to establish itself as the go to language for also making synchronous, CRUD webapps and gain a stronger foothold on the server.
This talk covers the current state of server side JavaScript beyond Node. It introduces Common Node, a synchronous CommonJS compatibility layer using node-fibers which bridges the gap between the different platforms. We look into Common Node's internals, compare its performance to that of other implementations such as RingoJS and go through some ideal use cases.
This document provides an overview of AngularJS and discusses some security issues related to the Bash shell vulnerability CVE-2014-6271. It also summarizes the key features of the Filydoc document management system and file sharing website.
Similar to Failsafe Mechanism for Yahoo Homepage (20)
Jemalloc can help debug memory leaks in ATS plugins. It provides memory profiling by sampling memory allocations and dumping profiles to files. These profiles can then be viewed as gifs to analyze the call graph. The author provides two case studies where jemalloc helped identify leaks - a leak over months in ATS fronting APIs, and a 12 hour leak from a bug in their Brotli plugin. Jemalloc also improved ATS scalability by addressing issues with memory operations and plugins stressing the CPU to higher utilization.
This document discusses benchmarking HTTP/2 using the h2load tool. It provides examples of using h2load to test various HTTP/2 configurations and protocols. The document also summarizes several experiments comparing performance of HTTP/2 with different settings, such as with or without domain sharding, combo handling, and different servers like ATS and nghttpx. It concludes that we need to consider server capacity for HTTP/2 deployments and that h2load is not perfect, providing opportunities for contribution.
This document discusses Edge Side Includes (ESI) and its use at Yahoo. ESI allows content to be assembled at the edge from different sources, improving performance. Yahoo uses ESI to assemble pages, support legacy modules, and handle combinations of assets. ESI enables availability through caching and fallbacks. The future may include deeper HTTP integration and smarter assembly of includes.
Learning from Captain Kirk, Spock and Crew of the EnterpriseKit Chan
Captain Kirk and the crew of the Starship Enterprise provide lessons for running a website or launching a product, such as having a central "war room" for the team, establishing a clear chain of command, sharing visuals to keep everyone informed, automating processes to improve efficiency, and embracing different perspectives to find innovative solutions. The document outlines five tips inspired by Star Trek for effective team collaboration and project management.
Yahoo had a long history with traffic servers beginning with their use of Inktomi servers in the late 1990s and early 2000s. After acquiring Inktomi in 2002, Yahoo renamed the traffic server to Yahoo Traffic Server (YTS) and continued development. In 2009, YTS was contributed to the Apache incubator and became Apache Traffic Server (ATS). Yahoo migrated thousands of nodes to ATS and it now handles hundreds of thousands of requests per second serving traffic for Yahoo. ATS development has continued with performance, stability, caching, and security enhancements. It is now the recommended caching proxy at Yahoo running on over 9,500 nodes.
This document discusses Edge Server Insertion (ESI), which allows content to be assembled from multiple backend servers to improve performance. ESI enables caching fragments and handling errors, improving availability. Use cases include content assembly, automatic fallback, and timely launch of new features. Performance is improved through parallel requests and caching. Future areas of development include manipulating requests/responses and smarter assembly.
Apache Traffic Server (ATS) is a fast, scalable HTTP caching proxy server. It allows plugins to be written using Lua, a lightweight scripting language. This provides advantages over writing plugins in C/C++, including easier development, testing, and ability to leverage Lua features. The presentation discusses using Lua with ATS, including exposing ATS APIs as Lua functions, implementing plugins, testing plugins, and security considerations like input validation and sandboxing. Future work may include exposing more ATS APIs and providing input validation libraries.
Lots of bloggers are using Google AdSense now. It’s getting really popular. With AdSense, bloggers can make money by showing ads on their websites. Read this important article written by the experienced designers of the best website designing company in Delhi –
Efficient hot work permit software for safe, streamlined work permit management and compliance. Enhance safety today. Contact us on +353 214536034.
https://sheqnetwork.com/work-permit/
A Comparative Analysis of Functional and Non-Functional Testing.pdfkalichargn70th171
A robust software testing strategy encompassing functional and non-functional testing is fundamental for development teams. These twin pillars are essential for ensuring the success of your applications. But why are they so critical?
Functional testing rigorously examines the application's processes against predefined requirements, ensuring they align seamlessly. Conversely, non-functional testing evaluates performance and reliability under load, enhancing the end-user experience.
Sami provided a beginner-friendly introduction to Amazon Web Services (AWS), covering essential terms, products, and services for cloud deployment. Participants explored AWS' latest Gen AI offerings, making it accessible for those starting their cloud journey or integrating AI into coding practices.
Overview of ERP - Mechlin Technologies.pptxMitchell Marsh
This PowerPoint presentation provides a comprehensive overview of Enterprise Resource Planning (ERP) systems. It covers the fundamental concepts, benefits, and key functionalities of ERP software, illustrating how it integrates various business processes into a unified system. From finance and HR to supply chain and customer relationship management, ERP facilitates efficient data management and decision-making across organizations. Whether you're new to ERP or looking to deepen your understanding, this presentation offers valuable insights into leveraging ERP for business success.
React and Next.js are complementary tools in web development. React, a JavaScript library, specializes in building user interfaces with its component-based architecture and efficient state management. Next.js extends React by providing server-side rendering, routing, and other utilities, making it ideal for building SEO-friendly, high-performance web applications.
NBFC Software: Optimize Your Non-Banking Financial CompanyNBFC Softwares
NBFC Software: Optimize Your Non-Banking Financial Company
Enhance Your Financial Services with Comprehensive NBFC Software
NBFC software provides a complete solution for non-banking financial companies, streamlining banking and accounting functions to reduce operational costs. Our software is designed to meet the diverse needs of NBFCs, including investment banks, insurance companies, and hedge funds.
Key Features of NBFC Software:
Centralized Database: Facilitates inter-branch collaboration and smooth operations with a unified platform.
Automation: Simplifies loan lifecycle management and account maintenance, ensuring efficient delivery of financial services.
Customization: Highly customizable to fit specific business needs, offering flexibility in managing various loan types such as home loans, mortgage loans, personal loans, and more.
Security: Ensures safe and secure handling of financial transactions and sensitive data.
User-Friendly Interface: Designed to be intuitive and easy to use, reducing the learning curve for employees.
Cost-Effective: Reduces the need for additional manpower by automating tasks, making it a budget-friendly solution. Benefits of NBFC Software:
Go Paperless: Transition to a fully digital operation, eliminating offline work.
Transparency: Enables managers and executives to monitor various points of the banking process easily.
Defaulter Tracking: Helps track loan defaulters, maintaining a healthy loan management system.
Increased Accessibility: Cutting-edge technology increases the accessibility and usability of NBFC operations. Request a Demo Now!
Responsibilities of Fleet Managers and How TrackoBit Can Assist.pdfTrackobit
What do fleet managers do? What are their duties, responsibilities, and challenges? And what makes a fleet manager effective and successful? This blog answers all these questions.
Ansys Mechanical enables you to solve complex structural engineering problems and make better, faster design decisions. With the finite element analysis (FEA) solvers available in the suite, you can customize and automate solutions for your structural mechanics problems and parameterize them to analyze multiple design scenarios. Ansys Mechanical is a dynamic tool that has a complete range of analysis tools.
1. Failsafe Mechanism for
Yahoo Homepage
Using Apache Storm & Apache Traffic Server
Pushkar Sachdeva (psachdev@yahoo-inc.com)
Kit Chan (kichan@yahoo-inc.com)
05/2016
3. Failsafe
“A fail-safe or fail-secure device is one that, in the event of a
specific type of failure, responds in a way that will cause no
harm, or at least a minimum of harm, to other devices or to
personnel”
4. Overall Architecture
Yahoo! Presentation, Confidential
Browser
ELB
EC2 ATS
S3
Property ATS
Property
Serving Stack
Crawler on Storm
AWSYahoo
Auto activate Failsafe
Switch traffic to AWS
Offstage Data Flow
Online Request Flow
Normal Operation
Online Request Flow
Failsafe Mode
5. AWS Failsafe Stack Architecture
Elastic Load
Balancer
S3 Bucket
Security Group
ATS EC2
Instances
ATS
Server
VPC
Availability Zone #1
ATS EC2
Instances
ATS
Server
Availability Zone #2
Region (US W Oregon)
Region (US E North Virginia)
Region (Ireland)
Region (Singapore)
S3 Replication across regions
Cloud watch
Crawled data
from Yahoo
https
http
6. EC2 Instance - ATS
● Instance (amazon linux)
○ t2.large - burstable
○ 2 vCPUs/8GB RAM/1 gbps network
● Apache Traffic Server
○ For caching
■ Negative caching enabled
■ Ramdisk used
○ Health Check/S3 Authentication plugin
○ Lua plugin
■ Query Parameters Sorting
■ Simple Device Detection
■ Error handling
● Cloudwatch Log Agent/Monitoring Scripts
● Autoscaling based on # of incoming requests
● Deployment Mechanism using Terraform / Packer
ATS
4Gb ramdisk
cache
Amazon Linux
Cloudwatch
Agent
Cloudwatch
Monitoring Scripts
7. Lua script example - sorting query parameters
function do_remap()
local query = ts.client_request.get_uri_args()
if (query ~= nil and query ~= '') then
local result = {}
local i = 1
for value in query:gmatch '([^&]*)' do
if (value ~= '') then
result [i] = value
i = i + 1
end
end
table.sort(result)
local sorted_query = table.concat(result, '&')
ts.client_request.set_uri_args(sorted_query)
end
end
14. Escalate Plugin in Apache Traffic Server (ATS)
● ATS is a proxy server that sits between the user and the origin server
● ‘Escalate’ is an ATS plugin that fetches content from failsafe servers when the
origin server fails to provide a ‘good’ response.
ATS Origin ServerUser
19. Apache Storm Crawler (Continued)
● Crawls content for desktop, smartphone and tablet
● Supports domain level configuration for request headers, query params and
output storage.
● Failsafe url path mapping example -
Mapping: http://{failsafe_host}/{original_domain}/{device}/{path};
{sorted_query_params_as_matrix_params}
URL: https://www.yahoo.com/news/trump-unveils-foreign-policy-plan-201628138.html?q=1&a=2
S3 file path: http://brb.yahoo.net/www.yahoo.com/smartphone/news/trump-unveils-foreign-policy-
plan-201628138.html;a=2;q=1
20. High Level Architecture
Proxy Router Proxy Cache Origin Server
Failsafe Crawler
AWS storage
1
10
5
4
3
2
9
8 7
6
User
7
6
4
3
5
2
1
PUT
Offline Crawler Request Flow
User Request Flow
Optional Request Flow to fetch
failsafe content
21. Benefits
● No manual intervention needed to serve failsafe content
● Granular control
● More relevant content is shown to user
● Failsafe content is cached in proxy layer
23. Future on Resiliency - multi-cloud for failsafe
● Additional Cloud Vendor
○ E.g. Google Cloud Platform
○ S3 vs Google Cloud Storage
○ EC2/ELB vs Google Compute Engine
○ Cloudwatch vs StackDriver
● Changes in Apache Storm Crawler
○ Can use Apache jclouds to create objects in storage in S3 or Google Cloud Storage
● Changes in deployment using terraform / configuration using chef
○ GCP & AWS are supported
● Route 53 can be used to do failover to GCP
24. Future on Resiliency
● Speculative Retry
void SpeculativeRetryPlugin::handleInputComplete()
{
orig_url_ = transaction_.getClientRequest().getUrl().getUrlString();
//fetch original request
sendFetchRequest(orig_url_, false);
//start a timer which would give a callback after ‘time_’ msecs
Async::execute<AsyncTimer>(this, new AsyncTimer(AsyncTimer::TYPE_ONE_OFF, time_), getMutex());
}
void SpeculativeRetryPlugin::handleAsyncComplete(AsyncTimer &async_timer)
{
async_timer.cancel();
//active_fetch keeps track if we have received the response of original request yet or not
//if not initiate a retry request
if(!active_fetch_) {
sendFetchRequest(orig_url_, true);
}
}