SlideShare a Scribd company logo
Ken Owens
CTO Cisco Intercloud Services
07/15/15
How Cisco Migrated from
MapReduce Jobs to Spark
Jobs
1
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Introduction
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Introduction
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Introduction

Recommended for you

Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...

Google Cloud Dataflow is a fully managed service that allows users to build batch or streaming parallel data processing pipelines. It provides a unified programming model for batch and streaming workflows. Cloud Dataflow handles resource management and optimization to efficiently execute data processing jobs on Google Cloud Platform.

googlehadoop summitdataflow
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201

In my talk I will discuss and show examples of using Apache Hadoop, Apache Hive, Apache MXNet, Apache OpenNLP, Apache NiFi and Apache Spark for deep learning applications. This is the follow up to last years Apache Deep Learning 101 that was done at Dataworks Summit and ApacheCon. As part of my talk I will walk through using Apache NXNet Pre-Built Models, MXNet's New Model Server with Apache NiFi, executing MXNet with Apache NiFi and running Apache MXNet on edge nodes utilizing Python and Apache MiniFi. This talk is geared towards Data Engineers interested in the basics of Deep Learning with open source Apache tools in a Big Data environment. I will walk through source code examples available in github and run the code live on an Apache Hadoop / YARN / Apache Spark cluster. This will be an introduction to executing Deep Learning Pipelines in an Apache Big Data environment. My talk at Data Works Summit Sydney was listed in top 7 -> https://hortonworks.com/blog/7-sessions-dataworks-summit-sydney-see/ Also have speak at and run Future of Data Princeton and at Oracle Code NYC. https://www.slideshare.net/oom65/hadoop-security-architecture?next_slideshow=1 https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac.html https://dzone.com/refcardz/introduction-to-tensorflow

dataworks summit barcelonadws19artificial intelligence and data science
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3

Apache Spark is a fast, general-purpose, and easy-to-use cluster computing system for large-scale data processing. It provides APIs in Scala, Java, Python, and R. Spark is versatile and can run on YARN/HDFS, standalone, or Mesos. It leverages in-memory computing to be faster than Hadoop MapReduce. Resilient Distributed Datasets (RDDs) are Spark's abstraction for distributed data. RDDs support transformations like map and filter, which are lazily evaluated, and actions like count and collect, which trigger computation. Caching RDDs in memory improves performance of subsequent jobs on the same data.

Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Introduction
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Introduction
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Source: IDC 7
30M
New devices
connected
every week
78%
Workloads
processed
in Cloud DCs
by 2018
5TB+
of data per person
by 2020
180B
Mobile apps
downloaded
in 2015
277X
Data created
by IoE devices
v. end-user
The Uber Trend: Exponential Rise in Connectivity
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Exponential Trend
Linear Trend
Disruptive Stress
/Opportunity
Knee of Curve
Exponential Growth Drives Opportunities
Peter Diamandis: BOLD

Recommended for you

YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud

Apache Hadoop YARN is the modern distributed operating system for big data applications. In Apache Hadoop 3.1.0, YARN added a service framework that supports long-running services. This new capability goes hand in hand with the recent improvements in YARN to support Docker containers. Together these features have made it significantly easier to bring new applications and services to YARN. In this talk you will learn about YARN service framework, its new containerization capabilities and how it lays the foundation for a hybrid and uniform architecture for compute and storage across on-prem and multi-cloud environments. This will include examples highlighting how easy it is to bring applications to the YARN service framework as well as how to containerize applications. Here's what to expect in this talk: - Motivation for YARN service framework and containerization - YARN service framework overview - YARN service examples - Containerization overview - Containerization for Big Data and non Big Data workloads - wait that's everything

dataworks summit barcelonadws19cloud
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final

The document discusses Paytm Labs' transition from batch data ingestion to real-time data ingestion using Apache Kafka and Confluent. It outlines their current batch-driven pipeline and some of its limitations. Their new approach, called DFAI (Direct-From-App-Ingest), will have applications directly write data to Kafka using provided SDKs. This data will then be streamed and aggregated in real-time using their Fabrica framework to generate views for different use cases. The benefits of real-time ingestion include having fresher data available and a more flexible schema.

hadoop kafka confluent paytm hive datalake
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...

Telecom operators need to find operational anomalies in their networks very quickly. This need, however, is shared with many other industries as well so there are lessons for all of us here. Spark plus a streaming architecture can solve these problems very nicely. I will present both a practical architecture as well as design patterns and some detailed algorithms for detecting anomalies in event streams. These algorithms are simple but quite general and can be applied across a wide variety of situations.

codemotion amsterdam 2016
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
When Products Become Cloud-enabled, They Become
10X More Valuable
$23.19
$249.00
$18.01
$199.00
$5.99
$59.99
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
SaaS
PaaS IaaS
A Broader Perspective than Hybrid Cloud Is Required…
Data Center Cloud Edge / IoT
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Hyperscale applications serving several
thousands of users very quickly
Traditional enterprise applications
IoE and increasing connectivity driving the need
for such workloads
Hadoop, Mobile back-ends, Gaming, Social
Small (~10%), yet rapidly growing
percentage of applications in the Cloud
ERP, CRM, Applications that leverage
traditional databases
Majority of applications being run
for/by Enterprises today
CIOs Need to Embrace Both Traditional
and Hyperscale Application Deployment
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
SaaS
PaaS IaaS
Application Portability and Interoperability Is the Key
Traditional
Applications
ERP, Financial, Client/Server,
CRM, email, …
Cloud Native
Applications
IoT, BigData,Analytics,
Gaming, ...
Data Center Cloud Edge / IoT

Recommended for you

Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!

Scalding is a scala DSL for Cascading. Run on Hadoop, it’s a concise, functional, and very efficient way to build big data applications. One significant benefit of Scalding is that it allows easy porting of Scalding apps from MapReduce to newer, faster execution fabrics. In this webinar, Cyrille Chépélov, of Transparency Rights Management, will share how his organization boosted the performance of their Scalding apps by over 50% by moving away from MapReduce to Cascading 3.0 on Apache Tez. Dhruv Kumar, Hortonworks Partner Solution Engineer, will then explain how you can interact with data on HDP using Scala and leverage Scala as a programming language to develop Big Data applications.​

scalatransparencyrightsmanagementhdp
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...

Hadoop is becoming a standard platform for building critical financial applications such as risk reporting, trading and fraud detection. These applications require high level of SLAs (service-level agreement) in terms of RPO (Recovery Point Objective) and RTO (Recovery Time Objective). To achieve these SLAs, organizations need to build a disaster recovery plan that cover several layers ranging from the infrastructure to the clients going through the platform and the applications. In this talk, we will present the different architecture blueprints for disaster recovery as well as their corresponding SLA objectives. Then, we will focus on the stretch cluster solution that Crédit Agricole CIB is using in production. We will discuss the solution’s advantages, drawbacks and the impact of this approach on the global architecture. Finally, we will explain in detail how to configure and deploy this solution and how to integrate each layer (storage layer, processing layer...) into the architecture.

dataworks summit barcelonadws19credit agricole group infrastructure platform
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks

Developers increasingly are building dynamic, interactive real-time applications on fast streaming data to extract maximum value from data in the moment. To do so requires a data pipeline, the ability to make transactional decisions against state, and an export functionality that pushes data at high speeds to long-term Hadoop analytics stores like Hortonworks Data Platform (HDP). This enables data to arrive in your analytic store sooner, and allows these analytics to be leveraged with radically lower latency. But successfully writing fast data applications that manage, process, and export streams of data generated from mobile, smart devices, sensors and social interactions is a big challenge. Join Hortonworks and VoltDB, an in-memory scale-out relational database that simplifies fast data application development, to learn how you can ingest large volumes of fast-moving, streaming data and process it in real time. We will also cover how developing fast data applications is simplified, faster - and delivers more value when built on a fast in-memory, scale-out SQL database.

hortonworkshadoopvoltdb
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Source: Gartner, Lydia Leong
of CIOs currently
have a second
fast/agile mode
of operation
45%
Traditional
Mode
Requires
Reliability
(ITIL, CMMI, COBIT)
Nonlinear Mode
Accept Instability
(DevOps,
automation,
reusable)
Systems
of
Differentiation
Systems
of
Innovation
Systems
of
Record
Change
Governance
Bimodal IT Is the New Normal
Source: Gartner, Lydia Leong
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Intercloud
The
Intercloud
Web-scale Architecture
API-Driven Automation
Open, Secure, Compliant,
Hybrid IT
Internet
The
Internet
IP Based
Open Standards
World of Isolated Clouds
(2000s)
Individual custom-built clouds
without consistent APIs
Connected for application
acceleration with Open APIs
The Intercloud
Intercloud
Islands of Isolated
PC LAN Networks (1990s)
Multiple LANs using
a multitude of protocols
The Internet
Connected using industry-
standard IP protocol
We Must Connect the Clouds
Use Case: Customer
Interaction Analytics
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Omni-Channel Customer Journeys
Server
Logs
Social
& Chat
Mobile
Event
Streams
Call
Center
S/W
Download
Open Trouble
Ticket
Assign
Engineer
Update
Trouble Ticket
Close Trouble
Ticket
Resolve
Trouble Ticket
Read Support
Documents
View Design
Documents
View Tech
Documents
New
Registration
Bug Search FAQs
Contract
Details
Product
Details
Device
Coverage
Interaction Touch points
Channels
Journey
Case Resolution
Software Upgrade
The customers’ interaction with Cisco across multiple touch points to get the desired business
outcome.

Recommended for you

Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015

The document discusses a presentation on OpenStack Sahara given at a conference in Rome. It begins with introducing the three speakers and their backgrounds. It then provides an agenda for the presentation which includes an introduction to big data, an overview of OpenStack components, and a demonstration of Sahara in action. The presentation discusses what big data is, provides a brief history of MapReduce and Hadoop, and explains how OpenStack is well-suited to host big data platforms through its various components and architecture. It concludes by introducing OpenStack Sahara as a way to simplify deploying and managing Hadoop clusters on OpenStack.

codemotion rome 2015
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB

NoSQL databases are being used everywhere by startups and Global 2000 companies alike for data environments that require cost-effective scaling. These environments also typically need to represent data in a more flexible way than is practical with relational databases.

nosqlmaprjson
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...

DeepLearning4J (DL4J) is a powerful Open Source distributed framework that brings Deep Learning to the JVM (it can serve as a DIY tool for Java, Scala, Clojure and Kotlin programmers). It can be used on distributed GPUs and CPUs. It is integrated with Hadoop and Apache Spark. ND4J is a Open Source, distributed and GPU-enabled library that brings the intuitive scientific computing tools of the Python community to the JVM. Training neural network models using DL4J, ND4J and Spark is a powerful combination, but the overall cluster configuration can present some unespected issues that can compromise performances and nullify the benefits of well written code and good model design. In this talk I will walk through some of those problems and will present some best practices to prevent them. The presented use cases will refer to DL4J and ND4J on different Spark deployment modes (standalone, YARN, Kubernetes). The reference programming language for any code example would be Scala, but no preliminary Scala knowledge is mandatory in order to better understanding the presented topics.

dataworks summit barcelonadws19artificial intelligence and data science
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
• Software Upgrades
• Bug Inquiry
• Software Inquiry
• Trouble Ticket Lifecycle
• Device Troubleshooting
• New Registration
• Contract Renewal
• Customer Interest
Analytics
• Customer Experience
Analytics
• Resource Forecasting
• Security and
Compliance
Customer Journeys Behavioral Insights
• Boost Self Service
• Real-time Content
Optimization &
Recommendation
• Context Based
Predictive Alerts
• Implicit Personalization
Impact
Customer Interaction Analytics
From Journey to Outcome…
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Server Logs
Customer Interaction Analytics
Big Data Platform
Synthesize customer journey maps into behavioral insights.
Call Center
Mobility
Social
Event
Streams
Data
Sources
Data
Ingestion
CiscoDV
Kafka
Redis
ETL
Analytics
Model
Build Model
Activity
Refinement
Activity
Synthesis
Synthesized
Insights
Real-time Processing
Batch Analytics
Insight Services
CiscoDV
Interact
ImpalaHive
Pig ES
Zoomdata,Platfora
AWS and CIS Intercloud
Solution
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
AWS Platform
Component Cloud::
Hadoop
(Batch
Analytics)
Cloud::
Queries
(Interactive
Queries)
Cloud::
Streams
(Near Real-
time
Analytics)
Virtual
Machines
30 6 5
AWS
Instance
Sizing
m3.2xlarge c3.xlarge m3.xlarge
Virtual
Cores
8/VM 4/VM 4/VM
RAM 30GB/VM 7.5GB/VM 15GB/VM
Disk 1.5 TB/VM 1.5 TB/VM 1.5 TB/VM

Recommended for you

Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c

Prezentace ze semináře ODA Partner Enablement Zámek Berchtold, 12.4.2018 Prezentoval Patrik Plachý,Senior Consultant Oracle

Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop

* Quick Intro to Bigtop * Trend Micro Big Data Platform * Mission-specific Platform * Big Data Landscape (3p) * Bigtop 1.1 Release (6p)

20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup

Sahara is an OpenStack service that allows users to easily provision and manage Hadoop clusters in OpenStack. It currently supports plugins for Hortonworks, Cloudera, and MapR distributions of Hadoop. The Cloudera plugin integrates Cloudera Manager to provision CDH services. Sahara aims to provide analytics as a service and allow data processing directly in OpenStack clusters using technologies like HDFS, Swift, and Hadoop frameworks. Performance overhead compared to bare metal Hadoop clusters is a current limitation being addressed.

saharaopenstackhadoop
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Case for Cisco Intercloud Services for Analytics…
 Cisco Security and Compliance requirements
• Workloads that deal with personally identifiable data and Cisco
confidential content cannot be uploaded to AWS. Cisco internal cloud
solution is a better fit.
 Customer journey beyond the enterprise
• Applications are hosted on AWS
• Partner systems hosted on AWS and other cloud providers
Presence in AWS and other cloud services required to support these
scenarios for end-end customer journey insights.
 Data virtualization integrated in the CIS Analytics Stack
• Connect data from multiple clouds and multiple big data platforms
 Integrated visualization toolset
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
CIS Analytics Platform
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
CIS Analytics Platform Requirements
Infra Provisioning
Deploy a virtual private cloud (VPC) on CIS with compute, storage and memory requirements comparable to the current
production system.
OpenStack
Icehouse OpenStack with Neutron, Nova, and Swift installed.
Big Data Ecosystem
Cloudera’s Hadoop distribution version CDH 5.1.3., ELK Stack, Apache Kafka and Apache Storm.
Data virtualization & Cloud Integration
Access to data services and data stores via Cisco Data Virtualization
Runtime Services
Foundational PaaS capabilities including SLAs for uptime, performance, latency, data retention, issue escalation and
support priorities, issue resolution, problem management, deployment process, patch management.
API Services
Provide both fine-grained and coarse-grained access to the all service layers of the CIS Analytics Platform. In the hybrid cloud
model it must support interoperability across platform service providers and promote the cloud concepts of extensibility and
flexibility.
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
AWS to CIS Migration – Success Criteria
 Successful synthesis of customer interaction data
 Successful automation of the end-end data process pipeline
 Build behavioral insight services
 Access to data and services via data discovery and visualization tools
 Meet the performance, scale and platform stability requirements
 Successful deployment of CiscoDV on CIS
 Connect HDFS and Hive DS with CiscoDV via Hive and Impala
 Build and expose insight services for consumption by limited users

Recommended for you

DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs

The document discusses Cisco's Hadoop as a service offering on their Intercloud platform. Some key points: - Cisco provides managed Hadoop, including Cloudera's distribution, on optimized instances with local storage and object storage. This offers a scalable, reliable, and secure environment for Hadoop workloads. - Use cases discussed include predictive maintenance using IoT data and analyzing customer journeys across multiple channels. - A pilot test showed Cisco's platform could process over 100 million records from production data across various Hadoop jobs. - Cisco also discusses their data virtualization product CiscoDV, which can integrate data across on-premises, cloud sources on Cisco and AWS. -

Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time

Apache Eagle is a distributed real-time monitoring and alerting engine for Hadoop that was created by eBay and later open sourced as an Apache Incubator project. It provides security for Hadoop systems by instantly identifying access to sensitive data, recognizing attacks/malicious activity, and blocking access in real time through complex policy definitions and stream processing. Eagle was designed to handle the huge volume of metrics and logs generated by large-scale Hadoop deployments through its distributed architecture and use of technologies like Apache Storm and Kafka.

hadoop summit
IOT Exploitation
IOT Exploitation	IOT Exploitation
IOT Exploitation

This document provides an overview of exploiting insecure IoT firmware. It begins with an introduction to IoT protocols like CoAP, MQTT, XMPP, and AMQP. It then discusses the OWASP top 10 security risks for IoT, focusing on insecure software/firmware. Common debugging interfaces for firmware like UART, JTAG, SPI, and I2C are explained. Operating systems and compilers used for IoT development are listed. Finally, the document outlines a methodology for exploiting insecure firmware, including getting the firmware, performing reconnaissance, unpacking, localizing points of interest, and then decompiling, compiling, tweaking, fuzzing, or pentesting the firmware. Tools mentioned include binwalk, firmwalk

Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
AWS and CIS Data Node Sizing Comparison
Hadoop Cluster for Batch and Query Analytics
Node Service AWS Instance Type vCPU Mem Storage
Number of
Data Nodes
Comments
Data Nodes/
Node Master m3.2xlarge 8 30 2x80 GB 30
Each hadoop data node has 1500GB of EBS
available for HDFS storage
AWS Sizing
CCS Sizing
Node Service CCS Instance Type vCPU Mem Storage
Number of
Data Nodes
Comments
Data Nodes/
Node Master GP-2XLarge 8 32 50 35
Each hadoop data node has 1500GB of EBS
available for HDFS storage
Less than AWS sizing (Storage)
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Pilot Test Data
• Test performed on one day’s production data
• Total no. of records processed – 110,852,667
• Total data size – 32GB
• Total no. of M/R jobs in the data pipeline – 17
• Two test cycles
• Cycle 1: Heterogeneous CCS nodes (vCPUs, storage, memory)
• Cycle 2: Homogeneous CCS nodes
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
CIS Performance of Batch Analytics – Limited Test
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Test Details by M/R job
Job Name CCS 12
nodes:
cycle1
CCS 18
nodes:
cycle1
CCS 24
nodes:
cycle1
CCS 30
nodes:
cycle1
CCS 18
nodes:
cycle2
CCS 24
nodes:
cycle2
CCS 30
nodes:
cycle2
CCS 35
nodes:
cycle2
New_cleanse 249 176 143 117 82 67 55 51
Process_private_ip 27 14 11 10 7 5 6 6
join_web_and_ip_data 142 95 76 61 49 40 34 29
combine_ip_decorated_files 26 14 11 10 9 7 8 7
filterBotEntries 34 19 15 13 10 8 7 7
sessionize 71 64 69 62 60 63 15 13
firstActivitiesFilter 26 15 13 10 9 8 6 6
allOtherActivitiesFilter 29 18 13 13 11 9 7 6
matchFirstActivities 21 13 11 13 13 11 8 8
buildActivities 27 15 12 10 7 6 9 9
filterBUG 8 5 3 2 3 3 4 4
filterSEA 8 5 3 2 3 3 4 4
filterTCO 8 5 3 2 3 3 4 4
filterTDV 8 5 3 2 3 3 4 4
filterWDV 8 5 3 2 3 3 4 4
filterMOD 8 5 3 2 3 3 4 4
filterTOOL 8 5 3 2 3 3 4 4

Recommended for you

114 Numalliance
114 Numalliance114 Numalliance
114 Numalliance

NUMALLIANCE has successfully integrated new companies over the past 10 years to expand its capabilities in cold forming solutions for wire and tube. It recently merged with its competitor SILFAX, bringing increased tube expertise. While the companies previously competed in automotive and aerospace, they take different approaches that are now complementary within NUMALLIANCE. For example, SILFAX focuses on longer tube production while NUMALLIANCE develops solutions for other parts. The key to a successful merger is identifying the right competitors to combine capabilities while allowing team members to enhance operations rather than causing redundancy.

Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side

This document discusses various tools for data visualization, including D3.js, WebGL, the ELK stack, R, Processing, Open Refine, and 3D printing. It provides examples of visualizations created with each tool and suggests when each tool may be best to use. D3.js is described as a low-level library that provides full control but requires more work, while tools like the ELK stack allow for quickly visualizing system and business data. R is presented as useful for exploring and analyzing large datasets, and Open Refine is recommended for cleaning and preparing CSV files for export.

datavizdata visualization
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence

This document provides an overview of heterogeneous persistence and different database management systems (DBMS). It discusses why a single DBMS is often not sufficient and describes different types of DBMS including relational databases, key-value stores, and columnar databases. For each type, it outlines good and bad use cases, examples, considerations, and pros and cons. The document aims to help readers understand the different flavors of DBMS and how to choose the right ones for their specific data and access needs.

mongodbsphinx searchkey-value store
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
PoC: Analytics with Spark on CIS
Existing code
 Made in Ruby with Wukong to run on Hadoop
 A history of changes and modifications
 Script-based, steps communicate via intermediary files
Goal
 Revise, rethink and reimplement with Spark on CIS
 Open for advanced cloud analytics
 Improve maintainability by moving away from aging Ruby on Hadoop
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Sessionize
Cleanse
logs
cleanse
private web
decorate
sessionize
(cookie, time)
sessioned
match 1st
(IP, UA, time)
build actions merge
session PSV
add to hivebug tool
first, others, bots
1..7
onlyBots
first
others
private
Main
computation
happens here
cleansed
 Pre-process log records (‘cleanse’)
 Extract HTTP sessions (‘sessionize’)
 Extract user actions, such as ‘search’, ‘download
patch’, ‘open manual’, ‘open a bug’
Ruby: Scripts with temp files
 Each box on the figure is a script in a separate file
 They pipe Gb of data as input and output
 Random matching of nodes to data for sessionizing
 Lots of redundant shuffling
Ruby Flow
global sort in time
global group by IP
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Sessionize
Cleanse
logs
cleanse
private web
decorate
sessionize
(cookie, time)
sessioned
match 1st
(IP, UA, time)
build actions merge
session PSV
add to hivebug tool
first, others, bots
1..7
onlyBots
first
others
private
Main
computation
happens here
cleansed
 Same flow, but each box is a Java or Scala function
No intermediate temp files
 Steps are chained by Spark, often without any need for
intermediate data
 If still needed, the data is stored in memory and local
disk as much as possible
Local computation
 Cleansing is computed on nodes local to data blocks
(same as Ruby)
 Sessions are built per IP
 On separate nodes each handling a single IP range
 One copied to the node on partition the data remains
local
Spark Flow
global partition by IP
local sort in time
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
 Volumes
 Logs of a single day: 52 Gb
 Total of 110 mil records
 Where 53 mil records are kept after pre-filtering
 Producing over 1 mil user actions
 Cluster of 30 nodes
 Ruby
 Runtime 140 min
 Spark
 Runtime 7 min (20 times faster )
Runtime comparison

Recommended for you

USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats

A report issued March 25, 2013 by the U.S. Geological Survey titled "Landscape Consequences of Natural Gas Extraction in Allegheny and Susquehanna Counties, Pennsylvania, 2004–2010." The report, using a series of maps and data, purports to show that drilling has lead to "carving up" wildlife habitats in some forests.

allegheny countydrilling impacthydraulic fracturing
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams

Suraj Pratap discusses security automation for red and blue teams. He outlines how he automates the server and application lifecycles using open source tools to address challenges around human capacity, tool selection, time, and cost when managing 600+ servers and 10+ applications across cloud infrastructures. Some areas he has automated include infrastructure security using Ansible and CloudFormation, security auditing using Scout2 and Prowler, offensive security tests using OpenVAS and Jenkins, vulnerability management with Dradis and Vulnreport.io, and security information and event monitoring with Alienvault and ELK.

securityinformation securityhacking
Demystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use CasesDemystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use Cases

Many vendors sell “security analytics” tools. Also, some organizations built their own security analytics toolsets and capabilities using Big Data technologies and approaches. How do you find the right approach for your organization and benefit from this analytics boom? How to start your security analytics project and how to mature the capabilities? (Source: RSA USA 2016-San Francisco)

incident response + soc
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
 Extracting sessions means sort in time and group by IP
 Ruby:
 sorting in time and per-IP grouping is performed across the whole cluster (very bad, lots of IO)
 Spark is good at dealing with partitions:
 per-IP groups are placed on different machines (partitions)
 global sort in time is replaced by many local per-IP sorts done on machines responsible for
extracting sessions for specific groups of IP addressed
 Other improvements
 Avoid redundant temp files, redundant (de)-serialization of objects (comes with Java/Scala), stages
keep data in memory when possible (comes with Spark)
 Cache results of user agent resolution that are heavy on regular expressions
Why?
CiscoDV on CIS
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Data Virtualization for Intercloud Analytics
Customer Benefits
 Discover data beyond the enterprise: Virtual integration that combines traditional
enterprise data, Big Data stores on CIS and AWS, cloud data from SaaS providers and,
Cisco Customers and Partners
 Seamless interoperability offers easy access to data across distributed data sources
in the intercloud analytics platform
 Universal data governance maximizes enforcement of data security rules
 Analytics Data Hubs: Deployment flexibility to build hybrid/virtual sandboxes that
enable nimble data discovery and rapid data analytics to support multiple LOBs
 Deliver data to any number of analytics tools.
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Use Case 1: Get Case Interactions
Use Case Description # of cases opened by company X that
are currently open. (other variations
would include cases by company,
trends etc.)
CiscoDV Value CiscoDV enforces data security rules to
restrict access on the intercloud
platform to customer sensitive data.
Data Sources SalesForce
Intercloud Solution CIS CiscoDV service can access the
“sanitized” version of CSOne data
through JDBC from RIDES(SWTG
CiscoDV) API.
Connection Type DV on hybrid cloud  Enterprise data
store

Recommended for you

Java management extensions (jmx)
Java management extensions (jmx)Java management extensions (jmx)
Java management extensions (jmx)

Introduction to Java Management Extension (JMX) Technology

javajava platform enterprise edition
Mindmappen
MindmappenMindmappen
Mindmappen
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...

This session will share large scale architectures from the author's experiences with various companies like Cisco, Symantec, and EMC and compare and contrast the architecture across : Infrastructure Architecture Scaling, Ecommerce integrations and migration approach from legacy into AEM, Digital Marketing Cloud Integrations such as personalization, analytics, and DMP.

scalabilityadobe experience manageraem
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Use Case 2: Get Customer Journey
Use Case Description Customer interactions on the web
pertaining to bug search and case
submission process. Foundational data
can be used to explore trends and feed
into content recommendation models
CiscoDV Value Direct access to Data on CIS Intercloud Analytics
Platform
Data Sources SAS Analytics
Intercloud Solution By direct network access to the Impala
Server, the CIS CiscoDV server
connects to the Impala Service in
Hadoop also on CIS as a Data Source.
SQL Queries configured in CiscoDV
execute Impala queries
Connection Type DV on hybrid cloud  VPC Big Data
platform
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
Use Case 3: Get Bug Interactions
Use Case
Description
Another foundational data service that provides
a breakdown of customer exposure or interest
in bugs. The service can be refined further to
look at trends specific to a company or a
product for further analytics.
CiscoDV Value Real-time data federation that accesses
extremely large data in CIS Intercloud Analytics
platform and join that with Bug Data accessed
via departmental CiscoDV instance (RIDES)
Data Sources SASA Analytics and QDDTS via RIDES
Intercloud
Solution
By building on the access to the Impala Server,
the DV server can join the Bug Data from the
Enterprise Data Stores with the HDFS data to
provide a federated view.
Connection
Type
DV on hybrid cloud  VPC Big Data platform
and Enterprise data store
Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public
CiscoDV on Intercloud Analytics Platform (CIS)
Scenario 1
CIS Cisco DV to Cisco
Enterprise Data Store
Scenario 2
CIS CiscoDV to Impala and
Hive on CIS Intercloud
Analytics Platform
Scenario 3
CIS Cisco DV to Hive on AWS
Big Data Cluster
Scenario1
Scenario 3
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015

Recommended for you

Accelerated Leadership
Accelerated LeadershipAccelerated Leadership
Accelerated Leadership

The document outlines a pilot leadership program called the Accelerated Leadership Class (ALC) at Peterson Air Force Base. The 7-session program will provide interactive leadership training to 12 junior airmen using experiential learning activities and models. Sessions will focus on developing leadership skills, emotional intelligence, giving and receiving feedback, and completing a group leadership project to benefit the base. The tentative schedule explores different timing options over 6 days, 3 sessions per week for 2 weeks, or twice monthly for 3 months.

 
by kktv
leadershipmanagementleadershipmanagement
Opensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNFOpensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNF

Microservice is gaining increased adoption in the Telco NFV world. It is key to understand the design and deployment methodologies involved in developing Microservice based VNF. This talk provides an opensource practitioner approach to building and deploying a Microservice based VNF and includes the following: - Design patterns, workflow models - Design models for VNF placement, capacity management, scale-in/out and resiliency - Deployment considerations that includes handing of scale and fault tolerant VNF using well known Opensource tools. About the presenter: Prem Sankar works for Ericsson Opensource Ecosystem team and part of the Opendaylight and OPNFV team in Ericsson. Prem evangelizes SDN and Cloud and has given many sessions and conducted workshops around SDN and ODL. Prem is PTL of ODL COE project and currently driving the Kuberenetes and ODL Integration in Opendaylight community. Prem is a frequent speaker at opensource summits and has presented in Opendaylight, OPNFV and Open networking summits.

network technologyvnfprem sankar
Performance testing for web-scale
Performance testing for web-scalePerformance testing for web-scale
Performance testing for web-scale

If you heard about web-scale or have a requirement to survive under web-scale or you just would like to prepare your application to handle an X effect this topic is for you. During a presentation you will understand aspects and caveats of performance testing, nuances of performance testing of Java based web applications. As a practical part you will get a brief overview of existing tools and will get a guide of using Gatling as a tool to make a load for your application. Gatling is an open source tool for performance loading written in Scala and provides comprehensive DSL for load scenario specification.

high-loadscalascale

More Related Content

What's hot

Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
DataWorks Summit
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
DataWorks Summit
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
DataWorks Summit
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
Andrey Vykhodtsev
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
DataWorks Summit
 
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
Adam Muise
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
DataWorks Summit
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Hortonworks
 
Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015
Codemotion
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
MapR Technologies
 
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
DataWorks Summit
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
MarketingArrowECS_CZ
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup
Wei Ting Chen
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
Cisco DevNet
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
DataWorks Summit/Hadoop Summit
 

What's hot (20)

Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
 
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 

Viewers also liked

IOT Exploitation
IOT Exploitation	IOT Exploitation
114 Numalliance
114 Numalliance114 Numalliance
114 Numalliance
Ludovic Vallet
 
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side
Mathieu Elie
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
Jervin Real
 
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
Marcellus Drilling News
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams
Suraj Pratap
 
Demystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use CasesDemystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use Cases
Priyanka Aash
 
Java management extensions (jmx)
Java management extensions (jmx)Java management extensions (jmx)
Java management extensions (jmx)
Tarun Telang
 
Mindmappen
MindmappenMindmappen
Mindmappen
yperlaan
 
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
Evolve The Adobe Digital Marketing Community
 
Accelerated Leadership
Accelerated LeadershipAccelerated Leadership
Accelerated Leadership
kktv
 
Opensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNFOpensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNF
Michelle Holley
 
Performance testing for web-scale
Performance testing for web-scalePerformance testing for web-scale
Performance testing for web-scale
Izzet Mustafaiev
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
BigData_Europe
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotapp
Jilles van Gurp
 
DevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedgeDevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedge
WhiteHedge Technologies Inc.
 
AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
AWS re:Invent 2014 | (ARC202) Real-World Real-Time AnalyticsAWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
Socialmetrix
 
Incident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEOIncident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEO
Paul Dutot IEng MIET MBCS CITP OSCP CSTM
 
How Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App ModernizationHow Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App Modernization
Docker, Inc.
 
SocCnx11 - All you need to know about orient me
SocCnx11 - All you need to know about orient meSocCnx11 - All you need to know about orient me
SocCnx11 - All you need to know about orient me
panagenda
 

Viewers also liked (20)

IOT Exploitation
IOT Exploitation	IOT Exploitation
IOT Exploitation
 
114 Numalliance
114 Numalliance114 Numalliance
114 Numalliance
 
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams
 
Demystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use CasesDemystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use Cases
 
Java management extensions (jmx)
Java management extensions (jmx)Java management extensions (jmx)
Java management extensions (jmx)
 
Mindmappen
MindmappenMindmappen
Mindmappen
 
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
EVOLVE'16 | Enhance | Anil Kalbag & Anshul Chhabra | Comparative Architecture...
 
Accelerated Leadership
Accelerated LeadershipAccelerated Leadership
Accelerated Leadership
 
Opensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNFOpensource approach to design and deployment of Microservices based VNF
Opensource approach to design and deployment of Microservices based VNF
 
Performance testing for web-scale
Performance testing for web-scalePerformance testing for web-scale
Performance testing for web-scale
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotapp
 
DevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedgeDevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedge
 
AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
AWS re:Invent 2014 | (ARC202) Real-World Real-Time AnalyticsAWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics
 
Incident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEOIncident Response in the wake of Dear CEO
Incident Response in the wake of Dear CEO
 
How Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App ModernizationHow Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App Modernization
 
SocCnx11 - All you need to know about orient me
SocCnx11 - All you need to know about orient meSocCnx11 - All you need to know about orient me
SocCnx11 - All you need to know about orient me
 

Similar to How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015

Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
ldangelo0772
 
L'azienda è più agile? Tutto merito del Data Center
L'azienda è più agile? Tutto merito del Data Center L'azienda è più agile? Tutto merito del Data Center
L'azienda è più agile? Tutto merito del Data Center
SMAU
 
Building The Right Network
Building The Right NetworkBuilding The Right Network
Building The Right Network
Cisco Canada
 
Cisco Connect Halifax 2018 Cisco dna - deeper dive
Cisco Connect Halifax 2018   Cisco dna - deeper diveCisco Connect Halifax 2018   Cisco dna - deeper dive
Cisco Connect Halifax 2018 Cisco dna - deeper dive
Cisco Canada
 
Application Centric Infrastructure (ACI), the policy driven data centre
Application Centric Infrastructure (ACI), the policy driven data centreApplication Centric Infrastructure (ACI), the policy driven data centre
Application Centric Infrastructure (ACI), the policy driven data centre
Cisco Canada
 
Presentation data center transformation cisco’s virtualization and cloud jo...
Presentation   data center transformation cisco’s virtualization and cloud jo...Presentation   data center transformation cisco’s virtualization and cloud jo...
Presentation data center transformation cisco’s virtualization and cloud jo...
xKinAnx
 
Cisco’s Cloud Ready Infrastructure
Cisco’s Cloud Ready InfrastructureCisco’s Cloud Ready Infrastructure
Cisco’s Cloud Ready Infrastructure
Cisco Canada
 
Migrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
Migrating from VMs to Kubernetes using HashiCorp Consul Service on AzureMigrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
Migrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
Mitchell Pronschinske
 
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
NetworkCollaborators
 
Cisco data center training for ibm
Cisco data center training for ibmCisco data center training for ibm
Cisco data center training for ibm
Christian Silva Espinoza
 
Presentation capturing the cloud opportunity
Presentation   capturing the cloud opportunityPresentation   capturing the cloud opportunity
Presentation capturing the cloud opportunity
xKinAnx
 
Cisco Connect Toronto 2018 sd-wan - delivering intent-based networking to t...
Cisco Connect Toronto 2018   sd-wan - delivering intent-based networking to t...Cisco Connect Toronto 2018   sd-wan - delivering intent-based networking to t...
Cisco Connect Toronto 2018 sd-wan - delivering intent-based networking to t...
Cisco Canada
 
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUICisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
Cisco Canada
 
Cisco Digital Network Architecture Deeper Dive From The Gates To The Gui
Cisco Digital Network Architecture Deeper Dive From The Gates To The GuiCisco Digital Network Architecture Deeper Dive From The Gates To The Gui
Cisco Digital Network Architecture Deeper Dive From The Gates To The Gui
Cisco Canada
 
Cisco ucs overview ibm team 2014 v.2 - handout
Cisco ucs overview   ibm team 2014 v.2 - handoutCisco ucs overview   ibm team 2014 v.2 - handout
Cisco ucs overview ibm team 2014 v.2 - handout
Sarmad Ibrahim
 
Cisco Connect 2018 Singapore - Cisco Software Defined Access
Cisco Connect 2018 Singapore - Cisco Software Defined AccessCisco Connect 2018 Singapore - Cisco Software Defined Access
Cisco Connect 2018 Singapore - Cisco Software Defined Access
NetworkCollaborators
 
Cisco’s Cloud Strategy, including our acquisition of CliQr
Cisco’s Cloud Strategy, including our acquisition of CliQr Cisco’s Cloud Strategy, including our acquisition of CliQr
Cisco’s Cloud Strategy, including our acquisition of CliQr
Cisco Canada
 
Cisco Powered Presentation - For Customers
Cisco Powered Presentation - For CustomersCisco Powered Presentation - For Customers
Cisco Powered Presentation - For Customers
Cisco Powered
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Hitachi Vantara
 
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network IntuitiveCisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Canada
 

Similar to How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015 (20)

Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
 
L'azienda è più agile? Tutto merito del Data Center
L'azienda è più agile? Tutto merito del Data Center L'azienda è più agile? Tutto merito del Data Center
L'azienda è più agile? Tutto merito del Data Center
 
Building The Right Network
Building The Right NetworkBuilding The Right Network
Building The Right Network
 
Cisco Connect Halifax 2018 Cisco dna - deeper dive
Cisco Connect Halifax 2018   Cisco dna - deeper diveCisco Connect Halifax 2018   Cisco dna - deeper dive
Cisco Connect Halifax 2018 Cisco dna - deeper dive
 
Application Centric Infrastructure (ACI), the policy driven data centre
Application Centric Infrastructure (ACI), the policy driven data centreApplication Centric Infrastructure (ACI), the policy driven data centre
Application Centric Infrastructure (ACI), the policy driven data centre
 
Presentation data center transformation cisco’s virtualization and cloud jo...
Presentation   data center transformation cisco’s virtualization and cloud jo...Presentation   data center transformation cisco’s virtualization and cloud jo...
Presentation data center transformation cisco’s virtualization and cloud jo...
 
Cisco’s Cloud Ready Infrastructure
Cisco’s Cloud Ready InfrastructureCisco’s Cloud Ready Infrastructure
Cisco’s Cloud Ready Infrastructure
 
Migrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
Migrating from VMs to Kubernetes using HashiCorp Consul Service on AzureMigrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
Migrating from VMs to Kubernetes using HashiCorp Consul Service on Azure
 
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
 
Cisco data center training for ibm
Cisco data center training for ibmCisco data center training for ibm
Cisco data center training for ibm
 
Presentation capturing the cloud opportunity
Presentation   capturing the cloud opportunityPresentation   capturing the cloud opportunity
Presentation capturing the cloud opportunity
 
Cisco Connect Toronto 2018 sd-wan - delivering intent-based networking to t...
Cisco Connect Toronto 2018   sd-wan - delivering intent-based networking to t...Cisco Connect Toronto 2018   sd-wan - delivering intent-based networking to t...
Cisco Connect Toronto 2018 sd-wan - delivering intent-based networking to t...
 
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUICisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
Cisco Digital Network Architecture – Deeper Dive, “From the Gates to the GUI
 
Cisco Digital Network Architecture Deeper Dive From The Gates To The Gui
Cisco Digital Network Architecture Deeper Dive From The Gates To The GuiCisco Digital Network Architecture Deeper Dive From The Gates To The Gui
Cisco Digital Network Architecture Deeper Dive From The Gates To The Gui
 
Cisco ucs overview ibm team 2014 v.2 - handout
Cisco ucs overview   ibm team 2014 v.2 - handoutCisco ucs overview   ibm team 2014 v.2 - handout
Cisco ucs overview ibm team 2014 v.2 - handout
 
Cisco Connect 2018 Singapore - Cisco Software Defined Access
Cisco Connect 2018 Singapore - Cisco Software Defined AccessCisco Connect 2018 Singapore - Cisco Software Defined Access
Cisco Connect 2018 Singapore - Cisco Software Defined Access
 
Cisco’s Cloud Strategy, including our acquisition of CliQr
Cisco’s Cloud Strategy, including our acquisition of CliQr Cisco’s Cloud Strategy, including our acquisition of CliQr
Cisco’s Cloud Strategy, including our acquisition of CliQr
 
Cisco Powered Presentation - For Customers
Cisco Powered Presentation - For CustomersCisco Powered Presentation - For Customers
Cisco Powered Presentation - For Customers
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
 
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network IntuitiveCisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
 

More from StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
Amazon Web Services Korea
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model SafeRohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
kumkum tuteja$A17
 
Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)
sapna sharmap11
 
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model SafeRohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
depikasharma
 
Niagara College degree offer diploma Transcript
Niagara College  degree offer diploma TranscriptNiagara College  degree offer diploma Transcript
Niagara College degree offer diploma Transcript
taqyea
 
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model SafePitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
vasudha malikmonii$A17
 
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model SafeKarol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
bookmybebe1
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
kamli sharma#S10
 
EGU2020-10385_presentation LSTM algorithm
EGU2020-10385_presentation LSTM algorithmEGU2020-10385_presentation LSTM algorithm
EGU2020-10385_presentation LSTM algorithm
fatimaezzahraboumaiz2
 
Introduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdfIntroduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdf
kihus38
 
University of Toronto degree offer diploma Transcript
University of Toronto  degree offer diploma TranscriptUniversity of Toronto  degree offer diploma Transcript
University of Toronto degree offer diploma Transcript
taqyea
 
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model SafeDaryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
butwhat24
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
Jyotishko Biswas
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model SafeSaket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
shruti singh$A17
 
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model SafeLajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
jiya khan$A17
 
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model SafeLajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
khansayyad1256
 
Cloud Analytics Use Cases - Telco Products
Cloud Analytics Use Cases - Telco ProductsCloud Analytics Use Cases - Telco Products
Cloud Analytics Use Cases - Telco Products
luqmansyauqi2
 
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model SafeMalviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
butwhat24
 

Recently uploaded (20)

[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
 
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model SafeRohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Vishakha Singla Top Model Safe
 
Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)
 
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model SafeRohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Megha Singla Top Model Safe
 
Niagara College degree offer diploma Transcript
Niagara College  degree offer diploma TranscriptNiagara College  degree offer diploma Transcript
Niagara College degree offer diploma Transcript
 
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model SafePitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
Pitampura @ℂall @Girls ꧁❤ 9873777170 ❤꧂Fabulous sonam Mehra Top Model Safe
 
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model SafeKarol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
Karol Bagh @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Jya Khan Top Model Safe
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
 
EGU2020-10385_presentation LSTM algorithm
EGU2020-10385_presentation LSTM algorithmEGU2020-10385_presentation LSTM algorithm
EGU2020-10385_presentation LSTM algorithm
 
Introduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdfIntroduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdf
 
University of Toronto degree offer diploma Transcript
University of Toronto  degree offer diploma TranscriptUniversity of Toronto  degree offer diploma Transcript
University of Toronto degree offer diploma Transcript
 
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model SafeDaryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Daryaganj @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
 
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model SafeSaket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
 
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model SafeLajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ruhi Singla Top Model Safe
 
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model SafeLajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
Lajpat Nagar @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Ginni Singh Top Model Safe
 
Cloud Analytics Use Cases - Telco Products
Cloud Analytics Use Cases - Telco ProductsCloud Analytics Use Cases - Telco Products
Cloud Analytics Use Cases - Telco Products
 
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model SafeMalviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
Malviya Nagar @ℂall @Girls ꧁❤ 9873940964 ❤꧂VIP Jina Singh Top Model Safe
 

How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015

  • 1. Ken Owens CTO Cisco Intercloud Services 07/15/15 How Cisco Migrated from MapReduce Jobs to Spark Jobs 1
  • 2. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Introduction
  • 3. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Introduction
  • 4. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Introduction
  • 5. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Introduction
  • 6. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Introduction
  • 7. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Source: IDC 7 30M New devices connected every week 78% Workloads processed in Cloud DCs by 2018 5TB+ of data per person by 2020 180B Mobile apps downloaded in 2015 277X Data created by IoE devices v. end-user The Uber Trend: Exponential Rise in Connectivity
  • 8. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Exponential Trend Linear Trend Disruptive Stress /Opportunity Knee of Curve Exponential Growth Drives Opportunities Peter Diamandis: BOLD
  • 9. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public When Products Become Cloud-enabled, They Become 10X More Valuable $23.19 $249.00 $18.01 $199.00 $5.99 $59.99
  • 10. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public SaaS PaaS IaaS A Broader Perspective than Hybrid Cloud Is Required… Data Center Cloud Edge / IoT
  • 11. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Hyperscale applications serving several thousands of users very quickly Traditional enterprise applications IoE and increasing connectivity driving the need for such workloads Hadoop, Mobile back-ends, Gaming, Social Small (~10%), yet rapidly growing percentage of applications in the Cloud ERP, CRM, Applications that leverage traditional databases Majority of applications being run for/by Enterprises today CIOs Need to Embrace Both Traditional and Hyperscale Application Deployment
  • 12. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public SaaS PaaS IaaS Application Portability and Interoperability Is the Key Traditional Applications ERP, Financial, Client/Server, CRM, email, … Cloud Native Applications IoT, BigData,Analytics, Gaming, ... Data Center Cloud Edge / IoT
  • 13. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Source: Gartner, Lydia Leong of CIOs currently have a second fast/agile mode of operation 45% Traditional Mode Requires Reliability (ITIL, CMMI, COBIT) Nonlinear Mode Accept Instability (DevOps, automation, reusable) Systems of Differentiation Systems of Innovation Systems of Record Change Governance Bimodal IT Is the New Normal Source: Gartner, Lydia Leong
  • 14. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Intercloud The Intercloud Web-scale Architecture API-Driven Automation Open, Secure, Compliant, Hybrid IT Internet The Internet IP Based Open Standards World of Isolated Clouds (2000s) Individual custom-built clouds without consistent APIs Connected for application acceleration with Open APIs The Intercloud Intercloud Islands of Isolated PC LAN Networks (1990s) Multiple LANs using a multitude of protocols The Internet Connected using industry- standard IP protocol We Must Connect the Clouds
  • 16. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Omni-Channel Customer Journeys Server Logs Social & Chat Mobile Event Streams Call Center S/W Download Open Trouble Ticket Assign Engineer Update Trouble Ticket Close Trouble Ticket Resolve Trouble Ticket Read Support Documents View Design Documents View Tech Documents New Registration Bug Search FAQs Contract Details Product Details Device Coverage Interaction Touch points Channels Journey Case Resolution Software Upgrade The customers’ interaction with Cisco across multiple touch points to get the desired business outcome.
  • 17. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public • Software Upgrades • Bug Inquiry • Software Inquiry • Trouble Ticket Lifecycle • Device Troubleshooting • New Registration • Contract Renewal • Customer Interest Analytics • Customer Experience Analytics • Resource Forecasting • Security and Compliance Customer Journeys Behavioral Insights • Boost Self Service • Real-time Content Optimization & Recommendation • Context Based Predictive Alerts • Implicit Personalization Impact Customer Interaction Analytics From Journey to Outcome…
  • 18. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Server Logs Customer Interaction Analytics Big Data Platform Synthesize customer journey maps into behavioral insights. Call Center Mobility Social Event Streams Data Sources Data Ingestion CiscoDV Kafka Redis ETL Analytics Model Build Model Activity Refinement Activity Synthesis Synthesized Insights Real-time Processing Batch Analytics Insight Services CiscoDV Interact ImpalaHive Pig ES Zoomdata,Platfora
  • 19. AWS and CIS Intercloud Solution
  • 20. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public AWS Platform Component Cloud:: Hadoop (Batch Analytics) Cloud:: Queries (Interactive Queries) Cloud:: Streams (Near Real- time Analytics) Virtual Machines 30 6 5 AWS Instance Sizing m3.2xlarge c3.xlarge m3.xlarge Virtual Cores 8/VM 4/VM 4/VM RAM 30GB/VM 7.5GB/VM 15GB/VM Disk 1.5 TB/VM 1.5 TB/VM 1.5 TB/VM
  • 21. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Case for Cisco Intercloud Services for Analytics…  Cisco Security and Compliance requirements • Workloads that deal with personally identifiable data and Cisco confidential content cannot be uploaded to AWS. Cisco internal cloud solution is a better fit.  Customer journey beyond the enterprise • Applications are hosted on AWS • Partner systems hosted on AWS and other cloud providers Presence in AWS and other cloud services required to support these scenarios for end-end customer journey insights.  Data virtualization integrated in the CIS Analytics Stack • Connect data from multiple clouds and multiple big data platforms  Integrated visualization toolset
  • 22. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public CIS Analytics Platform
  • 23. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public CIS Analytics Platform Requirements Infra Provisioning Deploy a virtual private cloud (VPC) on CIS with compute, storage and memory requirements comparable to the current production system. OpenStack Icehouse OpenStack with Neutron, Nova, and Swift installed. Big Data Ecosystem Cloudera’s Hadoop distribution version CDH 5.1.3., ELK Stack, Apache Kafka and Apache Storm. Data virtualization & Cloud Integration Access to data services and data stores via Cisco Data Virtualization Runtime Services Foundational PaaS capabilities including SLAs for uptime, performance, latency, data retention, issue escalation and support priorities, issue resolution, problem management, deployment process, patch management. API Services Provide both fine-grained and coarse-grained access to the all service layers of the CIS Analytics Platform. In the hybrid cloud model it must support interoperability across platform service providers and promote the cloud concepts of extensibility and flexibility.
  • 24. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public AWS to CIS Migration – Success Criteria  Successful synthesis of customer interaction data  Successful automation of the end-end data process pipeline  Build behavioral insight services  Access to data and services via data discovery and visualization tools  Meet the performance, scale and platform stability requirements  Successful deployment of CiscoDV on CIS  Connect HDFS and Hive DS with CiscoDV via Hive and Impala  Build and expose insight services for consumption by limited users
  • 25. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public AWS and CIS Data Node Sizing Comparison Hadoop Cluster for Batch and Query Analytics Node Service AWS Instance Type vCPU Mem Storage Number of Data Nodes Comments Data Nodes/ Node Master m3.2xlarge 8 30 2x80 GB 30 Each hadoop data node has 1500GB of EBS available for HDFS storage AWS Sizing CCS Sizing Node Service CCS Instance Type vCPU Mem Storage Number of Data Nodes Comments Data Nodes/ Node Master GP-2XLarge 8 32 50 35 Each hadoop data node has 1500GB of EBS available for HDFS storage Less than AWS sizing (Storage)
  • 26. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Pilot Test Data • Test performed on one day’s production data • Total no. of records processed – 110,852,667 • Total data size – 32GB • Total no. of M/R jobs in the data pipeline – 17 • Two test cycles • Cycle 1: Heterogeneous CCS nodes (vCPUs, storage, memory) • Cycle 2: Homogeneous CCS nodes
  • 27. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public CIS Performance of Batch Analytics – Limited Test
  • 28. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Test Details by M/R job Job Name CCS 12 nodes: cycle1 CCS 18 nodes: cycle1 CCS 24 nodes: cycle1 CCS 30 nodes: cycle1 CCS 18 nodes: cycle2 CCS 24 nodes: cycle2 CCS 30 nodes: cycle2 CCS 35 nodes: cycle2 New_cleanse 249 176 143 117 82 67 55 51 Process_private_ip 27 14 11 10 7 5 6 6 join_web_and_ip_data 142 95 76 61 49 40 34 29 combine_ip_decorated_files 26 14 11 10 9 7 8 7 filterBotEntries 34 19 15 13 10 8 7 7 sessionize 71 64 69 62 60 63 15 13 firstActivitiesFilter 26 15 13 10 9 8 6 6 allOtherActivitiesFilter 29 18 13 13 11 9 7 6 matchFirstActivities 21 13 11 13 13 11 8 8 buildActivities 27 15 12 10 7 6 9 9 filterBUG 8 5 3 2 3 3 4 4 filterSEA 8 5 3 2 3 3 4 4 filterTCO 8 5 3 2 3 3 4 4 filterTDV 8 5 3 2 3 3 4 4 filterWDV 8 5 3 2 3 3 4 4 filterMOD 8 5 3 2 3 3 4 4 filterTOOL 8 5 3 2 3 3 4 4
  • 29. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public PoC: Analytics with Spark on CIS Existing code  Made in Ruby with Wukong to run on Hadoop  A history of changes and modifications  Script-based, steps communicate via intermediary files Goal  Revise, rethink and reimplement with Spark on CIS  Open for advanced cloud analytics  Improve maintainability by moving away from aging Ruby on Hadoop
  • 30. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Sessionize Cleanse logs cleanse private web decorate sessionize (cookie, time) sessioned match 1st (IP, UA, time) build actions merge session PSV add to hivebug tool first, others, bots 1..7 onlyBots first others private Main computation happens here cleansed  Pre-process log records (‘cleanse’)  Extract HTTP sessions (‘sessionize’)  Extract user actions, such as ‘search’, ‘download patch’, ‘open manual’, ‘open a bug’ Ruby: Scripts with temp files  Each box on the figure is a script in a separate file  They pipe Gb of data as input and output  Random matching of nodes to data for sessionizing  Lots of redundant shuffling Ruby Flow global sort in time global group by IP
  • 31. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Sessionize Cleanse logs cleanse private web decorate sessionize (cookie, time) sessioned match 1st (IP, UA, time) build actions merge session PSV add to hivebug tool first, others, bots 1..7 onlyBots first others private Main computation happens here cleansed  Same flow, but each box is a Java or Scala function No intermediate temp files  Steps are chained by Spark, often without any need for intermediate data  If still needed, the data is stored in memory and local disk as much as possible Local computation  Cleansing is computed on nodes local to data blocks (same as Ruby)  Sessions are built per IP  On separate nodes each handling a single IP range  One copied to the node on partition the data remains local Spark Flow global partition by IP local sort in time
  • 32. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public  Volumes  Logs of a single day: 52 Gb  Total of 110 mil records  Where 53 mil records are kept after pre-filtering  Producing over 1 mil user actions  Cluster of 30 nodes  Ruby  Runtime 140 min  Spark  Runtime 7 min (20 times faster ) Runtime comparison
  • 33. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public  Extracting sessions means sort in time and group by IP  Ruby:  sorting in time and per-IP grouping is performed across the whole cluster (very bad, lots of IO)  Spark is good at dealing with partitions:  per-IP groups are placed on different machines (partitions)  global sort in time is replaced by many local per-IP sorts done on machines responsible for extracting sessions for specific groups of IP addressed  Other improvements  Avoid redundant temp files, redundant (de)-serialization of objects (comes with Java/Scala), stages keep data in memory when possible (comes with Spark)  Cache results of user agent resolution that are heavy on regular expressions Why?
  • 35. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Data Virtualization for Intercloud Analytics Customer Benefits  Discover data beyond the enterprise: Virtual integration that combines traditional enterprise data, Big Data stores on CIS and AWS, cloud data from SaaS providers and, Cisco Customers and Partners  Seamless interoperability offers easy access to data across distributed data sources in the intercloud analytics platform  Universal data governance maximizes enforcement of data security rules  Analytics Data Hubs: Deployment flexibility to build hybrid/virtual sandboxes that enable nimble data discovery and rapid data analytics to support multiple LOBs  Deliver data to any number of analytics tools.
  • 36. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Use Case 1: Get Case Interactions Use Case Description # of cases opened by company X that are currently open. (other variations would include cases by company, trends etc.) CiscoDV Value CiscoDV enforces data security rules to restrict access on the intercloud platform to customer sensitive data. Data Sources SalesForce Intercloud Solution CIS CiscoDV service can access the “sanitized” version of CSOne data through JDBC from RIDES(SWTG CiscoDV) API. Connection Type DV on hybrid cloud  Enterprise data store
  • 37. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Use Case 2: Get Customer Journey Use Case Description Customer interactions on the web pertaining to bug search and case submission process. Foundational data can be used to explore trends and feed into content recommendation models CiscoDV Value Direct access to Data on CIS Intercloud Analytics Platform Data Sources SAS Analytics Intercloud Solution By direct network access to the Impala Server, the CIS CiscoDV server connects to the Impala Service in Hadoop also on CIS as a Data Source. SQL Queries configured in CiscoDV execute Impala queries Connection Type DV on hybrid cloud  VPC Big Data platform
  • 38. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public Use Case 3: Get Bug Interactions Use Case Description Another foundational data service that provides a breakdown of customer exposure or interest in bugs. The service can be refined further to look at trends specific to a company or a product for further analytics. CiscoDV Value Real-time data federation that accesses extremely large data in CIS Intercloud Analytics platform and join that with Bug Data accessed via departmental CiscoDV instance (RIDES) Data Sources SASA Analytics and QDDTS via RIDES Intercloud Solution By building on the access to the Impala Server, the DV server can join the Bug Data from the Enterprise Data Stores with the HDFS data to provide a federated view. Connection Type DV on hybrid cloud  VPC Big Data platform and Enterprise data store
  • 39. Cisco and/or its affiliates. All rights reserved.Presentation_ID Cisco Public CiscoDV on Intercloud Analytics Platform (CIS) Scenario 1 CIS Cisco DV to Cisco Enterprise Data Store Scenario 2 CIS CiscoDV to Impala and Hive on CIS Intercloud Analytics Platform Scenario 3 CIS Cisco DV to Hive on AWS Big Data Cluster Scenario1 Scenario 3

Editor's Notes

  1. FABIO – a few items from Pankaj and Liz Monday: Per the John Chambers slides I sent you Monday night, please be sure to fully address digitization in the opener, so Pankaj can connect to John’s opening remarks. Set the stage here for what the digital transformation is and why it dries IoE and cloud. Explain where we came from, where we are today – exponential growth and a magnitude of changes still to come. Please see new VNI, to see if there are any newer/better stats re the Data Center. Pankaj feels the top 3 data points are ok in this slide, but perhaps we could find better ones for the bottom 2 data points? Maybe uplevel them a bit? ------------------------------------------------------- The world is changing. The digital transformation is turning traditional business models on their heads. We are seeing unprecedented growth in the explosion of devices and mobile apps and in data utilization. IoE – IoE devices create 277 times the data that the end user is creating. But only a fraction of it ever reaches the data center. A Boeing 787 for example, generates 40 TB of data per every hour of flight time. But only 0.5 TB is ultimately transmitted to the data center. Mobility: In 2014, global mobile data traffic grew 1.7x or 69%… In 2014 alone, 77B+ mobile apps downloaded… by 2015 180B apps (233% increase) Internet… IDC predicts by 2017, there will be 3.6 billion global Internet users… More than 1/2 the world population Big Data… By 2020 there will be more than 5,000 GB of data for every person on Earth These massive changes are putting tremendous stress on the data center. The traditional data center model has to evolve in order to meet demand today and into the future.
  2. We know how to fix this We’re going to do for cloud what we did for data. You couldn’t move data between the networks – they weren’t connected. Cisco unified those worlds The world of cloud today is a world of isolated clouds. There’s no workload or data portability. “Amazon is hotel California – you can never leave, and that data is staying there” Our vision is to connect all these clouds together into the Intercloud - whether private, public , or hybrid through technology and innovation Intercloud is going to connect these clouds together in the same way we connected data together. No one cloud model or single cloud approach, such as the massively scalable clouds from Amazon, Google or Microsoft will win alone in this space