SlideShare a Scribd company logo
John L Myers
Enterprise Management Associates
Managing Research Director
JMyers@EnterpriseManagement.com
@johnlmyers44
Taming the Beast:
Extracting Value from Hadoop
Ingo Mierswa
RapidMiner
Founder & CTO
imierswa@rapidminer.com
Panel Moderator
Lyndsay Wise, Research Director, EMA
Lyndsay has over 10 years experience in software
research, BI consulting, and strategy development,
specializing in software evaluation and best-fit solution
selection. Her focus at EMA is on data integration, data
governance, cloud technologies, data visualization,
analytics, and collaboration.
Slide 2 © 2015 Enterprise Management Associates, Inc.
Featured Speakers
John Myers, Managing Research Director, EMA
John has over 10 years of experience working in areas related to business
analytics in professional services consulting and product development
roles. Additionally, John helps organizations solve their business analytics
problems, whether they relate to operational platforms – such as customer
care or billing – or applied analytical applications – such as revenue
assurance or fraud management.
Ingo Mierswa, Founder & CTO, RapidMiner
Ingo, an industry-veteran data scientist, is the founder and CTO of
RapidMiner, the industry’s #1 open source platform for predictive
analytics. Ingo is passionate about the technological innovation enabled
by the open source community and envisions a world where easy-to-use
predictive analytics software empowers all business analysts and data
scientists. Ingo is the author of numerous award-winning publications
about predictive analytics and big data, and has spoken at countless
industry events.
Slide 3 © 2015 Enterprise Management Associates, Inc.
A PDF of the PowerPoint
presentation will be available
Event Presentation
Logistics for Today’s Webinar
Slide 4 © 2015 Enterprise Management Associates, Inc.
An archived version of the event recording will be
available at www.enterprisemanagement.com
• Log questions in the Q&A panel located on the
lower right corner of your screen
• Questions will be addressed during the Q&A
session of the event
Questions
Event Recording
Join the Conversation…
Submit your questions or comments to the panel
using: @wiseanalytics @johnlmyers44 @rapidminer
#predictiveanalytics
Slide 5 © 2015 Enterprise Management Associates, Inc.
Topic #1:
Issues With Data Lakes
Adoption of Hadoop-based Data Lake Architectures
Slide 7 © 2015 Enterprise Management Associates, Inc.
Topic #2:
Obstacles Implementing
Analytics On Hadoop
Obstacles Implementing Analytics
Slide 9 © 2015 Enterprise Management Associates, Inc.
Topic #3:
Processing Requirements for
Predictive Analytics
Required Processing and Compute Latency
for Big Data Projects
Slide 11 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 12 -
Architecture of Hadoop
Orchestration node
Worker nodes
©2015 RapidMiner, Inc. All rights reserved. - 13 -
Leverage Hadoop’s Compute Capacity
• Design advanced analytics workflows in your
predictive analytics platform
• Ensure your solution automatically translates
predictive analytics needs into native Hadoop
code, e.g., MapReduce, Hive, Pig, Spark, etc.
• Push predictive analytic instructions into your
Hadoop
• Hadoop performs calculations across the entire
Hadoop cluster for a holistic view of your data
• Data remains in Hadoop  Results are delivered
to the business
• Recommendations
– GUI workflow language (code-free)
– Don’t forget about security
ResultsAnalytic instructions
translated to native
Hadoop
Calculations
Results
operationalized in
business processes
Predictive
Analytics Platform
Topic #4:
Successful Big Data Analytics
Projects
Project Success
Slide 15 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 16 -
©2015 RapidMiner, Inc. All rights reserved. - 17 -
OPERATIONALIZE
Predictive Decisions
Close the Loop Between
Insight and Action
Embed predictive models into
critical business processes
Recommend best options for
human or automated actions
©2015 RapidMiner, Inc. All rights reserved. - 17 -
Topic #5:
Best Practices For
Implementing
Advanced/Modern Analytics
©2015 RapidMiner, Inc. All rights reserved. - 19 -
EFFORTLESS
Predictive Analytics
Immediately Empower
Analysts to Anticipate
Opportunity & Risk
Easily Combine Any Data at
Unlimited Scale with Any Model
Code-Free, Lightning-Fast
and Intuitive
©2015 RapidMiner, Inc. All rights reserved. - 19 -
Topic #6:
Use Of Mixed Environments
For Implementation Of Big
Data Analytics
Growing Importance of Cloud Resources
Slide 21 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 22 -
- 22 -
Design Once, Deploy
ANYWHERE
Leverage Investments in
Existing and Future Systems
Design predictive analytics
independent of platforms
Seamlessly execute predictive
analytics in-memory or
in any source, including
data-at-rest or data-in-motion
- 22 -©2015 RapidMiner, Inc. All rights reserved.
Topic #7:
Evolving Role of
the Data Consumer
What We Used to Think
of Analytical Users
Slide 24 © 2015 Enterprise Management Associates, Inc.
Empowering the Line of Business
Slide 25 © 2015 Enterprise Management Associates, Inc.
Topic #8:
Use Cases – Monetizing
Insights Buried In Your
Multi-Structured Data
©2015 RapidMiner, Inc. All rights reserved. - 27 -
Challenge
Better understand TV
viewing habits to
prevent churn and
optimize advertising
“RapidMiner allows us to leverage Big Data, in real-time.”
-- Avi Bernstein
Professor at the University of Zurich, Department of Informatics
Drive Broadcast Revenue and Customer
Retention
<5s
time to generate high
value activities based
on predictive
analytics
Solution
Process Big Data from
three million TV
viewers, in real-time,
to make program
recommendations and
personalized
advertising
©2015 RapidMiner, Inc. All rights reserved. - 28 -
Challenge
Monitor corporate
performance data in
real time to identify
correlations, outliers,
and economic drivers
“We benefit from the availability of community extensions via the RapidMiner
Marketplace. We can easily search for what others have designed in RapidMiner, and use
the extensions that are a fit for us.”
-- Tom Gatten
CEO
Track Data from Millions of Companies to
Identify Critical Economic Drivers
4.5 M
subject matter experts’
content analyzed in
the United Kingdom
every single day
Solution
Use RapidMiner to
mashup data of UK
businesses, rapidly
prototype predictive
models & identify
outlying, unusual,
data
Where To Go From Here?
Slide 29 © 2015 Enterprise Management Associates, Inc.
• Data lakes are an emerging data management architecture
• There are issues fully realizing value from data lakes
• Following best practice/pattern helps
Join the Conversation…
Submit your questions or comments to the panel
using: @wiseanalytics @johnlmyers44 @rapidminer
#predictiveanalytics
Slide 30 © 2015 Enterprise Management Associates, Inc.
Q&A – Please Log Questions in the Q&A Panel
Slide 31 © 2015 Enterprise Management Associates, Inc.
• Visit RapidMiner.com to learn more about
Effortless Predictive Analytics
• Learn more about leading IT analyst firm Enterprise
Management Associates (EMA) at
enterprisemanagement.com

More Related Content

Taming the Beast: Extracting Value from Hadoop

  • 1. John L Myers Enterprise Management Associates Managing Research Director JMyers@EnterpriseManagement.com @johnlmyers44 Taming the Beast: Extracting Value from Hadoop Ingo Mierswa RapidMiner Founder & CTO imierswa@rapidminer.com
  • 2. Panel Moderator Lyndsay Wise, Research Director, EMA Lyndsay has over 10 years experience in software research, BI consulting, and strategy development, specializing in software evaluation and best-fit solution selection. Her focus at EMA is on data integration, data governance, cloud technologies, data visualization, analytics, and collaboration. Slide 2 © 2015 Enterprise Management Associates, Inc.
  • 3. Featured Speakers John Myers, Managing Research Director, EMA John has over 10 years of experience working in areas related to business analytics in professional services consulting and product development roles. Additionally, John helps organizations solve their business analytics problems, whether they relate to operational platforms – such as customer care or billing – or applied analytical applications – such as revenue assurance or fraud management. Ingo Mierswa, Founder & CTO, RapidMiner Ingo, an industry-veteran data scientist, is the founder and CTO of RapidMiner, the industry’s #1 open source platform for predictive analytics. Ingo is passionate about the technological innovation enabled by the open source community and envisions a world where easy-to-use predictive analytics software empowers all business analysts and data scientists. Ingo is the author of numerous award-winning publications about predictive analytics and big data, and has spoken at countless industry events. Slide 3 © 2015 Enterprise Management Associates, Inc.
  • 4. A PDF of the PowerPoint presentation will be available Event Presentation Logistics for Today’s Webinar Slide 4 © 2015 Enterprise Management Associates, Inc. An archived version of the event recording will be available at www.enterprisemanagement.com • Log questions in the Q&A panel located on the lower right corner of your screen • Questions will be addressed during the Q&A session of the event Questions Event Recording
  • 5. Join the Conversation… Submit your questions or comments to the panel using: @wiseanalytics @johnlmyers44 @rapidminer #predictiveanalytics Slide 5 © 2015 Enterprise Management Associates, Inc.
  • 7. Adoption of Hadoop-based Data Lake Architectures Slide 7 © 2015 Enterprise Management Associates, Inc.
  • 9. Obstacles Implementing Analytics Slide 9 © 2015 Enterprise Management Associates, Inc.
  • 10. Topic #3: Processing Requirements for Predictive Analytics
  • 11. Required Processing and Compute Latency for Big Data Projects Slide 11 © 2015 Enterprise Management Associates, Inc.
  • 12. ©2015 RapidMiner, Inc. All rights reserved. - 12 - Architecture of Hadoop Orchestration node Worker nodes
  • 13. ©2015 RapidMiner, Inc. All rights reserved. - 13 - Leverage Hadoop’s Compute Capacity • Design advanced analytics workflows in your predictive analytics platform • Ensure your solution automatically translates predictive analytics needs into native Hadoop code, e.g., MapReduce, Hive, Pig, Spark, etc. • Push predictive analytic instructions into your Hadoop • Hadoop performs calculations across the entire Hadoop cluster for a holistic view of your data • Data remains in Hadoop  Results are delivered to the business • Recommendations – GUI workflow language (code-free) – Don’t forget about security ResultsAnalytic instructions translated to native Hadoop Calculations Results operationalized in business processes Predictive Analytics Platform
  • 14. Topic #4: Successful Big Data Analytics Projects
  • 15. Project Success Slide 15 © 2015 Enterprise Management Associates, Inc.
  • 16. ©2015 RapidMiner, Inc. All rights reserved. - 16 -
  • 17. ©2015 RapidMiner, Inc. All rights reserved. - 17 - OPERATIONALIZE Predictive Decisions Close the Loop Between Insight and Action Embed predictive models into critical business processes Recommend best options for human or automated actions ©2015 RapidMiner, Inc. All rights reserved. - 17 -
  • 18. Topic #5: Best Practices For Implementing Advanced/Modern Analytics
  • 19. ©2015 RapidMiner, Inc. All rights reserved. - 19 - EFFORTLESS Predictive Analytics Immediately Empower Analysts to Anticipate Opportunity & Risk Easily Combine Any Data at Unlimited Scale with Any Model Code-Free, Lightning-Fast and Intuitive ©2015 RapidMiner, Inc. All rights reserved. - 19 -
  • 20. Topic #6: Use Of Mixed Environments For Implementation Of Big Data Analytics
  • 21. Growing Importance of Cloud Resources Slide 21 © 2015 Enterprise Management Associates, Inc.
  • 22. ©2015 RapidMiner, Inc. All rights reserved. - 22 - - 22 - Design Once, Deploy ANYWHERE Leverage Investments in Existing and Future Systems Design predictive analytics independent of platforms Seamlessly execute predictive analytics in-memory or in any source, including data-at-rest or data-in-motion - 22 -©2015 RapidMiner, Inc. All rights reserved.
  • 23. Topic #7: Evolving Role of the Data Consumer
  • 24. What We Used to Think of Analytical Users Slide 24 © 2015 Enterprise Management Associates, Inc.
  • 25. Empowering the Line of Business Slide 25 © 2015 Enterprise Management Associates, Inc.
  • 26. Topic #8: Use Cases – Monetizing Insights Buried In Your Multi-Structured Data
  • 27. ©2015 RapidMiner, Inc. All rights reserved. - 27 - Challenge Better understand TV viewing habits to prevent churn and optimize advertising “RapidMiner allows us to leverage Big Data, in real-time.” -- Avi Bernstein Professor at the University of Zurich, Department of Informatics Drive Broadcast Revenue and Customer Retention <5s time to generate high value activities based on predictive analytics Solution Process Big Data from three million TV viewers, in real-time, to make program recommendations and personalized advertising
  • 28. ©2015 RapidMiner, Inc. All rights reserved. - 28 - Challenge Monitor corporate performance data in real time to identify correlations, outliers, and economic drivers “We benefit from the availability of community extensions via the RapidMiner Marketplace. We can easily search for what others have designed in RapidMiner, and use the extensions that are a fit for us.” -- Tom Gatten CEO Track Data from Millions of Companies to Identify Critical Economic Drivers 4.5 M subject matter experts’ content analyzed in the United Kingdom every single day Solution Use RapidMiner to mashup data of UK businesses, rapidly prototype predictive models & identify outlying, unusual, data
  • 29. Where To Go From Here? Slide 29 © 2015 Enterprise Management Associates, Inc. • Data lakes are an emerging data management architecture • There are issues fully realizing value from data lakes • Following best practice/pattern helps
  • 30. Join the Conversation… Submit your questions or comments to the panel using: @wiseanalytics @johnlmyers44 @rapidminer #predictiveanalytics Slide 30 © 2015 Enterprise Management Associates, Inc.
  • 31. Q&A – Please Log Questions in the Q&A Panel Slide 31 © 2015 Enterprise Management Associates, Inc. • Visit RapidMiner.com to learn more about Effortless Predictive Analytics • Learn more about leading IT analyst firm Enterprise Management Associates (EMA) at enterprisemanagement.com