Rick Mutsaers Informatica
- 2. 2 © Informatica. Proprietary and Confidential.
New Questions Are Forcing
Data-Driven Digital Transformation
Engage Your
Customers
Optimize Your
Operations
Transform Your
Products
Empower Your
People
Digital Transformation Goals
What specific
interactions and
events led to
conversion?
How can identify
new sources of
cross-sell and upsell
revenue?
What specific
patterns affect
employee retention
or patient
outcomes?
How can we
improve the
productivity of
employees?
What specific
events and patterns
are fraudulent and
anomalous?
How can minimize
cost and increase
profitability?
What specific
signals are there to
better predictive
downtime and
failure?
How improve the
uptime of key
processes to
improve products
and services?
- 3. 3 © Informatica. Proprietary and Confidential.
There is a Generational Market
Disruption Underway
Databases
Azure
HDInsight
Compute
Analytics
Storage Tabl
es
Applications
Proprietary
Engines
Strea
ms
Pig
Azure
Storage
- 4. 4 © Informatica. Proprietary and Confidential.
Data 3.0 brings in new challenges
Data
Volume &
Velocity
New
Users
325 million
business data
users and
growing
New Data
Types
(mobile, social, IoT)
20 billion
connected
devices
Data in
the Cloud
Over 92% of
data center
traffic
will come from
the Cloud
AI driven
Apps &
Analytics
1 billion workers
will be assisted by
machine learning
or AI
600 terabytes of
Facebookdataperday
- 5. 5 © Informatica. Proprietary and Confidential.
Key capabilities of modern data lake
architectures for Data 3.0
Integrate all types
of structured &
unstructured data
Process
dynamically
changing data
Connect to on-
premises &
Cloud data
Deploy on
multiple cloud
platforms
Automate data
flows into data
hungry apps
Guide behavior
with intelligent
recommendations
Catalog all data
Collect and
process batch &
streaming data
with big data
scale
Provide self-
service data
preparation
Subscribe to
certified data
Data
Volume &
Velocity
New
Users
New Data
Types
(mobile, social, IoT)
Data in
the Cloud
AI driven
Apps &
Analytics
- 6. 6 © Informatica. Proprietary and Confidential.
Typical, fragmented approach does not scale
INGEST
BIG DATA INFRASTRUCTURETRADITIONAL INFRASTRUCTURE
Aquire Ingest
Hand-coding
INGEST
Prepare
Hand-coding
INGEST
Secure
Hand-coding
INGEST
Master
Hand-coding
INGEST
Govern
INGEST
Serve
Hand-coding
Consume
- 7. 7 © Informatica. Proprietary and Confidential.
Data
is gold
Metadata
is the diamond in the rough
- 9. 9 © Informatica. Proprietary and Confidential.
Recommendation
Build a data catalog
- 12. 12 © Informatica. Proprietary and Confidential.
more than 700 customers
incl. over 400 mobile data customers
top 3 voice carrier with over 28 bio minutes
world leader in mobile data services
1.65 bio euro revenues
HQ in Brussels with offices
in Bern, Dubai, Singapore
and New York
400+ employees
22.4%
20%57.6%
Customer example: BICS
- 13. 13 © Informatica. Proprietary and Confidential.
Provide a 360 view on Roaming Activity
PROJECT
THE
- 14. 14 © Informatica. Proprietary and Confidential.
The Roadmap
• Phase 1 : Migrate the data storage and processing from the
Teradata DB to the Hybrid Platform
Set up the loading of the data into hadoop
Move all the Tracking Applications (using the detailed data) into
Hadoop and keep the SLA (<1 min) for the Subscriber Tracking
Move the processing of all the high latencies (15 minutes, Hourly,
Daily) application to Hadoop
• Phase 2: Compute the new analytics in realtime on Hadoop
and provide longer historical reporting to customers
- 15. `
Learn more:
Visit us at stand
102
Thank you !
Rick Mutsaers
+31 622 414240
rmutsaers@informatica.com
Editor's Notes
- So what is changing in the world of Analytics.
First of all we see new questions being asked to be able to control and grow business.
Like what interaction did we have with our customer that pursuaded him/her to buy our product.
Or in healthcare, what did we do to make the patient better. Or in general, do we understand the reasons why employees leave our company, so we can prevent this in future.
But also what can we do to lower cost and optimize our business performance.
- The infrastructure choices to help with these questions have dramatically changed over the past decade.
- And this new era we call Data 3.0 where data is used as a strategic asset to fuel this disruption comes with new challenges. Like data volumes, and types of data. or the fact that data consumers need smarter applications, meaning AI and Machine Learning. And we see a whole new group of data consumers.
Finally we see a big shift toward cloud. So how do you integrate data that’s not in your datacenter. More specifically that’s spread across many cloud applications.
- Looking at these challenges, what fo we then need in terms of capabilities...
- But don’t make the mistake of starting to use a plethora of non-integrated tools to tackle these problemns, rather think of an integrated platform that provides all the capabilities you need.
- Now the new adagium is ‘Data is the new gold’. Then i would pose the statement ‘Metadata is the diamond in the rough’. Let me explain.
- There are different types of metadata we need to collect to face these challenges
- Using all this metadata create a data catalog that business users can query to understand what data there is, who owns it, what it means, what quality it has etc.
- This can also be used to create data integration patterns based on an abstracted view
- The benefit of using a tightly integrated platform are numerous....
- Now lets look at a customer example that has gone through this digital revolution.
BICS is a network provider for roaming services.
- They started a project to get better view on roaming activity to better predict network usage, disruptions, customer behavior. To do this they
- To do this they first implemented a Hadoop platform to offload data from expensive Teradata into cheaper HDFS and then moved some of their existing data integration logic (built using Informatica’s PowerCenter data integration suite) to Hadoop for improved performance and scalability. They moved most of the batch processing to Hadoop now leveraging the technology benefits of that platform. They started with MapReduce and have now switched to Spark, without recoding.
Next phase will be to also start to process data in realtime/streaming mode to get even lower latencies for predictive maintenance and quicker response to disruptions.