Modern Data Integration Expert Session Webinar
- 2. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 2
Unlock Potential
William McKnight
www.mcknightcg.com
214-514-1444
Modern Data Integration - Expert Sessions
@williammcknight
- 3. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 3
Data is the Most Important Asset
in the World
• We trade it for services instead of money
• Our information is exploding
• Business is moving to real-time, all the time
• Our information differentiates us from our
competitors
• Information is a key business asset
- 4. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 4
Corporate Initiatives
80% of Initiatives That Matter are about DATA
• Budget
• Energy
80% of Initiatives should be Business-Focused
• ROI
• Resource-Leveled
- 5. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5
Data Maturity is Highly Correlated to
Business Success
Data
Maturity
Business
Success
- 6. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6
The Money Tree Doesn’t
Exist
Hitch your Architecture and Maturity Efforts to an Application Budget
- 7. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7
AI is disruptive
Data is the Foundation
- 8. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 8
Choosing a Platform: 3
Major Decisions
Decision #1: The Data Store Type
• The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is
best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller
volumes of all data -- still belong in a relational database.
Decision #2: Data Store Placement
• You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear
choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this
remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article.
Decision #3: The Workload Architecture
• Finally, you must keep in mind the distinction between operational or analytical workloads. Short transactional requests and
more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are
the preferred platforms for the analytics workload.
8
- 9. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9
Data Everywhere
And in Numerous Technical Forms
And in Numerous Clouds
- 10. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10
,
Low Maturity Data
Integration
- 11. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11
Leverageable Vehicles
Data Warehouse
Master Data Management
Data Lake
- 12. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12
Points of Data Integration
• Into the Data Warehouse(s)
• Into the Data Marts/cubes that do not integrate with the data warehouse
• Into the Data Marts/cubes that do integrate with the data warehouse
• Into Big Data platforms from sensor, clickstream, other systems
• Into Big Data platforms from Data Stream Processing
• Into the Master Data Management Hub from publishing/master systems
• From the Master Data Management Hub to every subscribing system (ERPs, NoSQL, Hadoop, data
warehouse, analytical databases, etc.)
• Between analytical stores
• Between operational stores
• Summaries of Big Data for the data warehouse and other analytical stores
• Data migrations for setting up new environments
• Etc.!
- 13. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13
Modern Realities of Data
Integration
Desire for consolidated methods for data integration
New types of data sources
• Logs, sensors, etc.
We have more than OLTP and OLAP
• Distributed data platforms
Desire for real-time data
High-velocity data increasingly needs integration
Traditional approaches, without Stream Processing, turn
into ETL+custom scripts+middleware+MQ
- 14. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14
Real-Time Data
A.k.a. messaging, live feeds, real-time, event-driven
Comes in continuously and often quickly, so we call
it streaming data.
Needs special attention and can be of immense
value, but only if we are alerted in time.
Foundation for Artificial Intelligence
• Stream data forms the core of data for artificial
intelligence
- 15. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15
Message-Oriented Middleware / Message
Queueing Technology
An architectural component that deals with messages
Manage and distribute streaming data
• Any kind of data wrapped in a neat package with a very simple
header
• Sent by “producers”—systems, sensors, or devices that generate
the messages—toward a “broker”.
• Routes them into queues according to the information enclosed in the
message header or its own routing process
• “Consumers” retrieve the messages from the queues to which they
subscribe
• Open the messages and perform some kind of action on them.
- 16. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16
Streaming Architecture
Apps
Streaming
Platform
Change logs
Streaming data pipelines
Messaging or
Stream processing
Request - Response
DW Hadoop
- 17. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17
Every Project is Burdened
(with Grander Opportunity)
- 18. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18
Data Success Measurement
User Satisfaction
Business ROI and
growth instigated
Data Maturity
(Long-term User Sat
and Bus ROI)
Misc.
- 19. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 19
“Beyond the Mountain is
another mountain.”
- 20. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20
Champion Initiatives That
Matter
Every single item on a company mission statement
relates to data at some level
It is from the position of data expertise that the
mission will be executed and company leadership
will emerge
The data professional is absolutely sitting on the
performance of the company in this information
economy and has an obligation to demonstrate the
possibilities and originate the architecture, data
and projects that will deliver.
It’s not enough to be responsive to urgent requests
and be the data leader that companies need.
- 21. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 21
Unlock Potential
William McKnight
www.mcknightcg.com
214-514-1444
Modern Data Integration - Expert Sessions
@williammcknight
- 23. Problems with Normal Data Integration Processes
Data modeling. Too much time spent coping with slight changes
in our business data
Business/IT alignment. Data architects, DBAs, and others can’t
communicate with businesspeople
Processes. Too much detail lost by handing off responsibility for
business data to different people
- 24. Problem: Data Modeling
Too much time spent coping with slight changes in our business data
Johann Sebastian Bach
Given Middle Family
- 25. Mougi
Problem: Data Modeling
Too much time spent coping with slight changes in our business data
Johann Sebastian Bach
Given Middle FamilyHon.
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art.
Ludwig van Beethoven
ChenYi
Repeated changes in operational systems’ row-and-column structures
- 26. Problem: Data Modeling
Ripple effects of changes in one system lead to changes in others
Mougi
Johann Sebastian Bach
Given Middle FamilyHon
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig van Beethoven
ChenYi
Operational, designed for transactions
Data warehouse, designed for abstractions
Sebastian
Middle
Dmitriyevich
Patronymic
el
al
Art
Hon
van
Mougi
Bach
Family
Shostakovich
Qasabgi
Beethoven
Chen
Johann
Given
Dmitri
Mohamed
Muhammad
Ludwig
Yi
Data mart, designed for analysis
Mo
ugi
Bac
h
Fam
ily
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Giv
en
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mo
ugi
Bac
h
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mo
ugi
Bac
h
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mougi
Johann Sebastian Bach
Given Middle FamilyHn
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig vn Beethoven
ChenYi
Sebastian
Sebastian
Sebastian
el
el
el
Dmitriyevich
Dmitriyevich
Dmitriyevich
Dmitriyevich
Mougi
Johann Sebastian Bach
Given Middle FamilyHn
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig vn Beethoven
ChenYi
Mougi
Johann Sebastian Bach
Given Middle Family
Dmitri Shostakovich
Mohamed
Muhammad Qasabgi
Ludwig Beethoven
ChenYi
Sebastian
Mougi
Johann Sebastian Bach
Given Middle Family
Dmitri Shostakovich
Mohamed
Muhammad Qasabgi
Ludwig Beethoven
ChenYi
Sebastian
Sebastian
Sebastian
Sebastian
Sebastian
- 27. Problem: Business/IT Alignment
Data people often can’t communicate with businesspeople
Data architect thinks
Model the data
Govern the data
Watch out for “quick fixes”
IT:
Gets it
That modeling stuff
we just talked about
Business:
Hates it
Business thinks
Modeling, metadata are hindrances
Analytical tools best without governance
IT slows them down
- 28. Problem: Processes
Too much information lost by distributing responsibility for business data
Cleansing occurs in transformation step: Different rules being fired
Different tools and metadata being used by platform
Loss of timestamps, context, before-and-after: No cross-platform auditability
No comprehensive rollback, alternate history, what-if
Operational
application
Data
warehouse
Cloud
application
F
a
m
i
l
y
Transformation
Cleansing
Standardization
Transformation
Cleansing
Standardization
- 29. F
a
m
i
l
y
F
a
m
i
l
y
How much time do we
spend mapping one set
of rows and columns
to another?
Modern Data Integration
A modern solution:
post-relational for data capture, transformation,
subject-oriented storage (perhaps), and exchange,
rich documents instead of relational models
Operational
application
Data
warehouse
Analytics
- 30. How much time do we
spend mapping one set
of rows and columns
to another?
Modern Data Integration
A modern solution:
post-relational for data capture, transformation,
subject-oriented storage (perhaps), and exchange,
rich documents instead of relational models
Operational
application
Data
warehouse
Analytics
- 33. Modern Data Integration: The Omni-Gen Approach
We built software to make ourselves successful
Immediate capture in automatically generated data hub
Master data: business-user-oriented, subject-oriented
Rapid, integrated data quality rules
Mastered and transactional subjects
Rapid cycle times to keep the business engaged
Support and automatically apply best practices
- 34. Modern Data Integration: The Omni-Gen Approach
Extending Value
We built “persona models” for customer and supplier
Everything you get in Omni-Gen, plus
Pre-built models
Pre-built data quality rules
Pre-built match/merge rules
Pre-built data governance
Immediate 360° core view, unlimited extensions
Supports different consumers with different, but trusted, data
- 35. Omni-Gen: More Value in Far Less Time
12-181-3 4-6
Project timeline, in months
Traditional
Data management tools
Build-it-yourself development environment
Omni-Gen
Software solution with built-in best practices
MDM, DQ, integration software with rules,
automatically generated data vault, remediation portal,
360° viewer, history, data interfaces, APIs, and feeds
Omnifor
Persona
Software solution with built-in best practices and complete master data models
Data vault model, data onramps; MDM, data quality, and integration software; MDM
and data quality rules, remediation portal, 360° viewer; Data interfaces, APIs,
history, & feeds; Analytical foundation for dashboarding, advanced analytics, more.