SlideShare a Scribd company logo
Driving Business Value Through
Agile Data Assets
Carl Olofson
Research Vice President, IDC
Agenda
 The Third Platform
 The Data Imperative
 Data In the Enterprise Today
 The Data Tsunami
 Getting the Data Under Control
 Benefits to Having Well-Defined and
Managed Data
 Conclusions/Recommendations
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 2
Toward the Third Platform
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 3
 Distributed systems, accessible to non-technical
users
 Data shared across systems, visual GUI access
 Systems extended to the Web via static pages,
limited customer access to data and functions
The First
Platform
 Fixed systems, statically defined data
 Running on terminal systems, performing
back-office tasks, only accessible internally
The Second
Platform
The Third Platform
 Bridging internal and external data
 Large collections of data ingested
first, defined later.
 Social data inclusion, mobile
device interaction.
 Cloud services for elasticity.
 Value delivered for new classes of
applications and data use (digital
transformation).
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 4
Source: IDC
From Static to Dynamic Data
Management
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 5
In a dynamic world…
 Data must change dynamically, or may originate externally,
but still requires definition.
 Applications are coded in an event-driven manner,
responding to stimuli, and, “learning” as they go.
 Agility, adaptability, elasticity are required.
In a static world…
 Data is defined to suit application needs.
 Applications are coded with fixed, serial processes.
 No agility, no adaptability, and change is hard.
Agile, But Managed Data
 New applications are emerging.
• Web-based customer-facing applications accessing
databases.
• Applications that interact with, and coordinate app data on
mobile devices.
• Applications that respond to sensor and other machine-
generated data.
 Existing applications need adapting.
• Taking advantage of machine-generated data, social
media data, data from customers and partners.
• Blending analytic and transactional processing on a single
database.
 Both new and existing applications must be agile,
so their data must be agile.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 6
Databases Are Changing
 New data technologies for new workloads.
• Hadoop – scalable but unmanaged.
• NoSQL – agile but without definitional formalism.
 Existing data technologies are evolving.
• Memory-optimized columnar data stores with SIMD
support for high speed analytics.
• Memory-optimized row or matrix data stores for high
speed transaction support.
• Late-binding schemas and agile schema support for
definition change without database restructuring.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 7
The Data
Imperative
 Dangers of unmanaged data definitions:
• Poor data quality, leading to exponential
damage to business processes due to high
speed integration.
• Lack of knowledge about sensitive data,
leading to risk of contractual or regulatory
noncompliance.
• Duplicate, errant, or missing data-driven
processes due to poor understanding of the
data.
 The process of digital transformation is
data-driven. The data must be well
understood.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 8
Data in the Enterprise Today
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 9
 Most enterprises do not have a data governance
initiative.
 Security definitions are fragmentary.
 A lack of MDM leads to inconsistent and incomplete
views of key enterprise data about customers,
partners, products, etc.
Fragmented
 Data is defined on an application-by-application
basis.
 Select data is defined in ETL for purposes of data
movement.
 Data warehouses have a select subset, the rest is
not managed at an enterprise level.
Ungoverned
The Data
Tsunami
 A huge wave of new data is coming fast.
• It’s not well defined.
• It’s high volume.
• It is critical to managing an agile business.
 The formats vary.
• Some is XML. Some is CSV. Some is… who knows?
• Some is managed by web applications in JSON.
 It needs to be ordered and interpreted, or
“curated”.
• All too often today, this is done by expensive data
scientists (not their job).
• Needs to be done by someone with an eye toward the
rest of the data in the enterprise.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 10
Getting the Data Under Control
 The Old Data Modeling Process
• Waterfall: driven by a well-defined sequential project plan.
• Driven by application specification.
• Slow, formal approach to model recursion.
• Models all to often left on the shelf after initial implementation.
 The New Data Modeling Process
• Agile: data is constantly examined and redefined.
• Data comes in, and then is interpreted.
• Data models must be designed to anticipate change.
• Models must also anticipate and support alternative forms of
organization such as document (JSON, XML), wide column, etc.
• Target could be RDBMS, but also Hadoop, NoSQL, NewSQL
database, et al.
• Models should anticipate integration, and cross-system
collaboration.
• Governance and security must be considerations from the start.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 11
Specify
Model
Implement
DeliverFeedback
CodeNeed
Model Implement
ReviseReview
Benefits of Having Well-Defined and
Managed Data
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 12
 Both analytical and transactional systems adapt to changing business conditions and
new data.
 Data sharing can be more informal, leading to greater insights through collaboration.
Agility
 Well-defined data is easier to secure.
 Knowing where the sensitive data is a key to proper protection from possible
compliance liability.
Lower Risk
 When data is well understood and leveraged across systems, it can be better
exploited. This is a key to success on the Third Platform.
 Adaptability means being able to take advantage of opportunities in the moment. Data
that is both transactional and analytical can enable smart applications.
More Business Opportunity
Conclusions/Recommendations
Conclusions
 As businesses evolve toward the Third
Platform, they must be prepared to embrace
Digital Transformation.
 This means being able to blend existing data in
new and unpredictable ways, and to leverage
new data on new data management
technologies.
 It also means modeling data in ways that
support the above, while ensuring data
security, lowering risk, and enabling
exploitation of opportunities that this new class
of data will deliver.
Recommendations
 Take an audit of your existing data assets, and ask the
question, “How well do I know where my data is, and
what it means?”
 Seek to define existing data through models, to ensure
its easy integration with other existing data sources,
and in preparation for new data sources.
 Look at tools and utilities that will support both the
definition and modeling of existing data sources, and
data in places like Hadoop, NoSQL, NewSQL
databases, and so on.
 Consider this an opportunity to leverage data
modeling to drive the enterprise to new levels of agility
and collaboration that will in turn ensure
competitiveness in the world of Digital Transformation.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 13
EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES
Driving Business Value Through
Agile Data Assets
Ron Huizenga
Senior Product Manger – ER/Studio
EMBARCADERO TECHNOLOGIES
Agenda
• What’s happening with data?
• The new lifecycle
• Data landscape complexity
• Discovery & identification through models
– Specific capabilities
• What’s happening in reality?
• Concluding remarks
2
EMBARCADERO TECHNOLOGIES
3
REFERENCES:
http://blog.qmee.com/wp-content/uploads/2013/07/Qmee-Online-In-60-Seconds2.png
http://techcrunch.com/2010/08/04/schmidt-data/
What’s Happening with Data?
EMBARCADERO TECHNOLOGIES
What’s in your data lake (swamp)?
4
EMBARCADERO TECHNOLOGIES
Information Refinery
5
EMBARCADERO TECHNOLOGIES
Key Skill Sets
• Data Design & Management
• ETL and Software Development
• Data Analysis / Stats
• Business Analysis & Discovery
Value Delivered
• Validation
• Integration
• Enrichment
• Usability
Value and the New Lifecycle
6
Discover
Document
(Model)
Integrate
EMBARCADERO TECHNOLOGIES
Data Landscape Complexity
7
• Comprised of:
– Proliferation of disparate systems
– Mismatched departmental solutions
– Many database platforms
– Big data platforms
– ERP, SAAS
– Obsolete legacy systems
• Compounded by:
– Poor decommissioning strategy
– Point-to-point interfaces
– Data warehouse, data marts, ETL …
Data Archaeologist?
EMBARCADERO TECHNOLOGIES
Discovery and Identification Through Models
• Identify candidate data sources
• Reverse engineer data sources into models
• Identify, name and define
• Classify through metadata
• Map “like” items across models
• Data lineage / chain of custody
• Repository
• Collaboration & publishing
8
EMBARCADERO TECHNOLOGIES
ER/Studio: Native Big Data Support
• MongoDB
– Diagramming
– Reverse & Forward Engineering (JSON, BSON)
– MongoDB certification for 2.x and 3.0
• Certified for HDP 2.1
– Forward and reverse engineering
– Hive DDL
• Additonal MetaWizard capabilities for additional
platforms
9
EMBARCADERO TECHNOLOGIES
ER/Studio: Extended Notation for MongoDB
10
EMBARCADERO TECHNOLOGIES
ER/Studio: Apply naming Standards
• Can invoke with other wizards
– General Physical Model
– Compare & Merge
– XML Schema Generation
– Model Validation
• Can apply to model or sub-model at any
time
• Either Direction
• Selective review/apply
• Enabled by loose model coupling
• Name lockdown (freeze names)
11
EMBARCADERO TECHNOLOGIES
ER/Studio: Universal Mappings
• Ability to link “like” or related objects
– Within same model file
– Across separate model files
• Entity/Table level
• Attribute/Column level
12
EMBARCADERO TECHNOLOGIES
ER Studio: Attachment of Metadata extensions
13
EMBARCADERO TECHNOLOGIES
ER/Studio: Data Dictionary
14
EMBARCADERO TECHNOLOGIES
Business Meaning: Glossary/Terms
15
EMBARCADERO TECHNOLOGIES
ER/Studio: Glossary Integration
16
EMBARCADERO TECHNOLOGIES
ER/Studio: Data Lineage
17
EMBARCADERO TECHNOLOGIES
Increasing volumes,
velocity, and variety of
Enterprise Data
30% - 50% year/year
growth
Decreasing % of
enterprise data which is
effectively utilized
5% of all Enterprise data
fully utilized
Increased risk from data
misunderstanding and
non-compliance
$600bn/annual cost for
data clean-up in U.S.
Enterprise Data Trends
EMBARCADERO TECHNOLOGIES
Business Stakeholders’ Data Usage
19
Suspect that business stakeholders
INTERPRET DATA INCORRECTLY
Yes,
frequently
14%
Yes,
occasionally
67%
No, never
9%
I don’t know
10%
Suspect that business stakeholders make decisions
USING THE WRONG DATA?
Yes,
frequently
11%
Yes,
occasionally
64%
No, never
13%
I don’t know
12%
EMBARCADERO TECHNOLOGIES
Data Model Usage & Understanding
20
13%
3%
16%
19%
31%
18%
0% 5% 10% 15% 20% 25% 30% 35%
We don’t use data models
Other
Our data team does most data
models but developers also build
them as needed
Our database administrators own
data modeling
Developers develop their own data
models
We have a data modeling team that
is responsible for data models
What is your organization’s approach to data modeling?
How well does your organization’s technology leadership team
understand the value of using data models?
Completely
understand
20%
Understand
somewhat
60%
Don’t
understand
17%
I don’t know
3%
87%
EMBARCADERO TECHNOLOGIES
Call to Action
• Audit, map and define existing data assets using
models, with the capabilities discussed
• Share, collaborate, govern
• Leverage data modeling to enable business agility
• Adapt to the “new” lifecycle
• Instill a data culture based on a philosophy of
continuous improvement
21
EMBARCADERO TECHNOLOGIES
Thank you!
• Learn more about the ER/Studio product family:
http://www.embarcadero.com/data-modeling
• Trial Downloads:
http://www.embarcadero.com/downloads
• To arrange a demo, please contact Embarcadero
Sales: sales@embarcadero.com, (888) 233-2224
22

More Related Content

Driving Business Value Through Agile Data Assets

  • 1. Driving Business Value Through Agile Data Assets Carl Olofson Research Vice President, IDC
  • 2. Agenda  The Third Platform  The Data Imperative  Data In the Enterprise Today  The Data Tsunami  Getting the Data Under Control  Benefits to Having Well-Defined and Managed Data  Conclusions/Recommendations © IDC Visit us at IDC.com and follow us on Twitter: @IDC 2
  • 3. Toward the Third Platform © IDC Visit us at IDC.com and follow us on Twitter: @IDC 3  Distributed systems, accessible to non-technical users  Data shared across systems, visual GUI access  Systems extended to the Web via static pages, limited customer access to data and functions The First Platform  Fixed systems, statically defined data  Running on terminal systems, performing back-office tasks, only accessible internally The Second Platform
  • 4. The Third Platform  Bridging internal and external data  Large collections of data ingested first, defined later.  Social data inclusion, mobile device interaction.  Cloud services for elasticity.  Value delivered for new classes of applications and data use (digital transformation). © IDC Visit us at IDC.com and follow us on Twitter: @IDC 4 Source: IDC
  • 5. From Static to Dynamic Data Management © IDC Visit us at IDC.com and follow us on Twitter: @IDC 5 In a dynamic world…  Data must change dynamically, or may originate externally, but still requires definition.  Applications are coded in an event-driven manner, responding to stimuli, and, “learning” as they go.  Agility, adaptability, elasticity are required. In a static world…  Data is defined to suit application needs.  Applications are coded with fixed, serial processes.  No agility, no adaptability, and change is hard.
  • 6. Agile, But Managed Data  New applications are emerging. • Web-based customer-facing applications accessing databases. • Applications that interact with, and coordinate app data on mobile devices. • Applications that respond to sensor and other machine- generated data.  Existing applications need adapting. • Taking advantage of machine-generated data, social media data, data from customers and partners. • Blending analytic and transactional processing on a single database.  Both new and existing applications must be agile, so their data must be agile. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 6
  • 7. Databases Are Changing  New data technologies for new workloads. • Hadoop – scalable but unmanaged. • NoSQL – agile but without definitional formalism.  Existing data technologies are evolving. • Memory-optimized columnar data stores with SIMD support for high speed analytics. • Memory-optimized row or matrix data stores for high speed transaction support. • Late-binding schemas and agile schema support for definition change without database restructuring. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 7
  • 8. The Data Imperative  Dangers of unmanaged data definitions: • Poor data quality, leading to exponential damage to business processes due to high speed integration. • Lack of knowledge about sensitive data, leading to risk of contractual or regulatory noncompliance. • Duplicate, errant, or missing data-driven processes due to poor understanding of the data.  The process of digital transformation is data-driven. The data must be well understood. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 8
  • 9. Data in the Enterprise Today © IDC Visit us at IDC.com and follow us on Twitter: @IDC 9  Most enterprises do not have a data governance initiative.  Security definitions are fragmentary.  A lack of MDM leads to inconsistent and incomplete views of key enterprise data about customers, partners, products, etc. Fragmented  Data is defined on an application-by-application basis.  Select data is defined in ETL for purposes of data movement.  Data warehouses have a select subset, the rest is not managed at an enterprise level. Ungoverned
  • 10. The Data Tsunami  A huge wave of new data is coming fast. • It’s not well defined. • It’s high volume. • It is critical to managing an agile business.  The formats vary. • Some is XML. Some is CSV. Some is… who knows? • Some is managed by web applications in JSON.  It needs to be ordered and interpreted, or “curated”. • All too often today, this is done by expensive data scientists (not their job). • Needs to be done by someone with an eye toward the rest of the data in the enterprise. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 10
  • 11. Getting the Data Under Control  The Old Data Modeling Process • Waterfall: driven by a well-defined sequential project plan. • Driven by application specification. • Slow, formal approach to model recursion. • Models all to often left on the shelf after initial implementation.  The New Data Modeling Process • Agile: data is constantly examined and redefined. • Data comes in, and then is interpreted. • Data models must be designed to anticipate change. • Models must also anticipate and support alternative forms of organization such as document (JSON, XML), wide column, etc. • Target could be RDBMS, but also Hadoop, NoSQL, NewSQL database, et al. • Models should anticipate integration, and cross-system collaboration. • Governance and security must be considerations from the start. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 11 Specify Model Implement DeliverFeedback CodeNeed Model Implement ReviseReview
  • 12. Benefits of Having Well-Defined and Managed Data © IDC Visit us at IDC.com and follow us on Twitter: @IDC 12  Both analytical and transactional systems adapt to changing business conditions and new data.  Data sharing can be more informal, leading to greater insights through collaboration. Agility  Well-defined data is easier to secure.  Knowing where the sensitive data is a key to proper protection from possible compliance liability. Lower Risk  When data is well understood and leveraged across systems, it can be better exploited. This is a key to success on the Third Platform.  Adaptability means being able to take advantage of opportunities in the moment. Data that is both transactional and analytical can enable smart applications. More Business Opportunity
  • 13. Conclusions/Recommendations Conclusions  As businesses evolve toward the Third Platform, they must be prepared to embrace Digital Transformation.  This means being able to blend existing data in new and unpredictable ways, and to leverage new data on new data management technologies.  It also means modeling data in ways that support the above, while ensuring data security, lowering risk, and enabling exploitation of opportunities that this new class of data will deliver. Recommendations  Take an audit of your existing data assets, and ask the question, “How well do I know where my data is, and what it means?”  Seek to define existing data through models, to ensure its easy integration with other existing data sources, and in preparation for new data sources.  Look at tools and utilities that will support both the definition and modeling of existing data sources, and data in places like Hadoop, NoSQL, NewSQL databases, and so on.  Consider this an opportunity to leverage data modeling to drive the enterprise to new levels of agility and collaboration that will in turn ensure competitiveness in the world of Digital Transformation. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 13
  • 14. EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES Driving Business Value Through Agile Data Assets Ron Huizenga Senior Product Manger – ER/Studio
  • 15. EMBARCADERO TECHNOLOGIES Agenda • What’s happening with data? • The new lifecycle • Data landscape complexity • Discovery & identification through models – Specific capabilities • What’s happening in reality? • Concluding remarks 2
  • 17. EMBARCADERO TECHNOLOGIES What’s in your data lake (swamp)? 4
  • 19. EMBARCADERO TECHNOLOGIES Key Skill Sets • Data Design & Management • ETL and Software Development • Data Analysis / Stats • Business Analysis & Discovery Value Delivered • Validation • Integration • Enrichment • Usability Value and the New Lifecycle 6 Discover Document (Model) Integrate
  • 20. EMBARCADERO TECHNOLOGIES Data Landscape Complexity 7 • Comprised of: – Proliferation of disparate systems – Mismatched departmental solutions – Many database platforms – Big data platforms – ERP, SAAS – Obsolete legacy systems • Compounded by: – Poor decommissioning strategy – Point-to-point interfaces – Data warehouse, data marts, ETL … Data Archaeologist?
  • 21. EMBARCADERO TECHNOLOGIES Discovery and Identification Through Models • Identify candidate data sources • Reverse engineer data sources into models • Identify, name and define • Classify through metadata • Map “like” items across models • Data lineage / chain of custody • Repository • Collaboration & publishing 8
  • 22. EMBARCADERO TECHNOLOGIES ER/Studio: Native Big Data Support • MongoDB – Diagramming – Reverse & Forward Engineering (JSON, BSON) – MongoDB certification for 2.x and 3.0 • Certified for HDP 2.1 – Forward and reverse engineering – Hive DDL • Additonal MetaWizard capabilities for additional platforms 9
  • 24. EMBARCADERO TECHNOLOGIES ER/Studio: Apply naming Standards • Can invoke with other wizards – General Physical Model – Compare & Merge – XML Schema Generation – Model Validation • Can apply to model or sub-model at any time • Either Direction • Selective review/apply • Enabled by loose model coupling • Name lockdown (freeze names) 11
  • 25. EMBARCADERO TECHNOLOGIES ER/Studio: Universal Mappings • Ability to link “like” or related objects – Within same model file – Across separate model files • Entity/Table level • Attribute/Column level 12
  • 26. EMBARCADERO TECHNOLOGIES ER Studio: Attachment of Metadata extensions 13
  • 31. EMBARCADERO TECHNOLOGIES Increasing volumes, velocity, and variety of Enterprise Data 30% - 50% year/year growth Decreasing % of enterprise data which is effectively utilized 5% of all Enterprise data fully utilized Increased risk from data misunderstanding and non-compliance $600bn/annual cost for data clean-up in U.S. Enterprise Data Trends
  • 32. EMBARCADERO TECHNOLOGIES Business Stakeholders’ Data Usage 19 Suspect that business stakeholders INTERPRET DATA INCORRECTLY Yes, frequently 14% Yes, occasionally 67% No, never 9% I don’t know 10% Suspect that business stakeholders make decisions USING THE WRONG DATA? Yes, frequently 11% Yes, occasionally 64% No, never 13% I don’t know 12%
  • 33. EMBARCADERO TECHNOLOGIES Data Model Usage & Understanding 20 13% 3% 16% 19% 31% 18% 0% 5% 10% 15% 20% 25% 30% 35% We don’t use data models Other Our data team does most data models but developers also build them as needed Our database administrators own data modeling Developers develop their own data models We have a data modeling team that is responsible for data models What is your organization’s approach to data modeling? How well does your organization’s technology leadership team understand the value of using data models? Completely understand 20% Understand somewhat 60% Don’t understand 17% I don’t know 3% 87%
  • 34. EMBARCADERO TECHNOLOGIES Call to Action • Audit, map and define existing data assets using models, with the capabilities discussed • Share, collaborate, govern • Leverage data modeling to enable business agility • Adapt to the “new” lifecycle • Instill a data culture based on a philosophy of continuous improvement 21
  • 35. EMBARCADERO TECHNOLOGIES Thank you! • Learn more about the ER/Studio product family: http://www.embarcadero.com/data-modeling • Trial Downloads: http://www.embarcadero.com/downloads • To arrange a demo, please contact Embarcadero Sales: sales@embarcadero.com, (888) 233-2224 22