SlideShare a Scribd company logo
The Database for Big Data
Solutions

NoSQL Simplified:
Schema vs Schema-less
Leon Guzenda & Nick Quinn
Meetup - February 20, 2014
© Objectivity, Inc. 2014

!1
Overview
• Objectivity Inc.

• Pros & Cons:

• Schema
• Schema-less

• What We Provide

• A Compromise
© Objectivity, Inc. 2014

!2
Objectivity, Inc.
• Headquartered in San Jose, CA
• Over two decades of NoSQL and Big Data experience
• Enables complex data virtualization and Big Data
solutions for the enterprise
• Software products:
• Objectivity/DB
• InfiniteGraph
• InfiniteGraph Social App
• Embedded in hundreds of enterprises, government
organizations and products, with millions of
deployments.
© Objectivity, Inc. 2014

!3
Objectivity/DB
• Fully distributed object database.

• Handles complex, highly inter-related data.

"
• Extremely fast navigational access.

• Scalable collections and B-Tree indices

• ACID transactions plus Multi-Reader, One Writer mode.

• Highly scalable - Single Logical View plus simple servers

• Parallel Query Engine and Relationship Analytics

• Fully interoperable C++, C#, Java, Python and SQL++ on
Windows, Unix, Linux and Mac OS X.
© Objectivity, Inc. 2014

!4
ODBMS Deployments

Data Fusion

Big Science
© Objectivity, Inc. 2014

Monitoring & Response

Telecom Infrastructure

Complex Financial Systems
!5
InfiniteGraph
• Fully distributed graph database

• High throughput and scalability

"
• Extremely fast navigational access

• ACID transactions for online operation

• Relaxed consistency during batch-mode parallel ingest

• Parallel queries

• Flexible indexing, including Lucene for text

• Java API and Gremlin support
© Objectivity, Inc. 2014

!6
Graph DBMS - Finding The Links

OTHER
DATABASE(S)

GRAPH DATABASE

© Objectivity, Inc. 2014

!7
Objectivity’s Disruptive Big Data Architecture
Uses Data Virtualization to hide the nodes and focus on the connections

© Objectivity, Inc. 2014

!8
Schema: Pros & Cons

© Objectivity, Inc. 2014

!9
Who's Who?
• SCHEMA:
• Network [CODASYL] databases - DDL [1972]
• Relational Databases - Data Dictionary
• Object Databases - ODMG'93
• Most Graph Databases
"
• Schema-less:
• KSAM/ISAM/DSAM/ESAM
• IMS (hierarchical)
• Pick OS Database (hash-tables)
• MUMPS (hierarchical array-storage)
• MongoDB - a specialized JSON (and JSON-like)
document store.
• CouchDB - a JSON document store.
© Objectivity, Inc. 2014

!10
Schema: Pros...
• Global data definitions
"
• Optimal access
"
• Enables Query By Example
"
• Interoperability
"
• Schema change control
"
• Schema contents can be manipulated via standard
APIs and tools
© Objectivity, Inc. 2014

!11
...Schema: Pros
• Global data definitions:
• Data types and the relationships between them
• Makes queries more efficient
• Actions can be restricted by data type, field values, relationship types

"

• Optimal access:
• Used to determine how to best store, manage and access particular data types

"

• Enables Query By Example by showing:
• Types of information available
• Relationships between them

"

• Interoperability:
• DBMS can change the shape of data items to suit the language/environment

"

• Schema change control:
• Can be used to enforce workflows that will keep applications and data in sync.

"

• Schema contents can be manipulated via standard APIs and tools:
• Easier learning curve
• Uniform security controls:
• The schema can use the same security controls as the data
• Query and visualization tools can be used for both data and schema
© Objectivity, Inc. 2014

!12
Schema: Cons
• The database designer and application developers have
to create and maintain the schema.
"
• Applications have to be kept in sync with schema
changes.
"
• Applications and programmers have to be aware of data
types
• Though this is one of the major claimed advantages of objectoriented programming.

"
• There is a perceived loss of flexibility

• Though this is more a function of the user interface to the
database than the underlying mechanisms.

© Objectivity, Inc. 2014

!13
Schema-less: Pros…
• Flexibility
"
• Can be more tolerant of variable Acidity and
Consistency models
"
• Ease of use and maintenance:

© Objectivity, Inc. 2014

!14
…Schema-less: Pros
• Flexibility - Users can, in theory:

"

• Put any kind of data into the system
• Create new kinds of relationships between things (in a few
products)
• Find data without worrying about the types of data
involved.

"

• Can be more tolerant of variable Acidity and Consistency
models

"

• Ease of use and maintenance:
• No need to worry about data types
• No need for a DBA
• Applications will [probably] work when new data arrives
© Objectivity, Inc. 2014

!15
Schema-less: Cons…
• Confusion
"
• Performance suffers
"
• poor Integrity
"
• Ambiguity

© Objectivity, Inc. 2014

!16
…Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].

"

• Performance suffers

"

• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67

© Objectivity, Inc. 2014

!17
Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].

"

• Performance suffers

"

• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67





Floating Point



© Objectivity, Inc. 2014

!18
Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].

"

• Performance suffers

"

• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67





Floating Point


• A ZIPcode may be stored as an integer (01234) or a string (“01234”)
in JSON, causing query and display problems.
© Objectivity, Inc. 2014

!19
The NoSQL Players
Operational

*

Intersystems

MarkLogic

McObject

Object/Graph
Objectivity/DB
Progress
Versant

"

Key-Value

*

Document

Berkeley DB
Cassandra
Redis
Riak
Voldemort

AppEngine
Cloudant
CouchDB
MongoDB
RavenDB

Couchbase

© Objectivity, Inc. 2014

*

AllegroGraph
InfiniteGraph
Neo4j
Titan
Column Family
HBase
HyperTable
SimpleDB

* Fully or partially schema-less

!20
A Compromise

Provide Flexibility With The Advantages Of Having A Schema

© Objectivity, Inc. 2014

!21
Objectivity/DB Schema Usage
• Has an internal schema in its system database (the Federated DB).

"

• User schemas are created and updated by:
• Creating .ddl files and pre-processing them with the DDL processor.
• Creating and compiling Java, C# or Python header files.
• Declaring or dynamically creating/modifyingSmalltalk classes (defunct).
• Declaring and changing table definitions with Objectivity/SQL++.

"

• SQL++ table/column definitions are updated automatically when classes are
declared or modified using other languages.
• This allows SQL++ to access C#, C++, Java and Python objects and vice-versa.

"

• A Federated Database can contain multiple named Schemas:
• Reduces re-compilation and re-building after a localized schema change.
• May facilitate security mechanisms in the future.

© Objectivity, Inc. 2014

!22
Objectivity Active Schema
"

• API and tools for creating, modifying, reading and deleting class
definitions, which include association (relationship) definitions.
• If used with a dynamic language, such as Smalltalk, creating or
modifying a class doesn't need to affect existing programs.
• In general, only generic access (via the ooObj base clase) can be used
without creating the files needed to recompile programs and methods
for accessing the new object types.

"

• Helps application developers build tools that need to access the schema,
e.g.:
• Graphical query tools
• highly flexible object modeling capabilities for end users.

"

• An end-user, such as a field technician or an analyst:
• Can add local object classes, populate, maintain and query them,
but...
• Cannot interfere with the correct operation of the pre-built
applications.
© Objectivity, Inc. 2014

!23
Use Cases

© Objectivity, Inc. 2014

!24
Use Case 1 - Intelligence Gathering Framework…
1

of

• An integrated application
development framework that
focuses on adaptability.

• Dynamic modeling of
entities, services and
workflows. 

• Versioning and temporality
features support system
evolution.

The screenshots show a location that is under surveillance and
everything known about it in the database.

© Objectivity, Inc. 2014

!25

2
…Use Case 1 - Intelligence Gathering Framework
2

• Eliminates the mapping layer
between the user defined
objects and the database.

• Performance and scalability. 


Design and Information Feeds

of

Users

Database

• Active Schema facilitates
object migration.


© Objectivity, Inc. 2014

!26

2
Use Case 2 - GDMO Framework
"
• Operations, Administration, and"
Maintenance interface for the CDMA"
system RF infrastructure

• Controls the Base Station Controller and
Base Station Transceiver Subsystem

• GDMO* Schema and CMIP agent-manager"
messaging

• A SPARC-based BSC rack supports a"
peak load of 150,000 simultaneous callers

• Deployed in CDMA networks worldwide,"
including SprintPCS"

* GDMO is the Guideline for the Definition of Managed Objects
© Objectivity, Inc. 2014

!27
Use Case 3 - Ontology Framework
SCHEMA

"
• Uses standard objects to define a metaschema 

• It is used to define concept templates

• They can be inherited from, combined or
extended to support a “class specification”


CONCEPT

LOGIC

CLASS

COMPONENTS

• The data is combined with Horn Logic to
build complex ontologies."
RELATIONSHIP

STRUCT

ARRAY

FIELD

* GDMO is the Guideline for the Definition of Managed Objects
© Objectivity, Inc. 2014

!28
Summary
• Don’t confuse CAP issues with Schema
considerations

• Schemas make the DBMS more powerful

• Schema-less architectures are more flexible

• It’s possible to build flexible systems with
Schema-based infrastructure

© Objectivity, Inc. 2014

!29
THANK YOU
• Please visit objectivity.com for:

•
•
•
•
•
•

Features
Use Cases
White Papers
Free downloads (60 day evaluation)
Sample Applications
Application Developer’s Wiki

"

• For further information:

"

• Email: info@objectivity.com

© Objectivity, Inc. 2014

!30

More Related Content

NoSQL Simplified: Schema vs. Schema-less

  • 1. The Database for Big Data Solutions NoSQL Simplified: Schema vs Schema-less Leon Guzenda & Nick Quinn Meetup - February 20, 2014 © Objectivity, Inc. 2014 !1
  • 2. Overview • Objectivity Inc.
 • Pros & Cons:
 • Schema • Schema-less
 • What We Provide
 • A Compromise © Objectivity, Inc. 2014 !2
  • 3. Objectivity, Inc. • Headquartered in San Jose, CA • Over two decades of NoSQL and Big Data experience • Enables complex data virtualization and Big Data solutions for the enterprise • Software products: • Objectivity/DB • InfiniteGraph • InfiniteGraph Social App • Embedded in hundreds of enterprises, government organizations and products, with millions of deployments. © Objectivity, Inc. 2014 !3
  • 4. Objectivity/DB • Fully distributed object database.
 • Handles complex, highly inter-related data.
 " • Extremely fast navigational access.
 • Scalable collections and B-Tree indices
 • ACID transactions plus Multi-Reader, One Writer mode.
 • Highly scalable - Single Logical View plus simple servers
 • Parallel Query Engine and Relationship Analytics
 • Fully interoperable C++, C#, Java, Python and SQL++ on Windows, Unix, Linux and Mac OS X. © Objectivity, Inc. 2014 !4
  • 5. ODBMS Deployments Data Fusion Big Science © Objectivity, Inc. 2014 Monitoring & Response Telecom Infrastructure Complex Financial Systems !5
  • 6. InfiniteGraph • Fully distributed graph database
 • High throughput and scalability
 " • Extremely fast navigational access
 • ACID transactions for online operation
 • Relaxed consistency during batch-mode parallel ingest
 • Parallel queries
 • Flexible indexing, including Lucene for text
 • Java API and Gremlin support © Objectivity, Inc. 2014 !6
  • 7. Graph DBMS - Finding The Links OTHER DATABASE(S) GRAPH DATABASE © Objectivity, Inc. 2014 !7
  • 8. Objectivity’s Disruptive Big Data Architecture Uses Data Virtualization to hide the nodes and focus on the connections © Objectivity, Inc. 2014 !8
  • 9. Schema: Pros & Cons © Objectivity, Inc. 2014 !9
  • 10. Who's Who? • SCHEMA: • Network [CODASYL] databases - DDL [1972] • Relational Databases - Data Dictionary • Object Databases - ODMG'93 • Most Graph Databases " • Schema-less: • KSAM/ISAM/DSAM/ESAM • IMS (hierarchical) • Pick OS Database (hash-tables) • MUMPS (hierarchical array-storage) • MongoDB - a specialized JSON (and JSON-like) document store. • CouchDB - a JSON document store. © Objectivity, Inc. 2014 !10
  • 11. Schema: Pros... • Global data definitions " • Optimal access " • Enables Query By Example " • Interoperability " • Schema change control " • Schema contents can be manipulated via standard APIs and tools © Objectivity, Inc. 2014 !11
  • 12. ...Schema: Pros • Global data definitions: • Data types and the relationships between them • Makes queries more efficient • Actions can be restricted by data type, field values, relationship types " • Optimal access: • Used to determine how to best store, manage and access particular data types " • Enables Query By Example by showing: • Types of information available • Relationships between them " • Interoperability: • DBMS can change the shape of data items to suit the language/environment " • Schema change control: • Can be used to enforce workflows that will keep applications and data in sync. " • Schema contents can be manipulated via standard APIs and tools: • Easier learning curve • Uniform security controls: • The schema can use the same security controls as the data • Query and visualization tools can be used for both data and schema © Objectivity, Inc. 2014 !12
  • 13. Schema: Cons • The database designer and application developers have to create and maintain the schema. " • Applications have to be kept in sync with schema changes. " • Applications and programmers have to be aware of data types • Though this is one of the major claimed advantages of objectoriented programming. " • There is a perceived loss of flexibility • Though this is more a function of the user interface to the database than the underlying mechanisms. © Objectivity, Inc. 2014 !13
  • 14. Schema-less: Pros… • Flexibility " • Can be more tolerant of variable Acidity and Consistency models " • Ease of use and maintenance: © Objectivity, Inc. 2014 !14
  • 15. …Schema-less: Pros • Flexibility - Users can, in theory: " • Put any kind of data into the system • Create new kinds of relationships between things (in a few products) • Find data without worrying about the types of data involved. " • Can be more tolerant of variable Acidity and Consistency models " • Ease of use and maintenance: • No need to worry about data types • No need for a DBA • Applications will [probably] work when new data arrives © Objectivity, Inc. 2014 !15
  • 16. Schema-less: Cons… • Confusion " • Performance suffers " • poor Integrity " • Ambiguity © Objectivity, Inc. 2014 !16
  • 17. …Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67 © Objectivity, Inc. 2014 !17
  • 18. Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67
 
 
 Floating Point 
 © Objectivity, Inc. 2014 !18
  • 19. Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67
 
 
 Floating Point 
 • A ZIPcode may be stored as an integer (01234) or a string (“01234”) in JSON, causing query and display problems. © Objectivity, Inc. 2014 !19
  • 20. The NoSQL Players Operational * Intersystems MarkLogic McObject Object/Graph Objectivity/DB Progress Versant " Key-Value * Document Berkeley DB Cassandra Redis Riak Voldemort AppEngine Cloudant CouchDB MongoDB RavenDB Couchbase © Objectivity, Inc. 2014 * AllegroGraph InfiniteGraph Neo4j Titan Column Family HBase HyperTable SimpleDB * Fully or partially schema-less !20
  • 21. A Compromise
 Provide Flexibility With The Advantages Of Having A Schema © Objectivity, Inc. 2014 !21
  • 22. Objectivity/DB Schema Usage • Has an internal schema in its system database (the Federated DB). " • User schemas are created and updated by: • Creating .ddl files and pre-processing them with the DDL processor. • Creating and compiling Java, C# or Python header files. • Declaring or dynamically creating/modifyingSmalltalk classes (defunct). • Declaring and changing table definitions with Objectivity/SQL++. " • SQL++ table/column definitions are updated automatically when classes are declared or modified using other languages. • This allows SQL++ to access C#, C++, Java and Python objects and vice-versa. " • A Federated Database can contain multiple named Schemas: • Reduces re-compilation and re-building after a localized schema change. • May facilitate security mechanisms in the future. © Objectivity, Inc. 2014 !22
  • 23. Objectivity Active Schema " • API and tools for creating, modifying, reading and deleting class definitions, which include association (relationship) definitions. • If used with a dynamic language, such as Smalltalk, creating or modifying a class doesn't need to affect existing programs. • In general, only generic access (via the ooObj base clase) can be used without creating the files needed to recompile programs and methods for accessing the new object types. " • Helps application developers build tools that need to access the schema, e.g.: • Graphical query tools • highly flexible object modeling capabilities for end users. " • An end-user, such as a field technician or an analyst: • Can add local object classes, populate, maintain and query them, but... • Cannot interfere with the correct operation of the pre-built applications. © Objectivity, Inc. 2014 !23
  • 24. Use Cases © Objectivity, Inc. 2014 !24
  • 25. Use Case 1 - Intelligence Gathering Framework… 1 of • An integrated application development framework that focuses on adaptability.
 • Dynamic modeling of entities, services and workflows. 
 • Versioning and temporality features support system evolution.
 The screenshots show a location that is under surveillance and everything known about it in the database. © Objectivity, Inc. 2014 !25 2
  • 26. …Use Case 1 - Intelligence Gathering Framework 2 • Eliminates the mapping layer between the user defined objects and the database.
 • Performance and scalability. 
 Design and Information Feeds of Users Database • Active Schema facilitates object migration.
 © Objectivity, Inc. 2014 !26 2
  • 27. Use Case 2 - GDMO Framework " • Operations, Administration, and" Maintenance interface for the CDMA" system RF infrastructure
 • Controls the Base Station Controller and Base Station Transceiver Subsystem
 • GDMO* Schema and CMIP agent-manager" messaging
 • A SPARC-based BSC rack supports a" peak load of 150,000 simultaneous callers
 • Deployed in CDMA networks worldwide," including SprintPCS" * GDMO is the Guideline for the Definition of Managed Objects © Objectivity, Inc. 2014 !27
  • 28. Use Case 3 - Ontology Framework SCHEMA " • Uses standard objects to define a metaschema 
 • It is used to define concept templates
 • They can be inherited from, combined or extended to support a “class specification”
 CONCEPT LOGIC CLASS COMPONENTS • The data is combined with Horn Logic to build complex ontologies." RELATIONSHIP STRUCT ARRAY FIELD * GDMO is the Guideline for the Definition of Managed Objects © Objectivity, Inc. 2014 !28
  • 29. Summary • Don’t confuse CAP issues with Schema considerations
 • Schemas make the DBMS more powerful
 • Schema-less architectures are more flexible
 • It’s possible to build flexible systems with Schema-based infrastructure © Objectivity, Inc. 2014 !29
  • 30. THANK YOU • Please visit objectivity.com for:
 • • • • • • Features Use Cases White Papers Free downloads (60 day evaluation) Sample Applications Application Developer’s Wiki " • For further information: " • Email: info@objectivity.com © Objectivity, Inc. 2014 !30