NoSQL Simplified: Schema vs. Schema-less
- 1. The Database for Big Data
Solutions
NoSQL Simplified:
Schema vs Schema-less
Leon Guzenda & Nick Quinn
Meetup - February 20, 2014
© Objectivity, Inc. 2014
!1
- 3. Objectivity, Inc.
• Headquartered in San Jose, CA
• Over two decades of NoSQL and Big Data experience
• Enables complex data virtualization and Big Data
solutions for the enterprise
• Software products:
• Objectivity/DB
• InfiniteGraph
• InfiniteGraph Social App
• Embedded in hundreds of enterprises, government
organizations and products, with millions of
deployments.
© Objectivity, Inc. 2014
!3
- 4. Objectivity/DB
• Fully distributed object database.
• Handles complex, highly inter-related data.
"
• Extremely fast navigational access.
• Scalable collections and B-Tree indices
• ACID transactions plus Multi-Reader, One Writer mode.
• Highly scalable - Single Logical View plus simple servers
• Parallel Query Engine and Relationship Analytics
• Fully interoperable C++, C#, Java, Python and SQL++ on
Windows, Unix, Linux and Mac OS X.
© Objectivity, Inc. 2014
!4
- 6. InfiniteGraph
• Fully distributed graph database
• High throughput and scalability
"
• Extremely fast navigational access
• ACID transactions for online operation
• Relaxed consistency during batch-mode parallel ingest
• Parallel queries
• Flexible indexing, including Lucene for text
• Java API and Gremlin support
© Objectivity, Inc. 2014
!6
- 7. Graph DBMS - Finding The Links
OTHER
DATABASE(S)
GRAPH DATABASE
© Objectivity, Inc. 2014
!7
- 8. Objectivity’s Disruptive Big Data Architecture
Uses Data Virtualization to hide the nodes and focus on the connections
© Objectivity, Inc. 2014
!8
- 10. Who's Who?
• SCHEMA:
• Network [CODASYL] databases - DDL [1972]
• Relational Databases - Data Dictionary
• Object Databases - ODMG'93
• Most Graph Databases
"
• Schema-less:
• KSAM/ISAM/DSAM/ESAM
• IMS (hierarchical)
• Pick OS Database (hash-tables)
• MUMPS (hierarchical array-storage)
• MongoDB - a specialized JSON (and JSON-like)
document store.
• CouchDB - a JSON document store.
© Objectivity, Inc. 2014
!10
- 11. Schema: Pros...
• Global data definitions
"
• Optimal access
"
• Enables Query By Example
"
• Interoperability
"
• Schema change control
"
• Schema contents can be manipulated via standard
APIs and tools
© Objectivity, Inc. 2014
!11
- 12. ...Schema: Pros
• Global data definitions:
• Data types and the relationships between them
• Makes queries more efficient
• Actions can be restricted by data type, field values, relationship types
"
• Optimal access:
• Used to determine how to best store, manage and access particular data types
"
• Enables Query By Example by showing:
• Types of information available
• Relationships between them
"
• Interoperability:
• DBMS can change the shape of data items to suit the language/environment
"
• Schema change control:
• Can be used to enforce workflows that will keep applications and data in sync.
"
• Schema contents can be manipulated via standard APIs and tools:
• Easier learning curve
• Uniform security controls:
• The schema can use the same security controls as the data
• Query and visualization tools can be used for both data and schema
© Objectivity, Inc. 2014
!12
- 13. Schema: Cons
• The database designer and application developers have
to create and maintain the schema.
"
• Applications have to be kept in sync with schema
changes.
"
• Applications and programmers have to be aware of data
types
• Though this is one of the major claimed advantages of objectoriented programming.
"
• There is a perceived loss of flexibility
• Though this is more a function of the user interface to the
database than the underlying mechanisms.
© Objectivity, Inc. 2014
!13
- 15. …Schema-less: Pros
• Flexibility - Users can, in theory:
"
• Put any kind of data into the system
• Create new kinds of relationships between things (in a few
products)
• Find data without worrying about the types of data
involved.
"
• Can be more tolerant of variable Acidity and Consistency
models
"
• Ease of use and maintenance:
• No need to worry about data types
• No need for a DBA
• Applications will [probably] work when new data arrives
© Objectivity, Inc. 2014
!15
- 17. …Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].
"
• Performance suffers
"
• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67
© Objectivity, Inc. 2014
!17
- 18. Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].
"
• Performance suffers
"
• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67
Floating Point
© Objectivity, Inc. 2014
!18
- 19. Schema-less: Cons
• Apparent tolerance of variable CAP models is actually orthogonal to
the schema vs schema-less debate [as is support for sharding].
"
• Performance suffers
"
• Integrity is practically non-existent
• Maintaining referential integrity is hard
• Queries may misinterpret values within an object
• 54686973206973206120737472696e6720706c7573206120666c6f
6174696e6720706f696e74206e756d62657258585858706c757320
616e6f7468657220737472696e67
Floating Point
• A ZIPcode may be stored as an integer (01234) or a string (“01234”)
in JSON, causing query and display problems.
© Objectivity, Inc. 2014
!19
- 22. Objectivity/DB Schema Usage
• Has an internal schema in its system database (the Federated DB).
"
• User schemas are created and updated by:
• Creating .ddl files and pre-processing them with the DDL processor.
• Creating and compiling Java, C# or Python header files.
• Declaring or dynamically creating/modifyingSmalltalk classes (defunct).
• Declaring and changing table definitions with Objectivity/SQL++.
"
• SQL++ table/column definitions are updated automatically when classes are
declared or modified using other languages.
• This allows SQL++ to access C#, C++, Java and Python objects and vice-versa.
"
• A Federated Database can contain multiple named Schemas:
• Reduces re-compilation and re-building after a localized schema change.
• May facilitate security mechanisms in the future.
© Objectivity, Inc. 2014
!22
- 23. Objectivity Active Schema
"
• API and tools for creating, modifying, reading and deleting class
definitions, which include association (relationship) definitions.
• If used with a dynamic language, such as Smalltalk, creating or
modifying a class doesn't need to affect existing programs.
• In general, only generic access (via the ooObj base clase) can be used
without creating the files needed to recompile programs and methods
for accessing the new object types.
"
• Helps application developers build tools that need to access the schema,
e.g.:
• Graphical query tools
• highly flexible object modeling capabilities for end users.
"
• An end-user, such as a field technician or an analyst:
• Can add local object classes, populate, maintain and query them,
but...
• Cannot interfere with the correct operation of the pre-built
applications.
© Objectivity, Inc. 2014
!23
- 25. Use Case 1 - Intelligence Gathering Framework…
1
of
• An integrated application
development framework that
focuses on adaptability.
• Dynamic modeling of
entities, services and
workflows.
• Versioning and temporality
features support system
evolution.
The screenshots show a location that is under surveillance and
everything known about it in the database.
© Objectivity, Inc. 2014
!25
2
- 26. …Use Case 1 - Intelligence Gathering Framework
2
• Eliminates the mapping layer
between the user defined
objects and the database.
• Performance and scalability.
Design and Information Feeds
of
Users
Database
• Active Schema facilitates
object migration.
© Objectivity, Inc. 2014
!26
2
- 27. Use Case 2 - GDMO Framework
"
• Operations, Administration, and"
Maintenance interface for the CDMA"
system RF infrastructure
• Controls the Base Station Controller and
Base Station Transceiver Subsystem
• GDMO* Schema and CMIP agent-manager"
messaging
• A SPARC-based BSC rack supports a"
peak load of 150,000 simultaneous callers
• Deployed in CDMA networks worldwide,"
including SprintPCS"
* GDMO is the Guideline for the Definition of Managed Objects
© Objectivity, Inc. 2014
!27
- 28. Use Case 3 - Ontology Framework
SCHEMA
"
• Uses standard objects to define a metaschema
• It is used to define concept templates
• They can be inherited from, combined or
extended to support a “class specification”
CONCEPT
LOGIC
CLASS
COMPONENTS
• The data is combined with Horn Logic to
build complex ontologies."
RELATIONSHIP
STRUCT
ARRAY
FIELD
* GDMO is the Guideline for the Definition of Managed Objects
© Objectivity, Inc. 2014
!28
- 29. Summary
• Don’t confuse CAP issues with Schema
considerations
• Schemas make the DBMS more powerful
• Schema-less architectures are more flexible
• It’s possible to build flexible systems with
Schema-based infrastructure
© Objectivity, Inc. 2014
!29
- 30. THANK YOU
• Please visit objectivity.com for:
•
•
•
•
•
•
Features
Use Cases
White Papers
Free downloads (60 day evaluation)
Sample Applications
Application Developer’s Wiki
"
• For further information:
"
• Email: info@objectivity.com
© Objectivity, Inc. 2014
!30