Elevate MongoDB with ODBC/JDBC
- 1. Elevate MongoDB with ODBC/JDBC
Sumit Sarkar, Principal Systems Engineer
DataDirect, A Progress Software Company
www.linkedin.com/in/meetsumit
@SAsInSumit
- 3. © 2013 Progress Software Corporation. All rights reserved.3
Welcome MongoDB connectivity to the DataDirect family
Big Data/NoSQL
Apache Hadoop Hive
Cloudera
Hortonworks
Pivotal HD
MapR
EMR
Pivotal HAWQ
Cloudera Impala
MongoDB
Cassandra (preview)
Spark SQL (preview)
SAP HANA (ODBC preview)
Data Warehouses
Amazon Redshift
SAP Sybase IQ
Teradata
Pivotal Greenplum
Relational
Oracle DB
Microsoft SQL Server
IBM DB2
MySQL
PostgreSQL
IBM Informix
SAP Sybase
Pervasive SQL
Progress OpenEdge
Progress Rollbase
SaaS/Cloud
Salesforce.com
Database.com
FinancialForce
Veeva CRM
ServiceMAX
Any Force.com App
Hubspot
Marketo
Microsoft Dynamics CRM
Microsoft SQL Azure
Oracle Eloqua
Oracle Service Cloud
Google Analytics
EDI/XML/Text
EDIFACT
EDIG@S
EANCOM
X12
IATA
Healthcare EDI: X12, HIPAA,
ICD-10, HL7
Custom EDI
Flat files: CSV, TSV, dBase,
Clipper, Foxpro, Paradox
Text Files
Any
SDK
SequeLink Socket Server
Customer Engineering
- 4. © 2013 Progress Software Corporation. All rights reserved.4
ODBC/JDBC applications with
MongoDB
- 5. © 2013 Progress Software Corporation. All rights reserved.5
Why connect to MongoDB with SQL access?
Analytics and Data Visualization
Reporting
Data Virtualization / Federation
Data Warehousing
- 6. © 2013 Progress Software Corporation. All rights reserved.6
Elevate your MongoDB data value with open database standards
Business Intelligence Data Integration
ODBC/JDBC
- 7. © 2013 Progress Software Corporation. All rights reserved.7
Why MongoDB Developers care about SQL?
Your applications are awesome and the data deserves open
access
Expand footprint of MongoDB where enterprise reporting
requirements prevent adoption
Share analytics, reporting, integration work with to your
colleagues
- 8. © 2013 Progress Software Corporation. All rights reserved.8
Select MongoDB ODBC/JDBC use cases from our user base
Network Security: Build reports using Microsoft BI Stack for
all incoming network traffic stored in MongoDB
Enhance Operational Systems: Store order details from IBM
order management system in MongoDB repository that
require SAP reporting
Visual Analytics: Use Tableau to determine success of
marketing campaign data including clicks, videos, social
shares, etc from MongoLabs
Complex Analysis: Build cubes for intelligence using
complex MongoDB documents storing clinical trial data
- 9. © 2013 Progress Software Corporation. All rights reserved.9
How SQL access to NoSQL works
- 10. © 2013 Progress Software Corporation. All rights reserved.10
MongoDB: an OpenSource, NoSQL Document (JSON) Database
MongoDB Data Model
Sample JSON Document:
{ name: “sue”, age: 26, status: “A”, groups: [“news”, “sports”]}
Relational database design
focuses on data storage
NoSQL document database design
focuses on data use
- 11. © 2013 Progress Software Corporation. All rights reserved.11
Introduced tools to perfect our intelligent schema
DataDirect ODBC/
JDBC Driver
Meta
data
Schema
Tool
Business Intelligence Data Integration
- 12. © 2013 Progress Software Corporation. All rights reserved.12
MongoDB | Schema Design Comparison
Relational Design NoSQL Document Design
{ user: {
first: “Brody,
last: “Messmer”, ...
}
purchases: [
{ symbol: “PRGS”, date: “2013-02-13”, price: 23.50, qty: 100, ...},
{ symbol: “PRGS”, date: “2012-06-12”, price: 20.57, qty: 100, ...},
...
]
}
...
Collection: users
VS
user_id first last …
123456 Brody Messmer …
…
user_id symbol date price qty …
123456 PRGS 2013-02-13 23.50 100 …
123456 PRGS 2012-06-12 20.57 100 …
…
Table: users
Table: purchases
- 13. © 2013 Progress Software Corporation. All rights reserved.13
Collection Name: stock
{ symbol: “PRGS”,
purchases:[
{date: ISODate(“2013-02-13T16:58:36Z”), price: 23.50, qty: 100},
{date: ISODate(“2012-06-12T08:00:01Z”), price: 20.57, qty: 100,
sellDate: ISODate(“2013-08-16T12:34:58Z”)}, sellPrice: 24.60}
]
}
MongoDB | Progress DataDirect Approach – Normalize
Table Name: stock
_id symbol
1 PRGS
stock_id Date Price qty sellDate sellPrice
1 2013-02-13
16:58:36
23.50 100 NULL NULL
1 2012-06-12
08:00:01
20.57 100 2013-08-16
12:34:58
24.60
Table Name: stock_purchases
- 14. © 2013 Progress Software Corporation. All rights reserved.14
Demo | MongoDB
- 15. © 2013 Progress Software Corporation. All rights reserved.15
Tableau and MongoDB ODBC
- 17. © 2013 Progress Software Corporation. All rights reserved.17
Lessons Learned
- 18. © 2013 Progress Software Corporation. All rights reserved.18
Challenges we faced in building SQL interface to NoSQL
Mapping SQL to MongoDB query language
• Queries are written as BSON, but exposed in Mongo “drivers” as an API.
• No support for joins
• Lack of Implicit conversions for query filters (where clause)
• Sorting is not ANSI SQL compliant
Non-relational schema
• Document oriented data model with complex data types (denormalized)
• Self-describing schema -- Can only discover columns by selecting data
- 19. © 2013 Progress Software Corporation. All rights reserved.19
SQL to MongoDB Lessons Learned with our users
Managing logical schema across BI metadata modeling clients and servers/clusters
Important to consider cross collection joins in stress tests on BI servers
Different BI tools require specific properties such as max varchar sizes on string data
types, or field names that start with underscore.
Hard to anticipate document structures and levels of nesting – so we supported
infinitely deep
Lack of enforced data types from document to document (row to row)
• Driver attempts to convert to the SQL Type defined. If conversion fails, NULL is returned.
• MongoDB does not implicitly convert to/from strings when applying filters (where clause)
– Driver is capable filtering client side via a hidden option
Field names of different case can result in collisions from document to document
• The driver will expose these fields as separate columns without “uppercaseIdentifiers”=true
Must sample data to generate schema on read – need to configure this value
“ColumnDiscoverySampleSize”
Editor's Notes
- “Elevate MongoDB with ODBC/JDBC
Adoption for MongoDB is growing across the enterprise and disrupting existing business intelligence, analytics and data integration infrastructure. Join us to disrupt that disruption using ODBC and JDBC access to MongoDB for instant out-of-box integration with existing infrastructure to elevate and expand your organization’s MongoDB footprint. We'll talk about common challenges and gotchas that shops face when exposing unstructured and semi-structured data using these established data connectivity standards. Existing infrastructure requirements should not dictate developers’ freedom of choice in a database.”
- 350+ ISVs
10,000 DEUs
- Microstrategy
IBM Cognos
Crystal Reports
Tableau
Qlikview
SQLServer IntegrationServices
IBM DataStage Infosphere
SQLServer LinkedServer
SAS
Sqoop
Pentaho
SAP BusinessObjects
Informatica PowerCenter
SAP DataServices
Tibco Spotfire
OBIEE
Sybase ECDA
IBM Federation Server
SPSS
Syncsort DMExpress
Talend
- Not here to get into a SQL is better than NoSQL argument
- Our approach provides the most natural and flexible representation of nested, multi-value data to relational applications
We examine the data stored and normalize nested data/tuples into second-normal relational form based on a fixed schema
‘Top-level’ data becomes the parent table and nested data become ‘virtual’ tables with FK relation to the parent table
Defined using schema definition tool
Transparent to the application!
- http://www.mongodb.com/presentations/webinar-user-data-management-mongodb
Flexible data model
Indexes and Query API
Easy for developers
High Performance and Scalable
Account Info, Activity Streams, Social Networks