The document discusses the rise of graph databases and their benefits over traditional SQL databases. It notes four trends driving growth in data size, connectivity, semi-structured data, and decoupled architectures that have led to the rise of NoSQL databases including key-value, column-oriented, document, and graph databases. It provides an overview of the graph database model, which represents data as nodes and relationships, and an example using the graph database Neo4j.
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)Emil Eifrem
Presentation given at nosql east 2009 in Atlanta. Introduces the NOSQL space by offering a framework for categorization and discusses the benefits of graph databases. Oh, and also includes some tongue-in-cheek party poopers about sucky things in the NOSQL space.
Spring Data Graph is an integration library for the open source graph database Neo4j and has been around for over a year, evolving from its infancy as brainchild of Rod Johnson and Emil Eifrem. It supports transparent AspectJ based POJO to Graph Mapping, a Neo4jTemplate API and extensive support for Spring Data Repositories. It can work with an embedded graph database or with the standalone Neo4j Server.
The session starts with a short introduction to graph databases. Following that, the different approaches using Spring Data Graph are explored in the Cineasts.net web-app, a social movie database which is also the application of the tutorial in the Spring Data Graph Guidebook. The session will also cover creating a green-field project using the Spring Roo Addon for Spring Data Graph and deploying the App to CloudFoundry.
This document provides an introduction to NoSQL databases. It discusses that NoSQL refers to non-relational databases that are not based on SQL and are focused on scalability. Some common types of NoSQL databases include column stores, key-value stores, document stores, graph databases, and XML databases. NoSQL databases are designed to handle large volumes of data across many servers and provide high availability with no single point of failure. Common uses of NoSQL databases include distributed systems like social networks where data is highly distributed and needs to be replicated across servers.
Neo4j Spatial provides spatial/GIS capabilities for Neo4j, allowing it to store and query geospatial data. It aims to make GIS more accessible and allow for complex spatial mapping and analytics by connecting location data to other domain data stored in the graph. Features include support for OpenStreetMap data, dynamic layers, and topological queries and persistence of spatial relationships directly in the graph.
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
Presentation about using Neo4j from Django presented at OSCON 2010, Portland OR.
Sample code is available at: https://svn.neo4j.org/components/neo4j.py/trunk/src/examples/python/djangosites/blog/
This document discusses Grails integration with Neo4j graph databases. It begins with an introduction to graph databases and Neo4j. It then covers the Grails Neo4j plugin which allows using Neo4j as the persistence layer for Grails domain classes. Finally, it addresses some challenges in mapping the Grails domain model to the Neo4j nodespace and potential solutions.
While mathematicians have used graph theory since the 18th century to solve problems, the software patterns for graph data are new to most developers. To enable "mass adoption" of graph technology, we need to establish the right abstractions, access APIs, and data models.
RDF triples, while of paramount importance in establishing RDF graph semantics, are a low-level abstraction, much like using assembly language. For practical and productive “graph programming” we need something different.
Similarly, existing declarative graph query languages (such as SPARQL and Cypher) are not always the best way to access graph data, and sometimes you need a simpler interface (e.g., GraphQL), or even a different approach altogether (e.g., imperative traversals such as with Gremlin).
Ora Lassila is a Principal Graph Technologist in the Amazon Neptune graph database group. He has a long experience with graphs, graph databases, ontologies, and knowledge representation. He was a co-author of the original RDF specification as well as a co-author of the seminal article on the Semantic Web.
The document introduces MongoDB as an open source, high performance database that is a popular NoSQL option. It discusses how MongoDB stores data as JSON-like documents, supports dynamic schemas, and scales horizontally across commodity servers. MongoDB is seen as a good alternative to SQL databases for applications dealing with large volumes of diverse data that need to scale.
using Spring and MongoDB on Cloud FoundryJoshua Long
This talk introduces how to build MongoDB applications with Spring Data MongoDB on Cloud Foundry. Spring Data provides rich support for easily building applications that work on multiple data stores.
Max De Marzi gave an introduction to graph databases using Neo4j as an example. He discussed trends in big, connected data and how NoSQL databases like key-value stores, column families, and document databases address these trends. However, graph databases are optimized for interconnected data by modeling it as nodes and relationships. Neo4j is a graph database that uses a property graph data model and allows querying and traversal through its Cypher query language and Gremlin scripting language. It is well-suited for domains involving highly connected data like social networks.
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
Neo4j is a graph database that stores data in nodes and relationships. It allows for efficient querying of connected data through graph traversals. Key aspects include nodes that can contain properties, relationships that connect nodes and also contain properties, and the ability to navigate the graph through traversals. Neo4j provides APIs for common graph operations like creating and removing nodes/relationships, running traversals, and managing transactions. It is well suited for domains that involve connected, semi-structured data like social networks.
This paper discusses implementing NoSQL databases for robotics applications. NoSQL databases are well-suited for robotics because they can store massive amounts of data, retrieve information quickly, and easily scale. The paper proposes using a NoSQL graph database to store robot instructions and relate them according to tasks. MapReduce processing is also suggested to break large robot data problems into parallel pieces. Implementing a NoSQL system would allow building more intelligent humanoid robots that can process billions of objects and learn quickly from massive sensory inputs.
The document provides an overview of graph databases and Neo4j. It defines that a graph is made up of nodes and relationships, with nodes connected by relationships that have a direction and properties. Graph databases are useful for modeling connected or variably structured data. Neo4j is introduced as an open-source graph database with good driver support and the Cypher query language. Examples demonstrate creating nodes, relationships, and queries in Cypher.
Graph databases are well suited for complex, interconnected data. Neo4j is a graph database that represents data as nodes connected by relationships. It allows for complex queries and traversals of graph structures. Unlike relational databases, graph databases can directly model real world networks and relationships without needing to flatten the data.
This document provides an overview of MongoDB, Java, and Spring Data. It discusses how MongoDB is a document-oriented NoSQL database that uses JSON-like documents with dynamic schemas. It describes how the Java driver can be used to interact with MongoDB to perform CRUD operations. It also explains how Spring Data provides an abstraction layer over the Java driver and allows for object mapping and repository-based queries to MongoDB.
The document discusses digital worlds and applications at both the enterprise and national scales in the United States healthcare system. It notes the massive scale of healthcare data sources, including hundreds of thousands of healthcare offices and databases containing information on hundreds of millions of patients. The critical importance of making sense of this vast amount of heterogeneous healthcare data to improve human lives and health outcomes is also emphasized.
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
Speakers: Michael Hunger (Neo Technology) and Josh Long (Pivotal)
Spring Data Neo4j 3.0 is here and it supports Neo4j 2.0. Neo4j is a tiny graph database with a big punch. Graph databases are imminently suited to asking interesting questions, and doing analysis. Want to load the Facebook friend graph? Build a recommendation engine? Neo4j's just the ticket. Join Spring Data Neo4j lead Michael Hunger (@mesirii) and Spring Developer Advocate Josh Long (@starbuxman) for a look at how to build smart, graph-driven applications with Spring Data Neo4j and Spring Boot.
Spring one2gx2010 spring-nonrelational_dataRoger Xia
This document provides a summary of a talk on using Spring with NoSQL databases. The talk discusses the benefits and drawbacks of NoSQL databases, and how the Spring Data project simplifies development of NoSQL applications. It then provides background on the two speakers, Chris Richardson and Mark Pollack. The agenda outlines explaining why NoSQL, overviewing some NoSQL databases, discussing Spring NoSQL projects, and having demos and code examples.
5. Trend 2: connectedness
Giant
Global
Graph
(GGG)
Information connectivity
Ontologies
RDF
Folksonomies
Tagging
Wikis User-
generated
content
Blogs
RSS
Hypertext
Text
documents web 1.0 web 2.0 “web 3.0”
1990 2000 2010 2020
6. Trend 3: semi-structure
Individualization of content!
In the salary lists of the 1970s, all elements had exactly
one job
In the salary lists of the 2000s, we need 5 job columns!
Or 8? Or 15?
Trend accelerated by the decentralization of content
generation that is the hallmark of the age of participation
(“web 2.0”)
7. Aside: RDBMS performance
Relational database
Performance
Salary List
Majority of
Webapps
Social network
}
Semantic Trading
custom
Data complexity
14. Four (emerging) NoSQL categories
Key-value stores
Based on Amazon's Dynamo paper
Data model: (global) collection of K-V pairs
Example: Dynomite, Voldemort, Tokyo
BigTable clones
Based on Google's BigTable paper
Data model: big table, column families
Example: Hbase, Hypertable
15. Four (emerging) NoSQL categories
Document databases
Inspired by Lotus Notes
Data model: collections of K-V collections
Example: CouchDB, MongoDB
Graph databases
Inspired by Euler & graph theory
Data model: nodes, rels, K-V on both
Example: AllegroGraph, VertexDB, Neo4j
17. NoSQL data models
Size
Key-value stores
Bigtable clones
Document
databases
Graph databases
(This is still of
90% nodes & relationships)
of
use
cases
Complexity
20. The Graph DB model: representation
Core abstractions: name = “Emil”
age = 29
Nodes sex = “yes”
Relationships between nodes
Properties on both
type = KNOWS
time = 4 years
type = car
vendor = “SAAB”
model = “95 Aero”
21. Example: The Matrix
name = “The Architect”
name = “Morpheus”
rank = “Captain”
occupation = “Total badass”
name = “Thomas Anderson”
age = 29
disclosure = public
KNOWS KNOWS KNO CODED_BY
WS
KN
S
KNO W
OW name = “Cypher”
S last name = “Reagan”
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
name = “Trinity”
22. Code (1): Building a node space
NeoService neo = ... // Get factory
// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );
// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );
// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
23. Code (1): Building a node space
NeoService neo = ... // Get factory
Transaction tx = neo.beginTx();
// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );
// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );
// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
tx.commit();
24. Code (1b): Defining RelationshipTypes
// In package org.neo4j.api.core
public interface RelationshipType
{
String name();
}
// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
{
private final String name;
MyDynamicRelType( String name ){ this.name = name; }
public String name() { return this.name; }
}
// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
{
KNOWS,
WORKS_FOR,
}
26. The Graph DB model: traversal
Traverser framework for name = “Emil”
high-performance traversing age = 29
sex = “yes”
across the node space
type = KNOWS
time = 4 years
type = car
vendor = “SAAB”
model = “95 Aero”
27. Example: Mr Andersonʼs friends
name = “The Architect”
name = “Morpheus”
rank = “Captain”
occupation = “Total badass”
name = “Thomas Anderson”
age = 29
disclosure = public
KNOWS KNOWS KNO CODED_BY
WS
KN
S
KNO W
OW name = “Cypher”
S last name = “Reagan”
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
name = “Trinity”
28. Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
Traverser.Order.BREADTH_FIRST,
StopEvaluator.END_OF_GRAPH,
ReturnableEvaluator.ALL_BUT_START_NODE,
RelTypes.KNOWS,
Direction.OUTGOING );
// Traverse the node space and print out the result
System.out.println( "Mr Anderson's friends:" );
for ( Node friend : friendsTraverser )
{
System.out.printf( "At depth %d => %s%n",
friendsTraverser.currentPosition().getDepth(),
friend.getProperty( "name" ) );
}
29. name = “The Architect”
name = “Morpheus”
rank = “Captain”
occupation = “Total badass”
name = “Thomas Anderson”
age = 29
disclosure = public
KNOWS KNOWS KNO CODED_BY
WS
KN
S
KNO W
OW name = “Cypher”
S last name = “Reagan”
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
name = “Trinity”
$ bin/start-neo-example
Mr Anderson's friends:
At depth 1 => Morpheus
friendsTraverser = mrAnderson.traverse(
Traverser.Order. BREADTH_FIRST ,
At depth 1 => Trinity
StopEvaluator. END_OF_GRAPH , At depth 2 => Cypher
ReturnableEvaluator. ALL_BUT_START_NODE
,
RelTypes. KNOWS , At depth 3 => Agent Smith
Direction. OUTGOING ); $
30. Example: Friends in love?
name = “The Architect”
name = “Morpheus”
rank = “Captain”
occupation = “Total badass”
name = “Thomas Anderson”
age = 29
disclosure = public
KNOWS KNOWS KNO CODED_BY
WS
KN
S
K NO W
OW name = “Cypher”
S last name = “Reagan”
name = “Agent Smith”
LO disclosure = secret version = 1.0b
VE language = C++
S age = 6 months
name = “Trinity”
31. Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
Traverser.Order.BREADTH_FIRST,
StopEvaluator.END_OF_GRAPH,
new ReturnableEvaluator()
{
public boolean isReturnableNode( TraversalPosition pos )
{
return pos.currentNode().hasRelationship(
RelTypes.LOVES, Direction.OUTGOING );
}
},
RelTypes.KNOWS,
Direction.OUTGOING );
32. Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( "Who’s a lover?" );
for ( Node person : loveTraverser )
{
System.out.printf( "At depth %d => %s%n",
loveTraverser.currentPosition().getDepth(),
person.getProperty( "name" ) );
}
33. name = “The Architect”
name = “Morpheus”
rank = “Captain”
occupation = “Total badass”
name = “Thomas Anderson”
age = 29
disclosure = public
KNOWS KNOWS KNO CODED_BY
WS
KN
S
K NO W
OW name = “Cypher”
S last name = “Reagan”
name = “Agent Smith”
LO disclosure = secret version = 1.0b
VE language = C++
S age = 6 months
name = “Trinity”
$ bin/start-neo-example
Who’s a lover?
new ReturnableEvaluator()
{
public boolean isReturnableNode( At depth 1 => Trinity
TraversalPosition pos)
{ $
return pos.currentNode().
hasRelationship( RelTypes. LOVES,
Direction .OUTGOING );
}
},
34. Bonus code: domain model
How do you implement your domain model?
Use the delegator pattern, i.e. every domain entity wraps a
Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
{
private final Node underlyingNode;
PersonImpl( Node node ){ this.underlyingNode = node; }
public String getName()
{
return this.underlyingNode.getProperty( "name" );
}
public void setName( String name )
{
this.underlyingNode.setProperty( "name", name );
}
}
35. Domain layer frameworks
Qi4j (www.qi4j.org)
Framework for doing DDD in pure Java5
Defines Entities / Associations / Properties
Sound familiar? Nodes / Relʼs / Properties!
Neo4j is an “EntityStore” backend
NeoWeaver (http://components.neo4j.org/neo-weaver)
Weaves Neo4j-backed persistence into domain objects
in runtime (dynamic proxy / cglib based)
Veeeery alpha
36. Neo4j system characteristics
Disk-based
Native graph storage engine with custom binary on-disk
format
Transactional
JTA/JTS, XA, 2PC, Tx recovery, deadlock detection,
MVCC, etc
Scales up (what's the x and the y?)
Several billions of nodes/rels/props on single JVM
Robust
6+ years in 24/7 production
37. Social network pathExists()
~1k persons
Avg 50 friends per
person
pathExists(a, b) limit
depth 4
Two backends
Eliminate disk IO so
warm up caches
38. Social network pathExists()
Emil
Mike Kevin
John
Marcus
Bruce Leigh
# persons query time
Relational database 1 000 2 000 ms
Graph database (Neo4j) 1 000 2 ms
Graph database (Neo4j) 1 000 000 2 ms
39. Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)
- Lacks in tool and framework support
- Few other implementations => potential lock in
- No support for ad-hoc queries
40. Language bindings
Neo4j.py – bindings for Jython and CPython
http://components.neo4j.org/neo4j.py
Neo4jrb – bindings for JRuby (incl RESTful API)
http://wiki.neo4j.org/content/Ruby
Clojure
http://wiki.neo4j.org/content/Clojure
Scala (incl RESTful API)
http://wiki.neo4j.org/content/Scala
… .NET? Erlang?
44. Conclusion
Graphs && Neo4j => teh awesome!
Available NOW under AGPLv3 / commercial license
AGPLv3: “if youʼre open source, weʼre open source”
If you have proprietary software? Must buy a commercial
license
But up to 1M primitives itʼs free for all uses!
Download
http://neo4j.org
Feedback
http://lists.neo4j.org
45. Poop 1
Key-value stores?
=> the awesome
… if you have 1000s of BILLIONS records OR you don't
care about programmer productivity
What if you had no variables at all in your programs except
a single globally accessible hashtable?
Would your software be maintainable?
46. Poop 2
In a not-suck architecture...
… the only thing that makes sense is to have an
embedded database.