Eifrem neo4j
 the benefits of
graph databases
What's the plan?
  Why now? – Four trends

  NoSQL overview

  Graph databases && Neo4j



Trend 1:
data set size

2007            2010
Trend 2: connectedness
 Information connectivity




                                                  Wikis             User-



                        documents     web 1.0             web 2.0             “web 3.0”
                                    1990         2000                   2010                   2020
Trend 3: semi-structure
  Individualization of content!
     In the salary lists of the 1970s, all elements had exactly
     one job
     In the salary lists of the 2000s, we need 5 job columns!
     Or 8? Or 15?

  Trend accelerated by the decentralization of content
  generation that is the hallmark of the age of participation
  (“web 2.0”)
Aside: RDBMS performance
                                                            Relational database

               Salary List

                             Majority of

                                           Social network

                                                                 Semantic Trading


                                                Data complexity
Trend 4: architecture

          1990s: Database as integration hub
Trend 4: architecture

                 2000s: (Slowly towards...)
         Decoupled services with own backend
Why NoSQL 2009?
 Trend 1: Size.

 Trend 2: Connectivity.

 Trend 3: Semi-structure.

 Trend 4: Architecture.
First off: the damn name
  NoSQL is NOT “Never SQL”

  NoSQL is NOT “No To SQL”

     is simply

ot    nly        !
Four (emerging) NoSQL categories
 Key-value stores
   Based on Amazon's Dynamo paper
   Data model: (global) collection of K-V pairs
   Example: Dynomite, Voldemort, Tokyo

 BigTable clones
   Based on Google's BigTable paper
   Data model: big table, column families
   Example: Hbase, Hypertable
Four (emerging) NoSQL categories
 Document databases
   Inspired by Lotus Notes
   Data model: collections of K-V collections
   Example: CouchDB, MongoDB

 Graph databases
   Inspired by Euler & graph theory
   Data model: nodes, rels, K-V on both
   Example: AllegroGraph, VertexDB, Neo4j
NoSQL data models

        Key-value stores

                      Bigtable clones


                                                    Graph databases

NoSQL data models

           Key-value stores

                         Bigtable clones


                                                       Graph databases

                                                                   (This is still      of
 90%                                                               nodes & relationships)

Eifrem neo4j
Graph DBs
& Neo4j intro
The Graph DB model: representation
 Core abstractions:                       name = “Emil”
                                          age = 29
   Nodes                                  sex = “yes”

   Relationships between nodes
   Properties on both

                         type = KNOWS
                         time = 4 years

                                                          type = car
                                                          vendor = “SAAB”
                                                          model = “95 Aero”
Example: The Matrix
                                                                           name = “The Architect”
                            name = “Morpheus”
                            rank = “Captain”
                            occupation = “Total badass”
name = “Thomas Anderson”
age = 29
                                                 disclosure = public

                  KNOWS                           KNOWS                          KNO                 CODED_BY

                                        KNO W

                      OW                                  name = “Cypher”
                        S                                 last name = “Reagan”
                                                                                              name = “Agent Smith”
                                                                       disclosure = secret    version = 1.0b
          age = 3 days                                                 age = 6 months         language = C++

                            name = “Trinity”
Code (1): Building a node space
NeoService neo = ... // Get factory

// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
Code (1): Building a node space
NeoService neo = ... // Get factory
Transaction tx = neo.beginTx();

// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly

Code (1b): Defining RelationshipTypes
// In package org.neo4j.api.core
public interface RelationshipType
   String name();

// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
   private final String name;
   MyDynamicRelType( String name ){ = name; }
   public String name() { return; }

// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
Whiteboard friendly

                      Björn                  Big Car
                        build             drives

The Graph DB model: traversal
 Traverser framework for                    name = “Emil”
 high-performance traversing                age = 29
                                            sex = “yes”
 across the node space

                           type = KNOWS
                           time = 4 years

                                                            type = car
                                                            vendor = “SAAB”
                                                            model = “95 Aero”
Example: Mr Andersonʼs friends
                                                                           name = “The Architect”
                            name = “Morpheus”
                            rank = “Captain”
                            occupation = “Total badass”
name = “Thomas Anderson”
age = 29
                                                 disclosure = public

                  KNOWS                           KNOWS                          KNO                 CODED_BY

                                        KNO W

                      OW                                  name = “Cypher”
                        S                                 last name = “Reagan”
                                                                                              name = “Agent Smith”
                                                                       disclosure = secret    version = 1.0b
          age = 3 days                                                 age = 6 months         language = C++

                            name = “Trinity”
Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
      Direction.OUTGOING );

// Traverse the node space and print out the result
System.out.println( "Mr Anderson's friends:" );
for ( Node friend : friendsTraverser )
      System.out.printf( "At depth %d => %s%n",
          friend.getProperty( "name" ) );
name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
                             occupation = “Total badass”
 name = “Thomas Anderson”
 age = 29
                                                  disclosure = public

                   KNOWS                           KNOWS                          KNO                  CODED_BY


                                         KNO W
                       OW                                  name = “Cypher”
                         S                                 last name = “Reagan”
                                                                                                name = “Agent Smith”
                                                                         disclosure = secret    version = 1.0b
           age = 3 days                                                  age = 6 months         language = C++

                             name = “Trinity”
                                                                        $ bin/start-neo-example
                                                                        Mr Anderson's friends:

                                                                        At depth 1 => Morpheus
friendsTraverser = mrAnderson.traverse(
  Traverser.Order. BREADTH_FIRST   ,
                                                                        At depth 1 => Trinity
  StopEvaluator. END_OF_GRAPH   ,                                       At depth 2 => Cypher
  ReturnableEvaluator. ALL_BUT_START_NODE
  RelTypes. KNOWS  ,                                                    At depth 3 => Agent Smith
  Direction. OUTGOING );                                                $
Example: Friends in love?
                                                                          name = “The Architect”
                           name = “Morpheus”
                           rank = “Captain”
                           occupation = “Total badass”
name = “Thomas Anderson”
age = 29
                                                disclosure = public

                  KNOWS                          KNOWS                          KNO                 CODED_BY

                                      K NO W

                     OW                                  name = “Cypher”
                       S                                 last name = “Reagan”
                                                                                             name = “Agent Smith”
        LO                                                            disclosure = secret    version = 1.0b
          VE                                                                                 language = C++
             S                                                        age = 6 months

                           name = “Trinity”
Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
     new ReturnableEvaluator()
          public boolean isReturnableNode( TraversalPosition pos )
               return pos.currentNode().hasRelationship(
                    RelTypes.LOVES, Direction.OUTGOING );
     Direction.OUTGOING );
Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( "Who’s a lover?" );
for ( Node person : loveTraverser )
      System.out.printf( "At depth %d => %s%n",
          person.getProperty( "name" ) );
name = “The Architect”
                                  name = “Morpheus”
                                  rank = “Captain”
                                  occupation = “Total badass”
  name = “Thomas Anderson”
  age = 29
                                                       disclosure = public

                      KNOWS                             KNOWS                            KNO                 CODED_BY


                                             K NO W
                          OW                                    name = “Cypher”
                            S                                   last name = “Reagan”
                                                                                                      name = “Agent Smith”
           LO                                                                  disclosure = secret    version = 1.0b
             VE                                                                                       language = C++
               S                                                               age = 6 months

                                  name = “Trinity”
                                                                             $ bin/start-neo-example
                                                                             Who’s a lover?
new ReturnableEvaluator()
  public boolean isReturnableNode(                                           At depth 1 => Trinity
    TraversalPosition pos)
  {                                                                          $
    return pos.currentNode().
     hasRelationship( RelTypes. LOVES,
              Direction .OUTGOING );
Bonus code: domain model
     How do you implement your domain model?
     Use the delegator pattern, i.e. every domain entity wraps a
     Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
   private final Node underlyingNode;
   PersonImpl( Node node ){ this.underlyingNode = node; }

    public String getName()
      return this.underlyingNode.getProperty( "name" );
    public void setName( String name )
      this.underlyingNode.setProperty( "name", name );
Domain layer frameworks
 Qi4j (
   Framework for doing DDD in pure Java5
   Defines Entities / Associations / Properties
      Sound familiar? Nodes / Relʼs / Properties!
   Neo4j is an “EntityStore” backend

 NeoWeaver (
   Weaves Neo4j-backed persistence into domain objects
   in runtime (dynamic proxy / cglib based)
   Veeeery alpha
Neo4j system characteristics
     Native graph storage engine with custom binary on-disk
    JTA/JTS, XA, 2PC, Tx recovery, deadlock detection,
    MVCC, etc
  Scales up (what's the x and the y?)
    Several billions of nodes/rels/props on single JVM
    6+ years in 24/7 production
Social network pathExists()
                ~1k persons
                Avg 50 friends per
                pathExists(a, b) limit
                depth 4
                Two backends
                Eliminate disk IO so
                warm up caches
Social network pathExists()


        Mike                               Kevin

                Bruce            Leigh

                                 # persons query time
Relational database                  1 000 2 000 ms
Graph database (Neo4j)               1 000      2 ms
Graph database (Neo4j)           1 000 000      2 ms
Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)

-   Lacks in tool and framework support
-   Few other implementations => potential lock in
-   No support for ad-hoc queries
Language bindings – bindings for Jython and CPython
 Neo4jrb – bindings for JRuby (incl RESTful API)
 Scala (incl RESTful API)
 … .NET? Erlang?
Eifrem neo4j
Eifrem neo4j
Grails Neoclipse screendump
 Graphs && Neo4j => teh awesome!
 Available NOW under AGPLv3 / commercial license
   AGPLv3: “if youʼre open source, weʼre open source”
   If you have proprietary software? Must buy a commercial
   But up to 1M primitives itʼs free for all uses!
Poop 1
 Key-value stores?
   => the awesome
   … if you have 1000s of BILLIONS records OR you don't
   care about programmer productivity

 What if you had no variables at all in your programs except
 a single globally accessible hashtable?
 Would your software be maintainable?
Poop 2
 In a not-suck architecture...

 … the only thing that makes sense is to have an
 embedded database.
Looking ahead: polyglot persistence
Eifrem neo4j

             Image credit: lost again! Sorry :(

