A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

A Little Graph Theory for the
Busy Developer
Jim Webber
Chief Scientist, Neo Technology
@jimwebber

Roadmap
• Imprisoned data
• Graph models
• Graph theory
– Local properties, global behaviors
– Predictive techniques
• Graph matching
– Real-time analytics for fun and profit
• Fin

http://www.flickr.com/photos/crazyneighborlady/355232758/

http://gallery.nen.gov.uk/image82582-.html

Aggregate-Oriented Data
http://martinfowler.com/bliki/AggregateOrientedDatabase.html
“There is a significant downside - the whole approach works really well
when data access is aligned with the aggregates, but what if you want to
look at the data in a different way? Order entry naturally stores orders as
aggregates, but analyzing product sales cuts across the aggregate structure.
The advantage of not using an aggregate structure in the database is that it
allows you to slice and dice your data different ways for different
audiences.
This is why aggregate-oriented stores talk so much about map-reduce.”

complexity = f(size, connectedness, uniformity)

http://www.bbc.co.uk/london/travel/downloads/tube_map.html

Property graphs
• Property graph model:
– Nodes with properties
– Named, directed relationships with properties
– Relationships have exactly one start and end node
• Which may be the same node

stole
from
loves
loves
enemy
enemy
A Good Man
Goes to War
appeared
in
appeared
in
appeared
in
appeared
in
Victory of
the Daleks
appeared
in
appeared
in
companion
companion
enemy

Property graphs are very whiteboard-friendly

http://blogs.adobe.com/digitalmarketing/analytics/predictive-analytics/predictive-analytics-and-the-digital-marketer/

http://en.wikipedia.org/wiki/File:Leonhard_Euler_2.jpg
Meet Leonhard Euler
• Swiss mathematician
• Inventor of Graph
Theory (1736)
16

http://en.wikipedia.org/wiki/Seven_Bridges_of_Königsberg
20

Triadic Closure
name: Kyle
name: Stan name: Kenny

Triadic Closure
name: Kyle
name: Kyle
FRIEND

Structural Balance
name: Cartman
name: Craig name: Tweek

Structural Balance
name: Cartman
name: Cartman
FRIEND

Structural Balance
name: Cartman
name: Cartman
ENEMY

Structural Balance
name: Kyle
name: Kyle
FRIEND

Structural Balance is a key
predictive technique
And it’s domain-agnostic

Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria

Predicting WWI
[Easley and Kleinberg]

Strong Triadic Closure
It if a node has strong relationships to two
neighbours, then these neighbours must have at
least a weak relationship between them.
[Wikipedia]

Triadic Closure
(weak relationship)
name: Kenny
name: Stan name: Cartman

Triadic Closure
(weak relationship)
name: Kenny
name: Kenny
FRIEND 50%

Weak relationships
• Relationships can have “strength” as well as
intent
– Think: weighting on a relationship in a property
graph
• Weak links play another super-important
structural role in graph theory
– They bridge neighbourhoods

Local Bridges
FRIEND
name: Kenny
name: Stanname: Kyle
FRIEND
FRIEND
name: Sally
name: Bebename: Wendy
FRIEND
FRIEND 50%
name: Cartman
FRIEND
ENEMY

Local Bridge Property
“If a node A in a network satisfies the Strong
Triadic Closure Property and is involved in at
least two strong relationships, then any local
bridge it is involved in must be a weak
relationship.”
[Easley and Kleinberg]

Graph Partitioning
• (NP) Hard problem
– Recursively remove the spanning links between
dense regions
– Or recursively merge nodes into ever larger
“subgraph” nodes
– Choose your algorithm carefully – some are better
than others for a given domain
• Can use to (almost exactly) predict the
break up of the karate club!

University Karate Clubs
(predicted by Graph Theory)
9

University Karate Clubs
(what actually happened!)

Cypher
• Declarative graph pattern matching language
– “SQL for graphs”
– Columnar results
• Supports graph matching commands and
queries
– Find me stuff like this…
– Aggregation, ordering and limit, etc.

Firstname:
Mickey
Surname: Smith
DoB: 19781006
SKU: 5e175641
Product:
Badgers
Nadgers Ale
SKU: 2555f258
Product:
Peewee Pilsner
Category: beer
SKU: 49d102bc
Product: Baby
Dry Nights
Category:
nappies
Category: baby Category:
alcoholic
drinks
SKU: 49d102bc
Product: XBox
360
Category:
consumer
electronics
Category:
console
BOUGHTBOUGHT
MEMBER_OF
MEMBER_OFMEMBER_OF
MEMBER_OFMEMBER_OF

Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
BOUGHTCategory: game
console

Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
!BOUGHTCategory: game
console

(beer)(nappies)
(console)
(daddy)
() ()
()

Flatten the graph
(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies)
(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer)
(daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)

Wrap in a Cypher MATCH clause
MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),
(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),

Cypher WHERE clause
MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),
(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),
WHERE b is null

Full Cypher query
START beer=node:categories(category=‘beer’),
nappies=node:categories(category=‘nappies’),
xbox=node:products(product=‘xbox 360’)
MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),
(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),
(daddy)-[b?:BOUGHT]->(xbox)
WHERE b is null
RETURN distinct daddy

Results
==> +---------------------------------------------+
==> | daddy |
==> +---------------------------------------------+
==> | Node[15]{name:"Rory Williams",dob:19880121} |
==> +---------------------------------------------+
==> 1 row
==> 0 ms
==>
neo4j-sh (0)$

Facebook Graph Search
Which sushi restaurants in
NYC do my friends like?

Cypher query
START me=node:person(name = 'Jim'),
location=node:location(location='New York'),
cuisine=node:cuisine(cuisine='Sushi')
MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant)
-[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)
RETURN restaurant

What are graphs good for?
• Recommendations
• Pharmacology
• Business intelligence
• Social computing
• Geospatial
• MDM
• Data center management
• Web of things
• Genealogy
• Time series data
• Product catalogue
• Web analytics
• Scientific computing
• Indexing your slow RDBMS
• And much more!

Free O’Reilly book for
everyone!
http://graphdatabases.com

Thanks for listening
Neo4j: http://neo4j.org
Me: @jimwebber

A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

More Related Content

Similar to A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

Similar to A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013 (20)

More from Neo4j

More from Neo4j (20)

Recently uploaded

Recently uploaded (20)

A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

Editor's Notes