This document discusses persistent graphs in Python with Neo4j. It begins by explaining the limitations of relational databases and how graph databases like Neo4j focus on modeling complex relationships through nodes and edges. It then provides an overview of Neo4j, describing it as an open source graph database that is stable, actively developed, and can handle billions of nodes and relationships to model complex data.
Este documento introduz o banco de dados orientado a grafos Neo4j. Explica que os bancos de dados orientados a grafos representam dados e esquemas como grafos, tornando consultas como menor caminho mais simples. Também descreve os principais conceitos do Neo4j como nós, rótulos, relacionamentos e propriedades e demonstra exemplos básicos de consultas Cypher e integração com C#.
GraphConnect Europe 2016 - NoSQL Polyglot Persistence: Tools and Integrations...
This document discusses polyglot persistence with Neo4j and provides tools and integrations for combining Neo4j with other databases. It introduces the Neo4j Doc Manager for syncing documents from MongoDB to Neo4j, the official Neo4j Docker image, and the Neo4j Cassandra Data Import Tool. It also discusses Neo4j 3.0 features like Bolt, stored procedures, and drivers.
Compelling location-based services require more than simple “what’s near me?” operations. The Open Street Map dataset is a perfect example of a rich geographically-based wiki that can be used for much more than map rendering.
With the newly released Neo4j Spatial, any data can be adapted to complex queries with geographic components like “Select all streets in the Municipality of NYC where at least 2 of my friends are walking right now”.
The talk will demonstrate the important benefits of modeling geodata in a graph, the main components needed to expose data to geo stacks like map servers, and explain how the Open Street Map dataset is modeled in Neo4j. I’ll show how using Neo4j unlocks the full potential of the OSM data far beyond just rendering maps.
There will also be some cool examples of Neo4j Spatial, from Telecomms network planning, Web-based AJAX GIS systems, topology editing and routing to REST and Web Feature Service endpoints, all in a single stack.
This is Location-based Services on steroids!
O documento discute bancos de dados de grafos, que armazenam dados em vértices e arestas ao invés de tabelas. Ele explica que os bancos de dados de grafos são do tipo noSQL e index-free. Também fornece exemplos de uso de bancos de grafos e lista alguns produtos populares de bancos de grafos no mercado.
OrientDB vs Neo4j - Comparison of query/speed/functionality
This presentation gives an overview on OrientDB and Neo4j. It also compares some specific querys, their speed and the overall functionality of both databases.
The querys might not be optimized in both cases. At least they have the same outcome and are both written as querys. For sure in Neo4j you should do this in Java code. But that is way harder to write, so this presentation is more like a direkt comparision instead of really getting the best results.
Also it's done with real data and at the end round about 200 GB of data.
The document provides an overview of different NoSQL database types, including key-value stores, document databases, column-oriented databases, graph databases, and caches. It discusses examples of databases for each type and common use cases. The document also covers querying graph databases, polyglot persistence using multiple database types, and concludes with when each database type is best suited and when not to use a NoSQL database.
Presented at JavaOne 2013, Tuesday September 24.
"Data Modeling Patterns" co-created with Ian Robinson.
"Pitfalls and Anti-Patterns" created by Ian Robinson.
This document provides an overview of tools available in the Java Development Kit (JDK) that allow for powerful introspection and manipulation of the Java Virtual Machine (JVM) and running applications. It discusses the java.lang.instrument API for injecting Java agents, the Java Debugging Interface (JDI) for debugging, the JVM Tool Interface (JVMTI) for heap and frame introspection, and examples of using these tools to build interactive debuggers, inject code at runtime, and test concurrency. Code samples and links to further resources are also provided.
The document provides an overview of the internal workings of Neo4j. It describes how the graph data is stored on disk as linked lists of fixed size records and how two levels of caching work - a low-level filesystem cache and a high-level object cache that stores node and relationship data in a structure optimized for traversals. It also explains how traversals are implemented using relationship expanders and evaluators to iteratively expand paths through the graph, and how Cypher builds on this but uses graph pattern matching rather than the full traversal system.
This document discusses different NoSQL database technologies for various application requirements. It describes graph databases like Neo4j, document databases like MongoDB, and column family databases like Cassandra. It then provides examples of using each for a blog system, Twitter clone, and social network. Graph databases are well-suited for the social network due to focusing on entity relationships. Document databases work well for the blog by embedding comments in blog posts. A column family database is a good fit for the Twitter clone to handle high write loads through denormalization across column families.
The document discusses models for concurrent programming. It summarizes common misconceptions about threads and concurrency, and outlines some of the core abstractions and tools available in Java for writing concurrent programs, including threads, monitors, volatile variables, java.util.concurrent classes like ConcurrentHashMap, and java.util.concurrent.locks classes like ReentrantLock. It also discusses some models not currently supported in Java like parallel arrays, transactional memory, actors, and Clojure's approach to concurrency using immutable data structures, refs, and atoms.
Graph databases are well suited for complex, interconnected data. Neo4j is a graph database that represents data as nodes connected by relationships. It allows for complex queries and traversals of graph structures. Unlike relational databases, graph databases can directly model real world networks and relationships without needing to flatten the data.
This document summarizes Tobias Ivarsson's work on developing a new "Advanced Compiler" for Jython. It provides an overview of the compiler project, performance figures comparing Jython and CPython on benchmark tests, discusses mismatches between Python and the JVM, and how performance is being improved. The new compiler adds analysis and intermediate representation steps to better represent Python code on the JVM. Benchmark results show initial Jython performance lagging CPython but improving with JIT warmup and continued compiler optimizations.
This document summarizes Tobias Ivarsson's presentation on the "Advanced Compiler" project for Jython. It discusses performance benchmarks that show Jython is currently slower than CPython. It also analyzes mismatches between Python and the JVM, such as late binding, call frames, and exception handling. However, it argues the new compiler provides an enabling platform for optimizations like call frame analysis, inline caching of builtins, and exception handler restructuring to improve performance. It concludes the project is not just about compilation but moving Jython forward to better leverage future JVMs.
Transcript: Details of description part II: Describing images in practice - T...
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
7 Most Powerful Solar Storms in the History of Earth.pdf
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Sistemas de recomendações e neo4J na cloud computingPriscila Mayumi
O documento discute sistemas de recomendação baseados em grafos e como implantar um banco de dados de grafos Neo4j na nuvem. Ele explica como criar recomendações para repositórios no GitHub usando um sistema baseado em colaboração, modela a estrutura de dados em um banco de grafos e descreve como implantar o Neo4j na nuvem usando serviços como GrapheneDB ou manualmente usando AWS ou Azure.
1) O documento apresenta conceitos sobre bancos de dados NoSQL, incluindo taxonomia, características como escalabilidade horizontal e replicação, e exemplos como Cassandra, HBase e Voldemort.
2) É discutido o teorema CAP e como diferentes bancos de dados NoSQL priorizam disponibilidade, consistência ou tolerância a partições.
3) São feitas comparações entre bancos de dados relacionais e NoSQL, cobrindo estrutura de dados, flexibilidade de esquema, normalização e acesso a dados.
Neo4j - Rede de relacionamentos baseada em grafosEvandro Venancio
This document discusses graphs and Neo4j. It begins with definitions of vertices, edges, and graphs. It then shows examples of creating a graph representing the Smith family relationships in Neo4j using Cypher queries. These queries create nodes for family members and relationships between them like parent/child and marriage. It asks questions like who are someone's parents, siblings, or cousins that can be answered by running Cypher queries on the graph.
Este documento introduz o banco de dados orientado a grafos Neo4j. Explica que os bancos de dados orientados a grafos representam dados e esquemas como grafos, tornando consultas como menor caminho mais simples. Também descreve os principais conceitos do Neo4j como nós, rótulos, relacionamentos e propriedades e demonstra exemplos básicos de consultas Cypher e integração com C#.
GraphConnect Europe 2016 - NoSQL Polyglot Persistence: Tools and Integrations...Neo4j
This document discusses polyglot persistence with Neo4j and provides tools and integrations for combining Neo4j with other databases. It introduces the Neo4j Doc Manager for syncing documents from MongoDB to Neo4j, the official Neo4j Docker image, and the Neo4j Cassandra Data Import Tool. It also discusses Neo4j 3.0 features like Bolt, stored procedures, and drivers.
Compelling location-based services require more than simple “what’s near me?” operations. The Open Street Map dataset is a perfect example of a rich geographically-based wiki that can be used for much more than map rendering.
With the newly released Neo4j Spatial, any data can be adapted to complex queries with geographic components like “Select all streets in the Municipality of NYC where at least 2 of my friends are walking right now”.
The talk will demonstrate the important benefits of modeling geodata in a graph, the main components needed to expose data to geo stacks like map servers, and explain how the Open Street Map dataset is modeled in Neo4j. I’ll show how using Neo4j unlocks the full potential of the OSM data far beyond just rendering maps.
There will also be some cool examples of Neo4j Spatial, from Telecomms network planning, Web-based AJAX GIS systems, topology editing and routing to REST and Web Feature Service endpoints, all in a single stack.
This is Location-based Services on steroids!
O documento discute bancos de dados de grafos, que armazenam dados em vértices e arestas ao invés de tabelas. Ele explica que os bancos de dados de grafos são do tipo noSQL e index-free. Também fornece exemplos de uso de bancos de grafos e lista alguns produtos populares de bancos de grafos no mercado.
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
This presentation gives an overview on OrientDB and Neo4j. It also compares some specific querys, their speed and the overall functionality of both databases.
The querys might not be optimized in both cases. At least they have the same outcome and are both written as querys. For sure in Neo4j you should do this in Java code. But that is way harder to write, so this presentation is more like a direkt comparision instead of really getting the best results.
Also it's done with real data and at the end round about 200 GB of data.
The document provides an overview of different NoSQL database types, including key-value stores, document databases, column-oriented databases, graph databases, and caches. It discusses examples of databases for each type and common use cases. The document also covers querying graph databases, polyglot persistence using multiple database types, and concludes with when each database type is best suited and when not to use a NoSQL database.
Presented at JavaOne 2013, Tuesday September 24.
"Data Modeling Patterns" co-created with Ian Robinson.
"Pitfalls and Anti-Patterns" created by Ian Robinson.
This document provides an overview of tools available in the Java Development Kit (JDK) that allow for powerful introspection and manipulation of the Java Virtual Machine (JVM) and running applications. It discusses the java.lang.instrument API for injecting Java agents, the Java Debugging Interface (JDI) for debugging, the JVM Tool Interface (JVMTI) for heap and frame introspection, and examples of using these tools to build interactive debuggers, inject code at runtime, and test concurrency. Code samples and links to further resources are also provided.
The document provides an overview of the internal workings of Neo4j. It describes how the graph data is stored on disk as linked lists of fixed size records and how two levels of caching work - a low-level filesystem cache and a high-level object cache that stores node and relationship data in a structure optimized for traversals. It also explains how traversals are implemented using relationship expanders and evaluators to iteratively expand paths through the graph, and how Cypher builds on this but uses graph pattern matching rather than the full traversal system.
This document discusses different NoSQL database technologies for various application requirements. It describes graph databases like Neo4j, document databases like MongoDB, and column family databases like Cassandra. It then provides examples of using each for a blog system, Twitter clone, and social network. Graph databases are well-suited for the social network due to focusing on entity relationships. Document databases work well for the blog by embedding comments in blog posts. A column family database is a good fit for the Twitter clone to handle high write loads through denormalization across column families.
[JavaOne 2011] Models for Concurrent ProgrammingTobias Lindaaker
The document discusses models for concurrent programming. It summarizes common misconceptions about threads and concurrency, and outlines some of the core abstractions and tools available in Java for writing concurrent programs, including threads, monitors, volatile variables, java.util.concurrent classes like ConcurrentHashMap, and java.util.concurrent.locks classes like ReentrantLock. It also discusses some models not currently supported in Java like parallel arrays, transactional memory, actors, and Clojure's approach to concurrency using immutable data structures, refs, and atoms.
Graph databases are well suited for complex, interconnected data. Neo4j is a graph database that represents data as nodes connected by relationships. It allows for complex queries and traversals of graph structures. Unlike relational databases, graph databases can directly model real world networks and relationships without needing to flatten the data.
This document summarizes Tobias Ivarsson's work on developing a new "Advanced Compiler" for Jython. It provides an overview of the compiler project, performance figures comparing Jython and CPython on benchmark tests, discusses mismatches between Python and the JVM, and how performance is being improved. The new compiler adds analysis and intermediate representation steps to better represent Python code on the JVM. Benchmark results show initial Jython performance lagging CPython but improving with JIT warmup and continued compiler optimizations.
This document summarizes Tobias Ivarsson's presentation on the "Advanced Compiler" project for Jython. It discusses performance benchmarks that show Jython is currently slower than CPython. It also analyzes mismatches between Python and the JVM, such as late binding, call frames, and exception handling. However, it argues the new compiler provides an enabling platform for optimizations like call frame analysis, inline caching of builtins, and exception handler restructuring to improve performance. It concludes the project is not just about compilation but moving Jython forward to better leverage future JVMs.
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
1. Persistent Graphs in
Python with Neo4j
twitter: @thobe / #neo4j
Tobias Ivarsson email: tobias@neotechnology.com
web: http://www.neo4j.org/
Hacker @ Neo Technology web: http://www.thobe.org/
Sunday, February 21, 2010
2. We all know the
relational model.
Attendees It has been predominant
for a long time.
username fullname registration tutorials payment
guido Guido van Rossum null yes 0
thobe Tobias Ivarsson 2009-12-12 no 300
joe John Doe 2010-02-05 yes 700
... ... ... ... ...
2
Sunday, February 21, 2010
3. Attendees
The relational model has
username fullname registration tutorials payment a few problems, such as:
•poor support for sparse
data
•modifying the data
guido Guido van Rossum null yes 0 model is almost
exclusively done through
adding tables
thobe Tobias Ivarsson 2009-12-12 no 300
joe John Doe 2010-02-05 yes 700
... ... ... ... ...
Location
username latitude longitude title publish
thobe 55°36'47.70"N 12°58'34.50"E Malmö yes
San
joe 37°49'36.00"N 122°25'22.00"W no
Francisco
... ... ... ... ...
3
Sunday, February 21, 2010
4. Attendees Sessions
username fullname registration tutorials payment id title time room ...
... ... ... ... ...
guido Guido van Rossum null yes 0
... ... ... ... ...
thobe Tobias Ivarsson 2009-12-12 no 300
Session attendance
joe John Doe 2010-02-05 yes 700 session user
... ... ... ... ... ... ...
Location ... ...
username latitude longitude title publish
More complication...
thobe 55°36'47.70"N 12°58'34.50"E Malmö yes ... ...
... ...
After a while, modeling ... ...
complex relationships ... ...
leads to complicated
...... ......
San ......
schemasjoe 37°49'36.00"N 122°25'22.00"W no ......
Francisco ...... ......
...... ......
... ... ... ... ...
4
Sunday, February 21, 2010
5. A number of companies
have realized that the
relational model is
insufficient and are
working on alternative
database solutions.
5
Sunday, February 21, 2010
6. Most focus on scaling to large numbers
192.168.0.15 192.168.0.16
192.168.0.21
192.168.0.14
6
Sunday, February 21, 2010
9. Positioning w.r.t. other NOSQL DBs
Size
Key/Value stores
Bigtable clones
Document databases
Graph databases
Billions of nodes
and relationships
> 90% of use cases
Complexity
8
Sunday, February 21, 2010
10. What is Neo4j?
๏ Neo4j is a Graph Database
• Non-relational (“#nosql”), transactional (ACID), embedded
• Data is stored as a Graph / Network
‣Nodes and relationships with properties
‣“Property Graph” or “edge-labeled multidigraph”
๏ Neo4j is Open Source / Free (as in speech) Software
• AGPLv3
Prices are available at
http://neotechnology.com/
• Commercial (“dual license”) license available
Contact us if you have
questions and/or special
license needs (e.g. if you
want an evaluation license)
‣Free (as in beer) for “small” installations
‣Inexpensive (as in startup-friendly) when you grow 9
Sunday, February 21, 2010
11. More about Neo4j
๏ Neo4j is stable
• In 24/7 operation since 2003
๏ Neo4j is in active development
• Neo Technology got VC funding October 2009
๏ Neo4j delivers high performance graph operations
• traverses 1’000’000+ relationships / second
on commodity hardware
10
Sunday, February 21, 2010
12. The Neo4j Graph data model
๏ Nodes are connected to one another through relationships
๏ A Relationship is a connection between two nodes
• Relationships have types
• Relationships have a direction
• Relationships are traversed equally fast in either direction
๏ Properties are mappings from a string key to a primitive value
• Both Nodes and Relationships have properties
• Primitive values are any of these (or an array of these):
‣String
‣Numbers: float, double, integers (1-8 byte) 11
Sunday, February 21, 2010
13. The Neo4j Graph data model
name: “Mary”
LOVES
name: “James” age: 35
age: 32 LIVES WITH
twitter: “@spam” LOVES
OWNS
property type: “car” DRIVES
brand: “Volvo”
model: “V70”
12
Sunday, February 21, 2010
14. Graphs are all around us
A B C D ...
1 17 3.14 3 17.79333333333
2 42 10.11 14 30.33
3 316 6.66 1 2104.56
4 32 9.11 592 0.492432432432
5 Even if this spread sheet looks
like it could be a fit for a RDBMS
2153.175765766
it isn’t:
•RDBMSes have problems with
... extending indefinitely on both
rows and collumns
•Formulas and data
dependencies would quickly lead
to heavy join operations
13
Sunday, February 21, 2010
15. Graphs are all around us
A B C D ...
1 17 3.14 3 = A1 * B1 / C1
2 42 10.11 14 = A2 * B2 / C2
3 316 6.66 1 = A3 * B3 / C3
4 32 9.11 592 = A4 * B4 / C4
5 = SUM(D2:D5)
...
14
Sunday, February 21, 2010
16. Graphs are all around us
A B C D ...
1 17 3.14 3 = A1 * B1 / C1
2 42 10.11 14 = A2 * B2 / C2
3 316 6.66 1 = A3 * B3 / C3
4 32 9.11 592 = A4 * B4 / C4
5 = SUM(D2:D5)
...
14
Sunday, February 21, 2010
17. Graphs are all around us If we add external data
sources the problem
becomes even more
interesting...
17 3.14 3 = A1 * B1 / C1
42 10.11 14 = A2 * B2 / C2
316 6.66 1 = A3 * B3 / C3
32 9.11 592 = A4 * B4 / C4
= SUM(D2:D5)
15
Sunday, February 21, 2010
18. Graphs are all around us If we add external data
sources the problem
becomes even more
interesting...
17 3.14 3 = A1 * B1 / C1
42 10.11 14 = A2 * B2 / C2
316 6.66 1 = A3 * B3 / C3
32 9.11 592 = A4 * B4 / C4
= SUM(D2:D5)
15
Sunday, February 21, 2010
21. Graphs are whiteboard friendly
thobe
Joe project blog
Wardrobe Strength
Hello Joe
Modularizing Jython
Neo4j performance analysis
16
Sunday, February 21, 2010
22. Query Languages
๏ Traversal API
๏ Sparql - “SQL for linked data”
SELECT ?person WHERE {
?person neo4j:KNOWS ?friend .
?friend neo4j:KNOWS ?foe .
?foe neo4j:name “Larry Ellison” .
}
๏ Gremlin - “perl for graphs”
./outE[@label='KNOWS']/inV[@age > 30]/@name
17
Sunday, February 21, 2010
23. Python integration for Neo4j
๏ Mapping of the core Neo4j API for Python
• Making it feel “Pythonic”
๏ Available from the Neo4j repository (and soon from PyPI)
• http://components.neo4j.org/neo4j.py
‣svn co http://svn.neo4j.org/components/neo4j.py/trunk neo4j-python
๏ Works with both Jython and CPython
• The threading of Jython is a plus with an embedded db...
๏ Comes with Django empowering batteries included
• Could have support for other frameworks in the future
18
Sunday, February 21, 2010
24. Simple interaction
import neo4j
graphdb = neo4j.GraphDatabase(“var/neo”)
with graphdb.transaction:
james = graphdb.node(name=“James”, age=32, twitter=“@spam”)
mary = graphdb.node(name=“Mary”, age=35)
the_car = graphdb.node(brand=“Volvo”, model=“V70”)
james.LOVES( mary )
mary.LOVES( james )
james.LIVES_WITH( mary )
james.OWNS( the_car, property_type=“car” ) Creates the graph we saw
in the first example.
mary.DRIVES( the_car )
19
Sunday, February 21, 2010
25. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ]
order = neo4j.BREDTH_FIRST
stop = neo4j.STOP_AT_END_OF_GRAPH
returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
26. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ]
order = neo4j.BREDTH_FIRST
stop = neo4j.STOP_AT_END_OF_GRAPH
returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
27. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1)
order = neo4j.BREDTH_FIRST
stop = neo4j.STOP_AT_END_OF_GRAPH
returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
28. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1)
order = neo4j.BREDTH_FIRST Trinity (@ depth=1)
stop = neo4j.STOP_AT_END_OF_GRAPH
returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
29. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1)
order = neo4j.BREDTH_FIRST Trinity (@ depth=1)
stop = neo4j.STOP_AT_END_OF_GRAPH
Cypher (@ depth=2)
returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
30. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1)
order = neo4j.BREDTH_FIRST Trinity (@ depth=1)
stop = neo4j.STOP_AT_END_OF_GRAPH
Cypher (@ depth=2)
returnable = neo4j.RETURN_ALL_BUT_START_NODE
Agent Smith (@ depth=3)
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
31. Graph traversals name: “The Architect”
disclosure: “public”
name: “Thomas Anderson”
age: 29 name: “Cypher”
last name: “Reagan”
KNOWS name: “Morpheus”
KNOWS KNOWS
rank: “Captain” CODED BY
LOVES occupation: “Total badass” KNOWS
KNOWS
name: “Trinity” disclosure: “secret”
name: “Agent Smith”
version: “1.0b”
since: “meeting the oracle” since: “a year before the movie”
language: “C++”
cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1)
order = neo4j.BREDTH_FIRST Trinity (@ depth=1)
stop = neo4j.STOP_AT_END_OF_GRAPH
Cypher (@ depth=2)
returnable = neo4j.RETURN_ALL_BUT_START_NODE
Agent Smith (@ depth=3)
for friend_node in Friends(mr_anderson):
print “%s (@ depth=%s)” % ( friend_node[“name”],
friend_node.depth ) 20
Sunday, February 21, 2010
32. Batteries for Django
from neo4j.model import django_model as models
class Movie(models.NodeModel):
title = models.Property(indexed=True)
year = models.Property()
href = property(lambda self: ('/movie/%s/' %
(self.node.id,)))
def __unicode__(self):
return self.title
class Actor(models.NodeModel):
name = models.Property(indexed=True)
href = property(lambda self: ('/actor/%s/' %
(self.node.id,)))
def __unicode__(self):
return self.name
# etc. ...
21
Sunday, February 21, 2010
33. “My ORM already does this”
๏ ORMs and model evolution is a hard problem
• virtually unsupported in Django
๏ SQL is a “compatible” across many RDBMSs
• data is still locked in
๏ Each ORM maps object models differently
• Moving to another ORM == legacy schema support
‣except your legacy schema is strange auto-generated
๏ Object/Graph Mapping is always done the same
• allows you to keep your data through application changes
• or share data between multiple implementations 22
Sunday, February 21, 2010
34. What your ORM doesn’t do
๏ Drop down to underlying graph model
• Traversals
• Graph algorithms
• Shortest path(s)
• etc.
23
Sunday, February 21, 2010
35. Buzzword summary http://neo4j.org/
SPARQL
AGPLv3
Open Source
ACID
Object mapping Shortest path
NOSQL
startup friendly
whiteboard friendly
Traversal
Query language
Embedded
Beer
Software Transactional Memory
polyglot persistence
Free Software
Scaling to complexity
24
Sunday, February 21, 2010