A presentation of the Neo4j graph database given at QCon SF 2008. It describes why relational databases are increasingly unfit for many applications today and why graphs may be a good fit. It also covers the fundamentals of how to program with Neo4j.
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)Emil Eifrem
Presentation given at nosql east 2009 in Atlanta. Introduces the NOSQL space by offering a framework for categorization and discusses the benefits of graph databases. Oh, and also includes some tongue-in-cheek party poopers about sucky things in the NOSQL space.
The document discusses Neo4j, an open-source graph database. It describes Neo4j as a Java-based, embeddable database that is ACID compliant and supports high availability clustering. It also summarizes that Neo4j uses a property graph data model and supports building node spaces through a core API or with languages like Ruby. Finally, it provides examples of how Neo4j is used for applications involving social networks, fraud detection, routing, and more.
The Data2Semantics project (COMMIT P23) is all about enriching research data, and making it more reusable for future research. Using Linked Data for this task is a fairly obvious step to make (surprise!). However, there are several shortcomings the current practices in publishing Linked Data, that calls for a slightly
different approach which (hopefully) bridges a gap between Web 2.0 and Web 3.0. I will present a proof-of-concept service (Linkitup) that works on top of existing scientific data repositories, and allows individual researchers to enrich their data with additional (linked) metadata.
Graph databases are a type of NoSQL database that uses nodes and relationships to represent and store data. Nodes can have properties and be connected to other nodes via relationships. This allows for complex queries of connected data. Neo4j is an example of a graph database that uses these concepts to store and query data. Code examples are shown for how to programmatically create nodes and relationships in Neo4j and traverse the graph to find connected nodes.
API's, Freebase, and the Collaborative Semantic webDan Delany
A presentation about the state of the collaborative semantic web, including:
- What?
- Why?
- Where do we stand?
- A case study on Metaweb's Freebase project
Querying Riak Just Got Easier - Introducing Secondary IndicesRusty Klophaus
This presentation introduces new Riak KV functionality called Secondary Indexes. Secondary Indices allows a developer to retrieve data by attribute value, rather than by primary key.
Currently, a developer coding outside of Riak’s key/value based access must maintain their own indexes into the data using links, other Riak objects, or external systems. This is straightforward for simple use cases, but can add substantial coding and data modeling for complex applications. By formalizing an approach and building index support directly into Riak KV, we remove this burden from the application developer while preserving Riak’s core benefits, including scalability and tolerance against hardware failure and network partitions.
The presentation covers usage, capabilities, limitations, and lessons learned.
The document discusses the evolution of the World Wide Web and its future direction. It describes how Web 3.0 will involve defining information with clear meanings to allow better cooperation between computers and humans. This will be achieved through techniques like metadata, tagging, microformats and semantic standards like RDF to represent information and relationships. New applications and services are expected to continue developing that build on existing technologies and data.
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007Marc Smith
Overview of the NodeXL project (Network Overview, Discovery and Exploration) that adds social network metrics and visualization features to Excel 2007. Contains updated images from version .84 of the NodeXL project.
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialXavier Amatriain
There is more to recommendation algorithms than rating prediction. And, there is more to recommender systems than algorithms. In this tutorial, given at the 2012 ACM Recommender Systems Conference in Dublin, I review things such as different interaction and user feedback mechanisms, offline experimentation and AB testing, or software architectures for Recommender Systems.
Neo4j is a Java-based graph database that is embeddable, ACID compliant, and has been in operation since 2003. It uses an indexing framework and supports high availability clustering. The document provides code examples for creating nodes and relationships in Neo4j and traversing the graph to find connections between nodes. It also discusses several potential applications of Neo4j, including network management, master data management, social networks, and fraud detection.
The document summarizes a conference called the JVM Language Summit that was held in 2008. Over 80 key VM and language designers met for 3 days to discuss the future of their projects related to the Java Virtual Machine (JVM). Presentations were given on various languages and VMs like Java, Clojure, Scala, and the HotSpot VM. Key topics of discussion included invokedynamic, metaobject protocols, language interoperability, and platform design. Attendees found the rapid exchange of ideas and new partnerships formed to be very valuable for advancing innovation on the JVM.
SemSearch09 workshop at WWW2009, April 21th 2009- http://km.aifb.uni-karlsruhe.de/ws/semsearch09/ - Paper available at: http://km.aifb.uni-karlsruhe.de/ws/semsearch09/semse2009_25.pdf
This document provides an introduction to NoSQL databases. It discusses that NoSQL refers to non-relational databases that are not based on SQL and are focused on scalability. Some common types of NoSQL databases include column stores, key-value stores, document stores, graph databases, and XML databases. NoSQL databases are designed to handle large volumes of data across many servers and provide high availability with no single point of failure. Common uses of NoSQL databases include distributed systems like social networks where data is highly distributed and needs to be replicated across servers.
This document summarizes a presentation given by Dominique Thomas on opportunities for new business models using grid computing. Some key points discussed include:
- Trends in high performance computing showing increasing demand for more computing power and data storage.
- European projects like BEinGRID and EGEE that are developing grid infrastructure for e-science applications.
- Challenges faced by CGGVeritas in seismic processing like needing more computing power and better management of large datasets and computing resources.
- Potential benefits of adopting grid computing for CGGVeritas including better optimization of IT resources, ability to solve more complex problems, knowledge sharing, and new business opportunities.
Building RESTful Java Applications with EMFKenn Hussey
Representational State Transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web. However, it is possible to design any enterprise software system in accordance with the REST architectural style without using the HTTP protocol and without interacting with the World Wide Web.
Systems that follow the principles of REST often referred to as RESTful. Proponents of REST argue that the Web enjoyed the scalability and growth that it has had as a direct result of a few key design principles. Among these principles are the notions that application state and functionality are divided into resources and that every resource is uniquely addressable using a universal syntax for use in hypermedia links. Another key principle of REST is that all resources share a uniform interface for the transfer of state between client and resource, consisting of a constrained set of content types and a constrained set of well-defined operations.
The Eclipse Modeling Framework (EMF) provides a Java runtime framework and tools for generative application development and fine-grained data integration based on simple models. Models can be specified directly using EMF's metamodel, Ecore, or imported from other forms, including UML and XML Schema. Given a model specification, EMF can generate a corresponding set of Java interfaces and implementation classes that can easily be mixed with hand-written code for maximum flexibility. When deployed, applications developed with EMF benefit from a powerful and extensible runtime, which, among other features, includes a persistence mechanism which has always supported the principles of REST – perhaps even before the term "REST" became popular. This tutorial will provide an introduction to EMF, including alternatives for specifying a model, EMF's code generation tools, and key runtime framework concepts. As a practical usage of this knowledge, the presenters will show how EMF can be used to build RESTful applications, exploring some best practices for working with resources and other features of the framework.
This document discusses how big data is not just about data volume, but also about data variety, velocity, and complexity. It argues that traditional data warehousing and business intelligence approaches are no longer sufficient, and that new approaches using Hadoop and NoSQL databases are needed to enable real-time insights from both structured and unstructured data at scale. These new approaches can help organizations gain a 360-degree view of their business and make faster, more data-driven decisions.
Academic presentation about the Relational Cloud system based on the paper "Relational Cloud: A Database-as-a-Service for the Cloud" by Carlo Curino et al.
1. The document discusses the history and future of semantic web technologies, including lessons learned and trends. It notes that semantic web's strength is in data aggregation rather than data management.
2. Two scenarios involving expressing claims in RDFa and linking from a homepage are presented, showing how trust can come from linked information.
3. Recent and emerging trends in user interfaces, search engines, and services are moving towards a more machine-readable web where pages make claims and datasets are interconnected.
The document discusses Elsevier's SciVerse platform which aims to facilitate scientific innovation through semantic applications and tools. The platform provides access to publications, datasets, and other research content. It also includes a developer network for building semantic applications to analyze and visualize data. One example application mentioned is Quantifind, which helps analyze topics like warfarin dosages through visualization and correlation of related data types and studies.
Similar to Neo4j -- or why graph dbs kick ass (20)
Startups in Sweden vs Startups in Silicon Valley, 2015 editionEmil Eifrem
Differences between running a startup in Sweden and a startup in Silicon Valley. Bonus: How Neo Technology (Neo4j) uses aspects of Scandinavian culture as a competitive advantage. Presented at the Nordic themed Monki Gräs in London, Jan of 2015.
Btw, we at Neo4j are hiring: http://neo4j.com/jobs/ :)
This document contains snippets from a Neo Technology conference presentation on graphs and graph databases. It discusses how graphs can be used to model real-world domains like social networks, telecommunications networks, financial networks, healthcare networks, and more. It also provides examples of how specific companies like Accenture are using graph databases and outlines Neo Technology's roadmap for improving the user experience of its graph database platform.
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
Recent years have seen an explosion of technologies for managing, processing and analyzing graphs, ranging from community projects like Apache Giraph, to vendor led products such as Neo4j and spin outs from established companies like Twitter’s FlockDB. The sheer number of technologies makes it difficult to keep track of these tools and what sets them apart, even for those of us who are active in the space!
But all graph technologies are not created equal. This session will provide a high level framework for making sense of the emerging graph landscape. It will describe the three dominant graph data models today, define top level categories like graph compute engines (Graphlab, Giraph, Pegasus, YarcData, etc) and graph databases (Neo4j, FlockDB, OrientDB, etc) and discuss common characteristics and important properties of each category.
Startups in Sweden vs Startups in Silicon ValleyEmil Eifrem
Differences between running a startup in Sweden and a startup in Silicon Valley. Presented at Stanford's "European Entrepreneurship & Innovation" (http://www.europeanentrepreneursatstanford.com) in Jan of 2012.
NOSQL part of the SpringOne 2GX 2010 keynoteEmil Eifrem
The document discusses Spring Data support for non-relational databases (NOSQL) to address challenges of proliferating and complex data not suitable for relational databases. It provides examples of using Spring Data with the Neo4j graph database to model complex domains like social networks by adding graph features to existing JPA data models and handling relationships as entities rather than raw database operations. A new Spring Roo add-on is presented for simplified Neo4j integration.
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsMydbops
This presentation, delivered at the Postgres Bangalore (PGBLR) Meetup-2 on June 29th, 2024, dives deep into connection pooling for PostgreSQL databases. Aakash M, a PostgreSQL Tech Lead at Mydbops, explores the challenges of managing numerous connections and explains how connection pooling optimizes performance and resource utilization.
Key Takeaways:
* Understand why connection pooling is essential for high-traffic applications
* Explore various connection poolers available for PostgreSQL, including pgbouncer
* Learn the configuration options and functionalities of pgbouncer
* Discover best practices for monitoring and troubleshooting connection pooling setups
* Gain insights into real-world use cases and considerations for production environments
This presentation is ideal for:
* Database administrators (DBAs)
* Developers working with PostgreSQL
* DevOps engineers
* Anyone interested in optimizing PostgreSQL performance
Contact info@mydbops.com for PostgreSQL Managed, Consulting and Remote DBA Services
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
Best Practices for Effectively Running dbt in Airflow.pdfTatiana Al-Chueyr
As a popular open-source library for analytics engineering, dbt is often used in combination with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models.
This webinar will cover a step-by-step guide to Cosmos, an open source package from Astronomer that helps you easily run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through:
- Standard ways of running dbt (and when to utilize other methods)
- How Cosmos can be used to run and visualize your dbt projects in Airflow
- Common challenges and how to address them, including performance, dependency conflicts, and more
- How running dbt projects in Airflow helps with cost optimization
Webinar given on 9 July 2024
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionBert Blevins
Cybersecurity is a major concern in today's connected digital world. Threats to organizations are constantly evolving and have the potential to compromise sensitive information, disrupt operations, and lead to significant financial losses. Traditional cybersecurity techniques often fall short against modern attackers. Therefore, advanced techniques for cyber security analysis and anomaly detection are essential for protecting digital assets. This blog explores these cutting-edge methods, providing a comprehensive overview of their application and importance.
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
How Social Media Hackers Help You to See Your Wife's Message.pdfHackersList
In the modern digital era, social media platforms have become integral to our daily lives. These platforms, including Facebook, Instagram, WhatsApp, and Snapchat, offer countless ways to connect, share, and communicate.
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
4. Trend 1: data is getting more connected
Giant
Global
Information connectivity
Graph
(GGG)
Ontologies
RDF
Folksonomies
Tagging
User-
Wikis
generated
content
Blogs
RSS
Hypertext
Text
documents web 1.0 web 2.0 “web 3.0”
1990 2000 2010 2020
5. Trend 2: ... and more semi-structured
Individualization of content!
In the salary lists of the 1970s, all elements had
exactly one job
In the salary lists of the 2000s, we need 5 job
columns! Or 8? Or 15?
Trend accelerated by the decentralization of
content generation that is the hallmark of the age
of participation (“web 2.0”)
6. Relational database
Salary List
Performance
Majority of
Webapps
Social network
Semantic
}
Trading
custom
Information complexity
10. A graph
A simple food web
(image from vtaide.com)
11. A big graph
Part of the food
web of the North
Atlantic (image from jeffkenedyassociates.com)
12. A social graph
LinkedIn
Facebook
Orkut
Hi5
Friendster
Dopplr
...
13. Your file system
Files & folders
Alias, links
Read & write
Roles, groups
...
14. Shut up and
show us the
code!
Image credit: lost! (please don’t shoot me)
15. The Graph DB model: representation
Core abstractions: name = “Emil”
age = 29
Nodes sex = “yes”
Relationships between nodes
1 2
Properties on both
type = KNOWS
time = 4 years 3
type = car
vendor = “SAAB”
model = “95 Aero”
16. Example: The Matrix
name = “The Architect”
name = “Morpheus”
rank = “Captain”
name = “Thomas Anderson”
occupation = “Total badass” 42
age = 29
disclosure = public
KNOWS KNOWS CODED_BY
1 KN O
7 3 WS
13
S
KN name = “Cypher”
KNOW
OW last name = “Reagan”
S
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
2
name = “Trinity”
17. Code (1): Building a node space
NeoService neo = ... // Get factory
// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( quot;namequot;, quot;Thomas Andersonquot; );
mrAnderson.setProperty( quot;agequot;, 29 );
// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( quot;namequot;, quot;Morpheusquot; );
morpheus.setProperty( quot;rankquot;, quot;Captainquot; );
morpheus.setProperty( quot;occupationquot;, quot;Total bad assquot; );
// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
18. Code (1): Building a node space
NeoService neo = ... // Get factory
Transaction tx = neo.begin();
// Create Thomas 'Neo' Anderson
Node mrAnderson = neo.createNode();
mrAnderson.setProperty( quot;namequot;, quot;Thomas Andersonquot; );
mrAnderson.setProperty( quot;agequot;, 29 );
// Create Morpheus
Node morpheus = neo.createNode();
morpheus.setProperty( quot;namequot;, quot;Morpheusquot; );
morpheus.setProperty( quot;rankquot;, quot;Captainquot; );
morpheus.setProperty( quot;occupationquot;, quot;Total bad assquot; );
// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
tx.commit();
19. Code (1b): Defining RelationshipTypes
// In package org.neo4j.api.core
public interface RelationshipType
{
String name();
}
// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
{
private final String name;
MyDynamicRelType( String name ){ this.name = name; }
public String name() { return this.name; }
}
// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
{
KNOWS,
WORKS_FOR,
}
20. The Graph DB model: traversal
Traverser framework for name = “Emil”
high-performance traversing age = 29
sex = “yes”
across the node space
1 2
type = KNOWS
time = 4 years 3
type = car
vendor = “SAAB”
model = “95 Aero”
21. Example: Mr Anderson’s friends
name = “The Architect”
name = “Morpheus”
rank = “Captain”
name = “Thomas Anderson”
occupation = “Total badass” 42
age = 29
disclosure = public
KNOWS KNOWS CODED_BY
1 KN O
7 3 WS
13
S
KN name = “Cypher”
KNOW
OW last name = “Reagan”
S
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
2
name = “Trinity”
22. Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
Traverser.Order.BREADTH_FIRST,
StopEvaluator.END_OF_GRAPH,
ReturnableEvaluator.ALL_BUT_START_NODE,
RelTypes.KNOWS,
Direction.OUTGOING );
// Traverse the node space and print out the result
System.out.println( quot;Mr Anderson's friends:quot; );
for ( Node friend : friendsTraverser )
{
System.out.printf( quot;At depth %d => %s%nquot;,
friendsTraverser.currentPosition().getDepth(),
friend.getProperty( quot;namequot; ) );
}
23. name = “The Architect”
name = “Morpheus”
rank = “Captain”
name = “Thomas Anderson”
occupation = “Total badass” 42
age = 29
disclosure = public
KNOWS KNOWS CODED_BY
1 KN O
7 3 WS
13
S
KN name = “Cypher”
KNOW
OW last name = “Reagan”
S
name = “Agent Smith”
disclosure = secret version = 1.0b
age = 3 days age = 6 months language = C++
2
name = “Trinity”
$ bin/start-neo-example
Mr Anderson's friends:
At depth 1 => Morpheus
friendsTraverser = mrAnderson.traverse(
Traverser.Order.BREADTH_FIRST, At depth 1 => Trinity
StopEvaluator.END_OF_GRAPH, At depth 2 => Cypher
ReturnableEvaluator.ALL_BUT_START_NODE,
RelTypes.KNOWS,
At depth 3 => Agent Smith
Direction.OUTGOING ); $
24. Example: Friends in love?
name = “The Architect”
name = “Morpheus”
rank = “Captain”
name = “Thomas Anderson”
occupation = “Total badass” 42
age = 29
disclosure = public
KNOWS KNOWS CODED_BY
1 7 3 KN O
WS
13
S
KN
KNOW
name = “Cypher”
OW last name = “Reagan”
S
name = “Agent Smith”
LO disclosure = secret version = 1.0b
VE age = 6 months language = C++
S
2
name = “Trinity”
25. Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
Traverser.Order.BREADTH_FIRST,
StopEvaluator.END_OF_GRAPH,
new ReturnableEvaluator()
{
public boolean isReturnableNode( TraversalPosition pos )
{
return pos.currentNode().hasRelationship(
RelTypes.LOVES, Direction.OUTGOING );
}
},
RelTypes.KNOWS,
Direction.OUTGOING );
26. Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( quot;Who’s a lover?quot; );
for ( Node person : loveTraverser )
{
System.out.printf( quot;At depth %d => %s%nquot;,
loveTraverser.currentPosition().getDepth(),
person.getProperty( quot;namequot; ) );
}
27. name = “The Architect”
name = “Morpheus”
rank = “Captain”
name = “Thomas Anderson”
occupation = “Total badass” 42
age = 29
disclosure = public
KNOWS KNOWS KN O CODED_BY
1 7 3 WS
13
S
KN
KNOW
name = “Cypher”
OW last name = “Reagan”
S
name = “Agent Smith”
LO disclosure = secret version = 1.0b
VE age = 6 months language = C++
S 2
name = “Trinity”
$ bin/start-neo-example
new ReturnableEvaluator()
Who’s a lover?
{
public boolean isReturnableNode(
TraversalPosition pos)
At depth 1 => Trinity
{ $
return pos.currentNode().
hasRelationship( RelTypes.LOVES,
Direction.OUTGOING );
}
},
28. Bonus code: domain model
How do you implement your domain model?
Use the delegator pattern, i.e. every domain entity
wraps a Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
{
private final Node underlyingNode;
PersonImpl( Node node ){ this.underlyingNode = node; }
public String getName()
{
return this.underlyingNode.getProperty( quot;namequot; );
}
public void setName( String name )
{
this.underlyingNode.setProperty( quot;namequot;, name );
}
}
29. Domain layer frameworks
Qi4j (www.qi4j.org)
Framework for doing DDD in pure Java5
Defines Entities / Associations / Properties
Sound familiar? Nodes / Rel’s / Properties!
Neo4j is an “EntityStore” backend
NeoWeaver (http://components.neo4j.org/neo-weaver)
Weaves Neo4j-backed persistence into domain
objects in runtime (dynamic proxy / cglib based)
Veeeery alpha, but veeery cool
30. Neo4j system characteristics
Disk-based
Native graph storage engine with custom (“SSD-
ready”) binary on-disk format
Transactional
JTA/JTS, XA, 2PC, Tx recovery, deadlock
detection, etc
Scalable
Several billions of nodes/rels/props on single JVM
Robust
5+ years in 24/7 production
31. Social network pathExists()
12
~1k persons
3
7 1 Avg 50 friends per
person
pathExists(a, b) limit
36
41 77 depth 4
5
Two backends
Eliminate disk IO so
warm up caches
32. Social network pathExists()
2
Emil
1 5
7
Mike Kevin
3 John
Marcus
9 4
Bruce Leigh
# persons query time
Relational database (MySQL) 1 000 2 000 ms
Graph database (Neo4j) 1 000 2 ms
Graph database (Neo4j) 1 000 000 2 ms
33. Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)
- Lacks in tool and framework support
- No other implementations => potential lock in
- No support for ad-hoc queries
+
34. More consequences
Ability to capture semi-structured information
=> allowing individualization of content
No predefined schema
=> easier to evolve model
=> can capture ad-hoc relationships
Can capture non-normative relations
=> easy to model specific links to specific sets
All state is kept in transactional memory
=> improves application concurrency
35. The Neo4j ecosystem
Neo4j is an embedded database
Tiny teeny lil jar file
Component ecosystem
index-util
neo-meta
neo-utils
owl2neo
sparql-engine
...
See http://components.neo4j.org
36. Example: NeoRDF
NeoRDF triple/quad store
OWL SPARQL
RDF
Metamodel Graph
match
Neo4j
37. Future development
Productify RDF support (Neo4j 1.1)
Tool support (Neo4j 1.1 and onwards)
Language bindings
Currently Jython, Python, Ruby
Probably works well with Groovy, Beanshell, etc
Tomorrow Scala? .NET? Erlang?
Standalone server
Experimental RemoteNeo in laboratory right now
How standardize REST API? SPARQL protocol?
38. Future development
Distribution (Neo4j 2.0), current thoughts:
Best bet today: sharding on top of
(Infiniflow) from Paremus: www.codecauldron.org
Fundamentals:
CAP theorem
BASE (“ACID 2.0”)
Eventual consistency
Separate HA and data partitioning
Generic clustering algorithm as base case, but
give lots of knobs for developers
39. How ego are you? (aka other impls?)
Franz’ AllegroGraph (http://agraph.franz.com)
Proprietary, Lisp, RDF-oriented but real graphdb
FreeBase graphd (http://blog.freebase.com/2008/04/09/a-
brief-tour-of-graphd/)
In-house at Metaweb
Kloudshare (http://whydoeseverythingsuck.com)
Graph database in the cloud, still stealth mode
Some academic papers from ~10 years ago
G = {V, E}
40. Conclusion
Graphs && Neo4j => teh awesome!
Available NOW under AGPLv3 / commercial license
AGPLv3: “if you’re open source, we’re open source”
If you have proprietary software? Must buy a
commercial license
But up to 1M primitives it’s free for all uses!
Download
http://neo4j.org
Feedback
http://lists.neo4j.org
43. Neo4j architecture gotchas
Focus on the domain (whiteboard friendly –
domain-first development)
Purpose of the domain layer:
“an adaptation of the generic node space to a
type-safe, object-oriented abstraction expressed
in the vocabulary of our domain” (!)
Implementation mindset
Assume the node space is always in memory
Assume everything is automatically persistent
Focus on logical transactions instead of artificial
load/stores
44. Implementation pointers
Use the delegator pattern, i.e. every domain entity
wraps a Neo4j primitive as follows:
Actor actor = new ActorImpl( underlyingNode );
Remember that the wrappers are extremely
lightweight – they contain no state except for a
reference – so create them freely!
It is good practice to override equals/hashCode
Delegate to underlying{Node,Relationship}’s
equals() or hashCode() implementation