SlideShare a Scribd company logo
Avoiding Deadlocks in Neo4j on
Z-Platform
- Mahesh Chaudhari, Cesar Arevalo &
Brian Roy
Outline
•
•
•
•
•
•

Introduction to the Z-Platform
Problems caused by Deadlocks
Locks and Deadlocks in Neo4j
Avoidance using Bipartite graphs
Performance
Conclusion
2
Full Entity
Profiles

Z-Platform
Sparse
Representatio
n of Profiles

MongoDB
Database

Json
Documents

Nodes & Edges
Neo4j Graph
Database

Z-Platform

Source Datasets
3
Deadlocks in Z-Platform
• Creating relationships is one of the most time
consuming processes
• Log analysis reveals deadlocks among batch
transactions and retry-mechanism takes time
• Dependent on how nodes and relationships are
grouped together
• Batch size is dependent on the size of the JSON
block sent to the server
• Time required to build relationships and resolve
deadlocks is in the order of seconds
4

Recommended for you

Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis

This document discusses using Redis to solve real world problems. It provides 4 patterns for using Redis: [1] as a simple, fast object store using hashes; [2] indexing objects with sorted sets; [3] using bitmaps for unique value counting; and [4] resolving locations with geohashing. Each pattern is explained and code examples are provided to illustrate how to model problems in Redis and leverage its data structures and performance. A variety of use cases are presented that can benefit from each pattern.

databasesnosqlmemcache
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture

Introduction to Apache Spark, understanding of the architecture, resilient distributed datasets and working.

sparkbig datanosql
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals

The document provides an overview of MongoDB sharding, including: - Sharding allows horizontal scaling of data by partitioning a database across multiple servers or shards. - The MongoDB sharding architecture consists of shards to hold data, config servers to store metadata, and mongos processes to route requests. - Data is partitioned into chunks based on a shard key and chunks can move between shards as the data distribution changes.

shardingshard keymongodb
Locks in Neo4j
• Create a Node n1  Write Lock on Node n1
• Update a Node n1  Write Lock on Node n1
but read available on Node n1
• Create a Relationship r1 between nodes n1
and n2  Write Locks on relationship r1, n1
and n2

5
Deadlocks across processes
A

B

P1

C

D

• Processes: P1 and P2
• Nodes: A, B, C, D
• Relationships: R1, R2, R3, R4

P2

R1
A

No Deadlocks

B
R4

A

C

B
R2

No Deadlocks

C

R1

R1
A

P1

P1

A

B

P1

R3
A

R3
D

P2

A

D

P2

Possibility of Deadlock

D
R2

C
Deadlock

6

P2
D
Deadlocks across Transactions
• Transactions are also like separate processes
but in a single thread or multiple threads
• Deadlocks occur across transactions
– Two concurrent transactions need write locks on
the same node n1
– In two concurrent transactions, T1 has write lock
on node n1 and waiting on write lock on node n2
whereas T2 has write lock on n2 and is waiting for
write lock on node n1
– Transactions of varying sizes
7
Concurrent Transactions Deadlocks
A

B

T1

A

B

T1

C

D

T2

A

D

T2

No Deadlocks

Possibility of Deadlock

8

Recommended for you

What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...

This presentation about Apache Spark covers all the basics that a beginner needs to know to get started with Spark. It covers the history of Apache Spark, what is Spark, the difference between Hadoop and Spark. You will learn the different components in Spark, and how Spark works with the help of architecture. You will understand the different cluster managers on which Spark can run. Finally, you will see the various applications of Spark and a use case on Conviva. Now, let's get started with what is Apache Spark. Below topics are explained in this Spark presentation: 1. History of Spark 2. What is Spark 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark architecture 6. Applications of Spark 7. Spark usecase What is this Big Data Hadoop training course about? The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab. What are the course objectives? Simplilearn’s Apache Spark and Scala certification training are designed to: 1. Advance your expertise in the Big Data Hadoop Ecosystem 2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark 3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos What skills will you learn? By completing this Apache Spark and Scala course you will be able to: 1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations 2. Understand the fundamentals of the Scala programming language and its features 3. Explain and master the process of installing Spark as a standalone cluster 4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark 5. Master Structured Query Language (SQL) using SparkSQL 6. Gain a thorough understanding of Spark streaming features 7. Master and describe the features of Spark ML programming and GraphX programming Who should take this Scala course? 1. Professionals aspiring for a career in the field of real-time big data analytics 2. Analytics professionals 3. Research professionals 4. IT developers and testers 5. Data scientists 6. BI and reporting professionals 7. Students who wish to gain a thorough understanding of Apache Spark Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training

what is apache sparkintroduction to apache sparkapache spark tutorial
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue

Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. At this scale, output committers that create extra copies or can’t handle task failures are no longer practical. This talk will explain the problems that are caused by the available committers when writing to S3, and show how Netflix solved the committer problem. In this session, you’ll learn: – Some background about Spark at Netflix – About output committers, and how both Spark and Hadoop handle failures – How HDFS and S3 differ, and why HDFS committers don’t work well – A new output committer that uses the S3 multi-part upload API – How you can use this new committer in your Spark applications to avoid duplicating data

spark summitapache spark
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users

At Databricks, we have a unique view into over a hundred different companies trying out Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common performance and scalability issues that our users run into. This talk will discuss some of these common common issues from an engineering and operations perspective, describing solutions and clarifying misconceptions.

databricksapache sparkspark summit
Sequential Asynchronous Transactions
Deadlocks
A

E

B
.
. n edges
.
F

A
T1
E

B

A

.
. n edges
.
F

D

T1

E

C

No Deadlocks

D

T2

A

T1

D

T2

A

F
.
. n edges
.
B

No Deadlock

Deadlock
9

T2
Deadlocks Detection and Avoidance
• Deadlocks Detection
– Only possible at run-time
– Recovery from deadlock is either to abort or retry

• Deadlocks Avoidance
– Reorder the operations to lower or eliminate the
likelihood of deadlocks
• Graph Clustering Algorithms: Most of them require
knowledge of entire graph

Clustering Relationships  Bipartite Graphs
10
Bipartite Graphs
• Given a Graph G with Vertices V and Edges
E, then graph G is a bipartite graph such that
vertices V can be partitioned into two
independent sets V1 and V2.
V1

A

V2

A

E

D
C
C

D

E
B

B
11
Creating Bipartite Graphs
• Use two colors to color each node such that
no two adjacent nodes have the same color.
1

2

A

V1

V2

A

E

D

C
C

D

E

B

B

12

Recommended for you

Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines

The architectural tradeoffs between the map/reduce paradigm and parallel databases has been a long and open discussion since the dawn of MapReduce over more than a decade ago. At Facebook, we have spent the past several years in independently building and scaling both Presto and Spark to Facebook scale batch workloads, and it is now increasingly evident that there is significant value in coupling Presto’s state-of-art low-latency evaluation with Spark’s robust and fault tolerant execution engine.

spark + ai summit

 *
Spark architecture
Spark architectureSpark architecture
Spark architecture

The document discusses Apache Spark, an open source cluster computing framework for real-time data processing. It notes that Spark is up to 100 times faster than Hadoop for in-memory processing and 10 times faster on disk. The main feature of Spark is its in-memory cluster computing capability, which increases processing speeds. Spark runs on a driver-executor model and uses resilient distributed datasets and directed acyclic graphs to process data in parallel across a cluster.

Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4

This document provides an introduction and overview of Apache NiFi 1.11.4. It discusses new features such as improved support for partitions in Azure Event Hubs, encrypted repositories, class loader isolation, and support for IBM MQ and the Hortonworks Schema Registry. It also summarizes new reporting tasks, controller services, and processors. Additional features include JDK 11 support, encrypted repositories, and parameter improvements to support CI/CD. The document provides examples of using NiFi with Docker, Kubernetes, and in the cloud. It concludes with useful links for additional NiFi resources.

apachenifiapache-nifiapache nifi
Non-Bipartite Graphs
1

2
V1

A

E

V2

A
D
C

C

D

E
B

B

13
Algorithm to generate Graph
V1

V2

A

D
C
E
B

• Create all the nodes
• Create batches of
relationships among
the same colored
nodes
• Create batches of
relationships across
the two colors
14
Algorithm in Z-Platform
• Batch of relationships R = {r1, r2, r3….. rn} :
– each r is a triplet {src, dest, props} where src and dest
are nodes and props is a set of key-value pairs

• Color the nodes based on each relationship with
two colors
• Mark the conflicting edges where both the src
and dest nodes are of the same color
• Batch these relationships together in a single
batch
• Start grouping the remaining edges such that no
two batches have any node in common
15
Performance – Test Setup
•
•
•
•
•

JDK 1.7
Neo4j Java Binding Rest API
Neo4j Enterprise Server 1.9
Batch size (configurable) : 2000
Test Program that generates random nodes
(max 1000) and relationships (max 10,000)
• Huge file that contains 10,226 nodes and
39,564,960 relationships (5 GB)
16

Recommended for you

ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data

ORC files were originally introduced in Hive, but have now migrated to an independent Apache project. This has sped up the development of ORC and simplified integrating ORC into other projects, such as Hadoop, Spark, Presto, and Nifi. There are also many new tools that are built on top of ORC, such as Hive’s ACID transactions and LLAP, which provides incredibly fast reads for your hot data. LLAP also provides strong security guarantees that allow each user to only see the rows and columns that they have permission for. This talk will discuss the details of the ORC and Parquet formats and what the relevant tradeoffs are. In particular, it will discuss how to format your data and the options to use to maximize your read performance. In particular, we’ll discuss when and how to use ORC’s schema evolution, bloom filters, and predicate push down. It will also show you how to use the tools to translate ORC files into human-readable formats, such as JSON, and display the rich metadata from the file including the type in the file and min, max, and count for each column.

dataworks summitdataworks summit 2017hadoop summit
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowAchieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache Arrow

The document is a presentation about using Apache Arrow to improve the speed of graph queries and analytics. It discusses how Arrow uses columnar data formats and vectorization to enable faster data processing. It also provides an example of how Arrow could be incorporated into Spark jobs and used with Beam to perform embarrassingly parallel graph processing. The presentation envisions future Neo4j integrations that would make Arrow processing transparent to data scientists.

Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals

Oracle Real Application Clusters 19c provides best practices and new features for upgrading to Oracle 19c. It discusses upgrading Oracle RAC to Linux 7 with minimal downtime using node draining and relocation techniques. Oracle 19c allows for upgrading the Grid Infrastructure management repository and patching faster using a new Oracle home. The presentation also covers new resource modeling for PDBs in Oracle 19c and improved Clusterware diagnostics.

oracle racracperformance
Performance – Creating Nodes
Time in seconds for Nodes
1.8
1.6
1.4
1.2
1
Time in Secs

0.8
0.6
0.4
0.2
0
1

2

3

4

5

6

• 10,226 Nodes: 5.07 seconds
• Average Time for 2000 Nodes: 0.99 seconds ~ 1 second
• Each Node has 11 properties
17
Performance – Creating Relationships
Time in Seconds for relationships
1.4
1.2
1

0.8
Time in Secs

0.6
0.4
0.2
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87

• 1,74,000 Relationships created in 47.16 seconds
• Average Time for 2000 relationships: 0.54 seconds
• Number of relationships per second: 3,689
18
Performance – Creating 39 Million Relationships

• 39,564,960 Relationships in : 10,573.56 seconds (2 hrs 56 mins 13 seconds)
• Average Time for 2000 relationships: 0.53 seconds
• Number of relationships per second: 3,741
19
Graph Visualized in the Neo4j 2.0

20

Recommended for you

The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...

So you know you want to write a streaming app, but any non-trivial streaming app developer would have to think about these questions: – How do I manage offsets? – How do I manage state? – How do I make my Spark Streaming job resilient to failures? Can I avoid some failures? – How do I gracefully shutdown my streaming job? – How do I monitor and manage my streaming job (i.e. re-try logic)? – How can I better manage the DAG in my streaming job? – When do I use checkpointing, and for what? When should I not use checkpointing? – Do I need a WAL when using a streaming data source? Why? When don’t I need one? This session will share practices that no one talks about when you start writing your streaming app, but you’ll inevitably need to learn along the way.

spark summitapache spark
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals

Slides cover Spark core concepts of Apache Spark such as RDD, DAG, execution workflow, forming stages of tasks and shuffle implementation and also describes architecture and main components of Spark Driver. The workshop part covers Spark execution modes , provides link to github repo which contains Spark Applications examples and dockerized Hadoop environment to experiment with

shufflesparksqlspark
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course

Introduction: This workshop will provide a hands-on introduction to Apache Spark using the HDP Sandbox on students’ personal machines. Format: A short introductory lecture about Apache Spark components used in the lab followed by a demo, lab exercises and a Q&A session. The lecture will be followed by lab time to work through the lab exercises and ask questions. Objective: To provide a quick and short hands-on introduction to Apache Spark. This lab will use the following Spark and Apache Hadoop components: Spark, Spark SQL, Apache Hadoop HDFS, Apache Hadoop YARN, Apache ORC, and Apache Ambari User Views. You will learn how to move data into HDFS using Spark APIs, create Apache Hive tables, explore the data with Spark and Spark SQL, transform the data and then issue some SQL queries. Pre-requisites: Registrants must bring a laptop that can run the Hortonworks Data Cloud. Speaker: Robert Hryniewicz, Developer Advocate, Hortonworks

apache sparkcrash coursetutorial
Future Work
• Test performance over the network using Amazon EC2
servers to mimic real world setup
• Single threaded application  multi-threaded to see if
better performance
– More complex algorithm to batch relationships together
– Analyze if the complexity is worth the performance
improvement

• Vary multiple factors:
– Batch size : 1000 to 4000
– Properties (relationship descriptors) : 2 – 20

• Dispatcher Pattern to facilitate the single point distribution
of nodes and relationships to threads/Transactions

21
Conclusion
• Deadlocks in general are time consuming and
difficult to detect and prevent
• Use of graph coloring to partition graph into
conflicting and non-conflicting edges
• Successful prototype tests shows significant
improvement in building relationships varying
from small number to a very large number

22
Dr. Mahesh Chaudhari
Sr. Software Engineer
+1 602 524 0610
mahesh@zephyrhealthinc.com

jobs@zephyrhealthinc.com
23
Contact Information
Sven Junkergård

Brian Roy

Director of Technology
+1 415 503 7412
sven@zephyrhealthinc.com

Director of Platform Engineering & Architect
+1 415 663 6919
brian@zephyrhealthinc.com

Zephyr Health Inc.
589 Howard St. 3rd Flr.
San Francisco, California 94105
+1.415.529.7649
zephyrhealthinc.com

24

Recommended for you

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture

This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark

apache sparkdistributed systemtungsten
How Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low LatencyHow Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low Latency

Dashtable is a hashtable implementation inside Dragonfly. It supports incremental resizes and fast, cache-friendly operations. In this talk, we will learn how Dashtable helps Dragonfly to keep its tail latency in check. In Dashtable, long-tail latencies have been reduced by a factor of 1000x, but P999 are 7x longer. Find out why we still think this is a good tradeoff.

high throughput and low latencyp99p99 conf
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...

Keynote presentation at GraML 2018, The Second Workshop on the Intersection of Graph Algorithms and Machine Learning, Co-located with IPDPS 2018.

More Related Content

What's hot

Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
DataArt
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
HostedbyConfluent
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
DataWorks Summit
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
Dvir Volk
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
Antonios Giannopoulos
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
Databricks
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Databricks
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
Databricks
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
GauravBiswas9
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowAchieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Neo4j
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
Databricks
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
How Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low LatencyHow Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low Latency
ScyllaDB
 

What's hot (20)

Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowAchieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
How Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low LatencyHow Dashtable Helps Dragonfly Maintain Low Latency
How Dashtable Helps Dragonfly Maintain Low Latency
 

Similar to Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoDB - Mahesh Chaudhari @ GraphConnect SF 2013

The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
Nesreen K. Ahmed
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Databricks
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
Databricks
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 20072009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
Marc Smith
 
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
sameh samir
 
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
Seungmin Shin
 
Building Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous DataBuilding Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous Data
Databricks
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 
Building a Modern Website for Scale (QCon NY 2013)
Building a Modern Website for Scale (QCon NY 2013)Building a Modern Website for Scale (QCon NY 2013)
Building a Modern Website for Scale (QCon NY 2013)
Sid Anand
 
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
NibrasulIslam
 
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterWeb-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Databricks
 
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
Deltares
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Jose Ricardo da Silva Junior
 
Building Conclave: a decentralized, real-time collaborative text editor
Building Conclave: a decentralized, real-time collaborative text editorBuilding Conclave: a decentralized, real-time collaborative text editor
Building Conclave: a decentralized, real-time collaborative text editor
Sun-Li Beatteay
 
Summit_Tutorial
Summit_TutorialSummit_Tutorial
Summit_Tutorial
Yrineu Felipe Rodrigues
 
mloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game developmentmloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game development
David Galeano
 
18CSL51 - Network Lab Manual.pdf
18CSL51 - Network Lab Manual.pdf18CSL51 - Network Lab Manual.pdf
18CSL51 - Network Lab Manual.pdf
Selvaraj Seerangan
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
József Makai
 

Similar to Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoDB - Mahesh Chaudhari @ GraphConnect SF 2013 (20)

The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 20072009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007
 
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct ...
 
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
 
Building Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous DataBuilding Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous Data
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
Building a Modern Website for Scale (QCon NY 2013)
Building a Modern Website for Scale (QCon NY 2013)Building a Modern Website for Scale (QCon NY 2013)
Building a Modern Website for Scale (QCon NY 2013)
 
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx
 
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterWeb-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
 
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
DSD-INT 2014 - NGHS Workshop Scripting in SOBEK 3 & Delft3D Flexible Mesh - P...
 
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
Dominoes: Exploratory Data Analysis of Software Repositories Through GPU Proc...
 
Building Conclave: a decentralized, real-time collaborative text editor
Building Conclave: a decentralized, real-time collaborative text editorBuilding Conclave: a decentralized, real-time collaborative text editor
Building Conclave: a decentralized, real-time collaborative text editor
 
Summit_Tutorial
Summit_TutorialSummit_Tutorial
Summit_Tutorial
 
mloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game developmentmloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game development
 
18CSL51 - Network Lab Manual.pdf
18CSL51 - Network Lab Manual.pdf18CSL51 - Network Lab Manual.pdf
18CSL51 - Network Lab Manual.pdf
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
 

More from Neo4j

BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit ParisAtelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Neo4j
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
FLOA - Détection de Fraude - GraphSummit Paris
FLOA -  Détection de Fraude - GraphSummit ParisFLOA -  Détection de Fraude - GraphSummit Paris
FLOA - Détection de Fraude - GraphSummit Paris
Neo4j
 
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
Neo4j
 
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
Neo4j
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
Neo4j
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
Neo4j
 

More from Neo4j (20)

BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit ParisAtelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
FLOA - Détection de Fraude - GraphSummit Paris
FLOA -  Détection de Fraude - GraphSummit ParisFLOA -  Détection de Fraude - GraphSummit Paris
FLOA - Détection de Fraude - GraphSummit Paris
 
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
 
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 

Recently uploaded

WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 

Recently uploaded (20)

WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 

Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoDB - Mahesh Chaudhari @ GraphConnect SF 2013

  • 1. Avoiding Deadlocks in Neo4j on Z-Platform - Mahesh Chaudhari, Cesar Arevalo & Brian Roy
  • 2. Outline • • • • • • Introduction to the Z-Platform Problems caused by Deadlocks Locks and Deadlocks in Neo4j Avoidance using Bipartite graphs Performance Conclusion 2
  • 3. Full Entity Profiles Z-Platform Sparse Representatio n of Profiles MongoDB Database Json Documents Nodes & Edges Neo4j Graph Database Z-Platform Source Datasets 3
  • 4. Deadlocks in Z-Platform • Creating relationships is one of the most time consuming processes • Log analysis reveals deadlocks among batch transactions and retry-mechanism takes time • Dependent on how nodes and relationships are grouped together • Batch size is dependent on the size of the JSON block sent to the server • Time required to build relationships and resolve deadlocks is in the order of seconds 4
  • 5. Locks in Neo4j • Create a Node n1  Write Lock on Node n1 • Update a Node n1  Write Lock on Node n1 but read available on Node n1 • Create a Relationship r1 between nodes n1 and n2  Write Locks on relationship r1, n1 and n2 5
  • 6. Deadlocks across processes A B P1 C D • Processes: P1 and P2 • Nodes: A, B, C, D • Relationships: R1, R2, R3, R4 P2 R1 A No Deadlocks B R4 A C B R2 No Deadlocks C R1 R1 A P1 P1 A B P1 R3 A R3 D P2 A D P2 Possibility of Deadlock D R2 C Deadlock 6 P2 D
  • 7. Deadlocks across Transactions • Transactions are also like separate processes but in a single thread or multiple threads • Deadlocks occur across transactions – Two concurrent transactions need write locks on the same node n1 – In two concurrent transactions, T1 has write lock on node n1 and waiting on write lock on node n2 whereas T2 has write lock on n2 and is waiting for write lock on node n1 – Transactions of varying sizes 7
  • 9. Sequential Asynchronous Transactions Deadlocks A E B . . n edges . F A T1 E B A . . n edges . F D T1 E C No Deadlocks D T2 A T1 D T2 A F . . n edges . B No Deadlock Deadlock 9 T2
  • 10. Deadlocks Detection and Avoidance • Deadlocks Detection – Only possible at run-time – Recovery from deadlock is either to abort or retry • Deadlocks Avoidance – Reorder the operations to lower or eliminate the likelihood of deadlocks • Graph Clustering Algorithms: Most of them require knowledge of entire graph Clustering Relationships  Bipartite Graphs 10
  • 11. Bipartite Graphs • Given a Graph G with Vertices V and Edges E, then graph G is a bipartite graph such that vertices V can be partitioned into two independent sets V1 and V2. V1 A V2 A E D C C D E B B 11
  • 12. Creating Bipartite Graphs • Use two colors to color each node such that no two adjacent nodes have the same color. 1 2 A V1 V2 A E D C C D E B B 12
  • 14. Algorithm to generate Graph V1 V2 A D C E B • Create all the nodes • Create batches of relationships among the same colored nodes • Create batches of relationships across the two colors 14
  • 15. Algorithm in Z-Platform • Batch of relationships R = {r1, r2, r3….. rn} : – each r is a triplet {src, dest, props} where src and dest are nodes and props is a set of key-value pairs • Color the nodes based on each relationship with two colors • Mark the conflicting edges where both the src and dest nodes are of the same color • Batch these relationships together in a single batch • Start grouping the remaining edges such that no two batches have any node in common 15
  • 16. Performance – Test Setup • • • • • JDK 1.7 Neo4j Java Binding Rest API Neo4j Enterprise Server 1.9 Batch size (configurable) : 2000 Test Program that generates random nodes (max 1000) and relationships (max 10,000) • Huge file that contains 10,226 nodes and 39,564,960 relationships (5 GB) 16
  • 17. Performance – Creating Nodes Time in seconds for Nodes 1.8 1.6 1.4 1.2 1 Time in Secs 0.8 0.6 0.4 0.2 0 1 2 3 4 5 6 • 10,226 Nodes: 5.07 seconds • Average Time for 2000 Nodes: 0.99 seconds ~ 1 second • Each Node has 11 properties 17
  • 18. Performance – Creating Relationships Time in Seconds for relationships 1.4 1.2 1 0.8 Time in Secs 0.6 0.4 0.2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 • 1,74,000 Relationships created in 47.16 seconds • Average Time for 2000 relationships: 0.54 seconds • Number of relationships per second: 3,689 18
  • 19. Performance – Creating 39 Million Relationships • 39,564,960 Relationships in : 10,573.56 seconds (2 hrs 56 mins 13 seconds) • Average Time for 2000 relationships: 0.53 seconds • Number of relationships per second: 3,741 19
  • 20. Graph Visualized in the Neo4j 2.0 20
  • 21. Future Work • Test performance over the network using Amazon EC2 servers to mimic real world setup • Single threaded application  multi-threaded to see if better performance – More complex algorithm to batch relationships together – Analyze if the complexity is worth the performance improvement • Vary multiple factors: – Batch size : 1000 to 4000 – Properties (relationship descriptors) : 2 – 20 • Dispatcher Pattern to facilitate the single point distribution of nodes and relationships to threads/Transactions 21
  • 22. Conclusion • Deadlocks in general are time consuming and difficult to detect and prevent • Use of graph coloring to partition graph into conflicting and non-conflicting edges • Successful prototype tests shows significant improvement in building relationships varying from small number to a very large number 22
  • 23. Dr. Mahesh Chaudhari Sr. Software Engineer +1 602 524 0610 mahesh@zephyrhealthinc.com jobs@zephyrhealthinc.com 23
  • 24. Contact Information Sven Junkergård Brian Roy Director of Technology +1 415 503 7412 sven@zephyrhealthinc.com Director of Platform Engineering & Architect +1 415 663 6919 brian@zephyrhealthinc.com Zephyr Health Inc. 589 Howard St. 3rd Flr. San Francisco, California 94105 +1.415.529.7649 zephyrhealthinc.com 24

Editor's Notes

  1. 11 Properties on each node and 11 properties on each edge.