1

I am developing a diagramming application and want to optimize operations with the Nodes and Relations of the diagram. Currently, I am using a relational database with tables for diagrams and nodes. Each node entity has a parentId field, which forms the relationships. This architecture seems far from ideal to me, so I am curious whether I would gain a performance boost in operations with the diagram elements by using a graph database. I am referring to operations such as retrieving all nodes for a specific diagram, modifying nodes, relationships, etc.

At first glance, it seems that the inherent nature of graph databases should naturally fit this task. But most examples in the documentation cover more typical scenarios such as building recommendation systems and other similar applications. And maybe I'm missing something.

6
  • 1
    Independently from whether a graph db would be a better fit for your use case, you should never* have a multi-value column like you described. Relations should have their own table with (at least) a "parent" and "child" (or whatever fits your relation) column refering to the primary key of the respective table. (* as usual, exceptions to this rule exist, albeit very rarely)
    – germi
    Commented Jun 5 at 10:00
  • 2
    What kinds of queries are you expecting? Just because your diagrams visually look like graphs doesn't mean your actual queries will need strong support for graph operations. If edges in this graph have strong types and simple cardinality rules, then modelling them as relations and traversing them via joins might end up being simpler. I'd also question whether you need node-level DB operations. Most diagramming tools treat diagrams as a document which would be stored in one blob/file, and wouldn't need node- or edge-level queries.
    – amon
    Commented Jun 5 at 12:13
  • How large graphs are we talking about? For small graphs it seem simpler to just keep everything in memory, and save/load to a simple json or xml file. Databases make more sense when you need to efficiently query large datasets.
    – JonasH
    Commented Jun 5 at 12:38
  • @germi You absolutely right, thank you. It's my fault, I forgot that I'm forming the field children after retrieving all the nodes belonging to one diagram. I do this for subsequent rendering on the frontend side. The nodes are stored in a table with a parentId field. I will update the question according this. Commented Jun 5 at 14:38
  • to follow uo on @amon comment, a diagraming app also makes me think of data in the other way around in terms of granularity than a graph db. My guess is you'll often want to query more or less the entire diagram, as well as make changes to multiple elements at once. In that case the whole diagram is basically one large aggregate that might be more efficient to store as a JSON document and do any finer level querying on the client side instead of the DB. It all depends on what your requirements are.
    – 5ar
    Commented Jun 5 at 14:49

1 Answer 1

2

If your diagrams somehow represent graphs (nodes and edges) there is indeed a natural match for the mapping of your database. Graph database also offer flexibility when you need to associate more properties with your nodes.

However nodes and edges are also well modeled in relational databases, and RDBMSes are extremely well optimized when it comes to use the relationships, even if the schema appears less natural.

For big data graphs with millions of nodes, a performance comparison would be interesting. You'll have to figure out if the overhead for accessing attributes in a graph model (example here) is outweighed by a higher performance in navigating between the nodes. Only a thorough benchmark will tell. All other statement would only be opinions.

However for diagrams meant to be read by humans, i.e at most a few hundreds of nodes, there will not be significant differences between the two technologies. The difference will be much more impacted by the individual vendors' default optimisations and API overhead, and even there it is expected to be marginal. Avoid premature optimization, especially when these are not needed.

Not the answer you're looking for? Browse other questions tagged or ask your own question.