-1

Consider a context where we have Users saving Questions and adding personal Tags to them.

In a graph-based paradigm, a first approach could be something like:

(User)-[SAVES {tags}]->(Question)

However, with this solution tags are treated as properties of the relationship, and I want to have Tags as nodes (just like Users and Questions), while keeping user-tag-question information.

The goal is to have something like a "question bookmarking service", where users can find questions by intersecting tags, find related tags based on common questions, re-use the same tag on multiple questions, find similar users based on tag use, etc.

How can this problem be approached in a graph-based database context?

6
  • Can you elaborate a bit more on what the problem is? Can you have something like (User)--[SAVES]-->(Question)<--[CLASSIFIES]--(Tag) or some variant of that, maybe with a relationship between user and tag as well? Or somethng like (User)--[SAVES]-->(TaggedQuestion), which then in turn connects to the tags and the question itself? Commented May 23, 2022 at 15:52
  • @FilipMilovanović Each user might use a different set of tags to 'save' a question. So there is no global set of tags associated with a question.
    – ssn
    Commented May 23, 2022 at 16:56
  • "graph-based paradigm" & "graph-based database context" don't mean anything in particular.
    – philipxy
    Commented Jun 11, 2022 at 13:04
  • @philipxy What would you suggest as an expression for this paradigm?
    – ssn
    Commented Jun 12, 2022 at 14:29
  • The point is "paradigm" or "context" or "-based" (and "graph DB") are too vague to answer about. Pick one specific complete product/model semantics & syntax. (What one are you working under to have this question arise?) Then ask 1 specific researched non-duplicate question re how/why you are 1st stuck following what published presentation of a design method for it. PS Suggest you also research re encoding/representing n-ary relation(ship)s/associations using binary relations. PS Please clarify via edits, not comments.
    – philipxy
    Commented Jun 12, 2022 at 18:40

1 Answer 1

2

Since you specifically state that you are looking for a solution from the graph DB point of view, I will assume that what you are talking about specifically would be a Labelled Property Graph (LPG). (Investigating the Differences Between LPG, RDF and SPARQL)

That being said, the way that I would personally go about the described task, would be to not store the Tag information within the edge between the User and the Question. For reference, that would be the way you have described the schema in the question, namely being:

(User)-[SAVES {tags}]->(Question)

Solution

I would opt for a slightly more complicated schema, but would simplify the consumption of the data on your client side by quite a large degree.

(User)-[SAVES]->(Question)
(Question)-[TAGGED{users}]->(Tags{tags})

With this schema, you are essentially moving the tag information from the specific edge connecting a given User to a Question, and instead creating a new relationship from the Question to a Tags object, storing the given User as an edge property. This new object would just be a simple list of tags. Since the relationship is now defined from the Question as it's source you can walk the graph whenever someone adds tags to a specific question and merge any Tags with common User into a single edge.

Why have a {tags} Property on the Tags Node?

Adding the {tags} list property is simply a way to optimize the design for what I imagine would be a hot path in the scenario, and is not a required aspect of the given solution. The property would simply allow you to retrieve all of the Tag objects associated with a given collection of Users on the edge. If the {tags} property was omitted, it would be required to walk all of the edges from the Question to Tag nodes while keeping track of which users are on the edge in order to find that information.

2
  • Thanks. I understand that the {users} property in the TAGGED relationship corresponds to the list of users that used a given tag in a given question. But why the property {tags} in Tags nodes?
    – ssn
    Commented May 23, 2022 at 17:34
  • 1
    That part is honestly just a personal preference, as having the property there would allow you to retrieve all of the tags associated with a given collection of Users on the edge. Without the {tags} collection you would have to walk all of the connections from the Question to Tag nodes and keep track of which Users are on the edge in order to find that information. But I guess what I'm trying to say is that it's more of an implementation optimization depending on what the intended use-case is for the database. Commented May 25, 2022 at 1:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.