10

What is the canonical way to model many-to-many relations with CQL3 ? Let's say I have to tables

CREATE TABLE actor (
    id text PRIMARY KEY,
    given text,
    surname text,
)

CREATE TABLE fan (
    id text PRIMARY KEY,
    given text,
    surname text,
)

and I'd like to model the fact that an actor can have many fan and each fan can like many actors.

The first idea that came to my my was to use sets, like in the following (and the other way around for fans):

CREATE TABLE actor (
    id text PRIMARY KEY,
    given text,
    surname text,
    fans set<text>
)

<similarly for fan>

but it seems they are meant for small sets, and I don't see a way to check if a fan is related to an actor without loading either set completely.

The second choice I found would be to make two mapping tables, each for each relation direction:

CREATE TABLE actor_fan (
    text actor,
    text fan,
    PRIMARY KEY(actor,fan)
);

<similarly for fan_actor>

Would this give me the ability to get both the fan list of an actor and check if a specific person is a fan of a given actor ? There is a lot of documentation about Cassandra, but it is often related to older versions and there seem to be lot of differences between the releases.

0

2 Answers 2

8

The proper way to do this in Cassandra is denormalizing the data into 2 tables. You shouldn't worry about having to write twice, once on each table, as Cassandra is designed to handle writes very fast to support such model.

Take a look at this data modelling tutorials that will help understanding these things:

Data modelling tutorials

Also I see you mentioned sets as well. Just as a side note and although it is not an answer to your questions, you might want to be aware of some new features like: http://www.datastax.com/dev/blog/cql-in-2-1

1
  • Thanks for the links. Different versions of Cassandra vary a lot in term of features (Cassandra 1.2 and above looks like a completely different product than earlier versions), so it is quite easy to find incomplete or obsolete documentation. Commented Oct 29, 2014 at 23:42
2

The way to achieve it is denormalizing data creating an actors_by_fans and a fans_by_actors. You can also use sets but this have limitations you already mentioned.

HTH, Carlo

Not the answer you're looking for? Browse other questions tagged or ask your own question.