1

Say I have a feature in a web app where users can create a post and like it. In the frontend the user should see the number of likes a post has.

I could store the data two ways:

1. Option 1: A small table where # of likes is determined from a query

(content: text, liked_by: string[])

2. Option 2: a larger table where # of likes is explicitly tracked

(content: text, likes: int, liked_by: string[])

Using option 1, for each Post that is shown to the user, we must retrieve an arbitrarily large array from the database (liked_by), and then Count(liked_by) each item in that array, which could be essentially an O(N * number_of_posts) length operation.

In option 2, we can immediately access the # of likes a post has, so this is much faster. However, now there is some redundancy or potentially unnecessary complexity in the database.

Is there a preferred pattern or method for structuring a database for this use case?

1 Answer 1

2

Its not very clear since your options aren't SQL. But it sounds like option #2 is the better path.

The way I've done this in the past is to maintain both a list of likes in their own table AND the total likes stored with the post. Basically when a post is liked/un-liked, you adjust the table that stores the likes and then recalculate the current total for that post. It is some duplication and denormalization, sure, but the benefit of easier and faster queries is more than worth having a little duplication. You can do the update of the liked total with an insert trigger on the likes table if you'd prefer, so you don't have to worry about keeping them in sync.

Not the answer you're looking for? Browse other questions tagged or ask your own question.