-1

I am working on a project. I have designed the codebase all the way from the scratch. Basic thing is the project is divided in individual modules. The modules represent one complete part of a business process. For example, AnalyticsModule deals with the analytics stuff of the project, PlatformModule deals with the other processing done on the platforms.

There are repositories in the project which corresponds to their entities. You write fetching/creating/updating/deleting logic in the repositories and those repositories return the entity instance. You don't write any logic in entities, just the table column definitions and relationships.

Now there is a PlatformEntity and corresponding PlatformRepository in PlatformModule. While working in the AnalyticsModule, I need some logic to count the number of platforms and provide the number of times a platform has been visited. This can be done in the PlatformRepository as I will start fetching the platforms and count their visits, but I wanna write this logic in AnalyticsModule as this method better belongs to the analytics part of the project.

Now, I have already added the support of multiple repositories per entity from the very beginning, but I am wondering whether this approach of having multiple repositories would be good or bad. Having two repositories for PlatformEntity (one in PlatformModule and the other one in AnalyticsModule) solves my problem though.

3
  • Just to be clear, you are talking about having multiple instances of the same PlatformRepository class? Commented Apr 16 at 13:28
  • No! platform has two repositories, PlatformRepository in PlatformModule and PlatformAnalyticsRepository in AnalyticsModule. Both the repositories will return the instances of the same PlatformEntity Commented Apr 16 at 13:44
  • This depends on how strict the separation between modules is. Are "modules" just separate namespaces in the same application? Are they separate binaries? Commented Apr 16 at 19:27

1 Answer 1

5

Your design doesn't sound great to me.

The idea with a repository is to abstract the underlying database. With your "generic repository" design you have a repository per entity already, so more than one abstraction for the same database, segregated by table.

Now you want to add additional repositories to return reporting data, essentially a view. which would fit with the generic repo idea, but put them in different modules.

I think the danger here is if you change the database you have multiple places in your code which may be affected. It would make more sense, to me, to have the repositories in a single module per database.

That way you change the db, you have one repository or collection of repositories to update and test.

Also here you show a flaw in the Generic Repo pattern, in that tables aren't a great unit of segregation, because you have queries which will cross table boundaries, or aggregate functions which don't return entities.

Additionally, it can be bad to mix transactional and reporting functionalities on a database. Your count of Platforms could slow down CRUD operations if the table is big enough. Often you will have a database for transactional use, and a separate reporting database or "data lake" for reporting and analytics.

8
  • I would like to see performance metrics around counting platforms and CRUD operations before adding a reporting database or data lake. SQL databases are really, really, really good at that kind of stuff, so I would probably start out with SQL. But this is just a nitpick about an otherwise good answer. +1 Commented Apr 16 at 20:51
  • yeah say you have a million SocialMediaPosts and you are inserting and pulling small sets back and then you have some report which is average posts per day per user. Its gona be slow and slow the rest of your site down
    – Ewan
    Commented Apr 16 at 22:14
  • @Ewan It depends. Smart usage of things like READ UNCOMMITED/NOLOCK or READPAST can help a lot with reports on nasty tables. More so, clever "table-ing" can solve a bunch of common problems like this - something like "SocialMediaPosts" on my DB would be read-only with smart versioning, so I can have features like edit history or auditioning for police/judicial matters. It also have the side effect of keeping the table mostly free for reading. On this design, a good reporting query can exclude the hot end of a table (which with proper fragmentation would be the only active area for writes)..
    – T. Sar
    Commented Apr 16 at 23:40
  • ..and that's all without going into other designs that are really helpful, like aggregate tables, materialized views, etc
    – T. Sar
    Commented Apr 16 at 23:50
  • 1
    I'm mean i'm not knocking SQL. obvs for a small DB or limited aggregates its not an issue. But OP uses the term "analytics" and I think its worth pointing out these are a thing en.wikipedia.org/wiki/Data_warehouse
    – Ewan
    Commented Apr 17 at 8:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.