1

I am working on converting a milestone project into microservices. All these services are in seperate docker containers. I am using event-driven pattern where I have used RabbitMQ as the message broker with celery being the async task queue.

Celery is storing task information in Redis, which is again in a separate container.

So basically we have these services in different containers:

  1. access-management
  2. RabbitMQ
  3. Redis
  4. gen
  5. apply
  6. cm
  7. notification

I want to use the same Redis container as a data source for all my other services.

Here's my doubt- as Microservices manage its own data and suggests to have separate database per service. If I use the same database(Redis container) for all the services, will that be an acceptable design without being deviated from microservice principles?

0

1 Answer 1

3

The idea behind "separate databases" seems to be one that's misunderstood, and I've written about it before here. Having separate databases does not mean separate database server hosts, or even processes. The idea is that the data stored by different applications is isolated. Coming from a background in relational databases (particularly MySQL), this would mean that each applications would have its own schema or database (see this particular Stack Overflow question and its answer for the MySQL terms).

Although my experience with Redis is more limited, my understanding is that this kind of data isolation doesn't exist. If you have a single instance or single cluster, all of the data is available. A single Redis instance doesn't support multiple, isolated databases. If you connect to the instance, you can read any of the key/value pairs stored in Redis. This leads to two solutions - each service having its own Redis instance or Redis cluster or not having the firm data isolation between services.

Having a single Redis instance and using prefixes on the keys seems to be widely supported and this is the approach that I've taken in the past. This approach, although it will allow one microservice to read and write data that belongs to another service, it does lead to a less complex infrastructure and you can handle the key prefixes in the application level to help developers avoid collisions between key names from different services as well as to make it harder to cross boundaries.


The other aspect to microservices is scalability. Although there may be a few reasons why one would choose microservices, the ability to develop, deploy, and scale pieces of a system independently is a common rationale. As such, you would want to scale the databases as well.

Although having a data store for every service may allow you to scale, I haven't been convinced that the infrastructure overhead that this would incur is necessary for scalability or high availability of the data store. The fact that cloud providers (like AWS or Google Cloud Platform) have managed services for various databases and data stores that have provisions for both makes it even more straightforward.

When you are working at a data layer, options such as a read replicas, sharding the data across a cluster, cross-region replication, and appropriate use of caching options are all good solutions for improving the performance of your data stores. I would rather implement these solutions for a single data store and scale up in hardware rather than needing to implement these solutions for multiple data stores.

12
  • My understanding of the purpose of having separate databases on microservices is that it facilitates horizontal scaling, which would seem to preclude having two databases from two different microservices on the same database server. Commented Jan 11, 2020 at 20:10
  • Of course, that doesn't mean that you have to isolate databases on microservices, or even that you should. But that is the prevailing wisdom. Commented Jan 11, 2020 at 20:12
  • thanks for the write-up. What's bothering me of not having isolated databases - if the database server goes down, then essentially all the services goes down.
    – Pro
    Commented Jan 11, 2020 at 20:17
  • 1
    @RobertHarvey It's far less of a concern with services such as AWS RDS and Google Cloud Databases with auto-scaling resources. Also, a well-designed database and queries with appropriate caching layers, the use of a reporting database, and read-only replicas also mitigate a lot of the performance issues to the point where scaling the database becomes more prohibitively expensive and infrastructure-intensive for most use cases.
    – Thomas Owens
    Commented Jan 11, 2020 at 20:19
  • @Pro The same problem exists in a monolithic application. And the same solutions apply. You can have read replicas (including geographically distributed replicas) that mirror the primary. Your database can fall over. Redis supports similar architectures for high availability. If you had multiple instances of Redis, you'd need to establish high availability for all of them instead of just one instance or one cluster, which would also lead to increased infrastructure complexity and cost.
    – Thomas Owens
    Commented Jan 11, 2020 at 20:21

Not the answer you're looking for? Browse other questions tagged or ask your own question.