13

Current situation

We are implementing (and now maintaining) an online shopping web application in a microservice architecture.

One of the requirements is that the business must be able to apply rules on what our customers add to their cart, in order to customize their experience and the eventual order. Quite obviously, a business rules engine had to be put in place, and we implemented a specific "microservice" for this (if we could still call it so).

Over the course of a year, this rules engine has become more and more complex, requiring more and more data (e.g. content of the cart but also user information, his role, his existing services, some billing information etc.) to be able to compute those rules.

For the moment, our shopping-cart microservice is gathering all this data from other microservices. Even though part of this data is used by shopping-cart, most of the time it is mainly used to feed the rules engine.

New requirements

Now arrives the need for other applications/microservices to reuse the rules engine for similar requirements. In the current situation, they would thus have to transmit the same kind of data, call the same microservices and build (almost) the same resources to be able to call the rules engine.

Continuing as is, we will face several issues:

  • everyone (calling the rules engine) has to reimplement the fetching of the data, even if they don't need it for themselves;
  • the requests to the rules engine are complex;
  • continuing in this direction, we will have to transport this data all around the network for many requests (think of μs A calling μs B calling the rules engine, but A already has some of the data the rules engine needs);
  • shopping-cart has become huge due to all the data fetching;
  • I probably forget many…

What can we do to avoid these troubles?

Ideally, we would avoid adding more complexity to the rules engine. We must also make sure that it does not become a bottleneck – for example some data is rather slow to fetch (10s or even more) so we implemented pre-fetching in shopping-cart such that the data is more likely to be there before we call the rules engine, and keep an acceptable user experience.

Some ideas

  1. Let the rules engine fetch the data it needs. This would add even more complexity to it, violating the single responsibility principle (even more…);
  2. Implement a proxy μs before the rules engine to fetch the data;
  3. Implement a "data fetcher" μs that the rules engine calls to fetch all the data it needs at once (composite inquiry).
4
  • Let me sum this up (with questions): you have several microservices implemented for a shop. One of them is a shopping-cart. Incorporated into the cart is a rules engine (either homebrew or some product), right? When a user adds an item to the cart, the rules-engine kicks in as part of the business logic and modifies the cart somehow (e.g. discount or bundle-products), right? Now another microservice also wants to use rules which may be based on similar input-data, right? And the input-data is provided by other microservices, right? Why is the fetching of the data so complex?
    – Andy
    Commented Sep 22, 2016 at 14:20
  • 4
    Your best bet to avoid those troubles is to get rid of the rules engine. Commented Sep 22, 2016 at 17:25
  • @Andy The rules engine is a separate microservice. Its API is a bit tailored for shopping-cart, but we could quite easily adapt it for the needs of the other microservices (they are still related to users, products & ordering). As we see it, they will need the same input data, especially as the business is able to choose the predicates to apply. All data is provided by other microservices except the cart content itself. Fetching the data is not complex per se but it becomes complex when you have to call ~10 other microservices and maintain the structure expected by the rules engine.
    – Didier L
    Commented Sep 22, 2016 at 20:31
  • 1
    @whatsisname I am not a big fan of having a rules engine in general either but at the current time we have to deal with it and anyway, and the business is changing its configuration on a day-to-day basis. Even if we got rid of it, we would still need some configurable component to do the same stuff, requiring the same input data… It would still be rules engine, just with another name, and we would still face the same issues.
    – Didier L
    Commented Sep 22, 2016 at 20:38

5 Answers 5

9

Let's take a step back for a second and assess our starting place before writing out this likely-to-be-novel-length answer. You have:

  • A large monolith (the rules engine)
  • A large quantity of non-modularized data that gets sent around in bulk
  • It's hard to get data to and from the rules engine
  • You can't remove the rules engine

Ok, this is not that great for microservices. An immediately glaring problem is you guys seem to be misunderstanding what microservices are.

everyone (calling the rules engine) has to reimplement the fetching of the data, even if they don't need it for themselves;

You need to define some sort of API or communication method that your microservices use and have it be common. This might be a library all of them can import. It might be defining a message protocol. It might be using an existing tool (look for microservice message buses as a good starting place).

The question of interservice communication is not a "solved" problem per se, but it's also not a "roll your own" problem at this point. A lot of existing tooling and strategies can make your life a ton easier.

Regardless of what you do, pick a single system and try to adapt your communication APIs to use this. Without some a defined way for your services to interact you are going to have all of the disadvantages of microservices and monolithic services and none of the advantages of either.

Most of your issues stem from this.

the requests to the rules engine are complex;

Make them less complex.

Find ways to make them less complex. Seriously. Common datamodels, split up your single rules engine into smaller ones, or something. Make your rules engine work better. Don't take the "jam everything into the query and just keep making them complicated" approach -- seriously look at what you are doing and why.

Define some sort of protocol for your data. My guess is you guys have no defined API plan (as per the above) and have started writing REST calls ad hoc whenever needed. This gets increasingly complex as you now have to maintain every microservice every time something gets updated.

Better yet, you aren't exactly the first company to ever implement an online shopping tool. Go research other companies.

Now what...

After this, you at least triaged some of the biggest issues.

The next issue is this question of your rules engine. I hope that this is reasonably stateless, such that you can scale it. If it is, while suboptimal you at least aren't going to die in a blaze of glory or build insane workarounds.

You want your rules engine to be stateless. Make it such that it processes data only. If you find it as a bottleneck, make it so you can run several behind a proxy/load balancer. Not ideal, but still workable.

Spend some time considering whether any of your microservices really should be put into your rules engine. If you are increasing your system overhead so significantly just to achieve a "microservices architecture" you need to spend more time planning this out.

Alternatively, can your rules engine be split into pieces? You may get gains just be making pieces of your rules engine specific services.

We must also make sure that it does not become a bottleneck – for example some data is rather slow to fetch (10s or even more)

Assuming this problem exists after solving the above issues you need to seriously investigate why this is happening. You have a nightmare unfolding but instead of figuring out why (10 seconds? for sending shopping portal data around? Call me cynical, but this seems a bit absurd) you seem to be patching the symptoms rather than looking at the problem causing the symptoms in the first place.

You've used the phrase "data fetching" over and over. Is this data in a database? If not, consider doing this - if you are spending so much time "manually" fetching data it seems like using a real database would be a good idea.

You may be able to have a design with a database for the data you fetch (depending on what this is, you've mentioned it many times), a few rules engines, and your client(s).

One last note is you want to make sure you use proper versioning of your APIs and services. A minor release should not break backwards compatibility. If you find yourself releasing all your services at the same time for them to work, you don't have a microservice architecture, you have a distributed monolithic architecture.

And ultimately, microservices aren't a one-size-fits-all solution. Please, for the sake of all that is holy, don't just do it because it's the new hip thing.

3
  • Thanks for your answer, @enderland. Indeed the microservice architecture is still relatively new to us, hence this question. This rules engine has evolved a bit organically to lead us here, so we now need driving directions to fix it. It is (fortunately) completely stateless, hence the quantity of data it takes as input. And this what we would like to tackle first to make it a reusable component. But how to reduce the quantity of input data without reducing the number available predicates? I guess we need an API that is able to fetch the data on its own, but how to architecture it properly?
    – Didier L
    Commented Sep 23, 2016 at 14:39
  • Concerning performance issues, those come from microservices that are actually calling slow JMS and SOAP services implemented by back-ends. They have their own databases but performance is not really their first goal (as long as it handles the load). And there are too many of them to consider replicating their data and maintaining it (for some we do it, though). The best we can do is thus caching and pre-fetching.
    – Didier L
    Commented Sep 23, 2016 at 14:49
  • So when you mention "few rules engines", I understand you mean specialized rules engines that only evaluate predicates for with 1 type of input, right? Would you suggest they fetch the data they need or it should be fetched upfront? We would also need some component to orchestrate the combination of predicates then, right? And pay attention of not adding too much network overhead due to this orchestration.
    – Didier L
    Commented Sep 23, 2016 at 14:58
1

With the amount of information presented about the rules engine and its inputs and outputs, I think your suggestion no. 2 is on the right track.

The current consumers of the rules engine could outsource the process of collecting the required information to a more special purpose component.

Example: You're currently using the rules engine to calculate discounts that need to be applied towards the contents of the shopping cart. Previous purchases, geography and current offers factor into it.

The new requirement is to use much of this same information to e-mail offers to previous customers based on upcoming specials and previous purchases. Previous purchases, current and upcoming offers factor into it.

I'd have two separate services for this. They would each rely on the rules engine service for some of its heavy lifting. Each of them would collect the required data needed for their request to the rules engine.

The rules engine just applies the rules, the consumers don't need to worry about what exact data the rules engine needs for the particular context, and these new intermediary services do just one thing: Assemble context and pass on the request to the rules engine and returns the response unmodified.

0

Aggregating the data required for the decision should be done outside the rules engine. This is because they are best designed as stateless services as when possible. Data fetching necessarily involves asynchronous processing and state holding. It doesn't much matter if the fetching is done by a proxy fronting the decision service, by the callers, or by a business process.

As a practical matter for implementation, I will mention that IBM Operational Decision Manager is starting to document and already supports the use of the product within docker containers. I am sure that other products provide this support as well and that it will become mainstream.

0

In my simple thinking, I guess, it will help to prefetch all requisite data by making a set of async calls to data retrieval services as soon as the customer starts buying and cache the data. So when you have to call the rules service the data is already there. And continue to be available for other services as well during the session.

0

Quite an interesting question! I'm adding this answer just for the sake of completeness and readers' benefit as this problem has seemingly already been solved by the industry six years on.

I think you have headed for the right solution by extracting the business rules from the shopping-cart and implementing them as a microservice so they could be reused. The only issue was to keep thinking about the implementation as a legacy rule engine that takes all its data as parameters. The remainder of this answer is pretty standard microservice design patterns these days.

As a microservice, it needs to hold all data it requires to work. This doesn't infringe the single responsibility principle. But how to fetch all those data and keep being responsive under load and resilient in case of other services' outages?

It might not be clear back in 2016 but in 2022 the answer is clear: design it as a Reactive Microservice. Integrate the services via messaging and notify all data changes through events, this way your "rule engine" service:

  • doesn't need to fetch all the data in real-time
  • is able to keep working even when other services aren't available.

Not the answer you're looking for? Browse other questions tagged or ask your own question.