I'm researching a subject of balancing the load between two black-box systems (with some twists). I thought that I could record latest response time log from each of those systems and process such a log to dynamically make decisions on how to balance the load.
E.g. if we start with system A handling 100% of the current load with 100ms 90p latency from some recent time window and the load grows so the latency grows to 150ms we should make a decision to duplicate some of those requests to system B so it warms up and eventually spread the load.
Unfortunately I don't really know where to start in looking for suitable algorithms that I could modify to suit this problem. I'd rather avoid ML, but I'm sure there are some other solutions to this problem.
If you could point me to some algorithms or even keywords to further research it'd be great. Thanks!
Update: I've looked at Markov Decision Process but it seems that it's not suited to problems where effects of an action depend not only on the current state, but also on history. Unless there would be a way to model some recent history window as a single state...