5
$\begingroup$

I have the following problem: I have products with different attributes (price, weight, category) and I have a list of clients. Every client has an "affinity value" with every product, the more affinity a person has, the more likely they will like it. I have a list of constraints to follow like: I have to deliver between 5-7 products to a client, the price of the total products has to be less than some value, etc. And I want to maximize the "total affection value".

I guess this is a Mixed Integer Problem, I'm using the CBC solver from Google OR-Tools library, I set my variables, my constraints, and my objective function. The problem is, this works with small numbers, but if I have a large number of clients and products, it generates too many variables. If I have 20,000 clients and 1,000 different products, I have 20 million variables to solve. When I try to solve it, it takes too long and my computer kills the program for lack of memory.

I never dealt with optimization problems before, someone can give me some advice on how the best way to tackle it? I just need to know the path that I should follow, if it's possible to accomplish what I want as well.

EDIT: I'll try to be more detailed. I have a binary matrix, X, which each line represents a client and a column represents a product. If I'm going to deliver the product to a person, the entry in the matrix is 1, 0 otherwise. So for example, if I have 5 clients and 4 different products and want to deliver 2 products to each client, I have something like:

$X = \begin{bmatrix} 0 & 1 & 1 & 0\\ 1 & 0 & 0 & 1\\ 0 & 0 & 1 & 1\\ 0 & 1 & 1 & 0\\ 1 & 0 & 0 & 1 \end{bmatrix}$

I have another matrix that I called N, with the same size of X and each position of this matrix corresponds to the same entry in matrix X. N has the "affinity values" that each client has to each product. So, I could have something like:

$N = \begin{bmatrix} 5 & 4 & 1 & 10\\ 1 & 0 & 4 & 1\\ 9 & 6 & 2 & 9\\ 7 & 1 & 6 & 5\\ 1 & 8 & 0 & 1 \end{bmatrix}$

I want to multiply each entry of X with the correspondent entry in N and sum all this multiplications, for example:

What I want to do is maximize the value of this summation, N has fixed values so I have to find a matrix X that give me the largest value possible of this sum. This is my objective function, so:

But I have to follow some constraints:
I. Each client has to receive between 5 and 7 products
II. The products can't cost more than 22.00
III. We can't allocate more items of one product than we have in stock
IV.The weight of the box has to be between 600 and 1000
V. Every client has to receive equal or less than two product categories (we have 5)

And a lot of other constraints.. Basically what I want to do is allocate our products to our clients in a way that the global "affinity value" is the higher as possible. As I see, every relation between a client and a product is a variable, so I have millions of variables that have to obey a lot of constraints.. The CBC solver works well when I have few variables (like 12,500 variables, for example) but can't handle the millions (maybe almost a billion variables in the future) that I have.

$\endgroup$
9
  • 2
    $\begingroup$ Welcome to OR.SE! I recommend that you post your formulation here too because one may find an alternative way of formulating your problem and reduce the number of variables. $\endgroup$
    – EhsanK
    Commented Aug 10, 2020 at 15:16
  • 1
    $\begingroup$ How are the decisions for different clients linked? 20,000 clients is pretty massive. What prevents you from just solving for each client individually rather than all-at-once? $\endgroup$
    – AirSquid
    Commented Aug 10, 2020 at 18:49
  • $\begingroup$ Are the constraint limits (2-4 products per client, total price limit, ...) the same for all clients, or does each client have separate limits? $\endgroup$
    – prubin
    Commented Aug 10, 2020 at 19:37
  • $\begingroup$ @EhsanK Thanks! I really appreciate it. I tried to be more detailed in the edit I made $\endgroup$ Commented Aug 11, 2020 at 2:00
  • 2
    $\begingroup$ With this large dimensions of input you may want to switch to a sparse approach. With your approach you currently need one variable for each client/product combination. However, you can probably that certain products will never be shipped to certain clients. In other words, the variables for these combinations will always be 0. For any client/product combination you can rule out a priori, you don't have to create a variable. This will help to reduce the model size. One way could for example to ignore any client/product pair with affinity below a certain threshold. $\endgroup$ Commented Aug 13, 2020 at 9:57

1 Answer 1

4
$\begingroup$

I can think of a couple of ways to reduce the size of the problem, at the risk of producing suboptimal solutions. One, as Daniel Junglas suggested, it to set a nonzero threshold for affinity level, and not include variables $x_{ij}$ where the affinity of customer $i$ for product $j$ is below the threshold.

Another is to do a cluster analysis of the customers (based on their product affinities), then build product bundles based on the mean / central "customer" of each cluster. That would reduce the 20,000 rows of $X$ to one row per cluster.

You could even combine these, if the number of clusters were too high, by first clustering and then eliminating assignments of product to cluster when the (aggregate) affinity of that cluster for that product was too small.

Finally, there was AirSquid's question about handling each customer individually. I did not understand your reply, but I suspect there are some joint constraints (such as supply / inventory of different products) that span customers. If so, you could arbitrarily partition the customers into subsets, arbitrarily partition supplies into the same number of subsets, pair a customer subset with a supply subset and solve those problems separately. In this approach, rather than clustering I would do the opposite: try to partition the customer base into $K$ subsets that are each as similar as possible to the overall population of customers. Then divide the resources into $K$ more or less identical portions and solve $K$ smaller MIPs.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.