59

Note: this is an abstract rewording of a real-life problem regarding ordering records in a SWF file. A solution will help me improve an open-source application.

Bob has a store, and wants to do a sale. His store carries a number of products, and he has a certain integer quantity of units of each product in stock. He also has a number of shelf-mounted price labels (as many as the number of products), with the prices already printed on them. He can place any price label on any product (unitary price for one item for his entire stock of that product), however some products have an additional restriction - any such product may not be cheaper than a certain other product.

You must find how to arrange the price labels, such that the total cost of all of Bob's wares is as low as possible. The total cost is the sum of each product's assigned price label multiplied by the quantity of that product in stock.


Given:

  • N – the number of products and price labels
  • Si, 0≤i<N – the quantity in stock of product with index i (integer)
  • Pj, 0≤j<N – the price on price label with index j (integer)
  • K – the number of additional constraint pairs
  • Ak, Bk, 0≤k<K – product indices for the additional constraint
    • Any product index may appear at most once in B. Thus, the graph formed by this adjacency list is actually a set of directed trees.

The program must find:

  • Mi, 0≤i<N – mapping from product index to price label index (PMi is price of product i)

To satisfy the conditions:

  1. PMAk ≤ PMBk, for 0≤k<K
  2. Σ(Si × PMi) for 0≤i<N is minimal

Note that if not for the first condition, the solution would be simply sorting labels by price and products by quantity, and matching both directly.

Typical values for input will be N,K<10000. In the real-life problem, there are only several distinct price tags (1,2,3,4).


Here's one example of why most simple solutions (including topological sort) won't work:

You have 10 items with the quantities 1 through 10, and 10 price labels with the prices $1 through $10. There is one condition: the item with the quantity 10 must not be cheaper than the item with the quantity 1.

The optimal solution is:

Price, $   1  2  3  4  5  6  7  8  9 10
Qty        9  8  7  6  1 10  5  4  3  2

with a total cost of $249. If you place the 1,10 pair near either extreme, the total cost will be higher.

32
  • 5
    What happens if you topological sort the items, with quantity as a "tie-breaker". Then match with prices. Does that give an optimal/good-enough solution? Commented Feb 4, 2011 at 13:40
  • 2
    @CyberShadow: maybe "shelf-mounted price label" rather than "price tag". I'm sure retailers have a term for those tags that sit under a clear plastic cover on the shelf, but I don't know what it is. Commented Feb 4, 2011 at 14:47
  • 31
    Bob should start a everything is $1 shop.
    – Argote
    Commented Feb 4, 2011 at 17:16
  • 5
    Ah, ok, I've found a case where the greedy approach based on topological sort breaks. Stock: Ax1, Bx10, Cx2. Constraints: A<B. Prices: $1, $2, $3. Optimal solution: $27. But the initial open set is {A,C}, so C:=$1, A:=$2, B:=$3 for $34. Commented Feb 4, 2011 at 19:12
  • 4
    @BlueRaja: To escalate the nitpicking, the problem as stated isn't in NP. Asking if there is a configuration with total price less than X is in NP, and that leads to a polynomially equivalent solution to the original problem. It might be NP-hard. A few minutes of casual thought has failed to produce a known NP-complete problem that reduces to this. Commented Feb 4, 2011 at 21:26

8 Answers 8

16
+100

The problem is NP-complete for the general case. This can be shown via a reduction of 3-partition (which is a still strong NP-complete version of bin packing).

Let w1, ..., wn be the weights of objects of the 3-partition instance, let b be the bin size, and k = n/3 the number of bins that are allowed to be filled. Hence, there is a 3-partition if objects can be partitioned such that there are exactly 3 objects per bin.

For the reduction, we set N=kb and each bin is represented by b price labels of the same price (think of Pi increasing every bth label). Let ti, 1≤ik, be the price of the labels corresponding to the ith bin. For each wi we have one product Sj of quantity wi + 1 (lets call this the root product of wi) and another wi - 1 products of quantity 1 which are required to be cheaper than Sj (call these the leave products).

For ti = (2b + 1)i, 1≤ik, there is a 3-partition if and only if Bob can sell for 2bΣ1≤ik ti:

  • If there is a solution for 3-partition, then all the b products corresponding to objects wi, wj, wl that are assigned to the same bin can be labeled with the same price without violating the restrictions. Thus, the solution has cost 2bΣ1≤ik ti (since the total quantity of products with price ti is 2b).
  • Consider an optimal solution of Bob's Sale. First observe that in any solution were more than 3 root products share the same price label, for each such root product that is "too much" there is a cheaper price tag which sticks on less than 3 root products. This is worse than any solution were there are exactly 3 root products per price label (if existent).
    Now there can still be a solution of Bob's Sale with 3 root labels per price, but their leave products do not wear the same price labels (the bins sort of flow over). Say the most expensive price label tags a root product of wi which has a cheaper tagged leave product. This implies that the 3 root labels wi, wj, wl tagged with the most expensive price do not add up to b. Hence, the total cost of products tagged with this price is at least 2b+1.
    Hence, such a solution has cost tk(2b+1) + some other assignment cost. Since the optimal cost for an existent 3-partition is 2bΣ1≤ik ti , we have to show that the just considered case is worse. This is the case if tk > 2b Σ1≤ik-1 ti (note that it's k-1 in the sum now). Setting ti = (2b + 1)i, 1≤ik, this is the case. This also holds if not the most expensive price tag is the "bad" one, but any other.

So, this is the destructive part ;-) However, if the number of different price tags is a constant, you can use dynamic programming to solve it in polynomial time.

0
9

This problem resembles many scheduling problems considered in the CS literature. Allow me to restate it as one.

Problem ("nonpreemptive single-machine scheduling with precedence, weights, and general lateness penalties")

Input:

  • jobs 1, …, n

  • a "treelike" precedence relation prec on the jobs (Hasse diagram is a forest)

  • weights w1, …, wn

  • a nondecreasing lateness penalty function L(t) from {1, …, n} to Z+

Output:

  • a permutation π of {1, …, n} minimizing ∑j wj L(π(j)) subject to the constraints that for all i prec j we have π(i) < π(j).

Correspondence: job <=> product; i prec j <=> i has a lower price than j; weight <=> quantity; L(t) <=> tth lowest price

When L is linear, there is an efficient polynomial-time algorithm due to Horn [1]. The article is behind a pay wall, but the main idea is

  1. For all j, find the connected set of jobs containing only j and its successors whose mean weight is maximum. For example, if n = 6 and the precedence constraints are 1 prec 2 and 2 prec 3 and 2 prec 4 and 4 prec 5, then the sets under consideration for 2 are {2}, {2, 3}, {2, 4}, {2, 3, 4}, {2, 4, 5}, {2, 3, 4, 5}. We actually only need the maximum mean weight, which can be computed bottom up by dynamic programming.

  2. Schedule jobs greedily in order of the mean weight of their associated sets.

In CyberShadow's example, we have n = 10 and 1 prec 10 and wj = j and L(t) = t. The values computed in Step 1 are

  • job 1: 5.5 (mean of 1 and 10)

  • job 2: 2

  • job 3: 3

  • job 4: 4

  • job 5: 5

  • job 6: 6

  • job 7: 7

  • job 8: 8

  • job 9: 9

  • job 10: 10

The optimal order is 9, 8, 7, 6, 1, 10, 5, 4, 3, 2.


This algorithm might work well in practice even for a different choice of L, as the proof of optimality uses local improvement. Alternatively, perhaps someone on the CS Theory Stack Exchange will have an idea.

[1] W. A. Horn. Single-Machine Job Sequencing with Treelike Precedence Ordering and Linear Delay Penalties. SIAM Journal on Applied Mathematics, Vol. 23, No. 2 (Sep., 1972), pp. 189–202.

6
  • 1
    I'll admit that this is somewhat over my head, but nevertheless I'm not quite sure how to translate quantities and prices to weights and the lateness penalty function. Commented Feb 12, 2011 at 16:27
  • I still can't say I understand this completely (I expect to have to dive into the respective literature first), but are you stating that this algorithm should stand up to the issues brought up in the comments and posts below? For one, does the terms "greedy" and "local optimality" imply that it would stop trying in a direction if the next position is not an immediate improvement? I've posted (and just updated) a simple test case where this won't work (just for that part, but I imagine it can be expanded in either direction). Commented Feb 12, 2011 at 23:54
  • Another thing is that I don't see any explicit checks with regards to the sum of products between price and quantity. I have to assume that lateness and weight are multiplied to get the overall fitness in step 2? Commented Feb 13, 2011 at 0:29
  • There's no theoretical guarantee when L is not linear. The algorithm is greedy, but it's not a local search.
    – user614296
    Commented Feb 13, 2011 at 0:42
  • There's no multiply because L was assumed linear.
    – user614296
    Commented Feb 13, 2011 at 0:46
4

Since I thought the problem was fun, I did a model for finding solutions using constraint programming. The model is written in a modelling language called MiniZinc.

include "globals.mzn";

%%% Data declaration
% Number of products
int: n;
% Quantity of stock
array[1..n] of int: stock;
% Number of distinct price labels
int: m;
% Labels
array[1..m] of int: labels;
constraint assert(forall(i,j in 1..m where i < j) (labels[i] < labels[j]),
              "All labels must be distinct and ordered");
% Quantity of each label
array[1..m] of int: num_labels;
% Number of precedence constraints
int: k;
% Precedence constraints
array[1..k, 1..2] of 1..n: precedences;

%%% Variables
% Price given to product i
array[1..n] of var min(labels)..max(labels): prices :: is_output;
% Objective to minimize
var int: objective :: is_output;

%%% Constraints
% Each label is used once
constraint global_cardinality_low_up_closed(prices, labels, num_labels, num_labels);

% Prices respect precedences
constraint forall(i in 1..k) (
            prices[precedences[i, 1]] <= prices[precedences[i, 2]]
       );

% Calculate the objective
constraint objective = sum(i in 1..n) (prices[i]*stock[i]);

%%% Find the minimal solution
solve minimize objective;

Data for a problem is given in a separate file.

%%% Data definitions
n = 10;
stock = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
m = 10;
labels = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
num_labels = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
k = 1;
precedences = [| 1, 10 |];

The model is fairly naive and straight-forward, no fancy stuff. Using the Gecode back-end for solving the example problem, the following output is generated (assuming the model is in model.mzn and the data in data.dzn)

$ mzn2fzn -I/usr/local/share/gecode/mznlib/ model.mzn data.dzn
$ fz -mode stat -threads 0 model.fzn 
objective = 265;
prices = array1d(1..10, [1, 10, 9, 8, 7, 6, 5, 4, 3, 2]);
----------
objective = 258;
prices = array1d(1..10, [2, 10, 9, 8, 7, 6, 5, 4, 1, 3]);
----------
objective = 253;
prices = array1d(1..10, [3, 10, 9, 8, 7, 6, 5, 2, 1, 4]);
----------
objective = 250;
prices = array1d(1..10, [4, 10, 9, 8, 7, 6, 3, 2, 1, 5]);
----------
objective = 249;
prices = array1d(1..10, [5, 10, 9, 8, 7, 4, 3, 2, 1, 6]);
----------
==========

%%  runtime:       0.027 (27.471000 ms)
%%  solvetime:     0.027 (27.166000 ms)
%%  solutions:     5
%%  variables:     11
%%  propagators:   3
%%  propagations:  136068
%%  nodes:         47341
%%  failures:      23666
%%  peak depth:    33
%%  peak memory:   237 KB

For larger problems it is of course much slower, but the model will typically generate successively better solutions over time.

3

Posting some thoughts as a community wiki, feel free to edit.

This problem is easier to visualise if you think about the additional constraints as having to lay out or rearrange a set of top-to-bottom trees in such a way that every node must be to the right of its parent (products on the left are cheaper and those on the right are more expensive).

Let's say that two products are conflicting if the first has more stock than the second, and yet the first must not be cheaper than the other (so they are being "pulled" in different directions price-wise). Similarly, a conflicting group of products is one where at least two products are conflicting, and none of its products conflicts with any product outside the group.

We can make a few observations:

  1. When "placing" (assigning a price tag to) two conflicting products, they will always be next to each other.
  2. If you sort all products by quantity disregarding constraints, and then arrange them optimally so they satisfy the constraints, then the final positions of all products in a conflicting group will always be between (inclusively) the leftmost and rightmost initial positions of the products.
  3. Therefore, if you can split a constraint tree in two by removing a single right-pointing edge from the tree such that the range of products' initial positions from the bottom subtree and the path to the tree root doesn't overlap, you can safely treat them as two distinct constraint trees (or single nodes) and forget that there was a dependency between them. (simple example)
An algorithm idea:
  1. First, place all products not bound by restrictions.
  2. For each constraint tree:
    1. Split it up into subtrees on all right-pointing edges (edges between non-conflicting products). We now have a set of subtrees with all edges pointing to the left.
    2. For each subtree:
      1. Get topologically-sorted list of it
      2. Try to insert this list at every position starting from the lowest to highest initial positions of the products in this subtree, settle on the one which yields lowest total price
    3. For each edge removed in step 2.1:
      1. If the new positions for two subtrees are "conflicting":
        1. Concatenate the higher with the lower list (special case of topological sort)
        2. Similarly try to find the optimal position for the concatenated list
        3. For future merging, consider the two examined subtrees as one subtree

The main problem with this algorithm is how to deal with displacement of already-placed constrained pairs. I imagine that simply trying to re-place displaced chains by iterative search might work, but the algorithm already looks too complicated to work right.

In the case that the number of distinct prices is low, you can use a deque (or doubly-linked list) for each distinct price, holding all the items with that price assigned to them. The deques are ordered from lowest to highest price. Inserting an item into a deque shifts the last item into the start of next deque (for the next higher distinct price), and so on for all deques after that.

One thing to note about iterative / bubble-sort-ish algorithms: when you have a conflicting pair of products, it is not enough to greedily walk in either direction by one position until the next one does not yield an improvement. Here is a test case I got by playing around a bit with Mathematica writing a test case generator:

Price, $   1 2 7 9
Qty        3 2 1 4

The constraint is to have the 4-qty item to the right of the 1-qty item. As shown above, the total price is $50. If you move the pair one position to the left (so it's 3 1 4 2), the total goes up to $51, but if you go once further (1 4 3 2) it goes down to $48.

2
  • I'm confused when you say "every node must be to the right of its parent" -- that would seem to remove the possibility of any left-pointing edges. What does it mean for A to be the parent of B? (I was assuming it meant "There is a constraint that product A must be assigned a lower price than product B".) Commented Feb 12, 2011 at 18:48
  • Yes, that's right. The part of my writeup that mentions trees with left-pointing edges imply before they are arranged to satisfy the constraints (product nodes are initially sorted by quantity). Commented Feb 12, 2011 at 19:02
3

This is a follow-up on Gero's answer. The idea is to modify his construction to show strong NP-hardness.

Instead of choosing $t_i=(2b+1)^i$, chose $t_i=i$. Now, you have to modify the argument that a solution with prize $P=2b\sum_{1\leq i \leq k} t_i$ implies that there exists a 3-partition.

Take an arbitrary shelf order. Do the accounting in the following way: distribute $w_i-1$ units of quantity of the root-product to its leaf-products. Then every product has quantity 2. By definition of the constraints, this does not shift to a higher price. After this shifting, the price will be exactly $P$. If the shifting moved some quantity to a lower prize, the original prize was strictly larger than $P$.

Hence, it is only possible to achieve the claimed prize, if all leaf-products have the same prize as their root-product, which means that there exists a 3-partition.

Citing the result from a SWAT 2010 paper this argument shows that even with unary encoding of the numbers and $k$ different price tags, a running time of $f(k)\cdot n^{O(1)}$ would violate "standard complexity assumptions". This makes the hinted at dynamic programming with a running time of $n^{O(k)}$ look not so bad.


This is cross-posted from the same answer at cstheory.

1

You could try first solving the simpler case where you simply have to sort labels by price and products by quantity, and match both directly, and then use an evolutionary process on this first approximation: generate random variations of the ordered list of products that you have, shifting a small number of random selected items up or down the list just a few places, calculate the total cost of each variation on the list, keep the best few and make those the basis of your next generation. Iterating this process over a number of generations should eventually, I expect, give you the right answer to your problem, in a fraction of the time it would take to brute force the solution.

1
  • Generational algorithms will suffer from local minimum problems (see my community wiki post). Commented Feb 11, 2011 at 19:22
1

One way to attack the problem is to express it using 0-1 linear programming and solve it using Balas' Additive Algorithm. Here's how the problem could be encoded:

  • Variables: N2 binary variables. For the sake of clarity I will index them by two integers: xij is 1 if and only if product i is assigned label j.
  • Objective function: minimize sum over i and j of SiPjxij (represents the original objective function).
  • Constraint: for each k sum over j of PjxAkj – PjxBkj is ≤ 0 (represents the original price constraints).
  • Constraints: for each i sum over j of xij is 1; for each j sum over i of xij is 1 (says that x encodes a permutation).

I'm not an expert in linear programming, probably there exists a more efficient encoding.

3
  • @Bolo- 0-1 linear programming in NP-hard, right? So this would be just a good, optimized way to tackle the problem using a solver that in worst-case won't be in polynomial time? Commented Feb 10, 2011 at 9:35
  • @templatetypedef: the complexity of linear programming problems is often pessimistic. Commented Feb 10, 2011 at 9:44
  • 1
    @templatetypedef Right. If this problem is in P, then a relevant polynomial algorithm will surely outperform the 0-1 linear programming solution. However, if the problem is not in P (which I suspect is the case), then linear programming should be a reasonable approach, since it has been studied for over half a century and the solvers are often quite fast in practice.
    – Bolo
    Commented Feb 10, 2011 at 9:45
0

Generate the permutations of the prices in lexicographic order, and return the first one that fits the constraints.

Assuming products and prices are already sorted (fewest to most, and highest to lowest, respectively),

  1. Set l[k] = k+1 for 0 <= k < n and l[n] = 0. Then set k = 1.
  2. Set p = 0, q = l[0].
  3. Set M[k] = q. If any price constraint specifically involving P[k] fails, go to 5. Otherwise, if k = n, return M[1]...M[n].
  4. Set u[k] = p, l[p] = l[q], k = k + 1 and go to 2.
  5. Set p = q, q = l[p]. If q != 0 go to 3.
  6. Set k = k - 1, and terminate if k = 0. Otherwise, set p = u[k], q = M[k], l[p] = q and go to 5.

This is (a slight modification of) Algorithm X from Knuth's Art of Computer Programming, Volume 4, Fascicle 2, Section 7.2.1.2. As with most of Knuth's algorithms, it uses 1-based indexing. Hacking it to fit the 0-based indexing of your typical programming language I leave as an exercise for the reader.

Edit:

Unfortunately, it turns out that this doesn't guarantee a non-decreasing sequence. I'll have to give it more thought to see if this can be salvaged.

3
  • Will this work correctly given that the quantities are not all the same? Commented Feb 10, 2011 at 4:01
  • It should. If you think about it, the best possible solution is one where the product with the lowest numbers in inventory gets the highest price, and then so on down the list with each product with increasing numbers getting a lower price. Permuting the price mapping in lexicographic order, then, will find progressively higher cost solutions with no reversals. Commented Feb 10, 2011 at 4:11
  • How will that scale to N items with potentially many price constraints? Commented Feb 10, 2011 at 4:52

Not the answer you're looking for? Browse other questions tagged or ask your own question.