5
$\begingroup$

Given N arrays of variable length.

Find a way to concatenate the arrays in such a way that the number of inversions is minimum

An inversion for an array Arr can be defined as,

a pair of indices (i,j) such that,

i != j

i < j

Arr[i]>Arr[j]

I tried to concatenate the arrays on the basis of sum, such that the one with minimal sum goes first and one with maximal sum goes last.

It didn't work out though.

Example:-

[14, 18, 18, 20, 16, 6, 11] SUM:- 103

[2, 4, 11, 40, 20, 14, 19] SUM:- 110

[14, 18, 18, 20, 16, 6, 11, 2, 4, 11, 40, 20, 14, 19] Inversions:- 42

[2, 4, 11, 40, 20, 14, 19, 14, 18, 18, 20, 16, 6, 11] Inversions:- 40
$\endgroup$
6
  • $\begingroup$ What running time are you looking for? $\endgroup$ Commented Sep 23, 2019 at 12:04
  • 1
    $\begingroup$ something less than n!. O(n) if possible. O(nlog(n)) will do too. $\endgroup$
    – asds_asds
    Commented Sep 23, 2019 at 12:20
  • 1
    $\begingroup$ you are either ignoring the number of the arrays or the lengths of them. If n is the number of array and m is the maximum length of an array, then O(nm) is needed to read the input and hence is a lower-bound (or at least the sum of the lengths) $\endgroup$ Commented Sep 23, 2019 at 12:23
  • 4
    $\begingroup$ @narekBojikian and others who are interested, I discovered something very interesting about inversions after trying long and hard to prove something impossible. If $A, B, C$ are arrays such that $A < B$ and $B < C$ in terms of minimizing inversions for pairs $AB, BA$ and $BC, CB$, then it does not mean that $A < C$. A counterexample is $A = [3], B = [4, 5, 0], C = [1, 2, 6]$. Here we have that $AB$ has less inversions (3) than $BA$ (4), $BC$ has less inversions (6) than $CB$ (7), yet $AC$ has more inversions (2) than $CA$ (1). Very surprising (to me), and why I deleted my answer. $\endgroup$
    – orlp
    Commented Sep 23, 2019 at 18:07
  • 2
    $\begingroup$ I was able to prove that the optimal combination of $A, B, C$ with $A < B$ and $B < C$ in terms of inversions must be either $ABC, BCA$ or $CAB$, but all of those are possible. For example $A = [8, 7, 1]$, $B = [3, 2, 9]$, $C = [6, 5, 4]$ has an optimal amount of inversions with $BCA$ despite $AB$ beating $BA$ and $BC$ beating $CB$. $\endgroup$
    – orlp
    Commented Sep 23, 2019 at 18:17

1 Answer 1

4
$\begingroup$

As an exercise to the reader, two lemmas:

Lemma 1. If $a, b$ are strings of symbols with a partial order, and $\text{numinv}$ counts the number of inversions in a string of symbols then $$\text{numinv}(ab) = \text{numinv}(a) + \text{numinv}(b) + \text{crossinv}(a, b)$$ where $$\text{crossinv}(a, b) = |\{(i,j) \in \{1,\dots,|a|\}\times\{1,\dots,|b|\} \mid a_i > b_j\}|.$$

Lemma 2. $\text{crossinv}$ has the distributive property over concatenation, that is for strings $a, b, c$ we have

\begin{align} \text{crossinv}(a, bc) &= \text{crossinv}(a, b) + \text{crossinv}(a, c)\\ \text{crossinv}(ab, c) &= \text{crossinv}(a, c) + \text{crossinv}(b, c). \end{align}

You can use the above two lemmas repeatedly to generalize, and see that for a tuple of strings $(x_1, \dots, x_n)$

$$\text{numinv}(x_1x_2\dots x_n) = \sum_{i=1}^n \text{numinv}(x_i) + \sum_{i=1}^n\sum_{j=i+1}^n\text{crossinv}(x_i, x_j).$$

Since the first sum is constant for any re-ordering of $x$, we can ignore it for our problem. In fact, if we sort the symbols inside each $x_i$ you can compute $\text{crossinv}(x_i, x_j)$ in $O(|x_i| + |x_j|)$ time. So we only need to minimize the second sum by re-ordering $x$.

However, I do believe that this is a difficult problem to solve optimally. A greedy swapping approach won't work. As a counter-example (I apologize for the somewhat length counterexample, it's just one I found during experiments with randomly generated arrays) consider the following array of arrays:

$$(39, 23, 9, 31, 1, 44), (35, 41, 1, 49, 23, 38, 4, 11), (13, 10, 17, 5, 38, 50, 46), (28, 21), (45, 46, 33)$$

This has $102$ inversions when concatenated. The unique optimal concatenation order is

$$(28, 21), (39, 23, 9, 31, 1, 44), (35, 41, 1, 49, 23, 38, 4, 11), (13, 10, 17, 5, 38, 50, 46), (45, 46, 33)$$

with $100$ inversions. However this can't be generated with a simple swap. I've found similar counterexamples for single-symbol substring rotations.

A simple greedy growing approach won't work either. $(41, 3, 31), (16, 11, 47)$ is uniquely optimally ordered, but if we extend this with $(30, 20)$ the unique optimal ordering becomes

$$(16, 11, 47), (30, 20), (41, 3, 31)$$

Which is in no way a simple extension of the previous.

To go even further, Tim shows in this answer that if you view $\text{crossinv}$ as a black box function, and you don't exploit any of its special properties, then this problem is NP-complete and you're doomed to fail. So any efficient optimal algorithm must use some special property of $\text{crossinv}$.

For very large amount of arrays you will probably see the best real-world results using a non-deterministic technique such as genetic evolution to optimize the number of inversions, although this wouldn't guarantee optimality. Or perhaps I'm overestimating this problem's difficulty and someone comes along with a better answer.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.