13
$\begingroup$

I've come across the following linear algebra problem while trying to derive something in information theory. I'm looking both for numerical ways to solve this type of problem and for anything analytical that can be said about it. If there's a closed-form solution (e.g. in terms of matrix algebra) that would be great.

I have a matrix $C$ (which might be singular) and vectors $a$ and $b$ such that $\sum_i a_i = \sum_j b_j$. I'm looking for vectors $\alpha$ and $\beta$ such the following conditions hold simultaneously: $$ \sum_i \alpha_i C_{ij} \beta_j = b_j\qquad\text{(for every $j$)} $$ and $$ \sum_j \alpha_i C_{ij} \beta_j = a_i\qquad\text{(for every $i$).} $$

Clearly there are cases where no solution exists, such as when $C$ is diagonal and $a\ne b$, but I suspect that in my case there always will be a solution. It looks like this should be easy to solve, but I can't quite see how to go about it.

The following may or may not be relevant for answering the question, but in my case the numbers all relate to a joint probability distribution over variables $A$, $B$ and $X$:

  • The elements of $C$ are the marginal distribution for $A$ and $B$. That is, $C_{ij} = p(A=i,B=i)$;
  • $a$ and $b$ are the following marginal conditional probability distributions: $a_i = p(A=i \mathop{|} X=k)$ and $b_j = p(B=j \mathop{|} X=k)$, for some particular $k$;
  • The numbers $\alpha_i C_{ij} \beta_j$ are the elements of an (unknown) conditional distribution $p(A=i,B=j\mathop{|}X=k)$, which is what I'm attempting to find. (Actually it's not the true conditional distribution but a minimum-information estimate of it, which is why it has this particular form.)

In terms of the linear algebra problem, this means that $a$, $b$ and $C$ satisfy the following constraints:

  1. all the elements of $C$, $a$ and $b$ are real and between 0 and 1.
  2. $\sum_{ij}C_{ij} = 1$
  3. $\sum_{i}a_{i} = \sum_j b_j = 1$.
  4. $a$ and $b$ can be expressed as the row and column sums of a matrix $D$ with real entries in $[0,1]$ such that $D_{ij}=0$ whenever $C_{ij}=0$, and $\sum_{ij}D_{ij}=1$. (The elements of $D$ are the "true" conditional distribution $p(A=i,B=j\mathop{|}X=k)$.)
$\endgroup$
6
  • $\begingroup$ If anyone can think of a better title for the question, that would be really helpful. $\endgroup$
    – N. Virgo
    Commented Apr 28, 2013 at 8:10
  • $\begingroup$ I don't understand the sentence beginning with "Clearly there are cases where..." If $C$ is diagonal with non-zero diagonal entries $c_1, c_2, \ldots, c_n$, could we not just set $\alpha_j = b_j/C_ii$? $\endgroup$
    – Adam Saltz
    Commented Apr 28, 2013 at 22:13
  • 1
    $\begingroup$ @AdamSaltz I've corrected a very silly mistake in the question. Your comment was correct of course, in regards to the question as I asked it, but the way it is now is what I intended to say. $\endgroup$
    – N. Virgo
    Commented Apr 29, 2013 at 2:39
  • 2
    $\begingroup$ The special case where $C$ has non-negative entries and every $a_i$ and $b_j$ equals one corresponds to a problem sometimes studied under the name "Matrix Scaling". There the necessary and sufficient condition is that $C$ has non-zero permanent (i.e. every set of $k$ rows has at least $k$ columns with nonzero entries for all $k$), and one algorithm (due to Sinkhorn and Knopp) is fairly straightforward (essentially scale each row so the row-sums are $1$, then each column so the column-sums are $1$, and keep repeating). (Continued next comment) $\endgroup$ Commented May 1, 2013 at 0:01
  • $\begingroup$ This condition isn't true in your case (as your diagonal example shows), but maybe there's some way of using it as a starting point. $\endgroup$ Commented May 1, 2013 at 0:01

1 Answer 1

2
+50
$\begingroup$

This is expanding on my comment above, though it's still not a full answer.

We'll assume that $C$, $a$ and $b$ are all non-negative. For any subset $S$ of rows, let $N(S)$ denote the set of columns having at least one nonzero entry in some row of $S$. Similarly, for a subset $T$ of columns, let $M(T)$ denote the set of rows having at least one nonzero entry in some column of $T$.

One necessary condition (the corresponding one to the "non-zero permanent" one in the all-$1$ case) is that for all $S$ and $T$ we have $$\sum_{i \in S} a_i \leq \sum_{j \in N(S)} b_j$$ $$\sum_{j \in T} b_j \leq \sum_{i \in M(T)} a_i.$$ (In the diagonal case, this reduces down to $a_i=b_i$. If all the entries of $C$ are positive, this reduces down to $\sum a_i = \sum b_j$)

I suspect that actually this condition is sufficient as well as necessary. This is motivated in part by the all-$1$'s case (where it is sufficient), and partly on some numerical examples I ran using the analogue of the Sinkhorn-Knopp algorithm: Repeatedly

  1. Scale each row so that the row sums equal the required value.
  2. Scale each column so that the column sums equal the required value.

The examples I tried had $a_i=b_i=i$ and $C$ taken to be a $50 \times 50$ matrix where the lower right $k \times k$ minor is set to $0$ and the remaining entries are uniform on $[0,1]$.

If $k=15$, then the above condition fails for $S=\{36,37,\dots,50\}$, and as expected the algorithm failed to converge. If $k=14$, the algorithm converged fairly rapidly on the examples I tried (all row sums within $10^{-8}$ of the correct value within $100$ iterations).

$\endgroup$
2
  • 1
    $\begingroup$ Thanks very much for this. I've realised that the interpretation in terms of probabilities implies an additional constraint on the relationship between $a$, $b$ and $C$, and I'm wondering whether this constraint is enough to guarantee the existence of a solution and/or convergence to it. I've edited the question to include the new constraint. (Item 4 in the numbered list at the end.) $\endgroup$
    – N. Virgo
    Commented May 2, 2013 at 2:56
  • 1
    $\begingroup$ The additional constraint you added guarantees that both of the inequalities in my answer are satisfied, so I suspect it's enough. I don't have any proof of this though. $\endgroup$ Commented May 6, 2013 at 23:25

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .