3
$\begingroup$

For all $n \geqslant 1,$ let $I_n = \{1, 2, \ldots, n\}.$

Let $X$ be a commutative semigroup.

Let $(x_i)_{i \in I_n}$ be a finite sequence in $X.$ That is to say, let $x \colon I_n \to X$ be any function. The generalised associative law enables us to define, unambiguously, \begin{equation} \label{eq:DSCS1}\tag{DSCS$1$} \sum x = \sum_{i=1}^n x_i = x_1 + x_2 + \cdots + x_n. \end{equation} (The notation $\sum x$ is not standard.) This generalised sum operation has the following property. For any $m \geqslant 1,$ and any strictly increasing sequence $(r_j)_{0 \leqslant j \leqslant m}$ in $\mathbb{N}$ such that $r_0 = 1$ and $r_m = n + 1$ (of course, this implies $m \leqslant n$), \begin{equation} \label{eq:TSCS1}\tag{TSCS$1$} \sum_{j=1}^m\sum_{k=r_{j-1}}^{r_j-1}x_k = \sum_{i=1}^n x_i. \end{equation} (I hope it is clear without a formal definition what is meant by the inner sum. If not: define $y_s = x_{r_{j-1}+s-1}$ for $s = 1, \ldots, r_j - r_{j-1}.$) The sum of a sequence $(x_1)$ of length $1$ is $x_1.$ A consequence of commutativity is that for any permutation $\sigma \colon I_n \to I_n,$ \begin{equation} \label{eq:TSCS2}\tag{TSCS$2$} \sum_{i=1}^n x_{\sigma(i)} = \sum_{i=1}^n x_i. \end{equation}

Let $A$ be a finite non-empty set, i.e. let there be a bijection $\alpha \colon I_n \to A$ for some (unique) $n \geqslant 1.$ Let $f \colon A \to X$ be a function. In view of \eqref{eq:TSCS2}, we can define, unambiguously, for any choice of $\alpha,$ \begin{equation} \label{eq:DSCS2}\tag{DSCS$2$} \sum f = \sum_A f = \sum_{a \in A} f(a) = \sum f \circ \alpha = \sum_{i=1}^n f(\alpha_i). \end{equation} (Of these notations, only the third and fifth are standard.) In the case where $A = I_n$ and $\alpha$ is the identity permutation, \eqref{eq:DSCS2} agrees with \eqref{eq:DSCS1}.

In Algebra (revised third edition, 2002, page 5), Serge Lang writes:

There are number of formal rules for dealing with [sums] which it would be tedious to list completely.

Tedious they might be, but then so is much foundational work of the kind undertaken by Bourbaki and Landau, for example, and nobody disputes its usefulness, especially not for such a subjective reason! A statement of rules for computing with finite sums would be useful in dealing with more complex cases, in which intuition has little to offer, and mistakes are likely.

(This is to say nothing of the more sophisticated and not purely algebraic concept of infinite unordered sums in complete Hausdorff commutative topological groups, whose definition takes for granted the properties of finite sums in semigroups, monoids, and groups, and to which my comment applies with even more force.)

Because questions requiring non-trivial manipulations of this kind keep coming up in MSE (I could list a few examples, but perhaps that isn't necessary), I have started work on an answer to this question, aiming to state and prove some of the more commonly useful results.

Can someone save me the bother (not that I wouldn't enjoy the work!) by giving a reference to a textbook or on-line article that already does the job?

$\endgroup$
6

1 Answer 1

2
$\begingroup$

If $S$ is a commutative semigroup, $A, B$ finite non-empty sets, $\theta \colon B \to A$ a bijection, and $f \colon A \to S$ any function, then \begin{equation} \label{eq:TSCS3}\tag{TSCS$3$} \sum_{b \in B} f(\theta(b)) = \sum_{a \in A} f(a) \end{equation}

Proof. This is almost trivial. If $\alpha \colon I_n \to B$ is any bijection, then by two applications of (DSCS$2$), $$ \sum_B f \circ \theta = \sum (f \circ \theta) \circ \alpha = \sum f \circ (\theta \circ \alpha) = \sum_A f, $$ because $\theta \circ \alpha \colon I_n \to A$ is a bijection. $\square$

Let $S$ be a commutative semigroup, $K$ a finite non-empty set, $(A_k)_{k \in K}$ a pairwise disjoint family of finite non-empty sets, and $(f_k \colon A_k \to S)_{k \in K}$ a family of functions. Write $B = \bigcup_{k \in K} A_k,$ and let $g \colon B \to S$ be the unique function whose restriction to $A_k$ is $f_k$ for all $k \in K.$ (More simply, let $g \colon B \to S$ be any function, and for all $k \in K,$ define $f_k$ to be the restriction of $g$ to $A_k.$) Then: \begin{equation} \label{eq:TSCS4}\tag{TSCS$4$} \sum_{b \in B} g(b) = \sum_{k \in K} \sum_{a \in A_k} f_k(a) = \sum_{k \in K} \sum_{a \in A_k} g(a). \end{equation}

Proof. Let the distinct elements of $K$ be $k_1, k_2, \ldots, k_m.$ For $j = 1, 2, \ldots, m,$ let the distinct elements of $A_{k_j}$ be $a_{j1}, a_{j2}, \ldots, a_{j,n_j}.$ Then the distinct elements of $B$ are $$ (b_1, b_2, \ldots, b_p) = (a_{11}, a_{12}, \ldots, a_{1n_1}, a_{21}, a_{22}, \ldots, a_{2,n_2}, \ldots, a_{m1}, a_{m2}, \ldots, a_{m,n_m}) $$ where $p = n_1 + n_2 + \cdots + n_m.$

By the generalised associative law (TSCS$1$) and three applications of (DSCS$2$), \begin{multline*} \sum_B g = \sum_{i=1}^p g(b_i) = g(b_1) + g(b_2) + \cdots + g(b_p) = \\ g(a_{11}) + \cdots + g(a_{1n_1}) + g(a_{21}) + \cdots + g(a_{2,n_2}) + \cdots + g(a_{m1}) + \cdots + g(a_{m,n_m}) \\ = [g(a_{11}) + \cdots + g(a_{1n_1})] + \cdots + [g(a_{m1}) + \cdots + g(a_{m,n_m})] \\ = \sum_{j=1}^m [g(a_{j1}) + \cdots + g(a_{j,n_j})] = \sum_{j=1}^m \sum_{i=1}^{n_j} g(a_{ji}) = \sum_{j=1}^m \sum_{A_{k_j}} g = \sum_{k \in K} \sum_{A_k} g, \end{multline*} where the final step is an application of (DSCS$2$) to the function $$ K \to S, \ k \mapsto \sum_{A_k} g, $$ using the bijection $I_m \to K,$ $j \mapsto k_j.$ $\square$


Alternative proof. To skip this, go to the next horizontal line.

(I wish I still had a copy of Carl E. Linderholm's satirical classic Mathematics Made Difficult, because this is probably worthy of inclusion! But it has the merit of avoiding the ugly notation and heavy use of ellipsis that mar the first proof.)

A coproduct of a family of sets $(A_k)_{k \in K}$ is a set $C$ together with a family of functions $(\gamma_k \colon A_k \to C)_{k \in K}$ possessing the universal property that for every set $S$ and every family of functions $(f_k \colon A_k \to S)_{k \in K}$ there exists a unique function $g \colon C \to S$ such that $g \circ \gamma_k = f_k$ for all $k \in K$:

In particular, the only function $g \colon C \to C$ such that $g \circ \gamma_k = \gamma_k$ for all $k \in K$ is the identity map. It follows that if $(C, (\gamma_k \colon A_k \to C)_{k \in K})$ and $(D, (\delta_k \colon A_k \to D)_{k \in K})$ are coproducts of the same family $(A_k)_{k \in K},$ there is a unique bijection $\theta \colon D \to C$ such that $\theta \circ \delta_k = \gamma_k$ for all $k \in K$:

More generally, suppose that $(B_j)_{j \in J}$ is another family of sets such that there is a bijection $\beta \colon J \to K,$ and for each $j \in J,$ a bijection $\alpha_j \colon B_j \to A'_j,$ where $A'_j = A_{\beta(j)}.$ Put $\gamma'_j = \gamma_{\beta(j)}$ for all $j \in J.$ Then $(C, (\gamma'_j)_{j \in J})$ is a coproduct of $(A'_j)_{j \in J},$ therefore $(C, (\gamma'_j \circ \alpha_j)_{j \in J})$ is a coproduct of $(B_j)_{j \in J},$ therefore there exists a unique bijection $\theta \colon D \to C$ such that this square commutes for all $j \in J$:

Therefore, for every family of functions $(f_k \colon A_k \to S)_{k \in K},$ there exists a unique function $g \colon C \to S$ making this diagram $(*)$ commute for all $j \in J$:

where $f'_j = f_{\beta(j)}$.

Every set-indexed family of sets $(A_k)_{k \in K}$ has a coproduct. The usual construction, which confusingly is called a disjoint union of the $A_k$ (even though the point is that the $A_k$ need not be disjoint), is $E = \bigcup_{k \in K} A^*_k,$ where $A^*_k = \{k\} \times A_k$ for all $k \in K$ (or $A^*_k = A_k \times \{k\},$ it makes no difference).

As the Wikipedia article rightly observes, the crucial property of such a set $E$ is that there is a family of injective functions $(\epsilon_k \colon A_k \to E)_{k \in K}$ whose images $A^*_k$ form a partition of $E$. It is clear that any such set $E$, with such functions $\epsilon_k,$ has the universal property needed for it to be a coproduct of the $A_k.$

The converse is also true. Let $(C, (\gamma_k)_{k \in K})$ be any coproduct of the $A_k,$ and let $\varphi \colon C \to E$ be the unique bijection such that $\varphi \circ \gamma_k = \epsilon_k$ for all $k \in K.$ Then each of the functions $\gamma_k = \varphi^{-1} \circ \epsilon_k$ is an injection, and their images $\gamma_k(A_k) = \varphi^{-1}(A^*_k)$ form a partition of $C,$ so $(C, (\gamma_k)_{k \in K})$ is a disjoint union of the $A_k,$ in the sense defined by the Wikipedia article.

We are interested in the case where the set $K$ and all the sets $A_k$ are finite and non-empty. That is, there exists a positive integer $m,$ a bijection $\beta \colon I_m \to K,$ and positive integers $(n_j)_{1 \leqslant j \leqslant m},$ such that there are bijections $$ \alpha_j \colon I_{n_j} \to A_{\beta(j)} \quad (1 \leqslant j \leqslant m). $$ Given a positive integer $m$ and a sequence of positive integers $(n_j)_{1 \leqslant j \leqslant m},$ define: \begin{gather*} r_j = 1 + \sum_{l=1}^j n_l \quad (j = 0, 1, \ldots, m), \\ p = \sum_{l=1}^m n_l = r_m - 1, \\ \delta_j \colon I_{n_j} \to I_p, \ s \mapsto r_{j-1} + s - 1 \quad (j = 0, 1, \ldots, m). \end{gather*} Then $I_p$ is the union of the pairwise disjoint sets $\delta_j(I_{n_j}),$ and the functions $\delta_j$ are injective, therefore $(I_p, (\delta_j)_{1 \leqslant j \leqslant m})$ is a disjoint union of the finite non-empty sequence of sets $(I_{n_j})_{1 \leqslant j \leqslant m}$; thus it has the universal property of a coproduct.

We now write the generalised associative law (TSCS$1$) in a more convenient form when the semigroup $S$ is commutative. Let $(y^{(j)} \colon I_{n_j} \to S)_{j \in I_m}$ be a finite non-empty sequence of finite non-empty sequences in $S,$ and, using the universal property, let $x \colon I_p \to S$ be the unique finite non-empty sequence in $S$ such that: $$ x \circ \delta_j = y^{(j)} \quad (j \in I_m). $$ Then: \begin{equation} \label{eq:TSCS1p}\tag{TSCS$1'$} \sum x = \sum_{i=1}^p x_i = \sum_{j=1}^m\sum_{k=1}^{n_j}y^{(j)}_k = \sum_{j \in I_m} \sum y^{(j)}. \end{equation} Here (DSCS$2$) has been applied to the function $I_m \to S,$ $j \mapsto \sum y^{(j)},$ using the identity permutation of $I_m.$ The expressions $\sum x$ and $\sum y^{(j)}$ can also be interpreted as finite unordered sums, by using the identity permutations of $I_p$ and $I_{n_j}$ respectively. The resulting interpretation of \eqref{eq:TSCS1p}, as an identity between finite unordered sums indexed by sets forming a coproduct configuration, is the prototype of a general result, which is now derived as an easy corollary.

In the diagram $(*),$ take $J = I_m,$ $B_j = I_{n_j}$ ($1 \leqslant j \leqslant m$), and $D = I_p,$ with the functions $(\delta_j \colon I_{n_j} \to I_p)_{j \in J}$ as just defined. Given a family of functions $(f_k \colon A_k \to S),$ and the associated function $g \colon C \to S,$ define the sequences $$ y^{(j)} = f'_j \circ \alpha_j \colon I_{n_j} \to S \quad (j \in J). $$ Let the sequence $x \colon I_p \to S$ be determined by the $y^{(j)}$ as above. Then for all $j \in I_m,$ the square and left, right, and lower triangles in this diagram commute:

Moving round the diagram, we find, for all $j \in I_m$: $$ (g \circ \theta) \circ \delta_j = g \circ (\theta \circ \delta_j) = g \circ (\gamma'_j \circ \alpha_j) = (g \circ \gamma'_j) \circ \alpha_j = f'_j \circ \alpha_j = y^{(j)}. $$ Therefore, by the universal property of the coproduct $(I_p, (\delta_j)_{j \in I_m}),$ the upper triangle also commutes: $$ x = g \circ \theta. $$

With so much preparatory work done, it follows easily that for any coproduct $(C, (\gamma_k \colon A_k \to C)_{k \in K})$ where the sets $K$ and $(A_k)_{k \in K}$ are finite and non-empty, and functions $g \colon C \to S,$ $(f_k \colon A_k \to S)_{k \in K}$ such that $f_k = g \circ \gamma_k$ for all $k \in K$: \begin{equation} \label{eq:TSCS5}\tag{TSCS$5$} \sum_{c \in C} g(c) = \sum_{k \in K} \sum_{a \in A_k} f_k(a). \end{equation}

Proof. \begin{align*} \sum_C g & = \sum_{I_p} x && \text{by } \eqref{eq:TSCS3} \text{ using } \theta \\ & = \sum_{j \in I_m} \sum y^{(j)} && \text{by } \eqref{eq:TSCS1p} \\ & = \sum_{j \in I_m} \sum_{A'_j} f'_j && \text{by } \eqref{eq:TSCS3} \text{ using } \alpha_j \\ & = \sum_{j \in I_m} \sum_{A_{\beta(j)}} f_{\beta(j)} && \text{by the definitions of } A'_j \text{ and } f'_j \\ & = \sum_{k \in K} \sum_{A_k} f_k && \text{by } \eqref{eq:TSCS3} \text{ using } \beta. \quad \square \end{align*}

\eqref{eq:TSCS4} is, of course, an immediate corollary of \eqref{eq:TSCS5}.


If $A, B$ are finite non-empty sets, $S$ a commutative semigroup, and $f \colon A \times B \to S$ a function, then \begin{equation} \label{eq:TSCS6}\tag{TSCS$6$} \sum_{a \in A} \sum_{b \in B} f(a, b) = \sum_{b \in B} \sum_{a \in A} f(a, b). \end{equation}

Proof. Because $(\{a\} \times B)_{a \in A}$ and $(A \times \{b\})_{b \in B}$ are partitions of $A \times B,$ \begin{align*} \sum_{a \in A} \sum_{b \in B} f(a, b) & = \sum_{a \in A} \ \sum_{(a, b) \in \{a\} \times B} f(a, b) && \text{by } \eqref{eq:TSCS3} \\ & = \! \sum_{(a, b) \in A \times B} \! f(a, b) && \text{by } \eqref{eq:TSCS4} \\ & = \sum_{b \in B} \ \sum_{(a, b) \in A \times \{b\}} f(a, b) && \text{by } \eqref{eq:TSCS4} \\ & = \sum_{b \in B} \sum_{a \in A} f(a, b) && \text{by } \eqref{eq:TSCS3}. \quad \square \end{align*}

If $S, S'$ are commutative semigroups, $\sigma \colon S \to S'$ a homomorphism, $A$ a finite non-empty set, and $f \colon A \to S$ a function, then \begin{equation} \label{eq:TSCS7}\tag{TSCS$7$} \sigma\left(\sum_{a \in A} f(a)\right) = \sigma\left(\sum_A f\right) = \sum_A \sigma \circ f = \sum_{a \in A} \sigma(f(a)). \end{equation}

Proof. By induction on $n,$ for any finite non-empty sequence $x \colon I_n \to S,$ $$ \sigma\left(\sum x \right) = \sigma\left(\sum_{i=1}^n x_i \right) = \sum_{i=1}^n \sigma(x_i) = \sum \sigma \circ x. $$ Therefore, if $\alpha \colon I_n \to A$ is any bijection, then $$ \sigma\left(\sum_A f\right) = \sigma\left(\sum f \circ \alpha\right) = \sum \sigma \circ (f \circ \alpha) = \sum (\sigma \circ f) \circ \alpha = \sum_A \sigma \circ f, $$ as stated. $\square$

If $S$ is a semiring (not assumed to have a $0$ or $1$), $s$ an element of $S,$ $A$ a finite non-empty set, and $f \colon A \to S$ a function, then \begin{gather} \label{eq:TSCS8a}\tag{TSCS$8_\text{a}$} s \! \left(\sum_{a \in A} f(a)\right) = \sum_{a \in A} sf(a), \\ \label{eq:TSCS8b}\tag{TSCS$8_\text{b}$} \left(\sum_{a \in A} f(a)\right) \! s = \sum_{a \in A} f(a)s. \end{gather}

Proof. These follow from \eqref{eq:TSCS7} when we take $\sigma \colon S \to S,$ $t \mapsto st,$ or $\sigma \colon S \to S,$ $t \mapsto ts,$ respectively. $\square$

If $S$ is a semiring, $A, B$ finite non-empty sets, and $f \colon A \to S$ and $g \colon B \to S$ functions, then \begin{equation} \label{eq:TSCS9}\tag{TSCS$9$} \sum_{a \in A} \sum_{b \in B} f(a)g(b) = \left(\sum_{a \in A} f(a)\right)\left(\sum_{b \in B} g(b)\right). \end{equation}

Proof. Applying \eqref{eq:TSCS8a}, followed by \eqref{eq:TSCS8b}, $$ \sum_{a \in A} \sum_{b \in B} f(a)g(b) = \sum_{a \in A} \left(f(a)\sum_{b \in B} g(b)\right) = \left(\sum_{a\in A} f(a)\right)\left(\sum_{b\in B} g(b)\right). \quad \square $$

Other results (tedious or not) could be proved, but I've chosen to end here, because \eqref{eq:TSCS3}, \eqref{eq:TSCS4}, \eqref{eq:TSCS6} and \eqref{eq:TSCS8a}/\eqref{eq:TSCS8b} are the only unwritten results about finite unordered sums that I've found myself relying on in the eight threads listed in the comments.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .