Creating larger structures from smaller ones without an explicit construction

Question

I'm asking this question as a replacement for my previous one, which I admit isn't clear, and which I am voting to close. Hopefully I'll be clearer now.

Admittedly, I'm not sure if this question pertains to abstract algebra, or if it is more of a soft question/philosophy question.

I'll use vector spaces to illustrate what I'm asking. Consider the $2$-dimensional vector spaces of ordered pairs of real numbers over the real numbers, where addition and scalar multiplication are the usual ones. When talking of something that "generates" this vector space, it is easy to point at two linearly independent vectors, such as $(1,0)$ and $(0,1)$, and correctly assert that they generate the vector space. In this case, we began with a vector space, and then looked for generators within its already pre-defined structure.

Now instead of starting with the vector space, suppose we go the opposite way, and ask straight away "what vector space do the ordered pairs $(1,0)$ and $(0,1)$ generate over the reals?" I suppose the answer would be "the space of all linear combinations of $(1,0)$ and $(0,1)$ with real coefficients". One such linear combination would be, for example, $3(1,0) +(-5)(0,1)$. But unless I explicitly state that scalar multiplication in this case corresponds to real multiplication of each element of the pair by that scalar, and that vector addition corresponds to real addition and that the pairs must be added component-wise, I can't state that $3(1,0) +(-5)(0,1) = (3,-5)$. If we did say that $3(1,0) +(-5)(0,1) = (3,-5)$, then we'd implicitly be leaning on previous concepts, namely the ones I just mentioned. But without those, what are we supposed to interpret an element like $3(1,0) +(-5)(0,1)$ to be? Did we conjure its and the vector space's existence out of nothing? Or is it merely a formal sum of two formal products, and all that matters is the symbolic construction?

The reason I'm asking this is because of the way the some explanations use the word "generate" to signify that a smaller structure generates a larger one, without having said what the larger one is. For example, it is said that vector spaces generate tensor products or Clifford algebras. Continuing with the example of the vector space of ordered pairs of real numbers over the reals, by equipping it with the standard inner product, I'm allowed to say that this space generates a Clifford algebra with a subspace of bivectors in it. Well, what are these bivectors? Yes, I can symbolically represent them as $\alpha (1,0) \wedge (0,1)$ and say that $(0,1) \wedge (1,0) = -(1,0) \wedge (0,1)$, but can I go any further than this without introducing additional characterizations, such as saying that $(a,b) \wedge (c,d) = ad - bc$? Or is it that the symbolism is what matters? If instead we began with a specific instance of a Clifford algebra that had the vector space in question as a subspace, then it'd make more sense to me to say something like "the vector space generates the Clifford algebra".

Maybe I'm stuck on what mathematicians mean by "generate" when they use the word without explictly constructing an example of the larger structure when starting from a smaller one. Is the intention to build a "skeleton" of the larger structure, without explicitly saying what its elements should be? If this is the case, isn't this potentially problematic, as not giving an explicit construction may raise the question of whether anything that fits this "skeleton" even exists at all, aside from formal symbolic expressions?

You can delete your own question, rather than "voting to close"... — Arturo Magidin, Commented Jun 1 at 20:22

Qiaochu Yuan · Accepted Answer · 2024-06-02 05:21:11Z

Now instead of starting with the vector space, suppose we go the opposite way, and ask straight away "what vector space do the ordered pairs $(1,0)$ and $(0,1)$ generate over the reals?"

The problem with this is that it's not clear what the notation $(1, 0)$ and $(0, 1)$ means here. Are these formal symbols or are they ordered pairs of real numbers? If the latter, you are already talking about $\mathbb{R}^2$ (this is clear later in your discussion when you manipulate sums of ordered pairs in the usual way), so you haven't avoided doing that.

The reason I'm asking this is because of the way the some explanations use the word "generate" to signify that a smaller structure generates a larger one, without having said what the larger one is. For example, it is said that vector spaces generate tensor products or Clifford algebras.

Yes, this is a subtle and important point. When we start with a vector space $V$ and produce from it, say, the tensor algebra $T(V)$ (I will stick to this example for now), this is not an object that lives inside an existing vector space we have already defined; it is a new vector space we cook up, using tensor products and direct sums, which are constructions that produce new vector spaces out of old ones. It's a little unclear what it means to say that $V$ "generates" the tensor algebra $T(V)$; more precisely we should say it freely generates $T(V)$, which is a statement with rigorous meaning (that I won't go into here).

The direct sum is not so hard to understand, it basically consists of finite sequences, but the tensor product is a genuinely tricky construction the first time you see it. You can say that the tensor product $V \otimes V$ consists of sums of formal symbols $v \otimes w$, but as you say:

Well, what are these bivectors?

To be maximally explicit here requires a precise construction of the tensor product. Here is the one I believe is standard: starting from two vector spaces $V, W$, we first construct the extremely large free vector space on the cartesian product $V \times W$; that is, we consider the space of formal linear combinations of pairs of a vector in $V$ and a vector in $W$, which we write as $v \otimes w$.

This "formal linear combination" business seems like what you're stuck on, so to be maximally explicit, the free vector space on a set $X$ over a field $\mathbb{R}$ can be constructed (although I personally prefer not to do this) as the vector space of functions $f : X \to \mathbb{R}$ which vanish except at finitely many points. We think of the values of such a function as the coefficients of a formal linear combination $\sum_{x \in X} f(x) x$, but this is a way to make precise exactly what this means, and in particular exactly what it means for two such things to be equal: it means that $f(x) = g(x)$ for all $x \in X$.

So, the beginning of the construction of the tensor product is the free vector space on $V \times W$, meaning the vector space of functions (not linear functions, arbitrary functions) $f : V \times W \to \mathbb{R}$ which vanish except at finitely many points, thought of as formal linear combinations $\sum_{v, w} f(v, w) v \otimes w$. This is, as I've said, an enormous vector space; even if $V$ and $W$ are finite-dimensional it's (usually) uncountable-dimensional. Next we cut it down to make it much smaller by taking a quotient space. The point here is that we want this symbol $v \otimes w$ to be bilinear in $v$ and $w$ and right now it doesn't know anything about the vector space structures of $V$ and $W$ at all. So, formally, consider the subspace of the free vector space above spanned by vectors of the following forms:

$$(av_1 + bv_2) \otimes w - a (v_1 \otimes w) - b (v_2 \otimes w), a, b \in \mathbb{R},v \in V, w \in W$$ $$v \otimes (aw_1 + bw_2) - a (v \otimes w_1) - b (v \otimes w_2), a, b \in \mathbb{R}, v \in V, w \in W$$

Now take the quotient by this subspace. If you're unfamiliar with quotients now is really the time to learn about them; typically the first place you'd see them is in an introductory course on abstract algebra, where you'd learn about quotient groups. The point of this quotient construction is precisely to force the symbol $v \otimes w$ to be bilinear; after quotienting, in the resulting quotient space it is now true that $(av_1 + bv_2) \otimes w = a v_1 \otimes w + b v_2 \otimes w$ and similarly for $w$, because we've forced it to be true by fiat.

This may seem extremely unwieldy and hard to reason about. When $V, W$ are finite-dimensional the tensor product $V \otimes W$ is also finite-dimensional, and yet for some reason we had to pass to this much bigger infinite-dimensional object to define it. This is just one of those things you get used to in mathematics; everything turns out to work out just fine, it's really not a problem, and the tensor product is pretty straightforward to understand in practice. Practically speaking you can think of it in the following easy way: if $\{ b_i \}, \{ c_j \}$ are bases of $V, W$, then the set of tensor products $\{ b_i \otimes c_j \}$ is a basis of $V \otimes W$. In particular, $\dim V \otimes W = \dim V \dim W$. However, the above definition has the significant benefit of not depending on a choice of basis.

Having defined tensor products, and assuming that you're okay with direct sums, the tensor algebra $T(V)$ is the infinite direct sum $\bigoplus_{n \ge 0} V^{\otimes n}$ of the iterated tensor products of $V$, where $V^{\otimes 0}$ is $1$-dimensional. If $V$ has a quadratic form $Q$ attached to it, the Clifford algebra $\text{Cl}(V, Q)$ can be defined as a further quotient algebra of the tensor algebra by some relations involving $Q$.

I don't know the context in which you're looking at this stuff; maybe you're a physicist. Probably physicists are mostly going to elide the details here and work exclusively with formal sums. That's probably for the best because getting into the details here can be pretty tedious. But it's worth knowing that it can be done and explicit constructions are possible, although mostly irrelevant in practice.

The answer we get to the question "well, what is a tensor product $v \otimes w$?" based on the above construction is: $v \otimes w$ is the equivalence class of functions $f : V \times W \to \mathbb{R}$ containing the function which is equal to $1$ at $(v, w)$ and $0$ otherwise, up to the equivalence relation where $f \sim g$ if $f - g$ is a finite sum of functions corresponding to the bilinearity relations above. (It goes like this for the Clifford algebra too but we have to add more relations.) But this is wildly beside the point. In practice we don't think about tensor products this way any more than we think about real numbers as being Dedekind cuts. The above construction is just a construction, and the real point of the tensor product is that it's universally bilinear. In practice we really do just think of it as a formal symbol we can manipulate according to specific rules and that works fine.

One practical point about working with quotient spaces is that it's relatively straightforward to check that two elements of the tensor product are equal: it suffices to show that you can use algebraic manipulations (involving bilinearity) to turn one into the other. What is not at all clear is how to check that two elements of the tensor product are not equal (and similarly for the Clifford algebra; what if every element were equal to every other element?). This requires proving some actual stuff but it can be done. — Qiaochu Yuan, Commented Jun 1 at 20:44
Thank you for the very thorough answer. Can I shoot you a message in a chat room to follow up on some of these points? — jvf, Commented Jun 2 at 2:07
@jvf: I'm happy to discuss further here in the comments or in another question. — Qiaochu Yuan, Commented Jun 2 at 5:23

Sambo · Accepted Answer · 2024-06-04 14:44:43Z

As you mentioned towards the end of your question, and especially in the (now closed) previous version of your question: the word "generating" is also used in the context of a larger structure. For example, given a group $G$, we can talk about the subgroup $\langle x \rangle$ of $G$ generated by an element $x$; or, we can say that the Borel $\sigma$-algebra on $\Bbb{R}$ is the $\sigma$-algebra generated by open sets (this is happening within the bigger $\sigma$-algebra $\mathcal{P}(\Bbb{R})$). All this is different from what you're talking about, which is generating "externally", i.e. without reference to a larger structure. (The two concepts are often in some way equivalent, however. This is similar to how we can talk about the internal direct sum of vector subspaces, or the external direct sum of vector spaces.) I just wanted to make that clear before proceeding.

What I'd like to draw your attention to is this: when we talk about "generating", or perhaps "freely generating", we're often implicitly referring to a universal property, which can be understood with category theory.

A motivating example

To justify this claim, let me start by giving an example. A somewhat common mathematical object is the "free group on two elements", $F_2$. Intuitively, we form this group by taking two "elements", say $x$ and $y$ (never mind what they "really are"), and trying to make a group with them. What can you do in a group? You have an identity, and you can multiply, and you can take inverses, and the inverses cancel out. That means that $F_2$ should have elements like $xyxy$ and $yyy$ and $yx^{-1}y^{-1}x$, and you should have $x^{-1}x=1$, where $1$ is the identity element of the group. It should also not be true that, for instance, $xy=yx$, because this is an "additional restriction" that would go against our idea that $F_2$ is freely generated by $x$ and $y$.

Now, there are two ways of making sense of $F_2$. The first is to construct it explicitly: This group is the set of strings made up of the "letters" $x$,$y$,$x^{-1}$,$y^{-1}$, where $1$ denotes the empty string, modulo an equivalence relation where "inverses cancel". That is, a string of the form $sxx^{-1}t$ is equivalent to $st$ (and also $sx^{-1}xt$), and the same for $y$. The group operation is concatenating strings (which we can check is well-defined on the equivalence classes), and we've constructed a bona fide group, which we can call $F_2$.

The second way is to describe how "a group freely generated by two elements" should behave; this is called its universal property. It says that "a group freely generated by two elements" consists of a group $F$, along with two elements $x,y \in F$, such that the following property holds.

For any group $G$, and for any two elements $g_x, g_y \in G$, there exists a unique group homomorphism $f : F \rightarrow G$ such that $f(x) = g_x$ and $f(y) = g_y$.

We can show that any two groups which satisfy this property must be isomorphic, so in that sense, this property "completely determines" a "group freely generated by two elements". From this perspective, this justifies us saying "the" group freely generated by two elements, because even if it's not a particular group (note that there is no reference to the group $F_2$ we constructed above), it's unique up to isomorphism.

Now, the free group $F_2$ we constructed above does satisfy this universal property, and we can use this universal property to prove other things about $F_2$. However, the fact that the universal property determines the group up to isomorphism means that this property is all we need to prove anything about $F_2$. You can think about the construction if you want, but when sitting down to do any proofs, all you will ever need is the universal property. You can forget about the explicit construction completely.

Now, to answer one of your questions. You ask

[...] isn't this potentially problematic, as not giving an explicit construction may raise the question of whether anything that fits this "skeleton" even exists at all [...]?

Yes, you are correct: just because we can formulate a universal property doesn't mean that something exists that satisfies it. However, once we have constructed an example of it (like we did with $F_2$), then we can forget about the construction and just use the universal property.

More broadly

I don't know anything about Clifford algebras, but I see on the Wikipedia page a subsection titled universal property and construction. The same rules apply here: there's a universal property, and once we've been able to construct a particular example that satisfies it, we can forget about the construction and just focus on using the universal property. The same is true for the tensor product of vector spaces.

This concept comes up a lot in other places too. We can think of the completion of a metric space $X$ as the complete metric space generated by $X$. We can talk about the category freely generated by a graph. My current research involves considering the free arithmetic universe generated from the empty set. I really believe that using universal properties is the best way to think about it all.

If you're interested in learning more about universal properties, and how they're related to other category theory concepts like limits and adjoint functors, I highly recommend the book Basic Category Theory by Tom Leinster. The book is very accessible, even if you haven't done any category theory before, and he has made it freely accessible on arXiv.

Stack Exchange Network

Creating larger structures from smaller ones without an explicit construction

2 Answers 2

A motivating example

More broadly

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
abstract-algebra
soft-question
terminology
philosophy
.

Hot Network Questions

Creating larger structures from smaller ones without an explicit construction

2 Answers 2

A motivating example

More broadly

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged abstract-algebrasoft-questionterminologyphilosophy.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
abstract-algebra
soft-question
terminology
philosophy
.