12
$\begingroup$

[I did notice similar questions were asked here before, but I couldn't find a satisfactory answer for me to grasp as a beginner, so I chose to post this question]

I'm just starting to teach myself linear algebra with Linear Algebra and Group Theory. The book starts with the concept of determinant with the definition of even & odd permutations by giving the example of a {3 by 3 array} with the following equation:

$$\begin{vmatrix} a_{11}\ \ a_{12}\ \ ...\ \ a_{1k}\ \ ...\ \ a_{1n}\\a_{21}\ \ a_{22}\ \ ...\ \ a_{2k}\ \ ...\ \ a_{2n}\\...\ \ ...\ \ ...\ \ ...\ \ ...\ \ ...\\a_{n1}\ \ a_{n2}\ \ ...\ \ a_{nk}\ \ ...\ \ a_{nn}\end{vmatrix}=\displaystyle\sum_{(p_1, p_2, ..., p_n)}(-1)^{[p_1, p_2, ..., p_n]}a_{1p_1}a_{2p_2}...a_{np_n}$$

where $[p_1, p_2, ..., p_n] $ is the number of inversions of permutation $p_1, p_2, ..., p_n$

So because there is no justification given in the book for the above equation for {$n$ by $n$ array} and the concept of determinant feels a bit odd at the first place, I tried to investigate it myself:

[Step 1]: I started with a {2 by 2 array} first, by taking a system of two equations in two unknowns:

$$(eq1):\ a_{11}x_1+a_{12}x_2=b_1\\(eq2):\ a_{21}x_1+a_{22}x_2=b_2$$ $$A=\begin{Vmatrix} a_{11}\ \ a_{12}\\a_{21}\ \ a_{22}\end{Vmatrix}$$with the determinant $detA$

[Step 2]: I then tried to eliminate {$x_2$ in $eq1$} and {$x_1$ in $eq2$} by doing the following:

$$(eq1 \cdot a_{22})-(eq2 \cdot a_{12})=(a_{11}a_{22}-a_{12}a_{21})x_1\\(eq2 \cdot a_{11})-(eq1 \cdot a_{21})=(a_{11}a_{22}-a_{12}a_{21})x_2$$

[Step 3]: I noticed that the two coefficients above both give me the determinant of the array, so I then postulated the following statement: 

the determinant is {the coefficient of unknown $x_k$ in $k$th row} after eliminating other unknowns from the $k$th row by {multiplication} and {subtracting other rows} in the array.

that is to say, if I have a $n$th order array $$N=\begin{Vmatrix} a_{11}\ \ a_{12}\ \ ...\ \ a_{1k}\ \ ...\ \ a_{1n}\\a_{21}\ \ a_{22}\ \ ...\ \ a_{2k}\ \ ...\ \ a_{2n}\\...\ \ ...\ \ ...\ \ ...\ \ ...\ \ ...\\a_{n1}\ \ a_{n2}\ \ ...\ \ a_{nk}\ \ ...\ \ a_{nn}\end{Vmatrix}$$ I can eventually transfer $N$ into $$\begin{Vmatrix} detN & 0 & ... & 0 & ... & 0\\0 & detN & ... & 0 & ... & 0\\... & ... & ... & ... & ... & ...\\0 & 0 & ... & 0 & ... & detN\end{Vmatrix}$$

[Step 4]: I tested my statement with a {3 by 3 array} and it seems to work. And the idea of odd & even permutation seems to become more intuitive as it has to do with the order of subtracting depending on the row of the unknowns.

So here comes my questions

  1. if my guess is right, how do I construct the permuatation formula at the beginning of the question for {$n$ by $n$ array} without defining determinant by using a set of formalistic operation at the first place ?
  2. I've seen multiple answers talking about the geometric intuition of determinant (and I roughly get the idea). How does the intuition of permutation connects, or transfers into the geometric intuition ?

[Note: I have never studied abstract algebra, so answers without using notations in abstract algebra will be much appreciated :)]

-----------------------------------------------------------------------
EDIT: I think I figured out my question 2 (the geometric intuition).... correct me if I am wrong


So again using a {2 by 2 array} as an example:

[Step 1]: Assume again I have the following equations and array

$$(eq1):\ a_{11}x_1+a_{12}x_2=b_1\\(eq2):\ a_{21}x_1+a_{22}x_2=b_2$$ $$A=\begin{vmatrix} a_{11}\ \ a_{12}\\a_{21}\ \ a_{22}\end{vmatrix}$$ with the determinant $detA$

[Step 2]: I can immediately transfer the array into $$\begin{Vmatrix} detA & 0\\0 & detA\end{Vmatrix}$$ [Step 3]: Becasue the above array is the coefficient of $x_1$ and $x_2$, I can write the unknowns down as a vector $$\begin{bmatrix} x_1\\x_2\end{bmatrix}$$ which makes the {$detA$ array} a linear transformation when multiply with this vector

$\endgroup$
11
  • 3
    $\begingroup$ The determinant of a matrix is the unique number satisfying the properties: (1) the determinant of the identity matrix is $1$, (2) adding a multiple of a row to another row doesn't change the determinant, (3) scaling a row scales the determinant. Geometrically, if the determinant is the volume of a parallelepiped spanned by the rows of the matrix, (1) is that the standard unit (hyper)cube has volume $1$, (2) is that shear transformations don't change volume, and (3) is that stretching scales the volume. It turns out the determinant is multilinear, which gives permutation-based formula. $\endgroup$ Commented May 20, 2021 at 19:31
  • 2
    $\begingroup$ The key property that follows from (2)&(3) is that the determinant of a matrix where one row is a sum of two vectors is the sum of the determinants of the two matrices where that row is replaced by one of the two vectors in the sum. This property lets you write a determinant as a sum of scalar multiples of determinants of permutation matrices. Swapping rows of a determinant multiplies the determinant by $-1$, so you can work out what these are (and it coincides with the sign of the permutation). In case it's useful, some notes I wrote: math.berkeley.edu/~kmill/math54fa16/det.pdf $\endgroup$ Commented May 20, 2021 at 19:35
  • 1
    $\begingroup$ Thanks for the comments, but what you said seems to predefine what a determinant is and then derive all the properties from it. Is there anyway to derive the formula without giving its definition at the first place? @Kyle Miller $\endgroup$
    – P'bD_KU7B2
    Commented May 20, 2021 at 19:46
  • 3
    $\begingroup$ You need to start with some kind of definition, otherwise there's nothing to refer to. Some options include (a) define it by the permutation formula (which lacks any intuition whatsoever but which is obviously well-defined), (b) define it geometrically as the volume of a parallelepiped (which is essentially the row operation approach), (c) define it in terms of alternating multilinear forms (closely related to (b)). The fancy version of (c) is to choose a basis vector for the n-fold exterior power of an n-dimensional vector space, which is 1-dimensional, which requires some abstract algebra! $\endgroup$ Commented May 20, 2021 at 19:56
  • 2
    $\begingroup$ I know there are some (mostly older) linear algebra textbooks that start with determinants, and yours appears to be one. But that's a torturous way to learn linear algebra, in my opinion. $\endgroup$
    – user169852
    Commented May 20, 2021 at 19:58

1 Answer 1

10
$\begingroup$

You seem to have rediscovered the adjugate matrix. I suppose you could try to use it to define the determinant, but I'd be hesitant that it would be well-defined, i.e. independent of whatever choices you've made while row-reducing. It is basically another way to think of Cramer's rule.

The indisputably* conceptually correct way to introduce determinants is through exterior algebra using the induced map on the highest exterior power. This is much too technical for virtually anyone who's just learning it, unfortunately. So different authors will pick random bits and pieces of the true picture that they think are sufficiently palatable to their audience.

But I can easily give you a flavor for what's going on and why inversions show up naturally, if you're willing to take a bit on faith.

Suppose $\vec{u}, \vec{v}$ are 2D vectors. Let $f(\vec{u}, \vec{v})$ be the area of the parallelogram they determine. Imagine replacing $\vec{u}$ with $t\vec{u}$ for a scalar $t$ which varies from $1$ to $-1$. We have $f(t\vec{u}, \vec{v}) = |t|f(\vec{u}, \vec{v})$. That absolute value sign is a bit strange, though--it prevents the function from being smooth! It feels like maybe when $t$ passes through zero, we should just use a "negative area". This ends up being the correct choice. In 3D, the notion of "orientation" ends up being extremely natural if you do anything with, say, computer graphics. So we effectively just get rid of the absolute value sign and introduce a signed area. More generally, we'd be interested in the signed hypervolume of the $n$-dimensional parallelogram determined by $n$ vectors in $n$-dimensional space.

In 2D, if you play around with it, you'll find any reasonable signed area function $A(\vec{u}, \vec{v})$ must satisfy at least three properties:

  1. Scaling: $A(c\vec{u}, \vec{v}) = cA(\vec{u}, \vec{v})$
  2. Linearity: $A(\vec{u}_1 + \vec{u}_2, \vec{v}) = A(\vec{u}_1, \vec{v}) + A(\vec{u}_2, \vec{v})$
  3. Alternating: $A(\vec{u}, \vec{v}) = -A(\vec{v}, \vec{u})$

Note that (3) says $A(\vec{u}, \vec{u}) = 0$, which is obvious from the area interpretation (whew!).

Ok, what if we had the coordinates of $\vec{u}$ and $\vec{v}$ in terms of the standard basis vectors--what would $A$ be in those coordinates? That is, suppose $\vec{u} = a_{11} \vec{e}_1 + a_{21} \vec{e}_2$, $\vec{v} = a_{12} \vec{e}_1 + a_{22} \vec{e}_2$. Liberally using properties (1)-(3), we compute:

\begin{align*} A(\vec{u}, \vec{v}) &= A(a_{11} \vec{e}_1 + a_{21} \vec{e}_2, a_{12} \vec{e}_1 + a_{22} \vec{e}_2) \\ &= a_{11} a_{21} A(\vec{e}_1, \vec{e}_1) + a_{11} a_{22} A(\vec{e}_1, \vec{e}_2) + a_{21} a_{12} A(\vec{e}_2, \vec{e}_1) + a_{21} a_{22} A(\vec{e}_2, \vec{e}_2) \\ &= (a_{11} a_{22} - a_{12} a_{21}) A(\vec{e}_1, \vec{e}_2) \\ &= (a_{11} a_{22} - a_{12} a_{21}). \end{align*}

This is exactly the determinant of the 2x2 matrix listing the coordinates of $\vec{u}$ and $\vec{v}$ in its columns.

You can play the same game with $n \times n$ matrices. You'll see quickly that the resulting expression will be a sum over permutations, and the only question will be what sign to use. The inversion number is simply the number of swaps needed to straighten out the relevant term, so it's got the right parity!

Ok, but existence of a function $A$ satisfying (1)-(3) isn't necessarily clear. To prove it rigorously, you reverse the whole thing, first defining inversion numbers and studying their basic properties, then using the Laplace expansion formula to define the determinant, then you show it actually satisfies properties (1)-(3). Or you could do a higher-tech version of the same thing by introducing the exterior algebra. But at some point you're going to have to show that the $n$th exterior product of an $n$-dimensional vector space is $1$-dimensional (and not $0$-dimensional), which will require some sort of construction like this no matter what.

*(Hah!)

$\endgroup$
2
  • $\begingroup$ Thank you for the detailed anwser. This is very helpful for a beginner like me to have a litte bit idea on what I am actually doing when reading the book :) $\endgroup$
    – P'bD_KU7B2
    Commented May 20, 2021 at 21:26
  • 1
    $\begingroup$ I agree with this. The determinant is best introduced as a way to measure how much a linear map “distorts” the area of a square or volume of a cube. After that it’s easy to understand its role in solving a linear system of equations. $\endgroup$
    – Deane
    Commented May 20, 2021 at 22:36

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .