26
$\begingroup$

I was looking at the definition of the imaginary unit $i$. We define it by $i^2=-1$. But what does square even mean in such context?

It is easily proven that no real solutions exist to this equation. Then we say we define $i$ as a new type of number such that the equation is true. Then $i$ is not a real number.

But we use multiplication in that equation, and as far as I know multiplication is only defined between real numbers at this point, and $i$ is clearly not a real number. So how can we apply an operator exclusive to real numbers on a number that is not real?

Edits - For clarity

$\endgroup$
15
  • 5
    $\begingroup$ It's not the operations of real numbers that are applied to complex numbers. It's the other way around: the operations of complex numbers are applied to real numbers. The definition you refer to is just an informal definition. It works because complex operations have the same basic properties as real operations (field properties) $\endgroup$
    – Pedro
    Commented Jun 17 at 0:43
  • 6
    $\begingroup$ Even though we want to "define" $i$ by the equation $i^2=-1$, this does not make from a formal perspective – your question highlights just one way in which the common introduction to the complex numbers is unrigorous. If you want to precise, then there are a number of ways to do it: the most elementary is to define a complex number to be a pair of real numbers. The idea behind this definition is that the pair $(a,b)$ represents the complex number $a+bi$ (but remember that complex numbers haven't been precisely defined yet). $\endgroup$
    – Joe
    Commented Jun 17 at 0:46
  • 12
    $\begingroup$ We define addition of complex numbers as the function $+_{\mathbb C}:\mathbb C^2\to\mathbb C$ given by $(a,b)+_{\mathbb C}(c,d)=(a+c,b+d)$, and multiplication as the function $\cdot_{\mathbb C}:\mathbb C^2\to\mathbb C$ given by $(a,b)\cdot_{\mathbb C}(c,d)=(ac-bd,ad+bc)$. Finally, we define the complex number $i$ to be the pair $(0,1)$. You can easily verfiy from the definition that $i\cdot_{\mathbb C}i=(-1,0)$. This is the precise sense in which $i^2=-1$. $\endgroup$
    – Joe
    Commented Jun 17 at 0:46
  • 9
    $\begingroup$ You might notice that according to the above definition, real numbers and complex numbers are actually completely different things: in particular, it is not true that a real number is a complex number. While this is technically true, the pair $(a,0)$ behaves in exactly the same way as the real number $a$, so it is common to "identify" them. In this loose sense, we can think of every real number as a complex number. More formally, there is a canonical embedding of $\mathbb R$ into $\mathbb C$. $\endgroup$
    – Joe
    Commented Jun 17 at 0:49
  • 4
    $\begingroup$ Related: What is the precise definition of $i$?, and A proper definition of $i$, the imaginary unit $\endgroup$
    – Pedro
    Commented Jun 17 at 0:54

8 Answers 8

42
$\begingroup$

One way to think of the complex numbers is as an extension of the real numbers, in the sense that we begin with all the typical real numbers (together with their typical operations), and add in an extra symbol "$i$". In this context, an expression like $3 + 2i - i^2$ would be just as meaningful as an expression like $3 + 2x - x^2$, where $x$ is some variable whose value we don't assume (an "indeterminate"). So we are effectively studying a collection of real polynomials with variable $i$.

However, what distinguishes the symbol $i$ from the symbol $x$ is that we decide (arbitrarily) that $i^2 = -1$ (whereas we assume truly nothing about $x$). There is no real number which satisfies this, thus $i$ cannot be a real number, but if we agree to treat the symbol $i$ as a "formal symbol" - a term whose value is not specified or even defined; rather, it is defined in terms of what relations it satisfies - then consequently an expression like $3 + 2i - i^2$ would be identified as the same as $3 + 2i - (-1) = 4 + 2i$. And all the typical rules and behaviors of complex numbers follow.

The "proper"/formal way to do this is to begin with a space of all polynomials, what you might denote $\mathbb{R}[x]$ (which includes terms like $3 + 2x - x^2$ and $4 + 2x$, but which are considered different polynomials) and then declare that the polynomial $x^2 + 1$ is "algebraically the same as" the polynomial $0$. This is accomplished by grouping polynomials into classes, where two polynomials (like $3+2x-x^2$ and $4+2x$) are considered related if their difference is a polynomial multiple of $x^2 + 1$ (in this case, $(4+2x) - (3+2x-x^2) = x^2+1{}\ ``=" 0$). Under this grouping scheme, each individual group has a unique representative in the form $a+bx$ (to find it, start with any polynomial in a given group and then divide by $x^2+1$ and take the polynomial remainder).

Finally, we find that there is a well-defined algebraic structure upon the set of all groupings; it turns out that if we take any two polynomial classes with respective representative elements $a+bx$ and $c+dx$ and any two polynomials from these classes, their sum will always belong to the class with representative element $(a+c)+(b+d)x$ and their product will always belong to the class with representative element $(ac - bd) + (ad+bc)x$. So, we can forget all about the polynomials and their groupings and all of those formal details and just work with these representative elements, with their algebraic operations defined in this way, and all of the typical properties that we expect of real numbers should continue to hold in this extended system (except, of course, that $x^2 = -1$ has no solution). And for good measure, we can decide to write $i$ instead of $x$, because the symbol is meaningless anyway.

This is a famous algebraic construction of $\mathbb{C}$ which could equivalently be denoted as $\mathbb{R}[i]/\langle i^2+1\rangle$, meaning: we are considering a space of real polynomials (polynomials with real coefficients) with variable $i$, but we "mod out" or take modulos with respect to the polynomial $i^2+1$, which is algebraically identified with $0$.


Edit: to more directly answer your question, the reason why we are justified to consider an expression like $i^2$ within the above framework is because $i$ here doesn't represent a "new number" which we have invented out of thin air, not existing within $\mathbb{R}$. Rather, $i$ represents the variable of a polynomial with real coefficients (thus $i$ can be imagined as ostensibly real valued, though we don't pick a value for it and leave it as an indeterminate). Moreover, the statement that $i^2 = -1$ is not a strict equality, but it is actually an equivalence modulo the polynomial $i^2+1$ (in the same way that you might say $1$ and $6$ are equivalent modulo $5$, as they each leave a remainder of $1$ after dividing by $5$).

Therefore, at no point are we considering any operations on anything not real valued. The discrepancy, then, between $\mathbb{C}$ and $\mathbb{R}$ is this: every element of $\mathbb{R}$ can be identified with a unique element of $\mathbb{C} = \mathbb{R}[i]/ \langle i^2 + 1\rangle$ - namely, the real number $a$ is identified with the constant polynomial $a$. On the other hand, there is no element of $\mathbb{R}$ which may be algebraically identified with the polynomial $i$, as $i$ satisfies $i^2 + 1 = 0$ which no real number can do. In other words, $\mathbb{R}$ and $\mathbb{C} = \mathbb{R}[i]/\langle i^2+1\rangle$ are not isomorphic as algebraic structures; there are elements of $\mathbb{C}$ which exhibit algebraic properties that no element of $\mathbb{R}$ can satisfy.

$\endgroup$
11
  • 6
    $\begingroup$ While we cannot simply declare "There is a new number $i$ and it works like this", and expect everything to work out, the "famous algebraic construction of $\Bbb{C}$" is about as close as one can get to doing exactly that. +1 $\endgroup$ Commented Jun 17 at 4:16
  • 5
    $\begingroup$ This is how I have tried (once or twice) to teach complex numbers: leverage what they already know about polynomials, and then just add one tiny extra rule on top of that. It seems miles better to me than anything that results from the abomination that is $i = \sqrt{-1}$, and I think it makes it less mysterious. I haven't done it enough to truly gauge whether it's better for the students, but I feel less icky. $\endgroup$
    – Arthur
    Commented Jun 17 at 9:48
  • 5
    $\begingroup$ @TheoBendit Essentially we do define a new number $i$ and declare how it works and then check that everything works out, and indeed it does. $\endgroup$
    – quarague
    Commented Jun 17 at 9:49
  • 3
    $\begingroup$ @quarague Whether we "define" or "build" the new number $i$ comes down to whether we're looking at this in terms of syntax or semantics; I will note that without applying a lot of external reasoning it's a lot harder (for me at least) to convince myself that the syntactic approach is worthwhile/valid c.f. the semantic approach. Why should mere lack of contradiction convince us that "everything works out" in practice? I think explicitly constructive framings are the best, pedagogically. $\endgroup$ Commented Jun 17 at 15:42
  • 4
    $\begingroup$ @Joe But we're not "declaring a new number", we're stating the axioms of $\mathbb{C}$, which is the axioms of $\mathbb{R}$ plus a new constant symbol $i$ and extra axioms governing addition/multiplication with $i$. I don't see how that's different from stating the axioms of $\mathbb{N}$, which can but is never introduced as something you construct from ZFC, we just take the axioms and work with them. $\endgroup$
    – Passer By
    Commented Jun 17 at 17:26
27
$\begingroup$

We do so by defining a more general concept of multiplication, which extends to complex numbers; we also want to define what a complex number even is, using concepts that we already know.

And then we embed the reals into our new construct.

Given that we have a construction of Real Numbers, we can construct the Complex Numbers in the following way:

Complex Numbers are defined as pairs of reals, $(x,y)$,

where $(a,b)\times (c,d)$ is defined to be the pair $(ac - bd,ad + cb)$

and $(a,b) + (c,d) = (a+c, b + d). $

So, $(m,0) \times (n,0) = (mn,0)$ and $(m,0) + (n,0) = (m+n,0)$ - with this observation we embed the reals into this construction by associating the complex number $(m,0)$ with the real number $m.$

Furthermore, notice that $(0,1) \times (0,1) = (-1,0) = -1 .$

We denote $(0,1)$ with the letter $i$;

thus, $i \times i = -1$.

Remark: $(0,1)\times(m,n) = (-n,m) $.

Also, notice that $(a,b) = (a,0) + (0,b) $

$= (a,0) + (0,1)\times(b,0) $

$= a + ib $

$= a + bi .$

So we can write any complex number as $a + bi .$

This gives a very tangible geometric interpreation to complex numbers. They are just coordinates on the plane of Real Numbers.

The X-Axis Constitutes the Real Number Line.

Multiplying by $i$ is seen as rotating by $90$ degrees.

So, you can ask "what is this strange object that, multiplied by itself produces a negative number"

The answer, an ordered pair of reals, with multiplication defined as above- where multiplying by $i$ is just a rotation along a circle.

I believe this is useful conceptualization, because

  1. it gives insight into why complex numbers could be useful to model the real world.

  2. it constructs objects out of ones we are familiar with, with properties that we want which let us solve equations like $x^2 + 1 = 0$

  3. It gives a construction of an object, which otherwise may seem mysterious as to how anything could have the property $i \times i = -1$

So, we build an object that satisfies the property, rather than just saying that an object has that property.

$\endgroup$
3
  • 6
    $\begingroup$ This is a nice answer. It will be even nicer when you edit it to fix the spelling errors and write the mathematics with mathjax: math.meta.stackexchange.com/questions/5020/… $\endgroup$ Commented Jun 17 at 1:04
  • 3
    $\begingroup$ Thank you. It was typed on my phone. I will work to edit it, when I have access to my laptop. $\endgroup$ Commented Jun 17 at 1:07
  • 4
    $\begingroup$ @MichaelCarey I have edited the formating for you. Please check it. Great answer as well. $+1$ $\endgroup$
    – GSmith
    Commented Jun 17 at 10:51
13
$\begingroup$

Multiplication of complex numbers is defined by

$$(a+bi)(c+di):=(ac-bd)+(ad+bc)i$$

With this definition you can see that if $a=c=0$ and $b=d=1$ then the above formula gives

$$i^2=(0+1i)(0+1i)=-1$$

The key here is to identify the complex number $a+bi$ with the point $(a,b)$ in the $xy-$plane, then the definition of multiplication given above becomes

$$(a,b)(c,d):=(ac-bd, ad+bc)$$

and addition is defined point wise, i.e.

$$(a,b)+(c,d)=(a+c, b+d)$$

Later on you may learn that multiplication by $i$ corresponds to rotation of points in the plane by 90 degrees in the counterclockwise direction, and so $i^2=-1$ is simply the statement that the point $(0,1)$ rotated 90 degrees counterclockwise maps to the point $(-1,0)$

$\endgroup$
7
  • 2
    $\begingroup$ Sorry but this is not what I intended to ask. Before even defining imaginary numbers as $a+bi$, we define the imaginary unit as $i^2=-1$. But at this point we only have multipication for real numbers, then how can we apply it to a $i$ which is not a real number. $\endgroup$ Commented Jun 17 at 0:48
  • 21
    $\begingroup$ @KrypticCoconut Thing is, we don't define it like that. This is indeed how the topic is usually explained in high school, but that's not a rigorous definition. Formally, $\mathbb{C}$ is usually defined as the set $\mathbb{R^2}$, and we define two operations $(a,b)+(c,d)=(a+c,b+d)$ and $(a,b)(c,d)=(ac-bd, ad+bc)$. Then we denote $(a,b)=a+ib$, this is just a notation. In particular, $i=(0,1)$ and we identify a real number $a\in\mathbb{R}$ with the pair $(a,0)$. So now $i^2=-1$ just follows by computation, you can check that $(0,1)^2=(-1,0)$. This is not a definition. $\endgroup$
    – Mark
    Commented Jun 17 at 0:55
  • 1
    $\begingroup$ @KrypticCoconut The way to apply multiplication to i is exactly as described in my, and others, answers. Try not to rely so much on the "definition" of i as the square root of negative one. Instead, think of complex numbers as ordered pairs of real numbers- that is, points in the xy-plane. The y axis is identified as the imaginary axis. Addition is defined point wise and multiplication is defined as in my answer. $\endgroup$
    – bill
    Commented Jun 17 at 1:06
  • 2
    $\begingroup$ Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. $\endgroup$
    – Community Bot
    Commented Jun 17 at 1:07
  • 4
    $\begingroup$ @KrypticCoconut Most students aren't as astute as you are. I'd say: we define $\mathbb C$ as extension of $\mathbb R$ by adding a new not real element $i$ which has the property that we extend the definition of multiplication so that $i^2=-1$ (we aren't just defining $i$ we are redefining multiplication by including an extended instance). We have to extend it further (and we have to extend addition as well) so that $(a+bi)(c+di)=(ac-bd)+(bc+ad)i$ is a new introduced property of multiplication. $\endgroup$
    – fleablood
    Commented Jun 17 at 1:10
9
$\begingroup$

Probably too late contribute anything, but I think the heart of the issue is this:

The complex numbers are not $\Bbb{R}$ with a new number $i$ attached. They are $\Bbb{R}$ with infinitely many new numbers attached. We attach $2i$, and $4 - 7i$ and $-\frac{\pi}{e^2} + \gamma i$, and every combination we can make in the form $a + ib$. We don't just attach a square root of $-1$, but we attach new $n$th roots for every number (for $n \ge 3$), as well as roots to all polynomials with real coefficients.

When introducing the complex numbers to students for the first time, the imaginary unit $i$ is often introduced almost like a thought experiment: "What if we just chucked a square root of $-1$ into the real numbers?". From there, the form $a + ib$ just magically appears, and addition/multiplication are derived from presupposing properties of addition and multiplication, such as distributivity and commutativity. Lo and behold, things just tend to "work out", and we get a functioning field with helpful properties.

But, you're right to be sceptical of this process. This is not how complex numbers are defined. In fact, there is no definition here to speak of. You can't just nominate a new number $i$ defined by a specific property like $i^2 = -1$, make up a form like $a + ib$, and expect basic properties like distributivity and commutativity to just fall in line! There's no obvious rigorous foundation to support that this will work. It only works out, because the teacher knows, in advance, that there is a firm basis to be found.

Instead, the complex number should be defined as a whole, as Rob, Michael Carey, and bill have done. Define all of $\Bbb{C}$, at the same time. You can define it as $\Bbb{R}^2$, or the more complicated $\Bbb{R}[i]/\langle i^2 + 1 \rangle$ construction from Rob's answer, or even as a set of real matrices: $$\Bbb{C} := \left\{\pmatrix{a & -b \\ b & a} : a, b \in \Bbb{R}\right\}.$$ No matter how you do it, you should define all of it, all at once. You should also define operations like addition and multiplication. Then, once you've defined it, identify the real numbers within it (in the above case, the real number $x$ is identified with $xI$, where $I$ is the identity matrix), and show that there is a square root of $-1$ (above: $i:=\pmatrix{0 & -1 \\ 1 & 0}$ squares to give $-I$).

$\endgroup$
3
  • $\begingroup$ FWIW, this is exactly how I present the complex numbers. And, indeed, I later give a model of them based on real matrices, whose properties I have proved. $\endgroup$
    – egreg
    Commented Jun 19 at 8:43
  • $\begingroup$ @egreg To clarify, I don't think there isn't anything wrong necessarily with the thought experiment approach, so long as it's clear that it's more motivation than mathematics. I think confusing motivation with mathematics is the source of the confusion for the OP. $\endgroup$ Commented Jun 19 at 10:22
  • $\begingroup$ For student of mathematics, the obvious motivation is Cardano's formula in the casus irreducibilis. Actually some of the students in computer science can already be acquainted with complex numbers from high school (technical schools deal with linear differential equations). $\endgroup$
    – egreg
    Commented Jun 19 at 10:34
3
$\begingroup$

There is another way to define complex numbers which is more attractive aesthetically, and provides a more natural answer to the question, but it needs a bit more sophistication.

Consider the set of polynomials of one variable, with real coefficients, e.g. $3x^3+4x^2+5x+1$. Here $x$ is not an unknown, it is just a mark. Then there are natural ways to add and multiply these polynomials, so they form a ring. Within that ring, there is an ideal generated by $x^2+1$. Consider the quotient of the ring by this ideal. It is a field (i.e. every element except the additive identity 0 has a multiplicative inverse). And within it, $x^2+1=0$. So it has all the properties we are looking for to count as the space of complex numbers,

$\endgroup$
3
$\begingroup$

What we're doing is creating a bigger field by introducing a new independent number and saying how it combines with itself and with the existing numbers via the field operations.

There are 4 new combinations: $r + i$, $i + i$, $r \cdot i$, and $i\cdot i$, where $r$ is real. The first and the third are taken to be new elements (when $r$ is not $0$ in the first and neither $0$ nor $1$ in the third), that's what we mean by independent; the second is determined by the field axioms to be $(1+1)\cdot i$. Thus, we define what $i\cdot i$ means (in this case, it's $-1$) and then let the field axioms handle the rest.

This implies the existence of all (and only) elements of the form $a + ib$ for real numbers $a$ and $b$, and defines addition independently and multiplication distributively on these numbers.

$\endgroup$
2
$\begingroup$

That is a very astute observation. In the context of your question, the answer is that we define multiplication in the same breath as we define i. We should actually spell out that this new entity can be multiplied with itself and the result is -1.

And we should not stop here. We should tell that i can be multipled with reals too, and 3i is a new number again, it is not real and not equal to i either. We should require that this new multiplication is commutative and associative.

Same with addition. We should explicitly define that the new numbers can be added with certain rules. i+5 is not real and not equal to i either, and so on...

$\endgroup$
2
$\begingroup$

I have found math is typically taught in a timeless sense. We teach concepts of "multiplication" like they have been set in stone forever. The reality of the history of mathematics lends some credence to your incredulity and shows how long it took us to accept complex numbers.

The concept of imaginary numbers first really came into use in the 1500's studying cubic equations. Mathematicians like Scipione del Ferro noticed that there were methods of solving cubic equations where, if you just accepted that the square root of negative numbers was a valid thing, a real root could pop out (we would now say that a complex number got multiplied by its conjugate, to get a real number). The validity of such approaches was of course questioned, but it was found to work in all of the cases they explored. Roots of cubic polynomials are easy to verify, even if the method you took to find them bordered on heresy. In mathematics, this is typically a sign you're on the right track, and they explored it further.

Other mathematicians like Rafael Bombelli figured out how to attach concepts like addition and subtraction to these numbers. He recognized that the rule for arithmetic on complex numbers were not the same as real numbers, but that one could define such operators meaningfully. His addition and subtraction had the usual properties.

Incidentally, this can be seen as a precursor for abstract algebra, where we might ask questions "what does it mean for multiplication to be defined over a set?" We look at what properties are implied by multiplication. Complex numbers and real numbers are both fields, in abstract algebra terms, which is an algebra which defines addition, subtraction, multiplication, and division in more-or-less the usual way you think of them today. Thus the modern answer is to say that you can multiply two complex numbers because they are a field, and fields admit multiplication.

Fancy modern terminology aside, it took a long time for complex numbers to become a "thing" in their own right. Once we started noticing things like Euler's formula ($e^{i\theta} = \cos(\theta) + i\sin(\theta)$) we started being able to really leverage complex numbers to solve more and more interesting problems.

That took time. Some mathematicians credit Gauss with being the first mathematician to really embrace them"

The English mathematician G.H. Hardy remarked that Gauss was the first mathematician to use complex numbers in "a really confident and scientific way" although mathematicians such as Norwegian Niels Henrik Abel and Carl Gustav Jacob Jacobi were necessarily using them routinely before Gauss published his 1831 treatise.

Note that that's 300 years later. It took us 300 years to become confident that complex numbers were really a "thing" and really start leveraging them to solve advanced problems that weren't solved before.

So I'd say that excuses you for thinking multiplication was only defined on real numbers. You were probably taught as such less than a decade ago. It took all of mathematics three centuries to believe in them!

Now all that being said, you will find multiplication getting redefined again and again. Cryptographers often make proofs in Galois Fields, which have a finite size (not infinite, like real numbers or integers). As a computer programmer, I also often work with "multiplication" of 32 bit numbers. 32 bit numbers don't even form a field under the usual arithmetic methods, but we call it multiplication none the less. It's useful to call it as such. So don't be surprised when you hear yet another meaning of multiplication. They abound!

$\endgroup$
1
  • $\begingroup$ Thanks for writing this out, it's an insightful answer, just as a comment - I should've been more clear in my question. My doubt wasn't regarding that multiplication is exclusive to reals (in that it cannot be defined for any other system within reason) but that multiplication was only yet defined for the reals and was being used for something not real, which has been cleared by the amazing answers given here. $\endgroup$ Commented Jun 19 at 1:17

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .