This is a healthy minimum dose of number theory to make you feel comfortable with the concepts we will be using in this course.

We do not start from the absolute beginning. We take for granted integers and arithmetic operations. We start by some terminology. The set of integers, denoted by \(\mathbb{Z}\), is the set of whole numbers including negative numbers, zero, and positive numbers. When we say positive integers – aka natural numbers – we exclude zero and, obviously, negative integers. The set of positive integers is denoted by \(\mathbb{Z}^+\). If you want to include zero but leave out negative integers, the term is non-negative integers. The set of non-negative integers is denoted by \(\mathbb{Z}_{\geq 0}\), and so on with other subscripts. When we say number we will mean integer unless stated otherwise.

The two foundational concepts of number theory are divisibility and parity (i.e. even v. odd). The former is more primitive, in the sense that once we define divisibility, we can use this definition to define parity. Here we go:

Definition 1 (divisibility)

An integer \(a\neq 0\) divides another integer \(b\), denoted by \(a \mid b\), if there exists an integer \(k\) such that \(b = ka\). We say \(a\) is a divisor (or factor) of \(b\), and \(b\) is a multiple of \(a\).

If \(a\) does not divide \(b\), we write \(a \nmid b\).

Check Express divisibility in first-order logic.

Here is the Division Algorithm, which is actually not an algorithm but a theorem:

Theorem (Division Algorithm)

Given any integers \(a\) and \(b\) with \(b>0\), there are unique integers \(q\) and \(r\) such that \(a = bq + r\) and \(0 \leq r < b\).

Some more terminology:

Definition (quotient and remainder)

In the equality of the Division Algorithm, \(a\) is the dividend, \(b\) is the divisor, \(q\) is the quotient and \(r\) is the remainder. Other notations for quotient and remainder are \(q = a\; \mathrm{div}\; b\) and \(r = a \bmod b\).

The quotient of dividing \(a\) by \(b\) can be expressed by the floor function which given a number (not necessarily integer) returns the largest integer not greater than that number. For example, \(\lfloor 3.7 \rfloor = 3\), \(\lfloor -2.3 \rfloor = -3\), and so on. Using the floor function, we can express the quotient as \(q = \lfloor \frac{a}{b} \rfloor\).

Check: What is the quotient and remainder when -11 is divided by 3?

If you answered the above incorrectly as ‘quotient is -3 and remainder is -2’, on the grounds that \(-11=-3\times3 -2\), you have to re-read Theorem (Division Algorithm).

Programming note: In Python, the floor division operator // computes the quotient, while the modulo operator % computes the remainder.

The infinite set \(\mathbb{Z}\) can be partitioned into two infinite subsets: even and odd integers. Partitioning means when you take the union of even integers and odd integers you get back to \(\mathbb{Z}\), while their intersection is the empty set. In more simple terms, if you are an integer, then you are either odd or even, you cannot be both, and you cannot be anything else. We define these sets:

Definition 2 (parity)

\(\begin{align} \forall n \in \mathbb{Z}, n \text{ is even iff } 2\mid n \\ \forall n \in \mathbb{Z}, n \text{ is odd iff } 2\nmid n \end{align}\)

In other words “\(n\) is even” and “\(2 \mid n\)” are the two ways, the latter being more symbolic, to say the same thing. (Isn’t what I just wrote is the definition of definition?)

Check: According to our definition of parity, do we treat 0 as even, odd, or neither?

Given \(n\in \mathbb{Z}\), the set \(\lbrace x : x\mid n\rbrace\) is named the divisors of \(n\).1

Definition 2 (proper divisors)

Given an integer \(n\), The proper divisors of \(n\) is the set \(\lbrace x: x\neq n, x>0, x\mid n\rbrace\).

Check: What are the proper divisors of 12? What about -12?

Here is the standard definition of prime numbers:

Definition 3 (prime)

A positive integer \(p > 1\) is called a prime number (or simply is prime) if its only positive divisors are 1 and \(p\) itself.

Can you think of an alternative definition of prime numbers using the concept of proper divisors we just defined?

Here it is:

Definition (prime – alternative)

A positive integer \(p > 1\) with no proper divisor other than 1 is called a prime number (or simply is prime).

Here is another one that does not need \(>1\) condition:

Definition (prime – alternative 3)

A positive integer \(p\) that has exactly 2 divisors is prime..

Definition (composite numbers)

A positive integer \(n > 1\) that is not prime is composite. Or, equivalently,
\(\forall n >1, n \text{ is composite iff } \exists z \text{ s.t. } z\mid n \text{ and } 1 < z < n\).

So much for definitions. Let’s now look at some truths we can prove using these definitions. The technical name for such truths is theorems. Here is a theorem about parity:

Theorem 1

For any integer \(n\), if \(n\) is odd, then \(n^2\) is odd.

Here is a proof:

Proof 1 (Theorem 1)

Given that \(n\) is odd, \(n = 2k + 1\) for some integer \(k\).
Then, \(n^2 = (2k +1) (2k +1) = 4k^2 + 4k + 1 = 2(2k^2 + 2k) + 1\).
Since \(2k^2 + 2k\) is an integer, we can say that \(n = 2p + 1\) for some integer \(p=2k^2 + 2k\), and therefore we can conclude that \(n^2\) is odd.

Looks reasonable. But it is not reasonable. We made use of two “truths” that we have not defined or proved yet: (1) any odd number \(n\) equals \(2k + 1\) for some integer \(k\), and (2) if \(a\) and \(b\) are integers, so are \(a+b\) and \(ab\). To make our proof rigorous, we need to avoid using any unproven statements and undefined terms.

We will first take the following theorem for granted without proving it:

Theorem 2 (closure)

The set of integers \(\mathbb{Z}\) is closed under addition, multiplication and subtraction. In other words, for any \(a,b \in \mathbb{Z}\), \(a + b, ab, a-b \in \mathbb{Z}\).

Check: Why is \(\mathbb{Z}\) not closed under division?

Now let us prove this very useful theorem about odd numbers:

Theorem 3 (odd numbers)

For any integer \(n\), \(n\) is odd iff there exists an integer \(k\) such that \(n = 2k + 1\).

Proof: (Theorem 3)

Let \(n\) be odd.
By the division algorithm, there exist integers \(q\) and \(r\) such that \(n = 2q + r\) and \(0 \leq r < 2\).
\(r\) can either be 0 or 1. If \(r\) is 0, then \(n = 2q\), which means \(2 \mid n\), contradicting the assumption that \(n\) is odd.
Therefore, \(r=1\), and \(n = 2q + 1\).

With Theorem 3 proved, we can now complete the proof of Theorem 1:

Proof (Theorem 1)

Let \(n\) be odd.
Then \(n = 2k +1\) for some \(k\) by Theorem 3
Then, \(n^2 = (2k +1) (2k +1) = 4k^2 + 4k + 1 = 2(2k^2 + 2k) + 1\).
Since \(2k^2 + 2k\) is an integer by Theorem 2, we can say that \(n = 2p + 1\) for some integer \(p=2k^2 + 2k\)
Therefore we can conclude that \(n^2\) is odd.

Let’s exercise a little more.

I think all of us know the basic facts about divisibility: a number is divisible by 2 if its last digit is divisible by 2 (i.e. even), likewise for 5, and more interestingly, a number is divisible by 3 if the sum of its digits is divisible by 3. It is quite straightforward to convince ourselves of the correctness of the first two facts, but why should the third hold? All of us know that it holds, but only some of us might know why it holds. Knowing that some fact holds can be very useful, in engineering, finance, medical practice, sports, and so on. But in science and mathematics, knowing that is secondary to knowing why.

Give yourself some time to think about why the divisibility rule for 3 holds.

Some of us will see it, perhaps, right away, some will need more time, some will progress only up to a certain point, and some will not even find that initial handle to grasp so that they can start reasoning. The biggest mistake for those who are not happy with their performance (which means all levels except the first) is to put the blame on their mental powers, rather than on the inadequacy of their knowledge of the subject matter, or how math works, or both. Try not to fall into this trap, mathematics is not reserved for mentally privileged people.

Let’s try to discover what we DON’T know that is preventing us from seeing the logic behind the divisibility rule for 3. First, do we really know the rule? The rule is actually not that “a number is divisible by 3 if the sum of its digits is divisible by 3”, but rather “a number is divisible by 3 if the sum of its digits in the decimal representation of it is divisible by 3”. There is no possibility of progress without, explicitly or implicitly, being aware of this nuance. Rule is stated to hold for decimal representation, so the reason why it holds must have something to do with the decimal representation. Do we know, then, what does decimal representation, or even representation, mean for numbers?

Theorem (base representation)

Given any integer \(n > 0\) and any base \(b \geq 2\), there exist unique integers \(d_0, d_1, \ldots, d_k\) all non-negative and less than \(b\) with \(d_k>0\) such that

\(n = d_k b^k + d_{k-1} b^{k-1} + \ldots + d_1 b^1 + d_0 b^0\).

For example, \(5097= 5\cdot10^3 + 0\cdot 10^2 + 9\cdot 10^1 + 7\cdot 10^0\), where the base is 10, and the digits are 5,0,9, and 7.

Let’s go back to divisibility by 3. When we write a number in this form, we immediately see that the base terms \(10^i\) except when \(i=0\) are irrelevant for divisibility by 3, you can just ignore them. Why?

Let’s go on trying to discover some other potential gaps in our knowledge and understanding. What is divisibility by 3? If \(3 \mid n\), then there exists an integer \(k\) such that \(n = 3k\). Is there an alternative expression of this state of affairs that would be more useful to us? What about \(n\bmod 3 = 0\). When we are invited to check whether \(3\mid 5097\), we are actually checking whether \(5097 \bmod 3 = 0\). Can we express ‘\(5097 \bmod 3\)’ in terms of 5097’s digits?

For a moment forget about digits. Assume we are asked to find whether 7 is divisible by 3. What do we do? We compute \(7 \bmod 3\) and discover that it is not zero, and answer ‘No’ to the question. What about 3 dividing 17? You would start computing \(17 \bmod 3\) right away, but can’t you also use the result of an earlier computation? (Imagine yourself in a situation where you are asked ‘\(7873298475 \bmod 42\)’ and you know the result for ‘\(7873298400 \bmod 42\)’, wouldn’t it be nice to get away by computing only ‘\(75 \bmod 42\)’ and recycle your earlier result?) Why not, for example, say that \(7\bmod 3=1\) and \(10 \bmod 3= 1\), so \(17\bmod 3\) must be their sum 2, which indeed is \(17\bmod 3\). Can I generalize this? Can I say that given two positive integers \(a\) and \(b\), \((a + b) \bmod 3 = a \bmod 3 + b \bmod 3\)? This doesn’t work, because if both \(a \bmod 3\) and \(b \bmod 3\) equal 2, their sum is 4, which is not a valid value for \(\bmod 3\) – for instance \((8 + 11)\bmod 3\neq (8\bmod 3) + (11\bmod 3)\). But observe that applying \(\bmod 3\) to the result, 4 in this case, will give you the right result. In other words, our specific example obeys the following template \((a + b) \bmod 3 = ((a \bmod 3) + (b \bmod 3)) \bmod 3\)? Does this hold generally? For all \(m>0\)? Yes, it does. Let us state this as a theorem2 and prove it.

Theorem (modular addition)

For any integers \(a,b\) and any positive integer \(m\), \((a + b) \bmod m = ( (a \bmod m) + (b \bmod m) ) \bmod m\).

Proof (modular addition)

Let \(a,b \in \mathbb{Z}\) and \(m \in \mathbb{Z}^+\).
By the Division Algorithm \((a + b) \bmod m = r_1 = a + b - pm\), for some \(p\).
Likewise \(((a \bmod m) + (b \bmod m) ) \bmod m = r_2 = (a - qm) + (b - sm) - tm\), for some \(q,s,t\).3
We need to show that \(r_1 = r_2\).
By rearrangment, \(r_2 = a + b - m(q + s + t)\) Then \(r_1 - r_2 = m(q+s+t-p)\)
We know that \(r_1,r_2< m\), by the definition of the remainder in the Division Algorithm.
Therefore, \(r_1-r_2\) cannot be equal to or greater than \(m\) and cannot be less than or equal to \(-m\), i.e. \(-m < r_1 - r_2 < m\).
The only way this condition holds is when \(q+s+t-p\), an integer, is equal to 0, which means \(r_1 = r_2\).

The same logic holds for multiplication as well:

Theorem (modular multiplication)

For any integers \(a,b\) and any positive integer \(m\), \((ab) \bmod m = ( (a \bmod m)(b \bmod m) ) \bmod m\).

The proof, which is similar to the above, is left as an exercise.

Where are we now on divisibility by 3? We now know that whenever we are asked about ‘\(10n\bmod 3\)’ we can answer that it is ‘\(n\bmod 3\)’, because we know that:

\[\begin{align*} 10n\bmod 3 &=& ((10 \bmod 3)(n\bmod 3)) \bmod 3 \\ &=&(n\bmod 3) \bmod 3\\ &=&n\bmod 3 \end{align*}\]

By the iterated application of this logic, you can have in general:

Lemma (powers of 10 mod 3)

For any non-negative integer \(k\) and any integer \(n\), \(10^k n \bmod 3 = n \bmod 3\), by Theorem (modular multiplication).

We are all set in terms of building blocks. Let’s combine them all:

Lemma (sum of digits mod 3)

Given any positive integer \(n\) with the decimal representation
\(n = d_k10^k + d_{k-1}10^{k-1} + \ldots + d_110^1 + d_010^0\),
\(n \bmod 3 = (d_k10^k + d_{k-1}10^{k-1} \ldots d_110^1 + d_010^0) \bmod 3\)
which, by Theorem (modular addition), is,
\(((d_k10^k \bmod 3) + (d_{k-1}10^{k-1}\bmod 3) + \ldots +\)
\(\quad\quad\quad\quad(d_110^1\bmod 3) + (d_010^0 \bmod 3)) \bmod 3\)
which, by Lemma (powers of 10 mod 3), is,
\(((d_k \bmod 3) + (d_{k-1}\bmod 3) + \ldots +\)
\(\quad\quad\quad\quad(d_1\bmod 3) + (d_0 \bmod 3)) \bmod 3\)
which, once again by Theorem (modular addition), is,
\((d_k + d_{k-1} + \ldots + d_1 + d_0) \bmod 3\),
establishing,
\(n \bmod 3 =(d_k + d_{k-1} + \ldots + d_1 + d_0) \bmod 3\)
Which is to say that a positive integer \(n\) is divisible by 3 iff the sum of the digits in its decimal representation is divisible by 3.

It’s been a long day, I know. But one final example. One that, like the above, trades on some property of number 3. Here is a claim about prime numbers that we need to check:

Claim (3 consecutive primes)

3, 5, and 7 are three consecutive prime numbers, in the sense that there is no odd composite number intervening. There is no other triple of consecutive prime numbers.

There are the following strategies to approach these types of statements.

  1. You find an instance that contradicts the claim. In this case a triple of consecutive primes other than 3,5,7.
  2. You assume that there exists such a triple other than 3,5,7, and then show that this assumption leads to a contradiction.
  3. You pick an arbitrary triple of consecutive primes, where all you are entitled to know that (i) they are primes, (ii) they are consecutive in the Claim’s sense, and (ii) the first of them is not 3. You convince the reader that the triple you picked cannot be all primes, which amounts to showing that any triple other than 3,5,7 is guaranteed NOT to be all prime, proving the claim.

This particular claim gives itself to the third strategy.

Think and try before I give you a clue.

Clue 1: The claim states that there cannot be any integer \(p\neq 3\) such that \(p, p+2, p+4\) are all prime. Make sure you understand the conditions of the problem.

Think and try before I give you another clue.

Clue 2: You can show any such triple other than 3,5,7 must include a number divisible by 3.

Think and try before I give you the final clue.

Clue 3: Concentrate on \(p \bmod 3\).

Think and try before you look at the proof.

Here is the proof, which concludes this post:

Proof (Claim 3 consecutive primes)

Let \(p, p+2, p+4\) be any three consecutive odd numbers with \(p \geq 5\).
Either \(3\mid p\) or \(3\nmid p\).
If \(3\mid p\), then \(p\) is composite, the triple cannot be all primes.
else if \(3\nmid p\) then either \(p \bmod 3=1\) or \(p\bmod 3 = 2\).
If \(p\bmod 3=1\), then \((p+2) \bmod 3 = 0\), which means \(3\mid (p+2)\), and therefore \(p+2\) is composite.
If \(p\bmod 3=2\), then \((p+4) \bmod 3 = 0\), which means \(3\mid (p+4)\), and therefore \(p+4\) is composite.
Simply there is no way that a triple other than 3,5,7 can be all primes.

  1. There is subtle difference between the sense we use “divisor” in this context and the sense we used it in the Division Algorithm. When talking about the divisors of a number \(n\) in the current context, when we say \(k\) is a divisor of \(n\), we mean \(k\) divides \(n\) without leaving a remainder. In the division algorithm, the divisor is a number that divides another number, but may leave a remainder. 

  2. This actually is corollary of a more major theorem about congruences, which we leave out of scope for now. 

  3. We used the Division Algorithm three times, once for each term connected by ‘+’, and once for the overall \(\bmod\).