Prime numbers and unique factorization
An Introduction to Real Analysis
2025-10-18
Prime numbers and their basic properties are topics of algebra more so than analysis. Nevertheless, these topics are important enough to be included in a development of the number systems as in Chapter 1 of Katznelson and Katznelson (2024).
The two most basic properties of prime numbers are their infinitude and that every natural number other than \(1\) factors into a unique product of primes. The latter is known as the fundamental theorem of arithmetic. Katznelson and Katznelson (2024, sec. 1.4) prove the infinitude of primes but not the fundamental theorem of arithmetic, so these notes prove the latter.
Proof of the fundamental theorem of arithmetic has two parts: proving that any \(a \in \symbb{N}\setminus \{ 1 \}\) has a prime factorization, and then proving that such a prime factorization is unique, ignoring permutations of the factors. Because these two parts rely on different techniques, and the second part is more complicated than the first, they have been split into Theorem 1 and Theorem 2, respectively, for clarity.
Fundamental theorem of arithmetic
Lemma 1 For \(a, b, c, d \in \symbb{N}\), if \(a < b\) and \(c \leq d\), then \(ac < bd\).
Proof. By Proposition 3 from the Section 1.1.2 notes for Katznelson and Katznelson (2024), \(ac < bc\). If \(c = d\), then \(bc = bd\), so \(ac < bd\). If \(c < d\), then by the same proposition, \(bc < bd\), so by transitivity of the linear order, \(ac < bd\). \(\blacksquare\)
Theorem 1 (Fundamental theorem of arithmetic; existence) If \(a \in \symbb{N}\setminus \{ 1 \}\), then \(a = \prod_{i = 1}^m p_i\) where \(p_1, \dots, p_m\) are prime.
Proof. By strong induction. Since \(2\) is its own prime factorization, the statement \(P(2)\) is true.
Next assume that for chosen \(a \geq 2\), the statements \(P(2), \dots, P(a)\) are true, so each of \(2, \dots, a\) have prime factorizations. If \(a + 1\) is prime, then \(a + 1\) is its own prime factorization, so the statement \(P(a + 1)\) is true.
If \(a + 1\) is not prime, then by Lemma 1.1.4 from Katznelson and Katznelson (2024, 4) there is a prime number \(p\) such that \(p\) divides \(a + 1\). Therefore there exists \(k \in \symbb{N}\) such that \(a + 1 = pk\). Because \(a + 1\) is assumed not to be prime, \(k \neq 1\), so \(k > 1\). Further, \(p > 1\) because \(1\) is not prime. To restate the preceding, \(p \geq 2\) and \(k \geq 2\). If \(p > a\) then Lemma 1 implies that \(pk \geq 2a > a + 1\), a contradiction.1 Similarly, if \(k > a\) then Lemma 1 implies that \(pk \geq 2a > a + 1\), again a contradiction.
Therefore \(p, k \in \{ 2, \dots, a \}\), whence by assumption \(p\) and \(k\) have prime factorizations \(p = \prod_{i = 1}^m p_i\) and \(k = \prod_{j = 1}^n q_j\). Therefore \(a + 1 = \prod_{i = 1}^m p_i \cdot \prod_{j = 1}^n q_j\), so \(a + 1\) has a prime factorization and the statement \(P(a + 1)\) is true. \(\blacksquare\)
It was established in the Section 1.1.2 notes for Katznelson and Katznelson (2024) that if \(a, b \in \symbb{N}\) and \(a < b\), then the \(k \in \symbb{N}\) such that \(b = a + k\) is unique. Because of this, the notation \(b - a \coloneq k\) can be adopted for the difference between \(b\) and \(a\).
The Section 1.1.2 notes for Katznelson and Katznelson (2024) also established the cancellation law for addition. Lemma 2 establishes a similar cancellation law for inequalities and derives a particular relationship between the two differences.
Lemma 2 If \(a, b, c \in \symbb{N}\) and \(ab > ac\), then \(b > c\) and \(ab - ac = a(b - c)\).
Proof. If \(ab > ac\) and \(b \leq c\), then \(ab \leq ac\), which is a contradiction.
Since \(ab > ac\) implies that \(b > c\), \(b = c + k\) for some \(k \in \symbb{N}\). Then \[\begin{align*} ab &= a(c + k)\\[0.75em] &= ac + ak\\[0.75em] &= ac + a(b - c), \end{align*}\] so by uniqueness of the difference, \(ab - ac = a(b - c)\). \(\blacksquare\)
Theorem 2 (Fundamental theorem of arithmetic; uniqueness) If \(a \in \symbb{N}\setminus \{ 1 \}\), then the prime factorization \(a = \prod_{i = 1}^m p_i\) of \(a\) is unique up to the order of the factors.
Proof. Suppose by contradiction that the set of natural numbers having distinct prime factorizations is not empty. Using well-ordering let \(a\) be the least such number. Then \(a = \prod_{i = 1}^m p_i = \prod_{j = 1}^n q_j\) where \(p_1, \dots, p_m, q_1, \dots, q_n\) are prime, and the two prime factorizations are different.
If there were a common factor in the two factorizations, say \(p_i = q_j\) for some \(i \in \{ 1, \dots, m \}\) and \(j \in \{ 1, \dots, n \}\), then using the cancellation law of multiplication, \[\begin{align} \MoveEqLeft[0] p_i(p_1 \cdot \dots \cdot p_{i - 1} p_{i + 1} \cdot \dots \cdot p_m) \notag \\[0.75em] &= q_j(q_1 \cdot \dots \cdot q_{j - 1} q_{j + 1} \cdot \dots \cdot q_n) \notag \\[0.75em] \implies &p_1 \cdot \dots \cdot p_{i - 1} p_{i + 1} \cdot \dots \cdot p_m \label{eq:fundarithuniq-1} \\[0.75em] &= q_1 \cdot \dots \cdot q_{j - 1} q_{j + 1} \cdot \dots \cdot q_n \label{eq:fundarithuniq-2}. \end{align}\] Since \(p_i = q_j > 1\), it follows that \[\begin{align*} a &= p_i(p_1 \cdot \dots \cdot p_{i - 1} p_{i + 1} \cdot \dots \cdot p_m)\\[0.75em] &> p_1 \cdot \dots \cdot p_{i - 1} p_{i + 1} \cdot \dots \cdot p_m \end{align*}\] and similarly \[\begin{align*} a &= q_j(q_1 \cdot \dots \cdot q_{j - 1} q_{j + 1} \cdot \dots \cdot q_n)\\[0.75em] &> q_1 \cdot \dots \cdot q_{j - 1} q_{j + 1} \cdot \dots \cdot q_n. \end{align*}\] Since the two prime factorizations of \(a\) were assumed different even though they shared the prime factor \(p_i = q_j\), there must be a difference in the remaining prime factors. The equality of \(\eqref{eq:fundarithuniq-1}\) and \(\eqref{eq:fundarithuniq-2}\) shows that there is a number less than \(a\) having distinct prime factorizations, which is a contradiction because it was assumed that \(a\) is the least such number. Therefore in the remainder of the proof it may be assumed that the two prime factorizations of \(a\) share no factors in common.
Without loss of generality, suppose that \(p_1 < q_1\). If \(q_1 - p_1 = 1\) then \(q_1\) is the successor of \(p_1\); the only successive prime numbers are \(2\) and \(3\). If \(p_1 = 2\) and \(q_1 = 3\) then \(a = 2 \prod_{i = 2}^m p_i = 3 \prod_{j = 2}^n q_j\). If one of \(q_2, \dots, q_n\) were even, then \(\prod_{j = 2}^n q_j\) would include a factor of \(2\), which is a contradiction. Therefore \(3 \prod_{j = 2}^n q_j\) is the product of odd numbers and hence itself odd. This implies that \(a\) is both even and odd, which is also a contradiction, so \(q_1 - p_1 > 1\).
Since \(q_1 - p_1 > 1\), by Theorem 1 \(q_1 - p_1\) has a prime factorization. If \(p_1\) is in the factorization of \(q_1 - p_1\), then \(q_1 - p_1 = p_1 \prod_{k = 2}^o r_k\), so \(q_1 = p_1 + p_1 \prod_{k = 2}^o r_k = p_1(1 + \prod_{k = 2}^o r_k)\), which contradicts the assumption that \(q_1\) is prime. Therefore \(p_1\) cannot be in the prime factorization of \(q_1 - p_1\).
Furthermore, since \[\begin{align} a &= p_1 \prod_{i = 2}^m p_i \label{eq:fundarithuniq-3} \\[0.75em] &= q_1 \prod_{j = 2}^n q_j \notag \\[0.75em] &= (p_1 + (q_1 - p_1)) \prod_{j = 2}^n q_j \notag \\[0.75em] &= p_1 \prod_{j = 2}^n q_j + (q_1 - p_1) \prod_{j = 2}^n q_j \label{eq:fundarithuniq-4} , \end{align}\] \(\eqref{eq:fundarithuniq-3}\) and \(\eqref{eq:fundarithuniq-4}\) show that \(p_1 \prod_{i = 2}^m p_i > p_1 \prod_{j = 2}^n q_j\), so by Lemma 2 \(\prod_{i = 2}^m p_i > \prod_{j = 2}^n q_j\). Using \(\eqref{eq:fundarithuniq-3}\), \(\eqref{eq:fundarithuniq-4}\) and Lemma 2 \[\begin{align} p_1 \prod_{i = 2}^m p_i - p_1 \prod_{j = 2}^n q_j &= (q_1 - p_1) \prod_{j = 2}^n q_j \label{eq:fundarithuniq-5} \\[0.75em] &= p_1 \left( \prod_{i = 2}^m p_i - \prod_{j = 2}^n q_j \right) \label{eq:fundarithuniq-6} . \end{align}\]
Using \(\eqref{eq:fundarithuniq-3}\) and \(\eqref{eq:fundarithuniq-4}\) again, \(p_1 \prod_{i = 2}^m p_i - p_1 \prod_{j = 2}^n q_j < a\), so the former has a unique prime factorization; \(\eqref{eq:fundarithuniq-5}\) and \(\eqref{eq:fundarithuniq-6}\) show two different ways of writing this difference so they must have the same prime factorization.
Comparing \(\eqref{eq:fundarithuniq-6}\) with \(\eqref{eq:fundarithuniq-5}\), it must be that either \((q_1 - p_1)\) or \(\prod_{j = 2}^n q_j\) contain \(p_1\) in their prime factorizations, which is a contradiction because both were determined to be impossible. \(\blacksquare\)
Computing prime factorizations
The fundamental theorem of arithmetic shows that any \(a \in \symbb{N}\setminus \{ 1 \}\) has a unique prime factorization. It does not provide any guidance in computing the prime factorization of \(a\). This is illustrative of a difference in perspective between mathematics and computer science: in mathematics it’s often satisfactory to show that an object exists, but computer science emphasizes finding methods for explicitly computing the object, proving the methods correct, implementing them and analyzing their efficiency.
The following Python code applies a naïve method to compute the prime factorization of \(a\). The function get_primes()
returns the prime numbers in \(\{ 2, \dots, a \}\) when called by factorize()
. If \(a\) is not prime, factorize()
considers the prime numbers \(p\) starting from \(2\) and checks if \(p_1 k_1 = a\) for potential factors \(k_1 \in \{ 2, \dots, a \}\). When a factor \(k_1\) is found, \(p_1\) is added to the list of prime factors of \(a\) and the program then attempts to factor \(k_1\).
Before beginning every attempt to factor a new number, the program checks if the product of the prime factors of \(a\) computed so far is equal to \(a\). When the product is equal to \(a\)—which eventually occurs by the fundamental theorem of arithmetic—the program returns the prime factors of \(a\) and terminates. As a demonstration, the program computes the prime factorization of every \(a \in \{ 2, \dots, 100 \}\).
This program is naïve in the sense that more efficient alternatives exist, but those alternatives use more sophisticated concepts than have been developed to this point in the Section 1.1 notes for Katznelson and Katznelson (2024).
References
Footnotes
To prove that for all \(a \geq 2\), \(2a > a + 1\), proceed by induction starting from \(a = 2\). Since \(2 \cdot 2 = 4 > 3 = 2 + 1\), the statement \(P(1)\) is true. Next assume that \(P(a)\) is true, so \(2a > a + 1\). Then \(2(a + 1) = 2a + 2 > 2a + 1 > (a + 1) + 1\), so the statement \(P(a + 1)\) is true.↩︎