Induction

An Introduction to Real Analysis

2025-10-16

The main topic in this section is the principle of induction and variations thereof.1

Variations on induction

From the Section 1.1.1 notes for Katznelson and Katznelson (2024), the principle of induction states that if \(M \subseteq \symbb{N}\) is such that \(1 \in M\), and \(a \in M\) implies that \(\sigma(a) \in M\), then \(M = \symbb{N}\). Two closely related principles are the principle of strong induction and the well-ordering principle. The principle of strong induction states that if \(M \subseteq \symbb{N}\) is such that \(1 \in M\), and \(\{ 1, \dots, a \} \coloneq \{ b \in \symbb{N}\colon 1 \leq b \leq a \} \subseteq M\) implies that \(\sigma(a) \in M\), then \(M = \symbb{N}\). The well-ordering principle states that if \(M \subseteq \symbb{N}\) is not empty, then \(M\) has a least element.

According to Katznelson and Katznelson (2024, sec 1.1.3), all three principles are equivalent. However, when developing the natural numbers from the Peano axioms, this is not true. The principle of induction cannot be replaced with either the principle of strong induction or the well-ordering principle as the third Peano axiom.

The latter two principles require by their definitions the notion of an order \(<\) on \(\symbb{N}\). The total order \(<\) on \(\symbb{N}\) developed in the Section 1.1.2 notes for Katznelson and Katznelson (2024) cannot be used for this purpose because it is defined in terms of addition, which itself was developed using the principle of induction.

Even presuming an alternative order has been defined, Magidin (2012) gives an example of a system that satisfies the first two Peano axioms and the principle of strong induction which does not satisfy the principle of induction. Moreover, Öhman (2019) gives an example of a system satisfying the first two Peano axioms and the well-ordering principle which does not satisfy the principle of induction.

Nevertheless, when starting from the Peano axioms as in the Section 1.1.1 notes for Katznelson and Katznelson (2024), both the principle of strong induction and the well-ordering principle can be proven as consequences thereof. This is done in Theorem 1 and Theorem 2, respectively.

Before proving either theorem, it will be helpful to prove that between a natural number \(a\) and its successor \(\sigma(a)\) there are no other natural numbers.

Lemma 1 For any \(a \in \symbb{N}\), there is no \(b \in \symbb{N}\) such that \(a < b < \sigma(a)\).

Proof. By contradiction suppose there were such \(b\). Then there exists \(k \in \symbb{N}\) such that \(b = a + k\), and also \(\sigma(a) = a + 1\).

If \(k = 1\) then \(b = \sigma(a)\). Since it was assumed that \(b < \sigma(a)\), this contradicts the trichotomy law. Therefore \(k < 1\) or \(k > 1\).

If \(k < 1\) then there exists \(l \in \symbb{N}\) such that \(k + l = 1\). If \(l = 1\) then \(\sigma(k) = 1\), which is a contradiction because \(1 \notin \symup{img}(\sigma)\). If \(l \neq 1\) then there exists \(l^- \in \symbb{N}\) such that \(\sigma(l^-) = l\), so \(k + l = k + \sigma(l^-) = \sigma(k + l^-) = 1\), which is again a contradiction for the same reason.

Therefore \(k > 1\). Adding \(a\) to both sides of this inequality shows that \[ \underbrace{a + k}_{= b} > \underbrace{a + 1}_{= \sigma(a)}, \] but this is another contradiction. \(\blacksquare\)

Theorem 1 (Principle of strong induction) If \(M \subseteq \symbb{N}\) is such that \(1 \in M\), and \(\{ 1, \dots, a \} \coloneq \{ b \in \symbb{N}\colon 1 \leq b \leq a \} \subseteq M\) implies that \(\sigma(a) \in M\), then \(M = \symbb{N}\).

Proof. Suppose \(M \subseteq \symbb{N}\) is such that \(1 \in M\) and \(\{ 1, \dots, a \} \subseteq M\) implies that \(\sigma(a) \in M\). Consider the set \(L \coloneq \{ a \in \symbb{N}\colon \{ 1, \dots, a \} \subseteq M \}\). If \(L = \symbb{N}\) then \(M = \symbb{N}\).2 Since \(1 \in M\), \(\{ 1 \} \subseteq M\), so \(1 \in L\). Next assume that \(a \in L\), so that \(\{ 1, \dots, a \} \subseteq M\). Then \(\sigma(a) \in M\), so by Lemma 1 \(\{ 1, \dots, \sigma(a) \} \subseteq M\), whence \(\sigma(a) \in L\). Therefore \(L = \symbb{N}\), so \(M = \symbb{N}\) and the principle of strong induction is true. \(\blacksquare\)

Theorem 2 (Well-ordering principle) If \(M \subseteq \symbb{N}\) is not empty, then \(M\) has a least element.

Proof. By contradiction, suppose \(M \subseteq \symbb{N}\) is not empty and for all \(a \in M\), there exists \(b \in M\) such that \(b < a\). If \(1 \in M\) then the proof of Lemma 1 shows that for all \(a \in M\), \(1 \leq a\), which is a contradiction, so assume \(1 \notin M\). Therefore \(M \neq \symbb{N}\), so \(\symbb{N}\setminus M \neq \emptyset\).

Apply Theorem 1 to \(\symbb{N}\setminus M\). By assumption \(1 \in \symbb{N}\setminus M\). Next assume that \(\{ 1, \dots, a \} \subseteq \symbb{N}\setminus M\). Either \(\sigma(a) \in M\) or \(\sigma(a) \in \symbb{N}\setminus M\).

If \(\sigma(a) \in M\), then there exists \(b \in M\) such that \(b < \sigma(a)\). By Lemma 1, it is not possible that \(a < b\), so \(b \leq a\). Because \(\{ 1, \dots, a \} \coloneq \{ b \in \symbb{N}\colon 1 \leq b \leq a \}\), it must be that \(b \in \{ 1, \dots, a \}\), so \(b \in \symbb{N}\setminus M\), which is a contradiction.

Therefore \(\sigma(a) \in \symbb{N}\setminus M\), so by the principle of strong induction, \(\symbb{N}\setminus M = \symbb{N}\). But then \(M = \emptyset\), which is another contradiction. \(\blacksquare\)

Proofs by induction

When applying the principle of induction in a “proof by induction”, for example in Exercise 1.1.1 from Katznelson and Katznelson (2024), the set \(M \subseteq \symbb{N}\) is rarely mentioned. Instead, statements \(P(a)\) where \(a \in \symbb{N}\) are considered, and a two-step process is followed. First the statement \(P(1)\) is shown to be true. Then if \(P(a)\) is true, it is shown that \(P(\sigma(a))\) is true. After completing both steps the proof by induction concludes, implying that \(P(a)\) is true for all \(a \in \symbb{N}\). Proposition 1 shows that this technique is valid.

Proposition 1 (Proof by induction) Let \(P(a)\) be a statement for each \(a \in \symbb{N}\). If \(P(1)\) is true and for all \(a \in \symbb{N}\), the truth of \(P(a)\) implies the truth of \(P(\sigma(a))\), then \(P(a)\) is true for all \(a \in \symbb{N}\).

Proof. Define the set \(M \coloneq \{ a \in \symbb{N}\colon P(a) \text{ is true} \}\). Since \(P(1)\) is true, \(1 \in M\). Next assume that \(a \in M\), so \(P(a)\) is true. Then \(P(\sigma(a))\) is true, whence \(\sigma(a) \in M\), so \(M = \symbb{N}\). \(\blacksquare\)

A similar justification exists for the technique of “proof by strong induction”.

Proposition 2 (Proof by strong induction) Let \(P(a)\) be a statement for each \(a \in \symbb{N}\). If \(P(1)\) is true and for all \(a \in \symbb{N}\), the truth of \(P(1), \dots, P(a)\) implies the truth of \(P(\sigma(a))\), then \(P(a)\) is true for all \(a \in \symbb{N}\).

Proof. Define the set \(M \coloneq \{ a \in \symbb{N}\colon P(a) \text{ is true} \}\). Since \(P(1)\) is true, \(1 \in M\). Next assume that \(\{ 1, \dots, a \} \subseteq M\), so \(P(1), \dots, P(a)\) are true. Then \(P(\sigma(a))\) is true, whence \(\sigma(a) \in M\), so \(M = \symbb{N}\). \(\blacksquare\)

It it sometimes desirable to prove a statement \(P(b)\) true for all \(b \geq a\), where \(a \in \symbb{N}\). This is facilitated by the following observation.

Proposition 3 For any \(a \in \symbb{N}\), let \(\symbb{N}_{\geq a} \coloneq \{ b \in \symbb{N}\colon b \geq a \} \subseteq \symbb{N}\) and \(\tau \colon \symbb{N}_{\geq a} \to \symbb{N}\) be such that \(\tau = \sigma\vert_{\symbb{N}_{\geq a}}\). Then the shifted triplet \((\symbb{N}_{\geq a}, \tau, a)\) is a Peano system.

Proof. If \(a = 1\) then \((\symbb{N}_{\geq a}, \tau, a) = (\symbb{N}, \sigma, 1)\) and there is nothing to prove. Further, the proof of Lemma 1 established that \(a < 1\) is not possible for any \(a \in \symbb{N}\), so assume that \(a > 1\).

The Peano axioms, if they applied to \((\symbb{N}_{\geq a}, \tau, a)\), would be as follows:

  1. For any \(b \in \symbb{N}_{\geq a}\), \(a \neq \tau(b)\).
  2. For \(b, c \in \symbb{N}_{\geq a}\), \(\tau(b) = \tau(c)\) implies that \(b = c\).
  3. If \(M \subseteq \symbb{N}_{\geq a}\) is such that \(a \in M\), and \(b \in M\) implies that \(\tau(b) \in M\), then \(M = \symbb{N}_{\geq a}\).

Since \(b \in \symbb{N}_{\geq a}\) implies that \(b \geq a > 1\), it follows that \(\tau(b) = \sigma(b) > b \geq a\) so \(\tau(b) \neq a\). Therefore \((\symbb{N}_{\geq a}, \tau, a)\) satisfies the first Peano axiom.

Because \(\tau\) is the restriction of \(\sigma\) to \(\symbb{N}_{\geq a}\) it follows that \(\tau\) is also injective. Therefore \((\symbb{N}_{\geq a}, \tau, a)\) satisfies the second Peano axiom.

Lastly, suppose \(M \subseteq \symbb{N}_{\geq a}\) is such that \(a \in M\), and \(b \in M\) implies that \(\tau(b) \in M\). If \(M \neq \symbb{N}_{\geq a}\) then \(\symbb{N}_{\geq a} \setminus M\) is not empty. Using Theorem 2, consider the least element of \(\symbb{N}_{\geq a} \setminus M\); call it \(l \in \symbb{N}_{\geq a} \setminus M\). Because \(l \in \symbb{N}\setminus \{ 1 \}\), there exists \(l^- \in \symbb{N}\) such that \(\sigma(l^-) = l\).

Since \(l \geq a\) and \(a \in M\), \(l \neq a\), therefore \(l > a\). If \(l^- < a\) then \(l^- < a < l = \sigma(l^-)\), which contradicts Lemma 1. Therefore \(l^- \geq a\), so \(l^- \in \symbb{N}_{\geq a}\). Since \(l^- < l\), it follows that \(l^- \notin \symbb{N}_{\geq a} \setminus M\), whence \(l^- \in M\). But then by assumption \(\tau(l^-) = \sigma(l^-) = l \in M\), a contradiction. Therefore \((\symbb{N}_{\geq a}, \tau, a)\) satisfies the third Peano axiom. \(\blacksquare\)

Since \((\symbb{N}_{\geq a}, \tau, a)\) is a Peano system, the techniques of proof by induction and proof by strong induction can be applied to \(\symbb{N}_{\geq a}\), similar to Proposition 1 and Proposition 2.

Corollary 1 (Proof by induction, arbitrary starting point) Let \(P(b)\) be a statement for each \(b \in \symbb{N}\). For any \(a \in \symbb{N}\), if \(P(a)\) is true, and for \(b \geq a\), \(P(b)\) being true implies that \(P(\sigma(b))\) is true, then \(P(b)\) is true for all \(b \geq a\).

Proof. Apply Proposition 1 to the Peano system \((\symbb{N}_{\geq a}, \tau, a)\). \(\blacksquare\)

Corollary 2 (Proof by strong induction, arbitrary starting point) Let \(P(b)\) be a statement for each \(b \in \symbb{N}\). For any \(a \in \symbb{N}\), if \(P(a)\) is true, and for \(b \geq a\), the truth of \(P(a), \dots, P(b)\) implies the truth of \(P(\sigma(b))\), then \(P(b)\) is true for all \(b \geq a\).

Proof. Apply Proposition 2 to the Peano system \((\symbb{N}_{\geq a}, \tau, a)\). \(\blacksquare\)

Difference of \(n\)th powers

Proposition 1.1.3 from Katznelson and Katznelson (2024, 3) is the equation \(x^n - y^n = (x - y)\sum_{k = 1}^n x^{n - k} y^{k - 1}\) for any \(x, y \in \symbb{R}\) and \(n \in \symbb{N}\). This formula is used repeatedly by Katznelson and Katznelson (2024, chap. 1) when establishing properties of real numbers. Katznelson and Katznelson (2024) prove this proposition by induction, but the proof does not shed light on the intuitive meaning of the equation. Below a geometrically motivated proof is presented. For this proof, \(x\) and \(y\) will be replaced with \(b\) and \(a\), respectively.3

For the purposes of this proof, \(b^n - a^n\) is interpreted as the difference in \(n\)-dimensional volumes between hypercubes in \(\symbb{R}^n\) with side lengths \(b > a > 0\). When \(n = 2\) the additional area of \(b^2\) compared to \(a^2\) is comprised of a long rectangle with area \((b - a) \cdot b\) and a short rectangle with area \((b - a) \cdot a\). That is, \(b^2 - a^2 = (b - a) \cdot b + (b - a) \cdot a = (b - a)(b + a)\).

When \(n = 3\) the additional area of \(b^3\) compared to \(a^3\) is comprised of:

  1. a large slab with volume \((b - a)\cdot b^2\), having length \(b\), width \(b\) and thickness \(b - a\);
  2. a medium slab with volume \((b - a) \cdot ba\), having length \(b\), width \(a\) and thickness \(b - a\); and
  3. a small slab with volume \((b - a) \cdot a^2\), having length \(a\), width \(a\) and thickness \(b - a\).

That is, \(b^3 - a^3 = (b - a)\cdot b^2 + (b - a) \cdot ba + (b - a) \cdot a^2 = (b - a)(b^2 + ba + a^2)\).

Based on these results one might conjecture and prove Proposition 1.1.3, perhaps by induction. It is not clear how to continue reasoning geometrically in arbitrary dimension; one would suspect that in dimension \(n\) the additional volume of \(b^n\) compared to \(a^n\) is comprised of \(n\) pieces, but it’s difficult to reason about how the dimensions of the two cubes would contribute to each of these.

The trick that makes the geometric approach tractable in higher dimensions is to scale the hypercubes so that the smaller hypercube has side length \(1\). Supposing for the moment that \(a = 1\), then \(b^2 - a^2 = b^2 - 1 = (b - 1)(b + 1)\). Similarly, \(b^3 - a^3 = b^3 - 1 = (b - 1)(b^2 + b + 1)\). These results suggest that in \(\symbb{R}^n\), the complicated expressions for the volumes of the \(n\) additional pieces can be replaced with the simpler summation \(\sum_{k = 0}^{n - 1} b^k\). Noticing that Exercise 1.1.3 from Katznelson and Katznelson (2024, 5) gives a closed-form expression for \(\sum_{k = 0}^{n - 1} b^k\) is the final piece of the puzzle.

Proof (Proposition 1.1.3). Let \(\beta \coloneq b / a\). By Exercise 1.1.3 from Katznelson and Katznelson (2024, 5), \(\beta^n - 1 = (\beta - 1)\sum_{k = 1}^n \beta^{k - 1}\). Therefore \[\begin{align*} b^n - a^n &= a^n(\beta^n - 1)\\[0.75em] &= a^n (\beta - 1)\sum_{k = 1}^n \beta^{k - 1}\\[0.75em] &= (b - a)\sum_{k = 1}^n a^{n - 1} \beta^{k - 1}\\[0.75em] &= (b - a)\sum_{k = 1}^n a^{n - k} b^{k - 1}\\[0.75em] &= (b - a)\sum_{k = 1}^n b^{n - k} a^{k - 1}, \end{align*}\] after reversing the order of the summation. Replacing \(b\) and \(a\) with \(x\) and \(y\), respectively, gives \(x^n - y^n = (x - y)\sum_{k = 1}^n x^{n - k} y^{k - 1}\). \(\blacksquare\)

Even though the geometric motivation for the proof assumed that \(b > a > 0\), the proof only uses that \(b / a = \beta \neq 1\), or equivalently \(b \neq a\). Even so, the result is still valid for \(b = a\) because in that case both sides are equal to zero. This is consistent with geometric intuition: when the two hypercubes have the same side length, the extra volume is zero.

References

Katznelson, Yitzhak, and Yonatan Katznelson. 2024. An Introduction to Real Analysis. Pure and Applied Undergraduate Texts 65. American Mathematical Society.
Magidin, Arturo. 2012. “Strong Induction Proofs Done with Weak Induction.” Mathematics Stack Exchange. https://math.stackexchange.com/q/108377.
Öhman, Lars-Daniel. 2019. “Are Induction and Well-Ordering Equivalent?” The Mathematical Intelligencer 41 (3): 33–40. https://doi.org/10.1007/s00283-019-09898-4.

Footnotes

  1. This section also states the binomial theorem, but this is proven in Exercise 1.1.4 of Katznelson and Katznelson (2024).↩︎

  2. Suppose \(L = \symbb{N}\) and \(M \neq \symbb{N}\), so \(\symbb{N}\setminus M\) is not empty. Then there exists some \(a \in L \cap (\symbb{N}\setminus M)\), so \(\{ 1, \dots, a \} \subseteq M\). But this implies that \(a \in M\), which is a contradiction.↩︎

  3. Using \(b\) and \(a\) avoids confusion between side lengths \(x\) and \(y\) and the \(x\)- and \(y\)-axes in dimensions \(2\) and \(3\).↩︎