Binomial theorem

An Introduction to Real Analysis

2025-10-05

Show that \[ \binom{n}{k - 1} + \binom{n}{k} = \binom{n + 1}{k}. \] for \(n \geq k \geq 1\).

Use induction to prove the binomial theorem.

Hints:

Express \((x + y)\sum_{k = 0}^n \binom{n}{k}x^{n - k}y^k\) as a sum of two sums.

Show that \(\sum_{k = 0}^n \binom{n}{k}x^{n - k}y^{k + 1} = \sum_{k = 1}^{n + 1} \binom{n}{k - 1}x^{n + 1 - k}y^k\).

Use (a).

Below are two proofs of parts (a) and (b), and thus of the binomial theorem: first by the steps laid out above, and second guided by combinatorial reasoning.

Proof (Part (a); first). Starting with the definition of binomial coefficients, put the two summands on a common denominator of \(k!(n - (k - 1))!\), and then simplify: \[\begin{align*} \MoveEqLeft[0] \binom{n}{k - 1} + \binom{n}{k}\\[0.75em] &= \frac{n!}{(k - 1)! (n - (k - 1))!} + \frac{n!}{k!(n - k)!}\\[0.75em] &= \frac{n!}{(k - 1)! (n - (k - 1))!} \cdot \frac{k}{k} + \frac{n!}{k!(n - k)!} \cdot \frac{n - (k - 1)}{n - (k - 1)}\\[0.75em] &= \frac{k \cdot n! + (n - (k - 1)) \cdot n!}{k! (n - (k - 1))!}\\[0.75em] &= \frac{(k + n - (k - 1)) \cdot n!}{k! (n - (k - 1))!}\\[0.75em] &= \frac{(n + 1) \cdot n!}{k! (n - (k - 1))!}\\[0.75em] &= \frac{(n + 1)!}{k!((n + 1) - k)!}\\[0.75em] &= \binom{n + 1}{k}, \end{align*}\] which completes the proof. \(\blacksquare\)

Proof (Part (b); first). Since \[\begin{align*} (x + y)^1 &= x + y\\[0.75em] &= \underbrace{\binom{1}{0}}_{= 1} x + \underbrace{\binom{1}{1}}_{= 1} y\\[0.75em] &= \binom{1}{0} \underbrace{x^{1 - 0} y^0}_{= x} + \binom{1}{1} \underbrace{x^{1 - 1} y^1}_{= y}\\[0.75em] &= \sum_{k = 0}^1 \binom{1}{k} x^{1 - k} y^k, \end{align*}\] the statement \(P(1)\) is true.

Next assume that \(P(n)\) is true. Start by separating \((x + y)^n\) into two summations: \[\begin{align} (x + y)^{n + 1} &= (x + y) (x + y)^n \notag \\[0.75em] &= (x + y) \underbrace{\sum_{k = 0}^n \binom{n}{k} x^{n - k} y^k}_{= (x + y)^n} \notag \\[0.75em] &= x \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^k + y \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^k \notag \\[0.75em] &= \sum_{k = 0}^n \binom{n}{k} x^{n - (k - 1)} y^k + \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^{k + 1} \label{eq:1} . \end{align}\] The two summations in \(\eqref{eq:1}\) need to be rewritten to bring them together again. For the left summation in \(\eqref{eq:1}\), \[\begin{align} \sum_{k = 0}^n \binom{n}{k} x^{n - (k - 1)} y^k &= \sum_{k = 1}^n \binom{n}{k} x^{n - (k - 1)} y^k + \underbrace{\binom{n}{0} x^{n - (0 - 1)} y^0}_{k = 0 \text{ term}} \notag \\[0.75em] &= \sum_{k = 1}^n \binom{n}{k} x^{n - (k - 1)} y^k + x^{n + 1} \label{eq:2} . \end{align}\] For the right summation in \(\eqref{eq:1}\), \[\begin{align} \MoveEqLeft[0] \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^{k + 1}\notag \\[0.75em] &= \sum_{k = 1}^{n + 1} \binom{n}{k - 1} x^{n - (k - 1)} y^k \notag \\[0.75em] &= \underbrace{\binom{n}{(n + 1) - 1} x^{n - ((n + 1) - 1)} y^{n + 1}}_{k = n + 1 \text{ term}} + \sum_{k = 1}^n \binom{n}{k - 1} x^{n - (k - 1)} y^k \notag \\[0.75em] &= y^{n + 1} + \sum_{k = 1}^n \binom{n}{k - 1} x^{n - (k - 1)} y^k \label{eq:3} . \end{align}\] Now equate \(\eqref{eq:1}\) with the sum of \(\eqref{eq:2}\) and \(\eqref{eq:3}\) and rewrite as one summation: \[\begin{align*} \MoveEqLeft[0] \underbrace{\sum_{k = 0}^n \binom{n}{k} x^{n - (k - 1)} y^k + \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^{k + 1}}_{\eqref{eq:1}}\\[0.75em] &= \underbrace{\sum_{k = 1}^n \binom{n}{k} x^{n - (k - 1)} y^k + x^{n + 1}}_{\eqref{eq:2}} + \underbrace{y^{n + 1} + \sum_{k = 1}^n \binom{n}{k - 1} x^{n - (k - 1)} y^k}_{\eqref{eq:3}}\\[0.75em] &= x^{n + 1} + \sum_{k = 1}^n \underbrace{\left( \binom{n}{k} + \binom{n}{k - 1} \right)}_{= \binom{n + 1}{k} \text{ by (a)}} x^{n - (k - 1)} y^k + y^{n + 1}\\[0.75em] &= \underbrace{\binom{n + 1}{0} x^{n - (0 - 1)} y^0}_{\text{write } x^{n + 1} \text{ as } k = 0 \text{ term}} + \sum_{k = 1}^n \binom{n + 1}{k} x^{n - (k - 1)} y^k + \underbrace{\binom{n + 1}{n + 1} x^{n - ((n + 1) - 1)} y^{n + 1}}_{\text{write } y^{n + 1} \text{ as } k = n + 1 \text{ term}}\\[0.75em] &= \sum_{k = 0}^{n + 1} \binom{n + 1}{k} x^{n - (k - 1)} y^k\\[0.75em] &= \underbrace{\sum_{k = 0}^{n + 1} \binom{n + 1}{k} x^{(n + 1) - k} y^k}_{n - (k - 1) = (n + 1) - k}, \end{align*}\] so \(P(n + 1)\) is true. \(\blacksquare\)

The second set of proofs begins by giving the binomial coefficients a combinatorial definition. Let \(m \in \symbb{N}\) and \(A \coloneq \{ 1, \dots, m \}\). For \(k \in \{ 0, \dots, m \}\), define \[ \binom{m}{k} \coloneq \mathop{}\!\#{\left\{ B \subseteq A \colon \mathop{}\!\#B = k \right\}}; \] in words, \(\binom{m}{k}\) is the number of subsets of \(A\) containing \(k\) elements.¹

This definition can be used to prove part (a) without even an explicit formula for \(\binom{m}{k}\).

Proof (Part (a); second). Choose \(x \in A\) and consider \(B \subseteq A\) where \(\mathop{}\!\#B = k\). Either \(x \in B\) or \(x \notin B\).

If \(x \in B\), then \(B\) is the disjoint union of \(\{ x \}\) and whatever else is in \(B\), if anything. That is, \(B = \{ x \} \uplus C\), where \(C \coloneq B \setminus \{ x \}\), so \(C \subseteq A \setminus \{ x \}\), \(\mathop{}\!\#(A \setminus \{ x \}) = m - 1\), and \(\mathop{}\!\#C = k - 1\). By definition, the number of such subsets \(C\) is \(\binom{m - 1}{k - 1}\), hence there are the same number of subsets \(B\) when \(x \in B\).

If \(x \notin B\), then \(B \subseteq A \setminus \{ x \}\), and the number of such subsets is \(\binom{m - 1}{k}\) by definition. Therefore there is a disjoint union \[\begin{align*} \MoveEqLeft[0] \left\{ B \subseteq A \colon \mathop{}\!\#B = k \right\}\\[0.75em] &= \left\{ B \subseteq A \colon \mathop{}\!\#B = k \text{ and } x \in B \right\} \uplus \left\{ B \subseteq A \colon \mathop{}\!\#B = k \text{ and } x \notin B \right\}, \end{align*}\] and the cardinality of the disjoint union is the sum of the cardinalities: \[\begin{align*} \MoveEqLeft[0] \mathop{}\!\#{\left\{ B \subseteq A \colon \mathop{}\!\#B = k \right\}}\\[0.75em] &= \mathop{}\!\#{\left\{ B \subseteq A \colon \mathop{}\!\#B = k \text{ and } x \in B \right\}} + \mathop{}\!\#{\left\{ B \subseteq A \colon \mathop{}\!\#B = k \text{ and } x \notin B \right\}}, \end{align*}\] which implies that \(\binom{m}{k} = \binom{m - 1}{k - 1} + \binom{m - 1}{k}\). Letting \(m = n + 1\) for some \(n \in \symbb{N}\cup \{ 0 \}\) completes the proof. \(\blacksquare\)

The combinatorial definition of \(\binom{m}{k}\) can be used to recover its explicit formula. Let \(A \coloneq \{ 1, \dots, m \}\) and suppose \(k \in \{ 1, \dots, m - 1 \}\). The explicit formula for \(\binom{m}{k}\) can be determined by over-counting the number of subsets of size \(k\) and then correcting. To form a \(k\)-tuple of elements of \(A\), there are \(m\) choices of elements of \(A\) for the first position; call the chosen element \(a_1\). Then there are \(m - 1\) choices of elements of \(A \setminus \{ a_1 \}\) for the second position; call the chosen element \(a_2\). Continuing in this way, there are \(m - (k - 1)\) choices of elements of \(A \setminus \{ a_1, \dots, a_{k - 1} \}\) for the \(k\)^th position. This implies that the number of \(k\)-tuples of elements of \(A\) is \(\prod_{j = 0}^{k - 1} (m - j)\).

Any such \(k\)-tuple \((a_1, a_2, \dots, a_k)\) corresponds to a subset \(B = \{ a_1, a_2, \dots, a_k \} \subseteq A\) such that \(\mathop{}\!\#B = k\), but multiple \(k\)-tuples correspond to the same subset: there are \(k!\) orderings of the same \(k\)-tuple, and they all correspond to the same subset \(B\). This means \(k!\) of the \(k\)-tuples of \(A\) will correspond to one \(k\)-subset of \(A\), and another \(k!\) \(k\)-tuples of \(A\) will correspond to another \(k\)-subset of \(A\), and so on, for each of the \(\binom{m}{k}\) \(k\)-subsets of \(A\). Therefore \[\begin{align*} \prod_{j = 0}^{k - 1} (m - j) &= k!\binom{m}{k}\\[0.75em] \implies \binom{m}{k} &= \frac{1}{k!} \prod_{j = 0}^{k - 1} (m - j)\\[0.75em] &= \frac{m(m - 1) \cdots (m - (k - 1))}{k!}\\[0.75em] &= \frac{m}{k! (m - k)!}, \end{align*}\] which recovers the definition of the binomial coefficients given by Katznelson and Katznelson (2024, 3).

The following two lemmata provide a clearer understanding of the expansion of \((x + y)^n\) and hence set the stage for the second proof of part (b). Lemma 1 establishes that all possible products of \(x\) and/or \(y\) with \(n\) factors are summands in the expansion of \((x + y)^n\). These products are called strings of length \(n\) instead of products to shift focus toward the arrangement of the symbols \(x\) and \(y\) in the product and away from the algebraic operation of multiplication.

Lemma 1 For \(x, y \in \symbb{R}\) and \(n \in \symbb{N}\), \((x + y)^n\) is the sum of all possible strings of \(x\) and \(y\) of total length \(n\).

Proof. By induction. Because \((x + y)^1 = x + y\), and the only possible strings of \(x\) and \(y\) of total length \(1\) are \(x\) and \(y\), the statement \(P(1)\) is true.

Next assume that \(P(n)\) is true, so \((x + y)^n\) is the sum of all possible strings of \(x\) and \(y\) of total length \(n\). Using distributivity, \[\begin{align*} (x + y)^{n + 1} &= (x + y)(x + y)^n\\[0.75em] &= x(x + y)^n + y(x + y)^n, \end{align*}\] so \((x + y)^{n + 1}\) is the sum of all possible strings of \(x\) and \(y\) of length \(n\) appended on the left with \(x\), and then separately appended on the left with \(y\). All these strings are therefore of length \(n + 1\).

Any string \(s\) of \(x\) and \(y\) of length \(n + 1\) contains some string of \(x\) and \(y\) of length \(n\) starting from the right, then appended with either \(x\) or \(y\) in the leftmost position. If the string \(s\) has an \(x\) in the leftmost position, then it must be one of the summands in \(x(x + y)^n\); if \(s\) has a \(y\) in the rightmost position, then it must be one of the summands in \(y(x + y)^n\). Therefore \(P(n + 1)\) is true. \(\blacksquare\)

Because multiplication is associative and commutative, a product of length \(n\) is invariant under permutations of the factors. For proof, see the Section 1.5 notes for Schröder (2008). For example, if \(n = 4\), then Lemma 1 shows that \((x + y)^4\) will contain the strings \(xxxy\), \(xxyx\), \(xyxx\), and \(yxxx\), and these are all equal to \(x^3 y\).

Lemma 2 answers the question: how many strings of length \(n\) contain \(n - k\) “\(x\)”s and \(k\) “\(y\)”s?

Lemma 2 If \(n \in \symbb{N}\) and \(k \in \{ 1, \dots, n \}\), there are \(\binom{n}{k}\) strings of length \(n\) of the symbols \(x\) and \(y\) containing \(n - k\) “\(x\)”s and \(k\) “\(y\)”s.

Proof. Consider a string \(s\) of length \(n\) of the symbols \(x\) and \(y\) containing \(n - k\) “\(x\)”s and \(k\) “\(y\)”s. The string contains \(n - k\) “\(x\)”s if and only if it contains \(k\) “\(y\)”s, so the two conditions are equivalent and it suffices to count the number strings of length \(n\) containing \(k\) “\(y\)”s.

Starting from the right, number each position in the string from \(1\) to \(n\), so that \(s_1\) is the rightmost element of the string and \(s_n\) is the leftmost element of the string. Define \(S_k\) to be the set of strings of length \(n\) containing \(k\) “\(y\)”s, and define \(B_k\) to be the set of subsets of \(\{ 1, \dots, n \}\) of size \(k\), i.e., \(B_k \coloneq \{ B \subseteq \{ 1, \dots, n \} \colon \mathop{}\!\#B = k \}\). By definition, \(\mathop{}\!\#B_k = \binom{n}{k}\), so if a bijection \(f \colon B_k \to S_k\) can be exhibited the proof of Lemma 2 will be complete.

Define \(f\) so that \(f(B)\) is the string \(s\) such that \(s_j = y\) for all \(j \in B\), and \(s_j = x\) for \(j \notin B\). Informally, associate to \(B\) the string with “\(y\)”s in the positions indicated by the indices in \(B\), and \(x\) otherwise. The map \(f\) is surjective because any string \(s \in S_k\) contains \(y\) in \(k\) positions, and if \(B\) is defined to be the set of exactly those indices, then \(B \in B_k\) and \(f(B) = s\).

Further, \(f\) is injective because if \(B_1, B_2 \in B_k\) and \(B_1 \neq B_2\), then either there is some \(j \in B_1\) such that \(j \notin B_2\), or there is some \(j \in B_2\) such that \(j \notin B_1\).² Without loss of generality, suppose there is some \(j \in B_1\) such that \(j \notin B_2\). Then using the subscript notation to refer to elements of strings, \(f(B_1)_j = y\), whereas \(f(B_2)_j = x\), whence the two strings \(f(B_1)\) and \(f(B_2)\) are different.

Since \(f\) is surjective and injective, it is also bijective, which completes the proof. \(\blacksquare\)

With Lemma 1 and Lemma 2 in hand, the second proof of part (b) only requires rearranging the expansion of \((x + y)^n\).

Proof (Part (b); second). By Lemma 1 \((x + y)^n\) is the sum of all possible strings of \(x\) and \(y\) of total length \(n\). By Lemma 2 for \(k \in \{ 1, \dots, n \}\) the number of such strings containing \(k\) “\(y\)”s, and equivalently \(n - k\) “\(x\)”s, is \(\binom{n}{k}\).³

Because multiplication is associative and commutative, each string containing \(k\) “\(y\)”s can be re-ordered to the form \(x^{n - k} y^k\) for \(k \in \{ 0, \dots, n \}\). Because addition is associative and commutative, the \(2^n\) summands can be re-ordered so that the summand containing \(0\) “\(y\)”s comes first, then the summands containing \(1\) \(y\), and so on, until the summand containing \(n\) “\(y\)”s comes last. Again by associativity and commutativity of addition, \[ \underbrace{x^{n - k} y^k + \dots + x^{n - k} y^k}_{\binom{n}{k} \text{ summands}} = \binom{n}{k} x^{n - k} y^k, \] and therefore \((x + y)^n = \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^k\), which completes the proof. \(\blacksquare\)

The second set of proofs makes clear that the binomial theorem relies only on the following properties of real numbers:

addition is associative and commutative
multiplication is associative and commutative
multiplication distributes over addition
the existence of a multiplicative identity element, in order to make sense of \(x^0\) and \(y^0\).

Because of this, the binomial theorem is valid more generally in any commutative ring \(R\) with unity. Even more generally, it’s not necessary for the entire ring to be commutative, only that the two ring elements under consideration commute. As an example, suppose \(M\) is the ring of \(n \times n\) matrices with entries in a ring \(R\). Then if \(A, B \in M\) and \(AB = BA\), the binomial theorem applies so that \((A + B)^n = \sum_{k = 0}^n \binom{n}{k} A^{n - k} B^k\).

References

Katznelson, Yitzhak, and Yonatan Katznelson. 2024. An Introduction to Real Analysis. Pure and Applied Undergraduate Texts 65. American Mathematical Society.

Schröder, Bernd S. W. 2008. Mathematical Analysis: A Concise Introduction. John Wiley & Sons.

Footnotes

When \(k = 0\) there is only one subset of \(A\) containing no elements: the empty set. Therefore \(\binom{m}{0} = 1\). When \(k = m\) there is only one subset of \(A\) containing all of \(A\): \(A\) itself. Therefore \(\binom{m}{m} = 1\). This agrees with the definition given by Katznelson and Katznelson (2024, 3).↩︎
For sets \(X\) and \(Y\), if the definition of \(X = Y\) is that \((X \subseteq Y) \land (Y \subseteq X)\) then by logic \(X \neq Y\) must mean \((X \nsubseteq Y) \lor (Y \nsubseteq X)\). Thus, either there exists \(x \in X\) such that \(x \notin Y\), or there exists \(y \in Y\) such that \(y \notin X\).↩︎
The number of strings containing \(0\) “\(y\)”s, equivalently \(n\) “\(x\)”s, is clearly \(1\).↩︎