Associative and commutative binary operations

An Introduction to Real Analysis
Section 1.1.5 notes

2025-12-20

The primary purpose of Chapter 1 of Katznelson and Katznelson (2024) is to develop all the standard number systems from the natural numbers to the complex numbers. Each number system has two arithmetic operations, addition and multiplication, that are formally defined as binary operations on the set of numbers in question. For example, addition of natural numbers is a binary operation \(+ \colon \symbb{N}\times \symbb{N}\to \symbb{N}\).

In particular, for each number system addition and multiplication are associative and commutative binary operations. Associativity and commutativity imply two properties that are repeatedly used when evaluating arithmetic expressions, although these properties are almost never explicitly mentioned. The purpose of these notes is to state and prove these properties in Theorem 1 and Theorem 2.

Because these properties hold for any binary operation satisfying the assumptions of Theorem 1 or Theorem 2, and arithmetic operations are not the only binary operations encountered in mathematics—other examples include set unions and intersections, group operations, matrix multiplication, and convolution of functions—these notes will proceed with an arbitrary set \(X\) and binary operation \(\star\colon X \times X \to X\).

Parenthesizing and permuting expressions

Theorem 1 proves that for an associative binary operation \(\star\colon X \times X \to X\), the placement of parentheses in an expression has no effect on the expression’s value. For example, \((x_1 \star x_2) \star(x_3 \star x_4) \overset{!}{=}(x_1 \star(x_2 \star x_3)) \star x_4 \overset{!}{=}x_1 \star(x_2 \star(x_3 \star x_4))\), and so on. Because of this it is permissible to simply write \(x_1 \star x_2 \star x_3 \star x_4\) and place the parentheses in any way that is useful for the calculation at hand.

The idea of the proof of Theorem 1 begins by recursively defining a standard placement of parentheses for an expression of \(a\) operands by \[\begin{align*} \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^1 x_i &\coloneq x_1, \text{ and}\\[0.75em] \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_i &\coloneq \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^a x_i \right) \star x_{a + 1}. \end{align*}\] For example, if \(a \overset{!}{=}4\) then \[ \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^4 x_i \overset{!}{=}(((x_1) \star x_2) \star x_3) \star x_4. \] (The parentheses around \(x_1\) can be removed without ambiguity.) The proof shows that any expression of \(a\) operands \(x_1, x_{\sigma(1)}, \dots, x_a\), in that order and with any placement of parentheses, must be equal to the expression with the same elements in the same order but with the parentheses in standard placement, i.e., \[ \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^a x_i. \] Since any two ways of placing the parentheses lead to expressions with the same value as when the parentheses are in standard position, any two ways of placing the parentheses must lead to expressions with the same value by transitivity of equality. The proof proceeds by strong induction because for any placement of parentheses for an expression of \(a\) operands, there is a final binary operation that must be evaluated for which the two operands are themselves composed of fewer than \(a\) operands each.

Theorem 1 Let \(\star\colon X \times X \to X\) be an associative binary operation on a set \(X\). Then for all \(a \in \symbb{N}\) and all \(x_1, x_{\sigma(1)}, \dots, x_a \in X\), the value of the expression \(x_1 \star x_{\sigma(1)} \star\dots \star x_a\) is equal for all possible placements of parentheses in the expression.

Proof. For expressions having one or two operands, there are zero or one binary operations to evaluate so the placement of parentheses has no effect on the values of such expressions. For expressions having three or more terms the proof will proceed by strong induction on the number of operands in an expression, starting from \(\sigma(\sigma(1))\).1 Let \(x_1, x_{\sigma(1)}, x_{\sigma(\sigma(1))} \in X\). Since \(\star\) is associative, \((x_1 \star x_{\sigma(1)}) \star x_{\sigma(\sigma(1))} = x_1 \star(x_{\sigma(1)} \star x_{\sigma(\sigma(1))})\) so the statement \(P(\sigma(\sigma(1)))\) is true.

Next assume that for chosen \(a \geq \sigma(\sigma(1))\), the statements \(P(\sigma(\sigma(1))), \dots, P(a)\) are true, so the values of all expressions of between \(\sigma(\sigma(1))\) and \(a\) operands, inclusive, are equal for all possible placements of parentheses in the expressions. Consider an expression of \(a + 1\) operands \(x_1, x_{\sigma(1)}, \dots, x_a, x_{a + 1}\), in that order, with some placement of parentheses. For simplicity write this expression without parentheses as \(x_1 \star x_{\sigma(1)} \star\dots \star x_a \star x_{a + 1}\). When evaluating the expression there is a final binary operation that must be evaluated. For this final evaluation the left operand is itself composed of \(1 \leq k < a + 1\) operands \(x_1, \dots, x_k\), and the right operand is itself composed of the remaining \((a + 1) - k\) operands \(x_{k + 1}, \dots, x_{a + 1}\).2 By assumption these left and right operands are equal to expressions with the same operands in the same order but with the parentheses in standard placement. Again for simplicity write these final operands as \(x_1 \star\dots \star x_k\) and \(x_{k + 1} \star\dots \star x_{a + 1}\), respectively. Using this and the definition of standard placement of parentheses, \[\begin{align*} \MoveEqLeft[0] x_1 \star x_{\sigma(1)} \star\dots \star x_a \star x_{a + 1}\\[0.75em] &= \underbrace{(x_1 \star\dots \star x_k)}_{= \left( \star_{i = 1}^k x_i \right)} \star\underbrace{(x_{k + 1} \star\dots \star x_{a + 1})}_{= \left( \star_{j = k + 1}^{a + 1} x_j \right)}\\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_j \right). \end{align*}\] If \(k + 1 = a + 1\) then by the cancellation law of addition \(k = a\), so \[\begin{align*} \MoveEqLeft[0] \underbrace{\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right)}_{k = a} \star\underbrace{\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_j \right)}_{= x_{a + 1}}\\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^a x_i \right) \star x_{a + 1}\\[0.75em] &= \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_i. \end{align*}\] If \(k + 1 \neq a + 1\) then because \(\star\) is associative, \[\begin{align} \MoveEqLeft[0] \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_j \right) \notag \\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right) \star\left( \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^a x_j \right) \star x_{a + 1} \right) \notag \\[0.75em] &= \underbrace{\left( \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^a x_j \right) \right)}_{a \text{ operands}} \star x_{a + 1} \label{eq:paren-1}. \end{align}\] The left operand in \(\eqref{eq:paren-1}\) is itself composed of \(a\) operands, so by assumption its parentheses can be put in standard placement. Therefore \[\begin{align*} \MoveEqLeft[0] \underbrace{\left( \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_i \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^a x_j \right) \right)}_{= \star_{i = 1}^a x_i} \star x_{a + 1}\\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1} ^a x_i \right) \star x_{a + 1}\\[0.75em] &= \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_i. \end{align*}\] The preceding demonstrates that whether \(k + 1 = a + 1\) or not, the parentheses in an expression of \(a + 1\) operands can be put in standard placement. By transitivity of equality, therefore, the values of any two expressions of the operands \(x_1, x_{\sigma(1)}, \dots, x_a, x_{a + 1}\), in that order and with any placement of parentheses, are equal. Since the elements \(x_1, x_{\sigma(1)}, \dots, x_a, x_{a + 1}\) were arbitrary, the values of all expressions of \(a + 1\) operands are equal for all possible placements of parentheses. Hence the statement \(P(a + 1)\) is true. \(\blacksquare\)

When the binary operation is commutative as well as associative, an expression can not only be parenthesized arbitrarily but also permuted arbitrarily. In particular, Theorem 2 proves that for an associative and commutative binary operation the value of an expression is invariant under permutations of the operands. For example, \(x_1 \star x_2 \star x_3 \star x_4 \overset{!}{=}x_2 \star x_1 \star x_4 \star x_3 \overset{!}{=}x_3 \star x_1 \star x_4 \star x_2\).

Theorem 2 Let \(\star\colon X \times X \to X\) be an associative and commutative binary operation on a set \(X\), let \(x_1, \dots, x_a \in X\) and let \(\tau \colon \{ 1, \dots, a \} \to \{ 1, \dots, a \}\) be a bijection. Then \(x_1 \star\dots \star x_a = x_{\tau(1)} \star\dots \star x_{\tau(a)}\).

Proof. By induction on the number of operands in an expression. Since there is only the identity permutation on \(\{ 1 \}\), the statement \(P(1)\) is true. Next assume that \(P(a)\) is true and consider \(x_1, \dots, x_a, x_{a + 1} \in X\) and a bijection \(\tau \colon \{ 1, \dots, a + 1 \} \to \{ 1, \dots, a + 1 \}\). By Theorem 1 parentheses may be placed in the expression \(x_{\tau(1)} \star\dots \star x_{\tau(a)} \star x_{\tau(a + 1)}\) in any valid fashion, so using standard placement in particular \[\begin{equation}\label{eq:permut-1} x_{\tau(1)} \star\dots \star x_{\tau(a)} \star x_{\tau(a + 1)} = \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_{\tau(i)}. \end{equation}\] Because \(\tau\) is a bijection there exists a unique \(k \in \{ 1, \dots, a + 1 \}\) such that \(\tau(k) = a + 1\). Informally the \(k\) place is wherever in the expression the last term \(x_{a + 1}\) ends up after \(\tau\) permutes the operands. There are three cases depending on whether \(\tau\) fixes \(x_{a + 1}\) at the end of the expression, sends it to the beginning, or sends it somewhere in the middle of the expression.

If \(k = a + 1\) then \(x_{a + 1}\) is fixed at the end of the expression. Starting with \(\eqref{eq:permut-1}\) and using Theorem 1, \[\begin{align*} \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_{\tau(i)} &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^a x_{\tau(i)} \right) \star\underbrace{x_{\tau(a + 1)}}_{= x_{a + 1}}\\[0.75em] &= \underbrace{(x_{\tau(1)} \star\dots \star x_{\tau(a)})}_{\mathclap{= x_1 \star\dots \star x_a \text{ by } P(a)}} \star x_{a + 1}\\[0.75em] &= (x_1 \star\dots \star x_a) \star x_{a + 1}\\[0.75em] &= x_1 \star\dots \star x_a \star x_{a + 1}. \end{align*}\]

If \(k = 1\) then \(x_{a + 1}\) is sent to the beginning of the expression. Starting with \(\eqref{eq:permut-1}\) and using Theorem 1 and the fact that \(\star\) is commutative, \[\begin{align*} \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_{\tau(i)} &= \underbrace{\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^k x_{\tau(i)} \right)}_{= x_{\tau(1)}} \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = \sigma(1)}^{a + 1} x_{\tau(j)} \right)\\[0.75em] &= \underbrace{x_{\tau(1)}}_{= x_{a + 1}} \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = \sigma(1)}^{a + 1} x_{\tau(j)} \right)\\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = \sigma(1)}^{a + 1} x_{\tau(j)} \right) \star x_{a + 1}. \end{align*}\] Continuing, \[\begin{align*} \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = \sigma(1)}^{a + 1} x_{\tau(j)} \right) \star x_{a + 1} &= \underbrace{(x_{\tau(\sigma(1))} \star\dots \star x_{\tau(a + 1)})}_{= x_1 \star\dots \star x_a \text{ by } P(a)} \star x_{a + 1}\\[0.75em] &= (x_1 \star\dots \star x_a) \star x_{a + 1}\\[0.75em] &= x_1 \star\dots \star x_a \star x_{a + 1}. \end{align*}\]

If \(k \notin \{ 1, a + 1 \}\) then \(k \in \{ \sigma(1), \dots, a \}\) so \(x_{a + 1}\) is sent somewhere in the middle of the expression. Then there exists \(k^- \in \symbb{N}\) such that \(k = \sigma(k^-)\), so starting with \(\eqref{eq:permut-1}\), using Theorem 1 and the fact that \(\star\) is associative and commutative, \[\begin{align*} \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{a + 1} x_{\tau(i)} &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{x = 1}^k x_{\tau(i)} \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_{\tau(j)} \right)\\[0.75em] &= \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{k^-} x_{\tau(i)} \star x_{\tau(k)}\right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_{\tau(j)} \right)\\[0.75em] &= \left( \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{k^-} x_{\tau(i)} \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_{\tau(j)} \right) \right) \star x_{\tau(k)}. \end{align*}\] Continuing, \[\begin{align*} \MoveEqLeft[0] \underbrace{\left( \left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{i = 1}^{k^-} x_{\tau(i)} \right) \star\left( \mathop{\vcenter{\hbox{\Huge $\star$}}}_{j = k + 1}^{a + 1} x_{\tau(j)} \right) \right)}_{= x_{\tau(1)} \star\dots \star x_{\tau(k^-)} \star x_{\tau(k + 1)} \star\dots \star x_{\tau(a + 1)}} \star\underbrace{x_{\tau(k)}}_{= x_{a + 1}}\\[0.75em] &= \underbrace{(x_{\tau(1)} \star\dots \star x_{\tau(k^-)} \star x_{\tau(k + 1)} \star\dots \star x_{\tau(a + 1)})}_{= x_1 \star\dots \star x_{a} \text{ by } P(a)} \star x_{a + 1}\\[0.75em] &= (x_1 \star\dots \star x_{a}) \star x_{a + 1}\\[0.75em] &= x_1 \star\dots \star x_a \star x_{a + 1}. \end{align*}\] Therefore, in any of the three possible cases, starting from \(\eqref{eq:permut-1}\) it follows that \(x_{\tau(1)} \star\dots \star x_{\tau(a)} \star x_{\tau(a + 1)} = x_1 \star\dots \star x_a \star x_{a + 1}\), so \(P(a + 1)\) is true. \(\blacksquare\)

Catalan numbers

Theorem 1 shows that parentheses can be placed in any way in an expression involving an associative binary operation. Assuming that \(n \geq 1\), let \(C_n\) be the number of ways of placing parentheses in an expression of \(n + 1\) terms \(x_0 \star x_1 \star\dots \star x_n\).3 (For completeness define the base case \(C_0 \coloneq 1\).) Then \(C_0 = 1\), \(C_1 = 1\), \(C_2 = 2\) and so on. The numbers \(C_n\) for \(n \in \symbb{N}\uplus \{ 0 \}\) are named Catalan numbers after Eugène Catalan. In Exercise 1.1.4 for Katznelson and Katznelson (2024) binomial coefficients \(\binom{n}{k}\) were introduced and motivated as counting the number of subsets of a set \(\{ 1, \dots, n \}\) containing \(0 \leq k \leq n\) elements, and the recurrence equation \(\binom{n + 1}{k} = \binom{n}{k - 1} + \binom{n}{k}\) was derived. Similarly, the Catalan numbers count parenthesizations of expressions and they have their own recurrence relation, which will be derived now.

As suggested in the discussion motivating the proof of Theorem 1, the recursion arises because any parenthesization of the expression \(x_0 \star x_1 \star\dots \star x_n\) necessarily leads to some final evaluation of the binary operation. The left and right operands of this final evaluation must be sub-expressions of the original expression. Suppose the left operand is composed of \(1 \leq k < n + 1\) operands \(x_0, \dots, x_{k - 1}\), and the right operand is itself composed of the remaining \((n + 1) - k\) operands \(x_k, \dots, x_n\). For the left operand parentheses can be placed in \(C_{k - 1}\) ways and for the right operand parentheses can be placed in \(C_{((n + 1) - k) - 1} = C_{n - k}\) ways. Therefore, assuming that the original expression \(x_0 \star x_1 \star\dots \star x_n\) has been parenthesized in such a way that the final evaluation involves expressions of length \(k\) and \((n + 1) - k\) operands, respectively, the number of ways the original expression can be parenthesized is \(C_{k - 1}C_{n - k}\). Because the operands in the final evaluation could have lengths anywhere in the range \((1, n), \dots, (n, 1)\) it’s necessary to sum over all these possibilities. Therefore the recurrence equation for the Catalan numbers is, for \(n \geq 1\), \[ C_n = \sum_{k = 1}^n C_{k - 1} C_{n - k}. \] For more about Catalan numbers see Chapter 9 of Aigner (2007), and various parts of Graham, Knuth, and Patashnik (1994).

References

Aigner, Martin. 2007. Discrete Mathematics. American Mathematical Society.
Graham, Ronald L., Donald E. Knuth, and Oren Patashnik. 1994. Concrete Mathematics: A Foundation for Computer Science. 2nd ed. Addison-Wesley.
Katznelson, Yitzhak, and Yonatan Katznelson. 2024. An Introduction to Real Analysis. Pure and Applied Undergraduate Texts 65. American Mathematical Society.

  1. Radix-\(b\) representations of natural numbers are not introduced until the Section 1.2.4 notes for Katznelson and Katznelson (2024), but for the purpose of this proof one can substitute \(3\) for \(\sigma(\sigma(1))\) if desired.↩︎

  2. Note that \((a + 1) - k\) does not involve any subtraction. It is simply notation for the \(l \in \symbb{N}\) such that \(a + 1 = k + l\), as introduced in the Section 1.1.4 notes for Katznelson and Katznelson (2024).↩︎

  3. Zero is not introduced until the Section 1.2 notes for Katznelson and Katznelson (2024), but that poses no issue here.↩︎