Well-ordering theorem: Every set is well-orderable.

Let $S$ be a set.

Let $\mathcal P(S)$ be the power set of $S$.

By the Axiom of Choice, there is a choice function $c$ defined on $\mathcal P(S \setminus\set\emptyset)$.

We will use $c$ and the Principle of Transfinite Induction to define a bijection between $S$ and some ordinal.

Intuitively, we start by pairing $c(S)$ with $0$, and then keep extending the bijection by pairing $c(S\setminus X)$ with $\alpha$, where $X$ is the set of elements already dealt with.

Basis for the Induction

$\alpha = 0$

Let $s_0 =c(S)$.

Inductive Step

Suppose $s_\beta$ has been defined for all $\beta < \alpha$.

If $S \setminus \{s_\beta: \beta < \alpha\}$ is empty, we stop.

Otherwise, define:$$s_\alpha :=c (S \setminus\{s_\beta: \beta < \alpha\} )$$The process eventually stops, else we have defined bijections between subsets of $S$ and arbitrarily large ordinals.

Now, we can impose a well-ordering on $S$ by embedding it via $s_\alpha \to \alpha$ into the ordinal $\beta = \displaystyle {\bigcup_{s_\alpha \mathop \in S} \alpha}$ and using the well-ordering of $\beta$.

The Well-Ordering Theorem holds

Assume the Well-Ordering Theorem holds.

Let $\mathcal F$ be an arbitrary collection of sets.

By assumption all sets can be well-ordered.

Hence the set $\bigcup \mathcal F$ of all elements of sets contained in $\mathcal F$ is well-ordered by some ordering $<$.

By definition, in a well-ordered set, every subset has a unique least element.

Also, note that each set in $\mathcal F$ is a subset of $\bigcup \mathcal F$.

Thus, we may define the choice function $c$:$$\forall X \in \mathcal F: c: \mathcal F \to \bigcup \mathcal F$$by letting $c(X)$ be the least element of $X$ under $<$.