Some elements of set theory
We start with elementary facts about sets, but we do not try to put set theory on an axiomatic basis. Axiomatic set theory puts the construction of sets on an axiomatic base, to avoid paradoxes which result from defining large, inclusive sets. One famous example which caused a shockwave through the mathematical society was the discovery of Bertrand Russell in 1901 that the set theory created by George Cantor leads to a contradiction:
\[
\text{Let } R=\{x  x \notin x \} \text{ Then } R \in R \Leftrightarrow R \notin R
\]
Modifications to set theory proposed in the 1920s by Abraham Fraenkel, Thoralf Skolem, and by Ernst Zermelo resulted in the axiomatic set theory called
ZFC. This theory became widely accepted once Zermelo's axiom of choice (the C in ZFC) ceased to be controversial. ZFC has remained the canonical axiomatic set theory down to the present day.
For the developing of calculus this more formal form of set theory is not needed. Therefore we do not give a precise definition of a set, a set is an undefinable object like points and lines in the theory of Euclidean geometry. The objects we deal with are either sets or primitive. This is in contrast with ZFC where only sets exists. We assume all primitive objects belong to a universe $U$. Objects have properties and relations with other objects. We use symbols (mainly letters) to denote objects, properties and relations. For sets we use upper case, for primitives we use lower case. Realize that at this stage of our discussion we have not yet defined the concept of a number and counting.
Sets

$x=y$

the object denoted by the symbols $x$ and $y$ are the same

$x \neq y$

the object denoted by the symbols $x$ and $y$ are not the same

$x \in X$

$x$ is an element of the set $X$

$x \notin X$

$x$ is not an element of the set $X$

$X \subset Y$

$X$ is a subset of $Y$ which means $\forall x (x \in X \Rightarrow Y)$

$X \not \subset Y$

$X$ is a not subset of $Y$ which means $\lnot (\forall x (x \in X \Rightarrow Y))$

$X = Y$

two sets $X$ and $Y$ are equal if and only if they have the same elements, $X \subset Y \land Y \subset X$.

$\{x \in X  P(x)\}$

a unique subset of $X$ such that for all elements $x \in X$ $P(x)=true$

$\{x \in X  P(x)\} \subset \{x \in X  Q(x) \} $

$(\forall x \in X)\left(P(x) \Rightarrow Q(x)\right)$

$\{x \in X  P(x)\} = \{x \in X  Q(x) \} $

$(\forall x \in X)\left(P(x) \Leftrightarrow Q(x)\right)$

$\emptyset_X$

the empty set of $X$, $\{x \in X  x\neq x\}$

$\emptyset$

Let $X$ and $Y$ be any two sets then $(x \in \emptyset_X \Rightarrow P(x))$ for any $P(x)$, because a false statement implies every statement (this follows from: $p \Rightarrow q \Leftrightarrow \lnot p \lor q)$. We have now $(x \in \emptyset_X) \Rightarrow (x \in \emptyset_Y)$ and also $(x\in \emptyset_Y) \Rightarrow (x \in \emptyset_X)$ from which follows $\emptyset_X = \emptyset_Y$. All empty sets are equal and denoted by $\emptyset$.

$\{a_1,a_2,...,a_n \}$

the set $A$ consists of elements $x \in A$ if and only if $x=a_i$ for some $i=1,2,...,n$.

$\{ a \}$

the set having object $a$ as unique element

$\mathcal{A}$

an nonindexed family of sets, whose elements are sets

$\{A_\alpha\}_{\alpha \in \Lambda}$

an indexed family of sets, whose elements are sets indexed by an indexing set $\Lambda$, $\Lambda \neq \emptyset$

$\mathcal{P}(X)$

the powerset of $X$ consists of all subsets of $X$, $x \in X \Leftrightarrow \{x\} \in \mathcal{P}(x)$
Set operations

$X \setminus Y$ or $\complement_X Y$

the difference between $X$ and $Y$ or the complement of $Y$ with respect to $X$ where ($Y \subset X$) is defined as $\{x \in X  x \notin Y\}

$X^c$

$X^c=U \setminus X=\complement_U X$, where $U$ is the universe set.

$X \cap Y$

intersection of $X$ and $Y$, $\{x:x \in X \land x \in Y\}$

$X \cup Y$

union of $X$ and $Y$, $\{x:x \in X \lor x \in Y\}$

$\mathop{\bigcup}\limits_{A \in \mathcal{A}} A$

union of nonindexed family of sets,$\{x \in U  \exists A (A \in \mathcal{A} \land x \in A) \}$

$\mathop{\bigcap}\limits_{A \in \mathcal{A}} A, \mathcal{A} \neq \emptyset$

intersection of nonindexed family of sets,$\{x \in U  \forall A (A \in \mathcal{A} \Rightarrow x \in A ) \}$, notice that we have exclude $\mathcal{A} \neq \emptyset$ otherwise $\mathop{\bigcap}\limits_\emptyset A=U$ and this would imply that $\mathop{\bigcap}\limits_\emptyset A \not \subset \mathop{\bigcup}\limits_\emptyset A$

$\mathop{\bigcup}\limits_{\alpha \in \Lambda} A_\alpha$

union of indexed family of sets,$\{x \in U  \exists \alpha (\alpha \in \Lambda \land x\in A_\alpha) \}$

$\mathop{\bigcap}\limits_{\alpha \in \Lambda} A_\alpha, \Lambda \neq \emptyset$

intersection of indexed family of sets,$\{x \in U  \forall \alpha (\alpha \in \Lambda \Rightarrow x \in A_\alpha ) \}$
Corollaries of set operations

$ X \cup X = X \text{ and } X \cap X = X$


$ X \cup Y = Y \cup X \text { and } X \cap Y = Y \cap X$

commutative property

$X \subset Y \Leftrightarrow X \cup Y=Y \Leftrightarrow X \cap Y = X$


$X \subset X \cup Y$ and $X \cap Y \subset X$


$X \subset Z \land Y \subset Z \Leftrightarrow X \cup Y \subset Z$


$Z \subset Y \land Z \subset X \Leftrightarrow Z \subset X \cap Y$


$X \cup (Y \cup Z) = (X \cup Y) \cup Z$

associative property

$X \cap (Y \cap Z) = (X \cap Y) \cap Z$

associative property

$X \cup (Y \cap Z)=(X \cup Y) \cap (X \cup Z$

distributive property

$X \cap (Y \cup Z)=(X \cap Y) \cup (X \cap Z$

distributive property

$X \subset E, \complement = \complement_E \text{ then } \complement(\complement X)=X$


$X \subset E, Y \subset E, \complement = \complement_E \text{ then } \complement(X \cup Y)=(\complement X)\cap(\complement Y)$


$X \subset E, Y \subset E, \complement = \complement_E \text{ then } \complement(X \cap Y)=(\complement X)\cup(\complement Y)$


$\emptyset \subset X$, for all $X$

$\emptyset \subset X$ is equivalent to $\ (\forall x \in U)(x \in \emptyset \Rightarrow x \in X)$. As $x \in \emptyset$ is false for every $x$ we have $x \in \emptyset \Rightarrow x \in X$ is true for every $x$ and thus $\emptyset \subset X$ is also true.
Products of sets

$(a,b)$

an ordered pair

$(a,b)=(a',b')$

$a=a' \land b=b'$

$pr_i c$

the $i^{th}$ projection of the ordered pair $c$, e.g. let $c=(a,b)$ then $pr_2 c=b$

$X \times Y$

Cartesian product of $X$ and $Y$, $\{(x,y)  x \in X \land y \in Y\}$

$R(x,y)$

Relation between $x \in X$ and $y \in Y$ associated with $R(pr_1 z, pr_2 z)$ of $z \in X \times Y$.

$G(x,y)$

The graph of a relation defined by $\{(x,y) \in X \times Y R(x,y)\}$

$G(x), G^{1}(y)$

Defined as $\forall x\in X: G(x)=\{y \in Y  (x,y) \in G(x,y)\}$ and $\forall y \in Y: G^{1}(y)=\{x \in X  (x,y) \in G(x,y)\}$ are called cross section of $G$ at $x$ and $y$.

$X_1 \times X_2 \times \dots \times X_n$

defined by induction: $(X_1 \times X_2 \times \dots \times X_{n1}) \times X_n$ = $\{(x_1,x_2,\dots,x_n)  x_i \in X_i \text{ for } i=1,2,\dots,n\}$

$X^n$

$X \times X \times X \dots \times X \text {( n times)} $

$pr_{i_1i_2...i_k}(z)$

$(x_{i_1},x_{i_2}, \dots,x_{i_k}) \in X_{i_1}\times X_{i_2} \times \dots \times X_{i_k}$
Natural number arises intuitively from the concept of counting. The most 'natural' way to count is by making a mark for every item
counted. If you have seven sheep, you write . If you have two sheep, you write . If you have no sheep at all, what do you write? Nothing at all! This last observation makes clear that the number $0$ we are all familar with is not so naturally. However we will include $0$ as part of the natural numbers later on.
Are natural numbers primitive objects or do they need to be constructed from more basic objects ? There are different approaches possible to introducing natural numbers. One approach is based on axiomatic set theory (ZFC). Another approach is based on defining the natural numbers as primitive objects. This approach is followed by DedekindPeano and is called Peano Arithmetic. There are two variants of Peano arithmetic, one is based on second order logic and the other on first order logic. We discuss here the second order variant.
For this introduction we only assume basic concepts from logic and the following symbols: $\mathbb{N}, 0, \sigma$. All concepts will be defined.
AXIOM 1

1) $\mathbb{N}$ is a nonempty set, 2) $0 \in \mathbb{N}$, 3) $\sigma: \mathbb{N} \to \mathbb{N}$ is a function with domain and codomain $\mathbb{N}$.
We call $\mathbb{N}$ the set of natural numbers and its elements natural numbers. We call $0$ onw. We call $\sigma$ the successor function and $\sigma n$ the successor of $n$.
AXIOM 2

The image of $\sigma$ does not contain $0$:
\[
\lnot \left(\left(\exists n \in \mathbb{N}\right) \left( \sigma n = 0\right)\right)
\]
In other words $0$ is not the successor of a natural number.
AXIOM 3

$\sigma(m)=\sigma(n) \Rightarrow m=n$. The function $\sigma$ is injective. In other words distinct natural numbers have distinct successors.
AXIOM 4 (Induction axiom)

Suppose $S$ is a subset of $\mathbb{N}$ such that 1) $0 \in S$ (2 $n \in S \Rightarrow \sigma n \in S$. Then $S=\mathbb{N}$. $S$ is called an inductive set:
\[
S \subset \mathbb{N} \land 0 \in S \land \left(\forall n\left(n\in S \Rightarrow \sigma n \in S\right)\right) \Rightarrow S=\mathbb{N}
\]
The first 3 axioma's construct the natural numbers as what is called an Dedekindinfinite set. A Dedekindinfinite set is a set for which there is a function $f: A \rightarrow A$ which is injective but not surjective (the element 0 is missing in de codomain of $\sigma$). The last axiom introduces the concept of induction. Induction is intuitively understood as a 'and so on' reasoning style and is related to the concept of infinity. It is needed in the list of axioms for natural numbers to state that $\mathbb{N}$ is the smallest inductive subset. So $S=\{0,1/4,1,5/4,...\}$ could not be $\mathbb{N}$ as it would lead to a contradiction that $A=\{0,1,2,...\}=S$. Only the sequence generated by $0$ can be included to achieve this.
Induction plays also an important rol in mathematical proofs. We will define two variant of the so called
principle of mathematical induction.
We can define symbols, also called numerals, for numbers as follows:
\[
1=\sigma 0, 2=\sigma 1, 3=\sigma 2, 4=\sigma 3, 5=\sigma 4, 6=\sigma 5, 7=\sigma 6, 8=\sigma 7, 9=\sigma 8
\]
Note that a natural number can be represented by many different numerals, e.g. IV and 4 are two different numerals for the same number defined by $\sigma(\sigma(\sigma(\sigma 0)))$. For naming all others natural numbers we will explain later the base10 positional notation. As numerals are not the same thing as natural numbers, one could ask what are natural numbers then. The answer is simply everything that satisfies the above Peano axioms.
A positive natural number, $\mathbb{N}^+$, is a nonzero element of $\mathbb{N}$.
Principle of mathematical induction
In a set theory like ZFC we can prove the induction axiom for the set of natural numbers using the fact that the set of natural numbers is defined as the smallest inductive set that contains zero, and the proof is almost trivial. (An inductive set means a set that contains the successor of x whenever it contains x). In high school or undergraduate courses, when one is asked to prove induction axiom, they are usually asked to derive the induction axiom from some other axioms like the least number principle for natural numbers. Here the principle of induction is a proposition derived from the axiom of Peano.
First Principle of mathematical induction (FPMI)
Proposition

Let $P(n)$ be a propositional function for each $n \in \mathbb{N}$.
If
1)$P(n_0)$ is true for some $n_0 \in \mathbb{N}$
2)$P(k) \Rightarrow P(k+1)$ for all $k \in \mathbb{N} \land k>n_0$
Then
$P(n)$ is true for all $n > n_0$

Let $T=\{x  x > n_0\} \subset \mathbb{N}$. Any $P(n)$ defines a subset $S=\{n \in T  P(x) \}$. We have $n_0 \in S$ and also if $ k>n_0 k \in S \Rightarrow k+1 \in S$. If we apply the principle of finite induction we have $S=T$.
Q.E.D.
This principle is used to proof quantified statements like $\forall k>n_0: P(k)$ over a set $\{k,k+1,k+2,...\}$ with $k \in \mathbb{Z}$. The induction proof consists of different steps:
1) Base step: proof $P(n_0)$
2) Induction hypothesis: assume for some $k > n_0 \land k \in \mathbb{Z}$ that $P(k)$ is true
3) Induction step: given the induction hypothesis proof: $P(k) \Leftarrow P(k+1)$ is true
This proof method can be easily understood by the application of the modus pones interference rule ad infinitum. Let $\forall n P(n), n \in \mathbb{N}$ be a propositional function, which has been proofed by the FPMI, then we have:
\begin{align}
P(0) &\land (P(0)\Rightarrow P(1)) \Rightarrow \\
P(1) &\land (P(1)\Rightarrow P(2)) \Rightarrow \\
P(2) &\land (P(2)\Rightarrow P(3)) \Rightarrow \\
... &
\end{align}
The dots suggest 'and so on', But of course this is not a very precise statement. One could argue that the principle is just a disguised form of applying modus pones ad infinitum and therefore not a sound mathematical proof. However this is not a correct view because we in fact proof: $\{ x  P(x)\} = \mathbb{N}$. For the definition of the natural numbers this 'and so on' had to be made precise. In the axiom system of the natural numbers as developed by Peano the axiom of Infite Induction plays this role. So FPMI proofs that a certain set has a 11 mapping with the natural numbers, the axiom of Finite Induction guarentees the 'and so on' step.
Second Principle of mathematical induction (FPMI)
The second principle of mathematical induction allows us to assume the truth of $P(n)$ for all values $\{n_0,n_0+1,...,k\}$ to derive $P(k+1)$.
Proposition

Let $P(n)$ be a propositional function for each $n \in \mathbb{N}$.
If
1)$P(n_0)$ is true for some $n_0 \in \mathbb{N}$
2)$P(n_0) \land P(n_0+1) \land \dots \land P(k1) \land P(k) \Rightarrow P(k+1)$ for all $k \in \mathbb{N} \land k>n_0$
Then
$P(n)$ is true for all $n > n_0$
Principle of Recursion
Recursion is closely related to induction. Induction is a method to proof a statement for an infinite number of objects with a finite number of steps. These objects in turn have been recursively defined. A recursive definition is based on the idea of an iterative repetition of a function on a starting element. Informally a recursive definition defines the following sequence $a_0=c,a_1=h(a_0), a_2=h(h(a_0), ...., a_n=h(...h(a_0))$ with $c$ the starting element and $h$ the function interatively applied $n$ times on this element to produce a sequence $a_n$ of $n$ terms.

Let $A$ be a set, $c \in A$ and $h: A \rightarrow A$ a function. Then there exists a function $f:\mathbb{N} \rightarrow A$ such that $f(0)=c$ and $f \circ \sigma =h \circ f \;\; \forall n \in \mathbb{N}$ in other notation: $f(n+1)=h(f(n))$.

We must proof that $f$ exists and is unique. Let $\sigma^\prime:\mathbb{N} \times A \rightarrow \mathbb{N} \times A$ be defined by $(n,a) \mapsto (\sigma n, ha)$. We say that $R \subset\mathbb{N} \times A$ is closed under $\sigma^\prime$ if (i) $(0,c) \in R$ and (ii) $\sigma^\prime(R) \subset R$. The intersection of all sets closed under $\sigma^\prime$ is itself closed and is also the smallest subset of $R \subset\mathbb{N} \times A$ that is closed under $\sigma^\prime$. We name this set $N^\prime$. We must now proof that for each $n \in \mathbb{N}$ the ordered pair $(n,a) \in N^\prime$ is unique, meaning if $(n,a) \in N^\prime$ and $(n,b) \in N^\prime$ then $a=b$. We prove this by induction. For the base step we need that show that $(0,c) \in N^\prime$ is unique. Suppose $(0,d) \in N^\prime$ with $c \neq d$, then we could remove this ordered pair from $N^\prime$ and $N^\prime$ still being a closed set, as $(0,d) \notin \sigma^\prime(N^\prime)$ because by Peano axiom 3 $0$ is not the successor of a natural number. Now for the induction step we assume that for some $n$ that $(n,a) \in N^\prime$ and is unique. Then $(\sigma n, ha) \in N^\prime$ because $N^\prime$ closed under $\sigma^\prime$. Suppose that $(\sigma n, b) \in N^\prime$. We need to show that $b=sa$. As $\sigma n \neq 0$ it follows that $(\sigma n,b) \in \sigma^\prime(N^\prime)$. So $(\sigma n,b)=(\sigma n, hd)$ for some pair $(m,d) \in N^\prime$. But as $\sigma$ is injective $m=n$ and we must have $(n,d) \in N^\prime$. But now we have both $(n,a) \in N^\prime$ and $(n,d) \in N^\prime$ from the uniqueness of $a$ follows that $a=d$. We have proved the induction step $(\sigma n,b)=(\sigma n, sa) \Leftrightarrow b=sa$ and can conclude that the ordered pair $(n,x)$ is unique for each $n \in \mathbb{N}$. We now define $f: \mathbb{N} \to A$ as $n \mapsto a$ where $(n,a) \in N^\prime$. So we have for each $n \in \mathbb{N}$ that $(n,fn) \in N^\prime$. But because of the closure of $N^\prime$ it follows that $(\sigma n, f(\sigma n)) \in N^\prime$ and $(\sigma n, h(fn)) \in N^\prime$ which shows that $f(\sigma n)=h(fn)$ or equivalent $f \circ \sigma =h \circ f$. We have now shown the existence of $f$.
Now we need to show that $f$ is unique. Suppose $f^\prime: \mathbb{N} \to A$ such that $f^\prime(0)=c$ and $f^\prime \circ \sigma=h \circ f^\prime$. We then need to show that $f=f^\prime$.Let $W \subset \mathbb{N}$ such that $f(n)=f^\prime(n)$ for all $n \in W$. We need to show $W=\mathbb{N}$ by induction. By definition $f(0)=f^\prime(0)$ so $0 \in W$. Assume $n \in W$ then it follows $f(n)=f^\prime(n)$ and $h(f(n))=h(f^\prime(n))$. But $h(f(n))=f(\sigma n)$ and $h(f^\prime(n))=f^\prime(\sigma n)$ and thus $f(\sigma n)=f^\prime(\sigma n)$ and therefore $\sigma n \in W$. By the induction axiom $W=\mathbb{N}$ and thus $f=f^\prime$.
Q.E.D.
If $a \in A$ and $h: A \to A$ then we want to consider the iteration process:
\[
a, h(a), h(h(a), h(h(h(a))), \dots
\]
Axiom of iteration

Let $h: A \rightarrow A$, then one can assign to every $n \in \mathbb{N}$ a function $h^n: A \to A$ such that $h^0$ is the identity function on $A$ and $h^{\sigma n} = h \circ h^n$.

Define $h^n$ by the rule $a \mapsto f_c(n)$ where $f_c$ is the function $f$ defined by the axiom of recursion with $h,A,c$ similarly defined. Then we have that $h^0(c)=f_c(0)=c$, so $h^0$ is the identity function. Also $h^{\sigma n} = f_c^{\sigma n}=h(f_c(n))=h(h^n(c))=h \circ h^n(c)$. This holds for any $c \in A$ and thus $h^{\sigma n}=h \circ h^n$
Q.E.D.