The laws of physics are invariant (do not change) under a rotational or translational coordinate transformation. This means that the origin, the orientation and angles between the coordinate axes and the length of a unit, do not influence the physical laws. An observer in a reference frame $S$ will therefore observe the same laws as an observer in $S^\prime$. Another way of saying the same is that the equations of physical laws are covariant (change in the same way) under a coordinate transformation. This idea of invariance under transformation of coordinates is what is called relativity in physics. The laws are symmetrical under a coordinate transformation. For this reason the quantities studied in physics should obey this same principle. The quantity, measured by its magnitude and direction, should remain the same after a coordinate transformation.

The quantities studied in physics, tensors, are of different complexity:

*scalars*: quantities with a magnitude only, tensor of rank 0*vectors*: quantites with a magnitude and direction, tensor of rank 1*dyads*: quantities with a magnitude and two directions, tensor of rank 2*triads*: quantities with a magnitude and three directions, tensor of rank 3*n-ad*: quantities with a magnitude and n directions, tensor of rank n

To study these quantities we must assume a geometry with a measure for distance (metric). In this lecture we describe the theory of coordinate transformations in an Euclidean space (a geometry based on the 5 postulates of Euclides and where the theorem of Pythagoras holds for measuring distance) with a Cartesian coordinate system (a coordinate system with straight lines). The theory could be further generalized to take into account curvilinear coordinate systems. We study here only vectors, tensors of rank-1. A vector is a mathematical object for which its direction and lenght is invariant under a rotational or translational coordinate transformation. So a vector, or any tensor, keeps its indentity after coordinate transformations.

A coordinate system with straight line axes and where the coordinate axes make right angles with each other is called a rectangular Cartesian coordinate system. The point of intersection of the coordinate axes is called the origin. This system is most convenient to work with opposed to an oblique Cartesian coordinate system.

It turns out that with three axis we can have two different type of orientations: right or left. In a right handed system the direction of the $z$ axis is aligned with the direction of a right handed screw in the $xy$ plane when it is turned in the direction from the positive $x$ axis to the positive $y$ axis. If this turn is clockwise the screw moves into the plane and if the turn is counterclockwise the screw moves out of the plane.

A rectangular Cartesian coordinate system in essence is a correspondence between locations in space and a tuple of n numbers, where n is the number of coordinate axes. The origin, $O$, is associated with the tuple $(0,\cdots,0)$, any other point is associated with the tuple defined by the orthogonal projection of the point in space onto the coordinate axes. Let us represent an n-tuple with $\vec{X}=(X^1,X^2,\cdots,X^n)$ or simply $X^j$ where $j$ is understood to range form $1..n$. Geometrically this tuple represents an arrow from the origin to a point $P$ with coordinates $(X^1,X^2,\cdots,X^n)$ in a specific coordinate system. This arrow has a length, direction and sense which is independent of the specific coordinate system. In each coordinate system this same arrow has a different tuple representation. The set of all these different representations is what we call a *Cartesian vector*. We define unit vectors: $\vec{e}_1,\vec{e}_2,\cdots,\vec{e}_n$ on each coordinate axis with unit length. Then we can associate any point in space with a vector $\vec{r}$ that is a linear combination of these unit vectors as follows:
\[
\vec{r}=X^1\vec{e}_1+X^2\vec{e}_2+\cdots+X^n\vec{e}_n=\sum_{i=1}^nX^i\vec{e}_i=X^i\vec{e}_i
\]
Note that in the last step we used the Einstein convention and just skipped the summation symbol.

The set of unit vectors define a rectangular *basis* $\mathcal{R}$ for the *vector space* of vectors. Any basis $\mathcal{G}=\{\vec{g}_1,\cdots,\vec{g}_n\}$, whether or not rectangular, defines a one-to-one mapping $\phi$ between the vector space $V^n$ and $R^n$ such that:
\[
\phi_\mathcal{G}(\vec{x})=[\vec{x}]_\mathcal{G}=X^j
\]
and
\[
\phi^{-1}_\mathcal{G}(X^j)=(X^j)_\mathcal{G}=X^j\vec{g}_j=\vec{x}
\]
For the basis we have:
\[
[\vec{g}_i]_j=\delta_{ij}
\]
where $[\vec{g}_i]_j$ denotes the $j$th component of the $i$th basis vector and $\delta_{ij}$ the Kronecker delta defined as:
\begin{equation*}
\delta_{ij}:=
\begin{cases}
1 & :i = j \\
0 & :i \ne j
\end{cases}
\end{equation*}
$\delta^i_j$ and $\delta^{ij}$ have the same meaning.

We now turn to the question how the coordinates of a point P in a system $X$ relates to the coordinates of the same point P in another system $\bar{X}$. $X$ and $\bar{X}$ are both rectangular Cartesian coordinate systems with the same unit of length.

We analyze two possible type of transformations. First, the axes of $X$ and $\bar{X}$ are parallell but the origin of $\bar{X}$ has been translated. Second, $X$ and $\bar{X}$ have a common origin, but the axes of $\bar{X}$ have been rotated.

For the first case let us take any point $P$. In $X$ the point $P$ is represented by a position arrow $r$ and by $\bar{r}$ in $\bar{X}$.The origin of $\bar{X}$ is specified by $r_o$ in coordinates of $X$. We have by the law of addition of vectors: \[ r=\bar{r}+r_o \] So the transformation law of the coordinates from $\bar{X}$ to $X$ is defined for a translation of the origin by: \begin{equation} X^j=\bar{X}^j+X_o^j \end{equation}

Next we look at the case of a common origin, but a rotation of the coordinate axes. Any point $P$ is represented by $r$ in $X$ and by $\bar{r}$ in $\bar{X}$. Now these two vectors are equal, because of a common origin, $r=\bar{r}$. Let $\vec{e}_j$ be unit basis vector for $X$ and $\bar{\vec{e}}_j$ for $\bar{X}$ then follows: \begin{equation} \vec{e}_jX^j=\bar{\vec{e}}_k\bar{X}^k \label{eq02} \end{equation} We can express $\bar{\vec{e}}_k$ in terms of the basis vector for $X$. As $\bar{\vec{e}}_k$ is of unit lenght, its projections in the direction of the basis vectors of $X$ can be represented by the cosines of the angles made by $\bar{\vec{e}}_k$ with these basis vectors. Let $\alpha_k^j$ be the angle between $\bar{\vec{e}}_k$ and $\vec{e}_j$ and define: \[ a_k^j=\cos \alpha_k^j \] then we have the basis vectors of the barred system expressed in basis vectors of the unbarred system: \[ \bar{\vec{e}}_k=a_k^j\vec{e}_j \] When substituted this result into $\ref{eq02}$ we obtain \begin{align*} \vec{e}_jX^j=a_k^j\vec{e}_j\bar{X}^k \\ (X^j-a_k^j\bar{X}^k)\vec{e}_j=\vec{0} \end{align*} But since the basis vectors are independent we have found the following transformation equations for a rotation of coordinate axes: \begin{equation} X^j=a_k^j\bar{X}^k \end{equation} The coordinates of a pair of rectangular Cartesian coordinate systems are related by linear tranasformations.

Notice that the transformation for basis vectors and components has an opposite direction: $a_k^j$ transforms $\vec{e}$ to $\bar{\vec{e}}$ and $\bar{X}$ to $X$. Notice further that the transformation of the basis vectors is by multiplication of the rows of $a_k^j$ and for the components by multiplication of the columns of $a_k^j$. The transformation equation for the $j$th component: \[ X^j\vec{e}^j=(a^j_1\bar{X}^1+a^j_2\bar{X}^2+\cdots+a^j_n\bar{X}^n)\vec{e}^j \] gives the transformation of each $\bar{X}^j$ component in the direction of $\vec{e}^j$.

The coefficients of transformation can be written out in the form of a square matrix: \[ (a_k^j)=\left( \begin{array}{ccccc} a_1^1 & a_1^2 & .. & a_1^n \\ a_2^1 & a_2^2 & .. & a_2^n \\ .. & .. &.. & ..\\ a_n^1 & a_n^2 & .. & a_n^n \end{array} \right) \] the lower index represents the rows and the upper the columns.

The matrix of transformation coefficients, $(a_k^j)$, contains in the rows the coordinates with respect to the $X$ system of the basis vectors of the $\bar{X}$ system. Multiplying each entry of one row, k, with the entries in the same column of any other row, p, and summation of these results gives: \[ \cos \alpha_k^p =\sum_{j=1}^n a_k^ja_p^j \] This follows from the definition of the scalar product (in a rectangular Cartesian coordinate ystem) and the fact that the unit vectors have unit lenght. If we apply this for all possible combinations of rows we obtain a n-square matrix. If $k=p$ then the two basis vectors coincide and $\cos \alpha_k^p=1$, otherwise $\cos \alpha_k^p=0$, We therefore have the follwowing property for the coefficients of the transformation matrix.

- Property 1
- \begin{equation} \sum_{j=1}^n a_k^ja_p^j=\delta_k^p \end{equation}
- Property 2
- \begin{equation} a^2=\det(a_k^j)^2=1 \end{equation}
- Property 3
- \begin{equation} \bar{X}^p=A_j^pX^j \end{equation}
- We use a $X$ for rectangular Cartesian coordinate systems with orthonormal basis vectors, and $Y$ for any Cartesian coordinate system.
- We use a capital letter for a matrix that transforms coordinates from the system without a bar to the system with a bar, and a small letter for the reverse transformation.
- We use A or a for transformations between $X$ and $\bar{X}$, B or b for transformations between $Y$ and $X$, D and d for transformations between $\bar{Y}$ and $X$ and C,c for a transformation between any two Cartesian systems
- For a matrix the rows are subscripts and the columns superscripts
- We use $\vec{e}$ for a orthonormal basis vectors, and $\vec{\iota}$ for any basis vector.
- Reciprocal basis
- The covariant and contravariant basis are reciprocal. \[ \vec{p}_j\cdot\vec{p}^k=\delta_j^k \]
- Proof
- Definition: Contravariant vector
- A contravariant vector is a collection $\vec{G}=\{G,\bar{G},\cdots\}$ of ordered n tuples each associated with a Cartesian coordinate system such that any two satisfy following transformation law: \begin{equation} G^j=\frac{\delta Y^j}{\delta \bar{Y^k}}\bar{G}^k \end{equation} with $G^j$ and $\bar{G}^k$ the contravariant components of the Cartesian vector in the respective coordinate system.
- Definition: Covariant vector
- A covariant vector is a collection $\vec{G}=\{G,\bar{G},\cdots\}$ of ordered n tuples each associated with a Cartesian coordinate system such that any two satisfy following transformation law: \begin{equation} G_j=\frac{\delta \bar{Y}^k}{\delta Y^j}\bar{G}_k \end{equation} with $G_j$ and $\bar{G}_k$ the covariant components of the Cartesian vector in the respective coordinate system.
**[1]**Robert C. Wrede, Introduction to vector and tensor analysis, 1972 Dover Publications.

where $\delta_{kp}$ is the Kronecker delta. Written out as a matrix we have: \[ (\delta_k^j)=\left( \begin{array}{ccccc} \delta_1^1 & \delta_1^2 & .. & \delta_1^n \\ \delta_2^1 & \delta_2^2 & .. & \delta_2^n \\ .. & .. &.. & ..\\ \delta_n^1 & \delta_n^2 & .. & \delta_n^n \end{array} \right) = \left( \begin{array}{ccccc} 1 & 0 & .. & 0 \\ 0 & 1 & .. & 0 \\ .. & .. &.. & ..\\ 0 & 0 & .. & 1 \end{array} \right) \]

Notice that in matrix notation the above equation has the following form \[ (a_k^j)(a_j^k)=(\delta_k^j) \] where $(a_j^k)$ is the transpose of $(a_k^j)$ formed by putting the colums (rows) of $(a_k^j)$ in rows (columns) of $(a_j^k)$. We derive: \[ \det(a_k^j)\det(a_j^k)=\det(\delta_k^j)=1 \] We have further that: \[ \det(a_k^j)=\det(a_j^k) \] This lead to the following property

So $a=1$ or $a=-1$. The sign depends on the handness of the coordinate system. If $X$ is a right-handed coordinate system and $\bar{X}$ a left handed then $a=1$ else $a=-1$.

Now we know that the transformation matrix has a non vanishing determinant we derive the inverse transformation matrix $A_j^k$ of $a_k^j$ with Cramer's rule: \begin{equation} A_j^k = \frac{\text{cofactor of }a_k^j \text{ in }\det(a_k^j)}{a} \end{equation} The cofactor of $a_j^k$ is the (n-1) determinant, obtained by striking out the jth row and kth column, multiplied by $(-1)^{j+k}$. We have \begin{align*} A_j^ka_k^p=\delta_j^p \\ A_j^ka_q^j=\delta_q^k \end{align*}

Given $X^j=a_k^j\bar{X}^k$ we multiply both sides with $A_j^p$ and derive: \begin{align*} A_j^pX^j=A_j^pa_k^j\bar{X}^k=\delta_k^p\bar{X}^k=\bar{X}^p \end{align*}

We now analyze a coordinate transformation with relaxing the requirement that the coordinate axis are orthogonal or the basis vectors of unit length. Sow we include oblique coordinate axis with basis vectors of arbitrarily length. Let $Y$ and $\bar{Y}$ be two such systems. We take any rectangular Cartesian system $X$. We write the unit vectors of $X$ as a linear combination of the basis vectors of $Y$ using the cosine rule: \[ \vec{e}_j=B_j^q\vec{\iota}_q \] and get: \[ Y^q=B_j^qX^j \] We rewrite the basis vectors of $\bar{Y}$ as a linear combination of the unit vectors of $X$: \[ \bar{\vec{\iota}}_s=d_s^j\vec{e}_j \] and get: \[ X^j=d_s^j\bar{Y}^s \] We substitute this result into the equation for $Y^q$: \[ Y^q=B_j^q d_s^j\bar{Y}^s \] but $B_j^q d_s^j$ is the product of two matrices results in: \[ c_s^q=B_j^q d_s^j \] as basis and unit vectors are independent the determinant of the matrices $B_j^q$ and $d_s^j$ is not zero so $c_s^q$ has an inverse: \[ C_q^s=D_j^sb_q^j \] and finally we have the following generalized linear transformation equations: \[ Y^q=c_s^q\bar{Y}^s \] and \[ \bar{Y}^s=C_q^sY^q \]

Let us now summarize all these coordinate and vector transformations we have seen before. We use conventions:

## $X-\bar{X}$ |
## $Y-X$ |
## $\bar{Y}-X$ |
## $Y-\bar{Y}$ |

$X^j=a^j_i\bar{X}^i$ and $\bar{\vec{e}}_j=a^i_j\vec{e}_i$ | $X^j=b^j_iY^i$ and $\vec{\iota}_j=b^i_j\vec{e}_i$ | $X^j=d^j_i\bar{Y}^i$ and $\bar{\vec{\iota}}_j=d^i_j\vec{e}_i$ | $Y^j=c^j_i\bar{Y}^i$ and $\bar{\vec{\iota}}_j=c^i_j\vec{\iota}_i$ |

$\bar{X}^j=A^j_iX^i$, $\vec{e}_j=A^i_j\bar{\vec{e}}_i$ | $Y^j=B^j_iX^i$, $\vec{e}_j=B^i_j\vec{\iota}_i$ | $\bar{Y}^j=D^j_iX^i$, $\vec{e}_j=D^i_j\bar{\vec{\iota}}_i$ | $\bar{Y}^j=C^j_iY^i$, $\vec{\iota}_j=C^i_j\bar{\vec{\iota}}_i$ |

We learned that the coordinates of two vectors belonging to any two different Cartesian coordinate systems, $Y$ and $\bar{Y}$, have a linear relationship to each other. This linear relationships follows from expressing the basis vectors of one system into a linear combination of the basis vectors of the other system. The rows of the transformation matrix from barred to unbarred, $c_j^k$, gives the basis vectors of the barred system as a linear combination of the unbarred system: \[ \bar{\vec{\iota}}_j=c_j^k\vec{\iota}_k \] and the rows of the transformation matrix from unbarred to barred, $C_j^k$, gives the basis vectors of the unbarred system as a linear combination of the barred system: \[ \vec{\iota}_j=C_j^k\bar{\vec{\iota}}_k \]

We call the rows of these matrices the *covariant basis*, $\bar{\vec{p}}_j=\bar{\vec{\iota}}_j$ for the barred system and $\vec{p}_j=\vec{\iota}_j$ for the unbarred system. The components of n-tuples associated with a covariant basis are called contravariant component and denoted by capital letter and superscrpit index notation.

Notice: \begin{align*} \bar{\vec{p}}_k=c_k^j\vec{p}_j \text{ and } \vec{p}_j=C_j^k\bar{\vec{p}}_k \end{align*}

Notice also that the coordinates of $\vec{p}$ and $\vec{\iota}$ are from different coordinate systems, but express the same vector. The covariant basis $\bar{\vec{p}}_j$ transforms the unit vectors of $Y$ to vectors parallel to the coordinate axes of $\bar{Y}$.The covariant basis $\vec{p}_j$ transforms the unit vectors of $\bar{Y}$ to vectors parallell to the coordinate axes of $Y$.

Let us now see how we can transform the coordinates $\bar{P}^j$ back to coordinates of $\bar{Y}$, denoted with $\bar{I}^k$. Notice that $\bar{\iota}_j$ and $\iota_j$ by definition in their own coordinate system have coordinates $\delta_j$. So we have $\bar{I}^k=\delta^k$.The coordinates $\bar{P}^j$ are the $j^{th}$ row of $c$ and to transform these coordinates we must multiply this $j^{th}$ row with C: \begin{equation*} \bar{I}^k=C^k_sc^s_j=\delta^k_j \end{equation*} So it follows that multiplying the rows of $c$ with the columns of $C$ gives a unity matrix: \[ C_s^kc_j^s=\delta_j^k \]

We could express this covariant basis of course also in coordinates of a rectangular coordinate system. We had before derived: \[ c_j^k=B_e^kd_j^e \] and from transformation $Y$ to $X$ we have: \[ \vec{\iota}_k=b_k^d\vec{e}_d \] Let us now substitute this in our equation for the covariant basis: \begin{align*} \bar{\vec{p}}_j &=c_j^k\vec{\iota}_k \\ &=B_e^kd_j^eb_k^d\vec{e}_d \\ &=B_e^kb_k^dd_j^e\vec{e}_d \text{ we can change the order if we do not change free and dummy indices } \\ &=\delta_e^dd_j^e\vec{e}_d \\ &=d_j^d\vec{e}_d \end{align*} Following the same line of reasoning you find $\vec{p}_j=b_j^d\vec{e}_d$.

We could also use the columns of the transformation matrix $b$ to define a linear combination of the basis vectors of $X$ and use this as basis vectors for $\bar{Y}$:
\[
\bar{\vec{p}}^s=\sum_{q=1}^3 b_q^s\vec{e}_q
\]
or use the columns of the transformation matrix $B$ to define a linear combination of the basis vectors of $X$ and use this as basis vectors for $Y$:
\[
\vec{p}^s=\sum_{q=1}^3 B_q^s\vec{e}_q
\]
This set of basis vectors is called a *contravariant basis* and the components of associated n-tuples are called covariant components and denoted with capital letters and subscript index notation.

For a rectangular Cartesian coordinate system its basis is reciprocal to itself: \[ \vec{e}_j\cdot\vec{e}^k=\delta_j^k \]

We have following relations: \begin{align*} \vec{p}^j=c_q^j\bar{\vec{p}}q \\ \bar{\vec{p}}q=c_j^q\vec{p}^j \end{align*}

The table of transformations we had before we could now extend:

## $X-\bar{X}$ |
## $Y-X$ |
## $\bar{Y}-X$ |
## $Y-\bar{Y}$ |

$X^j=a^j_i\bar{X}^i$ and $\bar{\vec{e}}_j=a^i_j\vec{e}_i$ | $X^j=b^j_iY^i$ and $\vec{\iota}_j=b^i_j\vec{e}_i$ | $X^j=d^j_i\bar{Y}^i$ and $\bar{\vec{\iota}}_j=d^i_j\vec{e}_i$ | $Y^j=c^j_i\bar{Y}^i$ and $\bar{\vec{\iota}}_j=c^i_j\vec{\iota}_i$ |

$\bar{X}^j=A^j_iX^i$, $\vec{e}_j=A^i_j\bar{\vec{e}}_i$ | $Y^j=B^j_iX^i$, $\vec{e}_j=B^i_j\vec{\iota}_i$ | $\bar{Y}^j=D^j_iX^i$, $\vec{e}_j=D^i_j\bar{\vec{\iota}}_i$ | $\bar{Y}^j=C^j_iY^i$, $\vec{\iota}_j=C^i_j\bar{\vec{\iota}}_i$ |

$X_j=A^i_j\bar{X}_i$ and $\bar{\vec{e}}^j=A^j_i\vec{e}^i$ | $X_j=B^i_jY_i$ and $\vec{\iota}^j=B^j_i\vec{e}^i$ | $X_j=D^i_j\bar{Y}_i$ and $\bar{\vec{\iota}}^j=D^j_i\vec{e}^i$ | $Y_j=C^i_j\bar{Y}_i$ and $\bar{\vec{\iota}}^j=C^j_i\vec{\iota}^i$ |

$\bar{X}_j=a^i_jX_i$, $\vec{e}^j=a^j_i\bar{\vec{e}}^i$ | $Y_j=b^i_jX_i$, $\vec{e}^j=b^j_i\vec{\iota}^i$ | $\bar{Y}_j=d^i_jX_i$, $\vec{e}^j=d^j_i\bar{\vec{\iota}}^i$ | $\bar{Y}_j=c^i_jY_i$, $\vec{\iota}^j=c^j_i\bar{\vec{\iota}}^i$ |

A relation between the coordinates of P between one system $Y$ and another $\bar{Y}$ can be described by a vector function such that: \[ \vec{Y}=Y(\bar{\vec{Y}})=(Y^1(\bar{\vec{Y}}),Y^2(\bar{\vec{Y}}),...,Y^n(\bar{\vec{Y}})) \]

The transformation gives rise to n equations, $X^1 \cdots X^n$, and each of these equations has n variables which contribute to the transformation. We specify the change contributed by the k-th variable to the jth equation by: \[ \frac{\delta Y^j}{\delta \bar{Y^k}} \] We are now able to formally define a Cartesian vector. In fact we have two types: contravariant and covariant depending on their coordinate basis vectors.

Let $G^j$ be the components of a contravariant vector in a Cartesian system $Y$ then $G^j\vec{p}_j$ is an invariant form: \[ G^j\vec{p}_j=c^j_k\bar{G}^kC_j^q\bar{\vec{p}}_q=\bar{\delta}_k^q\bar{G}^k\bar{\vec{p}}_q=\bar{G}^k\bar{\vec{p}}_k \] in particular with respect to an orthonormal system $\bar{X}$ we have: \[ G^j\vec{p}_j=\bar{G}^j\bar{\vec{e}}_j=\bar{\vec{G}} \]

A $n$ vector can be represented in contravariant coordinates as follows: \[ \vec{G}=G^1\abs{\vec{p}_1}\hat{\vec{p}}_1+G^2\abs{\vec{p}_2}\hat{\vec{p}}_2+...+G^n\abs{\vec{p}_n}\hat{\vec{p}}_n \] with $\hat{\vec{p}}_i$ a vector of unit length and $g_i=G^i\abs{\vec{p}_i}$ are the physical components represented by parallel projections on the coordinate axis.

A position arrow is clearly not a Cartesian vector as its transformation rule $r=\bar{r}+r_o$ does not satisfy the definition of a vector. Fortunately we deal in physical laws only with positional displacement, which is a vector: \[ r_{p2}-r_{p1}=\vec{r}_{21} \]

If we use the inverse transformation law we define the covariant vector.

Since the transformation laws are linear the sets of partial derivatives are nothing more than the constant transformation matrices.

Let $G_j$ be the components of a covariant vector in a Cartesian system $Y$ then $G_j\vec{p}^j$ is an invariant form by the same line of reasoning as for the contravariant vector. In particular with respect to an orthonormal system $\bar{X}$ we have: \[ G_j\vec{p}^j=\bar{G}_j\bar{\vec{e}}^j=\sum_{j=1}^n\bar{\vec{G}}_j\bar{\vec{e}}_j=\bar{\vec{G}} \] Let $\vec{G}$ be a covariant vector and take the dot product with $\vec{p}_k$. \[ \vec{p}_k\cdot\vec{G}=G_j\vec{p}_k\vec{p}^j=G_j\delta_k^j=G_k \] From this follows that $G_k/\abs{\vec{p}_k}$ is the orthogonal projection of $\vec{G}$ onto the coordinate axis $Y^k$.

Copyright ©2013 Jacq Krol. All rights reserved. Created ; last updated .