Points and coordinates
We’ll be considering the generalisation to \(n\)- (particularly \(n=3\)) dimensions of the familiar notions of single variable differential and integral calculus. This will all be further generalised when we come to the discussion of calculus on manifolds and with that goal in mind we’ll try to take a little more care than is perhaps strictly necessary in setting the stage.
Our space will be \(\RR^n\), \(n\)-dimensional Euclidean space. Forget for the moment that this is a vector space and consider it simply as a space of points, \(n\)-tuples such as \(a=(a^1,\dots,a^n)\). The (Cartesian) coordinates on \(\RR^n\) will be denoted by \(x^1,\dots,x^n\) so that the \(x^ith\) coordinate of the point \(a\) is \(a^i\). When talking about a general (variable) point in \(\RR^n\) we’ll denote it by \(x\) with \(x=(x^1,\dots,x^n)\). Beware though that in two and three dimensions we’ll sometimes also denote the Cartesian coordinates as \(x,y\) and \(x,y,z\) respectively. The \(x^is\) in \(\RR^n\) and the \(x\), \(y\), and \(z\) in \(\RR^2\) and \(\RR^3\) are best thought of as coordinate functions. So, for example, \(x^i:\RR^n\mapto\RR\) is such that \(x^i(a)=a^i\). When discussing general (variable) points we’re therefore abusing notation — using the same symbol to denote coordinate functions and coordinates. In other words we might come across a notationally undesirable equation such as \(x^i(x)=x^i\). Context should make it clear what is intended. It’s worth noting here that every point of the \(\RR^n\) space can be specified by a single Cartesian coordinate system.
When we come to consider more general spaces, for example the surface of a sphere, this will not be the case. In such cases we’ll still assign (Cartesian) coordinates to points in our space through coordinate maps which effectively identify coordinate patches of the general space of points with pieces of \(\RR^n\). Where these patches overlap, the coordinate maps must, in a precise mathematical sense, be compatible — we must be able to consistently “sew” the patches together. Spaces which can, in this way, be treated as “locally Euclidean” are important because we can do calculus on functions on these spaces just as we can for functions on \(\RR^n\). We simply exploit the vector space properties of \(\RR^n\) via the coordinate maps. Crucial in this regard is the fact that \(\RR^n\) is a normed vector space, the norm, \(|a|\), of a point \(a=(a^1,\dots,a^n)\) being given by \(|a|=\sqrt{(a^1)^2+\cdots+(a^n)^2}\) so that the distance between two points is given by \(d(a,b)\) where
\begin{equation}
d(a,b)=\sqrt{(b^1-a^1)^2+\cdots+(b^n-a^n)^2}.
\end{equation}
As a vector space, \(\RR^n\) has a standard set of basis vectors, \(e_1,\dots,e_n\), with \(e_i\) typically regarded as column vector with 1 in the \(i\)th row and zeros everywhere else.
Vectors and the choice of scalar product
Standard treatments of vector calculus exploit the fact that, \(\RR^3\) say, can be simultaneously thought of as a space of points and of vectors. There’s no need to distinguish since we can always “parallel transport” a vector at some point in space back to the origin or, for that matter, to any other point. In such treatments the usual scalar product is typically taken for granted. Personally, I’ve found that this leads to a certain amount of confusion as to the real role of the scalar product, particularly when it comes to, say, discussions of the geometry of spacetime in special relativity. In that case the space is \(\RR^4\) as per the previous section but the scalar product of tangent vectors is crucially not the Euclidean scalar product.
For this reason, we’ll take some care to distinguish the intuitive notion of vectors as “arrows in space” from the underlying space of points. To each point, \(x\), in \(\RR^n\) will be associated a vector space, the tangent space at \(x\), \(T_x(\RR^n)\). This is the space containing all the arrows at the point \(x\) and is, of course, a copy of the vector space \(\RR^n\). In other words our intuitive notion of an arrow between two points \(a\) and \(b\) is treated as an object within the tangent space at \(a\). When dealing with tangent vectors we’ll use boldface. Thus, the standard set of basis vectors in a tangent space, \(T_x(\RR^n)\), will be denoted \(\mathbf{e}_1,\dots,\mathbf{e}_n\), with \(\mathbf{e}_i\) a column vector with 1 in the \(i\)th row and zeros everywhere else. The basis vector \(\mathbf{e}_i\) can be regarded as pointing from \(x\) in the direction of increasing coordinate \(x^i\).
The usual scalar product, also call dot product, of two vectors \(\mathbf{u}=(u^1,\dots,u^n)\) and \(\mathbf{v}=(v^1,\dots,v^n)\) of \(\RR^n\), is given by
\begin{equation}
\mathbf{u}\cdot\mathbf{v}=\sum_{i=1}^nu^iv^i.
\end{equation}
The dot product is a non-degenerate, symmetric, positive-definite inner product on \(\RR^n\) and allows us to define the length of any vector \(\mathbf{v}\) as
\begin{equation}
|\mathbf{v}|=\sqrt{\mathbf{v}\cdot\mathbf{v}}.
\end{equation}
Thanks to the Cauchy-Schwarz Theorem the angle, \(\theta\), between two vectors \(\mathbf{u}\) and \(\mathbf{v}\) may be defined as
\begin{equation}
\cos\theta=\frac{\mathbf{u}\cdot\mathbf{v}}{|\mathbf{u}||\mathbf{v}|}.
\end{equation}
As we’ve mentioned, the Minkowski space-time of special relativity is, as a space of points, \(\RR^4\). However a different choice of scalar product, in this case called a metric, is made, namely, \(\mathbf{u}\cdot\mathbf{v}=-u^0v^0+u^1v^1+u^2v^2+u^3v^3\).
In \(\RR^3\), recall that given two vectors \(\mathbf{u}\) and \(\mathbf{v}\), their vector product, \(\mathbf{u}\times\mathbf{v}\), with respect to Cartesian basis vectors is defined as,
\begin{equation}
\mathbf{u}\times\mathbf{v}=(u^2v^3-u^3v^2)\mathbf{e}_1-(u^1v^3-u^3v^1)\mathbf{e}_2+(u^1v^2-u^2v^1)\mathbf{e}_3,
\end{equation}
which can be conveniently remembered as a determinant,
\begin{equation}
\mathbf{u}\times\mathbf{v}=\det\begin{pmatrix}
\mathbf{e}_1&\mathbf{e}_2&\mathbf{e}_3\\
u^1&u^2&u^3\\
v^1&v^2&v^3
\end{pmatrix}.
\end{equation}
Alternatively, using the summation convention,
\begin{equation}
(\mathbf{u}\times\mathbf{v})^i=\epsilon^i_{jk}u^jv^k,
\end{equation}
where \(\epsilon^i_{jk}=\delta^{il}\epsilon_{ljk}\) is the Levi-Civita symbol. Note that the distinction between upper and lower indices is not important here but in more general contexts it will become so and therefore here we choose to take more care than is really necessary. The Levi-Civita symbol is given by
\begin{align*}
\epsilon_{123}=\epsilon_{231}=\epsilon_{312}&=1\\
\epsilon_{213}=\epsilon_{132}=\epsilon_{321}&=-1\\
\end{align*}
and zero in all other cases. Recall that the Levi-Civita symbol satisfies the useful relations,
\begin{equation}
\epsilon_{ijk}\epsilon_{ipq}=\delta_{jp}\delta_{kq}-\delta_{jq}\delta_{kp}
\end{equation}
and
\begin{equation}
\epsilon_{ijk}\epsilon_{ijq}=2\delta_{kq}.
\end{equation}
where summation over repeated indices is understood.
Geometrically, \(|\mathbf{u}\times\mathbf{v}|\) is the area of the parallelogram with adjacent sides \(\mathbf{u}\) and \(\mathbf{v}\), \(|\mathbf{u}||\mathbf{v}|\sin\theta\), and its direction is normal to the plane of those vectors. Of the two possible normal directions, the right hand rule gives the correct one.
The combination \((\mathbf{u}\times\mathbf{v})\cdot\mathbf{w}\) is called the triple product. It is the volume of the parallelepiped with base area \(|\mathbf{u}\times\mathbf{v}|\) and height \(\mathbf{w}\cdot\hat{\mathbf{n}}\), where \(\hat{\mathbf{n}}=\mathbf{u}\times\mathbf{v}/|\mathbf{u}\times\mathbf{v}|\). It has the property that permuting the three vectors cyclically doesn’t affect its value,
\begin{equation*}
(\mathbf{u}\times\mathbf{v})\cdot\mathbf{w}=(\mathbf{v}\times\mathbf{w})\cdot\mathbf{u}=(\mathbf{w}\times\mathbf{u})\cdot\mathbf{v},
\end{equation*}
and also that
\begin{equation*}
(\mathbf{u}\times\mathbf{v})\cdot\mathbf{w}=\mathbf{u}\cdot(\mathbf{v}\times\mathbf{w}).
\end{equation*}
Both of these follow immediately from the observation that
\begin{equation}
(\mathbf{u}\times\mathbf{v})\cdot\mathbf{w}=\delta^{il}\epsilon_{ljk}u^jv^kw^i.
\end{equation}
A useful formula relating the cross and scalar products is
\begin{equation}
\mathbf{u}\times(\mathbf{v}\times\mathbf{w})=(\mathbf{u}\cdot\mathbf{w})\mathbf{v}-(\mathbf{u}\cdot\mathbf{v})\mathbf{w}.
\end{equation}
This relationship is established as follows.
\begin{align*}
(\mathbf{u}\times(\mathbf{v}\times\mathbf{w}))^i&=\epsilon^i_{jk}u^j(\mathbf{v}\times\mathbf{w})^k\\
&=\epsilon^i_{jk}\epsilon^k_{lm}u^jv^lw^m\\
&=\epsilon^k_{ij}\epsilon^k_{lm}u^jv^lw^m\\
&=(\delta_{il}\delta_{jm}-\delta_{im}\delta_{jl})u^jv^lw^m\\
&=(\mathbf{u}\cdot\mathbf{w})v^i-(\mathbf{u}\cdot\mathbf{v})w^i
\end{align*}
A First Look at Curvilinear Coordinate Systems
In \(\RR^2\), a point whose Cartesian coordinates are \((x,y)\) could also be identified by its polar coordinates, \((r,\theta)\), where \(r\) is the length of the point’s position vector and \(\theta\) the angle between the position vector and the \(x\)-axis (as given by the vector \((1,0)\)). In fact what we are doing here is putting a subset of points of \(\RR^2\), \(\RR^2\) minus the origin (since the polar coordinates of the origin are not well defined) into 1-1 correspondence with a subset of points, \((0,\infty)\times[0,2\pi)\), of another copy of \(\RR^2\). We have a pair of coordinate functions, \(r:\RR^2\mapto\RR\) and \(\theta:\RR^2\mapto\RR\), such that \(r(x,y)=r=\sqrt{x^2+y^2}\) and \(\theta(x,y)=\theta=\tan^{-1}(y/x)\). Note again the unfortunate notation here — \(r\) and \(\theta\) are being used to denote coordinate functions as well as the coordinates (real numbers) themselves.
Coordinates at a point give rise to basis vectors for the tangent space at that point. We’ll discuss this more rigorously later, but the basic idea is simple. Take polar coordinates as an example. If we invert the 1-1 coordinate maps, \(r=r(x,y)\) and \(\theta=\theta(x,y)\) to obtain functions \(x=x(r,\theta)\) and \(y=y(r,\theta)\) then we may consider the two coordinate curves through any point \(P(x,y)\) obtained by holding in turn \(r\) and \(\theta\) fixed whilst allowing the other to vary. The tangent vectors at \(P\) to these curves are then the basis vectors corresponding to the coordinates being varied. Let’s consider some particular examples, for which the construction is geometrically straightforward.
In the case of \(\RR^2\), consider polar coordinates at a point \(P(x,y)\).
Then we have \(x=r\cos\theta\) and \(y=r\sin\theta\). Corresponding to the \(r\) and \(\theta\) coordinates are basis vectors \(\mathbf{e}_r\) and \(\mathbf{e}_\theta\) at \(P\), pointing respectively in the directions obtained by increasing the \(r\)-coordinate holding the \(\theta\)-coordinate fixed and increasing the \(\theta\)-coordinate holding the \(r\)-coordinate fixed. We can use the scalar product to compute the relationship between the Cartesian and polar basis vectors according to,
\begin{equation}
\mathbf{e}_r=(\mathbf{e}_r\cdot\mathbf{e}_x)\mathbf{e}_x+(\mathbf{e}_r\cdot\mathbf{e}_y)\mathbf{e}_y,
\end{equation}
and
\begin{equation}
\mathbf{e}_\theta=(\mathbf{e}_\theta\cdot\mathbf{e}_x)\mathbf{e}_x+(\mathbf{e}_\theta\cdot\mathbf{e}_y)\mathbf{e}_y,
\end{equation}
which, assuming \(\mathbf{e}_r\) and \(\mathbf{e}_\theta\) to be of unit length, result in the relations,
\begin{align}
\mathbf{e}_r&=\cos\theta\mathbf{e}_x+\sin\theta\mathbf{e}_y\\
\mathbf{e}_\theta&=-\sin\theta\mathbf{e}_x+\cos\theta\mathbf{e}_y.
\end{align}
In three dimensional space, the cylindrical coordinates of a point, \((\rho,\varphi,z)\), are related to its Cartesian coordinates by,
\begin{equation}
x=\rho\cos\varphi,\quad y=\rho\sin\varphi,\quad z=z,
\end{equation}
and its not difficult to check that the unit basis vectors defined any some point by the cylindrical coordinate system there are related to the Cartesian basis vectors as,
\begin{align}
\mathbf{e}_\rho&=\cos\varphi\mathbf{e}_x+\sin\varphi\mathbf{e}_y\\
\mathbf{e}_\varphi&=-\sin\varphi\mathbf{e}_x+\cos\varphi\mathbf{e}_y\\
\mathbf{e}_z&=\mathbf{e}_z.
\end{align}
The spherical polar coordinates,
\((r,\theta,\varphi)\), are related to its Cartesian coordinates by,
\begin{equation}
x=r\cos\varphi\sin\theta,\quad y=r\sin\varphi\sin\theta,\quad z=r\cos\theta.
\end{equation}
To relate the unit basis vectors of the spherical polar coordinate system to the Cartesian basis vectors it is easiest to first express them in terms of the cylindrical basis vectors as,
\begin{align*}
\mathbf{e}_r&=\sin\theta\mathbf{e}_\rho+\cos\theta\mathbf{e}_z\\
\mathbf{e}_\theta&=\cos\theta\mathbf{e}_\rho-\sin\theta\mathbf{e}_z\\
\mathbf{e}_\varphi&=\mathbf{e}_\varphi,
\end{align*}
so that,
\begin{align}
\mathbf{e}_r&=\sin\theta\cos\varphi\mathbf{e}_x+\sin\theta\sin\varphi\mathbf{e}_y+\cos\theta\mathbf{e}_z\\
\mathbf{e}_\theta&=\cos\theta\cos\varphi\mathbf{e}_x+\cos\theta\sin\varphi\mathbf{e}_y-\sin\theta\mathbf{e}_z\\
\mathbf{e}_\varphi&=-\sin\varphi\mathbf{e}_x+\cos\varphi\mathbf{e}_y.
\end{align}