Category Archives: Multilinear Algebra

The Hodge Dual

In this section we will assume \(V\) is a real \(n\)-dimensional vector space with a symmetric non-degenerate inner product (metric), \(g(\cdot,\cdot):V\times V\mapto\RR\). In such a vector space we can always choose an orthonormal basis, \(\{e_i\}\), and know from the classification result, Theorem~\ref{in prod class}, that such spaces are characterised up to isometry by a pair of integers, \((n,s)\), where \(s\) is the number of \(e_i\) such that \(g(e_i,e_i)=-1\).

We have seen that the dimensions of the spaces \(\Lambda^r(V)\) are given by the binomial coefficients, \({n \choose r}\). In particular, simply by virtue of having the same dimension, this means that the spaces \(\Lambda^r(V)\) and \(\Lambda^{n-r}(V)\) are isomorphic. In fact, as we shall see, the metric allows us to establish an essentially natural isomorphism between these spaces called Hodge duality.

Take any pair of pure \(r\)-vectors in \(\Lambda^r(V)\), \(\alpha=v_1\wedge\dots\wedge v_r\) and \(\beta=w_1\wedge\dots\wedge w_r\), with \(v_i,w_i\in V\). Then we can define an inner product on \(\Lambda^r(V)\) as
\begin{equation}
(\alpha,\beta)=\det(g(v_i,w_j)),
\end{equation}
where \(g(v_i,w_j)\) is regarded as the \(ij\)th entry of an \(r\times r\) matrix, and extended bilinearly to the whole of \(\Lambda^r(V)\). Since the determinant of a matrix and its transpose are identical, the inner product is symmetric. Given our orthonormal basis, \(\{e_i\}\), of \(V\), consider the inner product of the corresponding basis elements, \(e_{i_1}\wedge\dots\wedge e_{i_r}\), where \(1\leq i_1Example Take the single basis vector of \(\Lambda^n(V)\) to be \(\sigma=e_1\wedge\dots\wedge e_n\), then \((\sigma,\sigma)=(-1)^s\).

Now whenever we have a symmetric non-degenerate inner product on some space \(U\), there is a natural isomorphism, \(U\cong U^*\), which associates to every linear functional, \(f\), on \(U\) a unique vector, \(v_f\in U\), such that \(f(u)=(v_f,u)\) for all \(u\in U\). Choose a normalised basis vector, \(\sigma\), for \(\Lambda^n(V)\) and notice that to any \(\lambda\in\Lambda^r(V)\) is associated a linear functional on \(\Lambda^{n-r}(V)\), \(f_\lambda\), according to \(\lambda\wedge\mu=f_\lambda(\mu)\sigma\). But to \(f_\lambda\) we can uniquely associate an element of \(\Lambda^{n-r}(V)\), call it \(\star\lambda\), according to \(f_\lambda(\mu)=(\star\lambda,\mu)\). \(\star\lambda\) is called the Hodge dual of \(\lambda\) and we may write,
\begin{equation}
\lambda\wedge\mu=(\star\lambda,\mu)\sigma.
\end{equation}
As a map, \(\star:\Lambda^r(V)\mapto\Lambda^{n-r}(V)\) is clearly linear.

Example Consider the 2-dimensional vector space \(\RR^2\) with the usual inner (scalar) product which we’ll here denote \(g(\cdot,\cdot)\). Denoting it’s standard basis vectors by \(\mathbf{e}_1\) and \(\mathbf{e}_2\), we have \(g(\mathbf{e}_i,\mathbf{e}_j)=\delta_{ij}\) and a basis for \(\Lambda^2(\RR^2)\) is \(\mathbf{e}_1\wedge\mathbf{e}_2\) with \((\mathbf{e}_1\wedge\mathbf{e}_2,\mathbf{e}_1\wedge\mathbf{e}_2)=1\). Clearly, we must then have
\begin{equation}
\star1=\mathbf{e}_1\wedge\mathbf{e}_2,
\end{equation}
and
\begin{equation}
\star(\mathbf{e}_1\wedge\mathbf{e}_2)=1.
\end{equation}
\(\star\mathbf{e}_1\) must be such that \((\star\mathbf{e}_1,\mathbf{e}_1)=0\) and \((\star\mathbf{e}_1,\mathbf{e}_2)=1\), that is,
\begin{equation}
\star\mathbf{e}_1=\mathbf{e}_2,
\end{equation}
and \(\star\mathbf{e}_2\) must be such that \((\star\mathbf{e}_2,\mathbf{e}_1)=-1\) and \((\star\mathbf{e}_2,\mathbf{e}_2)=0\), so
\begin{equation}
\star\mathbf{e}_2=-\mathbf{e}_1.
\end{equation}
Notice that if we had chosen \(\mathbf{e}_2\wedge\mathbf{e}_1=-\mathbf{e}_1\wedge\mathbf{e}_2\) as the basis for \(\Lambda^2(\RR^2)\) then \(\star1=-\mathbf{e}_1\wedge\mathbf{e}_2\), \(\star(-\mathbf{e}_1\wedge\mathbf{e}_2)=1\), \(\star\mathbf{e}_1=-\mathbf{e}_2\) and \(\star\mathbf{e}_2=\mathbf{e}_1\).

Given two bases of a vector space \(V\), \(\{e_i\}\) and \(\{f_i\}\), we say that they share the same orientation if the determinant of the change of basis matrix relating them is positive. Bases of \(V\) thus belong to one of two equivalence classes. From a slightly different perspective, given the bases \(\{e_i\}\) and \(\{f_i\}\) we can form the vectors \(e_1\wedge\dots\wedge e_n\) and \(f_1\wedge\dots\wedge f_n\) both of which belong the the 1-dimensional space \(\Lambda^n(V)\) and so we must have
\begin{equation}
f_1\wedge\dots\wedge f_n=ce_1\wedge\dots\wedge e_n.
\end{equation}
We know that we must be able to express the \(f_i\) in terms of the \(e_i\) as \(f_i=T_i^je_j\) where \(T_i^j\) are the elements of the change of basis linear operator defined by \(Te_i=f_i\). But we know that,
\begin{equation}
f_1\wedge\dots\wedge f_n=T^{\wedge n}(e_1\wedge\dots\wedge e_n)=\det Te_1\wedge\dots\wedge e_n,
\end{equation}
so \(c=\det T\). In other words given a basis \(\{e_i\}\) of \(V\), another basis \(f_i\) shares the same orientation if the corresponding top exterior powers are related by a positive constant. The Hodge dual thus depends on both the metric and the orientation of a given vector space.

Example Consider the 3-dimensional space \(\RR^3\) equipped with the usual inner product, with standard basis vectors \(\mathbf{e}_1\), \(\mathbf{e}_2\) and \(\mathbf{e}_3\) and \(\mathbf{e}_1\wedge\mathbf{e}_2\wedge\mathbf{e}_3\) as our prefered top exterior product. Then,
\begin{align}
\star1&=\mathbf{e}_1\wedge\mathbf{e}_2\wedge\mathbf{e}_3\\
\star\mathbf{e}_1&=\mathbf{e}_2\wedge\mathbf{e}_3\\
\star\mathbf{e}_2&=\mathbf{e}_3\wedge\mathbf{e}_1\\
\star\mathbf{e}_3&=\mathbf{e}_1\wedge\mathbf{e}_2\\
\star(\mathbf{e}_1\wedge\mathbf{e}_2)&=\mathbf{e}_3\\
\star(\mathbf{e}_2\wedge\mathbf{e}_3)&=\mathbf{e}_1\\
\star(\mathbf{e}_3\wedge\mathbf{e}_1)&=\mathbf{e}_2\\
\star(\mathbf{e}_1\wedge\mathbf{e}_2\wedge\mathbf{e}_3)&=1.
\end{align}

Let us now establish some general properties of the Hodge dual. We take an orthonormal basis of the \(n\)-dimensional space \(V\) to be \(\{e_i\}\) with top exterior form \(\sigma=ae_1\wedge\dots\wedge e_n\) with \(a=\pm1\). Then consider the pure \(r\)-vector \(e_I=e_1\wedge\dots\wedge e_r\) (no loss of generality will be incurred choosing \(I=(1,\dots,n)\)), we must then have that
\begin{equation}
\star e_I=ce_{r+1}\wedge\dots\wedge e_n=ce_J,
\end{equation}
for \(c=\pm1\) and \(J=(r+1,\dots,n)\). Of course \(c\) depends on our original choice \(a\) according to,
\begin{equation}
c=a(e_J,e_J).
\end{equation}
Consider now, \(\star e_J\), clearly
\begin{equation}
\star e_J=de_I,
\end{equation}
for some \(d=\pm1\) but since \(e_J\wedge e_I=(-1)^{r(n-r)}e_I\wedge e_J\), we have,
\begin{equation}
d=a(-1)^{r(n-r)}(e_I,e_I).
\end{equation}
We may therefore conclude that,
\begin{equation}
\star\star e_I=(-1)^{r(n-r)}(e_I,e_I)(e_J,e_J)e_I,
\end{equation}
but assuming \((\sigma,\sigma)=(-1)^s\) this is then,
\begin{equation}
\star\star e_I=(-1)^{r(n-r)+s}e_I,
\end{equation}
and by linearity we may conclude that for any \(\lambda\in\Lambda^r(V)\),
\begin{equation}
\star\star\lambda=(-1)^{r(n-r)+s}\lambda.
\end{equation}

Notice that for \(\lambda,\mu\in\Lambda^r(V)\), \(\lambda\wedge\star\mu=(\star\lambda,\star\mu)\sigma=(\star\mu,\star\lambda)\sigma=\mu\wedge\star\lambda\), that is,
\begin{equation}
\lambda\wedge\star\mu=\mu\wedge\star\lambda.
\end{equation}
But \(\mu\wedge\star\lambda=(-1)^r(n-r)\star\lambda\wedge\mu=(-1)^s(\lambda,\mu)\sigma\), that is,
\begin{equation}
\lambda\wedge\star\mu=\mu\wedge\star\lambda=(-1)^s(\lambda,\mu)\sigma.
\end{equation}

The Determinant Revisited

Suppose \(L:V\mapto V\) is a linear operator and consider the tensor product map \(L^{\otimes r}=L\otimes\dots\otimes L:T^r(V)\mapto T^r(V)\). Then clearly \(L^{\otimes r}\circ A=A\circ L^{\otimes r}\) so that \(L^{\otimes r}|_{\Lambda^r(V)}:\Lambda^r(V)\mapto\Lambda^r(V)\). This restriction is typically denoted \(L^{\wedge p}\). Now, as we’ve already observed, if \(V\) is an \(n\)-dimensional vector space, then \(\dim\Lambda^n(V)=1\). So any \(L^{\wedge n}\) is multiplication by a scalar. Choosing a basis, \(\{e_i\}\), of \(V\), then \(e_1\wedge\dots\wedge e_n\) is the single basis element of \(\Lambda^n(V)\), and if we write, \(Le_i=L_i^je_j\), then
\begin{equation}
L^{\wedge n}(e_1\wedge\dots\wedge e_n)=d_Le_1\wedge\dots\wedge e_n,
\end{equation}
where \(d_L\) is some scalar. But we also have,
\begin{equation}
L^{\wedge n}(e_1\wedge\dots\wedge e_n)=L_1^{i_1}\cdots L_n^{i_n}e_{i_1}\wedge\dots\wedge e_{i_n}.
\end{equation}
Now, the right hand side here is only non-zero when the set of indices \(\{i_1,\dots,i_n\}\) is precisely \(\{1,2,\dots,n\}\) and in this case
\begin{equation}
L_1^{i_1}\cdots L_n^{i_n}e_{i_1}\wedge\dots\wedge e_{i_n}=\sum_{\sigma\in S_n}\sgn(\sigma)L_1^{\sigma_1}\cdots L_n^{\sigma_n}e_1\wedge\dots\wedge e_n,
\end{equation}
in which we see precisely our original definition of the determinant,
so that \(d_L=\det L\).

Tensor Symmetries in Coordinate Representation

If \(T^{i_1\dots i_r}\) are the components of a \((r,0)\) tensor, \(T\), with respect to some basis then the symmetrization of \(T\), \(S(T)\), has components which are conventionally denoted, \(T^{(i_1\dots i_r)}\). That is, by definition,
\begin{equation}
T^{(i_1\dots i_r)}=\frac{1}{r!}\sum_{\sigma\in S_r}T^{i_{\sigma(1)}\dots i_{\sigma(r)}}.
\end{equation}
Similarly, the antisymmetrization of \(T\), \(A(T)\), has components which are conventionally deonted, \(T^{[i_1\dots i_r]}\). That is, by definition,
\begin{equation}
T^{[i_1\dots i_r]}=\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)T^{i_{\sigma(1)}\dots i_{\sigma(r)}}.
\end{equation}

Skew-Symmetric Tensors and the Exterior Algebra

A tensor, \(T\in T^r(V)\), is called skew-symmetric if \(P_\sigma(T)=\sgn(\sigma)T\) for all \(\sigma\in S_r\). The subspace in \(T^r(V)\) of all skew-symmetric tensors will be denoted \(\Lambda^r(V)\).

Define on \(T^r(V)\) the linear operator,
\begin{equation}
A=\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)P_\sigma.
\end{equation}
This is called the antisymmetrization on \(T^r(V)\). For any \(T\in T^r(V)\), \(A(T)\) is skew-symmetric, since for any \(\tau\in S_r\),
\begin{align*}
P_\tau\left(\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)P_\sigma(T)\right)&=\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)P_{\tau\sigma}(T)\\
&=\sgn(\tau)\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\tau\sigma)P_{\tau\sigma}(T)\\
&=\sgn(\tau)\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)P_{\sigma}(T)\\
&=\sgn(\tau)A(T).
\end{align*}
Conversely, suppose \(T\) is skew-symmetric, then
\begin{equation}
A(T)=\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)P_\sigma(T)=\frac{1}{r!}\sum_{\sigma\in S_r}\sgn(\sigma)^2T=T,
\end{equation}
so that \(\img A=\Lambda^r(V)\) and \(A^2=A\), so that \(A\) is a projector onto \(\Lambda^r(V)\).

If \(\{e_i\}\) is a basis for the \(n\)-dimensional vector space, \(V\), then all pure tensors of the form, \(e_{i_1}\otimes\dots\otimes e_{i_r}\), form a basis of \(T^r(V)\). A standard notation is to write,
\begin{equation}
A(e_{i_1}\otimes\dots\otimes e_{i_r})=e_{i_1}\wedge\dots\wedge e_{i_r}.
\end{equation}
The symbol \(\wedge\) is called the exterior or wedge product. Since by definition, two pure tensors,
\begin{equation*}
e_{i_1}\otimes\dots\otimes e_{i_j}\otimes\dots\otimes e_{i_k}\otimes\dots\otimes e_{i_r},
\end{equation*}
and
\begin{equation*}
e_{i_1}\otimes\dots\otimes e_{i_k}\otimes\dots\otimes e_{i_j}\otimes\dots\otimes e_{i_r},
\end{equation*}
differing only by the interchange of the pair \(e_{i_j}\) and \(e_{i_k}\), are related by a permutation \(\sigma\) with \(\sgn(\sigma)=-1\), we have,
\begin{equation*}
e_{i_1}\wedge\dots\wedge e_{i_j}\wedge\dots\wedge e_{i_k}\wedge\dots\wedge e_{i_r}=-e_{i_1}\wedge\dots\wedge e_{i_k}\wedge\dots\wedge e_{i_j}\wedge\dots\wedge e_{i_r}.
\end{equation*}
In particular, if \(i_j=i_k\) for some \(j\neq k\), then \(e_{i_1}\wedge\dots\wedge e_{i_r}=0\). It also follows that \(\Lambda^r(V)\) is spanned by tensors of the form, \(e_{i_1}\wedge\dots\wedge e_{i_r}\), such that, \(1\leq i_1n\). But these are also clearly linearly independent, since distinct \(e_{i_1}\wedge\dots\wedge e_{i_r}\) are linear combinations of non-intersecting subsets of basis elements of \(T^r(V)\). It follows then that,
\begin{equation}
\dim\Lambda^r(V)={n\choose r},
\end{equation}
with \(\dim\Lambda^n(V)=1\). We define,
\begin{equation}
\Lambda(V)=\bigoplus_{r=0}^n\Lambda^r(V),
\end{equation}
(\(\dim\Lambda(V)=2^n\)) and introduce a multiplication according to \(T_1\wedge T_2=A(T_1\otimes T_2)\) for any \(T_1\in\Lambda^r(V)\) and \(T_2\in\Lambda^s(V)\). Then for any, \(T_1\in T^r(V)\), \(T_2\in T^r(V)\) and \(T_3\in T^s(V)\),
\begin{align*}
(T_1\wedge T_2)\wedge T_3&=A(A(T_1\otimes T_2)\otimes T_3)\\
&=A\left(\frac{1}{(r+s)!}\sum_{\sigma\in S_{r+s}}\sgn(\sigma)P_\sigma(T_1\otimes T_2)\otimes T_3\right)\\
&=\frac{1}{(r+s)!}\sum_{\sigma\in S_{r+s}}\sgn(\sigma)A(P_\sigma(T_1\otimes T_2)\otimes T_3)\\
&=\frac{1}{(r+s)!}\sum_{\sigma\in S_{r+s}}\sgn(\sigma)^2A(T_1\otimes T_2\otimes T_3)\\
&=A(T_1\otimes T_2\otimes T_3),
\end{align*}
and similarly for \(T_1\wedge(T_2\wedge T_3)=A(T_1\otimes T_2\otimes T_3)\). We conclude, therefore, that the wedge product is associative. Also, for any \(T_1\in T^r(V)\) and \(T_2\in T^s(V)\), we have, \(A(T_1\otimes T_2)=(-1)^{rs}A(T_2\otimes T_1)\) so, in particular, \(T_1\wedge T_2=(-1)^{rs}T_2\wedge T_1\), for any \(T_1\in\Lambda^r(V)\) and \(T_2\in\Lambda^s(V)\).

As with the symmetric algebra, let us now realise \(\Lambda(V)\) as a quotient of the tensor algebra \(T(V)\).

Definition The exterior algebra on the vector space \(V\) over the field \(K\) is the quotient \(T(V)/J\) of the tensor algebra \(T(V)\) by the ideal \(J\) generated by the elements \(v\otimes v\) for all \(v\in V\).

As in the symmetric algebra case, we define \(J^r=T^r(V)\cap J\) so that \(J=\bigoplus_{r=0}^\infty J^r\). Then defining, \(\tilde{\Lambda}(V)=T(V)/J\), it follows as before that, \(\tilde{\Lambda}(V)=\bigoplus_{r=0}^\infty T^r(V)/J^r\). Thus we define, \(\tilde{\Lambda}^r(V)=T^r(V)/J^r\), and seek to relate this to \(\Lambda^r(V)\) defined above.

An alternative definition of the ideal \(J\), is as the ideal generated by the elements, \(u\otimes v+v\otimes u\), for any \(u,v\in V\). The equivalence of these definitions amounts to observing that for any \(u,v\in V\),
\begin{equation*}
(u+v)\otimes(u+v)-u\otimes u-v\otimes v=u\otimes v+v\otimes u.
\end{equation*}
Then, by an argument similar to the one we used in the symmetric case, this ideal is equivalent to the ideal generated by \(T-\sgn(\sigma)P_\sigma(T)\) for all \(T\in T^r(V)\) and any \(\sigma\in S_r\). Once again abusing notation, we’ll denote the product of two elements of \(\tilde{\Lambda}(V)\), \((T_1+J)(T_2+J)\), as \(T_1\wedge T_2\) rather than \(T_1\otimes T_2+J\). Thus the image in, \(\tilde{\Lambda}(V)\), of some pure tensor, \(v_1\otimes\dots\otimes v_r\in T^r(V)\), is denoted \(v_1\wedge\dots\wedge v_r\). Then since, \(T-\sgn(\sigma)P_\sigma(T)\in J\), it follows that, \(T_1\wedge T_2=(-1)^{rs}T_2\wedge T_1\), for any \(T_1\in\tilde{\Lambda}^r(V)\) and \(T_2\in\tilde{\Lambda}^s(V)\).

Just as in the symmetric case, skew-symmetric tensors and the exterior algebra inherit universal properties from the tensor product and tensor algebra respectively. The proofs follow those of the symmetric case.

Proposition If \(\iota\) is the \(r\)-linear function, \(\iota:V\times\dots\times V\mapto\tilde{\Lambda}^r(V)\), defined as , \(\iota(v_1,\dots,v_r)=v_1\cdots v_r\), then \((\tilde{\Lambda}^r(V),\iota)\) has the following universal mapping property: whenever \(f:V\times\dots\times V\mapto W\) is an alternating\footnote{We already met alternating forms in our earlier discussion of determinants, the only generalisation here is that the target space is another vector space.} \(r\)-linear function with values in a vector space \(W\) there exists a unique linear map \(L:\tilde{\Lambda}^r(V)\mapto W\) such that \(f=L\iota\).

Consequently, the space of linear maps \(\mathcal{L}(\tilde{\Lambda}^r(V),W)\) is isomorphic to the vector space of alternating \(r\)-linear functions from \(V\times\dots\times V\) to \(W\) and in particular \(\tilde{\Lambda}^r(V)^*\), the dual space of \(\tilde{\Lambda}^r(V)\), is isomorphic to the space of all alternating \(r\)-linear forms on \(V\times\dots\times V\).

Given a basis, \(\{e_i\}\), of \(V\), the pure tensors, \(e_{i_1}\otimes\dots\otimes e_{i_r}\), form a basis of \(T^r(V)\), and so the, \(e_{i_1}\wedge\dots\wedge e_{i_r}\), span \(\tilde{\Lambda}^r(V)\). In fact its clear that this space is already spanned by the set \(e_{i_1}\wedge\dots\wedge e_{i_r}\) with \(1\leq i_1Remark In the case of \(r=2\), we clearly have \(A+S=\id_{T^2(V)}\) and \(AS=0\), so that
\begin{equation}
T^2(V)=S^2(V)\oplus\Lambda^2(V).
\end{equation}

Remark The elements of \(\Lambda^r(V)\) are called \(r\)-vectors. An \(r\)-vector which can be written \(v_1\wedge\dots\wedge v_r\) for some \(v_i\in V\) will be called a pure \(r\)-vector.

Symmetric Tensors and the Symmetric Algebra

In this and the next section we will identify symmetric and skew-symmetric tensors within \(T(V)\), and demonstrate, that with a suitably defined multiplication, they form subalgebras of \(T(V)\). In both cases we’ll then realise these algebras as quotients of \(T(V)\).

To any permutation \(\sigma\in S_r\) denote by \(P_\sigma:T^r(V)\mapto T^r(V)\) the linear operator defined on pure tensors by \(P_\sigma(v_1\otimes\dots\otimes v_r)=v_{\sigma(1)}\otimes\dots\otimes v_{\sigma(r)}\). A tensor, \(T\in T^r(V)\), is called symmetric if \(P_\sigma(T)=T\) for all \(\sigma\in S_r\). The subspace in \(T^r(V)\) of all symmetric tensors will be denoted \(S^r(V)\).

Consider the linear operator, \(S\), on \(T^r(V)\), defined as,
\begin{equation}
S=\frac{1}{r!}\sum_{\sigma\in S_r}P_\sigma.
\end{equation}
This is called the symmetrization on \(T^r(V)\). For any permutation, \(\sigma\in S_r\), \(P_\sigma S=S\), so for any \(T\in T^r(V)\), \(S(T)\) is symmetric. Conversely, it is clear that if \(T\) is symmetric, then \(S(T)=T\). Thus \(\img S=S^r(V)\) and \(S^2=S\), so \(S\) is a projector onto \(S^r(V)\).

Example Consider a 2-dimensional vector space over \(\CC\) with basis \(\{e_1,e_2\}\). Then on the natural basis of \(T^3(V)\), we have
\begin{align*}
S(e_1\otimes e_1\otimes e_1)&=e_1\otimes e_1\otimes e_1\\
S(e_1\otimes e_1\otimes e_2)&=\frac{1}{3}(e_1\otimes e_1\otimes e_2+e_1\otimes e_2\otimes e_1+e_2\otimes e_1\otimes e_1)\\
S(e_1\otimes e_2\otimes e_1)&=\frac{1}{3}(e_1\otimes e_1\otimes e_2+e_1\otimes e_2\otimes e_1+e_2\otimes e_1\otimes e_1)\\
S(e_2\otimes e_1\otimes e_1)&=\frac{1}{3}(e_1\otimes e_1\otimes e_2+e_1\otimes e_2\otimes e_1+e_2\otimes e_1\otimes e_1)\\
S(e_1\otimes e_2\otimes e_2)&=\frac{1}{3}(e_1\otimes e_2\otimes e_2+e_2\otimes e_1\otimes e_2+e_2\otimes e_2\otimes e_1)\\
S(e_2\otimes e_1\otimes e_1)&=\frac{1}{3}(e_1\otimes e_2\otimes e_2+e_2\otimes e_1\otimes e_2+e_2\otimes e_2\otimes e_1)\\
S(e_2\otimes e_2\otimes e_1)&=\frac{1}{3}(e_1\otimes e_2\otimes e_2+e_2\otimes e_1\otimes e_2+e_2\otimes e_2\otimes e_1)\\
S(e_2\otimes e_2\otimes e_2)&=e_2\otimes e_2\otimes e_2
\end{align*}

Let us consider the dimension of \(S^r(V)\). If \(\{e_i\}\) is a basis of the \(n\)-dimensional vector space, \(V\), then all pure tensors of the form, \(e_{i_1}\otimes\dots\otimes e_{i_r}\), form a basis of \(T^r(V)\). A standard notation is to write,
\begin{equation}
S(e_{i_1}\otimes\dots\otimes e_{i_r})=e_{i_1}\dots e_{i_r}.
\end{equation}
The tensors, \(e_{i_1}\dots e_{i_r}\), clearly span \(S^r(V)\), but, as the Example makes clear, whenever \(\{i_1,\dots,i_r\}\) and \(\{j_1,\dots,j_r\}\) are identical as sets \(e_{i_1}\dots e_{i_r}=e_{j_1}\dots e_{j_r}\). In other words, \(e_{i_1}\dots e_{i_r}\) only depends on the number of times each \(e_i\) appears in the product, so we can write \(e_{i_1}\dots e_{i_r}=e_1^{a_1}\dots e_n^{a_n}\) where \(a_i\) is the multiplicity of \(e_i\) in \(e_{i_1}\dots e_{i_r}\), and \(a_1+\dots+a_n=r\). It is then clear that the tensors, \(e_1^{a_1}\dots e_n^{a_n}\), are linearly independent — for distinct \(n\)-tuples, \((a_1,\dots,a_n)\) and \((b_1,\dots, b_n)\), \(e_1^{a_1}\dots e_n^{a_n}\) and \(e_1^{b_1}\dots e_n^{b_n}\) are linear combinations of non-intersecting subsets of basis elements of \(T^r(V)\). Thus the \(e_1^{a_1}\dots e_n^{a_n}\) are a basis for \(S^r(V)\) and so to determine the dimension of \(S^r(V)\), we must count the number of distinct \(n\)-tuples \((a_1,\dots,a_n)\), \(a_i\in\ZZ_{\geq0}\), such that \(a_1+\dots+a_n=r\). A nice way of understanding this counting problem is through Feller’s `stars and bars’. Suppose \(r=8\) and \(n=5\) so that we wish to determine the dimension of the space \(S^8(V)\) where \(V\) is a \(5\)-dimensional vector space. Then each valid \(5\)-tuple corresponds to a diagram such as,
\begin{equation*}
|\star\star||\star\star|\star\star\star|\star|,
\end{equation*}
in which, reading left to right, the number of stars between the \(i\)th pair of bars corresponds to \(a_i\), so in particular, this example corresponds to \((2,0,2,3,1)\). We therefore need to count the number of possible arrangements of \(n-1\) bars and \(r\) stars, or in other words, the number of ways of choosing \(r\) star locations from the \(n+r-1\) possible locations. Thus we have that for an \(n\)-dimensional vector space \(V\),
\begin{equation}
\dim S^r(V)={n+r-1\choose r}.
\end{equation}

On the space, \(S(V)=\bigoplus_{r=0}^\infty S^r(V)\), we can define a multiplication according to, \(T_1\cdot T_2=S(T_1\otimes T_2)\), for any \(T_1\in S^r(V)\) and \(T_2\in S^s(V)\). Equipped with this multiplication \(S(V)\) becomes a commutative associative algebra. That the product is commutative is clear. That it is associative follows since, for any \(T_1\in S^r(V)\), \(T_2\in S^s(V)\) and \(T_3\in S^t(V)\),
\begin{align*}
(T_1\cdot T_2)\cdot T_3&=S(S(T_1\otimes T_2)\otimes T_3)\\
&=S\left(\frac{1}{(r+s)!}\sum_{\sigma\in S_{r+s}}P_\sigma(T_1\otimes T_2)\otimes T_3\right)\\
&=\frac{1}{(r+s)!}\sum_{\sigma\in S_{r+s}}S(P_\sigma(T_1\otimes T_2)\otimes T_3)\\
&=S(T_1\otimes T_2\otimes T_3),
\end{align*}
and similarly for \(T_1\cdot(T_2\cdot T_3)=S(T_1\otimes T_2\otimes T_3)\).

As already discussed, the tensor product provides a multiplication on the space \(T(V)=\bigoplus_{r=0}^\infty T^r(V)\) such that it becomes an associative algebra with identity. Moreover, by virtue of its universal mapping property we should expect to be able to realise the algebra \(S(V)\) as a quotient of \(T(V)\) by a certain ideal\footnote{Recall subspace \(I\) of an algebra \(A\) is an ideal of \(A\) if for all \(a\in A\) and all \(i\in I\), \(ai\in I\) and \(ia\in I\). In this case the space \(A/I\) is an algebra with the multiplication inherited from \(A\) according to \((a+I)\cdot(b+I)=ab+I\)}, \(I\). Indeed, if \(\pi:T(V)\mapto T(V)/I\) is the quotient map, then for \(u,v\in V\) we’ll want that \(\pi(u\otimes v)=\pi(v\otimes u)\) in \(T(V)/I\), that is, we’ll want \(u\otimes v-v\otimes u\in I\). We are led, therefore, to the following definition.

Definition The symmetric algebra on the vector space \(V\) over the field \(K\) is the quotient \(T(V)/I\) of the tensor algebra \(T(V)\) by the ideal \(I\) generated by the elements \(u\otimes v-v\otimes u\) for all \(u,v\in V\).

Let us denote by \(\tilde{S}(V)\), the symmetric algebra as defined here. Then defining \(I^r=T^r(V)\cap I\), it’s not difficult to see that \(I=\bigoplus_{r=0}^\infty I^r\). In fact, \(\tilde{S}(V)=\bigoplus_{r=0}^\infty T^r(V)/I^r\), which we can see by observing that the linear map defined as \(\sum_r(T_r+I^r)\mapsto \sum_rT_r+I\), where \(T_r\in T^r(V)\), is clearly surjective and is injective since if \(\sum_rT^r\in I\) then \(T^r\in I^r\). Thus, setting, \(\tilde{S}^r(V)=T^r(V)/I^r\), so that \(\tilde{S}(V)=\bigoplus_{r=0}^\infty\tilde{S}^r(V)\), we will want to establish that \(\tilde{S}^r(V)\) and \(S^r(V)\) are isomorphic, from which, it will immediately follow that \(\tilde{S}(V)\) and \(S(V)\) are isomorphic.

There is an alternative description of the ideal, \(I\). Let us denote by, \(I’\), the ideal generated by all elements, \(T-P_\sigma(T)\), for any tensor, \(T\in T^r(V)\), and any permutation, \(\sigma\in S_r\). Now for any pure tensor, \(v_1\otimes\dots\otimes v_r\),
\begin{equation*}
v_1\otimes\dots\otimes v_r-v_{\sigma(1)}\otimes\dots\otimes v_{\sigma(r)},
\end{equation*}
can be written as a sum of terms of the form
\begin{equation*}
v_1\otimes\dots\otimes v_{i_1}\otimes v_{i_1+1}\otimes\dots\otimes v_r-v_1\otimes\dots\otimes v_{i_1+1}\otimes v_{i_1}\otimes\dots\otimes v_r,\end{equation*}
in which only neighbouring factors are transposed. Since each of these clearly belongs to the ideal, \(I\), as originally defined, it follows that \(I’\subseteq I\). The reverse inclusion is obvious so we have \(I’=I\).
In particular, \(T(V)/I\) is commutative since for any pure tensors, \(T_1,T_2\in T(V)\), \(T_2\otimes T_1=P_\sigma(T_1\otimes T_2)\), for some permutation, \(\sigma\), so that, \(T_1\otimes T_2-T_2\otimes T_1=T_1\otimes T_2-P_\sigma(T_1\otimes T_2)\in I\). That is, \((T_1+I)(T_2+I)=T_1\otimes T_2+I=T_2\otimes T_1+I=(T_2+I)(T_1+I)\).

Abusing notation, for any \(v_1\otimes\dots\otimes v_r\in T^r(V)\), \(v_i\in V\), let us write its image in \(\tilde{S}^r(V)\), via the quotient map, \(\pi\), as \(v_1\cdots v_r\). Recall that the tensor product was defined via a universal mapping property. In particular, whenever we have an \(r\)-linear function \(f:V\times\dots\times V\mapto W\), where \(W\) is some vector space, then there is a unique linear mapping \(L:T^r(V)\mapto W\) such that \(f=L\iota\) where \(\iota \) was the \(r\)-linear function \(\iota(v_1,\dots,v_r)=v_1\otimes\dots\otimes v_r\). This leads to the following result for \(\tilde{S}^r(V)\).

Proposition If \(\iota\) is the \(r\)-linear function, \(\iota:V\times\dots\times V\mapto\tilde{S}^r(V)\), defined as, \(\iota(v_1,\dots,v_r)=v_1\cdots v_r\), then \((\tilde{S}^r(V),\iota)\) has the following universal mapping property: whenever \(f:V\times\dots\times V\mapto W\) is a symmetric \(r\)-linear function with values in a vector space \(W\) there exists a unique linear map \(L:\tilde{S}^r(V)\mapto W\) such that \(f=L\iota\).

Proof From the universal mapping property of the tensor product we have a map \(L’:T^r(V)\mapto W\) such that on pure tensors \(L'(v_1\otimes\dots\otimes v_r)=f(v_1,\dots,v_r)\). But since \(f(v_1,\dots,v_i,v_{i+1},\dots,v_r)=f(v_1,\dots,v_{i+1},v_i,\dots,v_r)\) it is clear that for any \(T\in I^r\), \(L'(T)=0\), and so \(L’\) factorises as, \(L’=L\pi\), where \(\pi\) is the quotient map \(\pi:T^r(V)\mapto T^r(V)/I^r\) and \(L:T^r(V)/I^r\mapto W\) is desired linear map.\(\blacksquare\)

As a consequence, we have that the space of linear maps \(\mathcal{L}(\tilde{S}^r(V),W)\) is isomorphic to the vector space of symmetric \(r\)-linear functions from \(V\times\dots\times V\) to \(W\) and in particular that \(\tilde{S}^r(V)^*\), the dual space of \(\tilde{S}^r(V)\), is isomorphic to the space of all symmetric \(r\)-linear forms on \(V\times\dots\times V\).

Recall that the tensor algebra, \(T(V)\), has a universal mapping property whereby whenever \(f:V\mapto A\) is a linear map from \(V\) into an associative algebra \(A\) with identity there exists a unique algebra homomorphism, \(F:T(V)\mapto A\), with \(F(1)=1\) and such that \(F(v)=f(v)\) with \(F(v_1\otimes\dots\otimes v_r)=f(v_1)\cdots f(v_r)\). This leads to the following result for \(\tilde{S}(V)\).

Proposition If \(\iota\) is the linear map embedding \(V\) in \(T(V)\) then \((\tilde{S}(V),\iota)\) has the following universal mapping property: whenever \(f:V\mapto A\) is a linear map from \(V\) into a commutative associative algebra \(A\) with identity, there exists a unique algebra homomorphism, \(F:\tilde{S}(V)\mapto A\), with \(F(1)=1\) such that \(F(v)=f(v)\) with \(F(v_1\cdots v_r)=f(v_1)\cdots f(v_r)\).

Proof From the universal mapping property of the tensor algebra we have an algebra homomorphism \(F’:T(V)\mapto A\) such that \(F'(v)=f(v)\) and since \(A\) is commutative we have \(F'(u\otimes v-v\otimes u)=0\), so \(I\in\ker F’\) and \(F’\) factorises as \(F’=F\pi\) where \(\pi\) is the quotient map \(\pi:T(V)\mapto T(V)/I\) and \(F:T(V)/I\mapto A\) is the desired algebra homomorphism.\(\blacksquare\)

Now if \(\{e_i\}\) is a basis of \(V\), then \(r\)-fold (tensor) products of the \(e_i\), \(e_{i_1}\otimes\cdots\otimes e_{i_r}\), span \(T^r(V)\). But since \(T(V)/I\) is commutative, this means that the elements, \(e_1^{a_1}\cdots e_n^{a_n}\), such that \(a_1+\dots+a_n=r\), must span \(\tilde{S}^r(V)\). Now recall that \(S^r(V)\) was defined as the image of the symmetrization operator, \(S\), on \(T^r(V)\), and that \(S\) is a projector. This means that \(T^r(V)=\ker S\oplus\img S=\ker S\oplus S^r(V)\). Clearly any element of \(I^r\) belongs to \(\ker S\), so \(I^r\subseteq\ker S\). But if there was some \(T\in\ker S\) such that \(T\notin I^r\) then \(\pi(T)\neq0\) and we must be able to express \(\pi(T)\) as a linear combination of the \(e_1^{a_1}\cdots e_n^{a_n}\). Thus, using these same linear coefficients, we can chose a tensor, \(T’\in T^r(V)\), as a linear combination of pure tensors of the form,
\begin{equation*}
\underbrace{e_1\otimes\dots\otimes e_1}_{a_1}\otimes\dots\otimes\underbrace{e_n\otimes\dots\otimes e_n}_{a_n},
\end{equation*}
each tensor in this linear combination corresponding to a distinct \(n\)-tuple, \((a_1,\dots,a_n)\), such that \(\pi(T)=\pi(T’)\). Then, \(T-T’\in I^r\), so \(S(T)=S(T’)\).
But \(S(T’)\) cannot be zero, since the symmetrization of distinct pure tensors in the linear combination, \(T’\), are non-zero linear combinations of distinct sets of basis elements of \(T^r(V)\). Thus, \(S(T)\neq0\), contradicting our initial assumption. It follows that \(\ker S=I^r\) and we have established that \begin{equation}
T^r(V)=S^r(V)\oplus I^r.
\end{equation}
In particular, this means that \(\dim T^r(V)/I^r=\dim S^r(V)\), so that the elements, \(e_1^{a_1}\cdots e_n^{a_n}\), such that \(a_1+\dots+a_n=r\), are a basis for \(\tilde{S}^r(V)\) and of course that \(\tilde{S}^r(V)\cong S^r(V)\) with this isomorphism such that, \(T+\ker S\mapsto S(T)\), mapping basis elements whose identification has already been anticipated by our abuse of notation. This clearly extends to a (grade preserving) algebra isomorphism \(\tilde{S}(V)\cong S(V)\).

Component Representation of Tensors

If \(\{e_i\}\) is a basis for \(V\) with \(\{e^i\}\) the dual basis of \(V^*\). Then any tensor, \(T\), of type \((r,s)\) can be expressed as the linear combination,
\begin{equation}
T=\sum_{\substack{i_1,\dots,i_r\\j_1,\dots,j_s}}T^{i_1\dots i_r}_{j_1\dots j_s}e_{i_1}\otimes\dots\otimes e_{i_r}\otimes e^{j_1}\otimes\dots\otimes e^{j_s},
\end{equation}
or, employing the summation convention,
\begin{equation}
T=T^{i_1\dots i_r}_{j_1\dots j_s}e_{i_1}\otimes\dots\otimes e_{i_r}\otimes e^{j_1}\otimes\dots\otimes e^{j_s},
\end{equation}
with the \(T^{i_1\dots i_r}_{j_1\dots j_s}\) the components of \(T\) with respect to the chosen basis of \(V\). In physics literature it is common for the collection of components, \(T^{i_1\dots i_r}_{j_1\dots j_s}\), to be actually referred to as “the” tensor. If we were to choose another basis for \(V\), say \(\{e’_i\}\), related to the first according to, \(e’_i=A_i^je_j\), then the new dual basis, \(\{e’^i\}\), is related to the old one by, \(e’^i=(A^{-1})^i_je^j\) (\(e’^i(e’_j)=(A^{-1})^i_ke^k(A_j^le_l)=(A^{-1})^i_kA_j^l\delta_l^k=\delta_j^i\)). With respect to this new pair of dual bases, the tensor \(T\) is given by
\begin{equation}
T=T^{i_1\dots i_r}_{j_1\dots j_s}A_{i_1}^{k_1}\cdots A_{i_r}^{k_r}(A^{-1})^{j_1}_{l_1}\cdots(A^{-1})^{j_r}_{l_r}e_{k_1}\otimes\dots\otimes e_{k_s}\otimes e^{l_1}\otimes\dots\otimes e^{l_r},
\end{equation}
so that the components of \(T\) with respect to the new basis, \({T’}^{k_1\dots k_r}_{l_1\dots l_s}\) say, are given by
\begin{equation}
{T’}^{k_1\dots k_r}_{l_1\dots l_s}=T^{i_1\dots i_r}_{j_1\dots j_s}A_{i_1}^{k_1}\cdots A_{i_r}^{k_r}(A^{-1})^{j_1}_{l_1}\cdots(A^{-1})^{j_r}_{l_r}.
\end{equation}
When treating tensors “as” their components the question naturally arises of how to distinguish between components with respect different bases of a single tensor. The usual approach, sometimes called kernel-index notation, maintains a “kernel” letter indicating the tensor with primes on the indices indicating that the components are with respect to another basis. For example in the case of a vector \(v\), \(v^i\) and \(v^{i’}\) are the same vector expressed with respect to two different bases. \(v^{i’}\) and \(v^{i}\) are thus related according to \(v^{i’}=(A^{-1})^{i’}_iv^i\).

A vector is sometimes defined as an object whose components transform in this way, that is contravariantly (with the inverse of the matrix relating the basis vectors) 1. Likewise, a covariant vector, \(v_i\), in other words, the components of a vector \(v\) with respect to the dual basis \(e^i\), transforms according to \(v_{i’}=A_{i’}^{i}v_i\). More generally, tensors of rank \((r,s)\) are then defined to by objects whose \(r+s\) coordinates, \(T^{i_1\dots i_r}_{j_1\dots j_s}\), transform as you’d expect based on the `upstairs’ or `downstairs’ position of its indices, as
\begin{equation}
T^{i’_1\dots i’_r}_{j’_1\dots j’_s}=T^{i_1\dots i_r}_{j_1\dots j_s}A_{i_1}^{i’_1}\cdots A_{i_r}^{i’_r}(A^{-1})^{j_1}_{j’_1}\cdots(A^{-1})^{j_r}_{j’_r}.
\end{equation}

Recall the notion of contraction. If we have a tensor of type \((r,s)\), \(T^{i_1\dots i_r}_{j_1\dots j_s}\), then contraction corresponds to forming a new \((r-1,s-1)\) tensor, say \(S^{i_1\dots i_{r-1}}_{j_1\dots j_{s-1}}\) as
\begin{equation}
S^{i_1\dots i_{r-1}}_{j_1\dots j_{s-1}}=T^{i_1\dots i_{a-1}ki_{a+1}\dots i_r}_{j_1\dots j_{b-1}kj_{b+1}\dots j_s}.
\end{equation}
In this case we have contracted over the \((i_a,j_b)\) pair of indices.

If the underlying vector space is equipped with a symmetric, non-degenerate inner product then this inner product can be regarded as a \((0,2)\) tensor. This is called the metric tensor, conventionally denoted \(g\). With respect to a given basis it has components, \(g_{ij}\), which are of course the elements of what we previously called the Gram matrix. The inner product provides us with a natural isomorphism \(V\mapto V^*\) such that \(v\mapsto \alpha_v\) with \(\alpha_v(w)=(v,w)\) for any \(w\in V\), that is, to uniquely associate a covariant vector with each contravariant vector and vice versa. In terms of a basis \(e_i\) of \(V\) with dual basis \(e^i\) of \(V^*\), we have \(e_i\mapsto\alpha_{e_i}\) which we could write as \(\alpha_{e_i}=\alpha_{ij}e^j\) with the \(\alpha_{ij}\) determined by \(\alpha_{e_i}(e_j)=g_{ij}=\alpha_{ik}e^k(e_j)=\alpha_{ij}\). So an arbitrary vector \(v^ie_i\) is mapped to \(v^ig_{ij}e^j\), or in other words, by applying the metric tensor to the contravariant vector \(v^i\) we obtain the covariant vector \(v_i\) given by
\begin{equation}
v_i=g_{ij}v^j.
\end{equation}
In the other direction, we have the inverse map, \(V^*\mapto V\), which we’ll write as \(e^i\mapsto g^{ij}e_j\) with the \(g^{ij}\) determined by \(v^ig_{ij}e^j\mapsto v^ig_{ij}g^{jk}e_k=v^ie_i\). That is,
\begin{equation}
g_{ij}g^{jk}=\delta_i^k,
\end{equation}
that is, \(g^{ij}\) is the inverse of the matrix \(g_{ij}\), and given a covariant vector \(v_i\) we obtain a contravariant vector \(v^i\) as
\begin{equation}
v^i=g^{ij}v_j.
\end{equation}
What we have here then is a way of raising and lowering indices which of course generalises to arbitrary tensors.

Let us note here, that in physical applications vectors and tensors very often arise as vector or tensor fields at a particular point in some space. Thus we might have, for example, a vector \(V(x)\) or tensor \(T(x)\) at some point \(x\). The components of \(V(x)\) or \(T(x)\) obviously depend on a choice of basis vectors. This in turn corresponds to a choice of coordinate system and as explained in the appendix on vector calculus, for non-cartesian coordinate systems, the corresponding basis vectors depend on the point in space, \(x\), so that the change of basis matrices relating components of \(V(x)\) or \(T(x)\) for different coordinate systems will also depend on \(x\).

Notes:

  1. In physics contexts there is often a restriction placed on the kinds of basis change considered. For example it is typical to see vectors defined as objects whose components transform contravaraintly with respect to spatial rotations.

The Tensor Algebra

Recall that an algebra, \(A\), over \(K\) is a vector space over \(K\) together with a multiplication operation \(A\times A\mapto A\) which is bilinear. In this section we will use the tensor product to construct the `universal’ associative algebra having an identity.

Definition A tensor of type \((r,s)\) is an element of the tensor product space \(\mathcal{T}^r_s(V)\) defined as
\begin{equation}
T^r_s(V)=\underbrace{V\otimes\dots\otimes V}_r\otimes\underbrace{V^*\otimes\dots\otimes V^*}_s.
\end{equation}
Here \(r\) is called the contravariant rank and \(s\) the covariant rank. In this context a \((0,0)\) tensor is an element of the base field \(K\), called simply a \(0\) rank tensor.

Recall that we have the following isomorphisms,
\begin{equation}
V_1^*\otimes\dots\otimes V_s^*\cong(V_1\otimes\dots\otimes V_s)^*\cong\mathcal{L}(V_1,\dots,V_r;K),
\end{equation}
so that tensors of type \((r,s)\) may be identified with multilinear functions,
\begin{equation}
f:\underbrace{V^*\times\dots\times V^*}_r\times\underbrace{V\times\dots\times V}_s\mapto K.
\end{equation}
A \(0\) rank tensor is just a scalar, the corresponding map just being scalar multiplication.

If we have another multilinear function,
\begin{equation*}
g:\underbrace{V^*\times\dots\times V^*}_p\times\underbrace{V\times\dots\times V}_q\mapto K,
\end{equation*}
which may, of course, be identified with a tensor of type \((p,q)\), then we can define a new multilinear function, such that,
\begin{equation*}
(\alpha_1,\dots,\alpha_{r+p},v_1,\dots,v_{s+q})\mapsto f(\alpha_1,\dots,\alpha_r,v_1,\dots,v_s)g(\alpha_{r+1},\dots,\alpha_{r+p},v_{s+1},\dots,v_{s+q}),
\end{equation*}
which could be identified with a tensor of type \((r+p,s+q)\). We have thus multiplied, via their respective identifications with multilinear maps, a tensor of type \((r,s)\) with a tensor of type \((p,q)\), to obtain a tensor of type \((r+p,s+q)\). The result, viewed as a multilinear map, is therefore called the tensor product, \(f\otimes g\), of the multilinear maps \(f\) and \(g\).

So defined, it is clear that this multiplication is bilinear, in the sense that,
\begin{equation}
(af_1+bf_2)\otimes g=af_1\otimes g +bf_2\otimes g,
\end{equation}
and,
\begin{equation}
f\otimes(ag_1+bg_2) =af\otimes g_1 +bf\otimes g_2,
\end{equation}
is associative but not necessarily commutative. It provides a multiplication on the space,
\begin{equation}
\mathcal{T}(V;V^*)=\bigoplus_{r,s=0}^\infty T^r_s(V).
\end{equation}
such that it becomes an algebra (here we understand \(T_0^0=K\), \(T_0^1=V\) and \(T_1^0=V^*\). This is called the tensor algebra, the name also given to the particular case of, \(\mathcal{T}(V)\), defined as
\begin{equation}
\mathcal{T}(V)=\bigoplus_{r=0}^\infty T^r(V)=K\oplus V\oplus (V\otimes V)\oplus\cdots ,
\end{equation}
equipped with the same multiplication (here \(T^r(V)=T_0^r(V)\). In fact, in this slightly simpler setting, we’ll introduce the multiplication directly without going via the identification of tensors with multilinear maps. Thus, we define the multiplication of an element of \(T^r\) with an element of \(T^s\) using the isomorphism.
\begin{equation}
(\underbrace{V\otimes\dots\otimes V}_r)\otimes(\underbrace{V\otimes\dots\otimes V}_s)\cong\underbrace{V\otimes\dots\otimes V}_{r+s},
\end{equation}
Indeed the restriction of this to \(T^r\times T^s\) provides a bilinear multiplication, \(\otimes:T^r\times T^s\mapto T^{r+s}\) (in the more general setting the only complication is that we’d need to use isomorphisms involving permutations). Equipped with this multiplication \(\mathcal{T}(V)\) is called the tensor algebra of \(V\). The tensor algebra, or rather the pair \((\mathcal{T}(V),\iota)\) where \(\iota:V\mapto T^1(V)\) is the obvious inclusion, has a universal mapping property. That is, whenever \(f:V\mapto A\) is a linear map from \(V\) into an associative algebra \(A\), with an identity, there exists a unique associative algebra homomorphism, \(F:\mathcal{T}(V)\mapto A\), with \(F(1)=1\) such that the following diagram commutes.

image

Here the uniqueness of \(F\) follows since \(\mathcal{T}(V)\) is generated by \(1\) and \(V\). Given then that on elements, \(v\in V\), \(F(v)=f(v)\), \(F\) is defined on the whole of \(\mathcal{T}(V)\) such that \(F(v_1\otimes\dots\otimes v_r)=f(v_1)\cdots f(v_r)\).

Isomorphisms

In this section the universal mapping property is used to establish a number basic isomorphisms involving tensor products.

Theorem Given vector spaces \(V_1\) and \(V_2\) over \(K\), there is a unique isomorphism, \(V_1\otimes V_2\cong V_2\otimes V_1\), such that for any \(v_1\in V_1\) and \(v_2\in V_2\), \(v_1\otimes v_2\mapsto v_2\otimes v_1\).

Proof Consider the bilinear function \(f:V_1\times V_2\mapto V_2\otimes V_1\) defined by \(f(v_1,v_2)=v_2\otimes v_1\) on elements \(v_1\in V_1\) and \(v_2\in V_2\). That \(f\) is indeed bilinear is a consequence of the bilinearity of the tensor product \(v_2\otimes v_1\). From the universal mapping property it follows that there is a linear map \(L:V_1\otimes V_2\mapto V_2\otimes V_1\) such that \(L(v_1\otimes v_2)=v_2\otimes v_1\). But likewise we could have started with a bilinear map from \(V_2\times V_1\mapto V_1\otimes V_2\) to end up with a linear map, \(L’:V_2\otimes V_1\mapto V_1\otimes V_2\), inverse, at least on pure tensors, to \(L\). That \(L’L=\id_{V_1\otimes V_2}\) on the whole of \(V_1\otimes V_2\) and indeed that \(LL’=\id_{V_2\otimes V_1}\) on the whole of \(V_2\otimes V_1\) follows since both \(L\) and \(L’\) are linear and the tensor product space is spanned by the linear sum of pure products.\(\blacksquare\)

Note that while \(V_1\otimes V_2\cong V_2\otimes V_1\) it is certainly not the case that \(v_1\otimes v_2=v_2\otimes v_1\) for arbitrary \(v_1\in V_1\) and \(v_2\in V_2\). The generalisation of the this to a tensor product of \(r\) vector spaces says that for any permutation \(\sigma\) of the numbers \(1,\dots,r\) there is a unique isomorphism,
\begin{equation}
V_1\otimes\dots\otimes V_r\cong V_{\sigma(1)}\otimes\dots\otimes V_{\sigma(r)},
\end{equation}
such that, \(v_1\otimes\dots\otimes v_r\mapsto v_{\sigma(1)}\otimes\dots\otimes v_{\sigma(r)}\), for any \(v_i\in V_i\).

Now let us consider associativity of the tensor product.

Theorem For vector spaces \(V_1\), \(V_2\) and \(V_3\) over \(K\) there is a unique isomorphism,
\begin{equation}
(V_1\otimes V_2)\otimes V_3\cong V_1\otimes(V_2\otimes V_3),
\end{equation}
such that for any \(v_1\in V_1\), \(v_2\in V_2\) and \(v_3\in V_3\), \((v_1\otimes v_2)\otimes v_3\mapsto v_1\otimes(v_2\otimes v_3)\).

Proof The function \(f:V_1\times V_2\times V_3\mapto(V_1\otimes V_2)\otimes V_3\), given by \(f(v_1,v_2,v_3)=(v_1\otimes v_2)\otimes v_3\) is clearly trilinear so by the universal mapping property we have linear map \(V_1\otimes V_2\otimes V_3\mapto(V_1\otimes V_2)\otimes V_3\) such that \(v_1\otimes v_2\otimes v_3\mapsto(v_1\otimes v_2)\otimes v_3\). Choosing bases for \(V_1\), \(V_2\) and \(V_3\) it’s clear that this maps one basis to another and so is an isomorphism. Similarly, we find \(V_1\otimes V_2\otimes V_3\cong V_1\otimes(V_2\otimes V_3)\) and the result follows.\(\blacksquare\)

Theorem For vector spaces \(V_1\), \(V_2\) and \(V_3\) over \(K\) there is a unique isomorphism,
\begin{equation}
V_1\otimes(V_2\oplus V_3)\cong(V_1\otimes V_2)\oplus(V_1\otimes V_3),
\end{equation}
such that for any \(v_1\in V_1\), \(v_2\in V_2\) and \(v_3\in V_3\), \(v_1\otimes (v_2,v_3)\mapsto (v_1\otimes v_2,v_1\otimes v_3)\).

Proof Here we need a bilinear function \(V_1\times(V_2\oplus V_3)\mapto(V_1\otimes V_2)\oplus(V_1\otimes V_3)\) so let us define a function \(f\) according to \(f(v_1,(v_2,v_3))=(v_1\otimes v_2,v_1\otimes v_3)\). That this is bilinear is demonstrated as follows,
\begin{align*}
f(av_1+bv_1′,(v_2,v_3))&=((av_1+bv_1′)\otimes v_2,(av_1+bv_1′)\otimes v_3)\\
&=(av_1\otimes v_2+bv_1’\otimes v_2,av_1\otimes v_3+bv_1’\otimes v_3)\\
&=(av_1\otimes v_2,av_1\otimes v_3)+(bv_1’\otimes v_2,bv_1’\otimes v_3)\\
&=a(v_1\otimes v_2,v_1\otimes v_3)+b(v_1’\otimes v_2,v_1’\otimes v_3)\\
&=af(v_1,(v_2,v_3))+b(v_1′,(v_2,v_3)),
\end{align*}
and
\begin{align*}
f(v_1,a(v_2,v_3)+b(v_2′,v_3′))&=f(v_1,(av_2+bv_2′,av_3,bv_3′))\\
&=(v_1\otimes(av_2+bv_2′),v_1\otimes(av_3+bv_3′))\\
&=(v_1\otimes av_2+v_1\otimes bv_2′,v_1\otimes av_3+v_1\otimes bv_3′)\\
&=(av_1\otimes v_2+bv_1\otimes v_2′,av_1\otimes v_3+bv_1\otimes v_3′)\\
&=(av_1\otimes v_2,av_1\otimes v_3)+(bv_1\otimes v_2′,bv_1\otimes v_3′)\\
&=a(v_1\otimes v_2,v_1\otimes v_3)+b(v_1\otimes v_2′,v_1\otimes v_3′)\\
&=af(v_1,(v_2,v_3))+bf(v_1,(v_2′,v_3′)).
\end{align*}
Then by the universal mapping property there is a linear map \(V_1\otimes(V_2\oplus V_3)\mapto(V_1\otimes V_2)\oplus(V_1\otimes V_3)\) such that \(v_1\otimes (v_2,v_3)\mapsto (v_1\otimes v_2,v_1\otimes v_3)\). Choosing bases for \(V_1\), \(V_2\) and \(V_3\) we see that this maps one basis to another so is an isomorphism.\(\blacksquare\)

More generally, we have that for a vector spaces \(U\), together with a (possibly infinite) set of spaces \(V_i\),
\begin{equation}
U\otimes(\bigoplus_iV_i) \cong\bigoplus_i(U\otimes V_i),
\end{equation}
with the isomorphism being the obvious extension of the one from the previous theorem.

Theorem For vector spaces \(V_1\) and \(V_2\) over \(K\) with respective dual spaces \(V_1^*\) and \(V_2^*\) there is a unique isomorphism,
\begin{equation}
V_1^*\otimes V_2^*\cong(V_1\otimes V_2)^*,
\end{equation}
such that, \(f_1\otimes f_2\mapsto(v_1\otimes v_2\mapsto f_1(v_1)f_2(v_2))\), for any \(v_1\in V_1\), \(v_2\in V_2\), \(f_1\in V_1^*\) and \(f_2\in V_2^*\).

Proof Define a function \(V_1^*\times V_2^*\mapto\mathcal{L}(V_1,V_2;K)\) by \((f_1,f_2)\mapsto((v_1,v_2)\mapsto f_1(v_1)f_2(v_2))\). That this is bilinear is clear so by the universal mapping property there is unique linear map \(V_1^*\otimes V_2^*\mapto\mathcal{L}(V_1,V_2;K)\) such that \(f_1\otimes f_2\mapsto((v_1,v_2)\mapsto f_1(v_1)f_2(v_2))\). Now this is a linear mapping between vector spaces of the same dimension, \(\dim V_1\dim V_2\), and moreover its image contains the bilinear functions we’ve already observed form a basis for \(\mathcal{L}(V_1,V_2;K)\), namely, \((v_1,v_2)\mapsto \alpha_{i_1}^{(1)}(v_1)\alpha_{i_2}^{(2)}(v_2)\), where the \(\alpha_{i_1}^{(1)}\) and \(\alpha_{i_2}^{(2)}\) are dual bases of \(V_1\) and \(V_2\) respectively. Thus this linear map is an isomorphism, which when combined with the isomorphism, \(\mathcal{L}(V_1,V_2;K)\cong(V_1\otimes V_2)^*\), gives us the isomorphism we sought.\(\blacksquare\)

We have of course also the obvious generalisation of this isomorphism,
\begin{equation}
V_1^*\otimes\dots\otimes V_r^*\cong(V_1\otimes\dots\otimes V_r)^*.
\end{equation}

Theorem For vector spaces \(V_1\) and \(V_2\) over \(K\), with \(V_1^*\) dual space of \(V_1\), there is a unique isomorphism,
\begin{equation}
V_1^*\otimes V_2\cong\mathcal{L}(V_1,V_2),
\end{equation}
such that, \(f_1\otimes v_2\mapsto(v_1\mapsto f_1(v_1)v_2)\), for any \(v_1\in V_1\), \(v_2\in V_2\) and \(f_1\in V_1^*\).

Proof The function defined by, \((f_1,v_2)\mapsto(v_1\mapsto f_1(v_1)v_2)\), is clearly a bilinear function from \(V_1^*\times V_2\) to \(\mathcal{L}(V_1,V_2)\). Therefore, by the universal mapping property, there is a unique linear map from \(V_1^*\otimes V_2\) to \(\mathcal{L}(V_1,V_2)\) given by, \(f_1\otimes v_2\mapsto(v_1\mapsto f_1(v_1)v_2)\). Now both \(V_1^*\otimes V_2\) and \(\mathcal{L}(V_1,V_2)\) have dimension \(\dim V_1\dim V_2\) and choosing bases \(e_i^{(1)}\) and \(e_i^{(2)}\) for \(V_1\) and \(V_2\) respectively, with \(\alpha_i^{(1)}\) the dual basis of \(V_1^*\), then this map takes the basis elements, \(\alpha_i^{(1)}\otimes e_j^{(2)}\), of \(V_1^*\otimes V_2\), to the linear maps \(v_1\mapsto\alpha_i^{(1)}(v_1)e_j^{(2)}\). Considering these maps applied to the basis elements of \(V_1\) we see that their matrix representations are the matrices with \(1\) in the \(j\)th row and \(i\)th column with zeros everywhere else. These matrices form a basis of \(\mathcal{L}(V_1,V_2)\) so we see that our linear map takes a basis to a basis and is therefore an isomorphism.\(\blacksquare\)

In the case of \(V_1=V_2=V\), it is of interest to establish the element of \(V^*\otimes V\) which corresponds to \(\id_V\). Denoting by \(e^i\) the dual basis of the basis \(e_i\) of \(V\) then this is the element \(\sum_ie^i\otimes e_i\) of \(V^*\otimes V\).

Consider the function \(V^*\times V\mapto K\) given by \((f,v)\mapsto f(v)\). This is clearly bilinear so induces a unique linear map \(V^*\otimes V\mapto K\), given by \(f\otimes v\mapsto f(v)\). This, understood as a linear map \(\mathcal{L}(V)\mapto K\), is just the trace, now given a basis-free (`canonical’) definition. To see that this really does coincide with the trace as previously encountered, consider an arbitrary element of \(V^*\otimes V\). It has the form, \(\sum_{ij}A_i^je^i\otimes e_j\), for some scalars, \(A_i^j\), and corresponds to the linear operator on \(V\) such that, \(e_k\mapsto\sum_{ij}A_i^je^i(e_k)e_j=\sum_jA_k^je_j\), that is, the linear operator represented by the matrix \(\mathbf{A}\) with elements \(A_i^j\). The trace of this linear operator is then \(\sum_{ij}A_i^je^i(e_j)=\sum_iA_i^i\) in accordance with our previous definition.

More generally, we have the notion of contraction. If in some tensor product space, \(V_1\otimes\dots\otimes V_r\), we have, \(V_j=V_i^*\), for some \(i\) and \(j\), then the contraction with respect to \(i\) and \(j\) is a linear mapping,
\begin{equation}
V_1\otimes\dots\otimes V_r\mapto\bigotimes_{k\neq i,j}^rV_r,\label{eq:abstract contraction}
\end{equation}
formed as a composition of a permutation of the tensor factors such that the \(i\) and \(j\) spaces are in the first two positions with the remaining order unchanged followed by the map formed as the tensor product of the map \(V^*\otimes V\mapto K\) discussed above tensored with the identity for the remaining factors followed by the trivial isomorphism corresponding to \(K\otimes V\cong V\).

Dimension and Bases

Consider first the trivial case when one of the spaces, \(V_i\) say, in a tensor product space, \(V_1\otimes\dots\otimes V_r\), is zero. Then any \(r\)-linear function out of \(V_1\times\dots\times V_r\) must be zero and so, as the image of the \(r\)-linear function \(\iota:V_1\times\dots\times V_r\mapto V_1\otimes\dots\otimes V_r\) generates the whole tensor product space, \(\dim(V_1\otimes\dots\otimes V_r)=0\) in this case. More generally, in the case that none of the spaces are zero, observe first that the dimension of \(V_1\otimes\dots\otimes V_r\) is the dimension of the dual space, \((V_1\otimes\dots\otimes V_r)^*\), and as already discussed, \(\mathcal{L}(V_1,\dots,V_r;K)\cong(V_1\otimes\dots\otimes V_r)^*\). Now suppose that a basis for the \(i\)th space, \(V_i\), is \(\{e_1^{(i)},\dots,e_{n_i}^{(i)}\}\), that is, \(\dim V_i=n_i\), for \(i=1,\dots,r\). Then for any \(r\)-linear form, \(f\), we have
\begin{equation}
f\left(\sum_{i_1}c_{(1)}^{i_1}e_{i_1}^{(1)},\dots,\sum_{i_1}c_{(r)}^{i_r}e_{i_r}^{(r)}\right)=\sum_{i_1,\dots,i_r}c_{(1)}^{i_1}\cdots c_{(r)}^{i_r}f(e_{i_1}^{(1)},\dots,e_{i_r}^{(r)}),
\end{equation}
where the \(c_{(j)}^{i_j}\) are arbitrary scalars. That is, \(f\) is uniquely specified by the \(n_1\cdots n_r\) scalars \(f(e_{i_1}^{(1)},\dots,e_{i_r}^{(r)})\). So defining the \(n_1\cdots n_r\), clearly linearly independent, \(r\)-linear forms, \(\phi_{i_1\dots i_r}\), such that, \(\phi_{i_i\dots i_r}(e_{j_1}^{(1)},\dots,e_{j_r}^{(r)})=\delta_{i_1j_1}\dots\delta_{i_rj_r}\), that is, \(\phi_{i_1\dots i_r}=\alpha_{i_1}^{(1)}(\cdot)\cdots\alpha_{i_r}^{(r)}(\cdot)\), where \(\{\alpha_1^{(i)},\dots,\alpha_{n_i}^{(i)}\}\) is the dual basis of \(V_i^*\), we see that any \(r\)-linear form can be expressed as linear combination of the \(\phi_{i_i\dots i_r}\) and so they form a basis for \(\mathcal{L}(V_1,\dots,V_r;K)\) which therefore has dimension \(n_1\cdots n_r\). That is,
\begin{equation}
\dim(V_1\otimes\dots\otimes V_r)=\dim V_1\cdots\dim V_r.
\end{equation}
It is also then clear that the \(n_1\cdots n_r\) pure tensors \(\{e_{i_1}^{(1)}\otimes\dots\otimes e_{i_r}^{(r)}\}\) form a basis for \(V_1\otimes\dots\otimes V_r\).

A nice application of the tensor product machinery is to the construction of the complexification of a vector space \(V\) over \(\RR\). As a vector space over \(\RR\), \(\CC\) is a two dimensional vector space with basis \(\{1,i\}\). Suppose \(\{e_1,\dots,e_n\}\) is a basis for the \(n\)-dimensional space \(V\). Then we can form the tensor product space \(\CC\otimes V\) over \(\RR\). As a real vector space this has basis \(\{1\otimes e_1,\dots,1\otimes e_n,i\otimes e_1,\dots,i\otimes e_n\}\) but we can define scalar multiplication by complex numbers simply as \(z(z’\otimes v)=(zz’)\otimes v\). We should now demonstrate that \(\CC\otimes V\cong V_\CC\) where \(V_\CC\) is as defined in Realification and Complexification. Consider the map \(\phi:V_\CC\mapto\CC\otimes V\) defined by \(\phi(v,v’)=1\otimes v+i\otimes v’\). This is clearly linear over \(\RR\). That it’s also linear over \(\CC\) follows since \(\phi(i(v,v’))=\phi(-v’,v)=-1\otimes v’+i\otimes v=i(1\otimes v+i\otimes v’)=i\phi(v,v’)\). To verify that this is an isomorphism , we’ll construct the inverse map. The bilinear map \(\CC\times V\mapto V_\CC\) defined by \((z,v)\mapsto z(v,0)\) induces by virtue of the universal mapping property a linear map we’ll call \(\phi’:\CC\otimes V\mapto V_\CC\) given by \(\phi'(z\otimes v)=z(v,0)\) on pure tensors. But this map is obviously also \(\CC\)-linear and since, \(\phi\circ\phi'(z\otimes v)=\phi(z(v,0))=z\phi(v,0)=z(1\otimes v)=z\otimes v\), we see that it is the inverse of \(\phi\).

Let us also note at this point that we have the obvious isomorphisms,
\begin{equation}
K\otimes V\cong V\cong V\otimes K,
\end{equation}
when \(V\) is a vector space over \(K\).

Definition and Construction of the Tensor Product

The central object of the following notes is the tensor product space. It is a vector space constructed from multiple vector spaces. The key attribute of such a space is that though it is a linear space it has an intrinsic multilinearity. Indeed, as we will see, tensor product spaces can be regarded equivalently as certain multilinear functions.

Given vector spaces \(V_1,\dots,V_n,W\) over a field \(K\), recall that a function \(f:V_1\times\dots\times V_n\mapto W\) is multilinear if it is linear in each entry in turn, that is, for each \(i=1,\dots, n\),
\begin{equation}
f(v_1,\dots,u_i+v_i,\dots,v_n)=f(v_1,\dots,u_i,\dots,v_n)+f(v_1,\dots,v_i,\dots,v_n).
\end{equation}
We encountered such functions already in a definition of the determinant as an alternating multilinear form (when the target space is the underlying field the word form is often encountered), also known as a volume form, as well as in the definition of inner products, excepting the complex Hermitian case, which were seen to be bilinear forms. They are clearly objects of considerable importance in their own right. However note the terminology. We have referred to these relationships as ‘functions’ or ‘forms’. They are (obviously) not linear maps. As such, they are not amenable to the main body of linear algebra technology we’ve developed. In particular we have no \(\ker\) or \(\img\) spaces. It turns out though that this apparent ‘otherness’ is somewhat illusory for we can introduce a new vector space, called the tensor product space, such that, essentially, multilinear functions on cartesian products become linear maps defined on the new tensor product spaces. With the multilinearity of the function encapsulated in the structure of the new product space.

Let us begin by considering the bilinear case. Thus, we suppose we have three vector spaces \(V_1,V_2\) and \(W\) with a function \(f:V_1\times V_2\mapto W\) such that
\begin{align}
f(av_1+bv_1′,v_2)&=af(v_1,v_2)+bf(v_1′,v_2)\\
f(v_1,av_2+bv_2′)&=af(v_1,v_2)+bf(v_1,v_2′),
\end{align}
that is, for any \(v_1\in V_1\) and \(v_2\in V_2\), \(f(v_1,-):V_2\mapto W\) \(f(-,v_2):V_1\mapto W\) are both linear maps.

Definition Given vector spaces \(V_1\) and \(V_2\) over a field \(K\), by a tensor product of \(V_1\) and \(V_2\) will be meant a vector space, \(U\), over \(K\), together with a bilinear function, \(\iota:V_1\times V_2\mapto U\), that is, a pair, \((U,\iota)\), with the following universal mapping property: whenever \(f:V_1\times V_2\mapto W\) is a bilinear function with values in a vector space \(W\) over \(K\), then there exists a unique linear mapping \(L:U\mapto W\) such that, \(L\iota=f\). That is, such that the following diagram commutes.

image

Theorem If \(V_1\) and \(V_2\) are vector spaces over \(K\) then a tensor product, \((U,\iota)\) exists and is unique in the sense that if \((U_1,\iota_1)\) and \((U_2,\iota_2)\) are two tensor products then there exists a unique isomorphism \(\phi:U_1\mapto U_2\) with \(\iota_2=\phi\circ\iota_1\).

We’ll establish this result in two stages.

Proof (Uniqueness) By definition, the following two diagrams commute.

image
So we have \(L_2L_1:U_1\mapto U_1\) such that \(L_2L_1\iota_1=L_2\iota_2=\iota_1\). But we then have the following two commutative diagrams,

image

and the uniqueness stipulation of the universal mapping property therefore implies \(L_2L_1=\id_{U_1}\). Clearly a similar argument leads to \(L_1L_2=\id_{U_2}\), thus \(U_1\) and \(U_2\) are indeed isomorphic, with \(L_1\) the desired isomorphism.
With uniqueness established, we can talk of the tensor product of vector spaces, \(V_1\) and \(V_2\), denoted simply \(V_1\otimes V_2\), with the bilinear function, \(\iota:V_1\times V_2\mapto V_1\otimes V_2\), understood to be part of the definition, but often not explicitly mentioned.

To establish existence, we need to recall the notion of the free vector space, \(F(S)\), on a set \(S\) over \(K\). This is the `formal enhancement’ of the set \(S\) with a formal scalar multiplication and a formal vector addition such that \(F(S)\) becomes a vector space consisting of finite sums \(a^1s_1+\dots+a^rs_r\) of the formal products \(a^is_i\) of elements \(a^i\in K\) and \(s_i\in S\). This is made precise in the following definition.

Definition The free vector space, \(F(S)\), on a set \(S\) over a field \(K\) is the set of all set-theoretic maps \(S\mapto K\) which vanish at all but a finite number of points of \(S\).

According to this definition, \(F(S)\) is a vector space over \(K\) with the usual pointwise addition and scalar multiplication, \((f+g)(s)=f(s)+g(s)\) and \((af)(s)=af(s)\). It has a basis consisting of the ‘delta functions’ \(\delta_s\) such that \(\delta_s(t)=1\) if \(s=t\) and 0 otherwise. Equivalence with the ‘formal enhancement of \(S\)’ definition is seen by observing that any \(f\in F(S)\) can be written uniquely as, \(f=f(s_1)\delta_{s_1}+\dots+f(s_r)\delta_{s_r}\) and so the formal finite sum \(a^1s_1+\dots+a^rs_r\) corresponds, upon identifying the elements \(s_i\) with their delta functions \(\delta_{s_i}\), to the map \(f\) such that \(f(s_i)=a^i\) for \(i=1,\dots,r\).

Note that if our set is a finite dimensional vector space \(V\) over \(K\) then \(F(V)\) is infinite dimensional as long as the field \(K\) is infinite (assuming \(V\) is not zero dimensional).

We now conclude the proof by constructing the tensor product space \(V_1\otimes V_2\) as a certain quotient of the free vector space, \(F(V_1\times V_2)\).

Proof (Existence) Consider, \(F(V_1\times V_2)\), the free vector space over the product space,\(V_1\times V_2\), of vector spaces \(V_1\) and \(V_2\) over \(K\). In this space we identify a subspace \(D\) defined to be the span of all elements of the form
\begin{equation*}
(av_1+bv_1′,v_2)-a(v_1,v_2)-b(v_1′,v_2),
\end{equation*}
and
\begin{equation*}
(v_1,av_2+bv_2′)-a(v_1,v_2)-b(v_1,v_2′),
\end{equation*}
where \(a,b\in K\), \(v_1,v_1’\in V_1\) and \(v_2,v_2’\in V_2\) and we’ve suppressed the \(\delta\) symbol, identifying \((v,w)\) with \(\delta_{(v,w)}\). Define \(V=F(V_1\times V_2)/D\) and \(\iota:V_1\times V_2\mapto V\) by \(\iota(v_1,v_2)=(v_1,v_2)+D\). Then \(\iota\) is bilinear by definition of the subspace \(D\). Also, note that any element of \(V\) can be expressed as some (finite) linear sum of elements of the image in \(V\) of \(\iota\). Now suppose we have a bilinear function \(f:V_1\times V_2\mapto W\) taking values in a vector space \(W\) over \(K\). We need to establish the existence and uniqueness of the linear map \(L\) such that the following diagram commutes.

image

Define \(L’:F(V_1\times V_2)\mapto W\) by \(L'(v_1,v_2)=f(v_1,v_2)\) extended linearly to the whole of \(F(V_1\times V_2)\) according to \(L'(a(v_1,v_2)+b(v_1′,v_2′))=af(v_1,v_2)+bf(v_1′,v_2′)\). Then since
\begin{equation*}
L'((av_1+bv_1′,v_2)-a(v_1,v_2)-b(v_1′,v_2))=f(av_1+bv_1′,v_2)-af(v_1,v_2)-bf(v_1′,v_2),
\end{equation*}
and
\begin{equation*}
L'((v_1,av_2+bv_2′)-a(v_1,v_2)-b(v_1,v_2′))=f(v_1,av_2+bv_2′)-af(v_1,v_2)-bf(v_1,v_2′),
\end{equation*}
by the bilinearity of \(f\), we see that \(D\subseteq\ker L’\). \(L’\) therefore factors as \(L’=L\pi\) where \(\pi:F(V_1\times V_2)\mapto F(V_1\times V_2)/D\) is the quotient map. Clearly \(L\iota=f\) so we have demonstrated existence of the desired linear map. Uniqueness is immediate, given commutativity of the diagram above and the fact, already noted, that \(F(V_1\times V_2)/D\) is linear span of the image if \(\iota\).\(\blacksquare\)

Though the definition and associated existence and uniqueness result are rather abstract, the basic idea is simple. A bilinear function on a product of spaces taking values in a third space `transfers’ its bilinearity to the tensor product space on which there is a corresponding linear transformation to the same target space. Indeed, if we denote by, \(\mathcal{L}(V_1,V_2;W)\), the vector space of bilinear functions on \(V_1\times V_2\) taking values in a vector space \(W\) over \(K\) and by, \(\mathcal{L}(V_1\otimes V_2,W)\), the vector space of linear maps from \(V_1\otimes V_2\) to \(W\), then we have the vector space isomorphism,
\begin{equation}
\mathcal{L}(V_1,V_2;W)\cong\mathcal{L}(V_1\otimes V_2,W).
\end{equation}
By the universal mapping property, to any \(f\in\mathcal{L}(V_1,V_2;W)\) there corresponds a unique linear map \(L_f:V_1\otimes V_2\mapto W\) so the mapping of sets here is such that \(f\mapsto L_f\). This is clearly linear. It is surjective since for any \(T:V_1\otimes V_2\mapto W\), \(T\iota\) is bilinear and injective since if \(f\in\mathcal{L}(V_1,V_2;W)\) is non-zero, \(f\neq0\), then \(L_f\iota\neq0\) so that \(L_f\neq0\). In particular, when \(W=K\), we have that the vector space of bilinear forms on \(V_1\times V_2\) is isomorphic to the dual of \(V_1\otimes V_2\), \((V_1\otimes V_2)^*\).

Elements of \(V_1\otimes V_2\) are written as linear sums of terms of the form, \(v_1\otimes v_2\), which are themselves images, via the bilinear function, \(\iota\), of elements, \((v_1,v_2)\in V_1\times V_2\), \(\iota(v_1,v_2)=v_1\otimes v_2\). Elements of \(V_1\otimes V_2\) of the form \(v_1\otimes v_2\) are called pure tensors. Although a small subset of all elements of \(V_1\otimes V_2\), as noted in the proof of existence, the pure tensors span the tensor product space.

Note that \(v_1\otimes v_2=0\) for some \(v_1\in V_1\) and \(v_2\in V_2\) if and only if every bilinear function \(f:V_1\times V_2\mapto W\) is zero on \((m,n)\). Consequently \(v_1\otimes v_2\neq 0\) if there exists some bilinear function \(f\) such that \(f(v_1,v_2)\neq 0\). Also note that \(v_1\otimes0=0\otimes v_2=0\) for any \(v_1\in V_1\) and \(v_2\in V_2\) since for any bilinear map, \(f\), \(f(v_1,0)=0=f(0,v_2)\).

All of the above can be immediately generalised to more than 2 vector spaces. If we have \(r\) vector spaces, \(V_1,\dots, V_r\) over a field \(K\) then the tensor product \(V_1\otimes\dots\otimes V_r\) is the unique vector space over \(K\) together with the \(r\)-linear function \(\iota:V_1\times\dots\times V_r\mapto V_1\otimes\dots\otimes V_r\) such that to any \(r\)-linear function \(f:V_1\times\dots\times V_r\mapto W\), \(W\) a vector space over \(K\), there is a unique linear mapping \(L:V_1\otimes\dots\otimes V_r\mapto W\) such that \(L\iota=f\). In this more general context we have, of course,
\begin{equation}
\mathcal{L}(V_1,\dots,V_r;W)\cong\mathcal{L}(V_1\otimes\dots\otimes V_r,W),
\end{equation}
and in particular,
\begin{equation}
\mathcal{L}(V_1,\dots,V_r;K)\cong(V_1\otimes\dots\otimes V_r)^*.
\end{equation}

Given vector spaces \(V_1,\dots, V_r\) and \(W_1,\dots, W_r\) over \(K\) with linear maps \(A_i:V_i\mapto W_i\) we can form the \(r\)-linear function from \(V_1\times\dots\times V_r\) to \(W_1\otimes\dots\otimes W_r\) such that \((v_1,\dots,v_r)\mapsto A_1(v_1)\otimes\dots\otimes A_r(v_r)\). That this is indeed an \(r\)-linear map follows immediately from the fact that the \(A_i\) are linear and the \(r\)-linearity of the tensor product \(W_1\otimes\dots\otimes W_r\). The universal mapping property then gives us a linear map from \(V_1\otimes\dots\otimes V_r\) to \(W_1\otimes\dots\otimes W_r\) such that \(v_1\otimes\dots\otimes v_r\mapsto A_1(v_1)\otimes\dots\otimes A_r(v_r)\) which we’ll write as \(A_1\otimes\dots\otimes A_r\) and call the tensor product of the linear maps \(A_i\).