Definition and Construction of the Tensor Product

The central object of the following notes is the tensor product space. It is a vector space constructed from multiple vector spaces. The key attribute of such a space is that though it is a linear space it has an intrinsic multilinearity. Indeed, as we will see, tensor product spaces can be regarded equivalently as certain multilinear functions.

Given vector spaces \(V_1,\dots,V_n,W\) over a field \(K\), recall that a function \(f:V_1\times\dots\times V_n\mapto W\) is multilinear if it is linear in each entry in turn, that is, for each \(i=1,\dots, n\),
\begin{equation}
f(v_1,\dots,u_i+v_i,\dots,v_n)=f(v_1,\dots,u_i,\dots,v_n)+f(v_1,\dots,v_i,\dots,v_n).
\end{equation}
We encountered such functions already in a definition of the determinant as an alternating multilinear form (when the target space is the underlying field the word form is often encountered), also known as a volume form, as well as in the definition of inner products, excepting the complex Hermitian case, which were seen to be bilinear forms. They are clearly objects of considerable importance in their own right. However note the terminology. We have referred to these relationships as ‘functions’ or ‘forms’. They are (obviously) not linear maps. As such, they are not amenable to the main body of linear algebra technology we’ve developed. In particular we have no \(\ker\) or \(\img\) spaces. It turns out though that this apparent ‘otherness’ is somewhat illusory for we can introduce a new vector space, called the tensor product space, such that, essentially, multilinear functions on cartesian products become linear maps defined on the new tensor product spaces. With the multilinearity of the function encapsulated in the structure of the new product space.

Let us begin by considering the bilinear case. Thus, we suppose we have three vector spaces \(V_1,V_2\) and \(W\) with a function \(f:V_1\times V_2\mapto W\) such that
\begin{align}
f(av_1+bv_1′,v_2)&=af(v_1,v_2)+bf(v_1′,v_2)\\
f(v_1,av_2+bv_2′)&=af(v_1,v_2)+bf(v_1,v_2′),
\end{align}
that is, for any \(v_1\in V_1\) and \(v_2\in V_2\), \(f(v_1,-):V_2\mapto W\) \(f(-,v_2):V_1\mapto W\) are both linear maps.

Definition Given vector spaces \(V_1\) and \(V_2\) over a field \(K\), by a tensor product of \(V_1\) and \(V_2\) will be meant a vector space, \(U\), over \(K\), together with a bilinear function, \(\iota:V_1\times V_2\mapto U\), that is, a pair, \((U,\iota)\), with the following universal mapping property: whenever \(f:V_1\times V_2\mapto W\) is a bilinear function with values in a vector space \(W\) over \(K\), then there exists a unique linear mapping \(L:U\mapto W\) such that, \(L\iota=f\). That is, such that the following diagram commutes.

image

Theorem If \(V_1\) and \(V_2\) are vector spaces over \(K\) then a tensor product, \((U,\iota)\) exists and is unique in the sense that if \((U_1,\iota_1)\) and \((U_2,\iota_2)\) are two tensor products then there exists a unique isomorphism \(\phi:U_1\mapto U_2\) with \(\iota_2=\phi\circ\iota_1\).

We’ll establish this result in two stages.

Proof (Uniqueness) By definition, the following two diagrams commute.

image
So we have \(L_2L_1:U_1\mapto U_1\) such that \(L_2L_1\iota_1=L_2\iota_2=\iota_1\). But we then have the following two commutative diagrams,

image

and the uniqueness stipulation of the universal mapping property therefore implies \(L_2L_1=\id_{U_1}\). Clearly a similar argument leads to \(L_1L_2=\id_{U_2}\), thus \(U_1\) and \(U_2\) are indeed isomorphic, with \(L_1\) the desired isomorphism.
With uniqueness established, we can talk of the tensor product of vector spaces, \(V_1\) and \(V_2\), denoted simply \(V_1\otimes V_2\), with the bilinear function, \(\iota:V_1\times V_2\mapto V_1\otimes V_2\), understood to be part of the definition, but often not explicitly mentioned.

To establish existence, we need to recall the notion of the free vector space, \(F(S)\), on a set \(S\) over \(K\). This is the `formal enhancement’ of the set \(S\) with a formal scalar multiplication and a formal vector addition such that \(F(S)\) becomes a vector space consisting of finite sums \(a^1s_1+\dots+a^rs_r\) of the formal products \(a^is_i\) of elements \(a^i\in K\) and \(s_i\in S\). This is made precise in the following definition.

Definition The free vector space, \(F(S)\), on a set \(S\) over a field \(K\) is the set of all set-theoretic maps \(S\mapto K\) which vanish at all but a finite number of points of \(S\).

According to this definition, \(F(S)\) is a vector space over \(K\) with the usual pointwise addition and scalar multiplication, \((f+g)(s)=f(s)+g(s)\) and \((af)(s)=af(s)\). It has a basis consisting of the ‘delta functions’ \(\delta_s\) such that \(\delta_s(t)=1\) if \(s=t\) and 0 otherwise. Equivalence with the ‘formal enhancement of \(S\)’ definition is seen by observing that any \(f\in F(S)\) can be written uniquely as, \(f=f(s_1)\delta_{s_1}+\dots+f(s_r)\delta_{s_r}\) and so the formal finite sum \(a^1s_1+\dots+a^rs_r\) corresponds, upon identifying the elements \(s_i\) with their delta functions \(\delta_{s_i}\), to the map \(f\) such that \(f(s_i)=a^i\) for \(i=1,\dots,r\).

Note that if our set is a finite dimensional vector space \(V\) over \(K\) then \(F(V)\) is infinite dimensional as long as the field \(K\) is infinite (assuming \(V\) is not zero dimensional).

We now conclude the proof by constructing the tensor product space \(V_1\otimes V_2\) as a certain quotient of the free vector space, \(F(V_1\times V_2)\).

Proof (Existence) Consider, \(F(V_1\times V_2)\), the free vector space over the product space,\(V_1\times V_2\), of vector spaces \(V_1\) and \(V_2\) over \(K\). In this space we identify a subspace \(D\) defined to be the span of all elements of the form
\begin{equation*}
(av_1+bv_1′,v_2)-a(v_1,v_2)-b(v_1′,v_2),
\end{equation*}
and
\begin{equation*}
(v_1,av_2+bv_2′)-a(v_1,v_2)-b(v_1,v_2′),
\end{equation*}
where \(a,b\in K\), \(v_1,v_1’\in V_1\) and \(v_2,v_2’\in V_2\) and we’ve suppressed the \(\delta\) symbol, identifying \((v,w)\) with \(\delta_{(v,w)}\). Define \(V=F(V_1\times V_2)/D\) and \(\iota:V_1\times V_2\mapto V\) by \(\iota(v_1,v_2)=(v_1,v_2)+D\). Then \(\iota\) is bilinear by definition of the subspace \(D\). Also, note that any element of \(V\) can be expressed as some (finite) linear sum of elements of the image in \(V\) of \(\iota\). Now suppose we have a bilinear function \(f:V_1\times V_2\mapto W\) taking values in a vector space \(W\) over \(K\). We need to establish the existence and uniqueness of the linear map \(L\) such that the following diagram commutes.

image

Define \(L’:F(V_1\times V_2)\mapto W\) by \(L'(v_1,v_2)=f(v_1,v_2)\) extended linearly to the whole of \(F(V_1\times V_2)\) according to \(L'(a(v_1,v_2)+b(v_1′,v_2′))=af(v_1,v_2)+bf(v_1′,v_2′)\). Then since
\begin{equation*}
L'((av_1+bv_1′,v_2)-a(v_1,v_2)-b(v_1′,v_2))=f(av_1+bv_1′,v_2)-af(v_1,v_2)-bf(v_1′,v_2),
\end{equation*}
and
\begin{equation*}
L'((v_1,av_2+bv_2′)-a(v_1,v_2)-b(v_1,v_2′))=f(v_1,av_2+bv_2′)-af(v_1,v_2)-bf(v_1,v_2′),
\end{equation*}
by the bilinearity of \(f\), we see that \(D\subseteq\ker L’\). \(L’\) therefore factors as \(L’=L\pi\) where \(\pi:F(V_1\times V_2)\mapto F(V_1\times V_2)/D\) is the quotient map. Clearly \(L\iota=f\) so we have demonstrated existence of the desired linear map. Uniqueness is immediate, given commutativity of the diagram above and the fact, already noted, that \(F(V_1\times V_2)/D\) is linear span of the image if \(\iota\).\(\blacksquare\)

Though the definition and associated existence and uniqueness result are rather abstract, the basic idea is simple. A bilinear function on a product of spaces taking values in a third space `transfers’ its bilinearity to the tensor product space on which there is a corresponding linear transformation to the same target space. Indeed, if we denote by, \(\mathcal{L}(V_1,V_2;W)\), the vector space of bilinear functions on \(V_1\times V_2\) taking values in a vector space \(W\) over \(K\) and by, \(\mathcal{L}(V_1\otimes V_2,W)\), the vector space of linear maps from \(V_1\otimes V_2\) to \(W\), then we have the vector space isomorphism,
\begin{equation}
\mathcal{L}(V_1,V_2;W)\cong\mathcal{L}(V_1\otimes V_2,W).
\end{equation}
By the universal mapping property, to any \(f\in\mathcal{L}(V_1,V_2;W)\) there corresponds a unique linear map \(L_f:V_1\otimes V_2\mapto W\) so the mapping of sets here is such that \(f\mapsto L_f\). This is clearly linear. It is surjective since for any \(T:V_1\otimes V_2\mapto W\), \(T\iota\) is bilinear and injective since if \(f\in\mathcal{L}(V_1,V_2;W)\) is non-zero, \(f\neq0\), then \(L_f\iota\neq0\) so that \(L_f\neq0\). In particular, when \(W=K\), we have that the vector space of bilinear forms on \(V_1\times V_2\) is isomorphic to the dual of \(V_1\otimes V_2\), \((V_1\otimes V_2)^*\).

Elements of \(V_1\otimes V_2\) are written as linear sums of terms of the form, \(v_1\otimes v_2\), which are themselves images, via the bilinear function, \(\iota\), of elements, \((v_1,v_2)\in V_1\times V_2\), \(\iota(v_1,v_2)=v_1\otimes v_2\). Elements of \(V_1\otimes V_2\) of the form \(v_1\otimes v_2\) are called pure tensors. Although a small subset of all elements of \(V_1\otimes V_2\), as noted in the proof of existence, the pure tensors span the tensor product space.

Note that \(v_1\otimes v_2=0\) for some \(v_1\in V_1\) and \(v_2\in V_2\) if and only if every bilinear function \(f:V_1\times V_2\mapto W\) is zero on \((m,n)\). Consequently \(v_1\otimes v_2\neq 0\) if there exists some bilinear function \(f\) such that \(f(v_1,v_2)\neq 0\). Also note that \(v_1\otimes0=0\otimes v_2=0\) for any \(v_1\in V_1\) and \(v_2\in V_2\) since for any bilinear map, \(f\), \(f(v_1,0)=0=f(0,v_2)\).

All of the above can be immediately generalised to more than 2 vector spaces. If we have \(r\) vector spaces, \(V_1,\dots, V_r\) over a field \(K\) then the tensor product \(V_1\otimes\dots\otimes V_r\) is the unique vector space over \(K\) together with the \(r\)-linear function \(\iota:V_1\times\dots\times V_r\mapto V_1\otimes\dots\otimes V_r\) such that to any \(r\)-linear function \(f:V_1\times\dots\times V_r\mapto W\), \(W\) a vector space over \(K\), there is a unique linear mapping \(L:V_1\otimes\dots\otimes V_r\mapto W\) such that \(L\iota=f\). In this more general context we have, of course,
\begin{equation}
\mathcal{L}(V_1,\dots,V_r;W)\cong\mathcal{L}(V_1\otimes\dots\otimes V_r,W),
\end{equation}
and in particular,
\begin{equation}
\mathcal{L}(V_1,\dots,V_r;K)\cong(V_1\otimes\dots\otimes V_r)^*.
\end{equation}

Given vector spaces \(V_1,\dots, V_r\) and \(W_1,\dots, W_r\) over \(K\) with linear maps \(A_i:V_i\mapto W_i\) we can form the \(r\)-linear function from \(V_1\times\dots\times V_r\) to \(W_1\otimes\dots\otimes W_r\) such that \((v_1,\dots,v_r)\mapsto A_1(v_1)\otimes\dots\otimes A_r(v_r)\). That this is indeed an \(r\)-linear map follows immediately from the fact that the \(A_i\) are linear and the \(r\)-linearity of the tensor product \(W_1\otimes\dots\otimes W_r\). The universal mapping property then gives us a linear map from \(V_1\otimes\dots\otimes V_r\) to \(W_1\otimes\dots\otimes W_r\) such that \(v_1\otimes\dots\otimes v_r\mapsto A_1(v_1)\otimes\dots\otimes A_r(v_r)\) which we’ll write as \(A_1\otimes\dots\otimes A_r\) and call the tensor product of the linear maps \(A_i\).