Linear Transformations and Matrices

Considering structure preserving maps between vector spaces leads to the following definition.

Definition A linear transformation is any map \(T:V\mapto W\) between vector spaces \(V\) and \(W\) which preserves vector addition and scalar multiplication, \(T(au+bv)=aTu+bTv\), where \(a,b\in K\) and \(u,v\in V\). We’ll call a linear transformation from a vector space to itself, \(T:V\mapto V\), a linear operator on \(V\).

The kernel of such a linear transformation is \(\ker T=\{v\in V\mid Tv=0\}\) and the image is \(\img T=\{w\in W\mid w=Tv\}\). The linearity of \(T\) means they are vector subspaces of \(V\) and \(W\) respectively. The dimension of \(\ker T\) is called the nullity of \(T\) while that of the image of \(T\) is the rank of \(T\), \(\rank(T)\). They are related to the dimension of the vector space \(V\) as follows.

Theorem (Rank-nullity theorem) If \(T:V\mapto W\) is a linear transformation and \(V\) is finite dimensional then,\begin{equation}
\dim \ker T +\dim \img T =\dim V .\label{equ:dimension equation}
\end{equation}

Proof  Take \(\{k_i\}\), \(1\leq i\leq r\), to be a basis of \(\ker T\). We know that we can extend this to a basis of \(V\), \(\{k_1,\dots,k_r,h_1,\dots,h_s\}\). Consider then the set \(\{h’_i\}\), \(1\leq i\leq s\), the elements of which are defined according to \(Th_i=h’_i\). Any element of \(\img T\) is of the form \(T(c^1k_1+\dots+c^rk_r+d^1h_1+\dots+d^sh_s)= d^1h’_1+\dots+d^sh’_s\) so \(\Span(h’_1,\dots,h’_s)=\img T\). Furthermore, suppose we could find \(c^i\), not all zero, such that \(c^1h’_1+\dots+c^sh’_s=0\). Then we would have \(T(c^1h_1+\dots+c^sh_s)=0\), that is, \(c^1h_1+\dots+c^sh_s\in\ker T\), but this would contradict the linear independence of the basis of \(V\). Thus \(\{h’_i\}\) for \(1\leq i\leq s\) is a basis for \(\img T\) and the result follows.\(\blacksquare\)

It is of course the case that \(T\) is one-to-one (injective) if and only if \(\ker T=\{0\}\) and is onto (surjective) if and only if \(\img T =W\). This theorem thus tells us that if \(V\) and \(W\) have the same dimension then \(T\) is one-to-one if and only if it is onto.

\(T\) is said to be invertible if there exists a linear transformation \(S:W\mapto V\) such that \(TS=\id_W\), the identity operator on \(W\), and \(ST=\id_V\), the identity operator on \(V\). In this case \(S\) is called the (there can only be one) inverse of \(T\) and denoted \(T^{-1}\). \(T\) is invertible if and only if it is both one-to-one and onto, in which case we call it an isomorphism. Notice that one-to-one and onto are equivalent respectively to \(T\) having a left and right inverse. Indeed, if \(T\) has a left inverse, \(S:W\mapto V\) such that \(ST=\id_V\), then for any \(v,v’\in V\) \(Tv=Tv’\Rightarrow STv=STv’\Rightarrow v=v’\). Conversely, if \(T\) is one-to-one then we can define a map \(S:W\mapto V\) which on the restriction to \(\img T\) is such that for any \(w\in\img T\), with \(w=Tv\), \(Sw=v\). \(T\) being one-to-one, this map is well-defined (single-valued). If \(T\) has a right inverse such that \(TS=\id_W\) then for any \(w\in W\) we can write \(w=\id_W w=TSw\) so \(T\) is certainly onto. Conversely, if \(T\) is onto then the existence of a right inverse is equivalent to the axiom of choice.

It is not difficult to see that two finite dimensional vector spaces \(V\) and \(W\) are isomorphic if and only if \(\dim(V)=\dim(W)\) and a linear transformation \(T:V\mapto W\) between two such spaces is an isomorphism if and only if \(\rank(T)=n\) where \(n\) is the common dimension. In other words, we have the following characterisation.

Finite dimensional vector spaces are completely classified in terms of their dimension.

Indeed, any \(n\)-dimensional vector space over \(K\) is isomorphic to \(K^n\), the space of all \(n\)-tuples of elements of the field \(K\). Explicitly, given a basis \(\{e_i\}\) of a vector space \(V\), this isomorphism identifies a vector \(v\) with the column vector \(\mathbf{v}\) of its components with respect to that basis,\begin{equation*}
v=v^ie_i\longleftrightarrow
\begin{pmatrix}
v^1\\
\vdots\\
v^n
\end{pmatrix}.
\end{equation*}

Clearly, a linear transformation \(T:V\mapto W\) is uniquely specified by its action on basis elements. If bases \(\{e_i\}\) and \(\{f_i\}\) are chosen for the respectively \(n\) and \(m\) dimensional vector spaces \(V\) and \(W\), then we can write any vector \(v\in V\) as \(v=v^ie_i\). For any such \(v\in V\) there is an element \(w\in W\) such that \(Tv=w\) and of course we can write it as \(w=w^if_i\). But there must also exist numbers \(T_i^j\) such that \(Te_i=T_i^jf_j\) so we have \(Tv=v^iT_i^jf_j=w^if_i=w\) which can be summarised in terms of matrices as
\begin{equation}
\begin{pmatrix}
w^1\\
\vdots\\
w^m
\end{pmatrix}=\begin{pmatrix}
T_1^1&\dots&T_n^1\\
\vdots&\ddots&\vdots\\
T_1^m&\dots&T_n^m
\end{pmatrix}\begin{pmatrix}
v^1\\
\vdots\\
v^n
\end{pmatrix}.
\end{equation}
That is, \(\mathbf{w}=\mathbf{T}\mathbf{v}\), is the matrix version of \(w=Tv\). The matrix \(\mathbf{T}\) is called the matrix representation of the linear transformation \(T\) and addition, scalar multiplication and composition of linear transformations correspond respectively to matrix addition, multiplication of a matrix by a scalar and matrix multiplication respectively.

Conversely, given a choice of bases \(\{e_i\}\) and \(\{f_i\}\) for vector spaces \(V\) and \(W\) of dimensions \(n\) and \(m\) respectively, any \(m\times n\) matrix \(\mathbf{A}\) gives rise to a linear transformation \(L_\mathbf{A}:V\mapto W\) defined by \(L_\mathbf{A}v=L_\mathbf{A}(v^ie_i)=A_i^jv^if_j\) for all \(v\in V\). Of course, having chosen bases for \(V\) and \(W\) we also have isomorphisms \(V\cong K^n\) and \(W\cong K^m\) so the following diagram commutes:
\begin{equation}
\begin{CD}
K^n @>\mathbf{A}>> K^m\\
@VV\cong V @VV\cong V\\
V @>L_{\mathbf{A}}>> W
\end{CD}
\end{equation}

Denoting by, \(\mathcal{L}(V,W)\), the set of linear transformations between vector spaces \(V\) and \(W\), it is clear that \(\mathcal{L}(V,W)\) is a vector space and we may summarise the preceding discussion in the following theorem.

Theorem A choice of bases for vector spaces \(V\) and \(W\), of dimensions \(n\) and \(m\) respectively, defines a vector space isomorphism \(\mathcal{L}(V,W)\cong\text{Mat}_{m,n}(K)\).

A consequence of this is that,
\begin{equation}
\dim\mathcal{L}(V,W)=\dim\text{M}_{m,n}(K)=nm=\dim V\dim W.
\end{equation}

A linear operator \(T:V\mapto V\) is called an automorphism if it is an isomorphism. The set of all linear operators on a vector space \(V\) is denoted \(\mathcal{L}(V)\) and is of course a vector space in its own right. The automorphisms of a vector space \(V\), denoted \(\text{GL}(V)\), form a group called the general linear group of \(V\). If \(T\in\text{GL}(V)\) and \(\{e_i\}\) is some basis of \(V\) then clearly \(\{Te_i\}\) is also a basis, identical to the original if and only if \(T=\id_V\) and conversely if \(\{e’_i\}\) is some other basis of \(V\) then the linear operator \(T\) defined by \(Te_i=e’_i\) is an isomorphism.

The invertibility of a linear transformation \(T\in\mathcal{L}(V,W)\) is equivalent to the invertibility, once a basis is chosen, of the matrix representation of the transformation. Indeed the invertibility of any matrix \(\mathbf{A}\) is equivalent to the invertibility of the corresponding linear transformation \(L_\mathbf{A}\) (which in turn means an invertible matrix must be square and of rank \(n=\dim V\)). We denote by \(\text{GL}_n(K)\) the group of automorphisms of \(K^n\), that is, the group of invertible \(n\times n\) matrices over \(K\). It is not difficult to see that the isomorphism \(V\cong K^n\) induces an isomorphism \(\text{GL}(V)\cong\text{GL}_n(K)\).