Change of Basis

Suppose \(V\) is an \(n\)-dimensional vector space with basis \(\{e_i\}\). Then we may identify \(v\in V\) with the column vector, \(\mathbf{v}\), of its components with respect to this basis,
\begin{equation*}
\mathbf{v}=
\begin{pmatrix}
v^1\\
\vdots\\
v^n
\end{pmatrix}.
\end{equation*}
If we have an alternative basis for \(V\), \(\{e’_i\}\), then with respect to this basis \(v\) will be represented by a different column vector, \(\mathbf{v}’\) say,
\begin{equation*}
\mathbf{v}’=
\begin{pmatrix}
v’^1\\
\vdots\\
v’^n
\end{pmatrix},
\end{equation*}
where \(v’^i\) are the components of \(v\) with respect to this alternative basis. But \(\{e_i\}\) and \(\{e’_i\}\) must be related via an invertible matrix \(\mathbf{P}\), which we’ll call the change of basis matrix, according to \(e’_i=P_i^je_j\). Then, since \(v=v’^ie’_i=v’^iP_i^je_j=v^je_j\) we have that
\(\mathbf{v}’=\mathbf{P}^{-1}\mathbf{v}\). We say that the components of a vector transform with the inverse of the change of basis matrix, that is they transform contravariantly 1.

Now suppose that in addition we have an \(m\) dimensional space \(W\) with bases \(\{f_i\}\) and \(\{f’_i\}\) related by a change of basis matrix \(\mathbf{Q}\) according to \(f’_i=Q_i^jf_j\). Let us consider how linear transformations and their matrix representations are affected by change of bases. If a linear transformation \(T\in\mathcal{L}(V,W)\) is represented with respect to the bases \(\{e_i\}\) and \(\{f_i\}\) by the matrix \(\mathbf{T}\) with components \(T_i^j\), then we consider its representation, \(\mathbf{T}’\) say, with respect to the alternative bases \(\{e’_i\}\) and \(\{f’_i\}\). Since any \(v\in V\) can be written either as \(v=v^ie_i=v^i{P^{-1}}_i^je’_j\) or \(v=v’^ie’_i\) and any \(w\in W\) either as \(w=w^if_i=w^i{Q^{-1}}_i^jf’_j\) or \(w=w’^if’_i\), we have
\begin{equation*}
w’^j={Q^{-1}}_i^jw^i={Q^{-1}}_i^jT_k^iv^k={Q^{-1}}_i^jT_k^iP_l^kv’^l
\end{equation*}
as well as \(w’^j={T’}_i^j v’^i\). So we must have,
\begin{equation*}
{T’}_i^j={Q^{-1}}_k^jT_l^kP_i^l.
\end{equation*}
That is, \(\mathbf{T}’=\mathbf{Q}^{-1}\mathbf{T}\mathbf{P}\). The matrices \(\mathbf{T}’\) and \(\mathbf{T}\) are said to be equivalent. Correspondingly, two linear transformations, \(T:V\mapto W\) and \(T’:V\mapto W\) are said to be equivalent if there exist automorphisms \(P\in\text{GL}(V)\) and \(Q\in\text{GL}(W)\) such that \(T’=Q^{-1}TP\).

Lemma If \(T:V\mapto W\) is a linear transformation and \(P\in\text{GL}(V)\) and \(Q\in\text{GL}(W)\) then
\begin{equation}
\dim\ker T=\dim\ker QTP \qquad\text{and}\qquad \dim\img T=\dim\img QTP. \label{equ:rank conservation}
\end{equation}

Proof \(P\) induces an isomorphism \(\ker T\cong\ker QTP\), as suppose \(u\in\ker QTP\), then \(QTPu=0\iff Q(TPu)=0\iff TPu=0\). So the restriction of \(P\) to \(\ker QTP\) maps \(\ker QTP\) to \(\ker T\). Since \(P\) is invertible this is an isomorphism. Similarly, \(Q\) can be seen to induce an isomorphism \(\img T\cong\img QTP\), which follows in any case from the isomorphism of kernels by the rank-nullity theorem. \(\blacksquare\)

This result tells us, in particular, that equivalent linear transformations share the same rank. For an \(m\times n\) matrix \(\mathbf{A}\), the rank, \(\rank(\mathbf{A})\), is defined to be that of the corresponding linear transformation, \(L_\mathbf{A}\), that is \(\rank(\mathbf{A})=\rank(L_\mathbf{A})\). Now we may regard \(L_\mathbf{A}\) as a map \(L_\mathbf{A}:K^n\mapto K^m\) and taking \(\{\mathbf{e}_i\}\) to be the standard basis of \(K^n\), then \(L_\mathbf{A}\mathbf{e}_1,L_\mathbf{A}\mathbf{e}_2,\dots,L_\mathbf{A}\mathbf{e}_n\) span \(\img L_\mathbf{A}\). But \(L_\mathbf{A}\mathbf{e}_i\) is simply the \(i\)th column of \(\mathbf{A}\) so we see that the rank of the matrix \(\mathbf{A}\) is just the dimension of the space spanned by its columns. What is the dimension of the row space?

Let us denote by \(\{k_i\}\), \(1\leq i\leq d\), a basis of \(\ker L_\mathbf{A}\), and extend this to a basis, \(e_1,\dots,e_{n-d},k_1,\dots,k_d\) of \(K^n\). Then, as we already saw in the proof of the rank-nullity theorem, the elements \(f_i=L_\mathbf{A}e_i\in K^m\), \(1\leq i\leq n-d\), are linearly independent and so can be extended to a basis \(f_1,\dots,f_m\) of \(W\). So we have constructed new bases for \(V\) and \(W\) respectively with respect to which the matrix representation of \(L_\mathbf{A}\) has the particularly simple form
\begin{equation}\tilde{\mathbf{A}}=
\begin{pmatrix}
\mathbf{I}_{n-d}&\mathbf{0}_{n-d,d}\\
\mathbf{0}_{m-n+d,n-d}&\mathbf{0}_{m-n+d,d}
\end{pmatrix}\label{equ:fully reduced matrix}
\end{equation}
where \(\mathbf{I}_d\) is the \(d\times d\) identity matrix and \(\mathbf{0}_{m,n}\) the \(m\times n\) zero matrix. The rank of this matrix is of course \(n-d\), simply the number of \(1\)s. From \eqref{equ:rank conservation}, we know that equivalent linear transformations have isomorphic kernels and images, so equivalent matrices have the same rank. In other words, any matrix \(\mathbf{A}\) may be factorised as
\begin{equation}
\mathbf{A}=\mathbf{Q}\begin{pmatrix}
\mathbf{I}_{n-d}&\mathbf{0}_{n-d,d}\\
\mathbf{0}_{m-n+d,n-d}&\mathbf{0}_{m-n+d,d}
\end{pmatrix}
\mathbf{P}^{-1}=\mathbf{Q}\tilde{\mathbf{A}}\mathbf{P}^{-1},\label{matrix factorization}
\end{equation}
and if it is equivalent to another matrix of the same form as \(\tilde{\mathbf{A}}\), say \(\mathbf{B}\), then \(\mathbf{B}=\tilde{\mathbf{A}}\). Given this factorisation, it is clear that \(\rank\mathbf{A}^\mathsf{T}=\rank\mathbf{A}\) from which it follows that the dimensions of the row and column spaces of any matrix are equal.

To summarise, for any pair of vector spaces \(V\) and \(W\), of dimensions \(n\) and \(m\) respectively, the linear transformations, \(\mathcal{L}(V,W)\), are determined, up to change of the respective bases, by their rank which is bounded above by \(\min(n,m)\).

A nice way to restate this conclusion is in the language of group actions and their orbits. Recall that if \(G\) is a group and \(X\) some set then an action of \(G\) on \(X\) is a (group) homomorphism between \(G\) and \(S_X\), the group of all permutations of the elements of \(X\). Now \(\text{GL}(V)\times\text{GL}(W)\), with the obvious group structure, has an action on the space \(\mathcal{L}(V, W)\) defined by \((Q,P)T=QTP^{-1}\) for \(P\in\text{GL}(V)\) and \(Q\in\text{GL}(W)\). Recall also that the action of a group on a set \(X\) partitions \(X\) into orbits, where the orbit of some \(x\in X\) is defined to be the subset, \(\{gx\mid\forall g\in G\}\), of \(X\). In our case we see that orbits of the action of \(\text{GL}(V)\times\text{GL}(W)\) on \(\mathcal{L}(V, W)\) are precisely the linear transformations of a given rank. The headline, as it were, is therefore the following.

The orbits of the action of \(\text{GL}(V)\times\text{GL}(W)\) on \(\mathcal{L}(V, W)\) are those elements of \(\mathcal{L}(V,W)\) of a given rank and thus are in bijection with the set \(\{d\mid 0\leq d\leq \min(\dim V,\dim W)\}\).

Notes:

  1. If we assemble the old and new basis vectors into row vectors \(\mathbf{e}\) and \(\mathbf{e}’\) respectively then \(\mathbf{e}’=\mathbf{e}\mathbf{P}\) but \(\mathbf{v}’=\mathbf{P}^{-1}\mathbf{v}\).