Sums, Intersections and Projections

While the intersection, \(U_1\cap U_2\), of two subspaces of a vector space \(V\) is again a subspace, the union, \(U_1\cup U_2\), is not unless \(U_1\subseteq U_2\) or \(U_2\subseteq U_1\). The sum \(U_1+U_2\) defined as \(\{u_1+u_2\mid u_1\in U_1,u_2\in U_2\}\) is the ‘smallest’ vector space containing \(U_1\cup U_2\). Just how large is it? Its not difficult to see that if we take \(u_1,\dots,u_d\) to be the basis \(U_1\cap U_2\) and extend it to a basis \(u_1,\dots,u_d,v_1,\dots,v_r\) of \(U_1\) and a basis \(u_1,\dots,u_d,w_1,\dots,w_s\) of \(U_2\), then \(u_1,\dots,u_d,v_1,\dots,v_r,w_1,\dots,w_s\) is a basis of \(U_1+U_2\) and we have
\begin{equation}
\dim(U_1\cap U_2)+\dim(U_1+U_2)=\dim(U_1)+\dim(U_2).
\end{equation}

Example In 3-dimensional space, if we consider two distinct planes which pass through the origin, so a pair of 2-dimensional subspaces, then their sum is of course the 3-dimensional space whilst their intersection is a 1-dimensional line, 1+3=2+2.

If \(U_1\), \(U_2\) and \(U_3\) are subspaces of a vector space \(V\) then notice that in general \(U_1\cap(U_2+U_3)\) and \((U_1\cap U_2)+(U_1\cap U_3)\) are not equal, intersection is not distributive over addition. Indeed, consider the subspaces \(U_1=\Span(\mathbf{e}_1+\mathbf{e}_2)\), \(U_2=\Span(\mathbf{e}_1)\) and \(U_3=\Span(\mathbf{e}_2)\) of \(\RR^2\). Then \(U_1\cap(U_2+U_3)=U_1\) but \(U_1\cap U_2=0=U_1\cap U_3\). Rather, we have the following equality,
\begin{equation}
U_1\cap(U_2+(U_1\cap U_3))=U_1\cap U_2+U_1\cap U_3.
\end{equation}
That \(U_1\cap(U_2+(U_1\cap U_3))\subset U_1\cap U_2+U_1\cap U_3\) follows since if \(v\in U_1\cap(U_2+(U_1\cap U_3))\) then \(v=u_1=u_2+u_{13}\) where \(u_1\in U_1\), \(u_2\in U_2\) and \(u_{13}\in (U_1\cap U_3)\). But then \(u_2\in U_1\) so indeed \(v\in U_1\cap U_2+U_1\cap U_3\). The reverse inclusion is immediate.

Definition A sum \(U=\sum_{i=1}^r U_i\) of vector subspaces \(U_i\subseteq V\) is direct if every \(u\in U\) can be written uniquely as \(u=u_1+\dots+u_r\) for some \(u_i\in U_i\).

Lemma The sum \(U_1+U_2\) of any pair of subspaces is direct if and only if \(U_1\cap U_2={0}\).

Proof If the sum is direct but there is some non-zero \(v\in U_1\cap U_2\), we could write the zero vector in two ways as \(0=0+0\) and \(0=v+(-v)\), contradicting directness. Conversely, if \(U_1\cap U_2=\{0\}\) and \(v=u_1+u_2\) as well as \(v=u’_1+u’_2\) then we must have \(u_1-u’_1=u’_2-u_2=0\) so the decomposition is unique. \(\blacksquare\)

More generally, the following theorem gives three equivalent criteria for a sum of arbitrary length to be direct.

Theorem A sum \(U=\sum_{i=1}^rU_i\) of vector subspaces \(U_i\subseteq V\) is direct if and only if one of the following three equivalent criteria holds:

  1. For each \(i\), \(U_i\cap\left(\sum_{j\neq i}U_j\right)=0\).
  2. If \(u_1+\dots+u_r=0\), \(u_i\in U_i\), then \(u_i=0\).
  3. Every \(u\in U\) can be written uniquely as \(u=u_1+\dots+u_r\) for some \(u_i\in U_i\)

Proof Suppose \((2)\) is false then \(-u_1=u_2+\dots+u_r\) which contradicts \((1)\), so \((1)\) implies \((2)\). Suppose \((3)\) were false so that we had \(u=u_1+\dots+u_r\) and \(u=u’_1+\dots+u’_r\), with not all \(u_i=u’_i\). Then subtracting these implies \((2)\) is false so \((2)\) implies \((3)\). Finally suppose \((1)\) is false, then we have some \(u\in U\) such that \(u\in U_i\) and \(u=u_1+\dots+u_{i-1}+u_{i+1}+\dots+u_r\). This implies \((3)\) is false so \((3)\) implies \((1)\).\(\blacksquare\)

Remark If \(\{e_1,\dots,e_n\}\) is a basis of \(V\) then clearly \(V=\sum_{i=1}^n\Span(e_i)\) is a direct sum.

Remark A situation which sometimes arises is that we know \(V=\sum_{i=1}^rU_i\) and also that \(\sum_{i=1}^r\dim U_i=\dim V\). Then choosing bases for each \(U_i\) we obtain a collection of vectors which certainly span \(V\). But since \(\sum_{i=1}^r\dim U_i=\dim V\) there must be \(\dim V\) of these vectors so they are a basis for \(V\) and we may conclude that the sum \(V=\sum_{i=1}^rU_i\) is direct.

If \(V=U+W\) is a direct sum of subspaces \(U\) and \(W\) these subspaces are said to be complementary. Given a subspace \(U\) of \(V\) then a complementary subspace \(W\) always exists. Just take a basis for \(U\), \(u_1,\dots,u_r\), and extend it to a basis \(u_1,\dots,u_r,w_1,\dots,w_{n-r}\) of \(V\) then \(W=\Span(w_1,\dots,w_{n-r})\) is a complementary subspace. Note that defining, for example, \(W’=\Span(w_1+u_1,w_2,\dots,w_{n-r})\) we obtain another subspace, also complementary to \(U\) but not equal to \(W\). Aside from the trivial cases of 0 and \(V\) complements of subspaces are not unique.

Example In \(\RR^2\), consider the subspace \(\Span(\mathbf{e}_1)\). Then \(\Span(\mathbf{e}_2)\) and \(\Span(\mathbf{e}_1+\mathbf{e}_2)\) are both examples of complementary subspaces.

Given two arbitrary vector spaces \(U\) and \(W\) their external direct sum, \(U\oplus W\), is defined to be the product set \(U\times W\) with vector space structure given by
\begin{align}
(u_1,w_1)+(u_2,w_2)&=(u_1+u_2,w_1+w_2)&c(u,w)=(cu,cw).
\end{align}
Now suppose \(U\) and \(W\) are in fact subspaces of some vector space \(V\) and consider the map \(\pi:U\oplus W\mapto V\) defined as \(\pi(u,w)=u+w\). This is clearly a linear transformation, with \(\ker\pi=\{(u,-u)\mid u\in U\cap W\}\) and \(\img\pi=U+W\). Thus in the case that \(U+W\) is a direct sum we have \(U+W\cong U\oplus W\) and in fact, abusing notation, we write in this case \(U+W=U\oplus W\). Furthermore, applying the rank-nullity theorem to the map \(\pi\), we obtain
\begin{equation}
\dim(U\oplus W)=\dim(U)+\dim(W)\label{equ:dim of sum}
\end{equation}

Example If \(U_1\) and \(U_2\) are subspaces of a vector space \(V\) with a non-zero intersection, \(U_0=U_1\cap U_2\) then we can write \(U_1=U_0\oplus U_1’\) and \(U_2=U_0\oplus U_2’\) for some subspaces \(U_1’\) and \(U_2’\) of \(U_1\) and \(U_2\) respectively. Consider the sum \(U_0+U_1’+U_2’\). It is not difficult to see that \(U_1+U_2=U_0+U_1’+U_2’\). In fact, since \(U_1\cap U_2’=0\), as follows since if \(u\in U_1\cap U_2’\) then \(u\in U_0\) and \(u\in U_2’\) contradicting \(U_2=U_0\oplus U_2’\), we can write \(U_1+U_2=(U_0\oplus U_1′)+U_2’=(U_0\oplus U_1′)\oplus U_2’\), that is,
\begin{equation}
U_1+U_2=U_0\oplus U_1’\oplus U_2′.
\end{equation}

Example If \(T\in\mathcal{L}(V)\), then note that whilst \(\dim V=\dim\ker T+\dim\img T\) it is not generally the case that \(\ker T\cap\img T=0\) so we cannot in general describe \(V\) as the direct sum of the kernel and image of a linear operator. Consider for example the \(V=\RR^2\) then the linear operator defined as
\begin{equation*}
T=\begin{pmatrix}0&1\\0&0\end{pmatrix}
\end{equation*}
is such that
\begin{equation*}
\ker T=\img T=\left\{\begin{pmatrix}1\\0\end{pmatrix}\right\}
\end{equation*}

If \(W\) is a subspace of a vector space \(V\) then we can form the quotient space \(V/W\) which is just the quotient group equipped with the obvious scalar multiplication. If \(w_1,\dots,w_r\) is a basis for \(W\) then we can extend it to a basis \(w_1,\dots,w_r,v_{1},\dots,v_{n-r}\) for \(V\) and \(v_1+W,\dots,v_{n-r}+W\) is then a basis for \(V/W\). In particular, we have
\begin{equation}
\dim V/W=\dim V-\dim W.
\end{equation}
Now if \(T:V\mapto V\), we could define a linear operator \(T’:V/W\mapto V/W\), as
\begin{equation}
T'(v+W)=Tv+W.
\end{equation}
But for this to be well-defined, \(W\) must be \(T\)-invariant, that is \(Tw\in W\) for any \(w\in W\). For if \(v+W=v’+W\), then \(v-v’=w\) for some \(w\in W\), and we require \(T'(v+W)=T'(v’+W)\). But
\begin{equation*}
T'(v+W)=Tv+W=T(v’+w)+W,
\end{equation*}
so we must have \(Tw\in W\).

Quotients of vector spaces also allow us to factorise certain linear maps in the following sense. Suppose \(T:V\mapto W\) is a linear transformation, and \(U\) is a subspace of \(V\) such that \(U\subseteq\ker T\). Define \(\pi:V\mapto V/U\) as \(\pi v=v+U\) (\(\pi\) is clearly linear with kernel \(\ker\pi=U\)). Then there exists a linear map \(T’:V/U\mapto W\) such that \(T=T’\pi\). That is, we have ‘factorised’ \(T\) as \(T’\pi\). Indeed, \(T’\) must clearly be defined as \(T'(v+U)=Tv+U\). This is well defined since if, \(v+U=v’+U\), then, \(v-v’\in U\), so that, \(T(v-v’)=0\) or \(Tv=Tv’\), so that \(T'(v+U)=T'(v’+U)\). In such a situation, \(T\) is said to factorise through \(V/U\).

Definition A projection on \(V\) is a linear operator \(P\in\mathcal{L}(V)\) such that
\begin{equation}
P^2=P.
\end{equation}

Proposition There is a one-to-one correspondence between projections \(P\), pairs of linear transformations \(P,Q\in\mathcal{L}(V)\) such that
\begin{equation}
P+Q=\id_V\qquad\text{and}\qquad PQ=0, \label{equ:proj op alter}
\end{equation}
and direct sum decompositions
\begin{equation}
V=U\oplus W. \label{equ:proj op decomp}
\end{equation}

Proof If \(P\) is a projection then we can define \(Q=\id_V-P\) and \eqref{equ:proj op alter} is obvious. Given operators \(P\) and \(Q\) satisfying \eqref{equ:proj op alter}, we can define subspaces \(U\) and \(W\) of \(V\) as \(U=PV\) and \(W=QV\). Then \(P+Q=\id_V\) implies that \(V=U+W\). That this sum is direct follows since if an element \(v\) belonged to both \(U\) and \(W\) then \(v=Pv_1=Qv_2\) for some \(v_1,v_2\in V\) which then means that \(Pv_1=PQv_2=0\), so \(v=0\) and \(V=U\oplus W\). Clearly \(\img Q\subseteq\ker P\) and conversely if \(v\in\ker P\) then \(v=Pv+Qv=Qv\) so \(\ker P\subseteq\img Q\) and \(\ker P\cong\img Q\). Likewise, \(\ker Q\cong\img P\), so we have
\begin{equation}
V=\img P\oplus\ker P=\ker Q\oplus\img Q.
\end{equation}
Given a direct sum decomposition \(V=U\oplus W\) any \(v\in V\) can be expressed uniquely as \(v=u+w\) with \(u\in U\), \(w\in W\) and we can therefore define a linear operator \(P\) by \(Pv=u\). So defined, \(P\) is clearly a projection. \(\blacksquare\)

Thus, we cannot speak of the projection onto some subspace \(U\) only a projection. There are as many projections onto a subspace \(U\) as there are complements of \(U\). However, note that if \(V=U\oplus W\) with \(P\) the projector onto \(W\) then, \(U=\ker P\), and, \(V/\ker P\cong W\), such that, \(v+\ker P\mapsto Pv\). So it is the case that all complements of \(U\) are isomorphic.

Example In the case of \(\RR^2\), for the direct sum decomposition \(\RR^2=\Span(\mathbf{e}_1)\oplus\Span(\mathbf{e}_2)\), the corresponding projections are
\begin{equation}
\mathbf{P}=\begin{pmatrix} 1&0\\0&0\end{pmatrix}\quad\text{and}\quad\mathbf{Q}=\begin{pmatrix} 0&0\\0&1\end{pmatrix},
\end{equation}
with \(\img\mathbf{P}=\Span(\mathbf{e}_1)\) and \(\img\mathbf{Q}=\Span(\mathbf{e}_2)\). For the direct sum decomposition, \(\RR^2=\Span(\mathbf{e}_1)\oplus\Span(\mathbf{e}_1+\mathbf{e}_2)\), the corresponding projections are
\begin{equation}
\mathbf{P}=\begin{pmatrix} 1&-1\\0&0\end{pmatrix}\quad\text{and}\quad\mathbf{Q}=\begin{pmatrix} 0&1\\0&1\end{pmatrix},
\end{equation}
with \(\img\mathbf{P}=\Span(\mathbf{e}_1)\) and \(\img\mathbf{Q}=\Span(\mathbf{e}_1+\mathbf{e}_2)\).

Remark Recalling the earlier Example, we note that a projection is an example of a linear operator for which the vector space does decompose as the direct sum of its kernel and image.

More generally, if we have linear operators \(P_1,\dots,P_r\) such that \(P_iP_j=0\) whenever \(i\neq j\) and \(P_1+\dots+P_r=\id_V\) then they are projectors and defining \(U_i=P_iV\),
\begin{equation}
V=U_1\oplus\dots\oplus U_r.
\end{equation}
Note that to check that this sum is really direct it is not enough to check that \(U_i\cap U_j=\{0\}\) whenever \(i\neq j\). 1 We confirm uniqueness directly. We have, \(v=(P_1+\dots+P_r)v=w_1+\dots+w_r\), say, and suppose we also have \(v=u_1+\dots+u_r\). Then applying \(P_i\) to both expressions we obtain \(u_i=w_i\) so the decomposition \(v=w_1+\dots+w_r\) is unique and the sum is direct. If we define \(U_{(i)}=\oplus_{j\neq i}U_j\) and \(P_{(i)}=\sum_{j\neq i}P_j\) then \(V=U_i\oplus U_{(i)}\), \(P_i+P_{(i)}=\id_V\) and \(P_iP_{(i)}=0\). So \(\ker P_i\cong\img P_{(i)}\), \(\img P_i\cong\ker P_{(i)}\) and \(V=\img P_i\oplus\ker P_i=\ker P_{(i)}\oplus\img P_{(i)}\).

If \(P_1\) and \(P_2\) are projections, which do not necessarily sum to the identity \(\id_V\), then it is natural to ask under what circumstances their sum (or difference) is also a projection.

Theorem Suppose \(P_1\) and \(P_2\) are projections onto subspaces \(U_1\) and \(U_2\) of a vector space \(V\) with \(W_1\) and \(W_2\) the respective complementary subspaces. Then,

  1. \(P=P_1+P_2\) is a projection if and only if \(P_1P_2=P_2P_1=0\) in which case \(\img P=U_1\oplus U_2\) and \(\img(\id_V-P)=W_1\cap W_2\).
  2. \(P=P_1-P_2\) is a projection if and only if \(P_1P_2=P_2P_1=P_2\) in which case \(\img P=U_1\cap U_2\) and \(\img(\id_V-P)=W_1\oplus W_2\).
  3. If \(P_1P_2=P_2P_1=P\), then \(P\) is a projection such that \(\img P=U_1\cap U_2\) and \(\img(\id_V-P)=W_1+W_2\)

Proof If \(P=P_1+P_2\) is a projection, then, \(P^2=P\), so that, \(P_1P_2+P_2P_1=0\). Multiplying by \(P_1\) from the left we obtain, \(P_1P_2+P_1P_2P_1=0\). Multiplying from the right we obtain, \(P_1P_2P_1+P_2P_1=0\), so that, \(P_1P_2=P_2P_1\), and hence \(P_1P_2=0\). That \(P_1P_2=P_2P_1=0\) implies \(P^2=P\) is clear. Assuming \(P\) is indeed a projection, consider \(\img P\). If \(v\in\img P\), then \(v=Pv=P_1v+P_2v\in U_1+U_2\). Conversely, if \(v\in U_1+U_2\), then \(v=u_1+u_2\) for some \(u_1\in U_1\) and \(u_2\in U_2\). Then, \(Pv=P_1u_1+P_2u_2\), since \(P_1P_2=0\), and so, \(Pv=v\), hence \(v\in\img P\). If \(v\in U_1\cap U_2\), then \(v=P_1v=P_1P_2v=0\), so \(U_1\cap U_2=0\) and \(\img P=U_1\oplus U_2\). Now if \(v\in\img(\id_V-P)\), then, \(v=v-Pv\), and it is clear that \(P_1v=0=P_2v\) so \(v\in W_1\cap W_2\). Conversely, if \(v\in W_1\cap W_2\), then
\begin{equation*}
(\id_V-P)v=v-(P_1+P_2)v=(\id_v-P_1)v-P_2v=(id_v-P_2)v=v
\end{equation*}
so \(v\in\img(\id_V-P)\). The other statements are proved similarly.\(\blacksquare\)

Suppose that with respect to some \(T\in\mathcal{L}(V)\) the subspace \(U\) of \(V\) is \(T\)-invariant and consider a direct sum \(V=U\oplus W\). Corresponding to any such direct sum decomposition we have a projection \(P\) such that \(\img P=U\) and \(\ker P=W\). Its not difficult to see that for any such projection we have \(PTP=TP\). Conversely, if \(PTP=TP\) for some projection \(P\) onto a subspace \(U\), then any \(u\in U\) is such that \(u=Pu\) so that \(Tu=TPu=PTPu=PTu\), that is, \(Tu\in U\), so \(U\) is \(T\)-invariant.

Theorem If \(V=U\oplus W\), for some subspaces \(U,W\subset V\), and \(P\) is the corresponding projection, then a necessary and sufficient condition for those subspaces to be invariant with respect to some linear operator \(T\in\mathcal{L}(V)\) is that \(T\) commutes with \(P\), \(TP=PT\).

Proof Assuming \(U\) and \(W\) are \(T\)-invariant then we know that \(PTP=TP\) but also \((1-P)T(1-P)=T(1-P)\). From the latter it follows that \(PTP=PT\) so \(TP=PT\). In the other direction, if \(TP=PT\), then for any \(u\in U\), \(u=Pu\) so that \(Tu=TPu=PTu\), or \(Tu\in U\). Likewise for any \(v\in V\).\(\blacksquare\)

Notes:

  1. For example, for the space \(\RR^2\) of all pairs \((x,y)\) we could define three subspaces \(U_1=\{(x,0)\mid x\in\RR\}\), \(U_2=\{(0,x)\mid x\in\RR\}\), and \(U_3=\{(x,x)\mid x\in\RR\}\). Clearly \(U_i\cap U_j=\{0\}\) whenever \(i\neq j\) but it is equally clear that we couldn’t express an arbitrary element \((x,y)\in\RR^2\) uniquely in terms of elements of \(U_1\), \(U_2\) and \(U_3\).