Category Archives: Inner Product Spaces

The Riesz Lemma

Recall that we saw that there was no natural isomorphism between a finite dimensional vector space, \(V\), and its dual, \(V^*\). When we have a non-degenerate inner product, the situation is a little different. Consider first the case of a real non-degenerate inner product space, and define a map \(V\mapto V^*\) according to \(v\mapsto f_v\), where \(f_v(w)=(v,w)\) for any \(w\in V\). That this is indeed a linear map follows since \(f_{av}(w)=(av,w)=a(v,w)\), that is, \(av\mapsto af_v\). It is injective since the inner product is non-degenerate, and, since \(\dim V=\dim V^*\), it is an isomorphism. In the complex case we again have a bijection but now, since \(f_{av}(w)=(av,w)=a^*(v,w)\) it is no longer linear but antilinear. Thus, in this case, we say that the map is an antiisomorphsim.

Polar and Singular Value Decompositions

Might there be an analog for operators of the polar decomposition of complex numbers \(z=re^{i\theta}\)? If so, we’d hope for something of the form \(T=PU\) with a unitary operator \(U\) corresponding to \(e^{i\theta}\) and some operator \(P\) corresponding to the ‘absolute value’ of \(T\). Guided by analogy we should hope for \(P\) to be positive’ in some sense.

Definition A linear operator \(T\) on a real orthogonal or complex Hermitian inner product space, \(V\), is called positive if it is self-adjoint and if \((Tv,v)\geq0\) for all \(v\in V\).

This is a sensible definition since for a self-adjoint operator \(T\), it is not difficult to see that the condition \((Tv,v)\geq0\) is equivalent to all eigenvalues, \(\lambda\), of \(T\) being such that \(\lambda\geq0\). Now, for any operator \(T\in\mathcal{L}(V)\), consider \(T^\dagger T\). This is clearly self-adjoint, and since \((T^\dagger Tv,v)=(Tv,Tv)\geq0\), by assumption of the positive definiteness of the inner product, it is also positive (note also that \(\ker T=\ker(T^\dagger T)\)). So to any operator \(T\) is associated a self-adjoint positive operator \(T^\dagger T\). However for our immediate goal, of achieving an analog of polar decomposition, it is somehow of the wrong ‘order’. We would need something like its ‘square root’.

Recall that any self-adjoint operator, \(T\), has a spectral decomposition \(T=\sum_i\lambda_iP_i\). If \(T\) is in fact positive, then each \(\lambda_i\geq0\) so that we could define,
\begin{equation}
\sqrt{T}=\sum_i\sqrt{\lambda_i}P_i.
\end{equation}
Clearly, \(\sqrt{T}\) is positive and \((\sqrt{T})^2=T\) (also note that \(\ker\sqrt{T}=\ker T\)). Moreover, it is the unique positive operator whose square is \(T\). Indeed suppose \(A\) was a positive operator such that \(A^2=T\). Suppose \(A\) has a spectral decomposition, \(A=\sum_i\mu_iQ_i\), then \(A^2=T=\sum_i\mu_i^2Q_i\). But by the uniqueness of the spectral decomposition of \(T\) we know that, appropriately ordered, we must have \(\lambda_i=\mu_i^2\) and \(Q_i=P_i\), so that \(A=\sqrt{T}\).

Theorem Any operator \(T\) on a real orthogonal or complex Hermitian inner product space with positive definite inner product can be expressed as a product of two operators, \(T=UP\), called its polar decomposition, in which \(P\) is a uniquely determined positive operator and \(U\) is an isometry, which is unique if and only if \(T\) is invertible.

Proof To begin, notice that if such a decomposition exists, then \(T^\dagger T=PU^\dagger UP=P^2\), and by the uniqueness of the square root \(P=\sqrt{T^\dagger T}\) is unique. Also, if \(T\) is invertible, then so is \(P\), so \(U=TP^{-1}\) is unique. Conversely if \(T\) is not invertible then neither is \(P\) and in this case \(\ker P\) is non-trivial and we can write \(V=\ker P\oplus \img P\). \(U\) can then be replaced by \(UU’\) where \(U’\) is any isometry of the form \(U’=f\oplus\id_{\img P}\) where \(f\) is any isometry of \(\ker P\).
Now let us consider existence. Define \(P=\sqrt{T^\dagger T}\) and observe that in the case that \(T\) is invertible we could simply define \(U=TP^{-1}\) which, as is easily verified, is an isometry. In the case that \(T\) is not invertible, we start by considering the subspace \(\img P\) of \(V\) and define on this subspace the map \(U_1:\img P\mapto\img T\) as \(U_1=TP^{-1}|_{\img P}\) (\(P\) is an isomorphism on \(\img P\) since \(\ker P\cap\img P=0\)). So defined \(U_1\) is clearly linear and since \(\ker P=\ker T\), it is well defined, in the sense that if \(v_1,v_2\in V\) are such that \(Pv_1=Pv_2\) then \(Tv_1=Tv_2\), is injective, and \(\dim\img P=\dim\img T\). Moreover, for any \(v_1,v_2\in\img P\), with \(v_1=Pu_1\) and \(v_2=Pu_2\), \((U_1v_1,U_1v_2)=(Tu_1,Tu_2)=(T^\dagger Tu_1,u_2)=(P^2u_1,u_2)=(Pu_1,Pu_2)=(v_1,v_2)\), so if \(\{v_1,\dots,v_k\}\) is an orthonormal basis of \(\img P\) and \(\{v_{k+1},\dots,v_n\}\) an orthonormal basis of \(\ker P=(\img P)^\perp\), then \(\{U_1v_{1},\dots,U_1v_k\}\) is an orthonormal basis for \(\img T\) which we can extend to an orthonormal basis of \(V=\img P\oplus(\img P)^\perp=\img T\oplus(\img T)^\perp\) as \(\{U_1v_{1},\dots,U_1v_k,u_k+1,\dots,u_n\}\) where \(\{u_k+1,\dots,u_n\}\) is a basis of \((\img T)^\perp\). Defining \(U_2:(\img P)^\perp\mapto(\img T)^\perp\) as \(U_2v_i=u_i\) for \(i=k+1,\dots,n\) then \(U=U_1\oplus U_2\) is the desired isometry of \(V\).\(\blacksquare\)

Remark From the polar decomposition of an operator \(T\) as \(T=UP\) we can obtain the polar decomposition of \(T\) in the form \(T=P’U\) where \(U\) is the same isometry and now \(P’=UPU^{-1}\).

For any linear operator \(T\), the eigenvalues of \(\sqrt{T^\dagger T}\) are called the singular values of \(T\). Clearly the singular values of \(T\) are non-negative real numbers. We will use the polar decomposition, to establish the following result, known as the singular value decomposition.

Theorem For any operator \(T\in\mathcal{L}(V)\) with singular values \(s_i\), \(i=1,\dots,n\), there exist orthonormal bases of \(V\), \(\{e_1,\dots,e_n\}\) and \(\{f_1,\dots,f_n\}\) such that \(Te_i=s_if_i\) for \(i=1,\dots,n\).

Proof Choose \(\{e_1,\dots,e_n\}\) as the basis of eigenvectors \(\sqrt{T^\dagger T}\), so that \(\sqrt{T^\dagger T}e_i=s_ie_i\). By the polar decomposition there is an isometry \(U\) such that \(T=U\sqrt{T^\dagger T}\). It follows that \(Te_i=U\sqrt{T^\dagger T}e_i=s_iUe_i\). Thus defining \(f_i=Ue_i\) we have \(Te_i=s_if_i\).\(\blacksquare\)

Example Suppose \(X\) and \(Y\) are \(n\)-dimensional subspaces of a \(2n\)-dimensional vector space \(V\) such that \(V=X\oplus Y\) with \(X\) and \(Y\) not assumed to be orthogonal. As subspaces of \(V\), both \(X\) and \(Y\) have orthonormal bases which we’ll denote by \(\{e_1,\dots,e_n\}\) and \(\{f_1,\dots,f_n\}\) respectively. We define a matrix \(\mathbf{A}\) with coordinates \(A_{ij}=(e_i,f_j)\). This matrix then has a polar decomposition, \(\mathbf{A}=\mathbf{U}\mathbf{P}\) and we can use the matrix \(U\) to define a new orthonormal basis for \(X\) with elements \(e_i’=Ue_i\), \(i=1,\dots,n\). Notice that \((e_i’,f_j)=(U_{ki}e_k,f_j)=U_{ki}^*(e_k,f_j)=U^\dagger_{ik}A_kj=P_{ij}\). In particular, this means that \((e_i’f_j)=(e_j’,f_i)^*\). Now introduce a linear operator, \(T\in\mathcal{L}(V)\), by defining it on basis elements as \(Te_i’=f_i\) and \(Tf_i=e_i’\). Clearly this map is such that \(T(X)=Y\) and \(T(Y)=X\). We demonstrate that it is in fact an isometry of \(V\). We have, \((Te_i’,Te_j’)=(f_i,f_j)=\delta_{ij}=(e_i’,e_j’)=(Tf_i,Tf_j)\), but also, \((Te_i’,Tf_j)=(f_i,e_j’)=(e_j’,f_i)^*=(e_i’,f_j)\). More generally, for \(X\) and \(Y\) subspaces of equal dimension of a vector space \(V\), then there always exists an isometry of \(V\) which interchanges \(X\) and \(Y\). To see this we first define \(\tilde{X}\) and \(\tilde{Y}\) to be respectively the orthogonal complements of \(X\cap Y\) in \(X\) and \(Y\) respectively so that \(X=(X\cap Y)\oplus\tilde{X}\) and \(Y=(X\cap Y)\oplus\tilde{Y}\). Now we can write \(V\) as \(V=(X\cap Y)\oplus(\tilde{X}\oplus\tilde{Y})\oplus(X+Y)^\perp\). In this decomposition, note that \(\tilde{X}\oplus\tilde{Y}\) is not an orthogonal direct sum whilst the other two are. The result now follows since we’ve already seen how to construct the desired isometry of \(\tilde{X}\oplus\tilde{Y}\) and this can be extended to an isometry of \(V\) by acting as the identity on \(X\cap Y\) and \((X+Y)^\perp\).

The singular value decomposition appears most commonly as the statement that any (real/complex) \(m\times n\) matrix \(\mathbf{T}\) can be expressed as a product of matrices \(\mathbf{P}\mathbf{\Sigma}\mathbf{Q}^\dagger\) where \(\mathbf{P}\) and \(\mathbf{Q}\) are respectively \(m\times m\) and \(n\times n\) real orthogonal/complex unitary matrices and \(\mathbf{\Sigma}\) is a diagonal matrix with non-negative real numbers on the diagonal.

To see this, consider two finite dimensional vector spaces, \(U\) and \(V\), both real orthogonal or complex Hermitian inner product spaces of dimensions \(n\) and \(m\) respectively, and a linear map \(T\in\mathcal{L}(U,V)\). Then \(\ker T^\dagger T=\ker T\) since clearly \(\ker T\subseteq\ker T^\dagger T\) and if \(u\in\ker T^\dagger T\) then \(0=(T^\dagger Tu,u)=(Tu,Tu)\) so \(u\in\ker T\). Thus, since \(T^\dagger T\) is positive, if we set \(r=\rank T=\rank T^\dagger T\) then \(U\) has an orthonormal basis \(\{u_1,\dots,u_r,u_{r+1},\dots,u_n\}\) of eigenvectors of \(T^\dagger T\) such that the corresponding eigenvalues can be arranged so that \(\lambda_1\geq\cdots\geq\lambda_r>0=\lambda_{r+1}=\cdots=\lambda_n\). Having fixed notation in this way we have that \(\{u_{r+1},\dots,u_n\}\) is a basis for \(\ker T=\ker T^\dagger T\) and that \(\{u_1,\dots,u_r\}\) is a basis for \((\ker T)^\top\). In fact, \(\img T^\dagger\subseteq(\ker T)^\top\), since if \(\tilde{u}\in\img T^\dagger\) then \(\tilde{u}=T^\dagger v\) for some \(v\in V\) and so for any \(u\in\ker T\), \((u,\tilde{u})=(u,T^\dagger v)=(Tu,v)=0\). Conversely, \((\ker T)^\top\subseteq\img T^\dagger\) since if \(u\notin\img T^\dagger\) then \(\exists\tilde{u}\in(\img T^\dagger)^\top\) such that \((u,\tilde{u})\neq0\) but \(T^\dagger T\tilde{u}\in\img T^\dagger\) and \((T\tilde{u},T\tilde{u})=(\tilde{u},T^\dagger T\tilde{u})=0\) so that \(T\tilde{u}=0\) and \(\tilde{u}\in\ker T\) which means \(u\notin(\ker T)^\top\). Thus \(\img T^\dagger=(\ker T)^\top\) and \(\{u_1,\dots,u_r\}\) is thus a basis for \(\img T^\dagger\).

Let us now introduce the notation \(s_i=\sqrt{\lambda_i}\) so that \(T^\dagger Tu_i=s_i^2u_i\) and, for \(i=1,\dots,r\), define the elements \(v_i\in V\) by \(v_i=(1/s_i)Tu_i\). Then
\begin{equation*}
(v_i,v_j)=\frac{1}{s_is_j}(Tu_i,Tu_j)=\frac{1}{s_is_j}(u_i,T^\dagger Tu_j)=\frac{s_j}{s_i}\delta_{i,j}
\end{equation*}
so \(\{v_1,\dots,v_r\}\) is an orthonormal basis for \(\img T=(\ker T^\dagger)^\top\) which can be extended to an orthonormal basis for \(V\) with \(\{v_{r+1},\dots,v_m\}\) then a basis for \(\ker T^\dagger\). With the basis vectors \(\{v_1,\dots,v_m\}\) of \(V\) so defined, \(TT^\dagger v_i=s_iTu_i=s_i^2v_i\), so that the \(v_i\) are eigenvectors for \(TT^\dagger\) with the same eigenvalues as the \(u_i\) have as eigenvectors for \(T^\dagger T\).

The result for matrices now follows since given the bases \(\{u_i\}\) and \(\{v_i\}\) of \(U\) and \(V\) respectively and the corresponding standard bases \(\{e_i\}\), there are real orthogonal/complex unitary matrices \(\mathbf{P}\) and \(mathbf{Q}\) whose elements are defined by \(e_i=P_i^ju_j\) and \(e_i=Q_i^jv_j\). The matrix elements of \(T\) with respect to the standard bases of $U$ and $V$ are defined by \(Te_i=T_i^je_j\) but we know that \(Tu_i=s_iv_i\) so we have
\begin{equation*}
Te_i=P_i^jTu_j=P_i^js_jv_j=P_i^js_j{Q^\dagger}_j^ke_k=T_i^ke_k.
\end{equation*}
That is, \(\mathbf{T}=\mathbf{P}\mathbf{\Sigma}\mathbf{Q}^\dagger\).

Self-Adjoint, Unitary and Orthogonal Operators

We continue to focus on real orthogonal and complex Hermitian spaces with positive definite inner products, now considering maps between them. Suppose \(V\) and \(W\) are such spaces, of dimensions \(n\) and \(m\) respectively, and \(T\in\mathcal{L}(V,W)\). Then the inner product allows us to uniquely associate a linear map \(T^\dagger\in\mathcal{L}(W,V)\) with \(T\), by defining it to be such that \((v,T^\dagger w)=(Tv,w)\). This is the adjoint of \(T\) (in the case of Hermitian spaces, sometimes the Hermitian adjoint). To see that it is indeed unique, notice that if there were distinct \(v’,v”\in V\) such that \((Tv,w)=(v,v’)\) and \((Tv,w)=(v,v”)\) then \((v,v’-v”)=0\) for all \(v\in V\) so \(v’=v”\). Its existence follows since if \(\{e_i\}\) is an orthonormal basis of \(V\), then any \(v\in V\) can be expressed as \(v=\sum v^ie_i\) with \(v^i=(e_i,v)\). Then \((Tv,w)=\sum(v,e_i)(Te_i,w)=(v,\sum(Te_i,w)e_i)\), so \(T^\dagger w=\sum(Te_i,w)e_i\). Indeed, if the matrix representation of \(T\) is \(\mathbf{T}\), then the matrix representation of \(T^\dagger\) is \(\mathbf{T}^\dagger\).

Of particular interest are operators on inner product spaces which coincide with their adjoint.

Definition A linear operator \(T\in\mathcal{L}(V)\) is self-adjoint or Hermitian if \(T=T^\dagger\).

Remark In terms of an orthonormal basis for \(V\), this means that the matrix representation of \(T\) is such that \(\mathbf{T}=\mathbf{T}^\dagger\). That is, if \(K=\RR\) it is symmetric whilst if \(K=\CC\) it is Hermitian.

Remark We’ll typically use the word Hermitian in the specific context of a complex Hermitian space and self-adjoint when the underlying vector space could be either real orthogonal or complex Hermitian.

Remark Recall Example. For any \(T\in\mathcal{L}(V)\), \(\ker T^\dagger=(\img T)^\perp\). Indeed if \(u\in\ker T^\dagger\) then for any element \(w\in\img T\), \((u,w)=(u,Tv)\), for some \(v\in V\), and \((u,Tv)=(T^\dagger u,v)=0\), so \(u\in(\img T)^\perp\). Conversely, if \(u\in(\img T)^\perp\), then for any \(v\in V\), \((T^\dagger u,v)=(u,Tv)=0\), so \(u\in\ker T^\dagger\). In particular, if \(T\) is self-adjoint, then \(\ker T=(\img T)^\perp\), and we have the orthogonal direct sum decomposition, \(V=\ker T\oplus\img T\).

The defintion of an Hermitian operator looks somewhat like an operator/matrix version of the condition on complex numbers by which we restrict to the reals. Though seemingly a trivial observation there turns out to be a rather remarkable analogy between linear operators and complex numbers and in this analogy, real numbers do indeed correspond to self-adjoint operators. In due course we’ll see analogs for modulus 1 and positive numbers as well as the polar decomposition of complex numbers. First though we obtain some results which make the real number/self-adjoint operator correspondence particularly compelling.

It is not difficult to see that for any linear operator \(T\) on a positive definite inner product space, \(T=0\) if and only if \((Tv,w)=0\) for all \(v,w\in V\). In fact we can do somewhat better than this.

Theorem A linear operator \(T\) on a complex Hermitian space \(V\) is zero, \(T=0\), if and only if \((Tv,v)=0\) for all \(v\in V\).

Proof Observe that generally, we have that,
\begin{equation}
(T(av),bw)+(T(bw),av)=(T(av+bw),av+bw)-\abs{a}^2(Tv,v)-\abs{b}^2(Tw,w)
\end{equation}
for all \(v,w\in V\) and \(a,b\in\CC\). In particular, if \((Tv,v)=0\) for all \(v\in V\), then with \(a=b=1\),
\begin{equation}
(Tv,w)+(Tw,v)=0,
\end{equation}
and choosing \(a=1\) and \(b=i\) then dividing by \(i\),
\begin{equation}
(Tv,w)-(Tw,v)=0,
\end{equation}
so that \((Tv,w)=0\) for all \(v,w\in V\) and we conclude that \(T=0\).\(\blacksquare\)

Note that we made essential use of the fact that \(V\) is a complex vector space here. The result is not generally valid for real vector spaces.

Theorem A linear operator \(T\in\mathcal{L}(V)\) on a complex Hermitian space \(V\) is Hermitian if and only if \((Tv,v)\) is real for all \(v\in V\).

Proof If \(T\) is Hermitian, then \((Tv,v)=(v,Tv)=(Tv,v)^*\) so \((Tv,v)\) is real. Conversely, if \((Tv,v)\) is real then \((Tv,v)=(Tv,v)^*=(v,T^\dagger v)^*=(T^\dagger v,v)\) so that \(((T-T^\dagger)v,v)=0\). But we’ve already seen that in this case we must have \(T-T^\dagger=0\) so \(T=T^\dagger\).\(\blacksquare\)

The following result provides an even stronger expression of the ‘realness’ of self-adjoint operators.

Theorem \(T\in\mathcal{L}(V)\) is a self-adjoint operator on a real orthogonal or complex Hermitian inner product space with positive definite inner product if and only if

  1. All eigenvalues of \(T\) are real.
  2. Eigenvectors with distinct eigenvalues are orthogonal.
  3. There exists an orthonormal basis of eigenvectors of \(T\). In particular, \(T\) is diagonalisable.

Proof The if is straightforward so we concentrate on the only if.

  1. Assuming \(K=\CC\), if \(Tv=\lambda v\) for some \(\lambda\in\CC\) and a non-zero vector \(v\in V\), then, \(\lambda^*(v,v)=(\lambda v,v)=(Tv,v)=(v,Tv)=\lambda(v,v)\), so \(\lambda\) must be real. Now suppose \(K=\RR\), but let us pass to the complexification, \(V_\CC\), defined in Realification and Complexification. To avoid confusion with the inner product, let us abuse notation and write an element of \(V_\CC\) as \(v+iv’\) rather than \((v,v’)\), where \(v,v’\in V\). Then, given the symmetric inner product on \(V\), \((\cdot,\cdot)\) we can define an Hermitian inner product on \(V_\CC\) according to \((u+iu’,v+iv’)=(u,v)+(u’,v’)-i(u’,v)+i(u,v’)\), where \(u,u’,v,v’\in V\). Clearly, \(T_\CC\), which acts on an element, \(v+iv’\), of \(V_\CC\) as \(T_\CC(v+iv’)=Tv+iTv’\), is self-adjoint with respect to this inner product and since, as we already know, the matrices of \(T\) and \(T_\CC\) with respect to a basis of \(V\) (which is also a basis of \(V_\CC\)) are identical, it follows that the eigenvalues of \(T\) must be real.
  2. Suppose \(Tv_1=\lambda_1 v_1\) and \(Tv_2=\lambda_2 v_2\) with \(\lambda_1\neq\lambda_2\). Then \(\lambda_1^*(v_1,v_2)=(Tv_1,v_2)=(v_1,Tv_2)=(v_1,v_2)\lambda_2\), so that if, for contradiction, \((v_1,v_2)\neq 0\) then \(\lambda_1^*=\lambda_1=\lambda_2\) contradicting the initial assumption.
  3. In the case \(\dim V=1\) the result is trivial so we proceed by induction on the dimension of \(V\) and assume the result holds for \(n=\dim V>1\). We know there is a real eigenvalue \(\lambda\) and eigenvector \(v_1\in V\) such that \(Tv_1=\lambda v_1\) and since by assumption the inner product of \(V\) is non-degenerate we have the decomposition \(V=\Span(v_1)\oplus\Span(v_1)^\perp\). Now since for any \(w\in\Span(v_1)^\perp\), we have \((Tw,v_1)=(w,Tv_1)=(w,v_1)\lambda=0\), \(\Span(v_1)^\perp\) is \(T\)-invariant. Thus, by the induction hypothesis, we can assume the result for \(\Span(v_1)^\perp\) and take \(\hat{v}_2,\dots,\hat{v}_n\) as its orthonormal basis. Then defining \(\hat{v}_1=v_1/\norm{v_1}\), \(\hat{v}_1,\dots,\hat{v}_n\) is an orthonormal basis of eigenvectors of \(T\).
  4. \(\blacksquare\)

Since the eigenspaces corresponding to the \(r\) distinct eigenvalues of a self-adjoint operator \(T\) decompose \(V\) into an orthogonal direct sum, \(V=\oplus_iV_i\), there correspond orthogonal projectors, \(P_i\), such that, \(\id_V=\sum_iP_i\). Thus for any \(v\in V\) we have \(Tv=T(\sum_iP_iv)=\sum_i\lambda_iP_iv\), that is,
\begin{equation}
T=\sum_{i=1}^r\lambda_iP_i,
\end{equation}
the spectral decomposition of \(T\), of which we’ll see a great deal more later. This decomposition is unique. Suppose we have \(r\) orthogonal projectors, \(Q_i\), complete in the sense that \(\id_V=\sum_iQ_i\), together with \(r\) real numbers, \(\mu_i\), such that \(T=\sum_i\mu_iQ_i\). If \(v\in\img Q_i\), that is, \(v=Q_iv\), we must have \(Tv=\mu_iv\). That is, \(\mu_i\) is an eigenvalue of \(T\) and any \(v\in\img Q_i\) belongs to the eigenspace of \(\mu_i\). Conversely, if \(Tv=\lambda v\) for some \(v\in V\), then since \(v=\sum_iQ_iv\), writing \(v_i=Q_iv\) we have \(\sum_i(\lambda-\mu_i)v_i=0\). Those \(v_i\) which are non-zero are orthogonal and since \(v\neq0\) at least one must be non-zero, so there must be some \(i\) such that \(\lambda=\mu_i\). Let us suppose we have relabelled the \(\mu_i\) such that \(\lambda_i=\mu_i\). Clearly, for any polynomial \(p\), \(p(T)=\sum_ip(\lambda_i)P_i=\sum_ip(\lambda_i)Q_i\). In particular, if we define a polynomial \(p_j(x)=\prod_{i\neq j}(x-\lambda_i)/(\lambda_j-\lambda_i)\) then \(p_j(\lambda_j)=1\) but \(p_j(\lambda_i)=0\) for all \(i\neq j\). These polynomials allow us then to establish \(P_i=Q_i\) for all \(i=1,\dots,r\).

Generally, a projector \(P\in\mathcal{L}(V)\) on a real orthogonal or complex Hermitian inner product space with positive definite inner product is an orthogonal projector if and only if \(P\) is self-adjoint. Indeed, the eigenvalues of a projection operator are either 0 or 1 and the corresponding eigenspaces are \(\ker P\) and \(\img P\) respectively. Thus, if \(P\) is self-adjoint, these eigenspaces are orthogonal. Conversely, if \(P\) is an orthogonal projection, we have, \(V=\ker P\oplus\img P\), with \(\ker P\) and \(\img P\) orthogonal. So choosing an orthonormal basis for \(V\) as the sum of of orthonormal bases for \(\ker P\) and \(\img P\) we have a basis which is precisely an orthonormal basis of eigenvectors of \(P\), so \(P\) is self-adjoint.

Earlier, we classified inner product spaces up to isometry. Focusing on real orthogonal and Hermitian spaces with non-degenerate inner products, let us now consider automorphisms, \(f:V\mapto V\), of these spaces which are also isometries, that is, such that, \((v,w)=(f(v),f(w))\), \(\forall v,w\in V\). Given our definition of the adjoint, this means that, \(f^\dagger f=\id_V\). If \(f\) is an isometry of a real orthogonal geometry it is called an orthogonal operator whilst an isometry of an Hermitian geometry is called a unitary operator.

Isometries of course form a group and in the case of a real orthogonal space whose inner product has signature, \((p,n-p,0)\), that group is called the orthogonal group of the inner product, \(O(V,p,n-p)\). Choosing an orthonormal basis for \(V\), \(\{e_i\}\), such that \((e_i,e_j)=\epsilon_i\delta_{ij}\) (no summation) with \(\epsilon_i=1\), \(1\leq i\leq p\) and \(\epsilon_i=-1\), \(p+1\leq i\leq n\), and defining the matrix, \(\mathbf{I}_{p,q}\), to be
\begin{equation*}
\mathbf{I}_{p,q}=
\begin{pmatrix}
\mathbf{I}_p & \mathbf{0}\\
\mathbf{0} & -\mathbf{I}_q
\end{pmatrix},
\end{equation*}
then its not difficult to see that we have a group isomorphism, \(O(V,p,n-p)\cong O(p,n-p)\), where \(O(p,n-p)\) is the matrix group,
\begin{equation*}
O(p,n-p)=\{\mathbf{O}\in\text{GL}_n(\CC)\mid \mathbf{O}^\mathsf{T}\mathbf{I}_{p,n-p}\mathbf{O}=\mathbf{I}_{p,n-p}\}.
\end{equation*}
In particular, when the inner product is positive definite, then the group of isometries is denoted simply, \(O(V)\), and we have the isomorphism, \(O(V)\cong O(n)\), where,
\begin{equation*}
O(n)=\{\mathbf{O}\in\text{GL}_n(\CC)\mid \mathbf{O}^\mathsf{T}\mathbf{O}=\mathbf{I}_n\}.
\end{equation*}

Similarly, in the case of Hermitian geometries the group of isometries is called the unitary group of the inner product. If the signature of the inner product is \((p,n-p,0)\) then it is denoted, \(U(V,p,n-p)\), and we have an isomorphism, \(U(V,p,n-p)\cong U(p,n-p)\), where \(U(p,n-p)\) is the matrix group defined by,
\begin{equation*}
U(p,n-p)=\{\mathbf{U}\in\text{GL}_n(\CC)\mid \mathbf{U}^\dagger\mathbf{I}_{p,n-p}\mathbf{U}=\mathbf{I}_{p,n-p}\}.
\end{equation*}
In particular, when the inner product is positive definite, a choice of an orthonormal basis provides an isomorphism \(U(V)\cong U(n)\) where,
\begin{equation*}
U(n)=\{\mathbf{U}\in\text{GL}_n(\CC)\mid \mathbf{U}^\dagger\mathbf{U}=\mathbf{I}_n\}.
\end{equation*}

In the spirit of the analogy, already discussed, between complex numbers and linear operators, unitary operators look like they should correspond to complex numbers of unit modulus. Indeed, as the following result, similar to Theorem, demonstrates, the spectra of such operators justifies the analogy.

Theorem \(U\) is a unitary operator on an Hermitian inner product space over \(\CC\) with positive definite inner product if and only if,

  1. All eigenvalues \(\lambda\) of \(U\) are such that \(|\lambda|=1\).
  2. Eigenvectors with distinct eigenvalues are orthogonal.
  3. There exists an orthonormal basis of eigenvectors of \(U\). In particular \(U\) is diagonalisable.

Proof The if is straightforward so we concentrate on the only if.

  1. If \(Uv=\lambda v\) for some \(\lambda\in\CC\) and a non-zero vector \(v\in V\) then \((Uv,Uv)=\lambda^*\lambda(v,v)=(v,v)\) so \(|\lambda|=1\).
  2. Suppose \(Uv_1=\lambda_1 v_1\) and \(Uv_2=\lambda_2 v_2\) with \(\lambda_1\neq\lambda_2\). Then \(\lambda_1^*(v_1,v_2)=(Uv_1,v_2)=(v_1,U^{-1}v_2)=(v_1,v_2)\lambda_2^{-1}=(v_1,v_2)\lambda_2^*\). That is \((v_1,v_2)=0\).
  3. In the case \(\dim V=1\) the result is trivial so we proceed by induction on the dimension of \(V\) and assume the result holds for \(n=\dim V>1\). We know there is some eigenvalue \(\lambda\) and eigenvector \(v_1\in V\) such that \(Uv_1=\lambda v_1\) and since by assumption the inner product of \(V\) is non-degenerate we have the decomposition \(V=\Span(v_1)\oplus\Span(v_1)^\perp\). Now since for every \(w\in\Span(v_1)^\perp\), we have \((Uw,v_1)=(w,U^{-1}v_1)=(w,v_1)\lambda^*=0\), \(\Span(v_1)^\perp\) is \(U\)-invariant. Thus, by the induction hypothesis, we can assume the result for \(\Span(v_1)^\perp\) and take \(\hat{v_2},\dots,\hat{v}_n\) as its orthonormal basis. Then defining \(\hat{v}_1=v_1/\norm{v_1}\), \(\hat{v}_1,\dots,\hat{v}_n\) is an orthonormal basis of eigenvectors of \(U\).
  4. \(\blacksquare\)

The corresponding result for orthogonal operators is a little different.

Theorem \(O\) is an orthogonal operator on a real orthogonal inner product space over \(\RR\) with positive definite inner product if and only if there exists an orthonormal basis in terms of which the matrix representation of \(O\) has the form,
\begin{equation}\label{orthog operator}
\begin{pmatrix}
\mathbf{R}(\theta_1)&\mathbf{0}& & & & & & & \\
\mathbf{0}&\ddots & & & & & & & \\
& &\mathbf{R}(\theta_r) && & & & & \\
& & &1 &\mathbf{0} & & & &\\
& & & \mathbf{0}&\ddots & & & & \\
& & & && 1& & & & \\
& & & & & &-1& \mathbf{0}& \\
& & & & & &\mathbf{0}&\ddots & \\
& & & & & & & &-1
\end{pmatrix}.
\end{equation}

Proof As ever, the if is straightforward so we focus on the only if. Dimension 1 is trivial. Consider dimension \(2\). A choice of orthonormal basis tells us that any orthogonal operator, \(O\), has a matrix representation \(\mathbf{O}\), such that \(\mathbf{O}^\mathsf{T}\mathbf{O}=\mathbf{I}_2\). Considering the determinant of this, we see that \(\det\mathbf{O}=\pm1\) and so \(\mathbf{O}\) must be of the form,
\begin{equation*}
\begin{pmatrix}
a&b\\
c&d
\end{pmatrix},
\end{equation*}
with \(a^2+c^2=1=b^2+d^2\), \(ab+cd=0\) and \(ad-bc=\pm1\). So in the case of determinant 1 we have \(b=-c\) and \(a=d\), and any such matrix can be written as
\begin{equation}
\begin{pmatrix}
\cos\theta&-\sin\theta\\
\sin\theta&\cos\theta
\end{pmatrix},
\end{equation}
for \(0\leq\theta<\pi\), that is, a rotation through an angle \(\theta\). Notice that for \(0<\theta<\pi\) this has no eigenvalues in \(\RR\). In the determinant -1 case we have \(a=-d\) and \(b=c\) and any such matrix can be written
\begin{equation}
\begin{pmatrix}
\cos\theta&\sin\theta\\
\sin\theta&-\cos\theta
\end{pmatrix},
\end{equation}
for \(0\leq\theta<\pi\), that is, a reflection in the line with unit vector \(\cos\frac{\theta}{2}\mathbf{e}_1+\sin\frac{\theta}{2}\mathbf{e}_2\). In contrast to the rotation matrix, this matrix has eigenvalues \(\pm1\) and so can be diagonalised. We conclude that the result holds for dimensions 1 and 2 and proceed, as in the unitary case, by induction on the the dimension of \(V\), assuming the result holds for some some \(n=\dim V>2\).
If \(O\) has a real eigenvalue then by the same argument argument as in the unitary case the result follows. Otherwise, consider \(O_\CC\), the complexification of \(O\), which must have pairs of complex eigenvalues. We recall from the discussion of the real Jordan normal form in The Real Jordan Normal Form that each pair of complex eigenvalues corresponds to a \(2\)-dimensional \(O\)-invariant subspace of \(V\). Choose such a subspace, \(V_0\). Then in an orthonormal basis we know that the matrix representation of the restriction of \(O\) to \(V_0\) must have the form \(\mathbf{R}(\theta)\) where,
\begin{equation}
\mathbf{R}(\theta)=\begin{pmatrix}
\cos\theta&-\sin\theta\\
\sin\theta&\cos\theta
\end{pmatrix}.
\end{equation}
Similar reasoning to the unitary case makes it clear that \(V_0^\perp\) is also \(O\)-invariant and so the result follows.\(\blacksquare\)

So in summary, an operator \(O\) on a real orthogonal space \(V\) with a positive definite inner product is an isometry, that is, an orthogonal operator, if and only if its matrix representation with respect to some orthonormal basis of \(V\) has the form,~\eqref{orthog operator}. An operator \(U\) on a (complex) Hermitian space \(V\) with a positive definite inner product is an isometry, that is, a unitary operator, if and only if its matrix representation with respect to some orthonormal basis of \(V\) is diagonal with its diagonal elements all belonging to the unit circle in \(\CC\). We’ve also seen that operators \(T\) on real orthogonal or Hermitian spaces with positive definite inner products are self-adjoint if and only if they are diagonalisable with all diagonals real.

It’s also worth noting that in an inner product space, \(V\), the orbit of a linear operator \(A\in\mathcal{L}(V)\) which is self-adjoint, under the usual action of vector space automorphisms \(P\in\text{GL}(V)\), \(P^{-1}AP\), does not consist only of other self-adjoint operators. However, if we consider instead the action of isometries that is, elements \(U\in U(V)\) when \(V\) is over \(\CC\) or of elements \(O\in O(V)\) when working over \(\RR\), then the orbits consist exclusively of self-adjoint operators.

Proposition If \(\mathbf{A}\in\text{Mat}_n(\CC)\) is Hermitian, that is, \(\mathbf{A}^\dagger=\mathbf{A}\), then there exists a \(\mathbf{P}\in U(n)\) such that \(\mathbf{P}^{-1}\mathbf{A}\mathbf{P}\) is real and diagonal. Similarly, if \(\mathbf{A}\in\text{Mat}_n(\RR)\) is symmetric, that is, \(\mathbf{A}^\mathsf{T}=\mathbf{A}\), then there exists a \(\mathbf{P}\in O(n)\) such that \(\mathbf{P}^{-1}\mathbf{A}\mathbf{P}\) is real and diagonal.

Proof We use Theorem and treat both \(K=\CC\) and \(K=\RR\) simultaneously. So assume \(\mathbf{A}\in\text{Mat}_n(K)\), then \(L_\mathbf{A}\in\mathcal{L}(K^n)\) is self-adjoint with respect to the standard inner product on \(K^n\), and from Theorem there is an orthonormal basis of eigenvectors for \(L_\mathbf{A}\). That is, there are some \(\lambda_1,\dots,\lambda_n\in\RR\) such that \(L_\mathbf{A}v_i=\lambda_i v_i\) or in terms of matrices,
\begin{equation}
\mathbf{A}(\mathbf{v}_1\dots\mathbf{v}_n)=(\lambda_1\mathbf{v}_1\dots\lambda_n\mathbf{v}_n)
\end{equation}\(\blacksquare\)


Theorem and Theorem are clearly very similar. Indeed, in the context of an Hermitian inner product space they can be ‘unified’ through the notion of a normal linear operator \(T\), that is, one which commutes with its own adjoint, \(TT^\dagger=T^\dagger T\). Self-adjoint operators and unitary operators are clearly both examples of normal operators. Now, for a normal operator \(T\), we have \((Tv,Tv)=(v,T^\dagger Tv)=(v,TT^\dagger v)=(T^\dagger v,T^\dagger v)\). Also, if \(T\) is normal then so is \((T-\lambda\id_V)^\dagger=T^\dagger-\lambda^*\id_V\) for any \(\lambda\in\CC\). So for any normal operator \(T\), if \(\lambda\) is an eigenvalue with eigenvector \(v\), \(Tv=\lambda v\), then \(\lambda^*\) is an eigenvalue of \(T^\dagger\) with eigenvector \(v\), \(T^\dagger v=\lambda^* v\).
Then, if \(Tv_1=\lambda_1v_1\) and \(Tv_2=\lambda_2v_2\) with \(\lambda_1\neq\lambda_2\), \(\lambda_1^*(v_1,v_2)=(Tv_1,v_2)=(v_1,T^\dagger v_2)=(v_1,v_2)\lambda_2^*\), so \((v_1,v_2)=0\). Finally, if \(v_1\) is an eigenvector of a normal operator \(T\) with eigenvalue \(\lambda_1\) then \(\Span(v_1)^\perp\) is \(T\)-invariant since for every \(w\in\Span(v_1)^\perp\) we have \((Tw,v_1)=(w,T^\dagger v_1)=(w,v_1)\lambda_1^*\). So we have,

Theorem \(T\) is a normal operator on an Hermitian inner product space with positive definite inner product if and only if,

  1. Eigenvectors with distinct eigenvalues are orthogonal.
  2. There exists an orthonormal basis of eigenvectors of \(T\). In particular \(T\) is diagonalisable.

Thus, in the case of Hermitian inner product spaces, we have a generalisation of the spectral decomposition result for self-adjoint operators. Any normal operator \(T\) has a spectral decomposition,
\begin{equation}
T=\sum_i\lambda_iP_i,
\end{equation}
where as before, the orthogonal projectors, \(P_i\), correspond to the eigenspaces, \(V_i\), of the (distinct) eigenvalues \(\lambda_i\) of \(T\) in the orthogonal decomposition \(V=\oplus_iV_i\).

Gram-Schmidt Orthogonalisation

In the cases of real orthogonal or complex Hermitian inner product spaces with a positive definite inner product there are no null vectors so we can start from any existing basis, \(\{f_i\}\) say, and systematically construct an orthogonal basis, \(\{e_i\}\), as follows. We begin by setting \(e_1=f_1\). We set \(e_2\) to be \(f_2\) with any component parallel to \(f_1\) removed, that is,
\begin{equation*}
e_2=f_2-\pi_{e_1}f_2,
\end{equation*}
where we have introduced the operator
\begin{equation*}
\pi_uv=\frac{(u,v)}{(u,u)}u.
\end{equation*}
Likewise \(e_3\) is just \(f_3\) with its components in the \(e_1\) and \(e_2\) directions removed and so on with the general vector \(e_j\) given by
\begin{equation}
e_j=f_j-\sum_{i=1}^{j-1}\pi_{e_i}f_j.
\end{equation}
Given the orthogonal basis \(\{e_i\}\) we can then normalise each vector to obtain an orthonormal basis. This procedure, for constructing an orthogonal basis from any given basis in an inner product space, is known as Gram-Schmidt orthogonalisation.

We can also view the construction ‘in reverse’ as follows. Given the assumption of positive (negative) definite inner product we know that not only is the \(n\times n\) matrix, \(\mathbf{G}\), of the inner product with respect to the given basis, \(\{f_i\}\), invertible, but every \(k\times k\) submatrix with elements \(G_{ij}\), \(1\leq i,j\leq k\) is also invertible. Indeed, if it weren’t, and there were numbers \(x^i\) not all zero such that \(\sum_{i=1}^kG_{ij}x^j=0\), then the vector \(\sum_{i=1}^kx^if_i\) would a non-zero null vector since \((\sum_{i=1}^kx^if_i,\sum_{j=1}^kx^jf_j)=\sum_{i,j=1}^k{x^i}^*x^j(f_i,f_j)=\sum_{i,j=1}^k{x^i}^*x^jG_ij=0\). Now define \(e_n=f_n-\sum_{i,j=1}^{n-1}G_{ij}^{-1}(f_j,f_n)f_i\). It is clearly orthogoonal to all \(f_i\), \(1\leq i\leq n-1\). \(e_{n-1}\) is then defined as \(e_{n-1}=f_{n-1}-\sum_{i,j=1}^{n-2}G_{ij}^{-1}(f_j,f_{n-1})f_i\), and is clearly orthogonal to all \(f_i\), \(1\leq i\leq n-2\) and to \(e_n\). Continuing in this way we arrive at the desired orthogonal basis.

Now, we know that if \(U\) is a subspace of \(V\) on which the restriction of the inner product is non-degenerate then we can write, \(V=U\oplus U^\perp\), which specifies the orthogonal projection onto \(U\), \(P:V\mapto U\). In fact, \(Pv\) is the vector in \(U\) closest to \(v\) in the sense that \(\norm{v-Pv}\leq\norm{v-u}\) for all \(u\in U\) with equality if and only if \(u=Pv\). To see this, observe first that for any \(v\in V\) and \(u\in U\), we have \(v-Pv\in U^\perp\) and \(Pv-u\in U\), so that \((v-Pv,Pv-u)=0\), and therefore,
\begin{align*}
\norm{v-u}^2&=\norm{v-Pv+Pv-u}^2\\
&=(v-Pv+Pv-u,v-Pv+Pv-u)\\
&=(v-Pv,v-Pv)+(Pv-u,Pv-u)+2\Real(v-Pv,Pv-u)\\
&=\norm{v-Pv}^2+\norm{Pv-u}^2,
\end{align*}
from which it follows that \(\norm{v-Pv}\leq\norm{v-u}\) with equality if and only if \(u=Pv\).

In the context of the Gram-Schmidt procedure, if \(U=\Span(e_1,\dots,e_k)\), then notice that the orthogonal projector \(P\) is just \(P=\sum_{i=1}^k\pi_{e_i}\), so geometrically, the inductive step of the Gram-Schmidt procedure is to express the next of the original basis vectors, \(f_{k+1}\), as the sum of the vector in \(\Span(e_1,\dots,e_k)\) closest to \(f_{k+1}\) with an element \(e_{k+1}\) of \(\Span(e_1,\dots,e_k)^\perp\).

It’s worth mentioning that any set of orthonormal vectors in a real orthogonal or complex Hermitian inner product space with positive definite inner product, may be extended to an orthonormal basis for \(V\) since they can be extended to a basis of \(V\) and Gram-Schmidt employed to orthogonalise the extension.

Norms

Unless otherwise stated, an inner product space will be here either real orthogonal or complex Hermitian. Thus, we can assume that the inner product of a vector, \(v\), with itself, \((v,v)\), is real. In the case that the space \(V\) has a positive definite inner product, then \((v,v)>0\) for any non-zero \(v\) and we can define the length or norm of a vector \(v\) to be
\begin{equation}
\norm{v}=\sqrt{(v,v)}.
\end{equation}
This is a genuine norm, since, as defined, \(\norm{av}=\abs{a}\norm{v}\), \(\norm{v}=0\) implies \(v=0\) and, as we’ll see shortly, \(\norm{v+w}\leq\norm{v}+\norm{w}\).
If, on the other hand, \(V\) has only a non-degnerate inner product, there can exist non-zero vectors, \(v\), with \((v,v)\leq0\). In this case, we could define
\begin{equation}
\norm{v}=\sqrt{|(v,v)|},
\end{equation}
but should note that this is, of course, no longer properly called a norm.

In the positive definite case, we have the important Cauchy-Schwarz inequality.

Theorem (Cauchy-Schwarz inequality) In a positive definite inner product space, \(V\), for any vectors \(v,w\in V\),\begin{equation}
\abs{(v,w)}\leq\norm{v}\norm{w},\label{equ:Cauchy-Schwarz}
\end{equation}
with equality if and only if \(v\) and \(w\) are linearly dependent.

Proof The statement is trivially true when \((v,w)=0\). Assuming \((v,w)\neq0\), so that \(v\neq0\) and \(w\neq0\), for any complex number \(a\) we consider the inner product of the vector \(v-aw\) with itself. Then,\begin{equation*}
(v-aw,v-aw)=\norm{v}^2-a(v,w)-a^*(w,v)+\abs{a}^2\norm{w}^2\geq0,
\end{equation*}
with equality if and only if \(v=aw\), so that choosing
\begin{equation*}
a=\frac{\norm{v}^2}{(v,w)}
\end{equation*}
we have
\begin{equation*}
\norm{v}^2-2\norm{v}^2+\frac{\norm{v}^4\norm{w}^2}{\abs{(v,w)}^2}\geq0
\end{equation*}
so that
\begin{equation*}
\abs{(v,w)}^2\leq\norm{v}^2\norm{w}^2
\end{equation*}
from which the result follows after taking square roots.\(\blacksquare\)

This means that we can define the angle, \(\theta\), \(0\leq\theta\leq\pi/2\) between two vectors, \(v\) and \(w\), in a real orthogonal or complex Hermitian space through,
\begin{equation}
\cos\theta=\frac{\abs{(v,w)}}{{\norm{v}\norm{w}}}.
\end{equation}
The triangle inequality,
\begin{equation}
\norm{v+w}\leq\norm{v}+\norm{w},\label{triangle-standard}
\end{equation}
follows by considering the square of the left hand side and using the Cauchy-Schwarz inequality,
\begin{align*}
\norm{v+w}^2&=\norm{v}^2+\norm{w}^2+2\Real(v,w)\\
&\leq\norm{v}^2+\norm{w}^2+2\abs{(v,w)}\\
&\leq\norm{v}^2+\norm{w}^2+2\norm{v}\norm{w}\\
&=\left(\norm{v}+\norm{w}\right)^2,
\end{align*}
then taking the square root.

Similarly, a couple of variants of the triangle inequality can be obtained by considering the square of the difference of norms,
\begin{align*}
\left(\norm{v}-\norm{w}\right)^2&=\norm{v}^2+\norm{w}^2-2\norm{v}\norm{w}\\
&\leq\norm{v}^2+\norm{w}^2-2\abs{(v,w)}\\
&\leq\norm{v}^2+\norm{w}^2-2\abs{\Real(v,w)}\\
&\leq\norm{v}^2+\norm{w}^2-2\Real(v,w)\\
&=\norm{v-w}^2
\end{align*}
so that taking square roots we obtain
\begin{equation}
\abs{\norm{v}-\norm{w}}\leq\norm{v-w},\label{triangle-variant1}
\end{equation}
or alternatively
\begin{align*}
\left(\norm{v}-\norm{w}\right)^2&=\norm{v}^2+\norm{w}^2-2\norm{v}\norm{w}\\
&\leq\norm{v}^2+\norm{w}^2-2\abs{(v,w)}\\
&\leq\norm{v}^2+\norm{w}^2-2\abs{\Real(v,w)}\\
&\leq\norm{v}^2+\norm{w}^2+2\Real(v,w)\\
&=\norm{v+w}^2
\end{align*}
so that taking square roots we obtain
\begin{equation}
\abs{\norm{v}-\norm{w}}\leq\norm{v+w}.\label{triangle-variant2}
\end{equation}
A further simple consequence of the definition of the norm in terms of the positive definite inner product is the parallelogram identity,
\begin{equation}
\norm{v+w}^2+\norm{v-w}^2=2\left(\norm{v}^2+\norm{w}^2\right),\label{equ:parallelogram}
\end{equation}
which in \(\RR^2\) expresses the fact that the sum of the squared lengths of the diagonals of a parallelogram is equal to twice the sum of the squared lengths of the sides.

We have seen that specifying a positive definite symmetric or Hermitian inner product, \((\cdot,\cdot)\), on a vector space, \(V\), over respectively \(\RR\) or \(\CC\) implies the existence of a norm \(\norm{\cdot}\) on \(V\). That is, real orthogonal and complex Hermitian spaces in which the inner product is positive definite are normed vector spaces. In the other direction, given a normed vector space, \(V\), over \(\RR\), in which the norm, \(\norm{\cdot}\), satisfies the parallelogram identity, \eqref{equ:parallelogram}, we can define a positive definite symmetric inner product as
\begin{equation}
(u,v)=\frac{1}{4}\left(\norm{u+v}^2-\norm{u-v}^2\right).\label{equ:norm inner product}
\end{equation}
Notice that this definition ensures that \(\norm{v}=\sqrt{(v,v)}\). Also, since by the parallelogram identity,
\begin{equation*}
\frac{1}{4}\left(\norm{u+v}^2-\norm{u-v}^2\right)=\frac{1}{2}\left(\norm{u+v}^2-\norm{u}^2-\norm{v}^2\right),
\end{equation*}
and
\begin{equation*}
\frac{1}{4}\left(\norm{u-v}^2-\norm{u+v}^2\right)=\frac{1}{2}\left(\norm{u-v}^2-\norm{u}^2-\norm{v}^2\right),
\end{equation*}
applying the triangle inequality, which the norm satisfies by definition, then leads to the Cauchy-Schwarz identity. To confirm that \eqref{equ:norm inner product} defines a genuine inner product on \(V\) we must check \eqref{inprod-linear}. Using the parallelogram identity we have,
\begin{align*}
\norm{u+v+w}^2&=2(\norm{u+v}^2+\norm{w}^2)-\norm{u+v-w}^2\\
&=2(\norm{u+v}^2+\norm{w}^2)-(2(\norm{u-w}^2+\norm{v}^2)-\norm{u-v-w}^2)\\
&=2\norm{u+v}^2-2\norm{u-w}^2+2(\norm{w}^2-\norm{v}^2)+\norm{u-v-w}^2,
\end{align*}
and
\begin{align*}
\norm{u-v-w}^2&=2(\norm{u-v}^2+\norm{w}^2)-\norm{u-v+w}^2\\
&=2(\norm{u-v}^2+\norm{w}^2)-(2(\norm{u+w}^2+\norm{v}^2)-\norm{u+v+w}^2)\\
&=2\norm{u-v}^2-2\norm{u+w}^2+2(\norm{w}^2-\norm{v}^2)+\norm{u+v+w}^2.
\end{align*}
That is,
\begin{equation*}
\norm{u+v+w}^2-\norm{u-v-w}^2=\norm{u+v}^2-\norm{u-v}^2+\norm{u+w}^2-\norm{u-w}^2
\end{equation*}
so \((u,v+w)=(u,v)+(u,w)\). Next, observe that the definition makes it clear that, \((u,-v)=-(u,v)\), so that,
\((u,nv)=n(u,v)\), for any integer \(n\). It is then easy to see that we must have, \((u,qv)=q(u,v)\), for any \(q\in\QQ\), so that for any \(a\in\RR\) and \(q\in\QQ\),
\begin{align*}
\abs{(u,av)-a(u,v)}&=\abs{(u,(a-q)v)-(a-q)(u,v)}\\
&\leq\abs{(u,(a-q)v)}+\abs{(a-q)}\abs{(u,v)}\\
&\leq2\abs{a-q}\norm{u}\norm{v},
\end{align*}
by Cauchy-Schwartz, and we have that \((u,av)=a(u,v)\) for any \(a\in\RR\).

In the case of a normed vector space, \(V\), over \(\CC\), in which the norm once again satisfies the parallelogram identity we can define a positive definite Hermitian inner product as
\begin{equation}
(u,v)=\frac{1}{4}\left(\norm{u+v}^2-\norm{u-v}^2+i\norm{u+iv}^2-i\norm{u-iv}^2\right).
\end{equation}
The proof that this defines a genuine inner product on \(V\) proceeds as for the real case, the only difference being the real and complex parts are treated separately.

Classification of Inner Product Spaces

Definition Two vectors \(v,w\in V\) are said to be orthogonal if \((v,w)=0\). If \(U\) is a subspace of \(V\) then the orthogonal complement of \(U\), denoted \(U^\perp\), is defined as
\begin{equation}
U^\perp=\{v\in V\mid(u,v)=0\;\forall u\in U\}.
\end{equation}

It is clear that \(U^\perp\) is also a subspace of \(V\) and that if the restriction of the inner product to \(U\) is non-singular, then \(U\cap U^\perp=\{0\}\). In fact, in this case we have the following result.

Theorem If \(U\) is a subspace of an inner product space \(V\) such that the restriction of the inner product to \(U\) is non-degenerate, then \(V=U\oplus U^\perp\).

Proof Since we already know that \(U\cap U^\perp=\{0\}\), we need only demonstrate that any \(v\in V\) can be written as \(v=u+v’\) with \(u\in U\) and \(v’\in U^\perp\). To this end suppose \(e_1,\dots,e_r\) is a basis of \(U\). Then we must find numbers \(c^i\) and some \(v’\in U^\perp\) such that
\begin{equation}
v=c^1e_1+\dots+c^re_r+v’.\label{equ:orthog comp intermediate}
\end{equation}
Now define the matrix \(\mathbf{M}\) through the matrix elements, \(M_{ij}=(e_i,e_j)\). Then taking succesive inner products of \eqref{equ:orthog comp intermediate} with the basis elements of \(U\), we get the system of equations,
\begin{eqnarray*}
(e_1,v)&=&M_{11}c_1+\dots+M_{rr}c_r\\
\vdots\quad&\vdots&\qquad\quad\vdots\\
(e_r,v)&=&M_{r1}c_1+\dots+M_{rr}c_r,
\end{eqnarray*}
and since the restriction of the inner product to \(U\) is non-degenerate, \(\mathbf{M}\) is non-singular. Thus there is a unique solution for the \(c^i\), so any \(v\in V\) can be expressed in the form, \eqref{equ:orthog comp intermediate}, and the result follows.\(\blacksquare\)

Remark Recall that a direct sum decomposition, \(V=U_1\oplus U_2\), determines projectors, \(P_1,P_2\in\mathcal{L}(V)\), \(P_i^2=P_i\), such that, \(P_1+P_2=\id_V\) and \(P_iP_j=0\) when \(i\neq j\), and, \(\img P_i=U_i\), \(\ker P_1=U_2\) and \(\ker P_2=U_1\). In the context of inner product spaces orthogonal projections are natural. These are the projections corresponding to an orthogonal direct sum decomposition, such as \(V=U\oplus U^\perp\), that is, projections whose image and kernel are orthogonal.

A non-zero vector \(v\) of an inner product space \(V\) is said to be a null vector if \((v,v)=0\). All vectors are null in symplectic geometries. In the case of orthogonal or Hermitian geometries, aside from the trivial case of a zero inner product, all vectors cannot be null. Indeed, suppose on the contrary, that every \(v\in V\) was such that \((v,v)=0\). Then for every pair of vectors \(u,v\in V\), we have, in the case of a symmetric inner product,
\begin{equation*}
0=(u+v,u+v)=(u,u)+(v,v)+2(u,v)=2(u,v),
\end{equation*}
so \((u,v)=0\) implying the inner product is zero. In the case of an Hermitian inner product,
\begin{equation*}
0=(u+v,u+v)=(u,u)+(v,v)+2\Real(u,v)=2\Real(u,v),
\end{equation*}
and
\begin{equation*}
0=(u+iv,u+iv)=(u,u)+(v,v)+2i\Imag(u,v)=2i\Imag(u,v),
\end{equation*}
so also in this case, \((u,v)=0\), contradicting our assumption that inner product is non-zero.

Theorem Any finite dimensional inner product space, \(V\), over \(\RR\) or \(\CC\), can be decomposed into a direct sum \(V=V_1\oplus\dots\oplus V_r\) where the subspaces \(V_i\) are pairwise orthogonal. In the case of symmetric or Hermitian inner products they are \(1\)-dimensional. In the case of an anti-symmetric inner product the \(V_i\) may be either \(1\)-dimensional, in which case the restriction of the inner product to \(V_i\) is degenerate, or \(2\)-dimensional, in which case the restriction is non-degenerate.

Proof The proof is by induction on the dimension of \(V\). The case of \(\dim V=1\) is trivial so consider \(\dim V\geq 2\). We assume the inner product is not zero, since in this trivial case there is nothing to prove. In the case of symmetric or Hermitian inner products, as already observed, we can choose a non-null vector \(u\) from \(V\). If \(U=\Span(u)\) then the restriction of the inner product of \(V\) to \(U\) is certainly non-degenerate so we have \(V=U\oplus U^\perp\) by the previous result. Thus, if \(V\) is \(n\) dimensional then \(U^\perp\) is \(n-1\) dimensional and by induction we can therefore assume that \(U^\perp\) has the desired decomposition and the result follows. In the case of an anti-symmetric inner product, there must exist two vectors, \(v_1\) and \(v_2\) say, such that \((v_1,v_2)\neq0\). Call the subspace spanned by these vectors \(U\) then the restriction of the inner product to \(U\) is non-degenerate and the result follows as before.\(\blacksquare\)

Given this orthogonal decomposition we can use what we already know about the classification of low dimensional inner product spaces to complete the classification in general. To that end, we consider two \(n\)-dimensional vector spaces, \(V\) and \(V’\), with inner products, \((\cdot,\cdot)\) and \((\cdot,\cdot)’\), and orthogonal decompositions \(\oplus_{i=1}^rV_i\) and \(\oplus_{i=1}^{r’}{V_i}’\) respectively.

Define the subspace, \(V_0=\oplus_{i=1}^{r_0}V_i\), the sum of the degenerate subspaces of the orthogonal decomposition, and \(V^\perp=\{v\in V\mid(v’,v)=0\;\forall v’\in V\}\), sometimes called the radical of the inner product \((\cdot,\cdot)\). Clearly \(V_0\subseteq V^\perp\) and conversely, by virtue of the decomposition, we know that any \(v\in V^\perp\) can be written uniquely as a sum \(v=\sum_{i=1}^nv_i\) with each \(v_i\in V_i\). Suppose it was the case that \(v_k\neq0\) for some \(k>r_0\). Then, in the case of symmetric or Hermitian inner products we’d have \((v_k,v_k)\neq0\) and in the anti-symmetric case there’d be some vector \({v’}_k\in V_k\) such that \(({v’}_k,v_k)\neq0\) so in either case we contradict \(v\in V^\perp\) and conclude that \(V_0=V^\perp\). But we also have that if \(\{e_i\}\) is a basis of \(V\) in which the Gram matrix is \(\mathbf{G}\) then \(V^\perp\) consists of those \(v=v^ie_i\in V\) such that \((e_j,v)=0\) for each \(j=1,\dots,n\), that is, such that, \(\sum_{i=1}^nG_{ji}v^i=0\), for each \(j=1,\dots,n\). But this is just the condition for \(v\in\ker L_\mathbf{G}\), so \(V^\perp=\ker L_\mathbf{G}\) and \(\dim V^\perp=n-\rank\mathbf{G}\). Thus, we may conclude that \(r_0=\dim V^\perp=n-\rank\mathbf{G}\). Moreover, since the Gram matrices of isometric inner product spaces have the same rank, if our two inner product spaces \(V\) and \(V’\) are isometric we may further conclude that \(r_0={r_0}’\) where \({r_0}’\) is the number of degenerate subspaces in the orthogonal decomposition of \(V’\).

Now let us consider the case of \(V\) and \(V’\) having anti-symmetric inner products. We know that aside from \(1\)-dimensional degenerate subspaces all the remaining subspaces in their respective orthogonal decompositions are \(2\)-dimensional non-degenerate and that any two such spaces are isometric. Thus, if \(V\) and \(V’\) have \(r_0={r_0}’\) then they must both have \(r_0\) degenerate \(1\)-dimensional subspaces and \((n-r_0)/2\) non-degenerate \(2\)-dimensional subspaces in their respective decompositions, which we may order such that in both cases non-degenerate precede degenerate subspaces. Therefore they must be isometric, since we can construct an isometry \(f:V\mapto V’\) as the direct sum \(f=\oplus_{i=1}^rf_i\) of isometries \(f_i:V_i\mapto {V_i}’\) which we know must exist from the discussion in Low Dimensional Inner Product Spaces. Conversely, if \(V\) and \(V’\) are isometric we know that \(r_0={r_0}’\). Thus, we conclude that two vector spaces equipped with antisymmetric inner products are isometric if and only if the vector spaces and their respective radicals have the same dimension. It should be clear that precisely the same statement can be made for two complex vector spaces with symmetric inner products.

In the orthogonal decompositions of real vector spaces equipped with symmetric, or complex vector spaces equipped with Hermitian, inner products, aside from the degenerate subspaces, there are in each case two possibilities for the remaining \(1\)-dimensional subspaces. They may be either positive or negative definite. Denote by \(r_+\) and \(r_-\) respectively the number of positive and negative definite subspaces of \(V\). If \(V_+\) and \(V_-\) are the respective direct sums of these subspaces then it is clear that \(V_+\) and \(V_-\) are respectively positive and negative definite, that \(r_+=\dim V_+\) and \(r_-=\dim V_-\) and that we can write \(V=V_+\oplus V_-\oplus V_0\). The triple \((r_+,r_-,r_0)\) is called the signature of the inner product. Define the same primed quantities, \({r_+}’\), \({V’}_+\), \({r_-}’\) and \({V’}_-\) for \(V’\), whose decomposition can then be written as, \(V’={V’}_+\oplus {V’}_-\oplus {V’}_0\). Now suppose \(V\) and \(V’\) are isometric. We know that they must have the same dimension and that \(r_0={r_0}’\). If \(f:V\mapto V’\) is an isometry, then for any \(v\in V\), by virtue of the decomposition of \(V’\), we can write \(f(v)=f(v)_++f(v)_-+f(v)_0\) where \(f(v)_+\in {V’}_+\), \(f(v)_-\in {V’}_-\) and \(f(v)_0\in {V’}_0\). We consider the restriction \(f|_{V_+}\) of \(f\) to \(V_+\) and note that if \({P’}_+:V’\mapto V’\) is the projection operator onto the subspace \({V’}_+\), then \({P’}_+\circ f|_{V_+}:V_+\mapto {V’}_+\) is linear. Now suppose \(r_+>{r_+}’\), then there must exist some \(v\in V_+\) such that \({P’}_+\circ f|_{V_+}(v)=0\), so that for this \(v\), \(f(v)_+=0\) and we have \(f(v)=f(v)_-+f(v)_0\). But notice that then \((v,v)=(f(v),f(v))’=(f(v)_-,f(v)_-)’<0\) contradicting the fact that \(v\in V_+\). Similarly, if \(r_+<{r_+}'\), then we must have \(r_->{r_-}’\), and we would again arrive at a contradiction by considering the restriction \(f|_{V_-}\). So we conclude that isometry of \(V\) and \(V’\) implies \(r_+={r_+}’\), \(r_-={r_-}’\) and \(r_0={r_0}’\), that is the have the same signature. Conversely, if two vector spaces \(V\) and \(V’\) have the same signature, then their orthogonal decompositions can be appropriately ordered such that an isometry, \(f:V\mapto V’\), can be constructed as the direct sum, \(f=\oplus f_i\), of isometries, \(f_i:V_i\mapto V_i\), which we know must exist from the discussion in Low Dimensional Inner Product Spaces. Thus, we conclude that two real vector spaces equipped with symmetric inner products or two complex vector spaces equipped with Hermitian inner products are isometric if and only if they share the same signature.

Let us summarise the above discussion in the following

Theorem Symplectic spaces and complex orthogonal spaces are characterised up to isometry by the pair of integers \((n,r_0)\) where \(n\) is the dimension of the space and \(r_0\) is the dimension of the radical of the inner product. Real orthogonal spaces and complex Hermitian spaces are characterised up to isometry by their signature, \((r_+,r_-,r_0)\), where \(r_+\) and \(r_-\) are the dimensions of the subspaces upon which the restriction of the inner product is respectively positive and negative definite.

Theorem and Theorem tell us that for orthogonal and Hermitian spaces we can always find an orthogonal basis. Such a basis is particularly useful when the inner product is non-degenerate. In this case, any vector \(v\) may be expressed as \(v=v^je_j\), but then \((e_i,v)=(e_i,v^je_j)\) so
\begin{equation}
v=\sum_{i=1}^n\frac{(e_i,v)}{(e_i,e_i)}e_i.
\end{equation}
A vector \(v\) is said to be normalised if \((v,v)=\pm1\), and a set of normalised vectors \(\{e_i\}\) is said to be orthonormal if they are orthogonal with one another, that is, \((e_i,e_j)=0\) whenever \(i\neq j\) and \((e_i,e_i)=\pm1\). So given any orthogonal basis of a non-degenerate inner product space we can always choose an orthonormal basis. If the inner product is positive definite then with respect to an orthonormal basis any vector the convenient decomposition,
\begin{equation}
v=\sum_{i=1}^n(e_i,v)e_i.
\end{equation}

It should be clear that for real orthogonal and complex Hermitian spaces we can always find a basis in which the Gram matrix has the form,
\begin{equation}
\mathbf{G}=\begin{pmatrix}
\mathbf{I}_{r_+}&\mathbf{0}&\mathbf{0}\\
\mathbf{0}&-\mathbf{I}_{r_-}&\mathbf{0}\\
\mathbf{0}&\mathbf{0}&\mathbf{0}
\end{pmatrix},
\end{equation}
and for complex orthogonal spaces, a basis in which the Gram matrix has the form,
\begin{equation}
\mathbf{G}=\begin{pmatrix}
\mathbf{I}_{n-r_0}&\mathbf{0}\\
\mathbf{0}&\mathbf{0}
\end{pmatrix}.
\end{equation}

Clearly there’s no such thing as an orthogonal basis for a symplectic space. However, Theorem and Theorem do make it clear that we can always choose a basis, \(\{e_1,f_1,\dots,e_{(n-r_0)/2},f_{(n-r_0)/2},e_{(n-r_0)/2+1},\dots,e_{(n+r_0)/2}\}\), such that, \((e_i,f_i)=-(f_i,e_i)=1\) for \(i=1,\dots,(n-r_0)/2\), are the only non-zero inner products of basis elements. Reordering, we obtain the symplectic basis,
\begin{equation}
\{e_1,\dots,e_{(n-r_0)/2},f_1,\dots,f_{(n-r_0)/2},e_{(n-r_0)/2+1},\dots,e_{(n+r_0)/2}\},
\end{equation}
in terms of which, the Gram matrix has the form,
\begin{equation}
\mathbf{G}=\begin{pmatrix}
\mathbf{0}&\mathbf{I}_{(n-r_0)/2}&\mathbf{0}\\
-\mathbf{I}_{(n-r_0)/2}&\mathbf{0}&\mathbf{0}\\
\mathbf{0}&\mathbf{0}&\mathbf{0}
\end{pmatrix}
\end{equation}

Hermitian, real orthogonal and real symplectic geometries arise quite naturally together, in the following way. Suppose we have a complex vector space \(V\) on which is defined the Hermitian inner product \((\cdot,\cdot):V\times V\mapto\CC\). We consider its realification, \(V_\RR\), on which we define two inner products, \(g(v,w)=\Real(v,w)\) and \(\omega(v,w)=\Imag(v,w)\). It is clear that \(g\) is symmetric and \(\omega\) is antisymmetric and that therefore, \(g\) is positive definite if and only if \((\cdot,\cdot)\) is positive definite. We also have the following relations,
\begin{eqnarray}
g(v,w)=g(iv,iw)=\omega(v,iw)=-\omega(iv,w)\label{orthsympl1}\\
\omega(v,w)=\omega(iv,iw)=-g(v,iw)=g(iv,w).\label{orthsympl2}
\end{eqnarray}
Conversely, if on \(V_\RR\), there are defined inner products, \(g\) and \(\omega\), respectively symmetric and antisymmetric, which satisfy the relations \eqref{orthsympl1} and \eqref{orthsympl2}, then the inner product on \(V\) defined as \((v,w)=g(v,w)+i\omega(v,w)\) is Hermitian.

Consider, in particular, \(\CC^n\) with the standard (orthonormal) basis \(\{\mathbf{e}_i\}\) and Hermitian inner product,
\begin{equation*}
(\mathbf{v},\mathbf{w})=\sum_{i=1}^n{v_i}^*w_i.
\end{equation*}
It’s realification is \(\RR^{2n}\) with basis \(\{\mathbf{e}_1,\dots,\mathbf{e}_n,i\mathbf{e}_1,\dots,i\mathbf{e}_n\}\) which is orthonormal with respect to \(g\) and symplectic with respect to \(\omega\).

The Hermitian inner product, or more precisely its absolute value, measures the extent to which two vectors are parallel or linearly dependent over \(\CC\) while \(g\) measures this over \(\RR\). Thus, \(\omega\) measures the extent to which the linear dependence of two vectors is due to extending the base field from \(\RR\) from \(\CC\).

The upshot of this strong relationship, particularly between complex Hermitian and real orthogonal inner products, is that we can develop their theory largely in parallel.

Low Dimensional Inner Product Spaces

We consider the different flavours of inner product in turn.

Antisymmetric Clearly, for a \(1\)-dimensional vector space the only possible antisymmetric inner product is the zero inner product. In the \(2\)-dimensional case, suppose first that the skew-symmetric inner product is degenerate. Then there exists, in our \(2\)-dimensional space \(V\), a non-zero vector \(v\) such that \((u,v)=0\) \(\forall u\in V\) and we can extend \(v\) to a basis \(\{v,v’\}\) of \(V\). Now consider the inner product of two arbitrary elements, \(av+bv’\) and \(cv+dv’\), where \(a\), \(b\), \(c\) and \(d\) are elements of the base field (\(\RR\) or \(\CC\)). We have
\begin{equation*}
(av+bv’,cv+dv’)=ac(v,v)+ad(v,v’)+bc(v’,v)+bd(v’,v’)=0,
\end{equation*}
so the only degenerate antisymmetric inner product on a \(2\)-dimensional vector space is the zero inner product. So consider the non-degenerate case. There must exist two vectors, \(v_1\) and \(v_2\), say, such that \((v_1,v_2)\neq0\). In particular this means they are linearly independent and so we can take them to be a basis of \(V\). If \((v_1,v_2)=a\) then the map \(f:V\mapto K^2\) defined by \(f(v_1)=a\mathbf{e}_1\), \(f(v_2)=\mathbf{e}_2\) is clearly an isometry between \(V\) and the symplectic space \(K^2\) of Example. That is, any symplectic geometry on a 2-dimensional vector space is isometric to either the trivial, zero case, or that of Example.

Symmetric Consider first a \(1\)-dimensional vector space, \(V\), over \(\RR\). If \(v\in V\) is non-zero but \((v,v)=0\) then we have the trivial case. So suppose \((v,v)=a\) for some \(a\in\RR\). Either \(a>0\), in which case the inner product is positive definite and we have isometry with the inner product space \(\RR\) equipped with the inner product given by simple multiplication, \((x,y)=xy\) for \(x,y\in\RR\), or \(a<0\), in which case the inner product is negative definite and we have isometry with \(\RR\) equipped with the inner product, \((x,y)=-xy\). As already observed, these two cases are not isometric. If, however, \(V\) is over \(\CC\) then any non-degenerate inner product space is simply isometric to \(\CC\) with the inner product \((x,y)=xy\). Hermitian Any non-trivial \(1\)-dimensional Hermitian inner product space \(V\) must be such that \((v,v)\neq0\) for some \(v\in V\). In this case \((v,v)=a\) with \(a\) some non zero real number. Similar to the real symmetric case we have two cases, positive or negative definite, each clearly isometric to \(\CC\) equipped respectively with the Hermitian inner product \((x,y)=x^*y\) or \((x,y)=-x^*y\).

Definitions and Examples

Definition An inner product space is a vector space \(V\) over \(K=\RR\) or \(\CC\) equipped with an inner product, \((\cdot,\cdot):V\times V\mapto K\), associating a number \((v,w)\in K\) to every pair of vectors \(v,w\in V\). For \(u,v,w\in V\) and \(a,b\in K\), this inner product must satisfy the linearity condition,
\begin{equation}
(u,av+bw)=a(u,v)+b(u,w),\label{inprod-linear}
\end{equation}
together with one of three possible symmetry properties,
\begin{equation}
(v,w)=(w,v),\label{inprod-symmetric}
\end{equation}
in which case the inner product is called symmetric and the space will be said to have an orthogonal geometry,
\begin{equation}
(v,w)=(w,v)^*,\label{inprod-hermitian}
\end{equation}
in which case the inner product is called Hermitian and the space will be said to have an Hermitian geometry, or
\begin{equation}
(v,w)=-(w,v),\label{inprod-symplectic}
\end{equation}
in which case the inner product is called antisymmetric and the space will be said to have a symplectic geometry.
If the inner product further satisifes the condition,
\begin{equation}
(v,v)\geq 0\text{ and }(v,v)=0 \iff v=0\label{inprod-posdef},
\end{equation}
then it is said to be positive definite, or, if it satisifes the weaker condition,
\begin{equation}
(v,w)=0\:\,\forall v\in V\implies w=0,\label{inprod-non sing}
\end{equation}
then it is said to be non-singular or non-degenerate.

A series of remarks relating to this definition are in order.

Remark When \(K=\CC\) and the inner product is Hermitian, we have,
\begin{equation}
(au+bv,w)=a^*(u,w)+b^*(v,w).
\end{equation}
The inner product is then said to be sequilinear with respect to the first argument. Notice also, that, \((v,v)\in\RR\), \(\forall v\in V\).

Remark In all cases, aside from complex Hermitian, the inner product is bilinear.

Remark When \(K=\RR\), an Hermitian inner product is simply symmetric so when considering Hermitian geometries we only consider vector spaces over \(\CC\).

Remark A negative definite inner product is of course such that \((v,v)\leq0\) and \((v,v)=0\) if and only if \(v=0\).

Remark In a space with symplectic geometry we have \((v,v)=0\), \(\forall v\in V\). In particular, such a space can never be positive (or negative) definite, but may be non-degenerate.

Remark The three symmetry properties described in the definition ensure that \((v,w)=0\) if and only if \((w,v)=0\).

If \(\{e_i\}\) is a basis for \(V\), then we can define a matrix \(\mathbf{G}\) with components \(G_{ij}=(e_i,e_j)\). \(\mathbf{G}\) is called the Gram matrix or matrix of the inner product with respect to the basis \(\{e_i\}\). Symmetric, Hermitian and anti-symmetric inner products then correspond respectively to Gram matrices of that type (recall that \(\mathbf{G}\) is Hermitian if \(\mathbf{G}=\mathbf{G}^\dagger\), where \(\mathbf{G}^\dagger\) is the complex conjugate of the transpose).

If we change basis according to \(e’_i=P_i^je_j\), then \(G’_{ij}=(e’_i,e’_j)=(P_i^ke_k,P_i^le_l)=(P_i^k)^*G_{kl}P_i^l\). So when \(K=\RR\) we have \(\mathbf{G}’=\mathbf{P}^\mathsf{T}\mathbf{G}\mathbf{P}\), and for \(K=\CC\), \(\mathbf{G}’=\mathbf{P}^\dagger\mathbf{G}\mathbf{P}\). In either case the matrices \(\mathbf{G}’\) and \(\mathbf{G}\) are said to be congruent. Note that congruent matrices have the same rank so it makes sense to define the rank of an inner product as the rank of its corresponding Gram matrix in some basis.

Let us consider the property of non-degeneracy in terms of the Gram matrix. Given a basis \(\{e_i\}\) of \(V\), we can define a linear map \(L_\mathbf{G}\) in the usual way such that for any \(v=v^ie_i\in V\), \(L_\mathbf{G}v=\sum_{i,j=1}^nG_{ji}v^ie_j\). Then the rank of \(\mathbf{G}\) is just \(n-\dim\ker L_\mathbf{G}\). A vector \(v\in\ker L_\mathbf{G}\) if and only if \(\sum G_{ji}v^i=0\) for each \(j=1,\dots,n\). But notice that non-degeneracy is equivalent to the statement that \((e_j,v)=0\) for all \(j=1,\dots,n\) implies \(v=0\) that is \(G_{ji}v^j=0\) for each \(j=1,\dots,n\) implies \(v=0\). So we see non-degeneracy is equivalent to \(\ker L_\mathbf{G}\) being trivial, that is, the gram matrix, \(\mathbf{G}\), being of full rank.

A real symmetric inner product space, with a positive definite inner product, is also called a Euclidean vector space.

Example The standard example of a Euclidean vector space is \(\RR^n\) with the inner product of any pair of vectors \(\mathbf{v},\mathbf{w}\in\RR^n\) given by the usual dot or scalar product,
\begin{equation}
(\mathbf{v},\mathbf{w})=\mathbf{v}\cdot\mathbf{w}=\sum_{i=1}^nv^iw^i.
\end{equation}

A real symmetric inner product space with a non-singular inner product is called a pseudo-Euclidean space.

Example The Minkowski space, sometimes denoted, \(\MM^4\), of special relativity is an important example of a pseudo-Euclidean space. It is \(\RR^4\) equipped with the inner product,
\begin{equation}
(v,w)=v^0w^0-\sum_{i=1}^3v^iw^i.
\end{equation}
(Four)-vectors of \(\MM^4\) are conventionally indexed from \(0\), with the 0-component being the ‘time-like’ component and the others being the ‘space-like’ components.

Example \(\CC^n\) has a natural Hermitian geometry when equipped with the inner product defined on any pair of vectors \(v,w\in\CC^n\) by,
\begin{equation}
(v,w)=\sum_{i=1}^n{v^i}^*w^i.
\end{equation}

Example For a simple example of a symplectic geometry on \(K^2\), consider the inner product defined on the standard basis vectors by \((\mathbf{e}_1,\mathbf{e}_2)=1=-(\mathbf{e}_2,\mathbf{e}_1)\) so that for any \(\mathbf{v}\) and \(\mathbf{w}\) in \(K^2\) we have,
\begin{equation}
(\mathbf{v},\mathbf{w})=\det(\mathbf{v},\mathbf{w})=v^1w^2-v^2w^1,
\end{equation}
that is, in the case of \(K=\RR\), the signed area of the parallelogram spanned by \(\mathbf{v}\) and \(\mathbf{w}\).

We’ll see shortly that this symplectic geometry is in fact the only non-degenerate possibility for a 2-dimensional space, up to a notion of equivalence we now define.

Definition An isometry between inner product spaces \(V\) and \(W\) is a linear isomorphism, \(f:V\mapto W\), which preserves the values of the inner products. That is,
\begin{equation}
(u,v)_V=(f(u),f(v))_W,
\end{equation}
for all \(u,v\in V\), where \((\cdot,\cdot)_V\) and \((\cdot,\cdot)_W\) are the inner products on \(V\) and \(W\) respectively. If such an isometry exists, the inner product spaces are said to be isometric.

Remark Clearly, isometric inner product spaces have Gram matrices of the same rank.

Remark Isomorphic spaces equipped with the trivial, zero inner product, are trivially isometric.

Example On \(\RR\), multiplication defines a symmetric inner product, \((x,y)=xy\), which is clearly positive definite. We could also define a negative definite symmetric inner product as, \((x,y)=-xy\). These two inner product spaces cannot be isometric since any automorphism of \(\RR\) is of the form \(f(x)=ax\), \(a\in\RR\), \(a\neq0\), and for \(f\) to be an isometry we’d need, \(x^2=-a^2x^2\), for all \(x\in\RR\). Indeed, this shows us why, on \(\CC\), the inner product spaces with symmetric inner products \((x,y)=xy\) and \((x,y)=-xy\), \(x,y\in\CC\) are isometric – just consider the automorphism \(f(x)=ix\). Staying with \(\CC\), the inner product spaces with Hermitian inner products \((x,y)=x^*y\), the positive definite case, and \((x,y)=-x^*y\), the negative definite case, are not isometric since any automorphism of \(\CC\) is of the form \(f(x)=ax\), \(a\in\CC\), \(a\neq0\), and we’d need, \(\abs{x}^2=-\abs{a}^2\abs{x}^2\), for all \(x\in\CC\).

Finite dimensional vector spaces are classified up to isomorphism in terms of an integer, \(n\), their dimension. In other words two vector spaces are isomorphic if and only if they have the same dimension. Similarly, we would like to classify inner product spaces up to isometry. The dimension will clearly be one of the ‘labels’ of these equivalence classes since in particular isometric spaces are isomorphic. The question is, what other data, related to the inner product, is required to characterise isometric spaces? We’ll see that the key to answering this question lies in expressing a given inner product space as a direct sum of low dimensional ones – the type of inner product determines the structural data needed to characterise the decomposition. We begin, therefore, by considering in detail, low dimensional inner product spaces up to isometry.