Monthly Archives: June 2016

Qubit mechanics I

Two-level systems, quantum mechanical systems whose state space is \(\CC^2\), are relatively simple yet still rich enough to exhibit most of the peculiarities of the quantum world. Moreover, they are physically important – we will consider nuclear magnetic resonance and the ammonia maser as examples.

Throughout we will assume, unless explicitly stated otherwise, that the column vectors and matrices representing state vectors and observables respectively are with respect to the standard basis of \(\CC^2\).

Properties of Pauli matrices

It’s straightforward to verify that the three Pauli matrices,
\begin{equation}
\boldsymbol{\sigma}_1=\begin{pmatrix}0&1\\1&0\end{pmatrix}\qquad\boldsymbol{\sigma}_2=\begin{pmatrix}0&-i\\i&0\end{pmatrix}\qquad\boldsymbol{\sigma}_3=\begin{pmatrix}1&0\\0&-1\end{pmatrix},
\end{equation}
each square to the identity,
\begin{equation}
\boldsymbol{\sigma}_i^2=\mathbf{I},\qquad i=1,2,3
\end{equation}
and that they are all traceless,
\begin{equation}
\tr\boldsymbol{\sigma}_i=0,\qquad i=1,2,3.
\end{equation}
From these two facts it follows that each Pauli matrix has two eigenvalues \(\pm1\). We can compute the commutators and find
\begin{equation}
[\boldsymbol{\sigma}_i,\boldsymbol{\sigma}_j]=2i\epsilon_{ijk}\boldsymbol{\sigma}_k.
\end{equation}
Likewise the anti-commutators are
\begin{equation}
\{\boldsymbol{\sigma}_i,\boldsymbol{\sigma}_j\}=2\delta_{ij}\mathbf{I},
\end{equation}
and since the product of any pair of operators is one-half the sum of the anti-commutator and the commutator we have
\begin{equation}
\boldsymbol{\sigma}_i\boldsymbol{\sigma}_j=\delta_{ij}\mathbf{I}+i\epsilon_{ijk}\boldsymbol{\sigma}_k.
\end{equation}
A simple consequence of this is the rather useful relation,
\begin{equation}
(\mathbf{u}\cdot\boldsymbol{\sigma})(\mathbf{v}\cdot\boldsymbol{\sigma})=(\mathbf{u}\cdot\mathbf{v})\mathbf{I}+i(\mathbf{u}\times\mathbf{v})\cdot\boldsymbol{\sigma}.
\end{equation}

Hermitian and Unitary operators on \(\CC^2\)

Any \(2\times2\) matrix \(\mathbf{M}\) representing a linear operator on \(\CC^2\) can be expressed as a linear combination of the identity matrix and the three Pauli matrices,
\begin{equation}
\mathbf{M}=m_0\mathbf{I}+\mathbf{m}\cdot\boldsymbol{\sigma}
\end{equation}
where \(m_0\) and the components \(m_1,m_2,m_3\) of the vector \(\mathbf{m}\) are complex numbers and \(\boldsymbol{\sigma}\) is the vector with components the Pauli matrices, \(\boldsymbol{\sigma}=(\boldsymbol{\sigma}_1,\boldsymbol{\sigma}_2,\boldsymbol{\sigma}_3)\). It follows that
\begin{equation}
m_0=\frac{1}{2}\tr\mathbf{M},\quad m_i=\frac{1}{2}\tr(\mathbf{M}\boldsymbol{\sigma}_i).
\end{equation}

The condition for a matrix \(\mathbf{Q}\) to be Hermitian is that \(\mathbf{Q}^\dagger=\mathbf{Q}\). Thus we must have
\begin{equation}
\mathbf{Q}^\dagger=q_0^*\mathbf{I}+\mathbf{q}^*\cdot\boldsymbol{\sigma}=q_0\mathbf{I}+\mathbf{q}\cdot\boldsymbol{\sigma}=\mathbf{Q},
\end{equation}
where \(\mathbf{q}^*\) indicates the vector whose components are the complex conjugate of the vector \(\mathbf{q}\). It follows immediately that any Hermitian operator, that is, any qubit observable \(Q\), can be represented by a matrix
\begin{equation}
\mathbf{Q}=q_0\mathbf{I}+\mathbf{q}\cdot\boldsymbol{\sigma}
\end{equation}
where \(q_0\) and the components of the vector \(\mathbf{q}\) are all real.

The condition for a matrix \(\mathbf{U}\) to be unitary is \(\mathbf{U}^\dagger\mathbf{U}=\mathbf{U}\mathbf{U}^\dagger=\mathbf{I}\).

Theorem Any qubit unitary transformation \(U\) can, up to a choice of phase, be represented by a matrix \(\mathbf{U}\) given by
\begin{equation}
\mathbf{U}=\exp{(-i\theta\mathbf{n}\cdot\boldsymbol{\sigma})}
\end{equation}
where \(\theta\) and \(\mathbf{n}\) can be interpreted respectively as an angle and a unit vector in 3-dimensional space.

Proof We begin with the general form
\begin{equation}
\mathbf{U}=u_0\mathbf{I}+\mathbf{u}\cdot\boldsymbol{\sigma}
\end{equation}
in which \(u_0\) and \(\mathbf{u}\) are an arbitrary complex number and complex valued vector respectively. We will impose the condition \(\mathbf{U}^\dagger\mathbf{U}=\mathbf{I}\) but observe that this condition leaves an overall choice of phase unconstrained. Using this flexibility we can take \(u_0\) to be real then
\begin{align*}
\mathbf{U}^\dagger\mathbf{U}&=(u_0\mathbf{I}+\mathbf{u}^*\cdot\boldsymbol{\sigma})(u_0\mathbf{I}+\mathbf{u}\cdot\boldsymbol{\sigma})\\
&=u_0^2\mathbf{I}+2u_0\Real\mathbf{u}\cdot\boldsymbol{\sigma}+(\mathbf{u}^*\cdot\boldsymbol{\sigma})(\mathbf{u}\cdot\boldsymbol{\sigma})\\
&=u_0^2\mathbf{I}+2u_0\Real\mathbf{u}\cdot\boldsymbol{\sigma}+\mathbf{u}\cdot\mathbf{u}^*\mathbf{I}-i(\mathbf{u}\times\mathbf{u}^*)\cdot\boldsymbol{\sigma}=\mathbf{I}.
\end{align*}
Similarly, we have,
\begin{equation*}
\mathbf{U}\mathbf{U}^\dagger=u_0^2\mathbf{I}+2u_0\Real\mathbf{u}\cdot\boldsymbol{\sigma}+\mathbf{u}\cdot\mathbf{u}^*\mathbf{I}+i(\mathbf{u}\times\mathbf{u}^*)\cdot\boldsymbol{\sigma}=\mathbf{I}.
\end{equation*}
This means that we must have
\begin{equation}
\mathbf{u}\times\mathbf{u}^*=0,\label{eq:condition one}
\end{equation}
\begin{equation}
u_0\Real\mathbf{u}=0\label{eq:condition two}
\end{equation}
and
\begin{equation}
u_0^2+\mathbf{u}^*\cdot\mathbf{u}=1.\label{eq:condition three}
\end{equation}
Equation \eqref{eq:condition one} implies that
\begin{equation}
(\Real\mathbf{u}+i\Imag\mathbf{u})\times(\Real\mathbf{u}-i\Imag\mathbf{u})=-2i\Real\mathbf{u}\times\Imag\mathbf{u}=0
\end{equation}
that is, \(\Real\mathbf{u}\parallel\Imag\mathbf{u}\) so that we must be able to write \(\mathbf{u}=\alpha\mathbf{v}\) for some complex number \(\alpha\) and real vector \(\mathbf{v}\). The second condition, \eqref{eq:condition two}, tells us that either \(u_0=0\) or \(\mathbf{u}\) is pure imaginary. So that together, \eqref{eq:condition one} and \eqref{eq:condition two} imply that either \(\mathbf{u}=i\mathbf{v}\) or \(u_0=0\) and \(\mathbf{u}=\alpha\mathbf{v}\). In the latter case \eqref{eq:condition three} then implies that \(|\alpha|^2|\mathbf{v}|^2=1\) so we can write
\begin{align*}
\mathbf{U}&=\alpha\mathbf{v}\cdot\boldsymbol{\sigma}\\
&=e^{i\phi}|\alpha||\mathbf{v}|\frac{\mathbf{v}}{|\mathbf{v}|}\cdot\boldsymbol{\sigma}
\end{align*}
which up to a choice of phase has the form
\begin{equation}
\mathbf{U}=i\mathbf{n}\cdot\boldsymbol{\sigma}
\end{equation}
for some real unit vector \(\mathbf{n}\).
In the case that \(u_0\neq0\), since
\begin{equation}
u_0^2+\mathbf{v}\cdot\mathbf{v}=1
\end{equation}
we can write \(u_0=\cos\theta\) and \(\mathbf{v}=-\sin\theta\mathbf{n}\) for some angle \(\theta\) and a (real) unit vector \(\mathbf{n}\), that is, in either case, we have that up to an overall phase,
\begin{equation}
\mathbf{U}=\cos\theta\mathbf{I}-i\sin\theta\mathbf{n}\cdot\boldsymbol{\sigma}.
\end{equation}
Finally, observing that \((\mathbf{n}\cdot\boldsymbol{\sigma})^2=\mathbf{I}\) the desired matrix exponential can be written as
\begin{align*}
\exp{(-i\theta\mathbf{n}\cdot\boldsymbol{\sigma})}&=\mathbf{I}-i\theta\mathbf{n}\cdot\boldsymbol{\sigma}+\frac{i^2\theta^2}{2!}(\mathbf{n}\cdot\boldsymbol{\sigma})^2-\frac{i^3\theta^3}{3!}(\mathbf{n}\cdot\boldsymbol{\sigma})^3+\dots\\
&=\left(1-\frac{\theta^2}{2!}+\frac{\theta^4}{4!}+\dots\right)\mathbf{I}-i\left(\theta-\frac{\theta^3}{3!}+\frac{\theta^5}{5!}+\dots\right)\mathbf{n}\cdot\boldsymbol{\sigma}\\
&=\cos\theta\mathbf{I}-i\sin\theta\mathbf{n}\cdot\boldsymbol{\sigma}
\end{align*}\(\blacksquare\)

Unitary Operators on \(\CC^2\) and Rotations in \(\RR^3\)

The \(2\times2\) unitary matrices form a group under the usual matrix multiplication. This is the Lie group \(U(2)\). The \(2\times2\) unitary matrices with determinant 1 form the special unitary group \(SU(2)\). The previous theorem tells us that a general element of \(SU(2)\), which we’ll denote \(\mathbf{U}(\mathbf{n},\theta)\), can be written as
\begin{equation}
\mathbf{U}(\mathbf{n},\theta)=\exp{\left(-i\frac{\theta}{2}\mathbf{n}\cdot\boldsymbol{\sigma}\right)}.
\end{equation}
That is, the phase has been chosen such that \(\det\mathbf{U}(\mathbf{n},\theta)=1\).

The group of rotations in 3-dimensional space, denoted \(SO(3)\), consists of the all \(3\times3\)-orthogonal matrices with determinant 1. The reason for choosing the half-angle \(\theta\) is made clear in the following result.

Theorem There is a 2-to-1 group homomorphism from \(SU(2)\) to \(SO(3)\), \(\mathbf{U}(\mathbf{n},\theta)\mapsto\mathbf{R}(\mathbf{n},\theta)\) where \(\mathbf{R}(\mathbf{n},\theta)\) is the rotation through an angle \(\theta\) about the axis \(\mathbf{n}\).

Proof If we denote by \(H\) the set of traceless, Hermitian, \(2\times2\) matrices then it is not difficult to see that this is a real 3-dimensional vector space for which the Pauli matrices are a basis. The map \(f:\RR^3\mapto H\) given by \(f(\mathbf{v})=\mathbf{v}\cdot\boldsymbol{\sigma}\) is then an isomorphism of vector spaces. Defining an inner product on \(H\) according to
\begin{equation}
(\mathbf{M},\mathbf{N})_H=\frac{1}{2}\tr(\mathbf{M}\mathbf{N})
\end{equation}
this isomorphism becomes an isometry of vector spaces since,
\begin{equation*}
(f(\mathbf{v}),f(\mathbf{w}))_H=\frac{1}{2}\tr((\mathbf{v}\cdot\boldsymbol{\sigma})(\mathbf{w}\cdot\boldsymbol{\sigma}))=\mathbf{v}\cdot\mathbf{w}=(\mathbf{v},\mathbf{w})_{\RR^3},
\end{equation*}
that is, \(H\) and \(\RR^3\) are isometric. Now to any \(2\times2\) unitary matrix \(\mathbf{U}\) we can associate a linear operator \(T_\mathbf{U}\) on \(H\) such that \(T_\mathbf{U}\mathbf{M}=\mathbf{U}\mathbf{M}\mathbf{U}^\dagger\). This is clearly an isometry on \(H\) and so the corresponding linear operator on \(\RR^3\), \(f^{-1}\circ T_\mathbf{U}\circ f\) which we’ll denote \(\mathbf{R}_\mathbf{U}\) and is such that
\begin{equation}
\mathbf{R}_\mathbf{U}\mathbf{v}\cdot\boldsymbol{\sigma}=\mathbf{U}\mathbf{v}\cdot\boldsymbol{\sigma}\mathbf{U}^\dagger\label{eq:rotation from unitary}
\end{equation}
for any \(\mathbf{v}\in\RR^3\) and must be an isometrty, that is, an orthogonal operator. In fact, since
\begin{align*}
\tr\left((\mathbf{R}_\mathbf{U}\mathbf{e}_1\cdot\boldsymbol{\sigma})(\mathbf{R}_\mathbf{U}\mathbf{e}_2\cdot\boldsymbol{\sigma})
(\mathbf{R}_\mathbf{U}\mathbf{e}_3\cdot\boldsymbol{\sigma})\right)&=\tr\left((R_\mathbf{U})_1^i(R_\mathbf{U})_2^j(R_\mathbf{U})_3^k\boldsymbol{\sigma}_i\boldsymbol{\sigma}_j\boldsymbol{\sigma}_k\right)\\
&=2i\epsilon_{ijk}(R_\mathbf{U})_1^i(R_\mathbf{U})_2^j(R_\mathbf{U})_3^k\\
&=2i\det{\mathbf{R}_\mathbf{U}}
\end{align*}
and also
\begin{align*}
\tr\left((\mathbf{R}_\mathbf{U}\mathbf{e}_1\cdot\boldsymbol{\sigma})(\mathbf{R}_\mathbf{U}\mathbf{e}_2\cdot\boldsymbol{\sigma})
(\mathbf{R}_\mathbf{U}\mathbf{e}_3\cdot\boldsymbol{\sigma})\right)&=\tr\left(\mathbf{U}\boldsymbol{\sigma}_1\mathbf{U}^\dagger\mathbf{U}\boldsymbol{\sigma}_2\mathbf{U}^\dagger\mathbf{U}\boldsymbol{\sigma}_3\mathbf{U}^\dagger\right)\\
&=\tr(\boldsymbol{\sigma}_1\boldsymbol{\sigma}_2\boldsymbol{\sigma}_3)\\
&=2i
\end{align*}
we see that \(\mathbf{R}_\mathbf{U}\in SO(3)\). Also we observe that given two unitary matrices \(\mathbf{U}_1\) and \(\mathbf{U}_2\),
\begin{align*}
\mathbf{R}_{\mathbf{U}_1\mathbf{U}_2}\mathbf{v}\cdot\boldsymbol{\sigma}&=(\mathbf{U}_1\mathbf{U}_2)\mathbf{v}\cdot\boldsymbol{\sigma}(\mathbf{U}_1\mathbf{U}_2)^\dagger\\
&=\mathbf{U}_1\left(\mathbf{R}_{\mathbf{U}_2}\mathbf{v}\cdot\boldsymbol{\sigma}\right)\mathbf{U}_1^\dagger\\
&=(\mathbf{R}_{\mathbf{U}_1}\mathbf{R}_{\mathbf{U}_2}\mathbf{v})\cdot\boldsymbol{\sigma}
\end{align*}
So defining the map \(\Phi:SU(2)\mapto SO(3)\) such that
\begin{equation}
\Phi(\mathbf{U}(\mathbf{n},\theta))=\mathbf{R}_\mathbf{U},
\end{equation}
we have a group homomorphism. The kernel of this map consists of unitary matrices \(\mathbf{U}(\mathbf{n},\theta)\) such that
\begin{equation*}
\mathbf{U}(\mathbf{n},\theta)(\mathbf{v}\cdot\boldsymbol{\sigma})=(\mathbf{v}\cdot\boldsymbol{\sigma})\mathbf{U}(\mathbf{n},\theta)
\end{equation*}
for any vector \(\mathbf{v}\). It follows that \(\mathbf{U}(\mathbf{n},\theta)\) must be a multiple of the identity matrix and since \(\det\mathbf{U}(\mathbf{n},\theta)=1\) it can only be \(\pm\mathbf{I}\). Thus, \(\ker\Phi=\{\mathbf{I},-\mathbf{I}\}\) and so the homomorphism is 2-to-1. To confirm the nature of the spatial rotation corresponding to \(\mathbf{U}(\mathbf{n},\theta)\), chose \(\mathbf{v}=\mathbf{n}\) in \eqref{eq:rotation from unitary} to see that \(\mathbf{R}_\mathbf{U}\mathbf{n}=\mathbf{n}\) so that \(\mathbf{R}_\mathbf{U}\) is a rotation about the axis \(\mathbf{n}\). To determine the angle \(\gamma\) of rotation we note that if \(\mathbf{m}\) is a vector perpendicular to \(\mathbf{n}\) then \(\cos\gamma=(\mathbf{R}_\mathbf{U}\mathbf{m})\cdot\mathbf{m}\) and we have
\begin{align*}
\cos\gamma&=(\mathbf{R}_\mathbf{U}\mathbf{m})\cdot\mathbf{m}\\
&=\frac{1}{2}\tr\left(\left(\cos\frac{\theta}{2}\mathbf{I}-i\sin\frac{\theta}{2}\mathbf{n}\cdot\boldsymbol{\sigma}\right)(\mathbf{m}\cdot\boldsymbol{\sigma})\right.\\
&\qquad\times\left.\left(\cos\frac{\theta}{2}\mathbf{I}+i\sin\frac{\theta}{2}\mathbf{n}\cdot\boldsymbol{\sigma}\right)(\mathbf{m}\cdot\boldsymbol{\sigma})\right)\\
&=\frac{1}{2}\tr\left(\left(\cos\frac{\theta}{2}\mathbf{m}\cdot\boldsymbol{\sigma}-\sin\frac{\theta}{2}(\mathbf{m}\times\mathbf{n})\cdot\boldsymbol{\sigma}\right)\right.\\
&\qquad\times\left.\left(\cos\frac{\theta}{2}\mathbf{m}\cdot\boldsymbol{\sigma}+\sin\frac{\theta}{2}(\mathbf{m}\times\mathbf{n})\cdot\boldsymbol{\sigma}\right)\right)\\
&=\cos^2\frac{\theta}{2}-\sin^2\frac{\theta}{2}\\
&=\cos\theta
\end{align*}
so that the unitary operator \(\mathbf{U}(\mathbf{n},\theta)\) corresponds to a spatial rotation about the axis \(\mathbf{n}\) through an angle \(\theta\). We therefore denote the rotation \(\mathbf{R}(\mathbf{n},\theta)\).\(\blacksquare\)

The Bloch sphere revisited

We have seen that any qubit observable, \(Q\), can be represented as a matrix

\begin{equation*}
\mathbf{Q}=q_0\mathbf{I}+\mathbf{q}\cdot\boldsymbol{\sigma}
\end{equation*}

where \(q_0\in\RR\) and \(\mathbf{q}\in\RR^3\). Recall the Bloch sphere,

in which a general qubit state, \(\ket{\psi}\), is given by,

\begin{equation*}
\ket{\psi}=\cos(\theta/2)\ket{0}+e^{i\varphi}\sin(\theta/2)\ket{1}.
\end{equation*}

It can be useful to denote this state vector by \(\ket{\mathbf{n};+}\), where \(\mathbf{n}\) is the unit vector with polar coordinates \((1,\theta,\phi)\), that is,
\begin{equation*}
\ket{\mathbf{n};+}=\cos(\theta/2)\ket{0}+e^{i\varphi}\sin(\theta/2)\ket{1}.
\end{equation*}
where
\begin{equation*}
\mathbf{n}=(\sin\theta\cos\phi,\sin\theta\sin\phi,\cos\theta)
\end{equation*}
and
\begin{equation*}
\ket{\mathbf{n};-}=\sin(\theta/2)\ket{0}-e^{i\varphi}\cos(\theta/2)\ket{1}.
\end{equation*}
corresponding to the antipodal point on the Bloch sphere (\(\theta\mapto\pi-\theta\) and \(\phi\mapto2\pi+\phi\)). Indeed, \(\ket{\mathbf{n};\pm}\) are precisely the eigenvectors of the observable \(\mathbf{n}\cdot\boldsymbol{\sigma}\),
\begin{equation}
(\mathbf{n}\cdot\boldsymbol{\sigma})\ket{\mathbf{n};\pm}=\pm\ket{\mathbf{n};\pm}.
\end{equation}
For example,
\begin{align*}
(\mathbf{n}\cdot\boldsymbol{\sigma})\ket{\mathbf{n};+}&=\begin{pmatrix}\cos\theta&&e^{-i\phi}\sin\theta\\ e^{i\phi}\sin\theta&&-\cos\theta\end{pmatrix}\begin{pmatrix}\cos\frac{\theta}{2}\\e^{i\phi}\sin\frac{\theta}{2}\end{pmatrix}\\
&=\begin{pmatrix}\cos\theta\cos\frac{\theta}{2}+\sin\theta\sin\frac{\theta}{2}\\
e^{i\phi}(\sin\theta\cos\frac{\theta}{2}-\cos\theta\sin\frac{\theta}{2})\end{pmatrix}\\
&=\begin{pmatrix}\cos\frac{\theta}{2}\\e^{i\phi}\sin\frac{\theta}{2}\end{pmatrix}=\ket{\mathbf{n};+}
\end{align*}
Note of course that \(\braket{\mathbf{n};+|\mathbf{n};-}=0\), that is \(\ket{\mathbf{n};+}\) and \(\ket{\mathbf{n};-}\) are orthogonal as state vectors in the Hilbert space \(\CC^2\) though of course \(\mathbf{n}\) and \(-\mathbf{n}\) are certainly not orhthogonal vectors in \(\RR^3\)!

We’ve seen that there is a 2-to-1 homomorphism from \(SU(2)\) to \(SO(3)\) such that \(\mathbf{U}(\mathbf{n},\theta)\mapsto\mathbf{R}(\mathbf{n},\theta)\) where
\begin{equation*}
\mathbf{U}(\mathbf{n},\theta)=\exp\left(-i\frac{\theta}{2}\mathbf{n}\cdot\boldsymbol{\sigma}\right)=\cos\frac{\theta}{2}\mathbf{I}-i\sin\frac{\theta}{2}\mathbf{n}\cdot\boldsymbol{\sigma}
\end{equation*}
and the rotation \(\mathbf{R}(\mathbf{n},\theta)\) is such that
\begin{equation*}
(\mathbf{R}(\mathbf{n},\theta)\mathbf{v})\cdot\boldsymbol{\sigma}=\mathbf{U}(\mathbf{n},\theta)\mathbf{v}\cdot\boldsymbol{\sigma}\mathbf{U}(\mathbf{n},\theta)^\dagger,
\end{equation*}
which we confirmed was a rotation of \(\theta\) about the axis \(\mathbf{n}\). This means that for an arbitrary vector \(\mathbf{v}\in\RR^3\),
\begin{equation}
\mathbf{R}(\mathbf{n},\theta)\mathbf{v}=\cos\theta\mathbf{v}+(1-\cos\theta)(\mathbf{v}\cdot\mathbf{n})\mathbf{n}+\sin\theta\mathbf{n}\times\mathbf{v}.
\end{equation}

In terms of the state vector notation \(\ket{\mathbf{n};\pm}\) relating unit vectors in \(\RR^3\) to states on the Bloch sphere we have that
\begin{equation}
\mathbf{U}(\mathbf{m},\alpha)\ket{\mathbf{n};+}=\ket{\mathbf{R}(\mathbf{m},\alpha)\mathbf{n};+}
\end{equation}
since
\begin{align*}
\left((\mathbf{R}(\mathbf{m},\alpha)\mathbf{n})\cdot\boldsymbol{\sigma}\right)\mathbf{U}(\mathbf{m},\alpha)\ket{\mathbf{n};+}&=\mathbf{U}(\mathbf{m},\alpha)\mathbf{n}\cdot\boldsymbol{\sigma}\ket{\mathbf{n};+}\\
&=\mathbf{U}(\mathbf{m},\alpha)\ket{\mathbf{n};+}
\end{align*}

Thus, as would have been anticipated, the unitary operator \(\mathbf{U}(\mathbf{m},\alpha)\) rotates the Bloch sphere state \(\ket{\mathbf{n};+}\) by an angle \(\alpha\) around the axis \(\mathbf{m}\).

Basic Postulates and Mathematical Framework

Quantum mechanics plays out in the mathematical context of Hilbert spaces. These may be finite or infinite dimensional. A finite dimensional Hilbert space in in fact nothing other than a complex vector space equipped with an Hermitian inner product. In infinite dimensions the space needs some extra technical attributes but for the time being our focus will be on finite dimensional spaces. We’ll state a series of postulates in terms of (general) Hilbert spaces safe in the knowledge that they will require little or no modification when we come to consider infinite dimensions.

State vectors

Postulate [State vectors in state space] Everything that can be said about the state of a physical system is encoded in a mathematical object called a state vector belonging to a Hilbert space also called a state space. Commonly used notation for such a vector is \(\ket{\psi}\). Conversely every non-zero vector \(\ket{\psi}\) in the Hilbert space corresponds to (everything that can be said about) a possible state of the system.

In fact, any non-zero multiple of a vector \(\ket{\psi}\) contains precisely the same information about a given state of the system and so most often we restrict attention to normalised states, that is, those of unit length, \(\braket{\psi|\psi}=1\). But normalisation only fixes state vectors up to a phase factor and the equivalence class in state space of all normalised vectors differing only by a phase is called a ray. Thus, to be precise, we say that physical states are in one-to-one correspondence with rays in state space.

Remarkable richness and physical relevance is already found in a 2-dimensional state space, the inhabitants of which are called qubits (their \(n\)-dimensional counterparts are called qudits). Mathematically, such a state space is just \(\CC^2\) and the standard basis would be provided by the pair of vectors
\begin{equation*}
\begin{pmatrix}1\\0\end{pmatrix}\qquad\qquad\begin{pmatrix}0\\1\end{pmatrix}
\end{equation*}
In quantum information contexts these are typically denoted by \(\ket{0}\) and \(\ket{1}\) respectively,
\begin{equation}
\ket{0}=\begin{pmatrix}1\\0\end{pmatrix}\qquad\qquad\ket{1}=\begin{pmatrix}0\\1\end{pmatrix}
\end{equation}
An arbitary state vector in this state space would have the form,
\begin{equation}
\ket{\psi}=a\ket{0}+b\ket{1}
\end{equation}
with \(a,b\in\CC\) and normalisation requiring that \(|a|^2+|b|^2=1\). This means that we could write a general state vector as
\begin{equation*}
\ket{\psi}=e^{i\gamma}\left(\cos(\theta/2)\ket{0}+e^{i\varphi}\sin(\theta/2)\ket{1}\right)
\end{equation*}
but for different \(\gamma\) values these are just all the members of the same ray and so we can write the most general state as
\begin{equation}
\ket{\psi}=\cos(\theta/2)\ket{0}+e^{i\varphi}\sin(\theta/2)\ket{1}.
\end{equation}
Given the assumption of unit length, we can represent qubits as point on a unit sphere, called the Bloch sphere,

In this diagram diagram we have illustrated an arbitrary qubit, \(\ket{\psi}\), as well as another possible pair of basis vectors,
\begin{equation}
\ket{+}\equiv\frac{1}{\sqrt{2}}\left(\ket{0}+\ket{1}\right)
\end{equation}
and
\begin{equation}
\ket{-}\equiv\frac{1}{\sqrt{2}}\left(\ket{0}-\ket{1}\right).
\end{equation}

Observables

Physical quantities of a quantum mechanical system which can be measured, such as position or momentum, are called observables.

Postulate [Observables] The observables of a quantum mechanical system corresponding to a state space, \(\mathcal{H}\), are represented by self-adjoint (Hermitian) operators on \(\mathcal{H}\).

In two dimensions, \(\mathcal{H}=\CC^2\), the Pauli matrices,

\begin{equation*}
\boldsymbol{\sigma}_1=\begin{pmatrix}0&1\\1&0\end{pmatrix}\qquad\boldsymbol{\sigma}_2=\begin{pmatrix}0&-i\\i&0\end{pmatrix}\qquad\boldsymbol{\sigma}_3=\begin{pmatrix}1&0\\0&-1\end{pmatrix},
\end{equation*}

are examples of (qubit) observables. In due course we’ll see their relationship to spin.

Recall that the eigenvalues of Hermitian operators are real, that eigenvectors with distinct eigenvalues are orthogonal and that there exists an orthonormal basis of eigenvectors of such operators. If the state space, \(\mathcal{H}\), has dimension \(d\) then a quantum mechanical observable, \(O\), may have \(r\leq d\) distinct eigenvalues \(\lambda_i\), each with geometric multiplicity \(d_i\) such that \(\sum_{i=1}^rd_i=d\). Denoting the corresponding othonormal basis of eigenvectors, \(\ket{i,j}\), with \(i=1,\dots,r\) and \(j=1,\dots,d_i\), then,
\begin{equation*}
O\ket{i,j}=\lambda_i\ket{i,j},\quad i=1,\dots,r,\; j=1,\dots,d_i.
\end{equation*}
If we denote by \(P_{\lambda_i}\) the projector onto the eigenspace, \(V_{\lambda_i}\), so that
\begin{equation*}
P_{\lambda_i}=\sum_{j=1}^{d_i}\ket{i,j}\bra{i,j},
\end{equation*}
then \(O\) has the spectral decomposition,
\begin{equation*}
O=\sum_{i=1}^r \lambda_iP_{\lambda_i}.
\end{equation*}

Time Development of State Vectors

Leaving aside for the moment the question of how we extract some meaningful information from these state vectors, the next postulate deals with the question of how state vectors change in time. For this we restrict attention to closed systems, that is, (idealised) systems isolated from their environment.

Postulate [Unitary time evolution] If the state of a closed system at time \(t_1\) is represented by a state vector \(\ket{\psi(t_1)}\) then at a later time \(t_2\) the state is represented by a state vector \(\ket{\psi(t_2)}\) related to \(\ket{\psi(t_1)}\) by a unitary operator \(U(t_2,t_1)\) such that
\begin{equation}
\ket{\psi(t_2)}=U(t_2,t_1)\ket{\psi(t_1)}
\end{equation}
The unitary operator \(U(t_2,t_1)\) is a property of the given physical system and describes the time evolution of any possible state of the system from time \(t_1\) to time \(t_2\).

Since \(U(t_2,t_1)\) is unitary we have that
\begin{equation*}
U^\dagger(t_2,t_1)U(t_2,t_1)=\id
\end{equation*}
but of course \(U(t,t)=\id\) and if \(t_1{<}t{<}t_2\) then \(U(t_2,t_1)=U(t_2,t)U(t,t_1)\) so we must have
\begin{equation*}
U^\dagger(t_2,t_1)=U(t_1,t_2).
\end{equation*}

Starting from some fixed time \(t_0\) let us consider the time development of a state \(\ket{\psi(t_0)}\) to some later time \(t\),
\begin{equation*}
\ket{\psi(t)}=U(t,t_0)\ket{\psi(t_0)}.
\end{equation*}
Differentiating this with respect to \(t\) we obtain
\begin{equation*}
\frac{\partial}{\partial t}\ket{\psi(t)}=\frac{\partial U(t,t_0)}{\partial t}U^\dagger(t,t_0)\ket{\psi(t)},
\end{equation*}
or, defining
\begin{equation*}
\Lambda(t,t_0)\equiv\frac{\partial U(t,t_0)}{\partial t}U^\dagger(t,t_0),
\end{equation*}
we have
\begin{equation*}
\frac{\partial}{\partial t}\ket{\psi(t)}=\Lambda(t,t_0)\ket{\psi(t)}.
\end{equation*}
The operator \(\Lambda\) is actually independent of \(t_0\) since,
\begin{align*}
\Lambda(t,t_0)&=\frac{\partial U(t,t_0)}{\partial t}U^\dagger(t,t_0)\\
&=\frac{\partial U(t,t_0)}{\partial t}U(t_0,t_1)U^\dagger(t_0,t_1)U^\dagger(t,t_0)\\
&=\frac{\partial U(t,t_0)U(t_0,t_1)}{\partial t}(U(t,t_0)U(t_1,t_1))^\dagger\\
&=\frac{\partial U(t,t_0)}{\partial t}U^\dagger(t,t_0)\\
&=\Lambda(t,t_1)
\end{align*}
so we may as well write it simply as \(\Lambda(t)\). Moreover, \(\Lambda(t)\) is anti-Hermitian as can be seen by differentiating \(U(t,t_0)U^\dagger(t,t_0)=\id\) to obtain
\begin{equation*}
\Lambda(t)+\Lambda^\dagger(t)=0.
\end{equation*}
Thus if we define a new operator \(H(t)\) according to
\begin{equation}
H(t)=i\hbar\Lambda(t)
\end{equation}
where \(\hbar\) is Planck’s constant, then \(H(t)\) is an Hermitian operator with units of energy and the time development equation becomes
\begin{equation}
i\hbar\frac{\partial}{\partial t}\ket{\psi(t)}=H(t)\ket{\psi(t)}.
\end{equation}
The operator \(H(t)\) is interpreted as the Hamiltonian of the closed system, the energy observable, and the time development equation in this form is called the Schrödinger equation.

Because the Hamiltonian is an Hermitian operator it has a spectral decomposition (dropping the the explicit reference to potential time dependence)
\begin{equation}
H=\sum_{i=1}^rE_iP_{E_i}
\end{equation}
where \(E_i\) are the (real) energy eigenvalues and \(P_{E_i}\) is a projector onto the energy eigenspace corresponding to the eigenvalue \(E_i\),
\begin{equation}
P_{E_i}=\sum_{j=1}^{d_i}\ket{E_i,j}\bra{E_i,j}
\end{equation}
where \(\ket{E_i,j}\) are energy eigenstates and \(d_i\) is the degeneracy of the energy eigenvalue \(E_i\).

The typical situation is that for a given closed system we know the Hamiltonian \(H(t)\), perhaps by analogy with a corresponding classical system. In this case, at least in principle, we can compute the corresponding unitary operator \(U(t)\) by solving the differential equation
\begin{equation}
\frac{dU(t)}{dt}=-\frac{i}{\hbar}H(t)U(t).
\end{equation}
There are 3 cases to consider.

The simplest situation is that the Hamiltonian is time independent since then it is straightforward to confirm that the solution is
\begin{equation}
U(t,t_0)=\exp\left[-\frac{i}{\hbar}H(t-t_0)\right].
\end{equation}

The second case is that the Hamiltonian is time dependent but the Hamiltonians at two different times commute, that is, \([H(t_1),H(t_2)]=0\), then we claim that the solution is
\begin{equation}
U(t,t_0)=\exp\left[-\frac{i}{\hbar}\int_{t_0}^tds\,H(s)\right].
\end{equation}
To see this first define
\begin{equation*}
R(t)=-\frac{i}{\hbar}\int_{t_0}^tds\,H(s),
\end{equation*}
so that \(R'(t)=-(i/\hbar)H(t)\) and note that
\begin{equation*}
[R'(t),R(t)]=\left[-\frac{i}{\hbar}H(t),-\frac{i}{\hbar}\int_{t_0}^tds\,H(s)\right]=-\frac{1}{\hbar^2}\int_{t_0}^tds\,[H(t),H(s)]=0.
\end{equation*}
That \(R'(t)\) and \(R(t)\) commute then means that me can write the derivative,
\begin{align*}
\frac{d}{dt}\exp R(t)&=\frac{d}{dt}\left[\id+R(t)+\frac{1}{2!}R(t)R(t)+\frac{1}{3!}R(t)R(t)R(t)+\dots\right]\\
&=R’+\frac{1}{2!}(R’R+RR’)+\frac{1}{3!}(R’RR+RR’R+RRR’)+\dots
\end{align*}
as
\begin{equation*}
\frac{d}{dt}\exp R(t)=R’\left(\id+R+\frac{1}{2!}R^2+\dots\right)=R'(t)\exp R(t),
\end{equation*}
confirming the result.

The third case is the most general situation in which Hamiltonians at two different times do not commute. In this case the best we can do is write the differential equation for \(U(t)\) as an integral equation,
\begin{equation*}
U(t,t_0)=\id-\frac{i}{\hbar}\int_{t_0}^tdt_1\,H(t_1)U(t_1,t_0)
\end{equation*}
and then, expressing \(U(t_1,t_0)\) as
\begin{equation*}
U(t_1,t_0)=\id-\frac{i}{\hbar}\int_{t_0}^{t_1}dt_2\,H(t_2)U(t_2,t_0)
\end{equation*}
iterate once to obtain,
\begin{equation*}
U(t,t_0)=\id+\left(-\frac{i}{\hbar}\right)\int_{t_0}^tdt_1\,H(t_1)+\left(-\frac{i}{\hbar}\right)^2\int_{t_0}^tdt_1H(t_1)\int_{t_0}^{t_1}dt_2H(t_2)U(t_2,t_0).
\end{equation*}
Continuing in this way we obtain a formal series,
\begin{align*}
U(t,t_0)=\id+\left(-\frac{i}{\hbar}\right)\int_{t_0}^tdt_1\,H(t_1)&+\left(-\frac{i}{\hbar}\right)^2\int_{t_0}^tdt_1H(t_1)\int_{t_0}^{t_1}dt_2H(t_2)\\
&+\left(-\frac{i}{\hbar}\right)^3\int_{t_0}^tdt_1H(t_1)\int_{t_0}^{t_1}dt_2H(t_2)\int_{t_0}^{t_2}dt_3H(t_3)\\
&+\dots
\end{align*}
the right hand side of which is called a time-ordered exponential.

Measurement

Information is extracted from a quantum system through the process of measurement and, in contrast to classical physics, the process of measurement is incorporated into the theoretical framework.

Postulate [General measurement]To the \(i\)th possible outcome of a measurement of a quantum system in a state \(\ket{\psi}\) there corresponds a measurement operator \(M_i\) such that the probability that the \(i\)th outcome occurs is \(p(i)\) where
\begin{equation}
p(i)=\bra{\psi}M^\dagger_iM_i\ket{\psi}
\end{equation}
and if this occurs then the state of the system after the measurement is given by,
\begin{equation}
\frac{M_i\ket{\psi}}{\sqrt{\bra{\psi}M_i^\dagger M_i\ket{\psi}}}.
\end{equation}
The measurement operators satisfy
\begin{equation}
\sum_iM_i^\dagger M_i=\id
\end{equation}
expressing the fact that the probabilities sum to 1.

Distinguishing States by General Measurements

Suppose we have a two dimensional state space and we are given one of two states, \(\ket{0}\) or \(\ket{1}\), at random. There is a measurement which can definitely distinguish between these two states, namely, defining the measurement operators \(M_0=\ket{0}\bra{0}\) and \(M_1=\ket{1}\bra{1}\) then \(M_0+M_1=\id\) and assuming we receive \(\ket{0}\) then \(p(0)=1\), that is \(p(0|\text{receive} \ket{0})=1\) and similarly \(p(1|\text{receive} \ket{1})=1\) so that the probability of successfully identifying the received state is
\begin{equation*}
P_S=p(\text{receive} \ket{0})p(0\,|\,\text{receive} \ket{0})+p(\text{receive} \ket{1})p(1\,|\,\text{receiving} \ket{1})=1.
\end{equation*}
Of course this is a perfect situation. More realistic is that we must decide what kind of measurement to perform and how to infer from a given measurement outcome the identity of the original state. So for example if we (correctly) chose the \(\{M_0,M_1\}\) measurement but inferred from a 0 outcome the state \(\ket{1}\) and vice versa then the probability of successful identification would be 0. If instead we chose a measurement based on basis elements \(\ket{+}\) and \(\ket{-}\), and inferred from a \(+\) outcome the state \(\ket{0}\) and from a \(-\) outcome the state \(\ket{1}\) then since
\begin{equation*}
p(0\,|\,\text{receive} \ket{0})=p(+\,|\,\text{receive} \ket{0})=\braket{0|+}\braket{+|0}=\frac{1}{2}
\end{equation*}
and
\begin{equation*}
p(1\,|\,\text{receive} \ket{1})=p(-\,|\,\text{receive} \ket{1})=\braket{1|-}\braket{-|1}=\frac{1}{2}
\end{equation*}
the probability of successfully identifying the received state is \(1/2\).

We can generalise this discussion as follows. Suppose we receive, with equal probability, one of \(N\) states \(\{\ket{\phi_1},\dots,\ket{\phi_N}\}\) from a \(d\)-dimensional subspace \(U\subset\mathcal{H}\) of a state space \(\mathcal{H}\). We investigate the probability of successfully distinguishing these \(N\) states based on a measurement corresponding to \(n\) measurement operators \(\{M_1,\dots,M_n\}\). We need a rule which encodes how we infer one of the \(N\) given states from one of the \(n\) measurement outcomes. We can express this as a surjective map \(f\) from the set of outcomes, \(\{1,\dots,n\}\), to the given states, \(\{1,\dots,N\}\). Then the probability of success is given by
\begin{equation}
P_S=\sum_{i=1}^Np(\text{receive} \ket{\phi_i})\times\left(\sum_{j:f(j)=i}p(j\,|\,\text{receive} \ket{\phi_i})\right).
\end{equation}
Now if by \(P_U\) we denote the orthogonal projector onto the subspace \(U\) to which the \(N\) states \(\ket{\phi_i}\) belong then we can write
\begin{equation}
p(j\,|\,\text{receive} \ket{\phi_i})=\braket{\phi_i|M_j^\dagger M_j|\phi_i}=\braket{\phi_i|P_UM_j^\dagger M_jP_U|\phi_i}.
\end{equation}
But \(M_j^\dagger M_j\) is a positive operator and therefore so is \(P_UM_j^\dagger M_jP_U\) and since the \(\ket{\phi_i}\) are assumed to be normalised we can say that \(\braket{\phi_i|P_UM_j^\dagger M_jP_U|\phi_i}\leq\tr P_UM_j^\dagger M_jP_U\). Thus, noting that \(\tr P_U=d\), we obtain
\begin{align*}
P_S&=\sum_{i=1}^Np(\text{receive} \ket{\phi_i})\times\left(\sum_{j:f(j)=i}\braket{\phi_i|P_UM_j^\dagger M_jP_U|\phi_i}\right)\\
&=\frac{1}{N}\sum_{i=1}^N\sum_{j:f(j)=i}\braket{\phi_i|P_UM_j^\dagger M_jP_U|\phi_i}\\
&\leq\frac{1}{N}\sum_{i=1}^N\sum_{j:f(j)=i}\tr P_UM_j^\dagger M_jP_U\\
&=\frac{1}{N}\tr P_U\left(\sum_jM_j^\dagger M_j\right)P_U\\
&=\frac{d}{N}.
\end{align*}
That is, the probability of success is bounded from above according to \(P_S\leq d/N\). If \(N\leq d\) and the states \(\{\ket{\phi_1},\dots,\ket{\phi_N}\}\) are orthogonal then it is possible to distinguish the states with certainty. Indeed, defining operators \(M_i=\ket{\phi_i}\bra{\phi_i}\) for \(i=1,..,N\) and \(M_{N+1}=\sqrt{\id-\sum_{i=1}^NM_i}\) then we have the appropriate measurement to be combined with with the trivial inference map \(f(i)=i\) for \(i=1,\dots,n\) and, for example, \(f(N+1)=1\).

Let’s now focus on the case that we have two states \(\ket{\phi_1}\) and \(\ket{\phi_2}\) belonging to a two-dimensional subspace \(U\). We already know that if the states are orthogonal then in principle it is possible to distinguish them with certainty. Let us then consider the case that they are not orthogonal. We will show that in this case we cannot reliably distinguish the two states. To see this suppose on the contrary that it were indeed possible. Then we must have a measurement with operators \(M_i\) and an inference rule \(f\) such that
\begin{equation*}
\sum_{j:f(j)=1}p(j\,|\,\text{receive} \ket{\phi_1})=\sum_{j:f(j)=1}\braket{\phi_1|M_j^\dagger M_j|\phi_1}=1
\end{equation*}
and
\begin{equation*}
\sum_{j:f(j)=2}p(j\,|\,\text{receive} \ket{\phi_2})=\sum_{j:f(j)=2}\braket{\phi_2|M_j^\dagger M_j|\phi_2}=1
\end{equation*}
So defining \(E_1\equiv\sum_{j:f(j)=1}M_j^\dagger M_j\) and \(E_2\equiv\sum_{j:f(j)=2}M_j^\dagger M_j\) and noting that \(E_1+E_2=\id\) we have \(\braket{\phi_1|E_2|\phi_1}=0\) so that \(\sqrt{E_2}\ket{\phi_1}=0\). Now we can form an orthonormal basis for \(U\) as \(\{\ket{\phi_1},\ket{\tilde{\phi_1}}\}\) such that \(\ket{\phi_2}=\alpha\ket{\phi_1}+\beta\ket{\tilde{\phi_1}}\) with \(|\alpha|^2+|\beta|^2=1\) and \(\beta<1\). So that
\begin{equation*}
\braket{\phi_2|E_2|\phi_2}=|\beta|^2\braket{\tilde{\phi_1}|E_2|\tilde{\phi_1}}{<}1.
\end{equation*}

Projective measurement

The preceding discussion of measurement is rather abstract – we have conspicuously not mentioned what is being measured. Let us now consider the more familiar projective measurement corresponding to the measurement of a particular observable.

Postulate [Projective measurement] The eigenvalues \(\lambda_i\) of the Hermitian operator, \(O\), representing a quantum mechanical observable are the possible outcomes of any experiment carried out on the system to establish the value of the observable. In this case the measurement operators are the orthogonal projectors, \(P_{\lambda_i}\), of the spectral decomposition of the Hermitian operator \(O\). That is, \(O=\sum_i\lambda_iP_{\lambda_i}\) where \(P_{\lambda_i}\) is the projector onto the eigenspace corresponding to the eigenvalue \(\lambda_i\). If a system is in a state \(\ket{\psi}\) then a measurement of an observable represented by the operator \(O\) will obtain a value \(\lambda_i\) with a probability
\begin{equation}
p(i)=\braket{\psi|P_i|\psi}
\end{equation}
and subsequently the system will be in a state
\begin{equation}
\frac{P_i\ket{\psi}}{\sqrt{p(i)}}
\end{equation}

Note that in contrast to general measurements, if we repeat a projective measurement of the same observable then we are guaranteed to get the same outcome.

We sometimes speak of measuring in (or along) a basis. Suppose \(\ket{i}\) is an orthonormal basis for the Hilbert space describing our system. If the system is initially in a state \(\ket{\psi}\) and we make a measurement in the basis \(\ket{i}\) then with probability \(P(i)=|\braket{i|\psi}|^2\) the measurement results in the system being in the state \(\ket{i}\). The measurement operators in this case are the one-dimensional projectors \(\ket{i}\bra{i}\).

Expectation values and uncertainty relations

The expectation value of the operator \(O\) when the system is in a state \(\ket{\psi}\), that is the expected value of a (projective) measurement of the observable represented by the operator \(O\) when the system is described by the state vector \(\ket{\psi}\) is given by
\begin{equation*}
\mathbf{E}_{\psi}[O]=\sum_ip(i)\lambda_i=\sum_i\braket{\psi|P_i|\psi}\lambda_i=\sum_i\braket{\psi|\lambda_iP_i|\psi}=\braket{\psi|O|\psi}.
\end{equation*}
We typically denote this expectation value \(\braket{O}_{\psi}\), thus
\begin{equation}
\braket{O}_{\psi}=\braket{\psi|O|\psi}.
\end{equation}

If a system is in an eigenstate of an observable \(O\) then when we measure this property we are sure to obtain the eigenvalue corresponding to that eigenstate. If though the system is in some arbitrary state \(\ket{\psi}\) then there will be some uncertainty in the value obtained. We denote the uncertainty of the Hermitian operator \(O\) in the state \(\ket{\psi}\) by \(\Delta_{\psi}O\), defined by
\begin{equation}
\Delta_{\psi}O\equiv\left|\left(O-\braket{O}_{\psi}\id\right)\ket{\psi}\right|
\end{equation}

It is not difficult to see that the uncertainty \(\Delta_{\psi}O\) vanishes if and only if \(\ket{\psi}\) is an eigenstate of \(O\).

We would expect there to be a relationship between the uncertainty \(\Delta_{\psi}O\) and the usual statistical standard deviation, \(\sqrt{\mathbf{E}_{\psi}[O^2]-\mathbf{E}_{\psi}[O]^2}\) and indeed we have,
\begin{align*}
(\Delta_{\psi}O)^2&=\left|\left(O-\braket{O}_{\psi}\id\right)\ket{\psi}\right|^2\\
&=\braket{\psi|\left(O-\braket{O}_{\psi}\id\right)^2|\psi}\\
&=\braket{\psi|O^2-2\braket{O}_{\psi}O+\braket{O}_{\psi}^2\id|\psi}\\
&=\braket{O^2}_{\psi}-\braket{O}_{\psi}^2.
\end{align*}

Geometrically, the orthogonal projection of \(O\ket{\psi}\) on the 1-dimensional subspace, \(U_{\psi}\), spanned by \(\ket{\psi}\) is \(P_{\psi}O\ket{\psi}\) where \(P_{\psi}\equiv\ket{\psi}\bra{\psi}\) and therefore \(P_{\psi}O\ket{\psi}=\braket{O}_{\psi}\ket{\psi}\). Furthermore, the component of \(O\ket{\psi}\) in the orthogonal complement of \(U_{\psi}\), \(U_{\psi}^\perp\), is \((\id-P_{\psi})O\ket{\psi}\), the length of which is just \(\Delta_{\psi}O\).

Theorem (The Uncertainty Principle) In a state \(\ket{\psi}\) the uncertainties in any pair of Hermitian operators, \(A\) and \(B\), satisfy the the relation
\begin{equation}
\Delta_{\psi}A\Delta_{\psi}B\geq\left|\braket{\psi|\frac{1}{2i}[A,B]|\psi}\right|
\end{equation}

Proof This is simply an application of the Cauchy-Schwarz inequality. We define two new operators, \(\tilde{A}=A-\braket{A}_{\psi}\id\) and \(\tilde{B}=B-\braket{B}_{\psi}\id\) and states \(\ket{a}=\tilde{A}\ket{\psi}\) and \(\ket{b}=\tilde{B}\ket{\psi}\). Then Cauchy-Schwarz tells us that
\begin{equation*}
\braket{a|a}\braket{b|b}\geq\left|\braket{a|b}\right|^2,
\end{equation*}
from which, observing that \(\braket{a|a}=(\Delta_{\psi}A)^2\) and \(\braket{b|b}=(\Delta_{\psi}B)^2\),
\begin{equation*}
(\Delta_{\psi}A)^2(\Delta_{\psi}B)^2\geq\left|\braket{a|b}\right|^2.
\end{equation*}
Now, \(\braket{a|b}=\braket{\psi|\tilde{A}\tilde{B}|\psi}\), and observe that we can write,
\begin{equation*}
\braket{\psi|\tilde{A}\tilde{B}|\psi}=\frac{1}{2}\braket{\psi|\{\tilde{A},\tilde{B}\}|\psi}+\frac{1}{2i}\braket{\psi|[\tilde{A},\tilde{B}]|\psi}i
\end{equation*}
where \(\{\tilde{A},\tilde{B}\}=\tilde{A}\tilde{B}+\tilde{B}\tilde{A}\) is the anti-commutator of \(\tilde{A}\) and \(\tilde{B}\). Therefore we have
\begin{equation*}
\left|\braket{a|b}\right|^2=\left(\frac{1}{2}\braket{\psi|\{\tilde{A},\tilde{B}\}|\psi}\right)^2+\left(\frac{1}{2i}\braket{\psi|[\tilde{A},\tilde{B}]|\psi}\right)^2
\end{equation*}
and can write the uncertainty relation as
\begin{equation}
(\Delta_{\psi}A)^2(\Delta_{\psi}B)^2\geq\left(\frac{1}{2}\braket{\psi|\{\tilde{A},\tilde{B}\}|\psi}\right)^2+\left(\frac{1}{2i}\braket{\psi|[\tilde{A},\tilde{B}]|\psi}\right)^2
\end{equation}
from which it immediately follows that
\begin{equation}
(\Delta_{\psi}A)^2(\Delta_{\psi}B)^2\geq\left(\frac{1}{2i}\braket{\psi|[\tilde{A},\tilde{B}]|\psi}\right)^2
\end{equation}
which is just the squared version of the desired result.\(\blacksquare\)

It is of interest to establish under what conditions the uncertainty relation is saturated. As can be seen from the proof, that will require that Cauchy-Schwarz inequality to be saturated and that the term involving the anti-commutator vanishes. We recall that saturation of Cauchy-Schwarz is equivalent to the linear dependence of the two vectors, \(\ket{b}=\alpha\ket{a}\), for some \(\alpha\in\CC\). The anti-commutator term came from the real part of \(\braket{a|b}\) so we require \(\braket{a|b}+\braket{b|a}=0\). That is, using \(\ket{b}=\alpha\ket{a}\), \((\alpha+\alpha^*)\braket{a|a}=0\), so that \(\alpha\) must be pure imaginary, \(\ket{b}=it\ket{a}\) for \(t\in\RR\). In terms of the original operators and state this is the condition,
\begin{equation}
(B-\braket{B}_{\psi}\id)\ket{\psi}=it(A-\braket{A}_{\psi}\id)\ket{\psi}
\end{equation}
from which we see that \(|t|=\Delta_{\psi}B/\Delta_{\psi}A\) and which can be rewritten as an eigenvalue equation involving a non-Hermitian operator \((B-itA)\),
\begin{equation}
(B-itA)\ket{\psi}=(\braket{B}_\psi-it\braket{A}_{\psi})\ket{\psi}.
\end{equation}

The Problem of Outcomes

an Institute for Enquiring Minds production