Basic Definitions and Examples

At school we learn that ‘space’ is 3-dimensional. We can specify its points in terms of coordinates \((x,y,z)\) and we think of ‘vectors’ as arrows from one point to another. For example, in the diagram below,
IMG_0192
the points \(P\) and \(Q\) might be specified as \(P=(p_1,p_2,p_3)\) and \(Q=(q_1,q_2,q_3)\) respectively. The vectors \(\mathbf{OP}\) and \(\mathbf{OQ}\) are the arrows from the origin to the respective points and it’s typical to write their components as column vectors,
\begin{equation*}
\mathbf{OP}=\begin{pmatrix}p_1\\p_2\\p_3\end{pmatrix}\quad\text{and}\quad\mathbf{OQ}=\begin{pmatrix}q_1\\q_2\\q_3\end{pmatrix}.
\end{equation*}
We don’t distinguish between \(\mathbf{OP}\) and any other arrow of the same length and direction. We can think of the vectors emanating from the origin as the representatives of an equivalence class of arrows of the same length and orientation positioned anywhere in space. In particular, the arrow from \(Q\) to \(S\) which we get by transporting \(\mathbf{OP}\), keeping its length and orientation the same so that its ‘tail’ meets the ‘head’ of \(\mathbf{OQ}\), belongs to the same equivalence class as \(\mathbf{OP}\). In fact what we obtain in this way is the geometric construction of \(\mathbf{OS}\) as the sum of \(\mathbf{OP}\) and \(\mathbf{OQ}\). Similarly, \(\mathbf{PQ}\), is equivalently taken to be the arrow from \(P\) to \(Q\) or, as in the diagram, the vector from the origin to the point reached by joining the tail of \(-\mathbf{OP}\), the vector of the same length as \(\mathbf{OP}\) but opposite direction, to the head of \(\mathbf{OQ}\), that is, \(\mathbf{PQ}=\mathbf{OQ}-\mathbf{OP}\). Generally, given any vector \(\mathbf{v}\) and any real number \(a\) \(a\mathbf{v}\) is another vector \(\abs{a}\) times as long as \(\mathbf{v}\) and pointing in the same direction when \(a\) is positive and opposite direction when \(a\) is negative. The algebraic structure we have here is perhaps the most familiar example of the abstract notion of a vector space.

Vectors in space arise of course in physics as the mathematical representation of physical quantities such as force or velocity. But this geometric setting also clarifies the discussion of the solution of simultaneous of linear equations in three unknowns.

Recall that to specify a plane in space we need a point \(P=(p_1,p_2,p_3)\) and a normal vector
\begin{equation*}
\mathbf{n}=\begin{pmatrix}n_1\\n_2\\n_3\end{pmatrix}.
\end{equation*}
The plane is then the set of points \(X=(x,y,z)\) such that the scalar product of the vector
\begin{equation*}
\mathbf{PX}=\begin{pmatrix}x-p_1\\y-p_2\\z-p_3\end{pmatrix},
\end{equation*}
between our chosen point \(P\) and \(X\), with the normal vector, \(\mathbf{n}\), is zero, that is, \(\mathbf{PX}\cdot\mathbf{n}=0\). This is equivalent to the equation
\begin{equation*}
n_1x+n_2y+n_3z=c
\end{equation*}
where \(c=\mathbf{OP}\cdot\mathbf{n}\), a single linear equation in 3 unknowns for which there are an infinite number of solutions, namely, all the points of the plane. This solution ‘subspace’ is clearly a 2-dimensional space within the ambient 3-dimensional space. Now consider a pair of such equations, assuming they aren’t simply a constant multiple of one another, there are two possibilities. In the case that the equations correspond to a pair of parallel planes there are no solutions. For example the pair
\begin{align*}
x-y+3z&=-1\\
-2x+2y-6z&=3
\end{align*}
corresponds geometrically to,
twoplanesparallel
The other possibility is that the equations correspond to a pair of intersecting planes in which case there are an infinite number of solutions corresponding to all points on the line of intersection. For example the pair
\begin{align*}
x-y+3z&=-1\\
2x-y-z&=1
\end{align*}
correspond geometrically to
twoplanes
The line of intersection here, found by solving the pair of equations, may be expressed as
\begin{equation*}
\begin{pmatrix}x\\y\\z\end{pmatrix}=\lambda\begin{pmatrix}4\\7\\1\end{pmatrix}+\begin{pmatrix}2\\3\\0\end{pmatrix}.
\end{equation*}
Its direction vector could have been found as the cross product of the respective normal vectors — equivalent to solving the homogeneous system,
\begin{align*}
x-y+3z&=0\\
2x-y-z&=0,
\end{align*}
with the triple \((2,3,0)\) a particular solution of the inhomogeneous system. In dimensions higher than 3, that is, for systems of linear equations involving more than 3 variables, we can no longer think in terms of planes intersecting in space but the abstract vector space setting continues to provide illumination.

Definition A vector space \(V\) over a field 1 \(K\) (from the point of view of physical applications \(\mathbb{R}\) or \(\mathbb{C}\) will be most relevant), the elements of which will be referred to as scalars or numbers, is a set in which two operations, addition and multiplication by an element of \(K\) are defined. The elements of \(V\), called vectors, satisfy:

  • \(u+v=v+u\)
  • \((u+v)+w=u+(v+w)\)
  • There exists a zero vector \(0\) such that \(v+0=v\)
  • For any \(u\), there exists \(-u\), such that \(u+(-u)=0\)

Thus \((V,+)\) is an abelian group, and is further equipped with a scalar multiplication satisfying:

  • \(c(u+v)=cu+cv\)
  • \((c+d)u=cu+du\)
  • \((cd)u=c(du)\)
  • \(1u=u\)

where \(u,v,w\in V\), \(c,d\in K\) and 1 is the unit element of \(K\).

Example The canonical example is the space of \(n\)-tuples, \(x=(x^1,\dots,x^n)\), \(x^i\in K\), denoted \(K^n\). Its vector space structure is given by \(x+y=(x^1+y^1,\dots,x^n+y^n)\) and \(ax=(ax^1,\dots,ax^n)\).

Example The polynomials of degeree \(n\) over \(\mathbb{R}\), \(P_n\), is a real vector space. In this case, typical vectors would be \(p=a_nx^n+a_{n-1}x^{n-1}+\dots+a_1x+a_0\) and \(q=b_nx^n+b_{n-1}x^{n-1}+\dots+b_1x+b_0\) with vector space structure given by \(p+q=(a_n+b_n)x^n+\dots+(a_0+b_0)\) and \(cp=ca_nx^n+\dots+ca_0\). More generally we have \(F[x]\), the space of all polynomials in \(x\) with coefficients from the field \(F\).

Example Continuous real-valued functions of a single variable, \(C(\RR)\), form a vector space with the natural vector addition and scalar multiplication.

Example The \(m\times n\) matrices over \(K\), \(\text{Mat}_{m,n}(K)\), form a vector space with the usual matrix addition and scalar multiplication. We denote by \(\text{Mat}_n(K)\) the vector space of \(n\times n\) matrices.

Definition A subspace, \(U\), of a vector space \(V\), is a subset of \(V\) which is closed under vector addition and scalar multiplication.

Example A plane through the origin is a subspace of \(\RR^3\). Note though that any plane which does not contain the origin cannot be a subspace since it does not contain the zero vector.

Example The solution set of any system of homogeneous linear equations in \(n\) variables over the field \(K\) is a subspace of \(K^n\). Incidentally, it will be useful to note here that any system of homogeneous linear equations has at least one solution, namely the zero vector and that also, an underdetermined system of homogeneous linear equations has infinite solutions. A simple induction argument on the number of variables establishes the latter. Suppose \(x_n\) is the \(n\)th variable. First we deal with the special case that in each equation the coefficient of \(x_n\) is zero. In this case, \(x_n\) can take any value, and we may set all other variables to zero. We thus have an infinite number of solutions. If, however, one or more equations have non-zero coefficients of \(x_n\) then choose one of them and use it to obtain an expression for \(x_n\) in terms of the other variables. We then use this expression twice. First, we eliminate \(x_n\) from all other equations to arrive at an underdetermined homogeneous system in \(n-1\) variables. By the induction hypothesis, this has infinite solutions. Then, to each such solution, we use the expression for \(x_n\) to obtain a solution of the original system.

A set of \(n\) vectors \(\{e_i\}\) in a vector space \(V\) is linearly dependent if there exist numbers \(c^i\), not all zero, such that \(c^1e_1+c^2e_2+…+c^ne_n=0\). They are linearly independent if they are not linearly dependent. The span of a set of vectors \(S\) in a vector space \(V\), \(\Span(S)\), is the set of all linear combinations of elements in \(S\).

Definition A set of vectors \(S\) is a basis of \(V\) if it spans \(V\), \(\Span(S)=V\), and is also linearly independent.

Throughout the Linear Algebra section of the Library, vector spaces will be assumed to be finite dimensional, that is, spaces \(V\) in which there exists a finite set \(S\) such that \(\Span(S)=V\). In this case it is not difficult to see that \(S\) must have a subset which is a basis of \(V\). In particular, any finite dimensional vector space has a basis.

The following fact will be used repeatedly in what follows.

Theorem Any linearly independent set of vectors, \(e_1,\dots,e_r\), in \(V\) can be extended to a basis of \(V\).

Proof For \(v\in V\), \(v\notin\Span(e_1,\dots,e_r)\) if and only if \(e_1,\dots,e_r,v\) are linearly independent (the ‘if’ follows since \(v\in\Span(e_1,\dots,e_r)\) implies that \(e_1,\dots,e_r,v\) are linearly dependent, the ‘only if’ since if \(v\notin\Span(e_1,\dots,e_r)\) and we had numbers \(c^i,b\) not all zero such that \(c^1e_1+\dots+c^re_r+bv=0\) then with \(b=0\) we contradict the linear independence of the \(e_i\) while with \(b\neq 0\) we contradict \(v\notin\Span(e_1,\dots,e_r)\)). Now, since \(V\) is finite dimensional, there is a spanning set \(S=\{f_1,\dots,f_d\}\) and if each \(f_i\in\Span(e_1,\dots,e_r)\) then \(e_1,\dots,e_r\) is already a basis. On the other hand if \(f_i\notin\Span(e_1,\dots,e_r)\) then we’ve seen that \(e_1,\dots,e_r,f_i\) is linearly independent and so considering each \(f_i\) in turn we may construct a basis for \(V\).\(\blacksquare\)

Theorem If a vector space \(V\) contains a finite basis which consists of \(n\) elements then any basis of \(V\) must consist of exactly \(n\) elements.

Proof We first establish that if we have \(V=\Span(e_1,\dots,e_m)\), with the \(e_i\) linearly independent, then for any linearly independent set of vectors, \(\{f_1,\dots,f_n\}\), \(m\geq n\). One way to see this is by observing that by assumption we can express each \(f_i\) as a linear combination of \(e_i\)s, \(f_i=\sum_{j=1}^mA_i^je_j\) for some \(A_i^j\in K\). Now the linear independence of the \(f_i\)s means that if we have numbers \(c^i\) such that \(\sum_{i=1}^nc^if_i=0\) then \(c^i=0\) for all \(i\). But \(\sum_{i=1}^nc^if_i=\sum_{i=1}^n\sum_{j=1}^mc^iA_i^je_j\), with the coefficient of \(e_j\) being the \(j\)th element of the \(m\times 1\) column vector \(\mathbf{A}\mathbf{c}\) (\(\mathbf{A}\) is here the \(m\times n\) matrix with \(A^i_j\) in the \(i\)th row and \(j\)th column and \(\mathbf{c}\) the \(n\times 1\) column vector with elements \(c^i\)) and if \(m{<}n\), then we know from the discussion in Example that there exists a \(\mathbf{c}\neq 0\) such that \(\mathbf{A}\mathbf{c}=0\). This contradicts the linear independence of the \(f_i\) and so we must indeed have \(m\geq n\). This result may now be applied to any pair of bases to establish the uniqueness of the dimension.\(\blacksquare\)

This allows us to define the dimension, \(n\), of a vector space \(V\) as the number of vectors in any basis of \(V\).

From now on we will, unless stated otherwise, employ the summation convention. That is, if in any term an index appears both as a subscript and a superscript then it is assumed to be summed over from 1 to \(n\) where \(n\) is the dimension of the space. Thus if \(\{e_i\}\) is a basis for \(V\) then any \(v\in V\) can be expressed uniquely as \(v=v^ie_i\). The numbers \(v^i\) are then called the components of \(v\) with respect to the basis \(\{e_i\}\).

Example The vector space \(K^n\) of \(n\)-tuples has a basis, which could reasonably be called ‘standard’, given by the vectors \(e_i=(0,\dots,1,\dots,0)\) with the \(1\) in the \(i\)th place. So in this special basis the components of a vector \(x=(x^1,\dots,x^n)\) are precisely the \(x^i\). It is common to take the elements of \(K^n\) to be \(n\)-dimensional column vectors and to denote vectors using bold face, as in,
\begin{equation}
\mathbf{x}=\begin{pmatrix}
x^1\\
\vdots\\
x^n
\end{pmatrix},
\end{equation}
with the standard basis vectors, \(\{\mathbf{e}_i\}\).

 

Notes:

  1. Recall that a ring is an (additive) group with a multiplication operation which is associative and distributes over addition. A field is a ring such that the multiplication also satisfies all the group properties (after throwing out the additive identity); i.e. it has multiplicative inverses, multiplicative identity, and is commutative.