Category Archives: Special Relativity

Time dilation and length contraction

In the previous post we learnt that if in one frame two clocks are synchronised a distance \(D\) apart, then in another frame in which these clocks are moving along the line joining them with speed \(v\), the clock in front lags the clock behind by a time of \(Dv/c^2\). Let’s now think more about the contrasting perspectives of Alice, riding a train, and Bob, track side, thinking in particular about their respective clock and length readings.

The following picture sums up Alice’s perspective:

Here and below, clocks in either Alice’s or Bob’s frame are denoted by the rounded rectangles with times displayed within. Lengths and times in Alice’s train frame will be denoted by primed symbols, those in Bob’s track frame, unprimed. Above we see that Alice records the two events to occur simultaneously at a time we’ve taken to be 0. We’re free to take the time on Bob’s clock at the rear of the carriage to read 0 at this time in Alice’s frame, but then, since from Alice’s perspective Bob’s clocks are approaching her with speed \(v\), we know that Bob’s clock at the front of the carriage, when Alice’s clock there is showing 0, must be already showing a later time which we denote \(T\). This is unprimed as it’s a time displayed by a clock in Bob’s frame. From our discussion of the relativity of simultaneity we know \(T\) must be given by
\begin{equation}
T=\frac{Dv}{c^2}\label{eq:tracktime}
\end{equation}
where \(D\) is the separation of those two clocks in Bob’s frame as measured in Bob’s frame. Alice measures the length of her carriage to be \(L’\). We call the length of an object measured in its rest frame its proper length so both \(L’\) and \(D\) are proper lengths, whereas \(D’\), the distance between Bob’s clocks as measured by Alice is not proper. Note that, of course, \(L’=D’\).

Now let’s consider Bob’s perspective. We now need two pictures corresponding to two different times in Bob’s frame.

From Bob’s perspective, at time 0 the rear of the carriage is located at his rear clock and the carriage clock there also shows zero. At the later time \(T\), the front of the carriage is located at his front clock and the carriage clock there shows 0. Bob sees Alice’s clocks travel with speed \(v\) towards him so we know that the front clock lag’s the clock to the rear by a time given by
\begin{equation*}
\frac{L’v}{c^2}
\end{equation*}
where \(L’\) is the distance between Alice’s clocks, the length of the carriage, as measured in Alice’s frame. Thus, in the first of the two Bob frame snapshots, the rear carriage clock shows 0 whilst the front carriage clock shows \(-T’\), and in the second snapshot, the rear carriage clock shows \(T’\) while the front carriage clock shows 0 where
\begin{equation}
T’=\frac{L’v}{c^2}.\label{eq:traintime}
\end{equation}

Now, we’re going to be interested in the ratio \(T’/T\), the fraction of track frame time recorded by train frame clocks – a ratio of a moving clock time to a stationary clock time. We see immediately from \eqref{eq:tracktime} and \eqref{eq:traintime} that this is the same as \(L’/D\). But recall that \(L’=D’\) so we have
\begin{equation*}
\frac{T’}{T}=\frac{D’}{D}.
\end{equation*}
\(D’/D\) is the ratio of a measurement of a length moving with speed \(v\), \(D’\), to a measurement of a length at rest, \(D\). This ratio must therefore also be equal to \(L/L’\), the length of the carriage as viewed from Bob’s perspective to the (rest-frame) length of the carriage as measured by Alice. So in fact we have
\begin{equation}
\frac{T’}{T}=\frac{D’}{D}=\frac{L}{L’}.
\end{equation}
Now recall that \(D=\gamma^2L\), where \(\gamma=1/\sqrt{1-(v/c)^2}\), from which it follows that
\begin{equation}
{L’}^2=\gamma^2L^2
\end{equation}
or,
\begin{equation}
L=\frac{1}{\gamma}L’.
\end{equation}
This is length contraction! Recall that \(\gamma{>}1\) so that the length of the carriage as measured by Bob is smaller than the carriage’s proper length measured by Alice. It follows also that
\begin{equation}
T’=\frac{1}{\gamma}T.
\end{equation}
This is time dilation! Whilst stationary clocks record a time \(T\), clocks in motion record a shorter time \(T’\) — moving clocks run slow.

The relativity of simultaneity

Following Mermin again, we’ll see how the invariance of the speed of light in all inertial frames leads directly to the relativity of simultaneity. Alice rides a train. In one of the carriages it is arranged to have two photons of light emitted from the center of the carriage, one traveling towards the front and the other towards the back. imageThe events \(E_f\) and \(E_r\) are respectively the photon reaching the front and rear of the carriage. In Alice’s frame of reference these events occur simultaneously — we don’t even have to refer to clock’s in Alice’s frame since we know that light travels at the same speed in all directions.

Now consider the situation from the perspective of a track-side observer, Bob. From his perspective Alice’s train is traveling with a velocity \(v\). Three events take place, first, the photons are emitted from the center 1 of the carriage. Since they still travel at light speed \(c\) in all directions he ‘sees’, that is, the clocks in his latticework record, the event \(E_r\) occurring before the event \(E_f\). Schematically we have,image
This is of course as we’d expect, as the left traveling photon heads to the back of the carriage, the back of the carriage is traveling towards it with velocity \(v\) while as the right traveling photon heads towards the front of the carriage, the front is traveling away from it with speed \(v\). Lets say that in Bob’s frame the length of the train carriage is \(L\). If \(T_r\) is the elapsed time in Bob’s frame between the photons being emitted and the left traveling photon reaching the back of the carriage then we have
\begin{equation}
cT_r=\frac{1}{2}L-vT_r
\label{eq:reartime}
\end{equation}
and after a time \(T_f\) the right traveling photon covers \(cT_f\) given by
\begin{equation}
cT_f=\frac{1}{2}L+vT_f.
\label{eq:fronttime}
\end{equation}
These individual times are not what we’re interested in though. We’re interested in the time difference, let’s call it \(\Delta T\), between the events \(E_r\) and \(E_f\) as observed by Bob, for which we obtain,
\begin{equation*}
c\Delta T=v(T_r+T_f).
\end{equation*}
But the total distance traveled by the photons is \(D=cT_r+cT_f\), the spatial separation Bob observes between the two events. So finally we obtain
\begin{equation*}
\Delta T=\frac{Dv}{c^2}.
\end{equation*}

Two events, \(E_r\) and \(E_f\), which are simultaneous in Alice’s inertial frame of reference, are not simultaneous in Bob’s frame, moving with velocity \(v\) in the direction pointing from \(E_f\) to \(E_r\) relative to Alice’s. In Bob’s frame, the event \(E_r\) occurs a time \(Dv/c^2\) before the event \(E_f\), where \(D\) is the spatial separation of the events as seen by Bob.

Alice’s clocks, that is those synchronised in her frame, will record the events \(E_r\) and \(E_f\) occurring at the same time. Moreover they will also record the fact that Bob’s clocks show the event \(E_f\) occurring a time \(Dv/c^2\) after \(E_r\). Alice’s explanation for this fact will be that Bob’s clocks aren’t properly synchronised. Bob on the other hand says the events aren’t simultaneous and says that Alice’s clocks cannot, therefore, be properly synchronised.

The rule about simultaneous events in one frame not being simultaneous in another can be stated in terms of clocks thus:

If in one frame two clocks are synchronised a distance \(D\) apart, then in another frame, in which these clocks are moving along the line joining them with speed \(v\), the clock in front lags the clock behind by a time of \(Dv/c^2\).

It will be useful in the next post, where we consider some consequences of the relativity of simultaneity, to have a relation between the two lengths \(L\) and \(D\) in Bob’s frame. From \eqref{eq:reartime},
\begin{equation*}
T_r=\frac{L}{2}\frac{1}{c+v},
\end{equation*}
and from \eqref{eq:fronttime},
\begin{equation*}
T_f=\frac{L}{2}\frac{1}{c-v},
\end{equation*}
so that
\begin{align*}
\Delta T&=\frac{L}{2}\frac{2v}{c^2-v^2}\\
&=\frac{L\gamma^2v}{c^2}
\end{align*}
where we have introduced the Lorentz factor, \(\gamma\), defined as
\begin{equation*}
\gamma=\frac{1}{\sqrt{1-(v/c)^2}}.
\end{equation*}
Notice that for \(v{<}c\), \(\gamma{>}1\), and, combining with our previous expression for \(\Delta T\), we conclude that \(L\) and \(D\) are related according to
\begin{equation}
D=\gamma^2L.
\end{equation}

Notes:

  1. We are assuming here of course that the center remains the center but even though, as we’ll soon see, the length of something does change depending on the frame of reference both the front half and back half would change by the same amount!

Velocity addition in special relativity — sometimes \(1+1\neq2\)

There’s a great little book on special relativity by the physicist N. David Mermin in which he gets to the heart of the astonishing consequences of Einstein’s special relativity in a particularly elegant fashion and with only very basic mathematics. In this and the following note we’ll closely follow Mermin’s treatment. The crucial fact of life which we have to come to terms with is that whether or not two events which are spatially separated happen at the same time is a matter of perspective. This flies in the face of our intuition 1. We’re wired to think of time as a kind of universal clock and that we and the rest of the universe march forward with its tick-tock relentlessly and in unison.

Let us begin by reconsidering the relativity of velocities. Our intuition, and Galilean relativity, tells us that if you are riding a train and throw a ball in the direction of travel then to someone stationary with respect to the tracks the speed of the ball is simply the sum of the train’s speed and the speed with which the ball  leaves your hand. But thanks to special relativity we know that, at least for light, this isn’t the case. A photon (particle of light) emitted from a moving train moves at light speed \(c\) with respect to the train and with respect to the tracks. This surely has consequences for the relativity of motion in general.

Following Mermin we employ the neat device of measuring the velocity of an object by racing it against a photon. With a corrected velcocity addition rule as our goal we conduct this race on a train carriage.

image

The particle, black dot, whose velocity \(v\) we seek, sets off from the back of the carriage towards the front in a race with a photon which, as we know, travels at speed \(c\). We arrange that the front of the carriage is mirrored so that once the photon reaches the front it’s reflected back. The point at which the particle and photon meet is recorded (perhaps a mark is made on the floor of the carriage – this is a gedanken experiment!). At that point the particle has travelled a fraction \(1-f\) of the length of the carriage whilst the photon has travelled \(1+f\) times the length of the carriage. The ratio of those distances must be proportional to the ratio of the velocities, that is,
\begin{equation}
\frac{1-f}{1+f}=\frac{v}{c},
\end{equation}
which we can rewrite as an equation for \(f\),
\begin{equation}
f=\frac{c-v}{c+v}\label{eq:f1}.
\end{equation}
The velocity is thus established in an entirely unambiguous manner. This may strike you as a somewhat indirect approach to measuring speed but notice that we’ve avoided measuring either time or distance. As we’ll soon see, in special relativity such measurements are rather more subtle than we might imagine.

Now let’s consider the same race but from the perspective of the track frame relative to which the train carriage is travelling (left to right) with velocity \(u\).

image

We’re after the correct rule for adding the velocity \(v\), of the particle relative to the train, to the velocity \(u\), of the train relative to the track, to give the velocity \(w\), of the particle relative to the track. To facilitate the calculations we’ll allow ourselves to use some lengths and times. However their values aren’t important — as we’ll see they fall out of the final equation. We’re really just using their ‘existence’. As indicated in the diagram, after time \(T_0\) the photon is a distance \(D\) in front of the particle, that is,
\begin{equation}
D=cT_0-wT_0,
\end{equation}
but this distance is then also the sum of the distances covered respectively by the photon and particle in time \(T_1\),
\begin{equation}
D=cT_1+wT_1.
\end{equation}
So we can write the ratio of the times as
\begin{equation}
\frac{T_1}{T_0}=\frac{c-w}{c+w}\label{eq:time-ratio1}.
\end{equation}
If the length of the carriage in the track frame is \(L\) then we also have that the distance covered by the photon in time \(T_0\) is
\begin{equation}
cT_0=L+uT_0
\end{equation}
and in time \(T_1\) is
\begin{equation}
cT_1=fL-uT_1.
\end{equation}
Combining these we eliminate \(L\) to obtain another expression for the ratio of times,
\begin{equation}
\frac{T_1}{T_0}=f\frac{(c-u)}{(c+u)}\label{eq:time-ratio2}.
\end{equation}
The two equations, \eqref{eq:time-ratio1} and \eqref{eq:time-ratio2} provide us with a second equation for \(f\),
\begin{equation}
f=\frac{(c+u)}{(c-u)}\frac{c-w}{c+w},
\end{equation}
which in combination with the first, \eqref{eq:f1}, leads to
\begin{equation}
\frac{c-w}{c+w}=\frac{c-u}{c+u}\frac{c-v}{c+v}\label{eq:velocity-addition1},
\end{equation}
which expresses the velocity \(w\) of the particle in the track frame in terms of the velocity \(u\) of the train in the track frame and the velocity \(v\) of the particle in the train frame. With a bit more work this can be rewritten as
\begin{equation}
w=\frac{u+v}{1+uv/c^2}\label{eq:velocity-addition2},
\end{equation}
which should be compared to the Galilean addition rule, \(w=u+v\).

Here’s a plot, with velocities in units of \(c\), comparing the Galilean with the special relativity velocity addition for an object fired at a speed \(v\) from a train carriage moving at half the speed of light:
velocityadd
Equation \eqref{eq:velocity-addition2} ensures that no matter how fast the particle travels with respect to the train (assuming it’s less than light speed), its velocity with respect to the track is always less than light speed. In the extreme case of a particle traveling at light speed with respect to a train which is also travelling at light speed, \(1+1=1\)!

Events, observers and measurements

In special relativity we often read that such and such an inertial observer measures the time between two events or such and such an inertial observer measures the distance between two events. On the face of it such assertions seem reasonably clear and straightforward and indeed very often their perspicuity is simply taken for granted. But as we’ll see their meanings in relativity are not what we’d expect and therefore its important to establish early exactly what is meant by an ‘observer’ an ‘event’ and what constitutes a measurement.

The adjective ‘inertial’ in ‘inertial observer’ has been dealt with already — whatever or whoever constitutes an observer should be in free-fall. Let’s also be clear that by an ‘event’ we mean a happening, somewhere, sometime, corresponding to a point in spacetime — perhaps a photon of light leaving an emitter or being absorbed by a detector, perhaps a particle passing through a particular point in space, perhaps a time being recorded by a clock at a particular point in space. Events, points in spacetime, are real, they care nothing for coordinate systems, frames of reference etc.

When we introduced the idea of a frame of reference we vaguely mentioned a laboratory in which lengths and times could be measured. Let’s be more concrete now and imagine an inertial frame of reference as a freely floating 3-dimensional latticework of rods and clocks with one node designated as the origin.

All the rods have the same length but the clocks at each node are rather special. Like all good clocks they can of course keep time. In addition though they are programmed with their respective locations with respect to the origin, so in particular they ‘know’ their distance from the origin. Furthermore they are sophisticated recording devices ready to detect any event and record its location and time for future inspection. In particular, this allows them all to be synchronized with the clock at the origin in the following way. A flash of light is sent out from the origin just as the clock there is set to 0. The spherical light front spreads out at the same speed \(c\) in all directions. As each clock in the lattice detects this light it sets its time equal to its distance from the origin divided by \(c\) and is then ‘in sync’ with the clock at the origin. We should imagine this latticework to be ‘fine-grained’ enough to ensure that to any required accuracy a clock is located ‘at’ the spatial location of any event. This is a crucial point. The time assigned to an event, with respect to an inertial frame of reference, is always that of one of the inertial frame’s clocks at the event. The spacetime location of the event is then given by the spatial coordinates of the clock there together with the clock’s time at the moment the event happens and is recorded along with a description of what took place. This would then constitute a ‘measurement’ and the inertial ‘observer’ carrying out the measurement should be thought of as the whole latticework. An observer is better thought of as the all-seeing eye of the entire inertial frame than as somebody located at some specific point in space with a pair of binoculars and a notepad! If we do speak of an observer as a person, and it is convenient and usual to do so, then we really mean such an intelligent latticework of rods and detecting clocks with respect to which that person is at rest.

Shortly we’ll see that when two or more events at different points in space occur simultaneously with respect to one inertial observer, with respect to another they generally occur at different times. Let’s be clear though that if two or more things happen at the same place at the same time then that’s an event and as such its reality is independent of any frame of reference. All observers must agree that it took place even if they assign to it different spacetime coordinates. Sometimes this is obvious. Consider two particles colliding somewhere. Then obviously the collision either took place or it didn’t and the question is merely what spacetime coordinates should be assigned to the location in spacetime of the collision. But other times it might seem a little more confusing. We might say that an observer, let’s call ‘her’ Alice, records that two spatially separated events, for example photons arriving at two different places, occur at the same time. Recall that this really means that at each location a clock records a time corresponding to the event there and these times turn out to be the same, let’s say 2pm. Now the clock striking 2 at a location just as the event takes place there is itself an event and so will be confirmed by any other inertial observer. Let’s call Bob our other observer. He will assign his own times to the two events, and, as we’ll see, he’ll find that his clocks record different times. However, recall that clock’s don’t just tell time — they also record the event — so Bob will certainly confirm that Alice’s clocks both struck 2pm as the photons arrived at those points in spacetime but Bob will conclude that Alice’s clocks aren’t synchronised since from his perspective these two events did NOT occur simultaneously!

Notes:

  1. It’s worth remarking that if we reverse the roles of space and time the corresponding conclusion is not at all surprising. We are entirely comfortable with the fact that whether or not two events which take place at different times occur at the same place is a matter of perspective.

Space and time become spacetime

Physicists at the beginning of the 20th century were thus faced with a conundrum. They had Newton’s theory of mechanics, working in perfect harmony with Galileo’s principle of relativity, tested over two centuries and never once found wanting. Maxwell’s theory of electrodynamics was, by comparison, the new kid on the block. But its experimental confirmation, particularly thanks to the work of Heinrich Hertz (1857-1894) proving the existence of electromagnetic waves travelling at the speed of light, was compelling. Maxwell’s theory pointed to the future of physics, it signalled a radical departure from the ‘action at a distance’ concept implicit in the classical interpretation of the interaction of physical bodies. But what was to be made of its inconsistency with Galilean relativity and the null result of Michelson-Morley?

Both Hendrik Lorentz (1853-1928) and Henri Poincaré (1854-1912) made significant contributions to the solution of this puzzle but it was Albert Einstein (1879-1955) in his 1905 paper “On the Electrodynamics of Moving Bodies” who had the clarity and audacity of vision to see that what was required was nothing less than a radically new understanding of the relationship between space and time. His solution was as simple as it was bold. He declared that the laws of physics, including Maxwell’s equations, are indeed valid in all inertial frames of reference. In particular, this means that no matter how fast a light source is travelling, the light always travels at the same speed \(c\). The Galilean transformations between inertial frames were no longer tenable but, thanks to Lorentz, their replacement, the Lorentz transformations, were already known. They were part of a theory, “Lorentz Aether Theory”, which Einstein’s bold insight swept aside. Einstein was able to show that the Lorentz transformations were a natural consequence of the fundamental principle of the constancy of the speed of light. The aether was now redundant.

The Galilean transformations assume that time is absolute, that observers in uniform motion relative to one another agree on the rate at which time passes and so always agree on the time interval between two given events. It was this assumption of an absolute time, such a deeply intuitive notion, which Einstein had the brilliance to dispense with. Subsequent notes in this series will discuss in more detail the derivation and remarkable consequences of the Lorentz transformations but its worth having a look at them now to get a qualitative sense of their departure from the Galilean paradigm.
\begin{align*}
x’&=\frac{x-vt}{\sqrt{1-(v/c)^2}}\\
y’&=y\\
z’&=z\\
t’&=\frac{t-vx/c}{\sqrt{1-(v/c)^2}}
\end{align*}Notice how the spatial and time coordinates have become intertwined. Notice also that in the limit \(c\to\infty\) the Lorentz transformations become the Galilean transformations. Over the course of the next few notes we’ll come to appreciate the speed of light as Nature’s speed limit. We’ll also see how Newtonian mechanics had to be modified to become consistent with this new principle of relativity.

To this day special relativity is regarded as the correct geometric setting for all of physics except gravity. Three of the four fundamental forces known, the electromagnetic and strong and weak nuclear forces are understood in terms of quantum field theories, a framework in which quantum mechanics and special relativity are successfully reconciled. Gravity though is specifically excluded in special relativity. Being an action at a distance theory, Newtonian gravity had no place in the new framework and it took Einstein 11 years to complete his monumental general theory of relativity which established gravity as curvature of spacetime. En route, in 1907, Einstein made a crucial observation regarding its nature. As I’ve already mentioned, gravity is a somewhat peculiar force in that it accelerates all masses equally. This led Einstein to the realisation that in free fall gravity is no longer perceptible. Nowadays this effect is familiar to us from footage of astronauts floating weightless in the International Space Station (ISS). Note that the space station isn’t in some sort of zero-gravity environment. On the contrary, the earth’s pull up there is only about 5% less than we experience it on the ground. The ISS is simply falling. It is in free-fall, but doesn’t come crashing down to earth since it has just the required velocity perpendicular to ‘down’ to ensure that as fast as it’s falling, the earth is curving away from it so it maintains its orbit. In fact, though pretty thin, the atmosphere in the space station’s orbit creates a drag which requires periodic re-boosts to maintain this crucial balance. During these, the ISS is not in free-fall, an effect vividly demonstrated by its crew members in this video:


The boosts ‘turn on’ gravity momentarily. This is in fact the crucial point. If you are in a windowless spaceship, there is no way to tell the difference between the spaceship being at rest on earth or being in deep space, far from any massive gravitation-inducing bodies, with its boosters on to provide an acceleration equal to that induced by earth’s gravity. The weightfulness will feel identical in both cases, just as the weighlessness experienced in free fall is no different from that which would be experienced in deep empty space. These are both examples of Einstein’s principle of equivalence upon which he based the general theory of relativity.

Though that is the beginning of a story for another day, we should now recall our definition of an inertial frame of reference as one in which a free test particle would have a constant velocity. We previously brushed over the issue of gravity. Now we see that something like the ISS is an excellent approximation to an inertial frame of reference. In fact, to be precise we should restrict attention to local reference frames. That is, windowless spaceships small enough so that tidal effects due to the non-uniformity of the gravitational pull are not perceptible 1 With respect to a local frame of reference in free-fall such free test particles really do exist! Thus, real world inertial frames of reference, those to which Einstein’s special relativity applies, are local free-fall frames, sometimes called free-float frames.

But if inertial frames of reference are really free-fall frames then where does that leave our earth-bound ‘inertial’ frames. In particular, are we entitled to use special relativity in analysing particle trajectories at the LHC? Fortunately the answer is yes. Since we are in any case interested in understanding the behaviour of objects moving at or near light speed, over the relevant time scales gravity isn’t an issue. To see this we note that in a laboratory on earth in a time \(t\) a particle falls a distance \((1/2)gt^2\), where \(g\approx10\text{ms}^{-2}\) is the acceleration due to gravity. So if the smallest displacement we can detect is of the order of a micrometer, \(10^{-6}\text{m}\), (the best spatial resolution of the tracking devices at the LHC), then that corresponds to a falling time of the order of \(10^{-4}\text{s}\). That doesn’t sound like long but near light speed particles can cover distances of the order of \(10\text{km}\) in that time so no deviation from inertial, straight line, motion could be detected in a realistically proportioned earth-bound laboratory.

To summarise then, we can say that the known laws of physics are invariant under Lorentz transformations and these transformations relate inertial frames which are best understood as local free-fall frames in uniform relative motion with respect to one another. An earth-bound laboratory is a reasonable approximation of a free fall frame when considering motion at or near light speed since over sensible distances the relevant time frames are so short that gravity may reasonably be ignored. Newtonian physics is invariant under Galilean transformations. That physics and those transformations are the low speed approximations of relativistic mechanics and Lorentz transformations respectively. In either context gravity doesn’t have to be excluded and earth-bound laboratories are reasonable approximations of inertial frames of reference when earth’s rotational motion is irrelevant.

Notes:

  1. If two balls are in free fall together towards the earth and are a certain horizontal distance apart they will tend to move closer together since they are both being pulled towards the centre of the earth. Likewise two balls in free fall with a certain initial vertical separation will tend to move further apart since the pull on the closer of the two is greater than on the other.

The Michelson-Morley Experiment

Towards the end of the 19th century, it had become generally accepted that Maxwell’s equations, as presented by James Clerk Maxwell (1831-1879) in his 1865 paper “A dynamical theory of the electromagnetic field”, were the correct and unifying description of the physics of electricity and magnetism. Light was by then understood to be electromagnetic waves, with Maxwell’s equations specifying their speed in vacuum to be a universal constant of nature, \(c=299,792,458\text{ms}^{-1}\approx3\times10^8\text{ms}^{-1}\). The equations make it clear that the speed of light does not depend on the speed of the source. It was therefore assumed that light waves must propagate through some kind of material medium, a ‘luminiferous aether’, just as sound waves propagate, independent of the speed of their source, through air. Consistent with this belief, Maxwell’s equations are not invariant under Galilean transformations. The presumption was that they held only in those frames which happen to be at rest with respect to the mysterious aether — only in such a preferred frame would light travel in all directions at speed \(c\). But this state of affairs should then present an opportunity to detect the relative motion between earth and the aether. The most famous such attempt was the Michelson-Morley experiment of 1887. 1 Here is a schematic of the optical interferometer used in their experiment.

image

Sodium light was split into two beams travelling at right-angles to one another. After travelling (approximately equal) distances \(L=11\text{m}\) each beam is reflected back to the beam splitter where they are recombined and directed towards a detector ready to observe interference fringes. The apparatus was mounted on a bed of mercury allowing it to be smoothly rotated. If by some miracle (the earth’s velocity relative to the sun is \(30\text{kms}^{-1}\) and \(200\text{kms}^{-1}\) relative to the centre of the milky way) the apparatus was at rest in the aether, then no shift in the observed interference fringes would be expected as the apparatus is rotated. Considering the more likely scenario of the interferometer travelling with some velocity \(v\) relative to the aether’s rest frame and aligned at an angle \(\theta\) to this direction we consider the following schematic.

image

We can write down the following pairs of equations for the round trip paths taken by each of the pair of beams.
\begin{align*}
c^2{t_1}^2&=(L-vt_1\sin\theta)^2+v^2{t_1}^2\cos^2\theta,\\
c^2{t_2}^2&=(L+vt_2\sin\theta)^2+v^2{t_2}^2\cos^2\theta,
\end{align*}
and
\begin{align*}
c^2{T_1}^2&=(L+vT_1\cos\theta)^2+v^2{T_1}^2\sin^2\theta,\\
c^2{T_2}^2&=(L-vT_2\cos\theta)^2+v^2{T_2}^2\sin^2\theta.
\end{align*}

From which we calculate(for example)
\begin{equation*}
(c^2-v^2){t_1}^2+2Lv\sin\theta t_1-L^2=0
\end{equation*}
so that
\begin{equation*}
t_1=\frac{-2Lv\sin\theta+2L\sqrt{v^2\sin^2\theta+(c^2-v^2)}}{2(c^2-v^2)}
\end{equation*}
and then
\begin{equation*}
t_1=\frac{-2Lv\sin\theta+2L\sqrt{c^2-v^2\cos^2\theta}}{2(c^2-v^2)}
\end{equation*}etc…
the respective total round trip paths to be
\begin{equation*}
c(t_1+t_2)=\frac{2L\sqrt{1-(v\cos\theta/c)^2}}{1-(v/c)^2},
\end{equation*}and
\begin{equation*}
c(T_1+T_2)=\frac{2L\sqrt{1-(v\sin\theta/c)^2}}{1-(v/c)^2},
\end{equation*}and the path difference, which we’ll call \(\Delta(\theta)\), to be
\begin{equation*}
\Delta(\theta)=\frac{2L}{1-(v/c)^2}\left(\sqrt{1-(v\sin\theta/c)^2}-\sqrt{1-(v\cos\theta/c)^2}\right).
\end{equation*}If the apparatus is rotated through \(90^\circ\) then the path difference is \(-\Delta(\theta)\) so the expected fringe shift between the two orientations will be a fraction \(2\Delta(\theta)/\lambda\) of a wavelength where \(\lambda=589\times10^{-9}\text{m}\) is the wavelength of sodium light. Assuming the apparatus starts off with an orientation of \(\theta=45^\circ\) to the direction of relative motion, in which case \(\Delta(\pi/4)=0\), and assuming the aether is at rest relative to the sun so the relative velocity is \(v=30\text{kms}^{-1}\) we can plot the expected shifts.

The greatest shift is expected to occur between two alignments in which one arm is parallel and the other perpendicular to the direction of motion. In this case we have a fringe shift of approximately \(2Lv^2/c^2\lambda\approx0.37\). In fact Michelson and Morley found nothing of the sort, observing fringe shifts no bigger than 0.01 of a wavelength which translate to a relative velocity of less than \(5\text{kms}^{-1}\) 2. The extraordinarily slim chance that the ether and earth frames happened to be comoving with the same velocity relative to the sun was eliminated by repeating the experiment at three month intervals. The result was the same.

A somewhat bizarre but theoretically possible explanation for the Michelson-Morley result, that the earth somehow drags the aether with it, is ruled out by the well established phenomenon of stellar aberration. The gist of the issue here is familiar to anyone who’s noticed that when cycling through falling snow, the snowflakes seem to fall towards us from somewhere in the sky in front of us rather than, as we observe when stationary, straight down. If somehow the clouds producing the snow were moving with us this apparent shift in the source of the snow wouldn’t occur. Analogously, due to earth’s motion in orbit about the sun, the apparent positions of stars in the sky is shifted. This is stellar aberration and, if the aether were dragged along with earth, it wouldn’t be observed — but it is.

Notes:

  1. Albert Abraham Michelson (1852-1931) was an esteemed experimenter who in 1907 became the first American to win a Nobel Prize in science.
  2. Subsequent, more accurate, measurements reduced this to less than \(1\text{kms}^{-1}\).

The principle of relativity

Special relativity was introduced by Einstein in 1905 to reconcile inconsistencies between Newtonian mechanics and Maxwell’s electromagnetism. In turned out that Newton’s theory had to be brought into line with Maxwell’s, with the reformulated mechanics then able to correctly describe motion approaching or at light speed. Just as significantly, the theory demanded a radical reappraisal of the relationship between space and time. The notion that our local geometry is purely spatial with time somehow distinct, absolute and universal, had to be abandoned. Space and time though different in character had to be regarded as combining to play equally important roles in a richer spacetime geometry.

The theory rests upon the simple yet profound principle of relativity:

The laws of physics are identical in all inertial frames of reference.

The giants of physics, Galileo, Newton, Maxwell and Einstein, have each had a hand in shaping our understanding of this principle. Here I’ll try to place our current understanding of its meaning in the context of its historical development.

Galileo’s ship

In his 1632 “Dialogue Concerning the Two Chief World Systems”, Galileo Galilei (1564-1642) described a picturesque scene below deck of a ship featuring “small winged creatures”, fish, dripping bottles as well as a game of catch and some jumping about to illustrate a phenomenon well known to us all. Stuff behaves the same whether we and our immediate environment are stationary or moving uniformly. Sitting in an aeroplane, the windows shuttered and headphones on are you parked on the runway or cruising at 30,000 feet? You have no way of distinguishing between these two possibilities. It’s only if the plane hits turbulence, its velocity suddenly changing, that you appreciate the importance of keeping your seat belt fastened and realise you’re perhaps further from the ground than you’d like! Galileo had identified that the way things move, mechanics, is identical whether our frame of reference, a laboratory in which we can measure distances and times, is stationary or moving with constant velocity. The laws of physics, as far as they were then understood, do not and cannot distinguish between frames of reference in uniform relative motion. This was Galileo’s principle of relativity.

Galilean invariance of Newton’s laws

Isaac Newton (1642-1727) formalised the laws of mechanics in terms of his three laws of motion and law of universal gravity. These were presented, along with a great deal else, in his monumental “Principia” published in 1687. The first law of motion states that every body continues in its state of rest or uniform motion in a straight line unless compelled by some external force to change that state. Explicitly, this says that a free particle (one not acted on by any force) has constant velocity. But also, implicitly, that there exists a frame of reference in which this is the case. Such a reference frame is precisely what is meant by an inertial frame of reference. In other words, the validity of Newton’s first law tests whether or not we are in an inertial frame. Furthermore, given one inertial frame, Galileo’s principle of relativity tells us that any other frame of reference moving uniformly with respect to it is also an inertial frame. One obvious question is where (on earth!) are these free particles — nothing escapes gravity! There are of course special situations, a puck on an ice rink, particles with no mass, in which gravity can clearly be ignored, but Newton posits that quite generally were it not for gravity the natural state of all things is rest or uniform motion. Some justification for setting aside gravity in this way is provided by the following observation. Recall that Newton’s law of gravity says that the gravitational force exerted on a point pass \(m\) by another point mass \(M\) is given by
\begin{equation}
F=G\frac{Mm}{r^2}
\end{equation}where \(G\) is the gravitational constant and \(r\) is the distance separating the point masses. Now normally, in accordance with Newton’s second law, \(F=ma\), the acceleration due to an applied force is inversely proportional to the mass. In the case of gravity though, since the force itself is proportional to the mass on which it acts, it accelerates all bodies equally regardless of their mass and so can be regarded as a kind of overlay upon the existing physics.

Mathematically, a frame of reference may be regarded as a coordinate system. To specify the location of a particle we need four coordinates, \(x,y,z,t\). Three, \(x,y,z\), to specify its where and one, \(t\), specifies its when. Let’s call this coordinate system \(S\) and assume it corresponds to an inertial frame of reference. Of course any other coordinate system which is simply spatially translated and/or rotated with respect to \(S\) also corresponds to an inertial reference frame. More interesting though would be one which was also in relative motion with respect to \(S\). Let’s call the corresponding coordinate system \(S’\). To keep things simple let’s focus on the relative motion and assume that we’ve arranged that at \(t=0\) the coordinate systems are aligned with \(S’\) moving with a velocity \(\mathbf{v}=(v,0,0)\), that is, with speed \(v\) in the positive \(x\)-direction relative to \(S\).

image

Then at some time \(t\) the spatial coordinates of a point with respect to \(S’\) are related to its coordinates in \(S\) according to the simple equations,
\begin{align}
x’&=x-vt\nonumber\\
y’&=y\label{eq:Galilean_space}\\
z’&=z.\nonumber
\end{align}Notice that we’ve implicitly assumed that there is a single, absolute time. That is, time in \(S’\) is assumed to be the same as time in \(S\),
\begin{equation}
t’=t.\label{eq:Galilean_time}\\
\end{equation}Together, the equations \eqref{eq:Galilean_space} and \eqref{eq:Galilean_time} are called the Galilean transformations relating the inertial coordinate systems \(S\) and \(S’\).

An immediate consequence of the Galilean transformations is that velocities add. That is, if \(u_x\) and \(u’_x\) are the \(x\)-components of the velocities \(\mathbf{u}\) and \(\mathbf{u}’\) of a particle as measured in \(S\) and \(S’\) respectively then,
\begin{equation*}
u_x=\frac{dx}{dt}=\frac{dx’}{dt’}+v=u’_x+v,\\
\end{equation*}so that, together with the obvious relations for the \(y\)- and \(z\)-components, \(\mathbf{u}=\mathbf{u}’+\mathbf{v}\). In particular there is no notion of absolute rest. As Galileo had observed, one person’s state of rest is another’s uniform motion, it is a matter of perspective.

Coordinate systems related by Galilean transformations are the mathematical abstraction of inertial frames of reference and Galileo’s principle of relativity set in this context is the statement that the mathematical expression of the laws of physics should be invariant under Galilean transformations. Take Newton’s second law as an example, \(\mathbf{F}=m\mathbf{a}\), now expressed in terms of 3-dimensional vectors, \(\mathbf{F}=(F_x,F_y,F_z)\) and \(\mathbf{a}=(a_x,a_y,a_z)\). If \(\mathbf{a}’\) is the acceleration as measured in \(S’\) then we have, say for its \(x\)-component, \(a’_x\),
\begin{equation*}
a’_x=\frac{d^2x’}{dt’^2}=\frac{d^2x}{dt^2}=a_x,\\
\end{equation*}and similarly for the \(y\)- and \(z\)-components, so \(\mathbf{a}’=\mathbf{a}\), that is, acceleration is invariant under Galilean transformation. To confirm that Newton’s second law is true in all inertial frames of reference then becomes the mathematical problem of checking, on a case by case basis, that all forces of interest are also invariant under Galilean transformations. In fact, most forces encountered in Newtonian dynamics depend only on relative position, relative velocity and time. So, since each of these are invariant under Galilean transformations so are the forces. In particular, this is the case for the force of gravity between two objects since it is inversely proportional to the square of their separation.

In most cases a frame of reference fixed to earth, such as the room in which you’re sitting, is a good approximation to an inertial frame. Technically though it isn’t, consider for example earth’s rotation about its axis which constitutes a radially directed acceleration. When working with such noninertial frames the acceleration of the frame manifests itself in the form of ‘fictitious’ forces — in the case of rotating frames, the Coriolis and centrifugal forces. Incidentally, it is a feature of such fictitious forces that, like gravity they are always proportional to the mass of the object whose motion is being studied. Could it be that gravity is also somehow a fictitious force? This idea turns out to have considerable legs!

Galilean relativity in action: What happens when you drop a soccer ball with a table-tennis ball sitting on top?

If you’ve never tried it you should — seeing the table-tennis ball ping high in the air is pretty dramatic. Understanding this behaviour provides a nice example of the power of translating between inertial frames of reference using the Galilean transformations. It will be intuitively obvious that a table-tennis ball hitting a soccer ball will simply bounce back with essentially the same speed but in the opposite direction leaving the football unmoved. Now, when we drop the football with the table-tennis ball on top, for a split second after the football hits the ground, we have the two balls colliding with each other with equal and opposite velocities. Schematically, and imagined horizontally, the situation we wish to understand is this:
image
Now let us consider the situation from the perspective of a frame of reference moving to the right with velocity \(v\). In this frame the football is at rest and, thanks to the way velocities add in Galilean transformations, we know that the table-tennis ball is on a collision course travelling at a speed of \(2v\). As already mentioned we know what happens is this situation – the table-tennis ball simply bounces back travelling in the opposite direction with the same speed \(2v\) and the football remains at rest. To understand the original problem we simply translate this outcome back to the original frame of reference to find the football still travelling a \(v\) but the table tennis ball travelling at \(3v\). In other words, when we drop a football with a table tennis ball on top the table tennis ball bounces back up at three times the speed at which the pair hit the ground!