Northern Illinois University

www.niu.edu/~hbrown

Some Common Mistakes about Modern Physics

Unpublished work © 2007 Harold I. Brown
www.niu.edu/~hbrown
hibrown@niu.edu

Popular conceptions, often derived from popular literature, embody many misunderstandings of science. In this discussion I will examine five widespread and, unfortunately, influential misconceptions about modern physics.

1. Relativity and Relativism. No form of relativism, as this is generally understood, is supported by the theory of relativity in physics. Rather, relativity physics takes quite the opposite perspective. The name "theory of relativity" was introduced by Planck and Einstein never cared for it. His preferred label was invariantstheorie; if this preference had prevailed, a great deal of popular misunderstanding might have been avoided. The fact that either label could be applied indicates that some term is being used differently from the way it is used in everyday language. In this case, it is the term "relativity" which has a special meaning for physicists. One of the two main postulates of the special theory of relativity (which does not deal with accelerated frames of reference - "special" here means "restricted") states that laws of nature are the same for all frames of reference that are moving at constant velocity with respect to each other (known, for short, as Galilean reference frames). This postulate is known as "the principle of relativity" although it is actually a principle of non-relativity. If this use of language seems perverse, it is at least not unique. For many centuries the logical principle which states that a proposition cannot be both true and false was known as the "principle of contradiction," although recent texts tend to call it the "principle of non-contradiction." Special relativity has a second main postulate which is also a claim about non-relativity: that the speed of light is the same for all Galilean reference frames; I will return to this postulate in item 2. For the moment I want to develop the meaning of the first principle.

To a large degree this principle is already familiar from classical physics and, indeed, from Euclidean geometry. To see its import consider two points, a and b, in two-dimensional space along with the usual set of x and y axes (see figure below). I will use x for the distance between the x-coordinates of the two points, y for the distance between their y-coordinates, and d for the distance between a and b. From the Pythagorean theorem, d2 = x2 + y2. Now consider a different set of coordinate axes with the origin moved to a different point in space and with the axes rotated around this new origin so that they form an angle with the original axes. On these new axes the coordinates of a and b, and the distances between coordinates on the two axes, will typically be different from those on the original axes. But if we use these new values to calculate d2 we will get the same value as we arrived at previously. This will hold for any pair of axes we choose. The value of d2 is an invariant that expresses an objective property -  the distance between a and b. The values of coordinates of a and b, and of the distances on the coordinate axes, are relative to the axes chosen and, as a result, do not capture an aspect of nature. We can, however, calculate the same invariant from any set of coordinate axes. This example illustrates the meaning of the claim that all frames of reference are equivalent: however they vary (in the limited respects we have been considering), they yield the same value for the distance between a and b. This situation extends directly to 3-dimensional space and to higher dimensions if one is inclined to go in that direction. However, for reasons we will come to shortly, d2 is not invariant in special relativity - which means that we will have to look further to find an invariant.

There is another familiar invariant from classical physics and everyday experience. Suppose you and I are using our watches to determine the time between two events - say the beginning and end of a race. Both watches run at the same rate, but they do not give the same time for a particular event because they are set differently. As a result, the reading of the time of an event on one of the watches has no significance as a property of some physical process. Yet each of us will get the same result for length of the race. In effect, we each have a 1-dimensional system; your watch and my watch are different coordinate axes giving different times for an event. But we get the same invariant value for the time between two events. This invariant also fails according to special relativity, which introduces a new invariant know as the spacetime intervalI2 = x2 + y2 + z2 - (ct)2. In this formula c is the speed of light in a vacuum and serves as a conversion factor so that all terms are measured in the same units. Some writers make the ct term positive and the space terms negative. Either convention is satisfactory, the important feature is that the signs for the space and time terms differ. Many central feature of special relativity are embodied in this difference. Again, the present point is that each of us, using our own coordinate system, measures spatial and temporal gaps, then we use our (typically different) results to calculate the interval, and we get the same result for this term.

A bit of Euclidean geometry

2. The Speed of Light. The second main postulate of special relativity states that the speed of light is the same for all Galilean reference frames. Arguably, the first postulate had long been implicit in physics; the effect of relativity is to rethink where the invariants are located. The second postulate is truly radical and is the source of special relativity’s most counter-intuitive results. Suppose, for example, that I am stationary with respect to a light source; light from that source travels towards me at the speed c. Now suppose that I am moving towards the source at some speed v relative to that source. No matter how high v is, the velocity of the light with respect to me will still be c, not c + v as it would be in classical physics. If I eventually pass the light source and move away from it, the speed of light from that source will still be c with respect to me.

One source of misunderstandings of this postulate arises because popular discussions of relativity often omit the crucial result that velocities do not add: velocities do not combine in the way assumed by commonsense and classical physics. Because of this neglect, one encounters the claim that we can "beat" the speed-of-light-limitation with the following procedure: a ship starting from the earth accelerates in a straight line to, say, 3/4c and then launches a daughter ship that accelerates to 3/4c with respect to the mother ship, and so forth. Let us consider what is wrong with this proposal.

The first point to note is that the only velocities countenanced by relativity are relative velocities. Velocity is not a relativistic invariant (except for the special case of the velocity of light). Thus when the postulate says that no velocity of a material object can exceed (or reach) c, we are talking about relative velocities. In our rocket example, the velocity of the daughter ship with respect to the earth must be less than c. Consider a particularly extreme example: two photons are approaching each other, each moving at velocity c with respect to any reference frame. What is the velocity of one of these photons with respect to the other? The answer is c. How do we get this answer? It follows from a result that Einstein derives in his first paper on relativity ("On the Electrodynamics of Moving Bodies"): if two objects are moving with velocities u and v, the combined velocity is (u + v)/(1 + uv/c2). Applying this to our rocket example, the velocity of the daughter ship with respect to earth is: (3/4c + 3/4c)/(1 + 9/16c2/c2) = .96c.

3. Riemannian Geometry. Many people will tell you that general relativity uses Riemannian Geometry, but many of those who say this do not know that there are two different mathematical structures that are known as "Riemannian Geometry." One of these is a member of the triad Euclidean, Lobachevskian, Riemannian Geometry. This is the Riemannian Geometry that many have heard of, but it has nothing to do with general relativity. All the geometries in this triad have constant curvature throughout space, but general relativity requires a geometry that allows for variable curvature from point to point. In general relativity there is a deep relation (some say identity) between the distribution of matter in spacetime and the geometry of spacetime. Because of this, the geometry must be as variable as the distribution of matter. The other Riemannian Geometry - also know as "differential geometry" - allows for the formulation of the appropriate geometries.

Picky note: As originally developed, Riemannian Geometry has three constraints that are relaxed in general relativity: the interval between two point must be positive or zero, the interval can be zero only if the points coincide, and all directions must be treated in the same way (isotropy). In general relativity I2 can have negative values, the interval between any two points on a light ray is zero, and the time direction is treated somewhat differently from the space directions - as indicated by the difference in sign in the expression for the interval. Because of this difference many contemporary physicists talk of "3 + 1" dimensions rather than 4 dimensions. Thus, strictly speaking, the geometry of general relativity is not Riemannian Geometry. There are at least three ways of dealing with this in the literature. Some write of "psuedo- Riemannian Geometry," some of "quasi-Riemannian Geometry," and some just extend the use of the label "Riemannian Geometry." Note also that Minkowski geometry, the spacetime geometry of special relativity, violates these constraints, although it is a geometry of constant curvature.

4. The Uncertainty Principle. This principle says that certain pairs of parameters cannot be simultaneously determined with arbitrary degrees of precision. Put differently, measuring one of them changes the other. Location and momentum are a typical pair. The relation between them is expressed in the formula (using x for location and px for momentum in the x direction): ΔxΔpx < h/4π. In this formula Δx is the error of measurement of location in the x direction, Δpx is the error in measurement of momentum in the x direction (i.e., the x component of the momentum), and h is Planck’s constant. This result puts no limitation on the accuracy of measurement of one of the parameters. Note that there is no uncertainty relation between, for example, distance in the x direction and the y component of momentum. There is also no uncertainty relation between the x and y components of location or the x and y components of momentum. It is extremely important that uncertainty relations do not hold between all parameters. For example, locating an electron in an atom requires four parameters that are simultaneously determinate, so there must not be an uncertainty relation between any pair of these, such as between radial distance from the nucleus and angular momentum. However, angular momentum is a vector quantity which can be resolved into components in the three directions of space, and there is an uncertainty between any two of these components.

Not-so-picky note. There are various disputes about the correct interpretation of the uncertainty principle. An especially important dispute concerns whether we are dealing with relations between what we can know about parameters or relations between the parameters themselves. In other words, does the principle say that we cannot simultaneously measure the exact position and momentum of a particle, or does it say that the particle does not simultaneously have an exact position and momentum? I have written the above paragraph in terms of the measurement interpretation because this is the way it is most often presented in popular works, but it is not my intention to favor one of these in the present discussion.

5. Spin. Spin is one of the fundamental properties recognized by particle physics, but this is not an actual spinning of a particle on its axis - although that is how it was introduced. Initially, spin was attributed to electrons in order to account for an anomaly in certain spectra. According to Niels Bohr’s theory of the atom, electrons move in circular orbits around the nucleus. Only certain orbits are permitted, and each orbit is associated with a specific energy. Photons are emitted when an electron jumps from its orbit to one of lower energy. The spectral lines that we detect are the result of a large number of jumps among various orbits that occur in a sample consisting of millions of atoms. Bohr’s original account was soon modified by Arnold Sommerfeld, who replaced the circular orbits with ellipses. This involved two changes that led to an improvement in the account of the spectral lines. First, a circle is specified by a single parameter - its radius. An ellipse requires two parameters, allowing for a greater variety of spectral lines. Second, since the Sommerfeld model locates the nucleus at a focus of an ellipse, the distance between the nucleus and the electron varies. This requires a variation in the speed of the electron in accordance with Kepler’s law which tells us that a line from the focus to the orbiting particle traces out equal areas in equal times. The electron thus moves faster when it is closer to the nucleus, and its speed is great enough to require corrections from special relativity. Nevertheless, problems remained. In particular, there were cases in which theory predicted a single spectral line, but experiment resulted in two spectral lines, one of higher energy than the predicted line, and one of lower energy. Moreover, these two lines were equally spaced from the predicted line.

Around 1925, before the development of modern quantum theory, at least three physicists - Samuel Goudsmit and George Uhlenbeck working together, and Ralph Kronig working independently - had the thought that these lines could be explained if the electron spun on its axis. This would introduce a magnetic field which would lead to an additional energy term to be combined with the predicted energy. Since the electron could spin in either a clockwise or counterclockwise direction, its energy would be either added to or subtracted from the predicted energy, thereby accounting for the observed spectral lines. Kronig discussed his proposal with Wolfgang Pauli, Werner Heisenberg, and Hendrik Kramers who pointed out major problems; as a result, Kronig never published his proposal. Goudsmit and Uhlenbeck were, at the time, students of Paul Erenfest who encouraged them to write a note which he then sent to a journal. At the time, they were aware of one problem: viewing the electron as a rotating sphere, the speed of the surface would be many times the speed of light. Shortly after the paper was completed, Erenfest arranged for a discussion with Hendrick Lorentz - one of the leading theorists in the world. On the basis of energy considerations Lorentz noted (among other problems) that either the mass of the electron would be greater than that of the proton or, maintaining the known mass, the radius of the electron would be larger than that of the entire atom. Goudsmit and Uhlenbeck attempted to withdraw their paper, but it was too late and they are generally credited with “discovering” electron spin. The energy and magnetic field associated with spin, but without the problems mentioned, reappeared in 1928 when Paul Dirac published a theory of the electron that met all the constraints of both quantum theory and special relativity. The property responsible for this magnetism is still referred to as “spin,” but it is very different from the classical idea of spin that we have been discussing. Consider the major features of spin according to quantum theory.

All fundamental particles have spin, not just the electron, and each instance of an elementary particle has exactly the same spin. The quantitative value of spin is expressed as a number, s, times Planck’s constant, h, divided by 2π. The value of s is always either an integer or a half integer. In the case of the electron s = 1/2 and the electron is described as a “spin-1/2" particle. In addition, spin has components in specific directions; it is these components that have the values of +1/2 or —1/2. Other elementary particles have s = 1 and three components with values 1, 0, and -1. The currently hypothetical graviton is a spin-2 particle, which means it has components with values 2, 1, 0, -1, -2. In general, for a given value of s there are 2s + 1 components running from -s to s in unit steps. Non-elementary particles have a spin value that is determined by the spins of their elementary constituents (not by the size and rotation speed of the particle). These are not just convenient labels. They are empirically testable results that play a role not only in endeavors such as the explanation of spectral lines, but also in existing and projected technologies that make use of the magnetic properties associated with spin. The read-heads used in computer hard drives provide one example of an existing technology; a new type of transistor is an example of a projected technology. The values of these magnetic properties are different than those that would be predicted by treating spin as a rotation. In addition, there are important differences between the behavior of particles with integer spin and those with half-integer spin. To take but one instance, the Pauli exclusion principle, which implies that no two electrons in an atom can be in the same state, and plays a key role in our understanding of chemical properties, applies to half-integer-spin particles, but not to particles with integer spin.

Next, note that in classical physics angular momentum can be represented by a vector. This is not an arbitrary convention. In a mathematical theory a property must behave in certain specific ways in order to be legitimately represented by a vector. Spin is not a vector quantity. For example, “rotating” an electron through if 360º does not bring it back to its previous orientation; that requires a 720º “rotation.” The proper representation of spin required introduction of a new type of mathematical object, now known as a “spinor.”

Finally, elementary particles are considered to be point particles, that is, particles with no size. From this perspective, the notion of an elementary particle spinning on its axis makes no sense. Spin is an intrinsic property of the particle; it is not a consequence of other properties.

There is a general lesson to be learned from this example about the language in which theories are expressed. It is common to introduce a familiar term on the basis of an initial understanding of a subject and then to retain that term as it becomes clear that the initial account requires serious modification. This is true of the notion of quantum theory, which has developed to include continuous changes, as well discontinuous jumps from one state to another. It is also true of contemporary chaos theory which deals with fully deterministic phenomena even though this was not recognized when its standard name was introduced. In such cases, the term being used takes on a new meaning. Assuming that the meaning of a term used in a particular discipline is the same as its meaning in other contexts provides one common route to misunderstanding.

6. Orbital Angular Momentum. In addition to spin, electrons in an atom have an angular momentum that was originally taken to be the result of the motion of electrons orbiting around the nucleus. In quantum theory electrons also have a property referred to as “orbital angular momentum,” although its properties differ significantly from classical angular moment and it can no longer be attributed to such motion. One departure from the classical notion is incorporated into Bohr’s theory of the atom where angular momentum is quantized and thus limited to a small set of permissible values. There is a basic unit of angular momentum and the angular momentum of a particular electron is an integer, l, times this basic unit. In addition, the energy of the electron is quantized and given by an integer, n, times a basic unit, where the smallest allowable value of n is 1. In quantum theory, for a given value of n there are n permissible values of l running from 0 to n - 1. For example, at the third energy level n = 3 and l has the permissible values 0, 1, and 2. In other words, for each value of n one of the permissible values of l is zero; l = 0 is the only permissible value when n = 1. A circulating electron cannot have zero angular momentum: this would be a problem if electrons actually moved around the nucleus, but it is not a problem once we abandon this inappropriate image.

7. Young Einstein. This section does not deal with a mistake about a physical theory, but rather with some confusions about an iconic physicist that support other confusions about physics and about science in general.

In 1905 Einstein was not a clerk at the patent office (nor a clerk at the post office, a view I have also encountered), he was a patent examiner. His title was Technical Expert Third Class and the job required technical training. In contemporary terms, Einstein was ABD in physics. He had been through the most advanced program at ETH (Eidgenössische Technische Hochschsule, The Swiss Federal Institute of Technology in Zurich), passing his exams 1900. At this point he needed only a dissertation to receive a doctorate. After completing the program, Einstein had hoped for a position as an assistant at ETH; this would have been the normal next step towards an academic career. But he did not get such a position - apparently because his arrogance had alienated the faculty who made the decision. This led to an unsatisfactory search for a job and to the patent office. He completed his doctorate in 1905 (in addition to writing the four famous papers) with a dissertation entitled, "On a New Determination of Molecular Dimensions." In 1906 he was promoted to Technical Expert Second Class.

Einstein was not a mathematician; he was a theoretical physicist. Mathematics was his central research tool, but he did not develop new mathematics. (Newton, by contrast, was both a theoretical physicist and a mathematician.) In his general theory of relativity Einstein used mathematics (tensor analysis) that was not then part of the physicist’s standard repertoire, but he did not invent this mathematics. Moreover, he accepted the need to use such highly advanced mathematics only after considerable resistance. In his early days Einstein had disdain for the more subtle forms of mathematics. The mathematics used in his great 1905 papers would, today, be mastered by a student with an undergraduate degree in physics. In 1908 Hermann Minkowski published the 4-dimensional spacetime version of special relativity, reformulating the theory in terms of tensors. Minkowski was a mathematics professor at ETH and was one of the people that Einstein alienated. When Minkowski read Einstein’s initial papers on special relativity he commented that he did not think Einstein had it in him, but he immediately saw its power and developed it further. Einstein was not impressed, brushing off Minkowski’s work as "superfluous learnedness." Eventually Einstein came to recognize this work as an important step towards general relativity: "in 1912 he adopted tensor methods and in 1916 acknowledged his indebtedness to Minkowski for having greatly facilitated the transition from special to general relativity" (Pais, p. 152). In his later life Einstein regularly worked with a mathematical consultant.

The speed with which Minkowski began working on relativity suggests that stories of great resistance to the new ideas are, at best, exaggerated. To be sure, not everyone instantly applauded. That never happens with new ideas among humans. There was both misunderstanding and resistance to some of Einstein’s ideas, such as his announcement that he had eliminated any need for an ether - a claim that he later recognized to be premature. But by and large professional physicists quickly recognized the importance of the new theory. Einstein received his first honorary doctorate, from the University of Geneva, in 1909. He was first nominated for the Nobel Prize (by Wilhelm Wien, himself a Nobel laureate) in 1912. Summarizing the response, Pais writes (p. 153): "Then [i.e. in 1912] and later the special theory would have its occasional detractors. However, Wien’s excellent account shows that it had taken the real pros a reasonably short time to realize that the special theory of relativity constituted a major advance."

Einstein was not always an elderly man with wild white hair.

Einstein at 35.

It is not my aim in this note to tear down Einstein. He was one of the most outstanding thinkers in human history. Even among the great physicists - themselves an intellectually outstanding group of people - Einstein and Newton are in a class by themselves. In many respects Einstein was also an exceptionally admirable human being. His actual accomplishments do not need mythical embellishments. Further, the myths do harm when they lead people to believe that, since Einstein was an untutored outsider, they too can do great things without bothering to study the subject they would revolutionize.

Some References

Kostro, Ludwig (2000), Einstein and the Ether, Montreal: Aperion.

Pais, Abraham (1982), ‘Subtle is the Lord ...’: The Science and the Life of Albert Einstein, New York: Oxford University Press.

Rigden, John (2005), Einstein 1905: The Standard of Greatness, Cambridge MA: Harvard University Press.

Stachel, John (1998), Einstein’s Miraculous Year: Five Papers that Changed the face of Physics, Princeton NJ: Princeton University Press.

Tomonaga, Sin-itiro, The Story of Spin, trans. Takeshi Oka, Chicago: University of Chicago Press.