The Unabashed Academic

16 January 2019

Problem solving by changing the box and the structure of parameters

Some of my readers may recall that as a physics teacher and education researcher, I'm interested in how mathematics is used in science. The diagram that helps me think about how it works is my "four-box model" below.
The key idea is that we look at a physical system and decide that some of the measurements that we use to quantify our description of the physical world behaves in a way that reminds us of how some mathematical system works. We then treat our measured numbers as if they were mathematical quantities (modeling) and use the calculational tools that come along with the mathematical system to solve problems that we couldn't otherwise solve (processing). We then see what those solutions tell us about our physical system (interpreting) and decide whether the results are any good (evaluating). If they are, it encourages us to blend our physical concept with the mathematical one, building a rich mathematical description of the physical world. [See Redish & Kuo]

I recently had an experience that demonstrated to me that this sometimes works in unexpected ways, and in doing so, gave me some insights both into the structure of parameters in equations and into how I think about math in science.

We spent last Thanksgiving in Estes Park, Colorado with friends and relatives. Since Estes Park is at 7500 feet, since a few of us were pretty old (70's and 80's), and since we were all science geeks, one of us brought an oximeter and everyone wanted to try it. Some older folks (like me) have a tendency to respond poorly to the low levels of oxygen at high altitudes by experiencing dizziness, nausea, and headaches. The oximeter is a device that you stick on your finger and it measures how well oxygenated your blood is. If you fall below 90% you need to worry.

When I got home, I ordered one from Amazon. (It's a measuring device! Of courseI have to have one! It's only $15!) When my (physicist) friend Royce came to visit a couple of weeks later, I showed it to him and he asked, "How does it work?" Hmmm. Although we had asked that question in Estes Park, the answer "It measures the transmission of light through your finger and oxygenated and non-oxygenated blood absorbs the light differently" satisfied us. But, as usual, Royce likes to go deeper. We looked up some Wikipedia articles and found some interesting physics and math involved. Look it up if you like, but that's not the point of this blog.


The point of this blog is about playing with the simple algebraic equations that come up in (the first phase) of studying (a simplified version of) the absorption problem. Here's how it works.

The oximeter uses LEDs of two different colors. Oxygenated and non-oxygenated blood absorb these two colors at different rates. The red and blue curves in the figure show how oxygenated and non-oxygenated blood absorbs different frequencies of light (horizontal axis). 
(Figure: By Adrian Curtin - Own work, CC BY-SA 3.0, 

Suppose we have color 1 (say 700 nm) and color 2 (say 900 nm) and that oxygenated blood absorbs them at rates a1and a2, while non-oxygenated blood absorbs them at rates b1and b2. (The values are the places where the curves cross the dotted lines.) Then suppose we have a mix of x amount of oxygenated and y amount of non-oxygenated. We want to calculate the ratio y/x. (Actually, x/(x+y), but the math there is not as interesting.) 

First, let's set up the equations. Our result for colors 1 and 2 will be r1and rwhere


Well, this is pretty trivial.* We have two equations in two unknowns. Solving them just requires 9thgrade algebra. One way to do it is by substitution. Solve the first equation for y in terms of x, then substitute that result into the second equation and get an equation for x alone. Solve that equation for x. Then put the result for x into the first equation and get the result for y. Then take the ratio of x/y. Here's what it looks like. (You don't need to follow it through. I just want to show how messy it looks.)

This is straightforward, but tedious, especially with all the parameters to keep track of. (Amazingly, I got it right the first time! Pat self on back.) But the solution has an interesting structure to it. Here's the result:
 Messy, but it has a pleasing symmetry. Wait a second! That's familiar! It looks like a cross product. Remember that if I have two three-space vectors, say 
then 

This looks like both the numerator and the denominator of our result.

Usually we treat parameters in a physical problem as an "enriched number"; that is, it's not just a number. It's a number with units. But we don't typically give it more structure, as least in introductory physics.

But suppose we treat our parameters using a position vector as a metaphor for our set of parameters. Let's define a 3-dimensional parameter space, with dimensions 1, 2, 3. So (a1,a2,a3), (b1,b2,b3), and (r1,r2,r3) would be vectors in this space.  We have to use a 3-space since you can't do cross products in 2-space. We'll just make a3b3= r3= 0.

Then we could write our two equations in two unknowns as a single vector equation with three components (the third being 0 = 0).

We want to solve for and y. From our experience with cross products, we know that a vector crossed with itself is zero.** We can easily solve for and by taking cross products with the vectors and b. Since the third component of each of the vectors is 0, we only have to look at the third component of the cross product since the 1 and 2 component terms will each contain a third component of some vector and therefore be equal to zero.
Similarly
and using the fact that


we find that our ratio is
Which is the same as I got before.

Somehow I find this a lot more satisfying than just churning. Why do I feel that way? The calculations both are straightforward, but the second approach feels more elegant.

I think that the answer is that the second solution doesn't just look at the problem, say, "This is a turn-the-crank algebra problem". It makes use of the symmetry in the parameter structure of the equations to imbed the simple algebra problem of numbers and symbols in a larger mathematical structure: vectors. This is a more complex mathematical structure that includes standard algebraic manipulations as a subset. Imbedding the algebra problem in vectors allows me to use the more powerful structure of vectors (and cross products).

This reminds me of the classic Gauss trick. The story goes that in 4th(?) grade the teacher wanted to keep the students quiet so he set them the task of adding all the numbers from 1 to 100 expecting it to take them a long time. Gauss came up a minute later with the answer. Instead of just adding the numbers, he paired them: 1 + 100 = 101, 2 + 99 = 101, 3 + 98 = 101… and saw that there was a symmetry in the problem. Since there will be 50 such pair, the answer is 50 x 101 = 5050. The teacher saw the task as one using a particular mathematical structure: addition. Gauss reframed the task to make use of a more powerful mathematical structure, multiplication.

In both cases, we see the value of not just taking the obvious mathematical structure for granted, but of looking what additional mathematical modeling we can imbed our task in so as to have access to more powerful tools. I suspect that this kind of analysis also gives insight into the ontology of equations – what kind of things I think the elements of an equation are and what I can do with them. I hope to write more about this later.

I note that the advanced research literature is filled with examples of reframing an equation to treat it by more advanced mathematical techniques, but I can't think of too many more examples that would be obvious to a sophomore physics major. Anyone else have some examples to share?

* This is a simplified model of the system. Your finger is not just made of blood! Some processing has to be done to take out the absorption of the light by the flesh and this is done by looking at the oscillatory pattern of the signal. But since that's not our point here, we'll stick with the simpler model.

** One way of thinking about this is to consider the cross product geometrically. The cross product of two vectors has the magnitude equal to the area of the parallelogram that is created by the two vectors. If the two vectors are in the same direction, the area is zero.

07 September 2016

Work and energy in introductory physics

Some of my Physics Education Research Facebook friends have been questioning the value of some forms of the work-energy theorem in teaching introductory physics for life sciences (IPLS). Since over the past few years I have made this theorem an important component of my teaching of mechanics, I thought I'd take the opportunity to describe how I do it.
I've been teaching a – what? – reformed? reinvented? renewed? – IPLS course.[1] (link to NEXUS paper) I struggle with the adjective since our reformation process went beyond usual course reform. We spent a lot of time communicating and discussing (and arguing) content and approach with biologists and chemists, and a lot of time researching student responses and what they brought to the table. This produced a deep philosophical change in the way we designed and that I teach the class.
We learned that it wasn't just that many life science students didn't know the required math (perhaps because they hadn't used it in previous courses) or that they weren't familiar with physics concepts (perhaps because they hadn't taken high school physics, or if they had, hadn't taken it seriously). Rather, there were some serious barriers in the way many students were thinking about the nature of the scientific knowledge they were learning – epistemological barriers, if you will.
Here are some of the issues that we found:
·      Life science students often saw scientific knowledge as bits and pieces of memorized knowledge, failing to build a coherent picture. Although they had learned some heuristics (usually in chemistry), they had little or no experience with the use of deep and powerful principles such as those that drive even introductory physics.
·      Life science students often were deeply skeptical of highly simplified "toy" models. Since life depends in a critical way on complexity, simplification was seen as "playing irrelevant games." Few had any experience with the concept of modeling and few understood the insight that could be derived from studying simplified systems.
·      Even when life science students knew and were comfortable with the required math, almost all saw math as a way to calculate, rather than as a way to think about physical relationships. They were missing not only the skill of estimation and intuitions of scale, they missed being able to read qualitative implications from equations.
Our course is designed to address these epistemological issues as well as the issues of choosing content relevant for life science students (like doing more fluids and including diffusion and random walk). We try to stress coherence, modeling, and the value of using equations to build understanding and insight. The work-energy theorem plays a pivotal role in this structure.
Newton's three laws form the framework for building understanding of mechanics and building models of physical motion. I treat the three laws [2] as the basic structure. Any analysis of a particular motion requires a model – a choice of what we are going to treat as "objects" and how we are going to model their interactions. We use the method of System Schema [3] as a tool for analyzing systems and building models. This is a pre-requisite to drawing free-body diagrams. Interactions are two-way connecting a pair of objects. When the focus is on one object the interaction is realized as a force. By Newton's third law, the forces on either end of an interaction are equal and opposite. This is the tool that focuses student attention on the modeling character of each system considered.
Newton's second law tells how an object responds to the forces it feels. If the forces are not balanced (cancel out), the object accelerates – changes its velocity according to the rule:
Acceleration = (Sum of the forces)/(mass of the object)
This is a vector law, so forces are requires to change either the object's speed or the direction of its motion.
This naturally leads to the question:
If I only care about the change in an object's speed and not its direction,
what does Newton's second law tell me?
It's pretty easy to figure out how to do this, at least in principle. Forces that are in the same direction as the motion tend to speed it up, forces that are in the opposite direction of the motion tend to slow it down, and forces that are perpendicular to the motion tend to change its direction. So to consider only the speed, we multiply Newton's second law by a small displacement along (or against) the direction of motion. After a little simple algebra (no calculus needed), we get the one-body work-kinetic energy theorem:[4]
The change in an object's kinetic energy = work done on it by the sum of all the forces it feels
Or as an equation, this is written
Δ(1/2 mv2) = Fnet.Δr
(Bold here indicates a vector.)This law is not particularly useful by itself unless it is used in connection with the System Schema so one can see that it helps to provide clear and simple answers to two rather subtle questions:
·      Why is there such a thing as "potential energy" but no such thing as "potential momentum"? The impulse-momentum theorem and the work-energy theorem look very similar.
·      Why do we sometimes treat potential energy as belonging to a single object (e.g., the gravitational PE, mgh) but sometimes treat it as belonging to a pair of objects (e.g., the PE between two electric charges, kqQ/r).
To answer the first, let's consider the impulse-momentum theorem in contrast to the work-energy theorem written above:
Δ(mv) = FnetΔt
If we consider a system with two objects interacting, since they interact for the same amount of time, and since the forces they exert on each other are equal and opposite (by Newton 3), they change each other's momenta in equal and opposite ways. This means that if we add together the impulse-momentum theorems for the two objects, their momentum changes will cancel. We can then see easily see what the conditions are for momentum conservation to hold. (All the other forces acting on the two objects have to cancel.)
For the work-energy theorem, things are a bit difference. If we consider a small time interval when the two objects are interacting, their time intervals are the same, but the distances that they move do not have to be. Therefore, if we add together the work-energy theorems for two interacting objects, even if there are no other forces acting on them, the work terms for the two objects do not have to cancel. And we can easily see that the extract term is the force dotted into the change in the relative separation of the two objects.
This extra term is why we introduce a potential energy (but not a potential momentum). And it makes clear that the potential energy belongs to the interaction between the two objects.
It also helps us understand when we can treat the PE as belonging to a single object rather than to a pair of objects. Since the momentum changes of the two objects are the same, its easy to find that the KE change of each object is Δ(p2/2m). If one object is much larger than the other, the KE change (and therefore the PE) can be totally assigned to the lighter object. This is why the gravitational PE of an object on the earth's surface can be assigned to the object, and why, in an atom or molecule, the electric PE if the interaction of an electron and a nucleus can be assigned to the electron and we can talk about "the PE of the electron."
These are nice results, if abstract. But I like the work-energy theorem for more reasons. Here are three:
·      When we have a situation where there one of the interacting objects is much larger than the other, there are a lot of nice examples where one can write energy conservation and create equations relating position and velocity. This gives the students good practice with using manipulating symbolic equations and interpreting the result.
·      It can be used to generate other relations and show the relation between other principles that are often treated as independent.
·      If provides the link between the fundamental concepts of force and energy, building another powerful coherence.
The first doesn't need much elaboration, but I was a bit surprised at the second. I knew in principle the power of the work-energy theorem, but it wasn't until I included a substantial discussion of fluids in my class that I realized how cool it was. The work-energy theorem, when applied to a bit of fluid in a pipe easily reduces to:[5]
·      The dependence of pressure on depth and the related Archimedes' principle (by assuming no motion and only gravitational and pressure forces)
·      Bernoulli's principle (by assuming no resistive forces)
·      The Hagen-Poiseuille equation (by assuming resistive force but no gravitational change)
It can also be used to generate new equations, such as a modified H-P equation for fluids flowing vertically in a tree.
Of course, each of these can be derived from forces as well, but tying everything to work-energy and thereby back to forces and Newton two emphasizes the coherence of the whole structure and the reliance on powerful overarching principles.
I've seen this work with my students. They all come in knowing that "energy is the ability to do work," but for most, these are just words. Once we've gone through the work-energy theorem they begin to be able to translate forces into work.
My favorite specific example of this occurred in an interview done with Carol, a student in the class's second term. We had completed a discussion of free energy and done a recitation analyzing the separation of oil and water and the formation of lipid cell membranes. The result is somewhat counterintuitive, since it is actually pretty easy for students who have taken chemistry to see that the interaction (electric attraction) between water and oil molecules is stronger than the interaction between two oil molecules. So why does oil and water separate? Why do lipid membranes form?
In the interview, Carol answered the question by referring to the equation for the Gibb's free energy:
ΔG = ΔHTΔS
As all biology and chemistry students know, Gibb's free energy is what drives chemical reactions. Spontaneous reactions go to a lower free energy. (Here, H is the enthalpy, which, for this discussion, is equivalent to the internal energy.)
She said (paraphrasing), "The force between the molecules goes into the work which creates potential energy. That goes into the H term since it's energy. Since it's attractive, that tends to make the H lower for the separated oil molecules. But the other term competes. It comes from the losing of the opportunities for the water molecules to interact. In this case, that term wins."
I've seen many students reason like this and it makes me happy. They are using equations to reason with qualitatively and bringing together the idea of forces and energy, building an overall coherence and reasoning from principle.
The single-particle work-energy theorem is easy to think about and reason qualitatively and quantitatively with. This is why I like it and why I make it a central element of my IPLS class.


[1] NEXUS/Physics: An interdisciplinary repurposing of physics for biologists, E. F. Redish, C. Bauer, K. L. Carleton, T. J. Cooke, M. Cooper, C. H. Crouch, B. W. Dreyfus, B. Geller, J. Giannini, J. Svoboda Gouvea, M. W. Klymkowsky, W. Losert, K. Moore, J. Presson, V. Sawtelle, K. V. Thompson, C. Turpen, and R. Zia, Am. J. Phys. 82:5 (2014) 368-377.

[2] I actually introduce a "zeroth law" – that every object responds only to forces it feels and only at the instant it feels them. While this might seem trivial to an experienced physicist, a significant fraction of the errors that introductory students make are a violation of this law.
[3] V. Sawtelle & E. Brewe, System Schema Introduction, NEXUS/Physics; L. Turner, System Schemas, Am. J. Phys. 41:9 (2003) 404.

Could dark matter be super cold neutrinos?

Probably the greatest physics problems of the current generation are the cosmological questions. Thanks to the development of powerful new telescopes (many of them in space) in the last years of the twentieth century, startling new and unexpected results have pointed the way to new physics. These currently go under the names of "dark matter" and "dark energy", but those aren't real descriptions; rather they are suggestions for what might provide theoretical solutions to experimental anomalies. And, as naming often does, they guide our thinking into explorations of how to come up with new physics.

The problem that "dark matter" is supposed to resolve began in the 1970s with the observations of Vera Rubin. By making a careful analysis of the motion of stars in galaxies, she found an unexpected anomaly. As any first year physics student can tell you, Newton's law of gravitation tells you how planets orbit around the sun. The mass of the sun draws the planets towards it, bending their velocities ever inward in (nearly) circular orbits. The mathematical form of the law produces a connection between the distance the planets are from the sun and the speed (and therefore the period) of the planets.

That connection was known empirically before Newton to Kepler (Kepler'sthird law of planetary motion: the cube of the distance from the sun is proportional to the square of the planet's period). The fact that Newton's laws of motion together with his law of gravity explained that result was considered a convincing proof of Newton's theories.

A galaxy has a structure somewhat like that of a solar system. There is a heavy object in the center – a massive black hole – that is responsible for most of the motion of the stars in the galaxy. Rubin found that the speed of the stars around the center didn't follow Kepler's law. The far out stars were going too fast. This suggested that there was an unseen distributed mass that we didn't know about (or that Newton's law of gravity perhaps failed at long distances; In my opinion this option has not received enough attention, though that's for another post.).

Observations in the past thirty years have increasingly supported the idea that there is some extra matter that we can't see – and a lot of it. More than the matter that we do see. As a result, a growing number of physicists are exploring what might be causing this.

I saw a lovely colloquium yesterday about one such search. Carter Hall, one of my colleagues in the University of Maryland Physics Department, spoke about the LUX experiment. This explores the possibility that there is a weakly interactive massive particle (a "WIMP") that we don't know about – one that doesn't interact with other particles electromagnetically so it doesn't give off or absorb light, and it doesn't interact strongly (with the nuclear force) so it doesn't create pions or other particles that would be easily detectable in one of our accelerators. This would make it very difficult to detect. The experiment was a tour de force, looking for possible interactions of a WIMP with a heavy nucleus – Xenon. (The interaction probability goes up like the square of the nuclear mass so a heavy nucleus is much more likely to show a result.) The experiment was incredibly careful, ruling out all possible known signals. It found no results but was able to rule out many possible theories and a broad swath of the parameter space – eliminating many possible masses and interaction strengths. An excellent experiment.

But as I listened to this beautiful lecture, I wondered whether the whole community exploring this problem hadn't made the mistake of looking under the lamppost for our lost car keys. It's sort of wishful thinking to assume that the solution to our problem might be exactly the kind of particle that would be detectable with the incredibly large, powerful, and expensive tools that we have built – particle accelerators. These are designed to allow us to find new physics – in the paradigm we have been exploring for nearly a century: finding new sub-nuclear particles and determining their interactions in the framework of quantum field theory.

This reflects a discussion my friend Royce Zia and I have been having for five decades. Royce an I met in undergraduate school (at Princeton) and then became fast friends in grad school (at MIT). We spent many hours there (and since) arguing about deep issues in physics. We both started out assuming we wanted to be elementary particle theorists. That, after all, was where the action was. Quarks had just been proposed and there was lots of interest in the nuclear force and how to make sense of all the particles that were being produced in accelerators. But we were both transformed by a class in Many Body Quantum Theory given by Petros Argyres, a condensed matter theorist. In this class we saw many (non-relativistic) examples of emergent phenomena – places where you knew the basic laws and particles, but couldn't easily see important results and structures from those basic laws. It took deep theoretical creativity and insight to find a new way of looking at and rearranging those laws so that the phenomena emerged in a natural way.

There are many such examples. The basic laws and particles of atomic and molecular physics were well known at the time. Atoms and molecules are made up of electrons and nuclei (the structure of the nuclei is irrelevant for this physics – only their charge and mass matters) and they are well described by the non-relativistic Schrödinger equation. But once you had many particles – like in a large atom, or a crystal of a metal – there were far too many equations to do anything useful with. Some insight was needed as to how to rearrange those equations so that there was a much simpler starting point.

Three examples of this are the shell model of the atom (the basis of all of chemistry), plasmon oscillations in a metal (coherent vibrations of all the valence electrons in a metal together), and superconductivity (the vanishing of electrical resistance in metals at very low temperatures). Each of these were well described by little pieces of the known theory arranged in clever and insightful ways – ways that the original equations gave no obvious hint of in their structure.
I was deeply impressed by this insight and decided that this extracting or explaining phenomena from new treatments of known physics was just as important – as just as fundamental – as the discovery of new particles or new physical laws. Royce and I argued this for many hours and finally decided to grant both approaches the title of "fundamental physics" – but we decided they were different enough to separate them. So we called the particle physics approach "fundamental-sub-one" and the many-body physics approach "fundamental-sub-two". (Interestingly, both Royce and I went on to pursue physics careers in the f2 area, he in statistical physics, me in nuclear reaction theory.) In the decades since we had these arguments, physics has made huge progress in f2 physics – from phase transition theory to the understanding and creation of exotic (and commercially important) excitations of many body systems.

So yesterday, I brought my f2 perspective to listening to Carter talk about dark matter and I wondered: He was talking all about f1 type solutions. Interesting and important, but could there also be an f2 type solution? We already know about weakly interacting massive particles: neutrinos. They only interact via gravity and the weak nuclear force, not electromagnetically or strongly. 

Could dark matter simply be a lot of cold neutrinos? They would have to be very cold – travelling at a slow speed – or else they would evaporate. When we make them in nuclear reactions in accelerators they are typically highly relativistic – travelling at essentially the speed of light. The gravity of the galaxy wouldn't be strong enough to hold them.

That leads to a potential problem for this model. Whatever dark matter is, it has to have been made fairly soon after the big bang – when the universe was very dense, very uniform, and very hot -- hot enough to generate lots of particles (mass) from energy. (Why we believe this is too long a story to go into here.) So you would expect that any neutrinos that were made then would be hot – going too fast to become cold dark matter.

But suppose there were some unknown emergent mechanism in that hot dense universe -- a phase transition -- that squeezed out a cold cloud of neutrinos. Neutrinos interact with matter very weakly – and their interaction strength is proportional to their energy so cold neutrinos interact even more weakly than fast neutrinos. If there were a mechanism that spewed out lots of cold neutrinos, I expect they would interact too weakly with the rest of the matter to come to thermal equilibrium. If the equilibration time were, say, a trillion years, they would stay cold and, if their density were right, could serve as our "dark matter".

Most of the experimental dark matter searches wouldn't find these cold neutrinos. Searching for them at this point would have to be a theoretical exploration: Can we find a mechanism in hot baryonic matter that will produce a phase transition that spews out lots of cold neutrinos? I don't know of any such mechanism or where to start, but wouldn't it be fun to consider?