3/3: Mathematical notation

13 June 2016

Modified: 12 October 2016

1878 words

[Part 3 is about communicating mathematical ideas. Part 1, Part 2. I took care to contain the tedious math bits in single paragraphs, so the point is still clear if you choose to only read the fun parts.¹]

summary. There is no such thing as “wrong” notation. All that counts is that you get the math right and communicate your ideas clearly.

Last time I explained how it’s not accurate to say that an electron “is” a wave function, because an electron is a thing in the universe and a wave function is a mathematical object, and mathematical objects don’t live in the real universe. When people talk about wave functions, they often use the letter $\psi$ . Obviously, even though it looks all nice and wavy, the $\psi$ itself isn’t the wave function either – it’s just its name. The concept of names is one we know and love from the real world: When I point at a chair and say, “This is Bob,” it’ll be clear what I mean when I explain that Bob has three legs. While it’s a terrible idea to call a chair Bob, giving things and their relationships with each other funny names is basically what mathematics is all about.

Just like we grew up believing that dictionaries had authority over the reality of words, school taught us that $+$ means you add two numbers, $-$ means you subtract them, $\times$ means you times them, and so on. But these symbols weren’t handed down from the heavens to the first humans to walk the Earth. There was a time when they didn’t exist, and then someone made them up. Now, $+, -, /, \times$ are pretty basic and sometimes you may even have a use for them in every day life, so these symbols are generally assumed to refer to their corresponding arithmetic operation. There are a handful of other symbols that are pretty unambiguous in their meaning, like $\sqrt{\cdot}$ or $=$ , but beyond this lies madness.

madness 1: when wrong is right and right is complicated

The slope of a function graph is called the function’s derivative. (If you’re familiar with, like, math, this may be known to you.) When your function is a straight line, you get the slope by dividing the difference between two function values by the difference of their arguments. When we write the differences as $\Delta f$ and $\Delta x$ , the derivative can be written as $f' = \Delta f / \Delta x$ . Here, both $\Delta f$ and $\Delta x$ are real numbers. When you have an arbitrary curve instead of a straight line, you can approximate the slope by choosing $\Delta f$ and $\Delta x$ very small. The smaller you make them, the more accurate the result will be. Want infinite accuracy? Make them infinitely small. To make it clear that you’re working with infinitely small numbers (“infinitesimals”), you call them $\mathrm d f$ and $\mathrm d x$ , which gives you $f'=\mathrm d f/\mathrm d x$ . Yay!

But … what are $\mathrm d f$ and $\mathrm d x$ ? Both are infinitely small, right? So if you try to calculate $\mathrm d f$ , you get $0$ . And if you try to calculate $\mathrm d x$ , you get $0$ , too. If you took any other value for them, they’d no longer be infinitely small, and thus you’d get an inaccurate result. Thus, if $\mathrm d f/\mathrm d x$ were a normal fraction like $\Delta f/\Delta x$ , it would be equal to $0/0$ , and we all know never to divide by zero. Hence, since $\mathrm d f/\mathrm d x$ does have a value, it must be something else entirely. Remember part 2, where I wrote,

If Newtonian mechanics is wrong, why do we still use it so damn much?

In that post, I explained that Newtonian mechanics often gives us the best prediction we can make, and using a “more correct” model would not give us a better result. Maybe this situation is similar: what do we get if we pretend $\mathrm d f$ and $\mathrm d x$ are numbers, and that we just don’t know their values?

example 1. Say you’re told to solve the equation $f(x) \cdot f'(x) = x^2$ . This may look daunting at first, but when you write the derivative as $\mathrm d f/\mathrm d x$ instead of $f'$ , you get

$f(x) \cdot \frac{\mathrm d f}{\mathrm d x} = x^2\,.$

and multiplying each side by $dx$ gives you $f(x)\,df=x^2\,dx$ . This looks like integrals without the integral signs, so let’s put some on both sides:

$\int f(x)\,df = \int x^2\,dx\,.$

Now we have $f^2 /2 = x^3 / 3$ , so $f(x)=\sqrt{2/3} x^{3/2}$ , and from this you can calculate the derivative $f'(x) = \sqrt{2/3} (3/2) \sqrt x$ . Popping this back into our initial equation, we get $\sqrt{2/3}\cdot \sqrt{2/3}\cdot (3/2) \cdot x^{3/2 + 1/2} = x^2$ . The roots of $(2/3)$ combine to a full $(2/3)$ , which then cancels with $(3/2)$ , and you’re left with $x^2=x^2$ , which tells you that your solution is correct. /example 1.

Owls sitting in shoes — As a reward for getting through the last paragraph, here's a picture of plush owls sitting in shoes. Inhale. Exhale.

In other words, we used a “mathematically” “wrong” approach to correctly solve a problem. In many situations, this is even a good idea. As long as you can prove that what you’re doing works, using symbols that look less mathematically rigorous but lead you to the solution more intuitively can save a lot of time and even help prevent mistakes.

The cool thing about this is: Many people have already realized this, which is exactly the reason we have $f'(x)=\mathrm d f/\mathrm d x$ and $\mathrm{curl}\,\vec v = \nabla\times\vec v$ and so on, which means you can often be pretty wishy-washy about your notation and still end up making fewer mistakes.

madness 2: when math isn’t all clear and unambiguous

Mathematics is known for being clear and unambiguous. And yes, we can definitively² prove that a theorem is either true or false, in contrast to the sciences where we only have falsifiable hypotheses and probabilities. But the language of math is just as bad as the language of language. Languages take shortcuts, sacrificing semantic clarity for the sake of data transmission rates. This is okay because most of the time everyone knows what you’re talking about.³ They tell you that mathematics doesn’t work that way, but I’m going to make the case that it does.

You know how you do your particle physics homework, and you use the symbol $m_e$ , and the only thing that symbol has ever stood for was the mass of an electron, and your teacher tries to make this elaborate argument about the importance of declaring your variables but somehow they completely miss that you never told anyone what $\pi$ means or what $e$ means or what $\log$ means, and so on? But then the cutoff point between what you need to define and what’s “obvious” isn’t really clear, and it becomes this huge frustrating mess? That’s the kind of thing I’m talking about. Or you say, “Let $p$ be the momentum operator,” and your professor complains that $p$ can’t be an operator because operators always need to have a hat, like $\hat p$ , and you say, no, you defined $p$ to be the operator and shut up you’re being ridiculous, but the professor insists and you end up having to draw a little hat on every single instance of the letter $p$ in your equations even though leaving it out would give you 100% the correct result and cause zero confusion.

example 3. You have a function $f(x,t)$ you want to integrate over $x$ .⁴ You’ll write something like $\int_a^b f(x,t)\,dx=[F(x,t)]_a^b$ , right? And here it’s totally not clear if the brackets are to be evaluated with $x$ as $a$ and $b$ or $t$ as $a$ and $b$ . You know, from looking to the left of the equals-sign, but it isn’t clear just by looking at the right half of the equation. Likewise, some authors write volume integrals as $\int_V f(\vec r_1, \vec r_2)\,d\tau$ , where it’s unclear whether they’re integrating over $\vec r_1$ or $\vec r_2$ . They fix this problem by putting explanations in the text and following conventions throughout the book so it’s clear from context what they mean. /example 3.

example 4. Or, instead of integrals, let’s talk about derivatives. When you have a bunch of equations with many partial derivatives, it can be frustrating to write $\partial V_x/\partial x, \partial V_y/\partial x, \partial V_x/\partial y$ , and so on, over and over. This is because you’re told that the components of a vector field $\vec V$ must always be written as $(V_x,V_y,V_z)$ . But since all these letters are only names, you can simply rename the components. For example, you could call the vector field $\vec V = (X,Y,Z)$ . This already saves you the work of writing a subscript every time you reference one of the components of $\vec V$ . But as an added bonus, you can now use the subscripts for other purposes, like partial derivatives. Thus, you can define $\partial V_x/\partial x$ as $X_x$ , $\partial V_y/\partial x$ as $Y_x$ , and so on. This is much shorter and way more fun! I tried that once and my TA was hopelessly confused because they didn’t understand that indices on vectors don’t have to mean selecting the corresponding component, even though I explicitly defined what everything means at the top of the page. /example 4.

Context matters when writing down equations. Everything doesn’t have to be clear in isolation, as long as you explain what’s happening. Obviously this doesn’t mean that you can just write literally anything because then it wouldn’t be clear anymore what you mean. But what you can do is invent new notation and use that if it makes sense. Note, however, that making up your own things isn’t always a good idea: there already exists a large set of shared expectations about what many symbols do and, often, it makes sense to go with established conventions. Like if you’re using other people’s equations, you shouldn’t just exchange all the letters for no good reason, even if you feel like $\xi$ is a much nicer letter than $\lambda$ .

In conclusion: Be free, be spontaneous, be brave – give your equations meaning instead of useless hats and subscripts. Sometimes, you really don’t have to repeat yourself.

In the future, when I have a list of my most notable essays, this one will be “The Long, Confusing, Meandering One.” This is my A Feast For Crows in terms of exciting action; it’s my American Gods in terms of quickly getting to the point; it’s my Getting Things Done in terms of elegant phrasing – you get the idea. Think of this more as a piece of performance art, rather than an informative article. ↩
If you ignore external uncertainties. ↩
Except when you’re writing a 2000 word essay on how to use mathematical notation without an outline. What was this guy thinking? ↩
I’m so sorry about all the integrals. And all the footnotes. ↩

3/3: Mathematical notation

madness 1: when wrong is right and right is complicated

madness 2: when math isn’t all clear and unambiguous

Footnotes