# Mathematical foundation of music

Let's discuss how the current music is created, and what was the motivation of the current system of musical tones.

**Sound** is a vibration (e.g. of a string, of a drum, of a speaker ...) which propagates through air, and can be detected e.g. by a human ear or by a microphone. It can not travel through the vacuum (in space), or through materials which absorb vibrations (like thick walls). The vibration gets weaker with the increasing distance from the source.

Vibration of the air means changes of the pressure of the air. These vibrations can be detected by a microphone and plotted on a chart. Zero value means silence, positive and negative values denote changes in pressure.

## Tones and Noises

We distinguish **tones** (a spoon hitting a cup, a hammer hitting a metal rod) and **noises** (clapping hands, a hammer hitting a wall, sound of trees in the wind). The difference is, that when the sound is plotted on a chart, **tones have a small part, a "pattern", that is repeating in time**. In other words, tones are **periodic**, while noises do not have any clear pattern which is repeating in time, the chart of the air pressure is more "random" for noises.

As we said, a tone is some sound pattern that is repeated in time. This pattern can be e.g. a single oscilation of a string. "A number of repetitions within a second" is called a **frequency** or a **pitch**, and is measured in Hertz. E.g. when a string makes 100 oscilations per second, we can hear a tone of 100 Hz.

## Combining tones

When two or more tones sound at the same time, their pressure is "added" together, which corresponds to a chart of a sum of two mathematical functions. In music, we want two tones to sound "nice" when played together.

*Two tones sound nice together, if they create a periodic sound*

The goal of a musician is to find two or more tones, which, when combined, still create a tone (a periodic sound). The new sound may have a different repeating pattern, a different frequency, but it **must be periodic**. Let's look at the properties of periodic functions.

## Periodic functions

*If $f(x)$ and $g(x)$ have a period of T, then, $f(x)+g(x)$ has a period of T*

This is clear: if two instruments create a sound of the same pitch, it sounds nice, no matter what the intruments are, how loud they play, etc. From now on, it is clear, that we should not worry about a specific pattern of a tone, but only about its frequency.

*If $f(x)$ has a period of T, then, $f(x+c)$ has a period of T*

In other words, a "phase" is not important. We get that $f(x)+f(x+c)$ still has a period of T.

$sin(x)+sin(x+0.4)$

*If $f(x)$ has a period of T, then, $f(nx)$ has a period of T for any integer $n$*

We make $n$ repetitions of the function within a single repetition of the original function. We can combine any tone with a tone of double frequency, tripple frequency, etc.

$sin(x)+sin(2x)$

Now we see, that a frequency of 200 Hz sounds nice with a frequency of 100 Hz. The ratio of frequencies is 2:1. We let the new function repeat twice, while the original function repeats once.

This is a handy mechanism for finding new frequencies that "sound nice" together. Let one function repeat m-times within a period, while another function repeats n-times within a period.

*If $f(x)$ has a period of T, then, $f(mx)+f(nx)$ has a period of T for any integers $m, n$*

$sin(3x)+sin(2x)$

In the example above, the ratio of frequencies is 3:2 (1.5). When we have a tone at some frequency (e.g. 300 Hz), we can play a tone with a frequency 1.5 times higher (450 Hz) and be sure that they sound nice together. The first tone must repeat twice, while the second repeats three times, to create a whole period - a repeating pattern of their sum.

Do 200 Hz and 307.7 Hz sound nice together? The ratio is 1.5385, or about 20:13. It requires 13 repetitions of the 200 Hz tone and 20 repetitions of 307.7 Hz to get to the whole period. When the period is so long, tones do not sound as nice together anymore. The goal of music is to find **ratios with a short period (small integers)**.

Let's find some "nice" ratios between 1 and 2:

- 1:1 - 1.0
- 9:8 - 1.125
- 5:4 - 1.25
- 4:3 - 1.3333
- 3:2 - 1.5
- 5:3 - 1.6666
- 15:8 - 1.875
- 2:1 - 2.0

## Which tones should be used in music

Let's write down some requirements for the musical instrument

**some tone**- it must play at least one tone**nice ratios**- if it plays a tone, it would be nice to play a higher tone with a nice ratio (2:1, 3:2, 4:3 ...)**flexibility**- if it plays a melody (a sequence of tones), it would be nice to be able to play the same melody a bit lower or a bit higher (e.g. each tone with 10% lower or 10% higher frequency)

It is technically possible to make an instrument which plays any frequency (either when building it, or later, when cofiguring / tuning the instrument). We also want to be able to play several instruments at the same time. We must pick **a set of tones (frequencies), which all instruments must be able play**.

After centuries of evolution and new ideas, people came up with an incredibly elegant set of tones for the music. Here is how it is created:

- A tone of
**27.5 Hz**, which is called A_{0} - From A
_{0}, create a sequence A_{1}, A_{2}, A_{3}..., where each next tone is 2x higher (55 Hz, 110 Hz, 220 Hz, ...) - As we have a geometric sequence of frequencies (55, 110, 220, ...), split each step into
**12 smaller, equal steps**(which still form a geometirc sequence)

The last rule inserts 11 more tones in between $A_n$ and $A_{n+1}$. This sequence of tones has a beatiful property: each two consecutive tones have the same ratio: about 1.06 (precisely $2^\frac{1}{12}$). If we start at any tone (frequency) $t$, and repeat this small step twelve times, we will double the frequency: $t\cdot1.06^{12} = t\cdot2$.

We can redefine musical tones: A frequency $t$ is a musical tone if and only if there is an integer $n$, such that $t = 27.5 \cdot 2^\frac{n}{12}$ Hz. We can denote such a tone as $t_n$.

- $t_0$ = 27.5 Hz = A
_{0} - $t_{24}$ = 110 Hz = A
_{2} - $t_{26} = t_{24} \cdot 1.06 \cdot 1.06 =$ 123.47 Hz
- ...

A regular piano contains 88 keys, which play frequencies $t_0$ = A_{0} = 27.5 Hz (left-most key) up to $t_{87}$ = 4186 Hz (right-most key). We can also see a pattern of 12 keys (7 white, 5 black), which repeats across the piano.

For historical reasons, musicans call this smallest step (a ratio of 1.06) a **semitone**, while the 12 semitones (a ratio of 2:1) is called an **octave**. A ratio between tones is called **an interval**.

Example: what is the difference between 150 Hz and 1200 Hz? A non-musician says it is 1050 Hz. A mathematician (after reading this article) can say, that it is a ratio of 8:1, or an interval of **three octaves**: $150\cdot2\cdot2\cdot2=1200$.

What is so elegant about this mechanism of choosing tones to be used in music? For every tone, it contains tones with nice ratios to the original tone!

Ratio | Similar ratio in music |
---|---|

1:1 - 1.0 | exactly 0 semitones |

9:8 - 1.125 | about 2 semitones: $2^\frac{2}{12}$ = 1.1224 |

5:4 - 1.25 | about 4 semitones: $2^\frac{4}{12}$ = 1.2599 |

4:3 - 1.3333 | about 5 semitones: $2^\frac{5}{12}$ = 1.3348 |

3:2 - 1.5 | about 7 semitones: $2^\frac{7}{12}$ = 1.4983 |

5:3 - 1.6666 | about 9 semitones: $2^\frac{9}{12}$ = 1.6817 |

15:8 - 1.875 | about 11 semitones: $2^\frac{11}{12}$ = 1.8877 |

2:1 - 2.0 | exactly 12 semitones |

Besided the same tone (1:1) and an octave (2:1), we could say that the ratio of 3:2 is the **most beautiful**: the integers are small (2 repetitions and 3 repetitions within a period)
and the musical tones can approximate it almost perfectly with 7 semitones: $\frac{t_{n+7}}{t_{n}} \doteq \frac{3}{2}$.

### A (very simplified) history of note names

Thousands of years ago, there was an instrument, that could play 14 tones across two octaves. Centuries later, these tones were named A, B, C, D, E, F, G, a, b, c, d, e, f, g.

Centuries later, there was another instrument with 7 tones within an octave, ant the tones were named do, re, mi, fa, sol, la, si. The first tone **do** could have any frequency, but **mi to do** must be 5:4, **fa to do** must be 4:3, **sol to do** must be 3:2, and others "somewhere in between". Since **do** was often tuned to the same frequency as **C** of the older instrument, these seven notes were also called C, D, E, F, G, A, B. A piano of that time contained seven keys within an octave.

Many years later, it became clear, that an octave can be separated into 12 equal steps. It would still contain "nice intervals" (5:4, 4:3, 3:2, 2:1, ...), but it would offer **flexibility** - a music piece can be shifted: played several semitones higher or lower.
In this new model, "old notes" C, D, E, F, G, A, B, correspond to +0, +2, +4, +5, +7, +9 and +11 semitones. But we need to make five more notes (+1, +3, +6, +8, +10) availabe on the instrument - so the black keys were added on a piano. These new notes need some names. We could name them e.g. H, I, J, K and L. Since a symbol # is used to shift a tone one semitone higher, it makes sense to name them C#, D#, F#, G# and A#. Here we see these twelve notes on a modern piano keyboard.

As there used to be 7 notes within an octave, it lead to naming certain intervals (ratios) in a specific way.

Step | as semitones | English | Italian |
---|---|---|---|

C to C | 0 semitones | First (unison) | Prima |

C to D | 2 semitones | Second | Seconda |

C to E | 4 semitones | Third | Terza |

C to F | 5 semitones | Fourth | Quarta |

C to G | 7 semitones | Fifth | Quinta |

C to A | 9 semitones | Sixth | Sesta |

C to B | 11 semitones | Seventh | Settima |

C to C | 12 semitones | Eighth (octave) | Ottava |

Now, we know the origin of a confusing word **octave**, which implies a number 8, but there is no 8 in a ratio of 2:1, or 12 semitones. Similarly, a word **fifth (quinta)** implies a number 5, but there is no 5 in a ratio of 3:2, or in 7 semitones.