Quick post today, just to record hypotheses I’d like to test. I’m back onto the “math of musical pitches” thing, but I’m promising myself I won’t get too lost down this rabbit hole. Not this time.
Pianos and digital keyboard instruments are tuned to 12-Tone Equal Temperament (12ET), where each half-step is exactly the same interval as the others. It’s a tuning system that confers certain advantages, such as flexibility in the use of tonal languages to exploit enharmonic ambiguity to pivot to other tonal centers. Such advantages come at the cost of the “sweetness” or “purity” (terms to be defined) of harmonic intervals. It’s a halfway-decent trade-off.
Ensemble vocalists and similar groups (such as a string quartet) tune their performances “by ear,” using subjective aural experience to shift notes microtonally to achieve harmonies that are deemed optimal, somehow. Hypothesis: the subjective judgement of what’s optimal for an individual performer has a physiological/biological component and an acculturated component; probably the biological component is weighted somewhat more strongly. Another hypothesis: what’s judged as optimal is usually some kind of just intonation; probably 5-limit tuning as (opposed to Pythagorean (3-limit) tuning or 7-limit tuning).
That’s a lot of stuff to look up on Wikipedia! Have at it, if you wish.
This has practical application. I used to write music using the combination: in my head, at the keyboard, singing a solo vocal line. The real-time bio-feedback that I get from that experience probably colors what I produce—effects the musical decisions I make. Now, for over 350 song ideas, I’ve improvised music using the combination: in my head, singing consecutive solo vocal lines, using either a layering/looping recording device or, occasionally, a volunteer chorus of singers. No absolute pitch reference; all tuned by subjective experience. With mixed results, certainly. Nonetheless, the real-time bio-feedback that I get from that experience probably colors what I produce differently.
What happens when you put them together? Is it possible to get the “sweetness” of choral harmony that sounds somehow in-tune with a piano that’s accompanying it?
Schema do exist for this. One intriguing option is to have a computer program analyze the notes played on a digital piano in real time, and perform some calculation to adjust frequencies on-the-fly within the first 100 milliseconds that they’re played. Reference to 12ET is kept as a starting point, but then for every note combination (“chord”) one note is kept true to 12ET and the others are “fixed” according to what kind of sonority is deemed to have been sought by the performer. The enharmonic ambiguity referenced above makes this proposition not so cut and dried. But in certain musical pieces where said ambiguity is at a minimum, this can be done quite successfully. The results from this process are quite pleasing and don’t at all sound “weird.” (Another subjective judgement for you.)
But I’m not going there. For reasons. Too much to flesh out here; this was supposed to be short!
On the other hand, tools exist to “correct” the tuning of performed pitches. The temptation is to “fix” a cappella performances to 12ET. Don’t do it! It’s a trap. Or, to put it more formally: I hypothesize that much of the appeal of choral singing will be lost if the just—probably 5-limit—tuning is “corrected” to 12ET. Another hypothesis: correcting a vocal ensemble’s pitches to 5-limit tuning will retain its appeal, possibly even heighten it.
So, finally, to my point. I want to take the case where the tone center is A, and the mode is mixolydian, so the only notes expressed are A, B, C#, D, E, F#, and G. On the piano, the A is tuned to 440Hz, and the other 6 notes tuned accordingly in the 12ET tuning. Taking the case where the vocal performance was “perfectly” tuned (or later pitch-corrected) to a 5-limit justification. If the tone center, A, were perfectly in-tune, the largest tuning discrepancy is in the F#, with 15.6 cents difference between the piano (higher) and the voices (lower). The next biggest discrepancy is in the C#, with 13.7 cents difference (again, with the piano frequency the higher of the two).
Here’s a spreadsheet I’m using for some of my calculations: 12ET versus 5-limit just – mixolydian case
Hypothesis: this is close enough that it will subjectively be judged as acceptable, but (another hypothesis) it can be improved upon.
Supporting evidence, here. This interval—about 14 or 15 cents—is only about 1/7 of the distance between half-steps. By some measures, an average human threshold for the detection of tuning discrepancies is around 25 cents. Vibrato in solo melodic lines is often from 40 cents up to over 100, and the ear judges that the pitch being sung is in the middle of that vibrato range. In other words, our ears (and pitch-processing parts of our brains) have a fudge-factor.
But it stands to reason that “closer” is “better.”
So. Hypothesis: in this respect, “closer” means that the overall average of the vocal pitches expressed is closer to the 7 played on the 12ET piano. Taking this particular case, if sung A were tuned 4.2 cents higher than the piano’s A, the maximum discrepancy in the F# reduces to 11.5 (rounding error), the next biggest in the C# reduces to 9.5, and no other pitch in the set of 7 is more than 8.1 cents off, high or low. No note is more than about 1/9 of a half-step off, between the piano and the singers.
Of course, to make this a little more convoluted, we’d probably have to take the weighted average of all the pitch-classes played on the piano and sung by the singers. Does F# occur as often as C#, in our test-case piece? Probably not. Hypothesis: if we were therefore to raise the singers’ A by just 3.9 cents, the root triad would be just a tiny bit “better,” at the trade-off that the less-frequently-occurring F# would be a tiny bit “worse.”
That’s some mightily fine-tuned fine tuning. Probably the 0.3 cents difference is below the threshold of what’s even worth taking the time to investigate.
Please note that I’m not proposing that pitch correction ought to be used to correct vocal performance. In an ideal world, singers’ pitch accuracy and subjective judgement would be good enough to ensure a “sweet” performance, and to navigate the discrepancy between good vocal harmony and the piano’s tuning. What I am proposing is that if pitch correction is used on a recording of vocal harmony, 12ET ought not to be used as a reference, even if there’s one or more 12ET instruments accompanying the vocalists. Some flavor of 5-limit just intonation, appropriate to the local tone-center and mode, should be used. And the root reference tone ought to have something to do with the average discrepancy between the just-tuned notes and the 12ET notes.
This, anyway, is the hypothesis I’d like to test.