VOCAL ACOUSTICS FOR ENGINEERS

By Joanna Cazden

Bobby McFerrin and Mariah Carey are icons of vocal virtuosity, but for most of us, cutting pristine vocals in the recording studio is an iffy proposition. Poor technique and/or preparation, detrimental health habits, emotional pressure, or common colds can turn a vocalist's dulcet tones into timbral trash.

Unfortunately, all the tricks and toys available in modern recording studios can't turn a savaged voice into a Caruso or Callas. The beauty and power of the voice comes from within--literally. Basic knowledge of the body's sound-producing inner mechanism not only inspires better vocal tracking, but can help circumvent age-old problems such as sibilant distortion and mic popping.

Technology Meets Biology
The human voice is an exquisitely expressive instrument, and recording it's wide dynamic range and variable frequency spectrum can frustrate even the best engineers. In addition, recordists must predict split-second shifts in amplitude and resonance produced by lyric articulation and compensate for the voice's betrayal of a singer's emotional state and general health. To make matters worse, the process of vocalization has been poorly understood because its mysteries are hidden inside our bodies.

However, advances in computer technology and digital signal processors now permit voice scientists to take electrical readings of throat muscle activity, measure airflow, and analyze the spectral content of speech sounds. A recent development, videostroboscopy, finally revealed the rippling vibration of the vocal folds (or cords) in slow motion, using a fiber-optic tube carrying a tiny video camera and strobe light inserted into the throat.

Fundamentally, the voice is a wind instrument, producing both an acoustic waveform and a variable air stream. The voice box, or larynx, is situated in the airway between the lungs and mouth. What we call the "Adam's apple" is a complex structure of cartilage approximately the size of a walnut that houses two small folds of muscle tissue. These vocal folds can be positioned to remain silent, or positioned to vibrate in response to exhaled air.

Pitch and some aspects of timbre are established by minute adjustments in the length, tension, and stiffness of the vocal folds. These adjustments are not entirely under conscious control, which is why most vocalists rely on metaphoric imagery or trial-and-error to achieve a desired sound.

Fundamental and More
The fundamental frequency F0 (pronounced "F-sub-zero") produced by the vocal folds can range from approximately 80 to 700 Hz in young adult males and from 140 Hz to 11 kHz in young adult females. Normal speech is produced in the low-mid portion of these ranges and generally uses a narrow band of frequencies rather than a precise pitch.

In addition to the fundamental frequency, the vocal folds also generate harmonics. The entire complex waveform then travels up into the partially-enclosed chambers of the throat, mouth, and nose. These vocal tract enclosures resonate at certain frequencies that become amplified if present in the arriving waveform.

The harmonics that resonate within these chambers are called formants. Formants appear as strong peaks in the radiated frequency spectrum, labeled F1, F2, and so on. When vocalists move their lips, tongues, and throat muscles to articulate vowels and consonants, they change the shape of the resonating chambers, altering the frequency, intensity, and proportions of these formants. The resulting changes are recognized by listeners as words.

The existence of formants explains why a perfect EQ setting for a singer's ah may not sound as good on the vowel ee; the shifting formants produce a kind of body-cavity EQ. So it's particularly important to audition vocal miking and EQ settings on long sections of a lyric passage rather than an isolated syllables.

Pop, Whistle, Boom
Some changes in vocal sound are not related to F0 or formants. For instance, certain consonants are produced by interrupting the air stream itself. These sounds occur in the range of 2 to 6 kHz, and consist of fluctuating, pressurized air turbulence.

Although this turbulence often causes significant miking hassles in the studio, it is a necessary evil. According to speech scientists, "intelligibility rides on the consonants," so without these airflow noises, lyric content would be lost. Unfortunately, a few consonants, such as p and s, are especially troublesome when recording vocals.

P belongs to a class of speech sounds called plosives, in which the outflowing air is stopped and then suddenly--and violently--released. (Put your hand in front of your mouth and say "pa-pa-pa"; you'll feel it!) While b, t, d, g, and k also are plosives, p often packs the biggest punch, distorting microphones with a virtual sonic boom of tonal frequencies.

Skilled singers diminish plosives by reducing their airflow, but windscreens can rescue less-savvy vocalists by breaking up the turbulence. Other solutions often bring compromises in tow. For example, using EQ to cut a few dB at 100 Hz can tame the boom, but the adjusted timbre may "thin out" the vocal sound. Placing the mic slightly to the side of a singer's mouth also reduces pops, but room tones can sneak in and produce an unnatural timbre. Audition all the options before deciding which method works best for the song and the singer.

Another class of consonants called fricatives includes s, z, sh, th, v, and f. These sounds are created by continuous pressurized turbulence in the air stream, rather than a single explosion. As the fricative highest in frequency and intensity, s often disrupts vocal recordings by producing sibilant distortion. If you don't have a de-esser in your signal-processing rack, cutting the EQ slightly around 6 kHz during troublesome phrases can help. However, the exact sibilant frequency varies greatly from person to person (it may be as high as 10 kHz for some people), so test the your vocalist's s sound to discover the best frequency to cut.

Grand New Opry
Using EQ to compensate for a singer's off days or poor timbre is still an inexact science. Up to now, most singing research has measured operatic voices, where the ideal mouth position and resonance spectrum are significantly different from typical pop, rock, folk, and country sounds.

One study located a "nasal twang" between 2 and 2.4 kHz. If a male tenor sounds too nasal, try cutting around these frequencies. Conversely, a country singer with an unconvincing sound may be helped by boosting the vocal EQ at 2 kHz and backing off the mellowness at 500 Hz.

In the world of classical music, a "singer's formant" around 3 kHz has been identified in baritones, tenors, and contraltos; it is believed to help singers project over a symphony orchestra.

A recent demonstration at the Pacific Voice Conference involving a trained singer and a spectrum analyzer showed that this formant can be added to any F0 and vowel spectrum. (Sopranos have difficulty producing it, but they apparently don't need the boost!) In a commercial parallel, consumer-audio expert Lawrence Ullman believes that the high-pitched sound favored by male pop singers has become desirable because of the response curve of most automobile sound systems.

Clearly, there is much to be learned about the wide variations in individual body structure, personality, training, pronunciation, and style that affect a vocalist's output. With ongoing advances in technology, musicians and voice scientists can face this challenge together.

Appendix
CARE AND FEEDING OF STUDIO SINGERS
1. Smoke, dust, and extremely cold or dry air take their toll on a singer's endurance. Try to air out the vocal booth before sessions, and limit smoking to outdoor or reception areas. In cold weather, get the room temperature up to 70 degrees or so before the singers arrive.

2. Most singers know to drink fluids throughout a session to keep throat tissues moist. Warm or moderately hot beverages are preferable to cold or iced drinks. Providing a teapot, microwave, or other hospitality--herbal tea is better than coffee--shows singers that you understand and value their work.

3. A significant (but often overlooked) strain on singers is too much talking. However tempting it is to socialize during breaks, allow vocalists some peace and privacy, and encourage colleagues to do the same.

4. Studies prove that inadequate training is a leading cause of vocal damage in young pop singers. If you are especially impressed with a singer's skill and stamina, find out who they study with. You can then offer tactful referrals to less-experienced singers.

References
Colton, R.H. and Casper, J.K. Understanding Voice Problems. Williams and Wilkins, 1990.

Ladefoged, P. Elements of Acoustic Phonetics. University of Chicago Press, 1974.

Sundberg, J. The Science of the Singing Voice. Northern Illinois University Press, 1987.

Reprinted from Electronic Musician, February 1993

Back to Archives TOC