Empowering the Next Generation of Women in Audio

Join Us

How to Mix Using Multiple Reference Monitors

And not drive yourself crazy

When I first started mixing, it sometimes felt like I was redoing my work over and over until I hit my deadline and was forced to stop. My mix process back then was mixing through my main speakers (full-range) then switching to small speakers for a pass. Then, I’d switch back to my main speakers and find a totally different set of problems. I’d do a pass-through a third set of speakers, and it’d open up another can of worms.

It was very hard to trust my mix decisions. I didn’t trust the rooms I was working in. I didn’t trust my speakers. I sometimes questioned my ears or ability. When there’s that much doubt how are you ever able to make a decision? You can’t. Constantly questioning what is “right” slows down the mix process severely.

From a mixing perspective, nearly every room is flawed in some way. There’s room resonances, bass management issues, less than ideal speaker placement, noise, reflections, or phase issues. Even a room that’s tuned by a great acoustician and considered flat can have 6dB variance or more! The only way to trust a room (or monitors) is to accept a room for what it is.

First and foremost, it helps to reduce as many changing variables as possible. Mix as much as you can in the same room using one set of references monitors. Think of it as your “home base.” The goal is to have a setup that you trust – not because it sounds amazing but because you know its quirks and flaws and strengths.

As you mix, make a mental note of things you notice, like, what frequencies are you always EQing? When you pan, is the imaging clear or muddy? Critical listening is about observation without judgment. Once you make judgments (especially that a mix sounds better or worse depending on the environment, plugin, etc.) it can turn into a psychological game. This is when you start questioning your speakers, room, and yourself.

Some of the best advice I’ve ever received about mixing is “mix, however, makes you comfortable.” Auratones speakers (a standard found in many post-production mix rooms) make my ears ring, so I don’t use them. If I mix through a television set, I listen at the same level I listen to tv at home. I quit mixing full-range at 82 dB (which I find uncomfortably loud sometimes) and closer to 78 dB or even lower on occasion. What I gain in confidence by listening at a comfortable level far outweighs what I lose sonically (by not mixing at the nominal calibrated level for a mix room).

Working in different rooms and monitoring situations can be used to your advantage. When I’m working on a film, I sometimes prefer to edit on headphones (especially to treat pops, clicks, unwanted noises). I like to do my detail EQ work and noise reduction in a room with near-field monitors (like a home studio). This allows you to hear detail that might be lost working in a theatrical mix stage. If I can work on a theatrical stage, that’s the best place to deal with bass management (like mixing to the subwoofer) and mixing in 5.1.

In post-production, we don’t just change monitors, but we sometimes change rooms completely. On top of it, the final mix might be going to a movie theater, television (Bluray, Video on Demand), and eventually online (to laptop or cell phone listeners). We’ve got 5.1 and stereo to consider (or even deeper into 3D Immersive Audio). Many projects don’t have the budget to do separate mixes so sometimes you have to make decisions that are good for one listening environment and bad for another. I find as a mixer I’m happier if I do one mix that I am really happy with versus trying to find a middle ground. I tend to cater to the audience that will have the most views.

It’s good to ask yourself, “what am I trying to achieve by changing monitors?” I don’t change monitors anymore unless there’s a specific reason, such as:

There’s definitely value in changing how you listen. I change my listening level a lot when I’m mixing film scores to hear how the mix sounds in context against dialog. If I’m mixing in 5.1, I might switch to the stereo to see how something I’ve mixed translates that way. I might listen through a tv or my phone if there’s a specific question or need for it.

A big part of learning to mix well is learning how to mix poorly, too. How often do you go back to an old mix and think, “that really sucked!” but at the time you thought it was great? We do what sounds “right” until we find something new that sounds right. There are times you have to accept that your mix is the best you’re going to do that day. Tomorrow is a new day, a new mix, and a chance to do something different

Ser bilingüe no siempre funciona

Por Andrea Arenas / Colaboración Vanessa Montilla

Es posible que hayas hecho varios cursos de idiomas. Sin embargo nada te prepara para trabajar el día a día como ingeniero de sonido, si estás de gira en un país donde se habla un idioma diferente a tu idioma materno. Es probable que por más cursos que hagas, en ninguno te hayan enseñado como le dicen a “peinar los cables”, y así a muchas palabras del argot técnico e inclusive del cotidiano.

Es por eso que he decidido hacer un pequeño glosario de objetos utilizados comúnmente en el audio pero que posiblemente no encontrarás en ningún libro de diseño de sistemas o de técnicas de grabación, y que por lo tanto no estás acostumbrado a utilizar en un idioma diferente al tuyo. Espero les sea útil y que además podamos completarlo entre todos en diferentes idiomas.


Cables /


Conectores/Connectors


Audio

 


Electricidad / Electrics


Herramientas / Tools / Gadgets


Artículos de oficina / Office supplies


Acciones / Actions


Instrumentos musicales / Musical instruments


Medidas / Mesurements

1.5m 5 feet
3m 10 feet
7.6m 25 feet
15m 50 feet
30m 100 feet
50m 165 feet
100m 330 feet

Being Bilingual Does Not Always Work

By Andrea Arenas / Collaborated by Vanessa Montilla

It is possible that you have done several language courses. However, nothing prepares you to work day-to-day as a sound engineer, if you are on tour in a country where a language other than your native language is spoken. It is likely that no matter how many courses you do, you have not been taught how to “comb the wires” (slang for ¨Untangle the wires¨ in Spanish), and many words of technical, and even everyday jargon.

That is why I have decided to make a small glossary of objects commonly used in audio but that you may not find in any book of system design or recording techniques, and that therefore you are not accustomed to using in a language other than of yours. I hope it is useful for you and that we can also complete it in different languages.


Cables


Conectores/Connectors


Audio

 


Electricidad / Electrics


Herramientas / Tools / Gadgets


Artículos de oficina / Office supplies


Acciones / Actions


Instrumentos musicales / Musical instruments


Medidas / Mesurements

1.5m 5 feet
3m 10 feet
7.6m 25 feet
15m 50 feet
30m 100 feet
50m 165 feet
100m 330 feet

Keeping it Real – Section 2

This is Section 2 of Becky Pell’s 3 Section Article on Using psychoacoustics in IEM mixing and the technology that takes it to the next level. Section 1

Acoustic Reflex Threshold

Have you ever noticed how you and the band can take a break from rehearsing, come back half an hour later, and when put your ears back in everything feels louder? And then how after a few moments it settles down and feels normal again? It’s because of a reflex action of the stapedius muscle in the middle ear. When this little muscle contracts, it pulls the stapes or ‘stirrup bone’ slightly away from the oval window of the cochlea, against which it normally vibrates to transmit pressure waves to be converted into nerve impulses. This action, which is a response to sounds of between 70-100dB SPL, effectively creates a compression effect resulting in a 20dB reduction in what you hear. However, the muscle can’t stay fully contracted for long periods, so after a few seconds, the tension drops to around 50% of the maximum. Whilst the initial reaction, at 150 milliseconds, is not fast enough to fully protect the ear against very loud and sudden transient sounds, it helps in reducing hearing fatigue over longer periods. Interestingly this reflex also occurs when a person vocalises, which helps to explain why a singer’s in-ear mix of the band might sound loud enough in isolation, but when they start singing they find they need more instrumentation. This happens in conjunction with the fact they are hearing themselves not only via the mix but through the bone conductivity of their skull. It’s well worth trying to sing along to an IEM mix that you’ve prepared for a singer to experience what this feels like for them because it’s a very different sensation from simply shouting down the mic to EQ it.

The acoustic reflex threshold also means that transients appear quieter than sustained sounds of the same level, and it’s the thinking behind a compression trick that is often used in studios and film production. When you compress the decay of a short sound such as a drum hit, it fools the brain into thinking the drum hit as a whole is significantly louder and punchier than it is, although the peak level – the transient – has not changed. Personally, I’d advocate caution if you’re going to try this in a monitor mix – the drummer needs to hear what their drums ACTUALLY sound like, and getting things such as drum tuning and mic placement correct at source are vital – but it’s an interesting thing to be aware of.

All in the timing

Our ability to perceive sounds as separate events is not only dependent on there being sufficient difference between them in frequency, but also on timing. This phenomenon is known as the ‘precedence effect’ and the ‘Haas effect.’

These effects describe how when two identical sounds are presented in quick succession, they are heard as a single sound. This perception occurs when the delay between the two sounds is between 1 to 5 ms for single click sounds, but up to 40 ms for more complex sounds such as piano music. When the lag is longer, the second sound is heard as an echo. A single reflection arriving within 5 to 30 ms can be up to 10 dB louder than the direct sound without being perceived as a distinct event. In 1951 Helmut Haas examined how the perception of speech is affected in the presence of a single reflection. He discovered that a reflection arriving later than 1 ms after the direct sound increases the perceived level and spaciousness (more precisely, the perceived width of the sound source), without being heard as a separate sound. This holds true up to around 20ms, at which point the sounds become distinguishable.

This can be an interesting experiment to try with a vocal mic and your IEMs. If you split the vocal mic down two channels, and delay one input somewhere between 1 and 20 ms, see what you notice. Then try panning one input hard left and the other hard right, and see how the vocal sounds thicker and creates a sense of width and space. Play with the delay time, and you’ll see that if it’s too short the signal starts to phase; too long and you lose the illusion. This game does make the signal susceptible to comb-filtering if you sum the inputs back to mono, especially at shorter delay times, so be aware of that.

Once again I would advocate extreme caution if you intend to use this in a monitor mix, as ‘tricking’ a singer in this way can backfire! However it’s a useful principle to be aware of if you have the opportunity to get creative with other sounds, and I use it a lot when adding pre-delay to a reverb – try it for yourself. No pre-delay creates a feeling of immediacy to the effect, but just 5-10ms creates a slight sense of space. If you’re after a little more breathiness and drama – ‘vampires swirling’ as I once heard it described – try increasing the pre-delay up to 20 ms and feel how it changes.

The Haas effect is also something to be very aware of for IEM mixing when it comes to digital latency. Every time we take a signal out of the console and send it somewhere else in the digital domain, a degree of minor time delay known as latency is introduced. Different processing devices introduce different amounts of latency, and obviously the less, the better. The more devices we add, the more the latency stacks up. Whilst a few milliseconds of latency may be totally imperceptible for, say, a guitarist; it’s a different matter when it comes to vocals. A singer will often be able to perceive something as being not quite right, without being able to put their finger on it, because when we vocalise and have that signal returned to our ears, the discrepancy between what we hear at the moment of making the sound, and the moment of it returning, becomes heightened in our awareness. Something to be vigilant about when dealing with any digital outboard such as plug-ins, for a singer.

Location Services

The Haas effect also affects where we perceive a sound to be coming from – the supposed location of the source is determined by the sound which arrives first, even though the sounds may be from two different physical locations. This holds true until the second sound is around 15dB louder than the first when the perception of direction changes.

Sound localisation is a very complex mechanism performed by the human brain. It’s not only dependent on the directional cues received by the ears, but it is also intertwined with the other senses, especially vision and proprioception. Our ability to determine a sound’s location and distance is called binaural hearing, and in addition to all the psychoacoustic effects discussed so far, it is also heavily influenced by the physical shape of our heads, ears, and even torsos. The outer ear or ‘pinna’ functions as a directional sound collector which funnels sound waves into the ear canal. The head and the topography of our face and torso influence how sounds from any position other than a 0° angle are heard, as they create an acoustic ‘shadow.’ Our brains process the differences between the information that our two ears collect, and interpret the results to determine where a sound is coming from, how far away it is, and whether it’s still or moving. At lower frequencies, below about 2kHz, this is mostly determined by the inter-aural time difference; that is, the discrepancy in time between when the sound reaches each ear. Above 2k the information gathered comes from the inter-aural level difference; that is, the discrepancy in volume between the sound that each ear hears. This clever evolutionary adaptation is due to the relative lengths of sound waves at different frequencies. For frequencies below 800 Hz, the dimensions of the head are smaller than the half wavelength of the sound waves so that the brain can determine phase delays between the ears.

However, for frequencies above 1600 Hz the dimensions of the head are greater than the length of the sound waves, so a determination of direction based on phase alone is not possible at higher frequencies; instead, we rely on the level difference between the two ears. These binaural disparities are known as Duplex theory and play an important role for sound localisation in the horizontal plane.

(As the frequency drops below 80 Hz it becomes difficult or impossible to use either time difference or level difference to determine a sound’s lateral source because the phase difference between the ears becomes too small for a directional evaluation, hence the experience of sub-bass frequencies being omnidirectional.)

Whilst this phenomenon makes it easy to sense which side a sound is coming from, it’s harder to determine direction in the up/down and front/back planes, due to our ears being placed at the same horizontal level as each other. Some types of owl have their ears placed at different heights, to allow for greater efficiency in finding prey when hunting at night, but humans have no such facility. This can result in ‘cones of confusion’, where we are unsure as to the elevation of a sound source because all sounds that lie in the mid-sagittal plane have similar inter-aural differences; however, once again the shapes of our bodies help us out. Imagine a sound source is right in front of you. There is a certain detour the torso reflection takes and hence a certain difference of this torso reflection in relation to the direct sound arriving at both ears. This yields a slight comb filter pattern which will change if you elevate this source. The same is true if this source is now moved behind you; the torso reflection changes and our brains process the information discrepancies to help us locate the source.

Next time: In the third and final section of this series on using psychoacoustics to enhance your monitor mixing, we’ll discover a ground-breaking new technology that takes IEMs to a whole new dimension.

Missed this Week’s Top Stories? Read our Quick Round-up!

It’s easy to miss the SoundGirls news and blogs, so we have put together a round-up of the blogs, articles, and news from the past week. You can keep up to date and read more at SoundGirls.org

June Feature Profile

The Road from Montreal to Louisville – Anne Gauthier

The Blogs

Keeping It Real

Playing With Voices


SoundGirls News

Shadowing Opportunity w/Guit Tech Claire Murphy

Shadowing Opportunity w/ FOH Engineer Kevin Madigan

Shadowing Opportunity w/ ME Aaron Foye

Letter for Trades and Manufacturers

https://soundgirls.org/scholarships-18/

Accepting Applications for Ladybug Music Festival

SoundGirls London Chapter Social – June 17

https://soundgirls.org/event/glasgow-soundgirls-meet-greet/?instance_id=1272

Shadowing Opportunities

Telefunken Tour & Workshop

https://soundgirls.org/event/colorado-soundgirls-ice-cream-social/?instance_id=1313

SoundGirls Expo 2018 at Full Sail University

Round Up From the Internet

On tour with Brittany Kiefer

 

 


View from the Top: Maureen Droney, The Recording Academy

“I’m privileged to be an advocate for my favorite people: recording engineers and producers.”

 


SoundGirls Resources

Directory of Women in Professional Audio and Production

This directory provides a listing of women in disciplines industry-wide for networking and hiring. It’s free – add your name, upload your resume, and share with your colleagues across the industry.


Women-Owned Businesses

Member Benefits

Events

Sexual Harassment

https://soundgirls.org/about-us/soundgirls-chapters/

Jobs and Internships

Women in the Professional Audio

Keeping It Real

Using psychoacoustics in IEM mixing and the technology that takes it to the next level

SECTION 1

All monitor engineers know that there are many soft skills required in our job – building a trusting relationship with bands and artists is vital for them to feel supported so they can forget about monitoring and concentrate on their job of giving a great performance. But what do you know about how the brain and ears work together to create the auditory response, and how can you make use of it in your mixes?

Hearing is not simply a mechanical phenomenon of sound waves travelling into the ear canal and being converted into electrical impulses by the nerve cells of the inner ear; it’s also a perceptual experience. The ears and brain join forces to translate pressure waves into an informative event that tells us where a sound is coming from, how close it is, whether it’s stationary or moving, how much attention to give to it and whether to be alarmed or relaxed in response. Whilst additional elements of cognitive psychology are also at play – an individual’s personal expectations, prejudices and predispositions, which we cannot compensate for – monitor engineers can certainly make use of psychoacoustics to enhance our mixing chops. Over the space of my next three posts, we’ll look at the different phenomena which are relevant to what we do, and how to make use of them for better monitor mixes.

What A Feeling

Music is unusual in that it activates all areas of the brain. Our motor responses are stimulated when we hear a compelling rhythm and we feel the urge to tap our feet or dance; the emotional reactions of the limbic system are triggered by a melody and we feel our mood shift to one of joy or melancholy; and we’re instantly transported back in time upon hearing the opening bars of a familiar song as the memory centres are activated. Studies have shown that memories can be unlocked in severely brain-damaged people and dementia patients by playing them music they have loved throughout their lives.

The auditory cortex of the brain releases the reward chemical dopamine in response to music – the same potentially addictive chemical which is also released in response to sex, Facebook ‘likes’, chocolate and even cocaine…. making music one of the healthier ways of getting your high. DJs and producers use this release to great effect when creating a build-up to a chorus or the drop in a dance track; in a phenomenon called the anticipatory listening phase, our brains actually get hyped up waiting for that dopamine release when the music ‘resolves’, and it’s manipulating this pattern of tension and release which creates that Friday night feeling in your head.

Missing Fundamentals

Our brains are good at anticipating what’s coming next and filling in the gaps, and a phenomenon known as ‘missing fundamentals’ demonstrates a trick which our brains play on our audio perception. Sounds that are not a pure tone (ie a single frequency sine wave) have harmonics. These harmonics are linear in nature: that is, a sound with a root note of 100 Hz will have harmonics at 200, 300, 400, 500 Hz and so on. However, our ears don’t actually need to receive all of these frequencies in order to correctly perceive the chord structure. If you play those harmonic frequencies, and then remove the root frequency (in this case 100Hz), your brain will fill in the gaps and you’ll still perceive the chord in its entirety – you’ll still hear 100Hz even though it’s no longer there. You experience this every time you speak on the phone with a man – the root note of the average male voice is 150Hz, but most phones cannot reproduce below 300Hz. No matter – your brain fills in the gaps and tells you that you’re hearing exactly what you’d expect to hear. So whilst the tiny drivers of an in-ear mould may not physically be able to reproduce the very low fundamental notes of some bass guitars or kick drums, you’ll still hear them as long as the harmonics are in place.

A biased system

Human hearing is not linear – our ear canals and brains have evolved to give greater bias to the frequencies where speech intelligibility occurs. This is represented in the famous Fletcher-Munson equal-loudness curves, and it’s where the concept of A-weighting for measuring noise levels originated. As you can see from the diagram below, we perceive a 62.5 Hz tone to be equal in loudness to a 1 kHz tone, when the 1k tone is actually 30dB SPL quieter.

Similarly, the volume threshold at which we first perceive a sound varies according to frequency. The area of the lowest absolute threshold of hearing is between 1 and 5 kHz; that is, we can detect a whisper of human speech at far lower levels than we detect a frequency outside that window. However, if another sound of a similar frequency is also audible at the same time, we may experience the phenomenon known as auditory masking.

This can be illustrated by the experience of talking with a friend on a train station platform, and then having a train speed by. Because the noise of the train encompasses the same frequencies occupied by speech, suddenly we can no longer clearly hear what our friend is saying, and they have to either shout to be heard or wait for the train to pass: the train noise is masking the signal of the speech. The degree to which the masking effect is experienced is dependent on the individual – some people would still be able to make out what their friend was saying if they only slightly raised their voice, whilst others would need them to shout loudly in order to carry on the conversation.

Masking also occurs in a subtler way. When two sounds of different frequencies are played at the same time, as long as they are sufficiently far apart in frequency two separate sounds can be heard. However, if the two sounds are close in frequency they are said to occupy the same critical bandwidth, and the louder of the two sounds will render the quieter one inaudible. For example, if we were to play a 1kHz tone so that we could easily hear it, and then add a second tone of 1.1kHz at a few dB louder, the 1k tone would seem to disappear. When we mute the second tone, we confirm that the original tone is still there and was there all along; it was simply masked. If we then re-add the 1.1k tone so the original tone vanishes again, and slowly sweep the 1.1k tone up the frequency spectrum, we will hear the 1k tone gradually ‘re-appear’: the further away the second tone gets from the original one, the better we will hear them as distinct sounds.

This ability to hear frequencies distinctly is known as frequency resolution, which is a type of filtering that takes place in the basilar membrane of the cochlea. When two sounds are very close in frequency, we cannot distinguish between them and they are heard as a single signal. Someone with hearing loss due to cochlea damage will typically struggle to differentiate between consonants in speech.

This is an important phenomenon to be aware of when mixing. The frequency range to which our hearing is most attuned, 500Hz – 5k, is where many of our musical inputs such as guitars, keyboards, strings, brass and vocals reside; and when we over-populate this prime audio real estate, things can start to get messy. This is where judicious EQ’ing becomes very useful in cleaning up a mix – for example, although a kick drum mic will pick up frequencies in that mid-range region, that’s not where the information for that instrument is. The ‘boom’ and ‘thwack’ which characterise a good kick sound are lower and higher than that envelope, so by creating a deep EQ scoop in that mid-region, we can clear out some much-needed real estate and un-muddy the mix. Incidentally, because of the non-linear frequency response of our hearing, this also tricks the brain into thinking the sound is louder and more powerful than it is. The reverse is also true; rolling off the highs and lows of a signal creates a sense of front-to-back depth and distance.

It’s also worth considering whether all external track inputs are necessary for a monitor mix – frequently pads and effects occupy this territory, and whilst they may add to the overall picture on a large PA, are they helping or hindering when it comes to creating a musical yet informative IEM mix?

Next time: In the second part of this psychoacoustics series we’ll examine the Acoustic Reflex Threshold, the Haas effect, and how our brains and ears work together to determine where a sound is coming from; and we’ll explore what it all means for IEM mixes.


 

Shadowing Opportunity w/Guit Tech Claire Murphy

SoundGirls Members who are actively pursuing a career in Guitar teching, Backline or Concert Production are invited to shadow Guitar tech, Claire Murphy. Claire is currently on tour with Vance Joy.

The experience will focus on Guitar teching; setting up “guitar world,” setting up the stage, experiencing line check and soundcheck with the artist. This is open to SoundGirls members ages 18 and over. There is one (1) spot available for each show. Most call times will be at 11.30am (TBD), and members will most likely be invited to stay for the show (TBD). Ideally, applicants will be able to demonstrate some experience in touring or knowledge there of, to gain the most from this opportunity.

Please fill out this application and send a resume to soundgirls@soundgirls.org with Vance Joy in the subject line. If you are selected to attend, information will be emailed to you.

Playing With Voices

When I went to the Acoustical Society of America’s meeting a few years ago, I did not know what to expect.  I was presenting an undergraduate research paper on signal processing and was expecting individuals with similar backgrounds.  Instead, there were presentations on marine wildlife, tinnitus, acoustic invisibility and the speech patterns of endangered languages.  One individual, I met there was Colette Feehan, a linguistic doctorate student at Indiana University.  I gravitated to her upbeat personality and affinity towards collecting awesome trivia. When she mentioned to me in passing her interest in voice acting, I thought I should follow up and pick her brain on the nuances of voice acting.

Colette Feehan

What is voice acting?

Voice acting is providing vocalizations for various kinds of animated characters and objects. This can be speech, grunts, screams, musical instruments, animal vocalizations, and a whole array of other sounds. When watching an animated TV show or movie, every sound you hear has to come from either someone’s mouth or some creative use of props. Often voice acting draws from generalizations about language that both the actor and the audience hold. In a way, some might think of voice acting as acting with a handicap. You’re not just acting with one arm tied behind your back, your acting without the help of any of your body language, facial expressions, etc. You need to convey all that information using just your voice. It’s honestly quite fascinating.

What got you interested in voice acting?

As a kid, I would always imitate sounds from baby elephants to musical instruments to voicing children younger than me. I can’t think of one specific moment that made me interested in voice acting, but I can certainly say it has always been a part of my life.

Who are your favorite voice actors?

I have too many to count. Some classic voice actors are Daws Butler (Yogi Bear, Elroy Jetson, Cap’n’Crunch) and June Foray (Rocky the Flying Squirrel, Cindy Lou Who, Mulan’s Grandmother). There is also Charlie Adler (Cow, Chicken, and the Red Guy from Cow and Chicken, Mr. and Mrs. Big Head in Rocco’s Modern Life), Frank Welker (Fred Jones from Scooby Doo, Nibbler from Futurama), Rob Paulsen (Yakko Warner, Carl Wheezer, Pinky), Grey DeLisle (Mandy from the Grim Adventures of Billy and Mandy and Azula in Avatar), Tara Strong (Timmy Turner, Bubbles from Powerpuff Girls, Dil Pickles), and Dee Bradley Baker (Momo and Appa from Avatar, Olmec in Legends of the Hidden Temple, Perry the Platypus).

What are your favorite voices to do?

First, I think it’s important to mention that I study the linguistics, phonetics, and acoustics of voice actors MUCH more than I actually do voices myself, though I have lent my voice to some improv, plays, friends animated projects, etc.

I’m a bit of a one-trick pony when it comes to voices, though. I can do teenagers and little kids, but not much else.

Any favorite tricks or sounds?

In contrast, I can do loads of weird sounds: kazoo, trumpet, electric guitar, mourning dove, cats (meow and purr), dogs.

Does voice acting have a specific lingo, and if so what terms should directors learn for more efficient directing?

It does! I’ve actually considered starting a bit of an informal dictionary on terms while working with voice actors on the linguistics of voice acting. Most of the lingo that I’ve really paid attention to are linguistics concepts like what linguists call “dark L” some voice actors call it “Lazy L”. “Breathy Voice” in linguistics is called “Smokey Voice” by voice actors. The one that is really interesting is what Rebecca Starr (2015) calls “Sweet Voice” this is an EXTREMELY specialized kind of breathy voice found in Anime that indexes a very specific character archetype.

I have heard that you are doing some research on voice actors, could you tell me a little about that?

In the Speech Production Lab at Indiana University, I am using a special 3D/4D ultrasound set up to look at the articulatory phonetics of adult voice actors who produce child voices for TV and film. A lot of people either don’t know or don’t think about how when we listen to child characters, particularly in animated TV, those voices are often being produced by an adult. The big question I am asking with my dissertation is–What are adults doing with their vocal tract anatomy in order to sound like a child?

So if anyone doesn’t know a lot about how ultrasound works, here is a quick and dirty description:

Ultrasound works by emitting high-frequency noise and timing how long it takes for those sound waves to bounce back. We place an ultrasound probe (like what you use to see a baby) under the participant’s chin and record ultrasound data of their speech in real-time. What we can see using ultrasound is an outline of the surface of the tongue. The sound waves travel through the tissues of the face and tongue, which is a fairly dense medium to travel through. When the waves come into contact with the air along the surface of the tongue, which is a much lower density medium to travel through, they show up on the ultrasound as a bright line which we can trace to then create static images and dynamic video of the tongue movement. So what does 3D/4D mean? We have a fancy ultrasound probe that records in three planes: sagittal, coronal, and transverse. So we take all these static, 2D images, trace them, then compile them into one 3D representation of the tongue. Then we can sync this with a recording of the speech creating our 4th D, time. So we can create videos of what a 3D representation of the tongue is doing while speaking and we can hear what it was doing at that moment. It is really cool.

So back to voice actors. With my dissertation research, I am imaging a few voice actors in two conditions: 1) doing their regular, adult voices and 2) doing their child voices. Then I compare what changes across those two conditions and what doesn’t.

So things I am looking for are: What is the hyoid bone doing (the bone in your neck near where your neck meets your head)? Does the place where the tongue touches the roof of the mouth for different consonants change? Are general tongue shapes and movements different across the two conditions? How do the acoustics change (how does the sound change)? Are those changes in acoustics changes that we would predict based on what the anatomy is doing?

How balanced is diversity in the voice actor industry?

Voice acting has a bit of a double-edged sword in that you don’t have to *look* the part to get the role. It’s just your voice! So someone who might not be your size -6, blonde-haired, wide-eyed beauty can still get the opportunity to play that character. Where this becomes negative, however, is with actors of color. Because you don’t have to look the part, I think a lot of white actors get roles that otherwise would have HAD to go to an actor of color. I do know the field has recently been trying to address this issue, but we can certainly do better.

So what is your opinion on vocal fry

I love creaky voice (I’m going to use this term instead). It can mean so many different things, socially. Is the speaker a man or a woman? Are they in their 20s? Are they using uptalk? Are they just running out of air at the end of their utterance?

Why is there the focus on women’s vocal fry?

I can’t say I’ve studied why specifically women’s creaky voice has blown up so much recently. Creak is really common in deeper voices, so men do it all the time, but we don’t seem to notice. Maybe when women started doing it more people unconsciously associated it with being manly and negatively reacted to it. Or maybe it’s that creak is often paired with uptalk, so it became stigmatized really quickly.

How are men’s and women’s voices different?

Again, I’m not sure I’m the most qualified to talk about this, but I can say that men’s and women’s voices differ in many different categories. First, there is simply anatomy; men have an Adam’s apple which increases the area for resonance in the larynx. They also tend to be bigger, have bigger lungs, etc., making their voices different. Then there are a lot of social ways in which men’s and women’s voices differ. Taking creak for example again, when women use creak it is associated with very different things than when a man uses creak. So the same “thing” performed by a man compared to a woman doing the same thing can be interpreted quite differently. Humans are fascinating.

 

Missed this Week’s Top Stories? Read our Quick Round-up!

It’s easy to miss the SoundGirls news and blogs, so we have put together a round-up of the blogs, articles, and news from the past week. You can keep up to date and read more at SoundGirls.org

June Feature Profile

The Road from Montreal to Louisville – Anne Gauthier

The Blogs

Multitasking – Why you should avoid it

Soldering for Beginners


SoundGirls News

Shadowing Opportunity w/ FOH Engineer Kevin Madigan

Shadowing Opportunity w/ ME Aaron Foye

Letter for Trades and Manufacturers

https://soundgirls.org/scholarships-18/

Accepting Applications for Ladybug Music Festival

https://soundgirls.org/event/vancouver-soundgirls-chapter-one-year-anniversary/?instance_id=1285

SoundGirls London Chapter Social – June 17

https://soundgirls.org/event/glasgow-soundgirls-meet-greet/?instance_id=1272

Shadowing Opportunities

Telefunken Tour & Workshop

https://soundgirls.org/event/colorado-soundgirls-ice-cream-social/?instance_id=1313

SoundGirls Expo 2018 at Full Sail University

Round Up From the Internet

The Theatrical Sound Designers and Composers Association Releases Statement on Women+ in Sound Design for Broadway and Theatres Across the Country


 

 

Engineer Liv Nagy on mixing sound for theatre

 

 


SoundGirls Resources

Directory of Women in Professional Audio and Production

This directory provides a listing of women in disciplines industry-wide for networking and hiring. It’s free – add your name, upload your resume, and share with your colleagues across the industry.


Women-Owned Businesses

Member Benefits

Events

Sexual Harassment

https://soundgirls.org/about-us/soundgirls-chapters/

Jobs and Internships

Women in the Professional Audio

X