Empowering the Next Generation of Women in Audio

Join Us

The Psychoacoustics of Modulation

Modulation is still an impactful tool in Pop music, even though it has been around for centuries. There are a number of well-known key changes in many successful Pop songs of recent musical decades. Modulation like a lot of tonal harmonies involves tension and resolution: we take a few uneasy steps towards the new key and then we settle into it. I find that 21st-century modulation serves as more of a production technique than the compositional technique it served in early Western European art music (this is a conversation for another day…).

 Example of modulation where the same chord exists in both keys with different functions.

 

Nowadays, it often occurs at the start of the final chorus of a song to support a Fibonacci Sequence and mark a dynamic transformation in the story of the song. Although more recent key changes feel like a gimmick, they are still relatively effective and seem to work just fine. However, instead of exploring modern modulation from the perspective of music theory, I want to look into two specific concepts in psychoacoustics: critical bands and auditory scene analysis, and how they are working in two songs with memorable key changes: “Livin’ On A Prayer” by Bon Jovi and “Golden Lady” by Stevie Wonder.

Consonant and dissonant relationships in music are represented mathematically as integer-ratios; however, we also experience consonance and dissonance as neurological sensations. To summarize, when a sound enters our inner ear, a mechanism called the basilar membrane response by oscillating at different locations along the membrane. This mapping process called tonotopicity is maintained in the auditory nerve bundle and essentially helps us identify frequency information. The frequency information devised by the inner ear is organized through auditory filtering that works as a series of band-pass filters, forming critical bands that distinguish the relationships between simultaneous frequencies. To review, two frequencies that are within the same critical band are experienced as “sensory dissonant,” while two frequencies in separate critical bands are experienced as “sensory consonant.” This is a very generalized version of this theory, but it essentially describes how frequencies in nearby harmonics like minor seconds and tritones are interfering with each other in the same critical band, causing frequency masking and roughness.

 

Depiction of two frequencies in the same critical bandwidth.

 

Let’s take a quick look at some important critical bands during the modulation in “Livin’ On A Prayer.” This song is in the key of G (392 Hz at G4) but changes at the final chorus to the key of Bb (466 Hz at Bb4). There are a few things to note in the lead sheet here. The key change is a difference of three semitones, and the tonic notes of both keys are in different critical bands, with G in band 4 (300-400 Hz) and Bb in band 5 (400-510 Hz). Additionally, the chord leading into the key change is D major (293 Hz at D4) with D4 in band 3 (200-300 Hz). Musically, D major’s strongest relationship to the key of Bb is that it is the dominant chord of G, the minor sixth in the key of Bb. Its placement makes sense because previously the chorus starts on the minor sixth in the key of G, which is E minor. Even though it has a weaker relationship to Bb major which kicks off the last chorus, D4 and Bb4 are in different critical bands and if played together would function as a major third and create sensory consonance. Other notes in those chords are in the same critical band: F4 is 349 Hz and F#4 is 370 Hz, placing both frequencies in band 4 and if played together would function as a minor second and cause sensory roughness. There are a lot of perceptual changes in this modulation, and while breaking down critical bands doesn’t necessarily reveal what makes this key change so memorable, it does provide an interesting perspective.

A key change is more than just consonant and dissonant relationships though, and the context provided around the modulation gives us a lot of information about what to expect. This relates to another psychoacoustics concept called auditory scene analysis which describes how we perceive auditory changes in our environment. There are a lot of different elements to auditory scene analysis including attention feedback, localization of sound sources, and grouping by frequency proximity, that all contribute to how we respond to and understand acoustical cues. I’m focusing on the grouping aspect because it offers information on how we follow harmonic changes over time. Many Gestalt principles like proximity and good continuation help us group frequencies that are similar in tone, near each other, or serve our expectations of what’s to come based on what has already happened. For example, when a stream of high notes and low notes is played at a fast tempo, their proximity to each other in time is prioritized, and we hear one stream of tones. However, as this stream slows down, the value in proximity shifts from the closeness in timing to the closeness in pitch, and two streams of different high pitches and low pitches are heard.

 Demonstration of “fission” of two streams of notes based on pitch and tempo.

 

Let’s look at these principles through the lens of “Golden Lady” which has a lot of modulation at the end of the song. As the song refrains about every eight measures, the key changes by a half-step or semitone upwards to the next adjacent key. This occurs quite a few times, and each time the last chord in each key before the modulation is the parallel major seventh of the upcoming minor key. While the modulation is moving upwards by half steps, however, the melody in the song is moving generally downwards by half steps, opposing the direction of the key changes. Even though there are a lot of changes and combating movements happening at this point in the song, we’re able to follow along because we have eight measures to settle into each new key. The grouping priority is on the frequency proximity occurring in the melody rather than the timing of the key changes, making it easier to follow. Furthermore, because there are multiple key changes, the principle of “good continuation” helps us anticipate the next modulation within the context of the song and the experience of the previous modulation. Again, auditory scene analysis doesn’t directly explain every reason for how modulation works in this song, but it gives us ulterior insight into how we’re absorbing the harmonic changes in the music.

One Size Does Not Fit All in Acoustics

Have you ever stood outside when it has been snowing and noticed that it feels “quieter” than normal? Have you ever heard your sibling or housemate play music or talk in the room next to you and hear only the lower frequency content on the other side of the wall? People are better at perceptually understanding acoustics than we give ourselves credit for. In fact our hearing and our ability to perceive where a sound is coming from is important to our survival because we need to be able to tell if danger is approaching. Without necessarily thinking about it, we get a lot of information about the world around us through localization cues gathered from the time offsets between direct and reflected sounds arriving at our ears that our brain performs quick analysis on compared to our visual cues.

Enter the entire world of psychoacoustics

Whenever I walk into a music venue during a morning walk-through, I try to bring my attention to the space around me: What am I hearing? How am I hearing it? How does that compare to the visual data I’m gathering about my surroundings? This clandestine, subjective information gathering is important to reality check the data collected during the formal, objective measurement processes of systems tunings. People spend entire lifetimes researching the field of acoustics, so instead of trying to give a “crash course” in acoustics, we are going to talk about some concepts to get you interested in the behavior that you have already been spending your whole life learning from an experiential perspective without realizing it. I hope that by the end of reading this you will realize that the interactions of signals in the audible human hearing range are complex because the perspective changes depending on the relationships of frequency, wavelength, and phase between the signals.

The Magnitudes of Wavelength

Before we head down this rabbit hole, I want to point out one of the biggest “Eureka!” moments I had in my audio education was when I truly understood what Jean-Baptiste Fourier discovered in 1807 [1] regarding the nature of complex waveforms. Jean-Baptiste Fourier discovered that a complex waveform can be “broken down” into its many component waves that when recombined create the original complex waveform. For example, this means that a complex waveform, say the sound of a human singing, can be broken down into the many composite sine waves that add together to create the complex original waveform of the singer. I like to conceptualize the behavior of sound under the philosophical framework of Fourier’s discoveries. Instead of being overwhelmed by the complexities as you go further down the rabbit hole, I like to think that the more that I learn, the more the complex waveform gets broken into its component sine waves.

Conceptualizing sound field behavior is frequency-dependent

 

One of the most fundamental quandaries about analyzing the behavior of sound propagation is due to the fact that the wavelengths that we work with in the audible frequency range vary in orders of magnitude. We generally understand the audible frequency range of human hearing to be 20 cycles per second (Hertz) -20,000 cycles per second (20 kilohertz), which varies with age and other factors such as hearing damage. Now recall the basic formula for determining wavelength at a given frequency:

Wavelength (in feet or meters) = speed of sound (feet or meters) / frequency (Hertz) **must use same units for wavelength and speed of sound i.e. meters and meters per second**

So let’s look at some numbers here given specific parameters of the speed of sound since we know that the speed of sound varies due to factors such as altitude, temperature, and humidity. The speed of sound at “average sea level”, which is roughly 1 atmosphere or 101.3 kiloPascals [2]), at 68 degrees Fahrenheit (20 degrees Celsius), and at 0% humidity is approximately 343 meters per second or approximately 1,125 feet per second [3]. There is a great calculator online at sengpielaudio.com if you don’t want to have to manually calculate this [3]. So if we use the formula above to calculate the wavelength for 20 Hz and 20kHz with this value for the speed of sound we get (we will use Imperial units because I live in the United States):

Wavelength of 20 Hz= 1,125 ft/s / 20 Hz = 56.25 feet

Wavelength of 20 kHz or 20,000 Hertz = 1,125 ft/s / 20,000 Hz = 0.0563 feet or 0.675 inches

This means that we are dealing with wavelengths that range from roughly the size of a penny to the size of a building. We see this in a different way as we move up in octaves along the audible range from 20 Hz to 20 kHz because as we increase frequency, the number of frequencies per octave band increases logarithmically.

32 Hz-63 Hz

63-125 Hz

125-250 Hz

250-500 Hz

500-1000 Hz

1000-2000 Hz

2000-4000 Hz

4000-8000 Hz

8000-16000 Hz

Look familiar??

Unfortunately, what this ends up meaning to us sound engineers is that there is no “catch-all” way of modeling the behavior of sound that can be applied to the entire audible frequency spectrum. It means that the size of objects and surfaces obstructing or interacting with sound may or may not create issues depending on its size in relation to the frequency under scrutiny.

For example, take the practice of placing a measurement mic on top of a flat board to gather what is known as a “ground plane” measurement. For example, placing the mic on top of a board, and putting the board on top of seats in a theater. This is a tactic I use primarily in highly reflective room environments to take measurements of a loudspeaker system in order to observe the system behavior without the degradation from the reflections in the room. Usually, because I don’t have control over changing the acoustics of the room itself (see using in-house, pre-installed PAs in a venue). The caveat to this method is that if you use a board, the board has to be at least a wavelength at the lowest frequency of interest. So if you have a 4ft x 4 ft board for your ground plane, the measurements are really only helpful from roughly 280 Hz and above (solve for : 1,125 ft/s / 4 ft  ~280 Hz given the assumption of the speed of sound discussed earlier). Below that frequency, the wavelengths of the signal under test will be larger in relation to the board so the benefits of the ground plane do not apply. The other option to extend the usable range of the ground plane measurement is to place the mic on the ground (like in an arena) so that the floor becomes an extension of the boundary itself.

Free Field vs. Reverberant Field:

When we start talking about the behavior of sound, it’s very important to make the distinction about what type of sound field behavior we are observing, modeling, and/or analyzing. If that isn’t confusing enough, depending on the scenario, the sound field behavior will change depending on what frequency range is under scrutiny. Most loudspeaker prediction software works by using calculations based on measurements of the loudspeaker in the free field. To conceptualize how sound operates in the free field, imagine a single, point-source loudspeaker floating high above the ground, outside, and with no obstructions insight. Based on the directivity index of the loudspeaker, the sound intensity will propagate outward from the origin according to the inverse square law. We must remember that the directivity index is frequency-dependent, which means that we must look at this behavior as frequency-dependent. As a refresher, this spherical radiation of sound intensity from the point source results in 6dB loss per doubling of distance. As seen in Figure A, sound intensity propagating at radius “r” will increase by a factor of r^2 since we are in the free field and sound pressure radiates omnidirectionally as a sphere outward from the origin.

Figure A. A point source in the free field exhibits spherical behavior according to the inverse square law where sound intensity is lost 6dB per doubling of distance

 

The inverse square law applies to point-source behavior in the free field, yet things grow more complex when we start talking about line sources and Fresnel zones. The relationship between point source and line source behavior changes whether we are observing the source in the near field or far field since a directional source becomes a point source if observed in the far-field. Line source behavior is a subject that can have an entire blog or book on its own, so for the sake of brevity, I will redirect you to the Audio Engineering Society white papers on the subject such as the 2003 white paper on “Wavefront Sculpture Technology” by Christian Heil, Marcel Urban, and Paul Bauman [4].

Free field behavior, by definition, does not take into account the acoustical properties of the venue that the speakers exist in. Free field conditions exist pretty much only outdoors in an open area. The free field does, however, make speaker interactions easier to predict especially when we have known direct (on-axis) and off-axis measurements comprising the loudspeakers’ polar data. Since loudspeakers manufacturers have this high-resolution polar data of their speakers, they can predict how elements will interact with one another in the free field. The only problem is that anyone who has ever been inside a venue with a PA system knows that we aren’t just listening to the direct field of the loudspeakers even when we have great audience coverage of a system. We also listen to the energy returned from the room in the reverberant field.

As mentioned in the introduction to this blog, our hearing allows us to gather information about the environment that we are in. Sound radiates in all directions, but it has directivity relative to the frequency range being considered and the dispersion pattern of the source. Now if we take that imaginary point source loudspeaker from our earlier example and listen to it in a small room, we will hear not only the direct sound coming from the loudspeaker to our ears, but also the reflections from the loudspeaker bouncing off the walls and then back at our ears delayed by some offset in time. Direct sound often correlates to something we see visually like hearing the on-axis, direct signal from a loudspeaker. Since reflections result from the sound bouncing off other surfaces then arriving at our ears, what they don’t contribute to the direct field, they add to the reverberant field that helps us perceive spatial information about the room we are in.

 

Signals arriving on an obstructed path to our ears we perceive as direct arrivals, whereas signals bouncing off a surface and arriving with some offset in time are reflections

 

Our ears are like little microphones that send aural information to our brain. Our ears vary from person to person in size, shape, and the distance between them. This gives everyone their own unique time and level offsets based on the geometry between their ears which create our own individual head-related transfer functions (HRTF). Our brain combines the data of the direct and reflected signals to discern where the sound is coming from. The time offsets between a reflected signal and the direct arrival determine whether our brain will perceive the signals as coming from one source or two distinct sources. This is known as the precedence effect or Haas effect. Sound System Engineering by Don Davis, Eugene Patronis, Jr., & Pat Brown (2013), notes that our brain integrates early reflections arriving within “35-50 ms” from the direct arrival as a single source. Once again, we must remember that this is an approximate value for time since actual timing will be frequency-dependent. Late reflections that arrive later than 50ms do not get integrated with the direct arrival and instead are perceived as two separate sources [5]. When two signals have a large enough time offset between them, we start to perceive the two separate sources as echoes. Specular reflections can be particularly obnoxious because they arrive at our ears either with an increased level or angle of incidence such that they can interfere with our perception of localized sources.

Specular reflections act like reflections off a mirror bouncing back at the listener

 

Diffuse reflections, on the other hand, tend to lack localization and add more to the perception of “spaciousness” of the room, yet depending on frequency and level can still degrade intelligibility. Whether the presence of certain reflections will degrade or add to the original source are highly dependent on their relationship to the dimensions of the room.

 

Various acoustic diffusers and absorbers used to spread out reflections [6]

In the Master Handbook of Acoustics by F. Alton Everest and Ken C. Pohlmann (2015), they illustrate how “the behavior of sound is greatly affected by the wavelength of the sound in comparison to the size of objects encountered” [7]. Everest & Pohlmann describe how the varying size of wavelength depending on frequency means that how we model sound behavior will vary in relation to the room dimensions. There is a frequency range at which in smaller rooms, the dimensions of the room are shorter than the wavelength such that the room cannot contribute boosts due to resonance effects [7]. Everest & Pohlmann note that when the wavelength becomes comparable to room dimensions, we enter modal behavior. At the top of this range marks the “cutoff frequency” to which we can begin to describe the interactions using “wave acoustics”, and as we progress into the higher frequencies of the audible range we can model these short-wavelength interactions using ray behavior. One can find the equations for estimating these ranges based on room length, width, and height dimensions in the Master Handbook of Acoustics. It’s important to note that while we haven’t explicitly discussed phase, its importance is implied since it is a necessary component to understanding the relationship between signals. After all, the phase relationship between two copies of the same signal will determine whether their interaction will result in constructive or destructive interference. What Everest & Pohlmann are getting at is that how we model and predict sound field behavior will change based on wavelength, frequency, and room dimensions. It’s not as easy as applying one set of rules to the entire audible spectrum.

Just the Beginning

So we haven’t even begun to talk about the effects of properties of surfaces such absorption coefficients and RT60 times, and yet we already see the increasing complexity of the interactions between signals based on the fact we are dealing with wavelengths that differ in orders of magnitude. In order to simplify predictions, most loudspeaker prediction software uses measurements gathered in the free field. Although acoustic simulation software, such as EASE, exists that allows the user to factor in properties of the surfaces, often we don’t know the information that is needed to account for things such as absorption coefficients of a material unless someone gets paid to go and take those measurements. Or the acoustician involved with the design has well documented the decisions that were made during the architecture of the venue. Yet despite the simplifications needed to make prediction easier, we still carry one of the best tools for acoustical analysis with us every day: our ears. Our ability to perceive information about the space around us based on interaural level and time differences from signals arriving at our ears allows us to analyze the effects of room acoustics based on experience alone. It’s important when looking at the complexity involved with acoustic analysis to remember the pros and cons of our subjective and objective tools. Do the computer’s predictions make sense based on what I hear happening in the room around me? Measurement analysis tools allow us to objectively identify problems and their origins that aren’t necessarily perceptible to our ears. Yet remembering to reality check with our ears is important because otherwise, it’s easy to get lost in the rabbit hole of increasing complexity as we get further into our engineering of audio. At the end of the day, our goal is to make the show sound “good”, whatever that means to you.

Endnotes:

[1] https://www.aps.org/publications/apsnews/201003/physicshistory.cfm

[2] (pg. 345) Giancoli, D.C. (2009). Physics for Scientists & Engineers with Modern Physics. Pearson Prentice Hall.

[3] http://www.sengpielaudio.com/calculator-airpressure.htm

[4] https://www.aes.org/e-lib/browse.cfm?elib=12200

[5] (pg. 454) Davis, D., Patronis, Jr., E. & Brown, P. Sound System Engineering. (2013). 4th ed. Focal Press.

[6] “recording studio 2” by JDB Sound Photography is licensed with CC BY-NC-SA 2.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/2.0/

[7] (pg. 235) Everest, F.A. & Pohlmann, K. (2015). Master Handbook of Acoustics. 6th ed. McGraw-Hill Education.

Resources:

American Physical Society. (2010, March). This Month in Physics History March 21, 1768: Birth of Jean-Baptiste Joseph Fourier. APS Newshttps://www.aps.org/publications/apsnews/201003/physicshistory.cfm

Davis, D., Patronis, Jr., E. & Brown, P. Sound System Engineering. (2013). 4th ed. Focal Press.

Everest, F.A. & Pohlmann, K. (2015). Master Handbook of Acoustics. 6th ed. McGraw-Hill Education.

Giancoli, D.C. (2009). Physics for Scientists & Engineers with Modern Physics. Pearson Prentice Hall.

JDB Photography. (n.d.). [recording studio 2] [Photograph]. Creative Commons. https://live.staticflickr.com/7352/9725447152_8f79df5789_b.jpg

Sengpielaudio. (n.d.). Calculation: Speed of sound in humid air (Relative humidity). Sengelpielaudio. http://www.sengpielaudio.com/calculator-airpressure.htm

Urban, M., Heil, C., & Bauman, P. (2003). Wavefront Sculpture Technology. [White paper]. Journal of the Audio Engineering Society, 51(10), 912-932.

https://www.aes.org/e-lib/browse.cfm?elib=12200

Keeping It Real

Using psychoacoustics in IEM mixing and the technology that takes it to the next level

SECTION 1

All monitor engineers know that there are many soft skills required in our job – building a trusting relationship with bands and artists is vital for them to feel supported so they can forget about monitoring and concentrate on their job of giving a great performance. But what do you know about how the brain and ears work together to create the auditory response, and how can you make use of it in your mixes?

Hearing is not simply a mechanical phenomenon of sound waves travelling into the ear canal and being converted into electrical impulses by the nerve cells of the inner ear; it’s also a perceptual experience. The ears and brain join forces to translate pressure waves into an informative event that tells us where a sound is coming from, how close it is, whether it’s stationary or moving, how much attention to give to it and whether to be alarmed or relaxed in response. Whilst additional elements of cognitive psychology are also at play – an individual’s personal expectations, prejudices and predispositions, which we cannot compensate for – monitor engineers can certainly make use of psychoacoustics to enhance our mixing chops. Over the space of my next three posts, we’ll look at the different phenomena which are relevant to what we do, and how to make use of them for better monitor mixes.

What A Feeling

Music is unusual in that it activates all areas of the brain. Our motor responses are stimulated when we hear a compelling rhythm and we feel the urge to tap our feet or dance; the emotional reactions of the limbic system are triggered by a melody and we feel our mood shift to one of joy or melancholy; and we’re instantly transported back in time upon hearing the opening bars of a familiar song as the memory centres are activated. Studies have shown that memories can be unlocked in severely brain-damaged people and dementia patients by playing them music they have loved throughout their lives.

The auditory cortex of the brain releases the reward chemical dopamine in response to music – the same potentially addictive chemical which is also released in response to sex, Facebook ‘likes’, chocolate and even cocaine…. making music one of the healthier ways of getting your high. DJs and producers use this release to great effect when creating a build-up to a chorus or the drop in a dance track; in a phenomenon called the anticipatory listening phase, our brains actually get hyped up waiting for that dopamine release when the music ‘resolves’, and it’s manipulating this pattern of tension and release which creates that Friday night feeling in your head.

Missing Fundamentals

Our brains are good at anticipating what’s coming next and filling in the gaps, and a phenomenon known as ‘missing fundamentals’ demonstrates a trick which our brains play on our audio perception. Sounds that are not a pure tone (ie a single frequency sine wave) have harmonics. These harmonics are linear in nature: that is, a sound with a root note of 100 Hz will have harmonics at 200, 300, 400, 500 Hz and so on. However, our ears don’t actually need to receive all of these frequencies in order to correctly perceive the chord structure. If you play those harmonic frequencies, and then remove the root frequency (in this case 100Hz), your brain will fill in the gaps and you’ll still perceive the chord in its entirety – you’ll still hear 100Hz even though it’s no longer there. You experience this every time you speak on the phone with a man – the root note of the average male voice is 150Hz, but most phones cannot reproduce below 300Hz. No matter – your brain fills in the gaps and tells you that you’re hearing exactly what you’d expect to hear. So whilst the tiny drivers of an in-ear mould may not physically be able to reproduce the very low fundamental notes of some bass guitars or kick drums, you’ll still hear them as long as the harmonics are in place.

A biased system

Human hearing is not linear – our ear canals and brains have evolved to give greater bias to the frequencies where speech intelligibility occurs. This is represented in the famous Fletcher-Munson equal-loudness curves, and it’s where the concept of A-weighting for measuring noise levels originated. As you can see from the diagram below, we perceive a 62.5 Hz tone to be equal in loudness to a 1 kHz tone, when the 1k tone is actually 30dB SPL quieter.

Similarly, the volume threshold at which we first perceive a sound varies according to frequency. The area of the lowest absolute threshold of hearing is between 1 and 5 kHz; that is, we can detect a whisper of human speech at far lower levels than we detect a frequency outside that window. However, if another sound of a similar frequency is also audible at the same time, we may experience the phenomenon known as auditory masking.

This can be illustrated by the experience of talking with a friend on a train station platform, and then having a train speed by. Because the noise of the train encompasses the same frequencies occupied by speech, suddenly we can no longer clearly hear what our friend is saying, and they have to either shout to be heard or wait for the train to pass: the train noise is masking the signal of the speech. The degree to which the masking effect is experienced is dependent on the individual – some people would still be able to make out what their friend was saying if they only slightly raised their voice, whilst others would need them to shout loudly in order to carry on the conversation.

Masking also occurs in a subtler way. When two sounds of different frequencies are played at the same time, as long as they are sufficiently far apart in frequency two separate sounds can be heard. However, if the two sounds are close in frequency they are said to occupy the same critical bandwidth, and the louder of the two sounds will render the quieter one inaudible. For example, if we were to play a 1kHz tone so that we could easily hear it, and then add a second tone of 1.1kHz at a few dB louder, the 1k tone would seem to disappear. When we mute the second tone, we confirm that the original tone is still there and was there all along; it was simply masked. If we then re-add the 1.1k tone so the original tone vanishes again, and slowly sweep the 1.1k tone up the frequency spectrum, we will hear the 1k tone gradually ‘re-appear’: the further away the second tone gets from the original one, the better we will hear them as distinct sounds.

This ability to hear frequencies distinctly is known as frequency resolution, which is a type of filtering that takes place in the basilar membrane of the cochlea. When two sounds are very close in frequency, we cannot distinguish between them and they are heard as a single signal. Someone with hearing loss due to cochlea damage will typically struggle to differentiate between consonants in speech.

This is an important phenomenon to be aware of when mixing. The frequency range to which our hearing is most attuned, 500Hz – 5k, is where many of our musical inputs such as guitars, keyboards, strings, brass and vocals reside; and when we over-populate this prime audio real estate, things can start to get messy. This is where judicious EQ’ing becomes very useful in cleaning up a mix – for example, although a kick drum mic will pick up frequencies in that mid-range region, that’s not where the information for that instrument is. The ‘boom’ and ‘thwack’ which characterise a good kick sound are lower and higher than that envelope, so by creating a deep EQ scoop in that mid-region, we can clear out some much-needed real estate and un-muddy the mix. Incidentally, because of the non-linear frequency response of our hearing, this also tricks the brain into thinking the sound is louder and more powerful than it is. The reverse is also true; rolling off the highs and lows of a signal creates a sense of front-to-back depth and distance.

It’s also worth considering whether all external track inputs are necessary for a monitor mix – frequently pads and effects occupy this territory, and whilst they may add to the overall picture on a large PA, are they helping or hindering when it comes to creating a musical yet informative IEM mix?

Next time: In the second part of this psychoacoustics series we’ll examine the Acoustic Reflex Threshold, the Haas effect, and how our brains and ears work together to determine where a sound is coming from; and we’ll explore what it all means for IEM mixes.


 

X