Findings

The VIA project focuses on vowel variation across American dialects. We explore how where speakers are raised influences how they pronounce vowels and how they hear vowels that others produce. A critical aspect of our work is measuring the degree to which speakers in different areas of the same region participate in the vowel shift patterns (NCS, SVS and CVS) that appear to be affecting vowel production in each region. These shift patterns are making vowel sounds, in many cases, become more acoustically distinct across the U.S.

Unlike most research on vowels in American dialects, we look not only at how vowel sounds are pronounced but also at how they are perceived. In other words, does living in a region where vowel production is different from another region actually alter how speakers hear the sounds as well as make the sounds? By studying both how speakers say vowels and how they hear vowels, we hope to provide some insight into the nature of the relationship between speech production and perception, an understudied area in language research. In collecting both production and perception based data from the same individuals from different regional dialect backgrounds, we investigate how socially driven differences in speech production relate to variability in speech perception.

Here you can read a bit about our main findings so far. We also provide an overview of some of our most relevant published research.

What we found

Figure 1 shows where most of the subjects who participated in this project were from.

[Figure 1]

[Figure 1] Map of the United States showing the location of 578 vowel indentification participants, and 71 regional "other" participants. Participant location indicated by color-coded bubbles for each region, including green for California, dark green for Nevada, red for Tennesse, navy blue for Illinois, blue for New York, dark red for Virginia, orange for North Carolina, forest green for Oregon, and white for Other.

So, what did we find? Well, our work is still ongoing but a number of discoveries have so far emerged.

Significant regional differences in production by region

Not surprisingly, our research finds that vowels vary depending on where individuals were raised regionally. Speakers in the North, South and West do, in general, sound different when saying the vowels in words like beet /i/, bit /I/, bait /e/, bet /ɛ/, bat /æ/, cot /ɑ/ and caught /ɔ/. However, the pronunciation of back vowels, that is, vowels in words like boot /u/, book /ʊ/ and boat /o/, are not as different across regions.

[Figure 2]

[Figure 2] The figures in the spectrogram display averages of the acoustic patterns found in our data for each region, using the International Phonetic Alphabet (IPA) symbols for each vowel class as noted above. Vowel plots such as these are made by taking measurements of several formants in a vowel signal. Formants are bands of acoustic energy around a particular frequency which represent different resonances of the vocal tract.

Differences in the position of the tongue, lips and jaw during a vowel’s articulation creates a different resonance pattern that includes these more acoustically intense bands (formants) which help listeners decode which vowel quality they are hearing. We can measure these differences in mouth/tongue position by instrumentally measuring how the formant structure (bands of acoustic energy) varies across different vowels. So, when different speakers use a different positioning of articulators in the vocal tract to make a sound (a more raised or backed tongue position, for example), the resultant formant pattern is measurably different. What are known as the first and second formants capture the main acoustic properties of a vowel and these appear to be most salient to listeners in classifying vowel sounds as particular categories.

So, for our project, we use the values of these first formants to plot a schematic diagram that graphically represents where speakers’ articulators are positioned when making such sounds. By comparing these measurements, we can see how speakers relate to one another in their vowel productions. Speakers from the same region tend to cluster together in terms of the way they articulate vowel sounds. As a result, we often find, when subjects from the same region’s vowels are plotted together, that we can really see variation across regions in how sounds are made. Figure 2 below displays such diagrams for our speakers in each region in our study, the South, the West and the North.

[Figure 3]

[Figure 3] Vowel plots for the three regions under the study, including Southern mean vowels, Western mean vowels and Northern mean vowels. Vowels are plotted in each region's box on an x-axis of 2 to -2 and a y-axis of 2 to -2.

Comparing these to the depiction of what we should expect based on the vowel shifts reported in each region (see “Do you speak American?” for contemporary vowel shift schematics), we can see that our speakers from across the U.S. do in fact show evidence of these regional shift patterns. For example, note how close together the /e/ and /ɛ/ classes are in the Southern means plot compared to how far apart they are in the Northern means plot. Likewise, note how close together the /ɑ/ and /ɔ/ vowel classes are in the Western plot compared to the Southern and Northern plots. These vowel positions illustrate the divergent tendencies in vowel production we are finding across regions.

Overall, our Northern speakers show more raised (meaning the tongue positioned higher in the mouth during articulation) production of the short front vowels /I/, /ɛ/ and /æ/ in words like bit, bet and bat, while they do not show any evidence of a merger in the vowels in words like cot and caught /ɑ/ and /ɔ/.

In the South, we see that the long and short front vowels are much closer together than in the Northern speaker sample. In other words, the vowels in words like beet /i/ versus bit /I/ and bait /e/ versus bet /ɛ/ do not have as much distance acoustically as those vowel pairs in the North. This suggests they are not raising or lowering and tensing the tongue the same way as Northerners during these vowels’ articulation.

So, what does this mean? Well, it doesn’t mean Southerners do not make a difference in the pronunciation of these vowels, but it means that they are using different acoustic cues compared to Northerners to keep these vowels distinct (so that they continue to signal a meaningful vowel difference such as in betversus bat). We suspect that Southerners use spectral dynamics (shifting acoustic ranges during a single vowel sound’s articulation) more than speakers in other regions to maintain vowel differences, something often referred to as ‘drawling.’

Finding measures of ‘drawling’ has proven difficult in language research as it has been hard to locate an acoustic measure that correlates with a ‘drawl’ and only a few studies look closely at what kinds of vowel measures beyond vowel length might equate with listener perceptions of drawled vowels. We are currently examining this hypothesis more empirically and early results indeed suggest Southerners have more rapid changes across a single vowel’s acoustic waveform (measured as vector length, trajectory length and spectral rate of change) than speakers in other regions.

Moving beyond the North and South, we also find Western vowel sounds produced differently – in line with the previously studied changes occurring in that region (also found in Canada). In the West, we found that short front vowels, in particular those found in words like bed /ɛ/ and bad /æ/, are retracting (the tongue position further back in the mouth) so that a word like bad sounds more like ‘bod’ to listeners outside that region. Looking at this same vowel sound in the North, you can see that this is quite different than how it is articulated there. For a Northerner, bad is produced much higher and fronter in the mouth (more ‘bed’ like to non-Northerners) than how it is produced (with a backed position) for a Westerner.

We also find, in contrast to the North and South, that speakers in the West tend not to pronounce a difference in the low back vowel sounds /ɑ/ and /ɔ/. These are the sounds in word pairs like cot, caught and Don, Dawn that are pronounced as distinct vowels sounds elsewhere, but sound the same when produced by a Westerner. However, as discussed in a bit more detail below, our research has shown that Westerners are actually using durational cues to mark a subtle acoustic contrast in these two classes, despite the fact that most people think they ‘sound the same’ when pronouncing these vowels.

So, though people often think Northern and Western speech is ‘general American’ sounding, you can see that this is actually far from accurate in that speech in all three regions varies considerably and in increasingly different ways. What we can get from this is that the concept of ‘general American’ is actually a myth. Speakers from all three regions are changing the way they say vowel sounds, and changing in ways that lead to more distinctive vowels across regions. This is particularly true for vowels produced in the front of the mouth.

Emerging similarity as well as difference

One interesting finding in contemporary research, confirmed here by our study results, is that changes to back vowel sounds (sounds produced by the back of the tongue raising and lowering as found in words like boot, book and boat) are more consistent across regions. These three vowel classes (referred to as the /uw/, /ʊ/ and /ow/ classes, respectively) are fronting, or using a fronter tongue position, in all major U.S. English dialects. Note in Figure 2 that these back vowel classes are actually not that far back in the mouth. The /ul/ and /ol/ categories in the plots represent these vowels before “L”, in words like “pool” and “pole.” As can be seen in the figure, these contexts resist fronting and depict the actual back periphery of speakers’ articulatory spaces.

Southern speakers are more advanced in this change and Northerners are least advanced, but all three dialects are showing a shift toward this fronter articulation. Why might this be when front vowel sounds are clearly moving in different directions? Well, likely this has to do somewhat with what is motivating back vowel fronting compared to the front vowel changes. Back vowel fronting is not uncommon in languages more generally, particularly languages like English that have a relatively large vowel inventory. Physiologically, there is only a limited amount of space in back of the mouth to produce back vowel sounds and English has quite a few vowels produced there.

Thus, it is thought that back vowel fronting is motivated by this crowding in the back of the mouth creating a tendency for fronting to relieve the overcrowding (Stockwell and Minkova 1997). An alternative explanation proposed to account for back vowel fronting is that it is an effect of tongue body frontness during the articulation of adjacent coronal consonants – consonants produced with fronter tongue articulation (Flemming 2003). Either way, similar pressures across dialects appears to have led to similar changes occurring.

Significant intra-regional differences in production and perception based on regionally diagnostic vowel shift patterns

While a bit more complex than the results discussed above, our research has also suggested that speakers raised within the same region do not all participate to the same degree in the regional vowel shifts. We find both within-speaker variability for participation within the same field site but also substantial variability across field sites within the same region. So, for example, siblings in Memphis displayed a great deal of difference in how much they participated in the Southern Vowel Shift pattern, with some showing the full shift pattern and some showing much less (see Figure 3 below showing the systems for three siblings).

Even more striking, though, different field sites within the South showed a great deal of difference in whether speakers there shifted vowels according to regional patterns or not. Looking at Figure 4 below, you can see that Memphis speakers in general show much more of the Southern Shift pattern than Virginians who show very little of the pattern comparably. However, we still found that regional cohesion within the vowel system (meaning vowels system similarity) was greater within regions than across regions. So, in other words, a speaker from Virginia and a speaker from Tennessee are still more similar in terms of how they produce vowels than either of them are to a Northerner or Western speaker.

[Figure 3]

[Figure 3] Variation across three sibling's vowel systems, Male Sibling 1, Male Sibling 2, and Female Sibling. Vowels are plotted in each sibling's box on an x-axis of 2 to -2 and a y-axis of 2 to -2.

[Figure 5]

[Figure 4] Aggregate vowel plots for three southern states, including Tennesse (25 Speakers), North Carolina (10 speakers) and Virginia (9 speakers). Vowels are plotted in each region's box on an x-axis of 2 to -2 and a y-axis of 2 to -2.

Significant regional differences in perception by region

Given our findings that the North, South and West have different ways of saying the front vowels (as in beet /i/ and bit /I/, and bait /e/ and bet /ɛ/), a reasonable hypothesis approaching this research would be that subjects from the different regions have different perceptual tendencies. In particular, we would expect that Southerners, for whom the Southern Vowel Shift involves different tongue position during production of /i/~/ɪ/ (beat~bit) and /e/~/ɛ/ (bait~bet) compared to the North and West, would exhibit different perceptual tendencies for these vowels compared to the other regions. Since the Northern Cities Shift (NCS) is characterized by more acoustically spread out bait and bet vowels than found in other regions, we might also (or alternatively) expect listeners from the North to show different perceptions than listeners from regions (like the South and West) with more proximate mid front vowels.

However, we also know from our research results that we find a lot of variability in even how much speakers raised in the same location participate in the vowel changes. So, even within a single city, listeners are exposed to a wide range of acoustic examples of each vowel which may make their perception of vowel sounds more elastic than their production of these same sounds. In other words, though you may say vowels somewhat the same way, you are used to hearing vowel sounds produced by other speakers in a vastly larger acoustic range, particularly in urban areas. Thus, our perception may be more fluid in terms of what/how we categorize the sounds we hear. In such a case, we might expect fewer cross regional differences in perception than we find in production.

So, what did we find in terms of perception? Well, we found that both of the above hypotheses were confirmed: Speakers do appear to adjust their perception range when consistently exposed to sounds that are produced differently across regions but that, for the most part, perception is more fluid across dialects than production.

The most striking difference in how vowels are produced across regions is in the way the mid front vowels (the vowels /e/ and /ɛ/ like in bait and bed) are said. In the North, the bait vowel remains articulated with a fronter tongue position while the bed vowel is made with a much more lowered and backed tongue position. In the South, however, the bait vowel is made with a very backed tongue position while the bed vowel has a much fronter tongue position. As you can imagine, this leads to a substantial difference in how these vowels sound across regions. When we look not only at how these vowel sounds differ in how they are produced but also perceived by speakers in the North and South, we find that this difference in pronunciation is matched with a difference in how speakers in the North and South hear these two vowels. The figure below presents what is called an identification function.

[Figure 5]

[Figure 5] Identification function for /e/ versus /ɛ/ perception across regions. The x-axis, Continuum Step, has a 1-7 scale with a rising y-axis, Percent heard as /ɛ/, has a scale from 0 to 100 percent.

This function shows how listeners in each region classified a word with the /e/ versus /ɛ/ vowel when played 28 varying acoustic versions of these sounds along a seven-point continuum (varying the formant structure) – in other words, when asked if they heard bet or bait when hearing the stimuli word. As you can see, Southerners differ (significantly based on statistical tests) from listeners from other regions. Listeners from the South hear more of the acoustic versions of the stimuli as bait, likely as a result of hearing a more backed variant of this vowel produced by speakers around them in the South. In contrast, Northerners and Westerners heard more bet vowels when they listened to the stimuli, again likely because of the retracted position for the bet vowel they hear produced in their regions compared to the South.

So, these results indicate that there is a link between production and perception. However, it appears that production must be very distinct for this perception effect to occur. Other vowels studied, those that don’t differ as much in how they are said across regions, do not show perceptual differences in the way that the bait versus bet vowel did.

Other research findings

We are still in the process of collecting and analyzing data. In addition to the results reported above, we have also looked at vowel length (duration) and spectral dynamics (‘drawling’) across dialects. In those papers, we also found significant differences across regions and across speakers in how durational (length) and dynamic aspects (‘drawling’) of vowels were used by speakers. For example, Western speakers, who show a merged formant structure in the cot/caught vowel classes, actually maintain a durational difference in those vowels that is lacking in speakers not merging them. Similarly, Southerners who overlap their formant structure in the bait and bet vowels on our measures above actually use more vowel dynamics (or drawling) in the bait vowel as a way of maintaining distinction and avoiding a merger in these classes.

So, it seems, speakers have a variety of acoustic cues to utilize in maintaining vowel differences. A speaker’s regional affiliation and the type of vowel patterns he or she participates in appear to influence how these cues are employed.

Finally, we have some other work in progress that examines geo-spatial mapping techniques for perception data and how vowel perception can be altered by changing social information provided about a speaker. Should you be interested in learning more about our research, you can look at some of our published papers.