Idea for a Doctoral Dissertation in Psycholinguistics

One day in my psycholinguistics class in 2009, I had two ideas that my prof, Dare Baldwin, announced would make good doctoral dissertations. I wrote them down, thinking maybe I would use them. The way my mind works is that I seriously consider doing a Ph.D in every subject I study. A few months later I was pretty sure I wasn’t going to do more research in psycholinguistics, so I wrote myself a note: “Look through notes from Dare’s class and post dissertation ideas for aspiring psycholinguists.”

Well I’m sorry to say that I’ve just looked through those notes and I can’t find the dissertation ideas, and one of them I have completely forgotten. The other I remember the basics of and if you are an aspiring psycholinguist you are welcome to it. Remember, this was in 2009, so check around to see if this research hasn’t been done already.

In Dare’s lecture, she explained that it is something of a mystery how exactly we hear voiced and non-voiced consonants as distinct from each other. If you pay close attention while you say the words “poor” and “bore,” for example, you might be able to notice that the only difference (at least with my accent) is how soon the vowel sound starts after the lips make the consonant. A slight gap between consonant and vowel creates a “p” and a smaller gap makes a “b.”

Using a computer to manipulate that gap, you can test what size of gap produces each consonant, and it turns out it’s a very specific and arbitrary-seeming size. We all hear the transition the same. And to make it even more mysterious, some other animals hear the distinction just like we do. How can this be an important distinction for animals to be able to make?

I believe that this is all due to a psychoacoustic phenomenon called temporal fusion. Any recording engineer knows that if you take two copies of a sound and space them at more than about 30 ms, you will hear both copies, distinct from one another. The second copy will sound like an echo of the first. If you space them at less than about 30 ms, what you instead hear is one, longer, thicker sound.

I bet you that 30 ms is also about the length of gap that starts to distinguish voiced from non-voiced consonants. That is, the length of gap is not arbitrary, but based on human hearing acuity. I will also bet you that other animals that can distinguish between Ps and Bs have temporal fusion that kicks in around 30 ms as well.

There you go. It should be easy and relatively cheap to test. If no one else has thought of it since 2009, it’s yours. If I remember the second idea, I’ll post it too.

(first published February 9, 2012 on Nathen’s Miraculous Escape.)

The Relationship between Clarity of Enunciation and Idea Density

[First published on Nathen’s Miraculous Escape, June 12, 2009.]

Abstract

This study was an attempt to determine whether there is a relationship between individuals’ clarity of enunciation, rated subjectively, and linguistic ability, measured as idea density, as in Findings from the Nun Study (Riley, K. P., Snowdon, D. A., Desrosiers, M. F., & Markesbury, W. R., 2005; Snowdon, D. A., Kemper, S J., Mortimer, J. A., Greiner, L. H., Wekstein, D. R., & Markesbury, W. R., 1996). Idea density, the number of propositions per 10 words, had no significant correlation with clarity of enunciation in 33 digitally videotaped dyadic conversations between adult participants.

The Relationship between Clarity of Enunciation and Idea Density

Since 1992 I have wondered whether exceptionally clear enunciation was an indication of intelligence. I seemed to notice enhanced and separated consonants more often in those that I considered radically intelligent. My Organic Chemistry teacher, for example, said the word “little” exactly as written, with the crisp, unvoiced, alveolar stop [t], and not, as most other people said it, sounding like “lid’l.” Also, if he said a word that ended in a stop consonant followed by a word that started with a stop consonant, he stopped for each consonant; in “stand together,” the [d] and [t] would be separate, not, as in others’ speech, “stantogether.”

There has apparently been no research on a possible connection between cognitive or linguistic ability and clarity of enunciation. Most speech clarity research has focused on intelligibility, in relation to speech disorders, or dysarthrias, of various kinds on the speech-production side (e.g. Ansel & Kent, 1992), or with hearing impairment and hearing aids on the reception side, (e.g. Amyn, Rakerd, & Punch, 2006). There is also some research on the effects of the increased speech clarity in infant-directed speech, on early language acquisition. In infant-directed speech, for example, mothers’ vowel sounds were found to be more distinct from each other than in normal speech (Kuhl, Andrusky, Chistovich, Chistovich, Kozhevnikova, Ryskin, Stolyarova, Sundberg & Lacerda, 1997), and the more distinct the vowel sounds were, the better their infants’ speech perception (Liu, Kuhl, & Tsao, 2003). Whether distinctly produced speech is correlated with any other traits or tendencies of the speaker, though, appears to have gone uninvestigated.

Idea density

Having no direct measure of cognitive ability available, this study used idea density as a proxy. Idea density is a measure of linguistic ability that is associated with knowledge, vocabulary, and education level, and is defined as the average number of ideas, or propositions, per ten words in a text (Snowdon, Kemper, Mortimer, Greiner, Wekstein, & Markesbury, 1996). A text with high idea density, then, is complex and provides the reader with a lot of information.

In linguistics, a proposition is an idea expressed in a narrative, and considered a basic unit of memory for texts (Kintsch & Keenan, 1973). Propositions include verb, adjective, and adverb phrases, noun and clause conjunctions, and indications of temporal and causal relations (Turner & Greene, 1977). Linguists can take a text and construct what they call a propositional text base, which is a list of propositions coded in such a way that all of the information from the original text can be reconstructed.

In a longitudinal study of 180 nuns who entered their convents between 1931 and 1943, the idea density of their autobiographies, written at an average age of 22, correlated with their cognitive functioning and neuropathology in their old age and at death: Lower idea density predicted decreased cognitive functioning, and dementia (Snowdon et al., 1996), low brain weight, cerebral atrophy, and the neural plaques and tangles associated with Alzheimer’s disease (Riley, Snowdon, Desrosiers, & Markesbury, 2005).

It would be useful to know if idea density is also correlated with something as easily recognizable as clarity of enunciation. To that end, the hypothesis of this study was that clarity of enunciation would be significantly and positively correlated with linguistic ability, as measured by the idea density in conversational speech.

Methods and results

Participants

Participants were 110 adults, recruited by undergraduate psychology majors at the University of Oregon for a required class project, and recorded in 55 dyadic conversations on digital video recorders of varying quality. Conversations were recorded in residences, not in a lab. Forty-four participants were excluded from analysis: 34 because the video file was either not provided, would not play, or did not match the accompanying conversation transcription, 4 because participants were eating during the conversation, 4 because they were non-native English speakers, and 2 because their longest utterances contained no more than two words. This left 66 participants (34 female) between the ages of 18 and 57 (M = 24.4, SD = 6.3), 3 Latino, 1 Native American, and 62 White. Education level ranged from less than high school to graduate degree. All participants signed informed consent forms and filled out simple demographics forms prior to being filmed. No formal debriefing was given.

Transcription

Ten minutes of each 15-minute conversation was transcribed by the student who recruited and recorded the participants, using Elan transcription software. Transcriptions were to include all speech by each participant during the 10-minute interval. The only punctuation marks used were [/] to indicate falling intonation at the end of an utterance, and [?] to indicate rising intonation at the end of an utterance. Annotations were composed of groups of utterances by a single participant separated by pauses of less than 2 seconds. Thus, an annotation could be of nearly any length, and could contain any number of utterances.

Idea density coding

Idea density was coded by the researcher, using the coding scheme presented by Riley et al. (2005), in which idea density is the average number of propositions per 10 words. Propositions include verb, adjective, and adverb phrases, noun and clause conjunctions, and indications of circumstance such as time, place, and causality. (See Appendix A for all coding schemes, and see Turner & Greene, 1973, for a thorough presentation of the construction of a propositional text base.)

For a very short example of proposition counting, consider the utterance “We’re going to see Meaghan’s art show.” This sentence contains 6 propositions: (a) the predicate phrase “We see show,” (b) “we” means the speaker and at least one other person, (c) the show is an art show, (d) the show is Meaghan’s, (e) this event is to happen in the future, and (f) the people meant by “we” will have to move to another location in order to see the show. Notice that even in this simple example there is some ambiguity; proposition (f) may or may not have been intended by the speaker.

For each participant, idea density was coded for the annotation of their speech closest to 10 seconds long. In cases where much of that annotation was taken up by laughing or unintelligible speech, the annotation closest to 10 seconds with the largest number of words was used. Annotations ranged from 17 to 51 words (M = 33.40, SD = 8.59), and formed a platykurtic distribution (kurtosis = -.69, SE = .58). Number of propositions ranged from 18 to 27 (M =18.30, SD = 4.46), and formed a platykurtic distribution (kurtosis = -.51, SE = .58). Idea density ranged from 3.6 to 7.2 (M = .56, SD = .08), forming a somewhat skewed (skew = -.14, SE = .3) and platykurtic (kurtosis = -.24, SE = .59) distribution. The levels of skew and kurtosis present were probably acceptable, all falling well within 2 standard errors.

Enunciation coding

Clarity of speech is usually coded using digital editors and spectrographic analysis to measure vowel space expansion and consonant enhancement. The limitations of this study, however, made it necessary to code enunciation more subjectively: Two coders, one of whom was the researcher, and neither of whom were blind to hypothesis, listened to the first 2 minutes of each conversation and coded the clarity of speech of each participant on a 3-point scale: 0 = noticeably unclear speech, 1 = average clarity of speech, and 2 = noticeably clear speech. The raters exhibited poor reliability (Cronbach’s α = .35), indicating the need for more training on the coding scheme, but, because of the time constraints of this study, each participant received a speech clarity score equal to the average of the two coders’ ratings as they were.

The averaged speech clarity scores had a mean of 1.40 (SD = .46) and formed a skewed (skew = -.46, SE = .30), platykurtic (kurtosis = -.52, SE = .60) distribution. The level of skew and kurtosis were probably acceptable, falling within 2 standard errors.

Results

The question under investigation was whether clarity of speech was correlated with linguistic ability, as measured by idea density, the average number of ideas per word, in the conversational speech of this sample. The answer is no, they were not. The correlation between clarity of speech and idea density was not significantly different from 0 (r = .03, p = .81). This indicates that clarity of speech and idea density were not related.

Idea density was also not significantly correlated with gender, level of education of the parents of the participant, or the number of propositions per annotation. On the other hand, idea density was negatively correlated with number of words spoken per annotation (r = –.38, p <.01), positively correlated with age of participant (r = .25, p = .04), and had a very marginal positive correlation with the education level of participant (r = .21, p = .09).

Clarity of speech was marginally correlated with only one other variable, level of participant education (r = .24, p = .06). See Table 1 for all correlation values.

Discussion

This study showed no support for the hypothesis that clarity of enunciation was positively correlated with linguistic ability. The dimensions may, therefore, be orthogonal.

It is true that the methodology used was limited in several ways. The quality of the sound recordings was highly variable. The coding of enunciation was subjective, unreliable, and coders were not blind to condition. Idea density was coded by one individual, so there was no way to check his reliability. Additionally, the difference between the correlations of enunciation and idea density for one coder (r = .173) and the other (r = –.157) was marginally significant (p =.06) using Fisher’s r to z transformations to make the comparison.

Clearly, these results may not be the best indication of the relationship between these two variables, and improved methodology might reveal different results. On the other hand, it may be prudent to look elsewhere for correlates of idea density. The only moderately strong correlation with idea density in this study, for example, was the negative correlation with number of words spoken. The idea that high rates of speech might indicate low verbal ability is somewhat counterintuitive and intriguing.

Another possibility is that the idea density coding used by Riley et al. (2005) and Snowdon et al. (1996) is not appropriate for conversational speech, or might need some modification; there may be ways that spoken and written language differ that need to be taken into account. One annotation in this sample (and which was not used for this reason), for example, contained only one word, “Yeah,” giving this participant an average of 10 propositions per 10 words. Surely, a tendency to speak in one-word utterances is not an indication of linguistic ability!

Other possible differences between written and spoken language are the prevalence of filler words and run-on sentences in spoken language. Here is an example from the sample: “No they were talking about Eddie/ And they were like OH YEAH EDDIE was getting a hold of me like he wants to say goodbye to all his homies before he goes/ And I was like they were like he hasn’t called you? And I was like no?” Chances are, had this passage been written instead of spoken, “said” would replace “was like” and “were like,” and the “and” at the beginning of 3 of the 4 utterances would not appear. Although in this case the changes would balance each other, it may be that on the whole, written and conversational language differ significantly in their average idea density, and perhaps even in the cognitive factors that produce high or low idea density in each. Unfortunately, Riley et al. (2005) and Snowdon et al. (1996) did not publish their idea density statistics (though not surprising, since they published in neurobiology and medical journals, not linguistics or psycholinguistics journals), or that analysis could have begun in this study.

Flawed as it is, this study represents the only evidence to date that I am aware of about the relationship between clarity of speech and linguistic ability, and, so far, it seems unwise to judge someone intelligent based only on the crispness of their enunciation.

References

Ansel, B. M., & Kent, R. D. (1992). Acoustic-phonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech and Hearing Research, 35,296-308.

Amlani, A. M., Rakerd, B., & Punch, J. L. (2006). Speech-clarity judgments of hearing-aid-processed speech in noise: Differing polar patterns and acoustic environments. International Journal of Audiology, 46, 319-330.

Kintsch, W. & Keenan, J. (1973). Reading rate and retention as a function of the number of propstitions in the base structure of sentences. Cognitive Psychology, 5, 257-274.

Kuhl, P.K., Andruski, J. E., Chistovich, I. A., Chistovich, L. A., Kozhevnikova, E. V., Ryskin, V. l., Stolyarova, E. I., Sundberg, U., & Lacerda, F. (1997). Cross-language analysis of phonetic units in language addressed to infants. Science, 277, 684-686.

Liu, H. M., Kuhl, P. K., & Tsao, F. M. (2003). An association between mothers’ speech clarity and infants’ speech discrimination skills. Developmental Science, 6, F1-F10.

Riley, K. P., Snowdon, D. A., Desrosiers, M. F., & Markesbury, W. R. (2005). Early life linguistic ability, late life cognitive function, and neuropathology: Findings from the Nun Study.Neurobiology of Aging, 26, 341-347.

Snowdon, D. a. Kemper, S J., Mortimer, J. A., Greiner, L. H., Wekstein, D. R., & Markesbury W. R. (1996). Linguistic ability in early life and cognitive function and Alzheimer’s disease in late life: Findings from the Nun Study. Journal of the American Medical Association, 275, 528-532.

Turner, A. & Greene, E. (1977). The construction and use of a propositional text base.University of Colorado, Institute for the Study of Intellectual Behavior; Boulder, CO.

Table 1

Correlations of all variables

Variables                    1.         2.         3.         4.         5.         6.         7.         8.         9.         10.

1. Propositions

2. Words                     .84**

3. Idea Density            .17       -.38**

4. Speech Clarity        -.04      -.07      .03

5. Coder 1 SC             -.08      -.20      .17       .88**

6. Coder 2 SC             .06       .15       -.16      .71**   .29*

7. Gender                    .05       .14       -.15      .08       .07       -.02

8. Age                         .12       -.01      .25*     .18       .21       .03       -.02

9. Participant Ed.        .11       -.03      .21       .24       .24       .16       -.16      .52**

10. Mother Ed.            -.08      -.01      -.13      -.03      -.01      -.04      -.02      .06       .23

11. Father Ed.             .04       .02       .03       .00       -.05      .04       -.08      .12       .28*            .47**

Note. SC = speech clarity. Ed. = education. Gender was coded 0 = female, 1 = male.

* Correlation is significant, p < .05

** Correlation is significant, p < .01

Appendix A

Idea density coding

The following from Turner and Greene (1977):

1. Modified arguments of predicate propositions

2. Connected arguments of predicate propositions

3. Predicate propositions

4. Modifiers of predicate propositions

5. Modified arguments of circumstantial propositions

6. circumstantial propositions

7. Other connective propositions within clause

8. Repeat

Enunciation coding

0 = noticeably unclear speech

1 = average clarity of speech

2 = noticeably clear speech

Gender coding

0 = female

1 = male

Education level coding

1 = Less than high school dimploma

2 = High school diploma

3 = Some college

4 = Undergraduate degree

5 = Some graduate school

6 = Graduate degree

An Important Lexical Retrieval Variable

I just read about two studies that found that humanities lecturers use more filled pauses–time saying ‘uh,’ ‘um,’ etc–than science lecturers, and that it’s probably because the humanities have more synonyms to draw upon. In science, it is very useful in conversation to have very precise, technical definitions of each word that everyone agrees upon. Empathy, for example, cannot mean or connote compassion in psychological discourse, and if it does, you run into problems.

Maybe that’s why the people in my social cognition lab (can) talk so fast. They all understand precisely each word, so ideas can come and go very rapidly. Still too rapidly for me to understand, sometimes.

[First published on Nathen’s Miraculous Escape, May 7, 2009.]