• Introduction
    • Prosodic structure
    • Redundancy
    • Motivation and Hypotheses
    • Download Postscript (Double sided A4 8Mb)
    • Download Acrobat PDF (Double sided A4 1Mb)


    Introduction

    We often don't say the same word the same way in different situations. If we read a list of words out loud we say them differently from when we produce them, spontaneously, in a conversation. Even within spontaneous speech there are wide differences in the articulation of the same word by the same speaker. Some words become extremely reduced while others get longer and louder and seem to stand out more strongly in a phrase or sentence. This thesis explores these variations in articulations from two different but arguably related perspectives, prosodic structure and redundancy.



    Prosodic structure

    Phoneticians and phonologists have studied 'suprasegmental' effects, variation that appears to occur at the phrase or word level, for many years and proposed various theories of prosodic structure to account for them. They have shown that these variations are not random but often extremely systematic. In general, theories of prosodic structure concentrate on three distinct though clearly related phenomena:

    1.
    Prominence: Some parts of the speech stream stand out more than other parts.
    2.
    Boundaries: Speech is split up into chunks which are marked by supra-segmental phenomema. (For example pauses, differences in tone, amplitude, segmental duration and prominence)
    3.
    Information Giving: Changes in prosodic structure can alter the meaning of the message. (For example changing a statement into a question.)

    Looking closely at the way prominence is realised in spoken language laboratory phonetics has found that prominent syllables are more clearly articulated. That is, the segments tend to be longer, the spectral characteristics are more distinct, they are louder and often marked with pitch change. Words with such prominence also tend to be easier for human subjects to recognise when excerpted from context.

    In general:

    prominence = more care of articulation = more noticeable = easier to recognise



    Redundancy

    Prosodic structure clearly affects care of articulation however another factor, redundancy, also appears to have a major impact. More common words and words you can easily predict from context (more redundant) tend to be articulated less clearly. For example the 'nine' in the phrase 'a stitch in nine' is less clearly articulated than the nine in 'I would like nine please'.

    Lindblom Lindblom90 in his H&H theory suggests that we put only as much effort into articulation as required for the listener to understand. He argues that we tend to under articulate an easily predictable (redundant) sections of speech and over articulate a difficult to predict (less redundant) sections of speech.



    Motivation and Hypotheses

    So we appear to have two quite different factors controlling the care with which we articulate speech. On one hand we have a complex prosodic structure which allows prominence and the chunking of speech and on the other we have complex interactions within the structure of language which makes some sections of speech predictable and others less so.

    This thesis will try to disentangle these factors. It explores the relationship between theories of prosodic structure, care of articulation and measurements of redundancy in a corpus of spontaneous spoken language. In doing so it aims to unite traditional phonological views of language structure with a stochastic, data driven approach to language analysis.

    Understanding these variations in articulation is of great importance for both engineers who wish to design effective speech recognition and synthesis software and also psycholinguistics and phoneticians who wish to understand the human language system. Potentially such an investigation can help refine theories of suprasegmental change and allow us to not only predict articulation variation in the speech stream but use this variation to explore the internal state of a speakers language system.

    The central questions this thesis will address are:

    1.
    Can we build an effective model of care of articulation that allows a quantitative analysis of large quantities of spontaneous speech? What are the problems and limitations of such a model?
    2.
    To what extent does a modern theory of prosodic structure account for such changes in the care of articulation in contrast to some simple measures of redundancy?
    3.
    How much interdependency exists between redundancy measurements and prosodic structure? Can concepts of predictability and prosodic structure be integrated together to offer a stronger predictive framework of changes in care of articulation.


    Matthew Aylett
    2/10/1999