Monique E. van Donzel and Florien J. Koopmans-van Beinum

Institute of Phonetic Sciences/IFOTT, University of Amsterdam, The Netherlands


This paper describes an experiment in which the different pausing strategies in discourse in Dutch were investigated. Spontaneous discourses were recorded from four male and four female native Dutch speakers. Silent and filled pauses were located in the speech signal, as well as lengthened words. These were subsequently related to different discourse structures, obtained independently from prosodic features. Results show that there are basically three different types of pausing: silent pauses, filled pauses, and lengthening of words. Speakers apply these means in different ways to achieve pausing, by using one specific pause type or a combination of more than one. The way of applying pausing is rather uniform within one speaker, whereas the choice of a particular strategy is largely speaker dependent.


In spontaneous speech, as well as in more prepared types of speech, speakers use pausing strategies to structure the continuation of their discourse. At certain points in their discourse, they will have to wait, to determine the way in which they want to continue their telling, since the exact content of their message is not fixed, as it is for instance in texts read aloud. A speaker may also introduce a pause for more functional reasons, for instanc to build up the tension or to make listeners have specific expectations about the rest of the telling. Furthermore, speakers may pause before important words to draw the listener's attention. Speakers may differ in the way they achieve pausing, some will just introduce a silent interval, while others prefer to pause using 'fillers'.

A difference in pausing may also depend on the kind of speech material. In a monologue, a speaker will use pauses to give the listener time to process the message. Since the speaker is usually not interrupted in a monologue, it will be sufficient to just introduce a silent pause. In dialogues the different participants can use pausing to mark different things. They may want to signal the end of their contribution, and give the floor to the other participants. In such a case, the speaker can just use a silent pause. They may also wish to continue, but need a pause to plan the follow up of their contribution. In this case, one can imagine the speaker using all kinds of fillers, to make sure that he holds his turn.

The relation between pausing and discourse structure is often investigated for fairly structured material, for instance instruction monologues [1, 2] or for more restricted texts [3, 4]. Pauses are said to coincide with boundaries on both the clause/sentence and the paragraph level (prosodic discourse boundaries), and to occur before highly important words. These studies, however, have been primarily concerned with different kinds and types of lengthening in relation to discourse boundaries, or the difference in the uses of pauses in spontaneous versus read speech. In our study, we want to investigate the actual strategy, more than the effect, for spontaneous discourse in Dutch.

The aim of our experiment was to see how different kinds of pausing strategies, such as silent and filled pauses, and lengthened words, are used by speakers in discourse in Dutch. We wanted to see whether speakers differ in the various means available to achieve pausing, whether they use these means differently, and whether there is any systematicity in the way they do it.

The approach will be to measure the pauses in the speech signal, and to mark their locations in the discourse structure. The discourse structure was analysed independently of any prosodic features, using a method based on the written transcription alone [5]. The application of this method resulted in a detailed analysis which indicates the division of the text into clauses and paragraphs on a global level, and the information status of the words and word groups on a local level.


2.1. Speakers, text analysts, and stimuli

Four male and four female native speakers of Dutch were selected as speakers. They were all students or staff members of the Institute of Phonetic Sciences. A panel of five Dutch text analysts, all familiar with discourse analysis, participated in the analysis of the discourse structure.

The speakers were asked to read aloud a short story in Dutch ('A Triumph', by S. Carmiggelt). After a short break, they were asked to retell the same story in their own words, with as many details as possible. During the retelling of the story a listener was present in the recording studio, to create a more natural story telling situation. This resulted in eight spontaneously retold versions of the same story (hereafter `retold version'). All recordings were made in an anechoic room on DAT-tape. The retold versions were stored as digitized audio files (low pass filtered 24 kHz, sample rate 48 kHz, 16-bit precision).

2.2. Analysis of discourse structure

Verbatim transcriptions of the eight retold stories were made by the first author. These transcriptions were analysed for discourse structure by the first author and then checked by the panel of discourse analysts, according to the method described below.

First of all, a division was made into clauses and paragraphs. A clause is defined as a unit containing words or word groups, which are grouped together on semantic or functional grounds. A paragraph consists of several clauses dealing with the same topic. Then, at the word level, a distinction was made between three types of information: new information, information which the speakers assume is completely new in the listeners' context; inferrable information, information which the speaker assumes the listener can infer from the preceding context or his/her knowledge of the world; evoked information, information that has previously been mentioned in the discourse, and that is known to the listener. Furthermore, discourse markers, indicating the main transition points between the different parts of the discourse, were indicated. For a detailed description of analysis, see [5].

2.3. Three kinds of pausing strategies

To get a first impression of the types of pausing used by our speakers, we performed an informal listening experiment with the eight retold versions (one listener: the first author). We found that there were basically three different kinds of pausing used by the speakers. First of all, a silent pause, breathing pauses included, characterised by a total absence of speech. Secondly, a filled pause, which consists of a hesitation sound (`eeh'), preceded and/or followed by a silence. Thirdly, speakers appear to use the lengthening of certain words as a pause to plan the continuation of their telling. These findings led us to expect on the one hand differences between speakers in the strategy used to achieve pausing, but on the other hand a consistent behaviour of applying a certain strategy for one speaker.

The three kinds of pausing were operationalized in the following way.

* Silent pauses. A pause was labeled as a silent pause if its duration was at least 150 msec. This minimal length was to insure that closure times of stop consonants were not included. In case the closure time occasionally did exceed 150 msec, it was obviously not marked as a silent pause.

* Filled pauses. A hesitation sound (`eeh') was labeled as a filled pause. These elements in the discourse do not have any lexical meaning, but they can indicate that the speaker needs time to plan the continuation of his/her telling, that he/she wants to avoid a silence, or that he/she wants to `hold the floor'. Silences preceding and/or following the hesitation sound were marked as 'silence to a filled pause' (thus not as a silent pause), even if they were shorter than 150 msec.

* Lengthening. A speaker can use lengthening as a planning tool by sustaining a particular vowel or consonant within certain words. As a first step the words containing lengthening were determined by ear by the first author, and in a later stage checked by the second author. Two kinds of lengthening were observed: a schwa added after the last consonant of the word, and the lengthening of a word-internal vowel or a consonant.


3.1. Pausing strategy per speaker

Our first question is whether speakers differ in the use of the various means available to achieve pausing. In other words, in what way are silences, filled pauses, and lengthening used by the speakers in their discourse? Therefore, we checked for each speaker the number of times he/she used a specific means. This is given in Table 1. The filled pauses here include the hesitation sound, whether or not accompanied by a silent interval.

Table 1: Total absolute number of occurrences of various pausing means (% between brackets), broken down per speaker.
Speake  Silenc  Filled  Length  Total   
  r       e     pause   e-ning          
  1       57      23      35     115    
         (50)    (20)    (30)           
  2       60      3 (     21       84   
         (71)     4)     (25)           
  3       79      1 (     23     103    
         (77)     1)     (22)           
  4       50      4 (     57     111    
         (45)     4)     (51)           
  5       43      2 (     36       81   
         (53)     3)     (44)           
  6       36      25      45     106    
         (34)    (24)    (42)           
  7       42      3 (     24       69   
         (61)     4)     (35)           
  8       28      3 (     63       94   
         (30)     3)     (67)           

The data from table 1 clearly show differences in pausing strategy between the eight speakers. Speakers 4, 6, and 8 do not use silences so much to achieve pausing, whereas the other speakers do. Instead, speakers 4 and 8 use a lot more lengthening than the others, whereas speaker 6 introduces filled pauses. As for filled pauses, both speakers 1 and 6 use them extremely often compared to the other speakers. Lengthening is used mostly by speakers 4 and 8, and less by the other speakers. Furthermore, the data show that speakers 2 and 3, and 5 and 7 behave in a fairly similar way overall.

Several types of silences were also distinguished in labeling the pauses. First of all, the silent pause ('Sil'), as defined above. Furthermore, we distinguished silences before and after filled pauses ('Sil # Filled' resp. 'Filled # Sil'), and after lengthened words ('Leng. # Sil'). Occasionally, there was a silence between a lengthened word and a filled pause; these silences were marked separately as well ('Leng. # Sil # Filler'). Table 2 presents the internal distribution of the filled pauses and the pauses related to lengthening, for the eight speakers.

Table 2 shows that the filled pauses as used by speakers 1 and 6 consist of the hesitation sound preceded and followed by a silent interval. The other speakers use less filled pauses altogether, but when they do, it is practically only the hesitation sound, with occasionally a silence preceding and/or following it. Again, speakers 5 and 7 show the same pattern of behaviour. As for lengthening, speakers 2, 3, 4, and 8 use an additional silence after a lengthenend word in almost half of the cases, while the other speakers do so much less. Recall from the previous table that speakers 4 and 8 use lengthening more than all other speakers to achieve pausing, and more than any other means at all.

Table 2: Total absolute number of occurrences of various types of silences in relation to filled pauses and lengthening, broken down per speaker. More details can be found in the text.
Spkr   Sil   Fille  Fille  Leng-  Leng.  Leng.  
        #      d    d #    the-n  # Sil    #    
      Fille  pause  Sil     ing           Sil   
        d                                  #    
 1     18    23     13     35       6    4      
 2       0     3      3    21       9    0      
 3       1     1      0    23     12     0      
 4       2     4      3    57     35     2      
 5       1     2      1    36     10     0      
 6       9   25     12     45       7    3      
 7       1     3      1    24       6    1      
 8       3     3      0    63     30     0      

3.2. Pausing and discourse structure

The next step is to investigate the relation between pausing and discourse structure. What is the relation between pauses and the global and local structure of the discourse? At what places in the spontaneous discourse do pauses occur? Before we can answer these questions, we first have to take a closer look at the internal structure of the discourse itself.

Global discourse structure. Table 3 gives the distribution of the discourse on a global level in terms of number of words, clauses, and paragraphs, as well as the percentage of the different discourse structures marked with a pause. At this point we are concerned with the occurrence of pauses on the whole, and we will therefore take a pause to be either a silent or a filled pause (hesitation sound plus surrounding silent intervals). Data on lengthening will be discussed in section 3.3.

Table 3: Total absolute number of discourse structures, and absolute number of structures marked with a pause (% between brackets), broken down per speaker.
Sp  Words  +Paus  Claus  +Paus  Parag  +Pause  
           e      e      e      r              
1    537    83     71     40     12    10 (    
           (16)          (56)          83)     
2    459    71     70     46     10    10      
           (16)          (66)          (100)   
3    582    90     75     42     10    10      
           (15)          (56)          (100)   
4    504    89     69     41       9    9      
           (18)          (59)          (100)   
5    491    55     63     33     10     9 (    
           (11)          (52)          90)     
6    361    67     45     37       9    9      
           (18)          (82)          (100)   
7    417    52     60     28       7    7      
           (13)          (47)          (100)   
8    511    61     65     22     10     8 (    
           (12)          (34)          80)     

Note that the categories mentioned in the table are not mutually exclusive. A pause occurring at a clause boundary can at the same time occur at a paragraph boundary; pauses at word level can also coincide with pauses at clause or paragraph level. This table shows the number and percentage of specific discourse structures marked with a pause.

The results at word level show that the same percentage of word boundaries is marked by a pause (11-18%) by all speakers, independent of the length of the discourse. This could mean that 15% of the word boundaries is commonly marked by a pause, at least for Dutch. As far as we know, nothing has been reported on this in the literature so far. At clause level, speakers mark around 56% of the clause boundaries with a pause. Two speakers however, show a different pattern: speaker 6 marks up to 80% of the clauses with a pause, while speaker 8 marks only 34%. The general picture is, however, in accordance with other findings on pausing in spontaneous speech in Dutch [1]. Paragraph boundaries are marked by the majority of speakers in 100% of the cases.

Local discourse structure. We furthermore checked the pausing strategies at a more local level of discourse. In spontaneous speech, pauses can also occur right after the first word of a clause. By pausing at this particular point, the speaker gives time to the listener to process the relation between the preceding clause and the one he/she is about to utter. In spontaneous discourse, this first word in the clause is usually the connective 'and' or some other kind of discourse marker. Table 4 shows the number of pauses occurring after discourse markers and connectives. The total number of pauses also includes the pauses at clause level, and a 'rest group' (total pauses minus pauses after discourse markers or connectives and at clauses). This is also given in table 4.

Table 4: Total absolute number of pauses (% between brackets) occurring after discourse markers (dm) and connectives, at clause boundaries, and in the rest group, broken down per speaker.
Speake  After dm     At    Total   Rest    
  r        or      clause  pauses  group   
        connectiv    s                     
  1      13 (16)     40      83    30      
                    (48)           (36)    
  2      12 (17)     46      71    13      
                    (65)           (18)    
  3      17 (19)     42      90    31      
                    (47)           (34)    
  4      22 (25)     41      89    26      
                    (46)           (29)    
  5       6 (11)     33      55    16      
                    (60)           (29)    
  6      11 (16)     37      67    19      
                    (55)           (29)    
  7       5 (10)     28      52    19      
                    (54)           (36)    
  8       6 (10)     22      61    33      
                    (36)           (54)    

The results from table 4 show that between 10-25% of the pauses are placed after discourse markers or connectives. Apparently speakers do use these points in the discourse to insert a pause. Speaker 4, however, makes use of this option more often than the rest of the speakers, while speakers 5, 7, and 8 make very little use of this strategy. Pauses occurring in the 'rest group' are pauses within clauses, for instance between word groups. The data from this column show that speaker 8 behaves differently in this respect, the pauses occurring within clauses constitute over 50% of all pauses used in his discourse. This means that this speaker uses a pausing strategy which is concentrated much more at the level of words or word groups than at the level of clause or paragraph.

If we take the percentages pausing at clause boundaries and after discourse markers and connectives together for all speakers, we are able to account for 67% of the pauses. If we leave out speaker 8, who shows a different pattern of pausing, this percentage rises to 70%.

3.3. Lengthening

As a first step in the investigation of lengthening in relation to discourse structure, we checked the total number of lengthened words (cf. table 1). The question now is what the relation is between lengthening as a pausing strategy, and discourse structure. Apparently speakers insert pauses after the first word in a clause (cf. table 4). Since we consider lengthening one of the strategies for achieving pausing, we checked whether lengthening also occurs mainly at these points. This is shown in table 5.

The data from this table show that lengthening at discourse markers and connectives accounts for 36% average of the cases. Differences between speakers are rather large, however. Speakers 2 and 4 use lengthening at discourse markers or connectives half of the time, while for the others it lies around 30%. The lengthened words in the 'rest group' did not show a consistent pattern at first sight, but it seems that determiners of highly informative words can be lengthened, just like auxiliaries to verbs expressing new information. So far, little is known about this lengthening in spontaneous discourse, more research will be needed before the matter becomes clear.

Table 5: Total absolute number of lengthened words (% between brackets) occurring after discourse markers (dm) and connectives (conn), broken down per speaker.
Spkr   At dm or       Total        Rest     
         conn      lengthening     group    
 1     11 (31)         35         24 (69)   
 2     12 (57)         21          9 (43)   
 3       9 (39)        23         14 (61)   
 4     28 (49)         57         29 (51)   
 5     11 (30)         36         25 (70)   
 6     16 (35)         45         29 (65)   
 7       5 (21)        24         19 (79)   
 8     16 (25)         63         47 (75)   


In conclusion, we can state that the various means to achieve pausing are indeed used differently by the eight speakers. This means that there are several ways for a speaker to apply 'pausing'. The choice of a specific strategy is largely speaker-dependent. The data from our experiment furthermore show that it is possible to roughly distinguish groups of speakers having the same pausing strategy (speakers 1 & 6, 2 & 3, 4 & 8, 5 & 7). This could very well be a matter of 'good' or 'bad' speakers: the use of one specific strategy might give a more natural way of telling than another. A listening experiment will be carried out to test this. Table 6 presents an overview of the different pausing strategies used by the speakers.

The data on pausing after discourse markers or connectives could imply that these first words in the clause are not really part of the discourse as such, since they often do not express any lexical meaning. In spontaneous discourse, a lot of clauses beginning with 'and' express the transition from the preceding clause to the next rather than a coordination relation. So they are used by the speaker to signal the major transition points, and to plan the content of the continuation of the discourse.

In [6] the perception of discourse boundaries in the retold versions was investigated. In relation to pausing strategy, the results indicate that the presence of an acoustic pause probably had a large influence on the listeners' perception of discourse boundaries, at least at clause boundaries and after discourse markers or connectives, since most pauses occur at these points in the discourse. We expect to find, in addition to the pauses, low boundary tones at full prosodic boundaries, such as paragraphs. Low boundary tones are usually associated with finality. At the weaker boundaries, we expect to find a pause or a high boundary tone, associated with non-finality. Measurements and analyses of F0 are being carried out at the moment. Data on perceived prominence will also be included.

Table 6: Overview of the different pausing strategies, broken down per speaker, using symbols + (much used) and - (little used) to represent specific strategies and uses.
Speaker     1   2   3   4   5   6   7   8   
Silence     +   +   +   -   +   -   +   -   
Filled      +   -   -   -   -   +   -   -   
Lengthenin  -   -   -   +   -   -   -   +   

This experiment was carried out within the larger project on the acoustic correlates of focussing in spontaneous discourse. Within that same project, we also performed an experiment on speaking rate in relation to discourse structure. This experiment is reported on in [7], and also includes data on the durations of the pauses measured in the present experiment.


The data were analysed during the first author's stay at Lund University, Department of Linguistics and Phonetics (Spring 1996). Many thanks are due to Gösta Bruce, to Merle Horne, and to Louis Pols, for careful reading of the draft and for useful comments in general.