Advances in Sociophonetics

(Darren Dugan) #1

Chapter 2. French liaison and the lexical repository 37



  1. Results


3.1 Distributional analysis of liaison types


The 16,805 liaison occurrences produced in free and guided conversations turned
out to be organized in 3,105 environments (or “types”) of liaison. Each environ-
ment was defined by a given token frequency, ranging from 1,318 to 1. The data
were plotted into a log-log graph (Figure 2). Log-log graphs are two-dimensional
graphs of numerical data that use logarithmic scales on both the horizontal and
vertical axes, and can be used to examine the tail of a distribution of data.
In statistics and probability theory the use of a log-log graph for plotting data
distribution is a common practice because it allows for clear visualization even for
data which is scarce in frequency.
In our analysis we plot the frequency of each liaison type along the y-axis and
along the x-axis we report the rank of each type according to their frequency.
If the points in the plot tend to converge into a straight line for large numbers
in the x-axis, then the researcher concludes that the distribution has a power-law
tail (Jeong et al. 2000). Figure 2 displays the rank order of each liaison type by its
number of occurrences in the corpus (y-axis).


REALISATIONS OF LIAISON IN THE PFC CORPUS 16,805 TOKENS AND 3,105 TYPES

OCCURRENCES

RANK

50 types account for
50% of tot, realizations
13 types account for
30% of tot, realizations

104

103

102

101

100
100 101 102 103

3,055 types account for
the remaining 50 %

Figure 2. Log-log plot of liaison environments (or ‘types’) in the PFC corpus (rank order
by number of occurrences).

Free download pdf