Nature - USA (2020-06-25)

Article

Extended Data Fig. 1 | Canary song annotation and sequence statistics.
a, Architecture of syllable segmentation and annotation machine learning
algorithm. (i) A spectrogram is fed into the algorithm as a 2D matrix in
segments of 1 s. (ii) Convolutional and max-pooling layers learn local spectral
and temporal filters. (iii) Bidirectional recurrent LSTM layer learns temporal
sequencing features. (iv) Projection onto syllable classes assigns a probability
for each 2.7-ms time bin and syllable. b, After manual proofreading
(see Methods), a support vector machine classifier was used to assess the
pairwise confusion between all syllable classes of bird 1 (see Methods). The test
set confusion matrix (right) and its histogram (left) show that in rare cases the
error exceeded 1% and at most reached 6%. As the higher values occurred only
in phrases with 10 s of syllables, this metric guarantees that most of the
syllables in every phrase cannot be confused as belonging to another syllable
class. Accordingly, the possibility of making a mistake in identifying a phrase
type is negligible. c, Number of phrases per song for the three birds used in this
study. d, Song durations for the three birds. e, Mean syllable durations for 85
syllable classes from three birds. Red arrow marks the duration below which all
trill types have more than ten repetitions on average. f, Relation between
phrase class mean duration (x axis) and standard deviation (y axis). Syllable

classes (dots) of three birds are coloured according to bird number. Dashed line marks 450 ms (upper limit for the decay time constant of GCaMP6f ). g, Range of mean number of syllables per phrase (y axis) for all syllable types with mean duration shorter than the x-axis value. Red line is the median, light grey marks the 25% and 75% quantiles and dark grey marks the 5% and 95% quantiles (blue line marks the number of syllable types contributing to these statistics). The red arrow matches the arrow in e. h, Cumulative histogram of trill phrase durations. i, All complex phrase transitions with second-order or higher dependence on song history context (for birds 1 and 2). For each phrase type that precedes a complex transition, the context dependence is visualized by a PST (see Methods). Transition outcome probabilities are marked by pie charts at the centre of each node. The song context (phrase sequence) that leads to the transition is marked by concentric circles, the innermost being the phrase type that preceded the transition. Nodes are connected to indicate the sequences in which they are added in the search for longer Markov chains that describe context dependence (for example, i–iii for first- to third-order Markov chains). Grey arrows indicate additional incoming links that are omitted for simplicity.

Nature - USA (2020-06-25)

Get our desktop app

Company

Features

Documentation

Resources