Skip to main content
release_alert
Welcome to the new Scholars 3.0! Read about new features and let us know what you think.
cancel

A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments

Publication ,  Conference
Casillas, M; Bergelson, E; Warlaumont, AS; Cristia, A; Soderstrom, M; VanDam, M; Sloetjes, H
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech
January 1, 2017

Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories. In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation-voice activity detection, utterance segmentation, and speaker diarization-As well as later steps-e.g., classification-based codes such as child-vsadult-directed speech, and speech recognition to produce phonetic/ orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-Automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large crossdatabase project. keywords Daylong recordings∗Language acquisition∗Annotation∗Speech recognition∗Speaker diarization.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2017

Volume

2017-August

Start / End Page

2098 / 2102
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech (Vol. 2017-August, pp. 2098–2102). https://doi.org/10.21437/Interspeech.2017-1418
Casillas, M., E. Bergelson, A. S. Warlaumont, A. Cristia, M. Soderstrom, M. VanDam, and H. Sloetjes. “A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments.” In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, 2017-August:2098–2102, 2017. https://doi.org/10.21437/Interspeech.2017-1418.
Casillas M, Bergelson E, Warlaumont AS, Cristia A, Soderstrom M, VanDam M, et al. A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments. In: Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech. 2017. p. 2098–102.
Casillas, M., et al. “A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments.” Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, vol. 2017-August, 2017, pp. 2098–102. Scopus, doi:10.21437/Interspeech.2017-1418.
Casillas M, Bergelson E, Warlaumont AS, Cristia A, Soderstrom M, VanDam M, Sloetjes H. A new workflow for semi-Automatized annotations: Tests with long-form naturalistic recordings of childrens language environments. Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech. 2017. p. 2098–2102.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech

DOI

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2017

Volume

2017-August

Start / End Page

2098 / 2102