Scholar Data

Datasets

UCLA High-Speed Laryngeal Video and Audio

Introduction

UCLA High-Speed Laryngeal Video and Audio was developed by UCLA Speech Processing and Auditory Perception Laboratory and is comprised of high-speed laryngeal video recordings of the vocal folds and synchronized audio recordings from nine subjects collected between April 2012 and April 2013. Speakers were asked to sustain the vowel /i/ for approximately ten seconds while holding voice quality, fundamental frequency, and loudness as steady as possible.

In the field of speech production theory, data such as contained in this release may be used to study the relationship between vocal folds vibration and resulting voice quality.

Data

None of the subjects had a history of a voice disorder. There was no native language requirement for recruiting subjects; participants were native speakers of various languages, including English, Mandarin Chinese, Taiwanese Mandarin, Cantonese and German.

Audio data is presented as 16kHz 16-bit flac and video is in avi format at 5 fps (frames per second).

Samples

Please view this video sample and this audio sample.

Updates

None at this time.

Authors

Chen, Gang ;
Neubauer, Juergen ;
Garellek, Marc ;
Samlan, Robin ;
Gerratt, Bruce R. ;
Kreiman, Jody ;
Alwan, Abeer

0 Citations0 Mentions35% FAIR0.9 Dataset Index

10.35111/4a2f-0j882017

The Subglottal Resonances Database

Introduction

The Subglottal Resonances Database was developed by Washington University and University of California Los Angeles and consists of 45 hours of simultaneous microphone and subglottal accelerometer recordings of 25 adult male and 25 adult female speakers of American English between 22 and 25 years of age.

The subglottal system is composed of the airways of the tracheobronchial tree and the surrounding tissues. It powers airflow through the larynx and vocal tract, allowing for the generation of most of the sound sources used in languages around the world. The subglottal resonances (SGRs) are the natural frequencies of the subglottal system. During speech, the subglottal system is acoustically coupled to the vocal tract via the larynx. SGRs can be measured from recordings of the vibration of the skin of the neck during phonation by an accelerometer, much like speech formants are measured through microphone recordings.

SGRs have received attention in studies of speech production, perception and technology. They affect voice production, divide vowels and consonants into discrete categories, affect vowel perception and can be useful in automatic speech recognition.

Data

Speakers were recruited by Washington University's Psychology Department. The majority of the participants were Washington University students who represented a wide range of American English dialects, although most were speakers of the mid-American English dialect.

The corpus consists of 35 monosyllables in a phonetically neutral carrier phrase (“I said a ____ again”), with 10 repetitions of each word by each speaker, resulting in 17,500 individual microphone (and accelerometer) waveforms. The monosyllables were comprised of 14 hVd words and 21 CVb words where C was b,d, g and V included all AE monophthongs and diphthongs.

The target vowel in each utterance was hand-labeled to indicate the start, stop, and steady-state parts of the vowel. For diphthongs, the steady-state refers to the diphthong nucleus which occurs early in the vowel.

The height and age of each speaker is included in the corpus metadata.

Audio files are presented as single channel 16-bit flac compressed wav files with sample rates of 48kHz or 16kHz. Image files are bitmap image files and plain text is UTF-8.

Samples

Please view the following samples:

Image Sample

Audio Sample

Text Sample

Acknowledgment

This work was supported in part by National Science Foundation Grant No. 0905250.

Updates

None at this time.

Authors

Alwan, Abeer ;
Lulich, Steven M. ;
Sommers, Mitchell S.

0 Citations0 Mentions35% FAIR0.9 Dataset Index

10.35111/5wf0-c3492015

Automated Author Profile
Alwan, Abeer

Alwan, Abeer

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

UCLA High-Speed Laryngeal Video and Audio

Introduction

Data

Samples

Updates

The Subglottal Resonances Database

Introduction

Data

Samples

Acknowledgment

Updates

Automated Author ProfileAlwan, Abeer

Alwan, Abeer

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

UCLA High-Speed Laryngeal Video and Audio

Introduction

Data

Samples

Updates

The Subglottal Resonances Database

Introduction

Data

Samples

Acknowledgment

Updates

Automated Author Profile
Alwan, Abeer