Speaker sample and metadata for BNClab-M subcorpus (a modified version of the BNClab subcorpus)

View Dataset
Smith, Nicholas;Broccias, Cristiano;Waters, Cathleen

Description

This file lists speakers and their characteristics in what we call the BNClab-M subcorpus. This subcorpus is a modified version of the BNClab subcorpus, created at Lancaster University (Brezina et al. 2018) as a sociolinguistically balanced and comparable subset from two demographically-sampled conversation corpora - the British National Corpus of 1994 (Burnard 2007) and the British National Corpus of 2014 (Love et al. 2017). The speaker sample in BNClab-M largely follows that in BNClab, but has been modified in an attempt to improve cross-time comparability. Modifications include i) restriction to speakers from England only, and to speakers designated as either working class or middle class; ii) reassignment of the social class of some speakers; and iii) addition of a few speakers from the parent BNCs that were not included in BNClab.Speaker classifications are by year (1994 or 2014), gender (female or male), age (18-44 or 45-and-over), region (five regions of England), and social class (working class or middle class). We gratefully acknowledge permission to use content from the Demographic Spoken component of BNC1994, owned by the British National Corpus Consortium, and content from the Spoken BNC2014, owned by Cambridge University Press, in accordance with their respective licences (http://www.natcorp.ox.ac.uk/docs/licence.html and http://corpora.lancs.ac.uk/bnc2014/licence.php).A research article (Smith et al. forthcoming) describing the design of the BNClab-M sample, and the rationale for modifications to the original BNClab sample, is to be published in the journal Research in Corpus Linguistics in December 2024. References:Brezina, Vaclav, Dana Gablasova and Susan Reichelt. 2018. BNClab. http://corpora.lancs.ac.uk/bnclab. Lancaster University. (10 April, 2024.)Burnard, Lou ed. 2007. Reference Guide for the British National Corpus (XML Edition).http://www.natcorp.ox.ac.uk/docs/URG/. Oxford University. (10 April, 2023.)Love, Robbie, Claire Dembry, Andrew Hardie, Vaclav Brezina and Tony McEnery. 2017. The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics 22(3): 319–344.Smith, Nicholas, Broccias, Cristiano and Waters, Cathleen. 2024. Addressing comparability and retrieval issues in conversation corpora: A case study on the spoken British National Corpora (1994 and 2014), using the past perfect. Research in Corpus Linguistics, 12(2): 80–110.

Citations (1)

Mentions (0)

Metrics

Dataset Index

2.2

FAIR Score

85%

Citations

1

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

University of Leicester

Assigned Domain

Subfield

Condensed Matter Physics

Field

Physics and Astronomy

Domain

Physical Sciences

Confidence Score

98%

Source

Open Alex

Keywords

Language in Culture and Society (Sociolinguistics)Applied Linguistics and Educational LinguisticsEnglish LanguageLanguage in Time and Space (incl. Historical Linguistics, Dialectology)

Normalization Factors

FT

15.38

CTw

1.00

MTw

1.00