Електронний багатомовний

термінологічний словник

Electronic Multilingual Terminological Dictionary


Linguistics

Corpus Linguistics

A study of language that includes all processes related to processing, usage and analysis of written or spoken machine-readable corpora. Corpus linguistics is a relatively modern term used to refer to a methodology, which is based on examples of ‘real life’ language use. At present, effectiveness and usefulness of corpus linguistics is closely related to the development of computer science.

Corpus linguistics is a quantitative paradigm grounded in the empirical tradition of language analysis. It uses large quantities of observational data compiled into data sets, called corpora, to provide evidence about language use by both first language (L1) and second language (L2) speakers (for an overview, see Barlow, 2005).
Corpus linguistics investigates spontaneous spoken and written language use.
Corpus linguistics provides access to large databases of language use that can reflect different forms of language, such as spoken and written L2. L2 language is typically compiled in what we call “learner corpora.” Second, these databases are easily shared, enriched with annotations and used by a wide range of researchers for a wide range of purposes. By allowing data to be shared in this way, corpus linguistics enabled a form of open linguistics before the notion was widely popularized, reducing effort in data collection and promoting the replication and reproduction of results and providing an important source of hypothesis testing.
Third, although we can observe linguistic phenomena such as the frequent co-occurrence of certain words in individual texts or speakers, only with the scale of the analysis afforded by corpus linguistics can we be confident about important recurrent patterns across many speakers and contexts. Corpus linguistics typically takes into consideration hundreds or thousands of different texts or speakers.

Sources:

⠀ McEnery, T., Brezina, V., Gablasova, D., & Banerjee, J. (2019). Corpus Linguistics, Learner Corpora, and SLA: Employing Technology to Analyze Language Use. Annual Review of Applied Linguistics, 39, 74-92. doi:10.1017/S0267190519000096

Part of speech Noun
Countable/uncountable countable
Type abstract
Gender neutral
Case nominative