There’s been a mini revolution in the way we speak and write British English and, says new research conducted on data from the last 20 years, it’s official, we have become much more informal.

A team of researchers, led by Lancaster University linguists, have used innovative computational methods to examine huge collections of words called corpora.

The study identifies that words such as ‘whom’ or ‘upon’ are out and ‘amazing’ and ‘stuff’ are well and truly in.

For example, formal research reports today include almost twice as many informal expressions such as it’s instead of it is than a research report 20 years ago.

There has also been a dramatic decrease in the use of a range of features such as the pronoun whom (52% decrease), modal verbs shall (60% decrease), must (40% decrease) and may (41% decrease).

And there has also been a dramatic increase in the use of formerly frowned upon linguistic features such as the split infinitive.

British English speakers are three times more likely to use a phrase such as ‘to gradually bring’ or ‘to effectively tell’ both in speech and writing than they were 20 years ago.

Social media and the Internet more generally brought new words and expressions related to technology (vlog, fitbit, bitcoin), social interaction (omg, tbh, defo) and progressive spelling (gunna, couldnt, tmoz).

The research team, led by Dr Vaclav Brezina, have recently completed a major project, started in 2014.

They have compiled and analysed a new dataset, known as the written British National Corpus 2014, which covers the period from 2007 to 2020 with 2014 providing the mid-point.

The research compares this with a previous dataset, the BNC1994, which covers the early 1990s, to which Lancaster also contributed.

The study, funded by the Economic and Social Research Council, offers an insight into how British English has developed from the early 1990s to the present.

This corpus, or bank of words, added an impressive 90 million words of data to a previously published spoken dataset (10 million words).

The data now offers a complete picture of present-day British English across different genres including newspapers, magazines, TV shows, social media, blog posts, online reviews, fiction, political speeches, academic writing, informal speech etc.

The 100-million-word British National Corpus 2014 is a large collection of ‘real life’ language, which can be used by researchers to understand more about how language works and how it is evolving.

Educators, textbook writers, dictionary compilers and the interested public will also be able to access the corpus to find usage examples of modern British English across different genres.

The Lancaster team has also developed specialised software (#LancsBox X) which allows efficient exploration of the dataset.

“Over the last twenty years, we have experienced dramatic changes in technology, which completely transformed the way we communicate,” says Dr Brezina.

“Written language has become much more dynamic and shared by many more people than ever before.

“We text or message friends and colleagues and get an immediate response but we might be hard-pressed to remember when we last wrote a letter to someone.

“Many more people also produce content for the general audience via social media and various websites – one doesn’t need to be a journalist or a novelist like in the old days to reach thousands or millions of people.”

Computational analysis of the frequencies of words, phrases and grammatical structures provides an insight into the development of British English over time.

For this, a large balanced sample of language is required that includes data from authors, publishers and the general public who contributed to this research.

LEAVE A REPLY

Please enter your comment!
Please enter your name here