Corpus linguistics is a gift for commercial copywriters wanting to incorporate everyday language in advertising.
By John Patkin
How would you write a commercial targeting Asian consumers of rice?
The first thought might be based on what is called the availability heuristic which leans on recent discussions with friends or what you might have consumed through the media. You may consider doing some research by asking a few potential clients but time is short when writing radio copy and advertisers negotiating spot prices might struggle with a research budget.
There is an alternative thanks to the emerging work in corpus linguistics which has been documenting the conversations of Asian speakers of English, including their opinions on rice types.
The 1-million word Asian Corpus of English (ACE) project partly-funded by the Australian Research Grants Council is a recent record of hundreds of English language conversations in the ASEAN region.
The conversations in the corpus were recorded in Vietnam, China, Hong Kong, Taiwan, the Philippines, Malaysia, Singapore, Brunei, Thailand and Japan. The detailed transcriptions include a textual representation of every utterance including breathing, stuttering, coughing, pauses, and parallel conversations in a naturally occurring environment. Demographic data such as ethnicity, gender, first language, education level, and age are included.
The data gives writers a rich insight into real-life conversations and offers an alternative to scripted or stereotypical representations. Educators’ uses include the possible rewriting of English language textbooks that represents local content instead of John, Mary and the Union flag.
Conversations from the Asian Corpus of English allow us to eavesdrop on how everyday people communicate. This Cinéma vérité data features hundreds of conversations between people of many backgrounds discussing complex and simple issues and the mundane. For example, a group of South East Asian teachers gathered in Singapore for a conference argue about the quality of rice while one of the females flirts with the only male in the group. In another recording, a group of ear nose and throat surgeons discuss the best way to do skin grafts. Engineers from a Hong Kong electronics company hold a teleconference with colleagues in China and Taiwan about product development. Advertisers wanting to reference such conversations can simply search the corpus for topics and key words.
Most of the data is limited to text due to ethical reasons, but some participants agreed to let researchers publish some audio snippets. The project’s principal investigator Professor Andy Kirkpatrick from Griffith University has used some sound bites to argue that variation in pronunciation is idiosyncratic. Another researcher, Dr Sophiaan Sobhan, highlighted the contrasting and evolving forms of English as concluding ASEAN English is the “same and different”.
The elder statesman of World Englishes, Professor Henry Widdowson, has argued we need to change the way we teach and assess English for so-called native speakers. But what does this mean for radio in Australia? Simply put, findings from the Asian Corpus of English can help us connect with Australians of South-East Asian heritage.
Corpora and associated linguistic tools can provide speech-based media a source of data that can help us to better understand our audiences and communicate with them. Speech-based content, such as talk shows, phone-ins and commercials can be better informed about acceptable and common language. The data can also be used to support an over-arching policy for a station. These events of naturally occurring English and other conversations provide us with the language and characters of everyday conversations. They are the reality that is often imitated, sometimes good and sometimes bad. By accessing a corpus, we understand that conversations are not about perfect pronunciation, speakers make mistakes and messages are frequently repeated.
The corpus shows us we are generally good-natured and pleasant but we are not comedians and despite requests to ‘turn-up the volume’ to increase excitement, messages are delivered without shouting. We are in the boring middle ground but if the topic is interesting, the participants will respond. Reconfirming speech is dotted with grammatical variation might be hard for some purists to accept, but the findings show us that so long as it is accurate for the intended purpose, it’ll do.
Shift in language was one of the themes of linguist Edward Sapir’s seminal work Language: An introduction to the study of speech… and that was published around 100 years ago when they were arguing over the use of who instead of whom.
Apart from technical words that gain admission to dictionaries, words with specific connection to culture and identity also win approval. The word “Hongkonger” long used in conversation but seldom in print was recently added to the Oxford dictionary and proves that while you might win a scrabble argument with a dictionary, you might need more than a book to communicate with everyday folks.
List of public accessible corpora
Asian Corpus of English http://corpus.ied.edu.hk/ace/
Australian Corpus of English https://www.ausnc.org.au/
John Patkin is the Chief Transcriber of the Asian Corpus of English and has worked on the project since 2010.