CaSMa/CASS workshop: Corpus approaches to Social Media Analysis


On December 12th,the CaSMa/CASS workshop on Corpus approaches to Social Medial Analysis took place at the University of Nottingham. The excellend workshop was delivered by Prof. Tony McEnery, Prof. Paul Baker, Dr. Claire Hardaker and Mark McGlashan from the ESRC Centre for Corpus Approaches to Social Science (CASS) at University of Lancaster.

The workshop  looked at the use of corpus linguistics to explore social media, specifically Twitter. It comprised of three talks covering recent research projects in the CASS centre, looking at Twitter reactions to the Channel 4 program Benefits Street (Paul Baker), the murder of Lee Rigby (Tony McEnery) and misogynistic abuse (Claire Hardaker and Mark McGlashan), followed by a session of hand-on experience in doing some basic corpus linguistics using the Antconc 3.4.3 software package and a small database of tweets.

In brief, corpus linguistics is used by CASS to “provide an insight into the use and manipulation of language in society in a host of areas of pressing concern, including climate change, hate crime and education. By providing fresh perspectives in such problems, [CASS] are helping to develop new approaches to challenging practices such as hate speech both in terms of raising awareness and of informing policy makers and other stakeholders of how such language may be used to wound and offend.”

Throughout the day, there were interesting discussion concerning the ethical considerations around the use of Twitter data, e.g. does the ‘broadcast’ nature of the twitter automatically mean that twitter data can be analysed without the need to ask for consent from the people whose post are being studied?

One of the important points that was raised during the discussions, was that when dealing with social media data sets where the communication has a public broadcast nature, such as twitter,the intrinsic ‘public’ nature of the data alone is not sufficient to justify using the data in research. Importantly, one must still look to the other principles of ethical conduct, scientific value, social responsibility and maximising benefit while minimising harm.

In the case of the study on misogynistic abuse through the medium of twitter, for instance, the considerations of social responsibility and maximising benefit while minimising harm prompted the researcher to seek explicit consent from the target of the abuse but not the perpetrators.

Workshop programme:


Coffee in Cloisters


Welcome and brief introduction to CaSMa


Introduction: Paul Baker and Tony McEnery: Corpus Linguistics and some issues when working with Twitter corpora

11:00 – 12:00

Claire Hardaker & Mark McGlashan: Misogynistic tweets and social networks


Tea and coffee


Paul Baker: Identifying discourse communities in tweets about Benefits Street




Tony McEnery: Comparing Press and Twitter reports on the Lee Rigby Attack

Coffee and cake to be available from 14:30


Paul Baker: Workshop – analysing the “heforshe” Twitter corpus
Please bring laptops for this part of the session (not Macs unless Antconc 3.4.3 installed)


Discussion chaired by Svenja Adolphs (and networking)


Go on, leave us a reply!

This site uses Akismet to reduce spam. Learn how your comment data is processed.