Skip to main content

Supercomputing drives deeper insight into Linguistics and social media

DeiC Interactive HPC has become an integral part of both research and teaching for Associate Professor in Linguistics Rebekah Baglini at Aarhus University’s Interacting Minds Centre.
By
11/12/2023 11:12
Billede
Rebeka Baglini
Foto: interactivehpc.dk

Supercomputing has long been associated with areas such as physics, engineering, and data science. However, researchers in humanities at Aarhus University are increasingly turning to supercomputing allowing them to delve into unexplored territories and discover new insights. From analysing historical archives to simulating ancient civilizations to analysing social media data, supercomputing offers unique opportunities to generate insights and advance knowledge in humanities.

While many studies are based on historical data, the research of Rebekah Baglini, Associate Professor in Linguistics at Interacting Minds Centre, Aarhus University is an excellent example of supercomputing applied to recent data in the humanities.   

She employs supercomputing in her current projects involving the collection, processing, and annotation of large-scale media data from traditional and social media sources. By examining this diverse range of data, Rebekah Baglini investigates causal inference and causal reasoning from a linguistic perspective. Her research involves the application of semantic model theory and computational methods to uncover insights in linguistics.

“I aim to develop computationally assisted methods to identify trends in the discursive and informational landscape around topics concerning media dynamics, public health and science communication, crisis and risk messaging, as well as the emergence of mis- and dis-information”, says Rebekah Baglini, Associate Professor in Linguistics, Aarhus University.

In addition to her linguistic investigations, Rebekah Baglini also strives to enhance the existing computational language models for multilingual natural language processing (NLP), with a particular focus on under-resourced languages.   

Humanities researchers should know the affordances of High-Performance Computing  

Rebekah’s pursuits demonstrate the continuous progress of digital humanities and the ongoing efforts to enhance existing language models, ultimately leading to a deeper understanding in the field of humanities.   

“My earlier work involved smaller language corpora and didn’t require HPC resources. However, as my projects grew in scale, involving large corpus creation, the relevance of supercomputing increased. I recognise that not all projects require HPC. However, it is useful for researchers to gain training in the affordances of HPC, parallel compute, and large models so they know what’s possible, and can potentially take on projects of larger scale or make use of state-of-the-art resources for data processing, modelling, and simulation”, says Rebekah Baglini, Associate Professor in Linguistics, Aarhus University

NLP and computational linguistics have become an integrated part of Rebekah Baglinis teaching

This explains why NLP and Computational Linguistics have become integral to Rebekah Baglini’s teaching, enabling her to offer students practical exposure to working with extensive datasets and large language models, fostering hands-on learning opportunities. She emphasises that there is a significant learning curve when delving into the realm of supercomputing. 

“There has definitely been a learning curve involved in the transition from locally maintained clusters to the cloud based Interactive HPC platform, particularly because it is also a somewhat new service without comprehensive documentation, and my affiliation with Center for Humanities Computing at Aarhus University has been a valuable resource as there is a great deal of collective experience and knowledge to draw on in the community”, says Rebekah Baglini, Associate Professor in Linguistics, Aarhus University.

Rebekah has used the DeiC Interactive HPC system for storing and analysing news and social media in the national research project HOPE that monitored Scandinavian user behaviour during Covid-19. 

Today she uses the system in her own AUFF Starting Grant Project CROSS: Causal Reasoning and Online Science Scepticism to train language models to identify and analyse emerging narratives that undermine or counteract verified messaging on scientific findings and public health recommendations.

For more inspiring humanities use cases, look to Katrine Frøkjær Baunvig about "Creating a Grundtvig-artificia lintelligence using DeiC Interactive HPC", or Rolf Lyneborg Lund about "HPC enlightens researchers in social sciences and humanities about human behavior".

 

Starting with DeiC Interactive HPC?

Learn more about DeiC Interactive HPC here.

Read more about access to DeiC Interactive HPC here  and explore activities and services here.

Find your university's Front Office, which can assist you in getting started with DeiC Interactive HPC.

You can also contact DeiC's HPC team at eske.christiansen@deic.dk, HPC Manager.

This story was originally written and published by the Center for Humanities Computing, AU, for DeiC Interactive HPC. Read the story here.