Large Language Models workshop

Wednesday, 25^th June
Carlos III University – Madrid, Spain

Livia Schubiger (D-GESS, ETH Zurich) & Raymond Duch (University of Oxford) organized an EPSA Madrid 2025 pre-conference that assembled researchers working with Large Language Models (LLMs). The focus were on the application of LLMs to research being conducted in the social sciences. The broad themes that we expected to cover included:

Data generation: this includes the role of AI agents as subjects in experiments, as enumerators, or as confederates; adaptive experimental design.
Data collection and curation: examples might include the compilation of historical and contemporary records videos and text.
Data analysis: text analysis and classification; the analysis of treatment effects associated with text and image vignette; the role of LLMs in the analysis of conversations and open-ended survey responses.
Design: LLM guidance in the design of experimental protocols and treatment arms.
Technical advances including the incorporation of non-text data such as images, video and audio.
Safety, ethics and replication: the ethical challenges of incorporating LLMs into social science research; efforts to address safety and LLM model development and deployment.
LLMs, the economy and governance: understanding how the rapid introduction and adoption of LLMs will affect the economy and governance.

We are at an early stage of incorporating LLM applications into our social science research. Hence this European event was a great opportunity to showcase the innovative research that is being conducted by scholars world-wide. The workshop format offered presenters a unique opportunity to showcase their work and get constructive and thoughtful feedback.

	Timing	Room		Presentation Schedule
Registration	8:30	Foyer just outside the Auditorium
Welcome and Introduction	09:00 – 09:10	Auditorium	Livia & Ray
Panel Session 1	09:10 – 11:00	Auditorium	Discussant – Frederik Hjorth, University of Copenhagen	Arianna Muti, Bocconi University (Milan) “Where are the leftist feminists now?” An LLM approach to how political events influence social media discourse on feminism Christy Coulson, London School of Economics and Political Science (LSE) “Gender-based Activism and Violent Backlash: Evidence from Latin America” Matilde Ceron, University of Salzburg “Leveraging Crowdcoding and LLMs for the analysis of gender mainstreaming: gender-sensitive recovery policies in the EU” Nils-Christian Bormann and Edoardo Alberto Viganò, Witten/Herdecke University. “Historical Conflict Event Data Collection via Large Language Models (LLMs)” Ashrakat Elshehawy, Stanford University “How Biased Police Reporting Shapes Misperceptions of Out-Group Crime”
Break	11:00 – 11:20	Foyer just outside the Auditorium
Panel Session 2	11:20 – 13:00	Auditorium	Discussant – Elizabeth Rhodes, OpenResearch	Gloria Gennaro, KCL “The evolution of political rhetoric on immigration in the UK” Sascha Riaz, European University Institute “Regime Loyalty during Wartime – Evidence from Nazi Germany” Tore Wig, University of Oslo “Prompting for Theoretical Progress: Using Generative AI to evaluate the evidence support for Grand Theories of Politics” Xiao Liu, Peking University “Automated Annotation of Evolving Corpora for Augmenting Longitudinal Network Data: A Framework Integrating Large Language Models and Expert Knowledge” Christopher Klamm, University of Cologne “Measuring Personal Attacks in Parliamentary Debates”
Workshop Lunch	13:00 – 13:30	Foyer just outside the Auditorium
Panel Session 3	13:30 – 15:10	Auditorium	Discussant – Xun PANG, Peking University	Javier Osorio, University of Arizona “ConfliBERT: A Language Model for Political Conflict” Jongwoo Jeong, Georgia State University “Spelling correction with large language models to reduce measurement error in open-ended survey responses” Winnie Xia, Aarhus University, Yen-Chieh Liao & Slava Jankin, University of Birmingham “Dynamic and Multilingual Embedding Regression for Political Text” David Muchlinski, Georgia Tech “SCOPE: Supercharging Conflict Prediction and Explanability using Advanced AI” Giovanni Pagano and Luigi Curini University of Milan “More than words? Understanding Multimodal Political Communication: a Computational Analysis of Textual and Visual Elements in the EP 2024 Campaign”
Break	15:10 – 15:20	Foyer just outside the Auditorium
Panel Session 4	15:20 – 17:00	Auditorium	Discussant – Moritz Marbach, UCL	Giuliano Formisano, University of Oxford “The Network Dynamics of Online Polarisation in the US” Camilo Cristancho, Universitat de Barcelona “Attitudes towards protest: A large-scale comparative perspective of media representations of protest in Latin America 2000-2024” Rachel Bernhard, Nuffield College, University of Oxford “Kiss, Marry, Kill: Appearance-Based Discrimination in Politics” Bryce J Dietrich, Purdue University, Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952–2012 Taehee Kim, University of Konstanz “Evaluating Generative Agents in Social Media: Replicating User Attitudes and Behaviors”
Panel Session 5	17:10 – 18.40	Auditorium	Discussant – Isaac Mehlhaff, Texas A&M University	Bastián González-Bustamante, Leiden University “Charting Reproducibility and Performance: LLMs in Multilingual Toxic Speech Detection” Fabrizio Gilardi, University of Zurich “Understanding and Mitigating Online Toxicity: A Chatroom-Based Experimental Framework with LLM Agents” Linette Lim, University College Dublin, Yen-Chieh Liao & Slava Jankin, University of Birmingham “Who Believes and Who Shares Fake News: A Multi-Agent System Application for Experimental Misinformation Research with LLMs” Nicolai Berk, ETH Zurich “What is a Good Conversation? Improving the Operationalization and Measurement of Deliberative Quality by Analyzing Interactions” Moritz Osnabrügge, Durham University “Polarized Speech: How Elite Rhetoric and Echo Chambers Fuel Negative Emotive Political Debate”

Papers

Attitudes towards protest: A large-scale comparative perspective of media representations of protest in 21st-century Latin America – Camilo Cristancho
Automated Annotation of Evolving Corpora for Augmenting Longitudinal Network Data: A Framework – Xiao Liu, Zirui Wu, Jiayi Li, Zhicheng Shao, Xun Pang, and Yansong Feng
Charting Reproducibility and Performance: LLMs in Multilingual Toxic Speech Detection – Bastián González-Bustamante
Debate Quality is an Emergent Property – Nicolai Berk, Francisco Tomás-Valiente Jordá, Lena Song, Dominik Stammbach, Laura Bronner, and Elliott Ash
Extractive versus Generative Language Models for Political Conflict Text Classification – Patrick T. Brandt, Sultan Alsarra, Vito J. D’Orazio, Dagmar Heintze, Latifur Khan, Shreyas Meher, Javier Osorio, and Marcus Sianan
Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952–2012 – Adam Breuer, Bryce J. Dietrich, Michael H. Crespin, Matthew Butler, J.A. Pyrse, and Kosuke Imai
Who Believes and Who Shares Fake News: A Multi-Agent System Application for Experimental Misinformation Research with LLMs – Linette Lim, Yen-Chieh Liao, and Slava Jankin