28831

AI-powered content analysis: Using generative AI to measure media and communication content

Marko Bachl

Zusätzl. Angaben / Voraussetzungen

Prior knowledge in R, applied data analysis, and interacting with application programming interfaces (API) will be helpful but are not required. However, a willingness to learn the necessary skills and an openness to explore the possibilities of code-based computational social science research during the seminar are mandatory.
Some prior exposure to (standardized, quantitative) content analysis will be helpful. However, qualitative methods also have their place in evaluating content analysis methods. If you have little experience with the former but can contribute with the latter, make sure to team up with a student whose skill set complements yours.
Students will use their own computer and the software R and RStudio to follow along the practical part of the seminar. A browser-based solution will be provided for students who cannot install the software on their own devices.

Schließen

Kommentar

Large language models (LLM; starting with Google’s BERT) and particularly their implementations as generative or conversational AI tools (e.g., OpenAI’s ChatGPT) are increasingly used to measure or classify media and communication content. The idea is simple yet intriguing: Instead of training and employing humans for annotation tasks, researchers describe the concept of interest to a model such as ChatGPT, present the coding unit, and ask for a classification. The first tests of the utility of ChatGPT and similar tools for content analysis were positive to enthusiastic [1–3]. User-friendly tutorials have proliferated the method to the average social scientist [4, 5]. Yet (closed-source, commercial) large language models are not entirely understood even by their developers, and their uncritical use has been criticized on ethical grounds [7–9].

In this seminar, we will engage practically with this cutting-edge research method. We start with a quick refresher on the basics of quantitative content analysis (both human and computational) and an overview of the rapidly developing literature on LLMs’ utility in this field. The main part of the seminar will be dedicated to learning step-by-step how to use and evaluate a generative AI model for applied content analytical research. In the end, students should be able to use the method in their own research. Schließen

Literaturhinweise

1] Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30), e2305016120. https://doi.org/10/gsqx5m

[2] Heseltine, M., & Clemm von Hohenberg, B. (2024). Large language models as a substitute for human experts in annotating political text. Research & Politics, 11(1). https://doi.org/10/gtkhqr

[3] Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C. E., & Van Bavel, J. J. (2024). GPT is an effective tool for multilingual psychological text analysis. Proceedings of the National Academy of Sciences, 121(34), e2308950121. https://doi.org/10/gt7hrw

[4] Törnberg, P. (2024). Best practices for text annotation with large language models. Sociologica, 18(2), Article 2. https://doi.org/10/g9vgm7

[5] Stuhler, O., Ton, C. D., & Ollion, E. (2025). From codebooks to promptbooks: Extracting information from text with generative large language models. Sociological Methods & Research. https://doi.org/10/g9vgnq

[7] Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10/gh677h

[8] Spirling, A. (2023). Why open-source generative AI models are an ethical way forward for science. Nature, 616(7957), 413–413. https://doi.org/10/gsqx6v

[9] Widder, D. G., Whittaker, M., & West, S. M. (2024). Why ‘open’ AI systems are actually closed, and why this matters. Nature, 635(8040), 827–833. https://doi.org/10/g8xdb3 Schließen