she / her
I'm a fourth year PhD student currently interested in perspectivism and human label variation in Natural Language Processing, social bias in NLP, low-resource language varieties and domains, data resource creation, and domain-specific information extraction.
I love all areas of NLP, from foundations to application. Please reach out to me if you want to discuss!
Contact me by reordering @ tuwien.ac.at pia.pachinger
An AI Reading of the Library of Babel
Simón López Trujillo*, Pia
Pachinger*, Baltazar Pérez*
The size of AustroTox
2024 ACL Findings
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection
Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt
We create the first dataset with a focus on Austrian German toxic language online and the first German dataset on toxicity featuring annotated spans facilitating explainability and error analysis. We find that LLMs not fine-tuned on AustroTox fail to recognize country-specific vulgar language and the targets of toxic statements (as of 1 / 2024).
Here is the data, here is the poster.
2023 EACL C3NLP
Toward Disambiguating the Definitions of Abusive, Offensive, Toxic, and Uncivil Comments
Pia Pachinger, Anna Maria Planitzer, Julia Neidhardt, Allan Hanbury
We find that researchers studying harmful language online treat abusiveness, offensiveness, and toxicity interchangeably as sub-concept of one another. While social science literature frequently employs incivility with similar meaning, computer science research underutilizes this term, suggesting both fields would benefit from terminological unification. We compile and analyse the distinct definitions researchers use for these concepts to facilitate future unification efforts across and within disciplines.
2022 TU Vienna
A Recommender System for Scientific Referees Based on Bibliographic Databases and Knowledge Graphs
Pia Pachinger, Georg Gottlob, Joël Ouaknine, Glenn Starkman, Matt Rainey, Emanuel Sallinger
We implement multiple recommender systems for scientific expert search using two-step coauthorship and publication venue overlap. Qualitative evaluation by 15 established computer science researchers showed the best system recommended a mean of 5 excellent and 2.8 suitable experts per 10 recommendations.
Workshop on Online Abuse and Harms (ACL) 2025
Alignment by Disagreement? Toward Investigating LLMs' Adaptation to Personal and Sociodemographic Variability in the Perception of Toxicity
Pia Pachinger, Anna Maria Planitzer, Allan Hanbury, Julia Neidhardt, Sophie Lecheler
International Communications Association Conference 2025
Like Walking a Tightrope: User-Centric Perspectives on Automated Content Moderation
Anna Maria Planitzer, Sophie Lecheler, Svenja Schäfer, Pia Pachinger (presented by Anna)
COMPTEXT 2025
Incorporating User Perceptions of Online Norm Violations in Toxicity Detection Models
Pia Pachinger, Anna Maria Planitzer, Allan Hanbury, Julia Neidhardt, Rebekah Wegener, Sophie Lecheler
2024 NAACL, Student Research Workshop
Best PhD proposal
User-Centric Offensive Text Detection in
Culture-Specific Contexts:
A PhD Proposal
Pia Pachinger
2022 Inria, Paris
Design + interaction + AI hackathon
Second Prize for That's life.
Artifact publicly exhibited at Le Bis in Paris
Anaïs Cambou*,
Anthonin Gourichon*, Fengyu Li*, Xiaoning Meng*, Pia Pachinger* (* equal
contribution)
PhD in Informatics, TU Vienna, 2022 – 2026
Natural Language
Processing, User-Centric Offensive Text Detection in Culture-Specific Contexts
Master in Data Science, TU Vienna
Machine Learning and
Statistics, Natural Language Processing and Visual Analytics
GPA 3.7 / 4.0
Languages
German (native)
English (C1)
Spanish (C1)
Italian (very bad still :) )
Python
Bachelor in Mathematics
University of Vienna
2019 - 2020 Centre for Cyber Security
Austrian Institute of Technology
Freelance researcher
Pre-training and evaluation of CNNs and LSTMs for anomaly detection in time series of system log data
2022 - 2026 Data Science Group, TU Vienna
Prae-Doc researcher
TACo: User-centric content moderation
BrAIn: Domain-specific information extraction
VHH: Evaluation of machine translation models
2021 - 2022 Databases and Artificial
Intelligence Group, TU Vienna
Student researcher
Implementation of recommender system for
scientific referees
2018 - 2020 Faculty of Mathematics, University of Vienna
Teaching Assistant
2025 Supervision of Gerald Weber's Masters Thesis on Information Extraction for the Engineering Domain
Reviewer for LREC-Coling 2024, WOAH (ACL) 2025
Teaching
2025 Faculty of Informatics, TU Vienna
Interdisciplinary Project in Data Science
2023 - 2024 Faculty of Informatics, TU Vienna
Natural Language Processing and Information Extraction
2023, Reimagining recommender systems together with Anna Merl and Ignacio Pérez Messina. Here is a demo.
2023 Faculty of Informatics, TU Vienna
Advanced Information Retrieval
2020 University of Vienna
Member of the curricular working group for the new data science master studies
2023 Faculty of Linguistics, Paris Lodron University Salzburg
Language Technology and Language Data
Visits
2019 Faculty of Mathematics, University of Vienna
Python for Mathematicians
2018 University of Bergen, Norway
Collaboration with Morten Brun on Topological Data Analysis
2018 - 2020 Faculty of Mathematics, University of Vienna
Introduction to Wolfram Mathematica
2018 National University of Colombia
Collaboration with Francisco Gómez on Topological Data Analysis
2016 - 2017 Autonomous University of Madrid
Erasmus
A long time ago: Guardería de Don Bosco, San José, Costa Rica
Volunteer in a kindergarden and boarding school for socially deprived children and adolescents