she / her
I'm a fourth year PhD student at TU Wien (Vienna, Austria) supervised by Allan Hanbury and Julia Neidhardt. I am interested in safety alignment, perspectivism and human label variation in Natural Language Processing, social bias in NLP, low-resource settings, and data resource creation.
Please reach out to me if you want to discuss!
Contact me by reordering @ tuwien.ac.at pia.pachinger
An AI Reading of the Library of Babel
Baltazar Pérez, Pia
Pachinger, Simón López Trujillo
The size of AustroTox
2024 ACL Findings
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection
Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt
We create the first dataset with a focus on Austrian German toxic language online and the first German dataset on toxicity featuring annotated spans facilitating explainability and error analysis. We find that LLMs not fine-tuned on AustroTox fail to recognize country-specific vulgar language and the targets of toxic statements (as of 1 / 2024).
2025 EMNLP NLPerspectives
A Disaggregated Dataset on English Offensiveness Containing Spans
Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Julia Neidhardt, Allan Hanbury
We re-annotate a subset of posts of the Jigsaw Toxic Comment Classification Challenge and provide disaggregated toxicity labels and spans. We find that five annotations per instance allow for fine distinctions between genuine disagreement and that arising from annotation error or inconsistency. Disagreement is especially high in cases of toxic statements involving non-human targets.
2023 EACL C3NLP
Toward Disambiguating the Definitions of Abusive, Offensive, Toxic, and Uncivil Comments
Pia Pachinger, Anna Maria Planitzer, Julia Neidhardt, Allan Hanbury
We find that researchers studying harmful language online treat abusiveness, offensiveness, and toxicity interchangeably as sub-concept of one another. While social science literature frequently employs incivility with similar meaning, computer science research underutilizes this term, suggesting both fields would benefit from terminological unification. We compile and analyse the distinct definitions researchers use for these concepts to facilitate future unification efforts across and within disciplines.
2022 TU Wien
A Recommender System for Scientific Referees Based on Bibliographic Databases and Knowledge Graphs
Pia Pachinger, Georg Gottlob, Joël Ouaknine, Glenn Starkman, Matt Rainey, Emanuel Sallinger
We implement multiple recommender systems for scientific expert search using two-step coauthorship and publication venue overlap. Qualitative evaluation by 15 established computer science researchers showed the best system recommended a mean of 5 excellent and 2.8 suitable experts per 10 recommendations.
Workshop on Online Abuse and Harms (ACL) 2025
Alignment by Disagreement? Investigating Individual and Sociodemographic Variability in the Perception of Harmful Language
Pia Pachinger, Anna Maria Planitzer, Allan Hanbury, Julia Neidhardt, Sophie Lecheler
International Communications Association Conference 2025
Like Walking a Tightrope: User-Centric Perspectives on Automated Content Moderation
Anna Maria Planitzer, Sophie Lecheler, Svenja Schäfer, Pia Pachinger (presented by Anna)
COMPTEXT 2025
Incorporating User Perceptions of Online Norm Violations in Toxicity Detection Models
Pia Pachinger, Anna Maria Planitzer, Allan Hanbury, Julia Neidhardt, Rebekah Wegener, Sophie Lecheler
2024 NAACL, Student Research Workshop
Best PhD proposal
User-Centric Offensive Text Detection in
Culture Specific Contexts:
A PhD Proposal
Pia Pachinger
2022 Inria, Paris
Design + interaction + AI hackathon
Second Prize for That's life.
Artifact publicly exhibited at Le Bis in Paris
Anaïs Cambou*,
Anthonin Gourichon*, Fengyu Li*, Xiaoning Meng*, Pia Pachinger* (* equal
contribution)
PhD in Computer Science, TU Wien, 2022 – 2026
Natural language processing, i.e. user-centric offensive text detection
Master in Data Science, TU Wien
Machine learning and statistics, natural language processing and visual analytics
Thesis on recommender systems based on Knowledge graphs
GPA 3.7 / 4.0
Languages
German (native)
English (C1)
Spanish (C1)
Python
Bachelor in Theoretical Mathematics, University of Vienna
Thesis on topological data analysis
TU Wien 2022 - 2026
Faculty of Informatics
Prae-Doc researcher
TACo: User-centric content moderation
BrAIn: Domain-specific information extraction
VHH: Evaluation of machine translation models
Austrian Institute of Technology 2019 - 2020
Center for Cyber Security
Researcher
Pre-training and evaluation of CNNs and LSTMs for anomaly detection in time series of system log data
TU Wien 2021 - 2022
Faculty of Informatics
Student researcher
Implementation of recommender system for
scientific referees
University of Vienna 2018-2020
Faculty of Mathematics
Teaching Assistant
2025 Supervision of Gerald Weber's Masters Thesis on Information Extraction for the Engineering Domain
Reviewer for LREC-Coling 2024, WOAH (ACL) 2025
2019 Faculty of Mathematics, University of Vienna
Python for Mathematicians
2023 Faculty of Informatics, TU Vienna
Natural Language Processing and Information Extraction
2020 University of Vienna
Member of the curricular working group for the new data science master studies
2023 Faculty of Informatics, TU Vienna
Advanced Information Retrieval
2023 Faculty of Linguistics, Paris Lodron University Salzburg
Language Technology and Language Data
Teaching
2018 - 2020 Faculty of Mathematics, University of Vienna
Introduction to Wolfram Mathematica
2025 Faculty of Informatics, TU Vienna
Interdisciplinary Project in Data Science
2018 University of Bergen, Norway
Collaboration with Morten Brun on Topological Data Analysis
2017 National University of Colombia
Collaboration with Francisco Gómez on Topological Data Analysis