Pia Pachinger

she / her

I'm a fourth year PhD student currently interested in perspectivism and human label variation in Natural Language Processing, social bias in NLP, low-resource language varieties and domains, data resource creation, and domain-specific information extraction. 

I love all areas of NLP, from foundations to application. Please reach out to me if you want to discuss!

Contact me by reordering    @    tuwien.ac.at    pia.pachinger

An AI Reading of the Library of Babel
Simón López Trujillo*, Pia Pachinger*, Baltazar Pérez* 

Publications

The size of AustroTox

2024 ACL  Findings 
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection 
Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt

We create the first dataset with a focus on Austrian German toxic language online and the first German dataset on toxicity featuring annotated spans facilitating explainability and error analysis. We find that LLMs not fine-tuned on AustroTox fail to recognize country-specific vulgar language and the targets of toxic statements (as of 1 / 2024). 

Here is the data, here is the poster.

2023 EACL C3NLP 
Toward Disambiguating the Definitions of Abusive, Offensive, Toxic, and Uncivil Comments
Pia Pachinger, Anna Maria Planitzer, Julia Neidhardt, Allan Hanbury

We find that researchers studying harmful language online treat abusiveness, offensiveness, and toxicity interchangeably as sub-concept of one another. While social science literature frequently employs incivility with similar meaning, computer science research underutilizes this term, suggesting both fields would benefit from terminological unification. We compile and analyse the distinct definitions researchers use for these concepts to facilitate future unification efforts across and within disciplines.

2022 TU Vienna
A Recommender System for Scientific Referees Based on Bibliographic Databases and Knowledge Graphs
Pia Pachinger, Georg Gottlob, Joël Ouaknine, Glenn Starkman, Matt Rainey, Emanuel Sallinger

We implement multiple recommender systems for scientific expert search using two-step coauthorship and publication venue overlap. Qualitative evaluation by 15 established computer science researchers showed the best system recommended a mean of 5 excellent and 2.8 suitable experts per 10 recommendations.

Current Presentations

Prices

Education

Employment

Further 
Activities

2025 Supervision of Gerald Weber's Masters Thesis on Information Extraction for the Engineering Domain


Reviewer for LREC-Coling 2024, WOAH (ACL) 2025

Teaching

2025 ​Faculty of Informatics, TU Vienna
Interdisciplinary Project in Data Science

2023 - 2024 Faculty of Informatics, TU Vienna
Natural Language Processing and Information Extraction

2023, Reimagining recommender systems together with Anna Merl and Ignacio Pérez Messina. Here is a demo.

2023 Faculty of Informatics, TU Vienna
Advanced Information Retrieval

2020 University of Vienna 
Member of the curricular working group for the new data science master studies

2023 Faculty of Linguistics, Paris Lodron University Salzburg
Language Technology and Language Data

Visits

2019 Faculty of Mathematics, University of Vienna
Python for Mathematicians

2018 University of Bergen, Norway
Collaboration with Morten Brun on Topological Data Analysis

2018 - 2020 Faculty of Mathematics, University of Vienna
Introduction to Wolfram Mathematica

2018 National University of Colombia
Collaboration with Francisco Gómez on Topological Data Analysis

2016 - 2017 Autonomous University of Madrid
Erasmus 

A long time ago: Guardería de ​Don Bosco, San José, Costa Rica
Volunteer in a kindergarden and boarding school for socially deprived children and adolescents