Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness

要旨

Mitigating algorithmic bias is a critical task in the development and deployment of machine learning models. While several toolkits exist to aid machine learning practitioners in addressing fairness issues, little is known about the strategies practitioners employ to evaluate model fairness and what factors influence their assessment, particularly in the context of text classification. Two common approaches of evaluating the fairness of a model are group fairness and individual fairness. We run a study with Machine Learning practitioners (n=24) to understand the strategies used to evaluate models. Metrics presented to practitioners (group vs. individual fairness) impact which models they consider fair. Participants focused on risks associated with underpredicting/overpredicting and model sensitivity relative to identity token manipulations. We discover fairness assessment strategies involving personal experiences or how users form groups of identity tokens to test model fairness. We provide recommendations for interactive tools for evaluating fairness in text classification.

著者
Zahra Ashktorab
IBM Research, Yorktown Heights, New York, United States
Benjamin Hoover
IBM Research AI, Cambridge, Massachusetts, United States
Mayank Agarwal
IBM Research, Cambridge, Massachusetts, United States
Casey Dugan
IBM Research, Cambridge, Massachusetts, United States
Werner Geyer
IBM Research, Cambridge, Massachusetts, United States
Hao Bang Yang
Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Mikhail Yurochkin
IBM Research, Cambridge, Massachusetts, United States
論文URL

https://doi.org/10.1145/3544548.3581227

動画

会議: CHI 2023

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

セッション: Platforms and Algorithms

Hall E
6 件の発表
2023-04-25 18:00:00
2023-04-25 19:30:00