Content Moderation

会議の名前
CHI 2025
Lost in Moderation: How Commercial Content Moderation APIs Over- and Under-Moderate Group-Targeted Hate Speech and Linguistic Variations
要旨

Commercial content moderation APIs are marketed as scalable solutions to combat online hate speech. However, the reliance on these APIs risks both silencing legitimate speech, called over-moderation, and failing to protect online platforms from harmful speech, known as under-moderation. To assess such risks, this paper introduces a framework for auditing black-box NLP systems. Using the framework, we systematically evaluate five widely used commercial content moderation APIs. Analyzing five million queries based on four datasets, we find that APIs frequently rely on group identity terms, such as ``black'', to predict hate speech. While OpenAI's and Amazon's services perform slightly better, all providers under-moderate implicit hate speech, which uses codified messages, especially against LGBTQIA+ individuals. Simultaneously, they over-moderate counter-speech, reclaimed slurs and content related to Black, LGBTQIA+, Jewish, and Muslim people. We recommend that API providers offer better guidance on API implementation and threshold setting and more transparency on their APIs' limitations. \noindent \textit{\textbf{Warning}: This paper contains offensive and hateful terms and concepts. We have chosen to reproduce these terms for reasons of transparency.}

著者
David Hartmann
Weizenbaum Institute Berlin, Berlin, Germany
Amin Oueslati
Hertie School Berlin, Berlin, Germany
Dimitri Staufer
Technical University Berlin, Berlin, Germany
Lena Pohlmann
Weizenbaum Institute Berlin, Berlin, Germany
Simon Munzert
Hertie School Berlin, Berlin, Germany
Hendrik Heuer
Center for Advanced Internet Studies (CAIS), Bochum, Germany
DOI

10.1145/3706598.3713998

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713998

動画
The Virtual Jail: Content Moderation Challenges Faced by Chinese Queer Content Creators on Douyin
要旨

Queer users of Douyin, the Chinese version of TikTok, suspect that the platform removes and suppresses queer content, thus reducing queer visibility. In this study, we examined how Chinese queer users recognize and react to Douyin's moderation of queer content by conducting interviews with 21 queer China-based Douyin content creators and viewers. Findings indicate that queer users actively explore and adapt to the platform's underlying moderation logic. They employ creative content and posting strategies to reduce the likelihood of their expressions of queer topics and identities being removed or suppressed. Like Western platforms, Douyin's moderation approaches are often ambiguous; but unlike Western platforms, queer users sometimes receive clarity on moderation reasons via direct communication with moderators. Participants suggested that Douyin's repressive moderation practices are influenced by more than just platform policies and procedures - they also reflect state-led homophobia and societal discipline. This study underscores the challenges Chinese queer communities face in maintaining online visibility and suggests that meaningful change in their experiences is unlikely without broader societal shifts towards queer acceptance.

著者
Caoyang Shen
University of Michigan, Ann Arbor, Michigan, United States
Oliver L.. Haimson
University of Michigan, Ann Arbor, Michigan, United States
DOI

10.1145/3706598.3714013

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714013

動画
"We're utterly ill-prepared to deal with something like this": Teachers' Perspectives on Student Generation of Synthetic Nonconsensual Explicit Imagery
要旨

Synthetic nonconsensual explicit imagery, also referred to as "deepfake nudes", is becoming faster and easier to generate. In the last year, synthetic nonconsensual explicit imagery was reported in at least ten US middle and high schools, generated by students of other students. Teachers are at the front lines of this new form of image abuse and have a valuable perspective on threat models in this context. We interviewed 17 US teachers to understand their opinions and concerns about synthetic nonconsensual explicit imagery in schools. No teachers knew of it happening at their schools, but most expected it to be a growing issue. Teachers proposed many interventions, such as improving reporting mechanisms, focusing on consent in sex education, and updating technology policies. However, teachers disagreed about appropriate consequences for students who create such images. We unpack our findings relative to differing models of justice, sexual violence, and sociopolitical challenges within schools.

受賞
Honorable Mention
著者
Miranda Wei
University of Washington, Seattle, Washington, United States
Christina Yeung
University of Washington, Seattle, Washington, United States
Franziska Roesner
University of Washington, Seattle, Washington, United States
Tadayoshi Kohno
University of Washington, Seattle, Washington, United States
DOI

10.1145/3706598.3713226

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713226

動画
“Ignorance is not Bliss”: Designing Personalized Moderation to Address Ableist Hate on Social Media
要旨

Disabled people on social media often experience ableist hate and microaggressions. Prior work has shown that platform moderation often fails to remove ableist hate, leaving disabled users exposed to harmful content. This paper examines how personalized moderation can safeguard users from viewing ableist comments. During interviews and focus groups with 23 disabled social media users, we presented design probes to elicit perceptions on configuring their filters of ableist speech (e.g., intensity of ableism and types of ableism) and customizing the presentation of the ableist speech to mitigate the harm (e.g., AI rephrasing the comment and content warnings). We found that participants preferred configuring their filters through types of ableist speech and favored content warnings. We surface participants’ distrust in AI-based moderation, skepticism in AI’s accuracy, and varied tolerances in viewing ableist hate. Finally, we share design recommendations to support users’ agency, mitigate harm from hate, and promote safety.

著者
Sharon Heung
Cornell Tech, New York , New York, United States
Lucy Jiang
Cornell University, Ithaca, New York, United States
Shiri Azenkot
Cornell Tech, New York, New York, United States
Aditya Vashistha
Cornell University, Ithaca, New York, United States
DOI

10.1145/3706598.3713997

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713997

動画
"I have never seen that for Deaf people's content:" Deaf and Hard-of-Hearing User Experiences with Misinformation, Moderation, and Debunking on Social Media in the US
要旨

Misinformation has been studied with various social media user groups, though not with Deaf and Hard-of-hearing (DHH) individuals. To address this gap, we conducted an interview with 15 DHH participants to explore their lived experiences with misinformation and their perspectives on common moderation and debunking approaches on social media. We found that participants often experience falsehoods, and highlighted examples specific to the DHH community such as misinformation related to American Sign Language (ASL) and Deaf culture. However, moderation interventions and debunking strategies for misinformation specific to DHH topics were lacking. Written warnings may be beneficial as long as they use language appropriate for DHH people with diverse literacy skills. Participants found visual interventions (e.g., videos) more beneficial as long as they can be appropriately captioned – which is not always the case in practice. Our findings provide practical moderation insights for DHH social media users.

受賞
Honorable Mention
著者
Filipo Sharevski
DePaul University, Chicago, Illinois, United States
Oliver Alonzo
DePaul University, Chicago, Illinois, United States
Sarah M. G.. Hau
DePaul University, Chicago, Illinois, United States
DOI

10.1145/3706598.3713114

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713114

動画
Who should set the Standards? Analysing Censored Arabic Content on Facebook during the Palestine-Israel Conflict
要旨

Nascent research on human-computer interaction concerns itself with fairness of content moderation systems. Designing globally applicable content moderation systems requires considering historical, cultural, and socio-technical factors. Inspired by this line of work, we investigate Arab users' perception of Facebook's moderation practices. We collect a set of 448 deleted Arabic posts, and we ask Arab annotators to evaluate these posts based on (a) Facebook Community Standards (FBCS) and (b) their personal opinion. Each post was judged by 10 annotators to account for subjectivity. Our analysis shows a clear gap between the Arabs' understanding of the FBCS and how Facebook implements these standards. The study highlights a need for discussion on the moderation guidelines on social media platforms about who decides the moderation guidelines, how these guidelines are interpreted, and how well they represent the views of marginalised user communities.

著者
Walid Magdy
University of Edinburgh, Edinburgh, United Kingdom
Hamdy Mubarak
Qatar Computing Research Institute, Doha, Doha, Qatar
Joni Salminen
University of Vaasa, Vaasa, Finland
DOI

10.1145/3706598.3713150

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713150

動画
The End of “Trust and Safety”?: Examining the Future of Content Moderation and Upheavals in Professional Online Safety Efforts
要旨

Trust & Safety (T&S) teams have become vital parts of tech platforms; ensuring safe platform use and combating abuse, harassment, and misinformation. However, between 2021 and 2023, T&S teams faced significant layoffs, impacted by broader downsizing in the tech industry. In addition, a reduction in T&S teams has also been attributed to partisan pressure against content moderation efforts designed to mitigate the spread of election and COVID-19-related misinformation. Accordingly, there exist crucial questions over the future of content moderation and T&S in the digital information environment, questions central to the work of CHI researchers interested in intervening in online harm through design, policy and user research. Through in-depth interviews with T&S professionals, this paper explores upheavals within the T&S industry, examining current perspectives of content moderation and broader strategies for maintaining safe digital environments.

受賞
Honorable Mention
著者
Rachel Elizabeth. Moran
University of Washington, Seattle, Washington, United States
Joseph S.. Schafer
University of Washington, Seattle, Washington, United States
Mert Bayar
University of Washington, Seattle, Washington, United States
Kate Starbird
University of Washington, Seattle, Washington, United States
DOI

10.1145/3706598.3713662

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713662

動画