Lost in Moderation: How Commercial Content Moderation APIs Over- and Under-Moderate Group-Targeted Hate Speech and Linguistic Variations
説明

Commercial content moderation APIs are marketed as scalable solutions to combat online hate speech. However, the reliance on these APIs risks both silencing legitimate speech, called over-moderation, and failing to protect online platforms from harmful speech, known as under-moderation. To assess such risks, this paper introduces a framework for auditing black-box NLP systems. Using the framework, we systematically evaluate five widely used commercial content moderation APIs. Analyzing five million queries based on four datasets, we find that APIs frequently rely on group identity terms, such as ``black'', to predict hate speech. While OpenAI's and Amazon's services perform slightly better, all providers under-moderate implicit hate speech, which uses codified messages, especially against LGBTQIA+ individuals. Simultaneously, they over-moderate counter-speech, reclaimed slurs and content related to Black, LGBTQIA+, Jewish, and Muslim people. We recommend that API providers offer better guidance on API implementation and threshold setting and more transparency on their APIs' limitations.

\noindent \textit{\textbf{Warning}: This paper contains offensive and hateful terms and concepts. We have chosen to reproduce these terms for reasons of transparency.}

日本語まとめ
読み込み中…
読み込み中…
The Virtual Jail: Content Moderation Challenges Faced by Chinese Queer Content Creators on Douyin
説明

Queer users of Douyin, the Chinese version of TikTok, suspect that the platform removes and suppresses queer content, thus reducing queer visibility. In this study, we examined how Chinese queer users recognize and react to Douyin's moderation of queer content by conducting interviews with 21 queer China-based Douyin content creators and viewers.

Findings indicate that queer users actively explore and adapt to the platform's underlying moderation logic. They employ creative content and posting strategies to reduce the likelihood of their expressions of queer topics and identities being removed or suppressed.

Like Western platforms, Douyin's moderation approaches are often ambiguous; but unlike Western platforms, queer users sometimes receive clarity on moderation reasons via direct communication with moderators. Participants suggested that Douyin's repressive moderation practices are influenced by more than just platform policies and procedures - they also reflect state-led homophobia and societal discipline. This study underscores the challenges Chinese queer communities face in maintaining online visibility and suggests that meaningful change in their experiences is unlikely without broader societal shifts towards queer acceptance.

日本語まとめ
読み込み中…
読み込み中…
"We're utterly ill-prepared to deal with something like this": Teachers' Perspectives on Student Generation of Synthetic Nonconsensual Explicit Imagery
説明

Synthetic nonconsensual explicit imagery, also referred to as "deepfake nudes", is becoming faster and easier to generate. In the last year, synthetic nonconsensual explicit imagery was reported in at least ten US middle and high schools, generated by students of other students. Teachers are at the front lines of this new form of image abuse and have a valuable perspective on threat models in this context. We interviewed 17 US teachers to understand their opinions and concerns about synthetic nonconsensual explicit imagery in schools. No teachers knew of it happening at their schools, but most expected it to be a growing issue. Teachers proposed many interventions, such as improving reporting mechanisms, focusing on consent in sex education, and updating technology policies. However, teachers disagreed about appropriate consequences for students who create such images. We unpack our findings relative to differing models of justice, sexual violence, and sociopolitical challenges within schools.

日本語まとめ
読み込み中…
読み込み中…
“Ignorance is not Bliss”: Designing Personalized Moderation to Address Ableist Hate on Social Media
説明

Disabled people on social media often experience ableist hate and microaggressions. Prior work has shown that platform moderation often fails to remove ableist hate, leaving disabled users exposed to harmful content. This paper examines how personalized moderation can safeguard users from viewing ableist comments. During interviews and focus groups with 23 disabled social media users, we presented design probes to elicit perceptions on configuring their filters of ableist speech (e.g., intensity of ableism and types of ableism) and customizing the presentation of the ableist speech to mitigate the harm (e.g., AI rephrasing the comment and content warnings). We found that participants preferred configuring their filters through types of ableist speech and favored content warnings. We surface participants’ distrust in AI-based moderation, skepticism in AI’s accuracy, and varied tolerances in viewing ableist hate. Finally, we share design recommendations to support users’ agency, mitigate harm from hate, and promote safety.

日本語まとめ
読み込み中…
読み込み中…
"I have never seen that for Deaf people's content:" Deaf and Hard-of-Hearing User Experiences with Misinformation, Moderation, and Debunking on Social Media in the US
説明

Misinformation has been studied with various social media user groups, though not with Deaf and Hard-of-hearing (DHH) individuals. To address this gap, we conducted an interview with 15 DHH participants to explore their lived experiences with misinformation and their perspectives on common moderation and debunking approaches on social media. We found that participants often experience falsehoods, and highlighted examples specific to the DHH community such as misinformation related to American Sign Language (ASL) and Deaf culture. However, moderation interventions and debunking strategies for misinformation specific to DHH topics were lacking. Written warnings may be beneficial as long as they use language appropriate for DHH people with diverse literacy skills. Participants found visual interventions (e.g., videos) more beneficial as long as they can be appropriately captioned – which is not always the case in practice. Our findings provide practical moderation insights for DHH social media users.

日本語まとめ
読み込み中…
読み込み中…
Who should set the Standards? Analysing Censored Arabic Content on Facebook during the Palestine-Israel Conflict
説明

Nascent research on human-computer interaction concerns itself with fairness of content moderation systems. Designing globally applicable content moderation systems requires considering historical, cultural, and socio-technical factors. Inspired by this line of work, we investigate Arab users' perception of Facebook's moderation practices. We collect a set of 448 deleted Arabic posts, and we ask Arab annotators to evaluate these posts based on (a) Facebook Community Standards (FBCS) and (b) their personal opinion. Each post was judged by 10 annotators to account for subjectivity. Our analysis shows a clear gap between the Arabs' understanding of the FBCS and how Facebook implements these standards. The study highlights a need for discussion on the moderation guidelines on social media platforms about who decides the moderation guidelines, how these guidelines are interpreted, and how well they represent the views of marginalised user communities.

日本語まとめ
読み込み中…
読み込み中…
The End of “Trust and Safety”?: Examining the Future of Content Moderation and Upheavals in Professional Online Safety Efforts
説明

Trust & Safety (T&S) teams have become vital parts of tech platforms; ensuring safe platform use and combating abuse, harassment, and misinformation. However, between 2021 and 2023, T&S teams faced significant layoffs, impacted by broader downsizing in the tech industry. In addition, a reduction in T&S teams has also been attributed to partisan pressure against content moderation efforts designed to mitigate the spread of election and COVID-19-related misinformation. Accordingly, there exist crucial questions over the future of content moderation and T&S in the digital information environment, questions central to the work of CHI researchers interested in intervening in online harm through design, policy and user research. Through in-depth interviews with T&S professionals, this paper explores upheavals within the T&S industry, examining current perspectives of content moderation and broader strategies for maintaining safe digital environments.

日本語まとめ
読み込み中…
読み込み中…