Algorithmic Auditing and Responsible AI

https://doi.org/10.1145/3449166

The rise of geotargeted online advertising has disrupted the business model of local journalism, but it remains ambiguous whether online advertising platforms can effectively reach local audiences. To address this ambiguity, we present a focused study auditing the positional accuracy of geotargeted display advertisements on Google. We measure the frequency and severity of geotargeting errors by targeting display ads to random ZIP codes across the United States, collecting self-reported location information from users who click on the advertisement. We find evidence that geotargeting errors are common, but minor in terms of advertising goals. While 41% of respondents lived outside the target ZIP code, only 11% lived outside the target county, and only 2% lived outside the target state. We also present details regarding a high volume of suspicious clicks in our data, which made the cost per sample extremely expensive. The paper concludes by discussing implications for advertisers and the business of local journalism.

Northwestern University, Evanston, Illinois, United States

https://doi.org/10.1145/3449157

Smart speakers are becoming increasingly ubiquitous in society and are now used for satisfying a variety of information needs, from asking about the weather or traffic to accessing the latest breaking news information. Their growing use for news and information consumption presents new questions related to the quality, source diversity, and comprehensiveness of the news-related information they convey. These questions have significant implications for voice assistant technologies acting as algorithmic information intermediaries, but systematic information quality audits have not yet been undertaken. To address this gap, we develop a methodological approach for evaluating information quality in voice assistants for news-related queries. We demonstrate the approach on the Amazon Alexa voice assistant, first characterising Alexa's performance in terms of response relevance, accuracy, and timeliness, and then further elaborating analyses of information quality based on query phrasing, news category, and information provenance. We discuss the implications of our findings for the design of future smart speaker devices and for the consumption of news information via such algorithmic intermediaries more broadly.

Northwestern University, Evanston, Illinois, United States

https://doi.org/10.1145/3479577

A growing body of literature has proposed formal approaches to audit algorithmic systems for biased and harmful behaviors. While formal auditing approaches have been greatly impactful, they often suffer major blindspots, with critical issues surfacing only in the context of everyday use once systems are deployed. Recent years have seen many cases in which everyday users of algorithmic systems detect and raise awareness about harmful behaviors that they encounter in the course of their everyday interactions with these systems. However, to date little academic attention has been granted to these bottom-up, user-driven auditing processes. In this paper, we propose and explore the concept of everyday algorithm auditing, a process in which users detect, understand, and interrogate problematic machine behaviors via their day-to-day interactions with algorithmic systems. We argue that everyday users are powerful in surfacing problematic machine behaviors that may elude detection via more centrally-organized forms of auditing, regardless of users’ knowledge about the underlying algorithms. We analyze several real-world cases of everyday algorithm auditing, drawing lessons from these cases for the design of future platforms and tools that facilitate such auditing behaviors. Finally, we discuss work that lies ahead, toward bridging the gaps between formal auditing approaches and the organic auditing behaviors that emerge in everyday use of algorithmic systems.

Carnegie Mellon University , Pittsburgh, Pennsylvania, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

https://doi.org/10.1145/3449081

Large and ever-evolving technology companies continue to invest more time and resources to incorporate responsible Artificial Intelligence (AI) into production-ready systems to increase algorithmic accountability. This paper examines and seeks to offer a framework for analyzing how organizational culture and structure impact the effectiveness of responsible AI initiatives in practice. We present the results of semi-structured qualitative interviews with practitioners working in industry, investigating common challenges, ethical tensions, and effective enablers for responsible AI initiatives. Focusing on major companies developing or utilizing AI, we have mapped what organizational structures currently support or hinder responsible AI initiatives, what aspirational future processes and structures would best enable effective initiatives, and what key elements comprise the transition from current work practices to the aspirational future.

Partnership on AI, San Francisco, California, United States

Spotify, San Francisco, California, United States

Accenture, San Francisco, California, United States

https://doi.org/10.1145/3449100

Algorithmically-mediated content is both a product and creator of dominant social narratives, and it has the potential to impact users' beliefs and behaviors. We present two studies on the content and impact of gender and racial representation in image search results for common occupations. In Study 1, we compare 2020 workforce gender and racial composition to that reflected in image search. We find evidence of underrepresentation on both dimensions: women are underrepresented in search at a rate of 42% women for a field with 50% women (comparable to 2015 levels of underrepresentation); people of color are underrepresented with 16% in search compared to an occupation with 22% people of color (proportional to the U.S. workforce). We also compare our gender representation data with that collected in 2015 by Kay et al., finding little improvement in the last half-decade. In Study 2, we study people's impressions of occupations and sense of belonging in a given field when shown search results with different proportions of women and people of color. We find that both types of representation as well as people's own racial and gender identities impact their experience of image search results, and conclude by emphasizing the need for designers and auditors of algorithms to consider the disparate impacts of algorithmic content on users of marginalized identities.

Stanford University, Stanford, California, United States

McGill University, Montreal, Quebec, Canada

Stanford University, Stanford, California, United States

https://doi.org/10.1145/3449148

While algorithm audits are growing rapidly in importance and commonality, relatively little scholarly work has gone toward synthesizing prior work and strategizing future research in the area. This systematic literature review aims to fill the gap, following PRISMA guidelines in a review of over 500 English articles that yielded 62 algorithm audit studies. The studies are synthesized and organized primarily by behavior (discrimination, distortion, exploitation, and misjudgement), with codes also provided for domain (e.g. search, vision, advertising, etc.), organization (e.g. Google, Facebook, Amazon, etc.), and audit method (e.g. sock puppet, direct scrape, crowdsourcing, etc.). Based on the review, previous audit studies have exposed powerful algorithms exhibiting problematic behavior, such as search algorithms culpable of distortion and advertising algorithms culpable of discrimination. The review also suggests some behaviors, domains, methods, and organizations that call for for future audit attention, such as problematic "echo chambers" and other distortion effects from advertising algorithms. The paper concludes by discussing algorithm auditing in the context of other research working toward algorithmic justice.

Northwestern University, Evanston, Illinois, United States