A growing body of work has highlighted the important role that Wikipedia’s volunteer-created content plays in helping search engines achieve their core goal of addressing the information needs of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs). Our results extend prior work by considering three U.S. search engines, simulating both mobile and desktop devices, and using a spatial analysis approach designed to study modern SERPs that are no longer just “ten blue links”. We find that Wikipedia links are extremely common in important search contexts, appearing in 67-84% of all SERPs for common and trending queries, but less often for medical queries. Furthermore, we observe that Wikipedia links often appear in “Knowledge Panel” SERP elements and are in positions visible to users without scrolling, although Wikipedia appears less in prominent positions on mobile devices. Our findings reinforce the complementary notions that (1) Wikipedia content and research has major impact outside of the Wikipedia domain and (2) powerful technologies like search engines are highly reliant on free content created by volunteers.
https://doi.org/10.1145/3449078
Do models of collaboration among contributors of Wikipedia generalize beyond the larger, western editions of the encyclopedia? In this study, we expanded upon the known collaborative mechanisms on the English Wikipedia and demonstrated that the collaboration model is best captured through the interplay of these mechanisms. We annotated talk page conversations for types of power plays or vies for control over edits they make to articles, to understand how policy and power play mechanisms in editors’ discussions accounts for behavior in English (EN), Farsi (FA), and Chinese (ZH) language editions of Wikipedia. Our findings show that the same power plays used in EN exist in both FA and ZH but the frequency of their usage differs across the editions. This suggests that editors in different language communities value contrasting types of policies to compete for power while discussing editing an article. This study contributes to a deeper understanding of how collaboration models developed from a western perspective translate to non-western languages.
Wikipedia is the product of thousands of editors working collaboratively to provide free and up-to-date encyclopedic information to the project's users. This article asks to what degree Wikipedia articles in three languages --- Hindi, Urdu, and English --- achieve Wikipedia’s mission of making neutrally-presented, reliable information on a polarizing, controversial topic available to people around the globe. We chose the topic of the recent revocation of Article 370 of the Constitution of India, which, along with other recent events in and concerning the region of Jammu and Kashmir, has drawn attention to related articles on Wikipedia. This work focuses on the English Wikipedia, being the preeminent language edition of the project, as well as the Hindi and Urdu editions. Hindi and Urdu are the two standardized varieties of Hindustani, a lingua franca of Jammu and Kashmir. We analyzed page view and revision data for three Wikipedia articles to gauge popularity of the pages in our corpus, and responsiveness of editors to breaking news events and problematic edits. Additionally, we interviewed editors from all three language editions to learn about differences in editing processes and motivations, and we compared the text of the articles across languages, as they appeared shortly after the revocation of Article 370. While activity on South Asian language editions of Wikipedia is growing, at the time of writing, the Hindi and Urdu editions are still in their nascency. In Hindi and Urdu, as well as English, editors predominantly try to adhere to the principle of neutral point of view (NPOV), and for the most part, the editors quash attempts by other editors to push political agendas.
https://doi.org/10.1145/3449108
An increasing number of safety departments in organizations across the U.S. are offering mobile apps that allow their local community members to report potential risks, such as hazards, suspicious events, ongoing incidents, and crimes. These ``community-sourced risk'' systems are designed for the safety departments to take action to prevent or reduce the severity of situations that may harm the community. However, little is known about the actual use of such community-sourced risk systems from the perspective of both community users and the safety departments. This study examines how community users report incidents through mobile apps for help and how safety departments utilize the systems to serve their community members. More specifically, we conducted a comprehensive system log analysis of community-sourced risk systems that were used by safety departments and students at more than 200 American colleges and universities. Our findings revealed a mismatch between what the safety departments expected to receive and what their community members actually reported, and identified several factors (e.g., anonymity of the tip, organization, tip type) that were associated with the safety departments' effective responsiveness to community members' tips. Our findings provide both new design and practical implications for more effective use of community-sourced risk systems for community safety.
Crowdsourcing markets provide workers with a centralized place to find paid work. What may not be obvious at first glance is that, in addition to the work they do for pay, crowd workers also have to shoulder a variety of unpaid invisible labor in these markets, which ultimately reduces workers' hourly wages. Invisible labor includes finding good tasks, messaging requesters, or managing payments. However, we currently know little about how much time crowd workers actually spend on invisible labor or how much it costs them economically. To ensure a fair and equitable future for crowdsourcing, we need to be certain that workers are being paid fairly for all of the work they do. In this paper, we conduct a field study to quantify the invisible labor in crowd work. We build a plugin to record the amount of time that 100 workers on Amazon Mechanical Turk dedicate to invisible labor while completing 40,903 HITs. If we ignore the time workers spent on invisible labor, workers' median hourly wage was $3.76. But, we estimated that crowd workers in our study spent 33% of their time daily on invisible labor, dropping their median hourly wage to $2.83. We found that the invisible labor differentially impacts workers depending on their skill level (master workers did 23% less invisible labor) and workers' demographics. The invisible labor category that took the most time and that was also the most common revolved around workers having to manage their payments. The second most time-consuming invisible labor category involved hyper-vigilance, where workers vigilantly watched over requesters' profiles for newly posted work and vigilantly searched for labor. We hope that through our paper, the invisible labor in crowdsourcing becomes more visible, and our results help to reveal the larger implications of the continuing invisibility of labor in crowdsourcing.
https://doi.org/10.1145/3476060
The life cycle of a peer-produced dataset follows the phases of growth, maturity, and decline. Paying crowdworkers is a proven method to collect and organize information into structured tables. However, these tabular representations may contain inaccuracies due to errors or data changing over time. Thus, the maturation phase of a dataset can benefit from the additional human examination. One method to improve accuracy is to recruit additional paid crowdworkers to verify and correct errors. An alternative method relies on unpaid contributors, collectively editing the dataset during regular use. We describe two case studies to examine different strategies for human verification and maintenance of in-flux tabular datasets. The first case study examines traditional micro-task verification strategies with paid crowdworkers, while the second examines long-term maintenance strategies with unpaid contributions from non-crowdworkers. Two paid verification strategies that produced more accurate corrections at a lower cost per accurate correction were redundant data collection followed by final verification from a trusted crowdworker and allowing crowdworkers to review any data freely. In the unpaid maintenance strategies, contributors provided more accurate corrections when asked to review data matching their interests. This research identifies considerations and future approaches to collectively improving information accuracy and longevity of tabular information.
https://doi.org/10.1145/3479592
The future of crowd work has been identified to depend on worker satisfaction, but we lack a thorough understanding of how worker satisfaction can be increased in microtask crowdsourcing. Prior work has shown that one solution is to build tasks that are engaging. To facilitate engagement, two methods that have received attention in recent HCI literature are the use of video games and conversational interfaces. While these are largely different techniques, they aim for the same goal of reducing worker burden and increasing engagement in a task. On one hand, video games have huge motivation potential and translating game design elements for motivational purposes has shown positive effects. Recent work in games research has shown that the use of player avatars is effective in fostering interest, enjoyment, and other aspects pertaining to intrinsic motivation. On the other hand, conversational interfaces have been argued to have advantages over traditional GUIs due to facilitating a more human-like interaction. `Conversational' microtasking has recently been proposed to improve worker engagement in microtask marketplaces. The contexts of games and crowd work are underlined by the need to motivate and engage participants, yet the potential of using worker avatars to promote self-identification and improve worker satisfaction in microtask crowdsourcing has remained unexplored. Addressing this knowledge gap, we carried out a between-subjects study involving 360 crowd workers. We investigated how worker avatars influence quality related outcomes of workers and their perceived experience, in conventional web and novel conversational interfaces. We equipped workers with the functionality of customizing their avatars, and selecting characterizations for their avatars, to understand whether identifying with an avatar can increase the motivation of workers. We found evidence that using worker avatars can significantly reduce workers' perceived task difficulty in information findings tasks across both web and conversational interfaces. We also found that using worker avatars with conversational interfaces can effectively reduce cognitive workload, increase worker intrinsic motivation, and increase worker retention. Our findings have important implications in alleviating workers' perceived workload, building their self-confidence, and on the design of crowdsourcing microtasks.
https://doi.org/10.1145/3476063
Crowdsourcing is popular for large-scale data collection and labeling, but a major challenge is on detecting low-quality submissions. Recent studies have demonstrated that behavioral features of workers are highly correlated with data quality and can be useful in quality control. However, these studies primarily leveraged coarsely extracted behavioral features, and did not further explore quality control at the fine-grained level, i.e., the annotation unit level. In this paper, we investigate the feasibility and benefits of using fine-grained behavioral features, which are the behavioral features finely extracted from a worker's individual interactions with each single unit in a subtask, for quality control in crowdsourcing. We design and implement a framework named Fine-grained Behavior-based Quality Control (FBQC) that specifically extracts fine-grained behavioral features to provide three quality control mechanisms: (1) quality prediction for objective tasks, (2) suspicious behavior detection for subjective tasks, and (3) unsupervised worker categorization. Using the FBQC framework, we conduct two real-world crowdsourcing experiments and demonstrate that using fine-grained behavioral features is feasible and beneficial in all three quality control mechanisms. Our work provides clues and implications for helping job requesters or crowdsourcing platforms to further achieve better quality control.
https://doi.org/10.1145/3479586