Informed consent is a core cornerstone of ethics in human subject research. Through the informed consent process, participants learn about the study procedure, benefits, risks, and more to make an informed decision. However, recent studies showed that current practices might lead to uninformed decisions and expose participants to unknown risks, especially in online studies. Without the researcher's presence and guidance, online participants must read a lengthy form on their own with no answers to their questions. In this paper, we examined the role of an AI-powered chatbot in improving informed consent online. By comparing the chatbot with form-based interaction, we found the chatbot improved consent form reading, promoted participants' feelings of agency, and closed the power gap between the participant and the researcher. Our exploratory analysis further revealed the altered power dynamic might eventually benefit study response quality. We discussed design implications for creating AI-powered chatbots to offer effective informed consent in broader settings.
Artificially intelligent (AI) agents such as robots are increasingly delegated power in work settings, yet it remains unclear how power functions in interactions with both humans and robots, especially when they directly compete for influence. Here we present an experiment where every participant was matched with one human and one robot to perform decision-making tasks. By manipulating who has power, we created three conditions: human as leader, robot as leader, and a no-power-difference control. The results showed that the participants were significantly more influenced by the leader, regardless of whether the leader was a human or a robot. However, they generally held a more positive attitude toward the human than the robot, although they considered whichever was in power as more competent. This study illustrates the importance of power for future Human-Robot Interaction (HRI) and Human-AI Interaction (HAI) research, as it addresses pressing concerns of society about AI-powered intelligent agents.
A semi-automated manufacturing system that entails human intervention in the middle of the process is a representative collaborative system that requires active interaction between humans and machines. User behavior induced by the operator's decision-making process greatly impacts system operation and performance in such an environment that requires human-machine collaboration. There has been room for utilizing machine-generated data for a fine-grained understanding of the relationship between the behavior and performance of operators in the industrial domain, while multiple streams of data have been collected from manufacturing machines. In this study, we propose a large-scale data-analysis methodology that comprises data contextualization and performance modeling to understand the relationship between operator behavior and performance.
For a case study, we collected machine-generated data over 6-months periods from a highly automated machine in a large tire manufacturing facility. We devised a set of metrics consisting of six human-machine interaction factors and four work environment factors as independent variables, and three performance factors as dependent variables.
Our modeling results reveal that the performance variations can be explained by the interaction and work environment factors ($R^2$ = 0.502, 0.356, and 0.500 for the three performance factors, respectively). Finally, we discuss future research directions for the realization of context-aware computing in semi-automated systems by leveraging machine-generated data as a new modality in human-machine collaboration.
The dazzling promises of AI systems to augment humans in various tasks hinge on whether humans can appropriately rely on them. Recent research has shown that appropriate reliance is the key to achieving complementary team performance in AI-assisted decision making. This paper addresses an under-explored problem of whether the Dunning-Kruger Effect (DKE) among people can hinder their appropriate reliance on AI systems. DKE is a metacognitive bias due to which less-competent individuals overestimate their own skill and performance. Through an empirical study ($N=249$), we explored the impact of DKE on human reliance on an AI system, and whether such effects can be mitigated using a tutorial intervention that reveals the fallibility of AI advice, and exploiting logic units-based explanations to improve user understanding of AI advice. We found that participants who overestimate their performance tend to exhibit under-reliance on AI systems, which hinders optimal team performance. Logic units-based explanations did not help users in either improving the calibration of their competence or facilitating appropriate reliance. While the tutorial intervention was highly effective in helping users calibrate their self-assessment and facilitating appropriate reliance among participants with overestimated self-assessment, we found that it can potentially hurt the appropriate reliance of participants with underestimated self-assessment. Our work has broad implications on the design of methods to tackle user cognitive biases while facilitating appropriate reliance on AI systems. Our findings advance the current understanding of the role of self-assessment in shaping trust and reliance in human-AI decision making. This lays out promising future directions for relevant HCI research in this community.
If large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write -- and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants' writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully.
Language technologies have a racial bias, committing greater errors for Black users than for white users. However, little work has evaluated what effect these disparate error rates have on users themselves. The present study aims to understand if speech recognition errors in human-computer interactions may mirror the same effects as misunderstandings in interpersonal cross-race communication. In a controlled experiment (N=108), we randomly assigned Black and white participants to interact with a voice assistant pre-programmed to exhibit a high versus low error rate. Results revealed that Black participants in the high error rate condition, compared to Black participants in the low error rate condition, exhibited significantly higher levels of self-consciousness, lower levels of self-esteem and positive affect, and less favorable ratings of the technology. White participants did not exhibit this disparate pattern. We discuss design implications and the diverse research directions to which this initial study aims to contribute.