We present an experimental study that investigates how LLM-driven conversational AI tools might be weaponized to facilitate, exacerbate, or commoditize coercive control. Inspired by speculative design, we construct four scenarios that combine well-known coercive control tactics with the current capabilities of conversational AI tools. Then, we explore these scenarios via interactions with popular AI agents (ChatGPT, Gemini). We find that although AI tools refuse straightforward requests for harmful content, their guardrails can be circumvented via strategies such as gradual persuasion, splitting conversations, pre-prompting, and manipulating the AI agent's settings. Collectively, these strategies enable AI agents to be leveraged in ways that facilitate harassment, intimidation, gaslighting, monitoring, surveillance, and other coercive control tactics. To make these tools safer for everyone, we discuss opportunities for AI agents to resist being abused for coercive control via analysis of users’ conversational patterns, and ensuring that pre-programmed settings are clearly visible to prevent covert manipulation.
ACM CHI Conference on Human Factors in Computing Systems