199. Multi-Agent Reasoning Systems for Sensemaking and Planning

Perspectra: Choosing Your Experts Enhances Critical Thinking in Multi-Agent Research Ideation
説明

Recent advances in multi-agent systems (MAS) enable tools for information search and ideation by assigning personas to agents. However, how users can effectively control, steer, and critically evaluate collaboration among multiple domain-expert agents remains underexplored. We present Perspectra, an interactive MAS that visualizes and structures deliberation among LLM agents via a forum-style interface, supporting @-mention to invite targeted agents, threading for parallel exploration, with a real-time mind map for visualizing arguments and rationales. In a within-subjects study with 18 participants, we compared Perspectra to a group-chat baseline as they developed research proposals. Our findings show that Perspectra significantly increased the frequency and depth of critical-thinking behaviors, elicited more interdisciplinary replies, and led to more frequent proposal revisions than the group chat condition. We discuss implications for designing multi-agent tools that scaffold critical thinking by supporting user control over multi-agent adversarial discourse.

日本語まとめ
読み込み中…
読み込み中…
From Conversation to Human-AI Common Ground: Extracting Cognitive Workflows for Reuse in Sense-making Tasks
説明

Knowledge workers increasingly rely on conversational AI for sense-making tasks (e.g., conducting market analysis), yet must repeatedly reconstruct context and intent to meet their goals. A formative study (N=10) showed that workflow reuse with AI often failed. Current tools either only remember preferences or enforce rigid, predefined workflows—neither adapts to evolving goals. We present ThinkFlow, a system that maintains a dynamic common ground through a cognitive workflow schema, enabling users to express intent and AI to adapt and reuse workflows across contexts. An expert-rating study shows that the schema can accurately capture the collocutor's reasoning process, and when reused for a similar task, improves the AI's responses compared to when the schema isn't present. A user study with eight knowledge workers demonstrates that ThinkFlow supports awareness of evolving workflows, intent expression, and flexible application across contexts.

日本語まとめ
読み込み中…
読み込み中…
NarrativeLoom: Enhancing Creative Storytelling through Multi-Persona Collaborative Improvisation
説明

Large Language Models show promise for AI-assisted storytelling, yet current tools often generate predictable, unoriginal narratives. To address this limitation, we present NarrativeLoom, a multi-persona co-creative system grounded in Campbell's Blind Variation and Selective Retention theory. NarrativeLoom deploys specialized Artificial Intelligence (AI) personas to generate diverse narrative options (blind variation), while users act as creative directors to select and refine them (selective retention). We designed a controlled study with 50 participants and found that stories co-authored with NarrativeLoom were not only perceived by users as more novel and diverse but were also objectively rated by experts as significantly better across all Torrance Test creativity dimensions: fluency, flexibility, originality, and elaboration. Stories are significantly longer with richer settings and more dialogue. Writing expertise emerged as a moderator: novices benefited more from structured scaffolding. This demonstrates the value of theory-informed co-creative systems and the importance of adapting them to varying user expertise. Project page: https://ppyyqq.github.io/narrativeloom.

日本語まとめ
読み込み中…
読み込み中…
CritiqueCrew: Orchestrating Multi-Perspective Conversational Design Critique
説明

UI designers face growing cognitive load and cross-functional friction at the intersection of user needs, business goals, and engineering constraints. Existing automated tools often deliver static "problem lists," lacking actionable repair paths and disrupting creative flow. We introduce CritiqueCrew, a Figma tool that supports designers through conversational critique. CritiqueCrew generates multi-faceted insights by implementing a multi-perspective orchestration of distinct expert roles (UX, PM, Engineer). It translates abstract critiques into concrete actions via in-context feedback and interactive remediation. Across two independent controlled studies (Total N=48), CritiqueCrew significantly improved both design quality and subjective experience compared to a traditional static checker. Furthermore, our results confirm that the structured orchestration of expert roles—rather than a unified model—is key to fostering trust and creativity support. Our work demonstrates how AI can shift from a "problem auditor" to a "solution co-creator" by integrating multi-perspective dialogue with interactive repair, offering design implications for future creative tools.

日本語まとめ
読み込み中…
読み込み中…
DuoDrama: Supporting Screenplay Refinement Through LLM-Assisted Human Reflection
説明

AI has been increasingly integrated into screenwriting practice. In refinement, screenwriters expect AI to provide feedback that supports reflection across the internal perspective of characters and the external perspective of the overall story. However, existing AI tools cannot sufficiently coordinate the two perspectives to meet screenwriters' needs. To address this gap, we present DuoDrama, an AI system that generates feedback to assist screenwriters' reflection in refinement. To enable DuoDrama, based on performance theories and a formative study with nine professional screenwriters, we design the Experience-Grounded Feedback Generation Workflow for Human Reflection (ExReflect). In ExReflect, an AI agent adopts an experience role to generate experience and then shifts to an evaluation role to generate feedback based on the experience. A study with fourteen professional screenwriters shows that DuoDrama improves feedback quality and alignment and enhances the effectiveness, depth, and richness of reflection. We conclude by discussing broader implications and future directions.

日本語まとめ
読み込み中…
読み込み中…
All Futures at Once: Supporting Speculative Design for Placemaking with Multi-Agent Social Simulation
説明

Placemaking transforms physical spaces into socially meaningful places, with long-term impacts depending on how future communities inhabit and interact with them. Speculative design helps envision such futures, yet existing approaches often produce static representations that emphasize spatial form over evolving activity. We present ParaScape, a design support system that facilitates speculative design for placemaking by generating dynamic speculative objects through an underlying LLM-based multi-agent social simulation framework. The framework models heterogeneous agents with group-specific preferences and sensitivities, simulating context-sensitive behaviors and interactions that produce evolving scenarios. These scenarios are visualized as image sequences, where each scenario depicts multiple activities unfolding within a place at a given moment. ParaScape builds on this framework to allow designers to explore scenarios, analyze activity diversity and evolvability, and reflect on trade-offs among stakeholder needs. Evaluations through two experiments, a user study, and two case studies show that ParaScape supports critical reasoning and inclusive placemaking.

日本語まとめ
読み込み中…
読み込み中…
DiLLS: Interactive Diagnosis of LLM-based Multi-agent Systems via Layered Summary of Agent Behaviors
説明

Large language model (LLM)-based multi-agent systems have demonstrated impressive capabilities in handling complex tasks. However, the complexity of agentic behaviors makes these systems difficult to understand. When failures occur, developers often struggle to identify root causes and to determine actionable paths for improvement. Traditional methods that rely on inspecting raw log records are inefficient, given both the large volume and complexity of data. To address this challenge, we propose a framework and an interactive system, DiLLS, designed to reveal and structure the behaviors of multi-agent systems. The key idea is to organize information across three levels of query completion: activities, actions, and operations. By probing the multi-agent system through natural language, DiLLS derives and organizes information about planning and execution into a structured, multi-layered summary. Through a user study, we show that DiLLS significantly improves developers’ effectiveness and efficiency in identifying, diagnosing, and understanding failures in LLM-based multi-agent systems.

日本語まとめ
読み込み中…
読み込み中…