Teamwork Triumphs: Collaborative Experiences

会議の名前
UIST 2023
VRoxy: Wide-Area Collaboration From an Office Using a VR-Driven Robotic Proxy
要旨

Recent research in robotic proxies has demonstrated that one can automatically reproduce many non-verbal cues important in co-located collaboration. However, they often require a symmetrical hardware setup in each location. We present the VRoxy system, designed to enable access to remote spaces through a robotic embodiment, using a VR headset in a much smaller space, such as a personal office. VRoxy maps small movements in VR space to larger movements in the physical space of the robot, allowing the user to navigate large physical spaces easily. Using VRoxy, the VR user can quickly explore and navigate in a low-fidelity rendering of the remote space. Upon the robot's arrival, the system uses the feed of a 360 camera to support real-time interactions. The system also facilitates various interaction modalities by rendering the micro-mobility around shared spaces, head and facial animations, and pointing gestures on the proxy. We demonstrate how our system can accommodate mapping multiple physical locations onto a unified virtual space. In a formative study, users could complete a design decision task where they navigated and collaborated in a complex 7.5m x 5m layout using a 3m x 2m VR space.

著者
Mose Sakashita
Cornell University, Ithaca, New York, United States
Hyunju Kim
Cornell University, Ithaca, New York, United States
Brandon J. Woodard
Brown University, Providence , Rhode Island, United States
Ruidong Zhang
Cornell University, Ithaca, New York, United States
Francois Guimbretiere
Cornell University, Ithaca, New York, United States
論文URL

https://doi.org/10.1145/3586183.3606743

動画
CrossTalk: Intelligent Substrates for Language-Oriented Interaction in Video-Based Communication and Collaboration
要旨

Despite the advances and ubiquity of digital communication media such as videoconferencing and virtual reality, they remain oblivious to the rich intentions expressed by users. Beyond transmitting audio, videos, and messages, we envision digital communication media as proactive facilitators that can provide unobtrusive assistance to enhance communication and collaboration. Informed by the results of a formative study, we propose three key design concepts to explore the systematic integration of intelligence into communication and collaboration, including the panel substrate, language-based intent recognition, and lightweight interaction techniques. We developed CrossTalk, a videoconferencing system that instantiates these concepts, which was found to enable a more fluid and flexible communication and collaboration experience.

著者
Haijun Xia
University of California, San Diego, San Diego, California, United States
Tony Wang
University of California San Diego, La Jolla, California, United States
Aditya Gunturu
Manipal Institute of Technology, Manipal, Karnataka, India
Peiling Jiang
University of California San Diego, San Diego, California, United States
William Duan
University of California San Diego, San Diego, California, United States
Xiaoshuo Yao
University of California, San Diego, San Diego, California, United States
論文URL

https://doi.org/10.1145/3586183.3606773

動画
Going Incognito in the Metaverse: Achieving Theoretically Optimal Privacy-Usability Tradeoffs in VR
要旨

Virtual reality (VR) telepresence applications and the so-called "metaverse" promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities that users are accustomed to having access to. To remedy this, we present the first known method of implementing an "incognito mode" for VR. Our technique leverages local ε-differential privacy to quantifiably obscure sensitive user data attributes, with a focus on intelligently adding noise when and where it is needed most to maximize privacy while minimizing usability impact. Our system is capable of flexibly adapting to the unique needs of each VR application to further optimize this trade-off. We implement our solution as a universal Unity (C#) plugin that we then evaluate using several popular VR applications. Upon faithfully replicating the most well-known VR privacy attack studies, we show a significant degradation of attacker capabilities when using our solution.

受賞
Best Paper
著者
Vivek C. Nair
University of California, Berkeley, Berkeley, California, United States
Gonzalo Munilla-Garrido
Technical University of Munich, Munich, Germany
Dawn Song
University of California, Berkeley, Berkeley, California, United States
論文URL

https://doi.org/10.1145/3586183.3606754

動画
The View from MARS: Empowering Game Stream Viewers with Metadata Augmented Real-time Streaming
要旨

We present MARS (Metadata Augmented Real-time Streaming), a system that enables game-aware streaming interfaces for Twitch. Current streaming interfaces provide a video stream of gameplay and a chat channel for conversation, but do not allow viewers to interact with game content independently from the steamer or other viewers. With MARS, a Unity game’s metadata is rendered in real-time onto a Twitch viewer’s interface. The metadata can then power viewer-side interfaces that are aware of the streamer’s game activity and provide new capacities for viewers. Use cases include providing contextual information (e.g. clicking on a unit to learn more), improving accessibility (e.g. slowing down text presentation speed), and supporting novel stream-based game designs (e.g. asymmetric designs where the viewers know more than the streamer). We share the details of MARS’ architecture and capabilities in this paper, and showcase a working prototype for each of our three proposed use cases.

著者
Noor Hammad
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Erik Harpstead
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Jessica Hammer
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
論文URL

https://doi.org/10.1145/3586183.3606753

動画
WorldSmith: A Multi-Modal Image Synthesis Tool for Fictional World Building
要旨

Crafting a rich and unique environment is crucial for fictional world-building, but can be difficult to achieve since illustrating a world from scratch requires time and significant skill. We investigate the use of recent multi-modal image generation systems to enable users iteratively visualize and modify elements of their fictional world using a combination of text input, sketching, and region-based filling. WorldSmith enables novice world builders to quickly visualize a fictional world with layered edits and hierarchical compositions. Through a formative study (4 participants) and first-use study (13 participants) we demonstrate that WorldSmith offers more expressive interactions with prompt-based models. With this work, we explore how creatives can be empowered to leverage prompt-based generative AI as a tool in their creative process, beyond current "click-once" prompting UI paradigms.

著者
Hai Dang
University of Bayreuth, Bayreuth, Germany
Frederik Brudy
Autodesk Research, Toronto, Ontario, Canada
George Fitzmaurice
Autodesk Research, Toronto, Ontario, Canada
Fraser Anderson
Autodesk Research, Toronto, Ontario, Canada
論文URL

https://doi.org/10.1145/3586183.3606772

動画
WavoID: Robust and Secure Multi-modal User Identification via mmWave-voice Mechanism
要旨

With the increasing deployment of voice-controlled devices in homes and enterprises, there is an urgent demand for voice identification to prevent unauthorized access to sensitive information and property loss. However, due to the broadcast nature of sound wave, a voice only system is vulnerable to adverse conditions and malicious attacks. We observe that the cooperation of millimeter waves (mmWave) and voice signals can significantly improve the effectiveness and security of user identification. Based on the properties, we propose a multi-modal user identification system (named WavoID) by fusing the uniqueness of mmWave sensed vocal vibration and mic-recorded voice of users. To estimate fine-grained waveforms, WavoID splits signals and adaptively combines useful decomposed signals according to correlative contents in both mmWave and voice. An elaborated anti-spoofing module in WavoID comprising biometric bimodal information defend against attacks. WavoID produces and fuses the response maps of mmWave and voice to improve the representation power of fused features, benefiting accurate identification, even facing adverse circumstances. We evaluate WavoID using commercial sensors on extensive experiments. WavoID has significant performance on user identification with over 98% accuracy on 100 user datasets.

著者
Tiantian Liu
Zhejiang University, Hangzhou, China
Feng Lin
Zhejiang University, Hangzhou, Zhejiang, China
Chao Wang
Zhejiang University, Hangzhou, Zhejiang, China
Chenhan Xu
University at buffalo, Buffalo, New York, United States
Xiaoyu Zhang
Computer Science and Engineering, Buffalo, New York, United States
Zhengxiong Li
University of Colorado Denver, Denver, Colorado, United States
Wenyao Xu
University at buffalo, Buffalo, New York, United States
MING-CHUN HUANG
Duke Kunshan University, Kunshan, China
Kui Ren
Zhejiang University, Hangzhou, China
論文URL

https://doi.org/10.1145/3586183.3606775

動画