Videos

https://doi.org/10.1145/3544548.3581126

How-to videos are rich in information---they not only give instructions but also provide justifications or descriptions. People seek different information to meet their needs, and identifying different types of information present in the video can improve access to the desired knowledge. Thus, we present a taxonomy of information types in how-to videos. Through an iterative open coding of 4k sentences in 48 videos, 21 information types under 8 categories emerged. The taxonomy represents diverse information types that instructors provide beyond instructions. We first show how our taxonomy can serve as an analytical framework for video navigation systems. Then, we demonstrate through a user study (n=9) how type-based navigation helps participants locate the information they needed. Finally, we discuss how the taxonomy enables a wide range of video-related tasks, such as video authoring, viewing, and analysis. To allow researchers to build upon our taxonomy, we release a dataset of 120 videos containing 9.9k sentences labeled using the taxonomy.

School of Computing, KAIST, Daejeon, Korea, Republic of

KAIST, Daejeon, Korea, Republic of

https://doi.org/10.1145/3544548.3581494

Sighted and blind and low vision (BLV) creators alike use videos to communicate with broad audiences. Yet, video editing remains inaccessible to BLV creators. Our formative study revealed that current video editing tools make it difficult to access the visual content, assess the visual quality, and efficiently navigate the timeline. We present AVscript, an accessible text-based video editor. AVscript enables users to edit their video using a script that embeds the video's visual content, visual errors (e.g., dark or blurred footage), and speech. Users can also efficiently navigate between scenes and visual errors or locate objects in the frame or spoken words of interest. A comparison study (N=12) showed that AVscript significantly lowered BLV creators' mental demands while increasing confidence and independence in video editing. We further demonstrate the potential of AVscript through an exploratory study (N=3) where BLV creators edited their own footage.

University of Texas, Austin, Austin, Texas, United States

School of Computing, KAIST, Daejeon, Korea, Republic of

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

UCLA, Los Angeles, California, United States

NAVER AI Lab, Seongnam, Gyeonggi, Korea, Republic of

University of Texas, Austin, Austin, Texas, United States

https://doi.org/10.1145/3544548.3580772

Video is an effective medium for learning procedural knowledge, such as surgical techniques. However, learning procedural knowledge through videos remains difficult due to limited access to procedural structures of knowledge (e.g., compositions and ordering of steps) in a large-scale video dataset. We present Surch, a system that enables structural search and comparison of surgical procedures. Surch supports video search based on procedural graphs generated by our clustering workflow capturing latent patterns within surgical procedures. We used vectorization and weighting schemes that characterize the features of procedures, such as recursive structures and unique paths. Surch enhances cross-video comparison by providing video navigation synchronized by surgical steps. Evaluation of the workflow demonstrates the effectiveness and interpretability (Silhouette score = 0.82) of our clustering for surgical learning. A user study with 11 residents shows that our system significantly improves the learning experience and task efficiency of video search and comparison, especially benefiting junior residents.

University of California, San Diego, San Diego, California, United States

KAIST, Daejeon, Korea, Republic of

University of California, Santa Barbara, Santa Barbara, California, United States

KAIST, Daejeon, Korea, Republic of

https://doi.org/10.1145/3544548.3581525

Multi-stage programming tutorials are key learning resources for programmers, using progressive incremental steps to teach them how to build larger software systems. A good multi-stage tutorial describes the code clearly, explains the rationale and code changes for each step, and allows readers to experiment as they work through the tutorial. In practice, it is time-consuming for authors to create tutorials with these attributes. In this paper, we introduce Colaroid, an interactive authoring tool for creating high quality multi-stage tutorials. Colaroid tutorials are augmented computational notebooks, where snippets and outputs represent a snapshot of a project, with source code differences highlighted, complete source code context for each snippet, and the ability to load and tinker with any stage of the project in a linked IDE. In two laboratory studies, we found Colaroid makes it easy to create multi-stage tutorials, while offering advantages to readers compared to video and web-based tutorials.

University of Michigan, Ann Arbor, Michigan, United States

University of Pennsylvania, Philadelphia, Pennsylvania, United States

University of Michigan, Ann Arbor, Ann Arbor, Michigan, United States

University of Michigan, Ann Arbor, Michigan, United States

https://doi.org/10.1145/3544548.3581115

We present Escapement, a video prototyping tool that introduces a powerful new concept for prototyping screen-based interfaces by flexibly mapping sensor values to dynamic playback control of videos. This recasts the time dimension of video mock-ups as sensor-mediated interaction. This abstraction of time as interaction, which we dub video-escapement prototyping, empowers designers to rapidly explore and viscerally experience direct touch or sensor-mediated interactions across one or more device displays. Our system affords cross-device and bidirectional remote (tele-present) experiences via cloud-based state sharing across multiple devices. This makes Escapement especially potent for exploring multi-device, dual-screen, or remote-work interactions for screen-based applications. We introduce the core concept of sensor-mediated abstraction of time for quickly generating video-based interactive prototypes of screen-based applications, share the results of observations of long-term usage of video-escapement techniques with experienced interaction designers, and articulate design choices for supporting a reflective, iterative, and open-ended creative design process.

UC Berkeley, Berkeley, California, United States

University College London, London, United Kingdom

Microsoft Research, Redmond, Washington, United States

Microsoft, Seattle, Washington, United States

Ontario Tech University, Oshawa, Ontario, Canada

Autodesk Research, Toronto, Ontario, Canada

University of Texas, Austin, Austin, Texas, United States

Microsoft Research, Redmond, Washington, United States