Photo & video manipulation

Paper session

会議の名前
CHI 2020
Data-driven Multi-level Segmentation of Image Editing Logs
要旨

Automatic segmentation of logs for creativity tools such as image editing systems could improve their usability and learnability by supporting such interaction use cases as smart history navigation or recommending alternative design choices. We propose a multi-level segmentation model that works for many image editing tasks including poster creation, portrait retouching, and special effect creation. The lowest-level chunks of logged events are computed using a support vector machine model and higher-level chunks are built on top of these, at a level of granularity that can be customized for specific use cases. Our model takes into account features derived from four event attributes collected in realistically complex Photoshop sessions with expert users: command, timestamp, image content, and artwork layer. We present a detailed analysis of the relevance of each feature and evaluate the model using both quantitative performance metrics and qualitative analysis of sample sessions.

キーワード
Log segmentation
Image editing logs
Interaction history
Multi-level hierarchy
著者
Zipeng Liu
University of British Columbia, Vancouver, BC, Canada
Zhicheng Liu
Adobe Research, Seattle, WA, USA
Tamara Munzner
University of British Columbia, Vancouver, BC, Canada
DOI

10.1145/3313831.3376152

論文URL

https://doi.org/10.1145/3313831.3376152

動画
Temporal Segmentation of Creative Live Streams
要旨

Many artists broadcast their creative process through live streaming platforms like Twitch and YouTube, and people often watch archives of these broadcasts later for learning and inspiration. Unfortunately, because live stream videos are often multiple hours long and hard to skim and browse, few can leverage the wealth of knowledge hidden in these archives. We present an approach for automatic temporal segmentation of creative live stream videos. Using an audio transcript and a log of software usage, the system segments the video into sections that the artist can optionally label with meaningful titles. We evaluate this approach by gathering feedback from expert streamers and comparing automatic segmentations to those made by viewers. We find that, while there is no one "correct" way to segment a live stream, our automatic method performs similarly to viewers, and streamers find it useful for navigating their streams after making slight adjustments and adding section titles.

キーワード
live streaming
creativity
video segmentation
著者
C. Ailie Fraser
Adobe Research & University of California, San Diego, Seattle, WA, USA
Joy O. Kim
Adobe Research, San Francisco, CA, USA
Hijung Valentina Shin
Adobe Research, Cambridge, MA, USA
Joel Brandt
Adobe Research, Santa Monica, CA, USA
Mira Dontcheva
Adobe Research, Seattle, WA, USA
DOI

10.1145/3313831.3376437

論文URL

https://doi.org/10.1145/3313831.3376437

GAZED– Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings
要旨

We present GAZED– eye GAZe-guided EDiting for videos captured by a solitary, static, wide-angle and high-resolution camera. Eye-gaze has been effectively employed in computational applications as a cue to capture interesting scene content; we employ gaze as a proxy to select shots for inclusion in the edited video. Given the original video, scene content and user eye-gaze tracks are combined to generate an edited video comprising cinematically valid actor shots and shot transitions to generate an aesthetic and vivid representation of the original narrative. We model cinematic video editing as an energy minimization problem over shot selection, whose constraints capture cinematographic editing conventions. Gazed scene locations primarily determine the shots constituting the edited video. Effectiveness of GAZED against multiple competing methods is demonstrated via a psychophysical study involving 12 users and twelve performance videos.

キーワード
Eye gaze
Cinematic video editing
Stage performance
Static wide-angle recording
Gaze potential
Shot selection
Dynamic programming
著者
K. L. Bhanu Moorthy
International Institute of Information Technology, Hyderabad, Hyderabad, India
Moneish Kumar
Samsung R&D Institute Bangalore, Bangalore, India
Ramanathan Subramanian
Indian Institute of Technology Ropar, Ropar, India
Vineet Gandhi
International Institute of Information Technology, Hyderabad, Hyderabad, India
DOI

10.1145/3313831.3376544

論文URL

https://doi.org/10.1145/3313831.3376544

動画
Adaptive Photographic Composition Guidance
要旨

Photographic composition is often taught as alignment with composition grids—most commonly, the rule of thirds. Professional photographers use more complex grids, like the harmonic armature, to achieve more diverse dynamic compositions. We are interested in understanding whether these complex grids are helpful to amateurs.<br>In a formative study, we found that overlaying the harmonic armature in the camera can help less experienced photographers discover and achieve different compositions, but it can also be overwhelming due to the large number of lines. Photographers actually use subsets of lines from the armature to explain different aspects of composition. However, this occurs mainly offline to analyze existing images. We propose bringing this mental model into the camera—by adaptively highlighting relevant lines to the current scene and point of view. We describe a saliency-based algorithm for selecting these lines and present an evaluation of the system that shows that photographers found the proposed adaptive armatures helpful for capturing more well-composed images.

キーワード
photography
camera interfaces
composition
dynamic symmetry
著者
Jane L. E
Stanford University, Stanford, CA, USA
Ohad Fried
Stanford University, Stanford, CA, USA
Jingwan Lu
Adobe Research, San Jose, CA, USA
Jianming Zhang
Adobe Research, San Jose, CA, USA
Radomír Měch
Adobe Research, San Jose, CA, USA
Jose Echevarria
Adobe Systems, San Jose, CA, USA
Pat Hanrahan
Stanford University, Stanford, CA, USA
James A. Landay
Stanford University, Stanford, CA, USA
DOI

10.1145/3313831.3376635

論文URL

https://doi.org/10.1145/3313831.3376635

動画
Camera Adversaria
要旨

In this paper we introduce Camera Adversaria; a mobile app designed to disrupt the automatic surveillance of personal photographs by technology companies. The app leverages the brittleness of deep neural networks with respect to high-frequency signals, adding generative adversarial perturbations to users' photographs. These perturbations confound image classification systems but are virtually imperceptible to human viewers. Camera Adversaria builds on methods developed by machine learning researchers as well as a growing body of work, primarily from art and design, which transgresses contemporary surveillance systems. We map the design space of responses to surveillance and identify an under-explored region where our project is situated. Finally we show that the language typically used in the adversarial perturbation literature serves to affirm corporate surveillance practices and malign resistance. This raises significant questions about the function of the research community in countenancing systems of surveillance.

キーワード
surveillance capitalism
adversarial examples
critical design
著者
Kieran Browne
Australian National University, Canberra, ACT, Australia
Ben Swift
Australian National University, Canberra, ACT, Australia
Terhi Nurmikko-Fuller
Australian National University, Canberra, ACT, Australia
DOI

10.1145/3313831.3376434

論文URL

https://doi.org/10.1145/3313831.3376434