To create aesthetically pleasing aerial footage, the correct framing of camera targets is crucial. However, current quadrotor camera tools do not consider the 3D extent of actual camera targets in their optimization schemes and simply interpolate between keyframes when generating a trajectory. This can yield videos with aesthetically unpleasing target framing. In this paper, we propose an optimization formulation that optimizes the quadrotor camera pose such that targets are positioned at desirable screen locations according to videographic compositional rules and entirely visible throughout a shot. Camera targets are identified using a semi-automatic pipeline which leverages a deep-learning-based visual saliency model. A large-scale perceptual study (N≈500) shows that our method enables users to produce shots with a target framing that is closer to what they intended to create and more or as aesthetically pleasing than with the previous state of the art.
https://doi.org/10.1145/3411764.3445568
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2021.acm.org/)