OptiSub: Optimizing Video Subtitle Presentation for Varied Display and Font Sizes via Speech Pause-Driven Chunking

要旨

Viewers desire to watch video content with subtitles in various font sizes according to their viewing environment and personal preferences. Unfortunately, because a chunk of the subtitle—a segment of the text corpus displayed on the screen at once—is typically constructed based on one specific font size, text truncation or awkward line breaks can occur when different font sizes are utilized. While existing methods address this problem by reconstructing subtitle chunks based on maximum character counts, they overlook synchronization of the display time with the content, often causing misaligned text. We introduce OptiSub, a fully automated method that optimizes subtitle segmentation to fit any user-specified font size while ensuring synchronization with the content. Our method leverages the timing of speech pauses within the video for synchronization. Experimental results, including a user study comparing OptiSub with previous methods, demonstrate its effectiveness and practicality across diverse font sizes and input videos.

著者
Dawon Lee
KAIST, Daejeon, Korea, Republic of
Jongwoo Choi
KAIST, Daejeon, Korea, Republic of
Junyong Noh
KAIST, Daejeon, Korea, Republic of
DOI

10.1145/3706598.3714199

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714199

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Optimization with/for AI

G318+G319
7 件の発表
2025-04-30 23:10:00
2025-05-01 00:40:00
日本語まとめ
読み込み中…