OptiSub: Optimizing Video Subtitle Presentation for Varied Display and Font Sizes via Speech Pause-Driven Chunking

Viewers desire to watch video content with subtitles in various font sizes according to their viewing environment and personal preferences. Unfortunately, because a chunk of the subtitle—a segment of the text corpus displayed on the screen at once—is typically constructed based on one specific font size, text truncation or awkward line breaks can occur when different font sizes are utilized. While existing methods address this problem by reconstructing subtitle chunks based on maximum character counts, they overlook synchronization of the display time with the content, often causing misaligned text. We introduce OptiSub, a fully automated method that optimizes subtitle segmentation to fit any user-specified font size while ensuring synchronization with the content. Our method leverages the timing of speech pauses within the video for synchronization. Experimental results, including a user study comparing OptiSub with previous methods, demonstrate its effectiveness and practicality across diverse font sizes and input videos.

KAIST, Daejeon, Korea, Republic of

10.1145/3706598.3714199

https://dl.acm.org/doi/10.1145/3706598.3714199

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

G318+G319

7 件の発表

開始日時2025-04-30 23:10:00

終了日時2025-05-01 00:40:00

読み込み中…

お気に入り

あとで読む

コレクション