Due to advances in deep learning, gestures have become a more common tool for human-computer interaction. When implementing a large amount of training data, deep learning models show remarkable performance in gesture recognition. Since it is expensive and time consuming to collect gesture data from people, we are often confronted with a practicality issue when managing the quantity and quality of training data. It is a well-known fact that increasing training data variability can help to improve the generalization performance of machine learning models. Thus, we directly intervene in the collection of gesture data to increase human gesture variability by adding some words (called styling words) into the data collection instructions, e.g., giving the instruction "perform gesture #1 faster" as opposed to "perform gesture #1." Through an in-depth analysis of gesture features and video-based gesture recognition, we have confirmed the advantageous use of styling words in gesture training data collection.
https://doi.org/10.1145/3411764.3445457
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2021.acm.org/)