Ghana has a population of over 27 million people, of which 1 in 15 may have a communication disability. The number of speech and language therapists (SLTs) available to support these people remains remarkably small, presenting a major workforce challenge. As an emerging profession, there remain significant challenges around educating the first generation of SLTs. Ghana, however, has a healthy digital infrastructure which can be taken advantage of. We describe a comprehensive study which aimed to co-design a set of locally appropriate digital tools to enhance SLT training in Ghana. We contribute insights into how digital tools could support social learning and the transition from student to independent practitioner and future clinical supervisor. We offer a set of design recommendations for creating an online Community of Practice to enhance continuing professional development.
Learning to speak in foreign languages is hard. Speech shadowing has been rising as a proven way to practice speaking, which asks a learner to listen and repeat a native speech template as simultaneously as possible. However, shadowing can be hard to do because learners can frequently fail to follow the speech and unintentionally interrupt a practice session. Worse, as a technical way to evaluate shadowing performance in real-time has not been established, no automated solutions are available to help. In this paper, we propose a technical framework with context-dependent speech recognition to evaluate shadowing in real-time. We propose a shadowing tutor system called WithYou, which can automatically adjust the playback and the difficulty of a speech template when learners fail, so shadowing becomes smooth and tailored. Results from a user study show that WithYou provides greater speech improvements (14%) than the conventional method (2.7%) with a lower cognitive load.
https://doi.org/10.1145/3313831.3376322
We present a system that automatically transforms text articles into audio-visual slideshows by leveraging the notion of word concreteness, which measures how strongly a word or phrase is related to some perceptible concept. In a formative study we learn that people not only prefer such audio-visual slideshows but find that the content is easier to understand compared to text articles or text articles augmented with images. We use word concreteness to select search terms and find images relevant to the text. Then, based on the distribution of concrete words and the grammatical structure of an article, we time-align selected images with audio narration obtained through text-to-speech to produce audio-visual slideshows. In a user evaluation we find that our concreteness-based algorithm selects images that are highly relevant to the text. The quality of our slideshows is comparable to slideshows produced manually using standard video editing tools, and people strongly prefer our slideshows to those generated using a simple keyword-search based approach.
https://doi.org/10.1145/3313831.3376519
Analyzing queries from search engines and intelligent assistants is difficult. A key challenge is organizing queries into interpretable, context-preserving, representative, and flexible groups. We present structural templates, abstract queries that replace tokens with their linguistic feature forms, as a query grouping method. The templates allow analysts to create query groups with structural similarity at different granularities. We introduce Tempura, an interactive tool that lets analysts explore a query dataset with structural templates. Tempura summarizes a query dataset by selecting a representative subset of templates to show the query distribution. The tool also helps analysts navigate the template space by suggesting related templates likely to yield further explorations. Our user study shows that Tempura helps analysts examine the distribution of a query dataset, find labeling errors, and discover model error patterns and outliers.
https://doi.org/10.1145/3313831.3376451
Social signals are crucial when we decide if we want to interact with someone online. However, social signals are typically limited to the few that platform designers provide, and most can be easily manipulated. In this paper, we propose a new idea called synthesized social signals (S3s): social signals computationally derived from an account's history, and then rendered into the profile. Unlike conventional social signals such as profile bios, S3s use computational summarization to reduce receiver costs and raise the cost of faking signals. To demonstrate and explore the concept, we built Sig, an extensible Chrome extension that computes and visualizes S3s. After a formative study, we conducted a field deployment of Sig on Twitter, targeting two well-known problems on social media: toxic accounts and misinformation. Results show that Sig reduced receiver costs, added important signals beyond conventionally available ones, and that a few users felt safer using Twitter as a result. We conclude by reflecting on the opportunities and challenges S3s provide for augmenting interaction on social platforms.