Crystallizing Schemas with Teleoscope: Thematic Curation of Large Text Corpora on Reddit

要旨

Large text corpora, such as Reddit posts, have become an increasingly prevalent site of qualitative inquiry. However, most large text corpora are intractable for qualitative researchers. Instead, teams rely on statistical subsampling to reduce corpora to a manageable size for qualitative analysis. While previous work for navigating large corpora involves visualizing the dataset at the corpus-level using high-level statistical summaries, few systems offer the ability to curate data using an interpretivist approach. To address this, we developed Teleoscope, a web-based interface designed to scaffold iterative, interactive, and reflexive refinement of a large corpus, in a process we call thematic curation. Across three deployments, we learned that Teleoscope supports serendipitous discovery of new keywords, results in greater feelings of confidence in search saturation, and aids collaborative discussion of alternative curation pathways. Teleoscope empowers researchers to stay "close to the data" in order to make qualitative workflows methodologically coherent with large text corpora.

著者
Patrick Yung Kang. Lee
University of Toronto, Toronto, Ontario, Canada
Paul Hendrik. Bucci
University of British Columbia, Vancouver, British Columbia, Canada
Leo Itsuki. Foord-Kelcey
University of British Columbia, Vancouver, British Columbia, Canada
Alamjeet Singh
University of British Columbia , Vancouver , British Columbia, Canada
Ivan Beschastnikh
University of British Columbia, Vancouver, British Columbia, Canada

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: AI Systems for Human Goals

P1 - Room 122
7 件の発表
2026-04-14 18:00:00
2026-04-14 19:30:00