Data analysts often need to iterate between data transformations and chart designs to create rich visualizations for exploratory data analysis. Although many AI-powered systems have been introduced to reduce the effort of visualization authoring, existing systems are not well suited for iterative authoring. They typically require analysts to provide, in a single turn, a text-only prompt that fully describe a complex visualization.
We introduce Data Formulator 2 (DF2 for short), an AI-powered visualization system designed to overcome this limitation.
DF2 blends graphical user interfaces and natural language inputs to enable users to convey their intent more effectively, while delegating data transformation to AI.
Furthermore, to support efficient iteration, DF2 lets users navigate their iteration history and reuse previous designs, eliminating the need to start from scratch each time.
A user study with eight participants demonstrated that DF2 allowed participants to develop their own iteration styles to complete challenging data exploration sessions.
Data-rich documents are ubiquitous in various applications, yet they often rely solely on textual descriptions to convey data insights. Prior research primarily focused on providing visualization-centric augmentation to data-rich documents. However, few have explored using automatically generated word-scale visualizations to enhance the document-centric reading process. As an exploratory step, we propose GistVis, an automatic pipeline that extracts and visualizes data insight from text descriptions. GistVis decomposes the generation process into four modules: Discoverer, Annotator, Extractor, and Visualizer, with the first three modules utilizing the capabilities of large language models and the fourth using visualization design knowledge. Technical evaluation including a comparative study on Discoverer and an ablation study on Annotator reveals decent performance of GistVis. Meanwhile, the user study (N=12) showed that GistVis could generate satisfactory word-scale visualizations, indicating its effectiveness in facilitating users' understanding of data-rich documents (+5.6% accuracy) while significantly reducing their mental demand (p=0.016) and perceived effort (p=0.033).
Historical visualizations are a valuable resource for studying the history of visualization and inspecting the cultural context where they were created. When investigating historical visualizations, it is essential to consider contributions from different cultural frameworks to gain a comprehensive understanding. While there is extensive research on historical visualizations within the European cultural framework, this work shifts the focus to ancient China, a cultural context that remains underexplored by visualization researchers. To this aim, we propose a semi-automatic pipeline to collect, extract, and label historical Chinese visualizations. Through the pipeline, we curate ZuantuSet, a dataset with over 71K visualizations and 108K illustrations. We analyze distinctive design patterns of historical Chinese visualizations and their potential causes within the context of Chinese history and culture. We illustrate potential usage scenarios for this dataset, summarize the unique challenges and solutions associated with collecting historical Chinese visualizations, and outline future research directions.
Multi-class scatterplots are essential for visually comparing data, such as examining class distributions in dimensionality reduction and evaluating classification models. Visual class separation (VCS) measures quantify human perception but are largely derived from and evaluated with datasets reflecting limited types of scatterplot features (e.g., data distribution, similar class densities). Quantitatively identifying which scatterplot features are influential to VCS tasks can enable more robust guidance for future measures. We analyze the alignment between VCS measures and people's perceptions of class separation through a crowdsourced study using 70 scatterplot features relevant to class separation. To cover a wide range of scatterplot features, we generated a set of multi-class scatterplots from 6,947 real-world datasets. Our results highlight that multiple combinations of features are needed to best explain VCS. From our analysis, we develop a composite feature model that identifies key scatterplot features for measuring VCS task performance.
Data visualizations are increasingly seen as socially constructed, with several recent studies positing that perceptions and interpretations of visualization artifacts are shaped through complex sets of interactions between members of a community. However, most of these works have focused on audiences and researchers, and little is known about if and how practitioners account for the socially constructed framing of data visualization. In this paper, we study and analyze how visualization practitioners understand the influence of their beliefs, values, and biases in their design processes and the challenges they experience. In 17 semi-structured interviews with designers working with race and gender demographic data, we find that a complex mix of factors interact to inform how practitioners approach their design process---including their personal experiences, values, and their understandings of power, neutrality, and politics. Based on our findings, we suggest a series of implications for research and practice in this space.
Transcripts are central to qualitative research in HCI, particularly for researchers using methods of Conversation Analysis (CA) and Interaction Analysis (IA) who study the socially situated nature of human-computer interaction. However, CA and IA researchers continue to highlight the significant need for more dynamic ways to visualize transcripts to support interaction analysis. This need is particularly evident in HCI, where transcripts as a form of data have received little attention. In this article, we make three contributions to HCI research. First, we present Transcript Explorer, an open-source visualization system that integrates three visualization techniques we have developed to interactively visualize transcripts linked to videos: Distribution Diagrams, Turn Charts and Contribution Clouds. Second, we present findings from a qualitative analysis of focus group interviews with three different qualitative research groups who engaged with this system to analyze common transcript data. Finally, we expand upon transcripts as a unique form of data for HCI research and propose directions for future research.