Full-featured data analysis tools provide users a wide variety of ways to transform and visualize their data; ironically, this abundance can be as much hindrance as help in the initial stage of data exploration. In these stages, the critical question is often not "what steps must I take to visualize this data?" but rather "what is this data and what can it tell me?" This mismatch leads to several intertwined challenges. It’s difficult to get a mental picture of the data without first visualizing it, but it’s hard to identify the appropriate way to visualize the data without first having a mental picture of it. Moreover, it’s all too easy for an intriguing data point to pique a researcher’s interest and distract them from their current task. This difficult-to-navigate and distraction-rich environment can easily hide faulty assumptions from notice until they botch the analysis later down the line. Together these problems can send the analyst tumbling down a rabbit-hole of progressively deeper and sometimes misguided analysis, while the remainder of the data landscape lies uncharted. We investigate whether we can address these problems through a set of interface features that could easily be incorporated into current visual analytics tools. We built a prototype implementation of these features called DataFramer. Preliminary assessment via a study with 29 participants suggests the approach of examining data and stating questions before exploring the data is promising. We present a taxonomy of exploratory analysis statements and errors, as well as qualitative observations about how participants posed questions for exploring data using different tools.




Download Full History