Life can often be described as a series of events. These events contain rich information that, when put together, can reveal history, expose facts, or lead to discoveries. Therefore, many leading organizations are increasingly collecting databases of event sequences: Electronic Medical Records (EMRs), transportation incident logs, student progress reports, web logs, sports logs, etc. Heavy investments were made in data collection and storage, but difficulties still arise when it comes to making use of the collected data. Analyzing millions of event sequences is a non-trivial task that is gaining more attention and requires better support due to its complex nature. Therefore, I aimed to use information visualization techniques to support exploratory data analysis---an approach to analyzing data to formulate hypotheses worth testing---for event sequences. By working with the domain experts who were analyzing event sequences, I identified two important scenarios that guided my dissertation:
First, I explored how to provide an overview of multiple event sequences? Lengthy reports often have an executive summary to provide an overview of the report. Unfortunately, there was no executive summary to provide an overview for event sequences. Therefore, I designed LifeFlow, a compact overview visualization that summarizes multiple event sequences, and interaction techniques that supports users' exploration.
Second, I examined how to support users in querying for event sequences when they are uncertain about what they are looking for. To support this task, I developed similarity measures (the M&M measure 1-2) and user interfaces (Similan 1-2) for querying event sequences based on similarity, allowing users to search for event sequences that are similar to the query. After that, I ran a controlled experiment comparing exact match and similarity search interfaces, and learned the advantages and disadvantages of both interfaces. These lessons learned inspired me to develop Flexible Temporal Search (FTS) that combines the benefits of both interfaces. FTS gives confident and countable results, and also ranks results by similarity.
I continued to work with domain experts as partners, getting them involved in the iterative design, and constantly using their feedback to guide my research directions. As the research progressed, several short-term user studies were conducted to evaluate particular features of the user interfaces. Both quantitative and qualitative results were reported. To address the limitations of short-term evaluations, I included several multi-dimensional in-depth long-term case studies with domain experts in various fields to evaluate deeper benefits, validate generalizability of the ideas, and demonstrate practicability of this research in non-laboratory environments. The experience from these long-term studies was combined into a process model and a set of design guidelines for temporal event sequence exploration.
My contributions from this research are LifeFlow, a visualization that compactly displays summaries of multiple event sequences, along with interaction techniques for users' explorations; similarity measures (the M&M measure 1-2) and similarity search interfaces (Similan 1-2) for querying event sequences; Flexible Temporal Search (FTS), a hybrid query approach that combines the benefits of exact match and similarity search; and case study evaluations that results in a process model and a set of design guidelines for temporal event sequence exploration. Finally, this research has revealed new directions for exploring event sequences.