Understanding Data by Having Computers and Humans Work Together

By Zhe Cui, Karthik Badam, Adil Yalcin, and Niklas Elmqvist

Understanding data using computer data science tools can be seen as a collaboration between the analyst and the computer. Most data tools put the analyst in the driver’s seat to guide the analysis. For example, in Excel, you need to select rows and columns and choose which chart type to use to calculate values, create charts, and develop insights. Such tools completely rely on the user to understand the data. However, this one-sided arrangement is not sufficient for modern large datasets and complex analytical scenarios when (1) the user is unaware of the best methods to transform, arrange, and analyze the data, or (2) is overwhelmed by the sheer scale and complexity of the data (or both).

In a recent research project, we explore an alternative approach that supports the analyst by providing automatic insights from the data through visualizations. This proactive approach leverages any available computer power to run data analyses automatically in the background and present insights (or “a-ha moments”) that can aid the analyst’s exploration. Our tool, DataSite, is an example of this idea that is designed to improve the coverage of the data the user views, the qualities of insight collected, as well as the user’s engagement during data analysis.

The Role of Automation in Generating Insights

An insight is an observation of value made from data in the sensemaking process. In visual sensemaking, people create data visualizations through tools such as Microsoft Excel and the industry-standard tool Tableau to extract trends, patterns, and outliers, eventually leading them to form insights. However, this process can be overwhelming if performed manually when dealing with a large number data items with many attributes, for example all of the students and their data enrolled at a university, or the sales data over time for a multinational company. The main challenge here is that the user will not know what the visualization looks like until it is shown, and this “trial and error” process can therefore be long and exhausting.

At the core of our work is the idea that computers can alleviate this burden by automatically generating observations and “useful” charts. After all, computing has always been about simplifying people’s lives. The same should be true of data analysis.

While this is thought-provoking, there is no perfect definition of an insight yet (John Stasko discussed this aspect more specifically in his recent blog post), which complicates our goal of proactive computation. However, “insight” has often been a subjective word and heavily depends on the goals of analysis and the user, as well as the domain. For instance, a paper from Tang et al. at SIGMOD 2017 developed algorithms to extract top insights from a dataset based on a database perspective in order to help enterprises make better and faster decisions.

Blending the Best of Human and Machine Capabilities

Our proactive approach is based on the core philosophy that “human thinking is expensive, whereas computational resources are cheap.” DataSite utilizes computer resources such that when the user analyzes and visualizes the data, the computer or server (or even a cluster) simultaneously executes appropriate automatic analyses on the data in the background to suggest interesting leads to the user to investigate as a next step. For instance, while observing the differences between cars with different horsepower, DataSite suggests differences in Miles per Gallon based on a correlation analysis that was automatically executed in the background.

 

By continuously executing all conceivable analyses on all combinations of data dimensions, DataSite uses brute force to generate automatic insights. This enhances the analyst’s awareness of the data in the data exploration process by both choosing best practice analysis methods as well as eliminating the need for the human to perform costly calculations. The computer-generated insights are presented through a user interface element called the feed view, which streams continuously updating results from computational analyses, akin to social media feed such as on Twitter or Facebook. Each result is accompanied by a visualization that highlights the result in the context of the data items. This leads to an analytical workflow that mixes the best of human and machine intelligence.

Better Insights and User Engagement through Proactive Insights

We evaluated our approach in DataSite through two user studies of open-ended visual exploration. In these studies, we compared DataSite to manual visualization (Polestar) and visualization recommendation (Voyager 2), respectively. In DataSite, we focused on standard automated analyses such as computation of statistical measures for mean, variance, and frequency of items, and standard data science methods for clustering, regression, correlation, and dimension reduction. The task for studies is the same: exploratory analysis of unknown data (also called “open-ended task”). We used 2 tools with 2 datasets (one dataset on each tool interface). Participants started with one tool and dataset, and then moved to the second interface. They were asked to explore the dataset “as much as possible” within a given time of 20 minutes and were encouraged to speak out aloud their thinking process and insights. Three major benefits of DataSite emerged from this study:

  • Broader coverage: DataSite shows 30% increase in data field attribute coverage compared with Polestar. There are more multi-attribute charts (encoding two or more data attributes) that participants viewed and interacted with using DataSite than Polestar. When compared with Voyager 2, DataSite has comparable data field attribute coverage but provided more meaningful charts.
  • More time spent on charts: Most participants spent at least 25% of their time on exploring the feed itself. All participants felt that the feed is useful for analysis and provides guidance of “where to look” in the data.
  • Better subjective ratings: People rated DataSite more efficient and comprehensive than Polestar and Voyager 2.

The Future of Proactive Analytics

 DataSite can be seen as a canonical visual analytics system in that it blends automatic computations with manual visual exploration. We regard it as the first step towards a fully proactive visualization system involving explicit human feedback in the loop, such as tasks people are doing, data attributes people care about, and advanced analysis people want to dive into. Besides, inferential statistics and user behavior based recommendations can also be integrated to provide user-guided recommendations of insights. A truly intelligent visual analysis system would leverage possible feedback from user and computational power from the computer to present easily understanding and interpretable insights.

More Resources

Paper: https://arxiv.org/abs/1802.08621

Advanced Transportation Lab is a long term partner of HCIL

At the University of Maryland’s Center for Advanced Transportation Technology Laboratory (CATT LAB), more than 100 engineers, developers, researchers and students work to provide officials with actionable insight that can help keep transportation systems running smoothly. Housed in the A. James Clark School of EngineeringThe CATT Lab is interested in providing planners, operators and researchers with computer user interfaces that leverage big data to create robust visualizations—charts, graphics and tables—that can assist in transportation decision-making.

Catherine Plaisant, Senior Research Scientist at the University of Maryland Institute for Advanced Computer Studies and Associate Director of Research of the Human-Computer Interaction Lab,  is part of the CATT Lab’s User Experience (UX) Team. The team is tasked with creating novel user interfaces for the Regional Integrated Transportation Information System.  From a major snowstorm to an unexpected event requiring multiple road closures or detours, disruptions to our nation’s highways and connecting transportation systems can often lead to big problems and major delays—frustrating not only travelers, but also law enforcement and transportation officials. The CATT lab can not only only pinpoint the worst bottlenecks, but identify what caused them, how long the backups are, and how much they cost, and make recommendations to the state of Maryland. “Research findings often take years or decades to impact the products people use, while the work with the CATT lab is rapidly implemented by a team of excellent developers and used within weeks by hundreds of operators and managers in transportation agencies across the U.S.,” Catherine says.

John Allen, a faculty assistant at CATT, calls Plaisant an “integral part” of the UX Team. “Catherine brings clear insight, intelligence and an innate talent to conceptualizing and developing interfaces and visualizations, critical to useful and usable next-gen analytics,” he says. In particular, Allen says, Plaisant’s contributions and guidance have made RITIS tools easy to understand, simple to use, with great actionable visualizations that help those in the industry quickly identify problems and develop smart, cost-effective mobility, safety and security solutions.  For instance, Plaisant has been involved in the creation of an Origin-Destination Data Suite, which will gain trip insight derived from leveraged geospatial data. This enables unprecedented understanding of vehicular movement information, such as origin and destination zones, diversionary routes during peak travel times (or from incidents) and more. He notes that as the CATT Lab evolves its tools to new datasets or requested features and functions, Plaisant’s expertise will continue to help advance the usability and usefulness of the lab’s tools. 

 –Excerpted from a UMIACS article  by Melissa J. Brachfeld and an NPR interview with  Mike Pack from the CATT lab 

science everywhere logo

Science Everywhere

Science Everywhere logo

Engaging entire neighborhoods in science learning with technology


 

Science Everywhere is an NSF funded research study aimed at understanding how technology can engage entire communities in science learning. We utilize a design-based research approach in which we co-design innovative science learning technology with families, teachers, and leaders in a community, implement that technology in the community, and then redesign that technology in an iterative design process. Broadly, this study will contribute to theory on connected learning by developing an understanding of how to connect science learning at home, school, and community spaces with technology. This study also aims to contribute to our understanding of parent-child learning, interactive display design, and social media for learning.

 

 

 

Woman with children observing the results of a cooking experiment that involved brownies.


After School Programs
In Science Everywhere we run after school programs that focus on science learning in life relevant contexts. Programs include topics like kitchen chemistry, engineering and design through Minecraft, and investigating how things fly.

Cartoon image of people using ipads and interacting with a large electronic display.


New Technology
For this study we developed a suite of science learning technologies. These technologies include a social media app for science learning and an interactive display for schools and public spaces.

 

 

 

Parents and children designing together. They are placing sticky notes on a wall.


Co-Design with Families
In our Science Everywhere design sessions we work with parents and children from our communities to co-design technologies for science learning. These families help us understand the process of making connections from school to home and the dynamics of parent-child and peer-to-peer learning.

Student in a science class holding a compass.


School Integration
We also work with science teachers in our communities to help us with the design of our technologies and to integrate our technologies into science classrooms. These partnerships help us to further investigate connected learning practices.

 

 


 

Partners and People


There are several groups that partner together to make Science Everywhere a possibility. These organizations include the University of Maryland iSchool and College of Education, the University of Washington iSchool, Prince George’s County Public Schools, Solid Rock Church, Highline Public Schools, and KidsTeam UW. The principal investigators of this study include Dr. June Ahn and Dr. Tamara Clegg from the University of Maryland, and Dr. Jason Yip from the University of Washington.


Other members of the research team include:
Dr. Jochen Rick, designer and developer
Elizabeth Bonsignore, PhD candidate at UMD iSchool
Daniel Pauw, PhD candidate at UMD iSchool
Judith Uchidiuno, PhD student at CMU
Austin Beck, PhD student at UMD College of Education
Kelly Mills, PhD student at UMD College of Education
Caroline Pitt, PhD student at UW iSchool


Past research team members:
Meridian Witt, Media Arts student at Wellesley College

 


 

Videos

 


 

Papers, Posters, and Presentations


Yip, J., Clegg, T., Ahn, J., Uchidiuno, J., Bonsignore, E., Beck, A., Pauw, D., Mills, K. The Evolution of Roles and Social Bonds During Child-Parent Co-design. Proceedings of the 2016 ACM annual conference on Human Factors in Computing Systems (CHI 2016). [PDF]
 
Pauw, D. A., Clegg, T. L., Ahn, J., Bonsignore, E., Yip, J. C., & Uchidiuno, J. (2015). Navigating Connected Inquiry Learning with ScienceKit. Presented at the Computer Supported Collaborative Learning 2015. [PDF]
 
Ahn, J., Clegg, T., Yip, J., Bonsignore, E., Pauw, D., Gubbels, M., Lewittes, B., & Rhodes, E. (2014). Seeing the unseen learner: designing and using social media to recognize children’s science dispositions in action. Learning, Media and Technology, (ahead-of-print), 1-31. [LINK]
 
Yip, J.C., Clegg, T.L., Ahn, J., Bonsignore, E., Gubbels, M., Rhodes, E., & Lewittes, B. (2014). The role of identity development within tensions in ownership of science learning. Proceedings of the Eleventh International Conference of the Learning Sciences (ICLS 2014). [PDF]
 
Clegg, T.L., Bonsignore, E., Ahn, J., Yip, J.C., Pauw, D., & Gubbels, M. (2014). Capturing personal and social science: Technology for integrating the building blocks of disposition. Proceedings of the Eleventh International Conference of the Learning Sciences (ICLS 2014). [PDF]
 
Yip, J.C., Ahn, J., Clegg, T.L., Bonsignore, E., Pauw, D. & Gubbels., M. (2014). “It helped me do my science.” A case of designing social media technologies for children in science learning. Proceedings of the 13th International Conference of Interaction Design and Children (IDC 2014). [PDF]

 


Kidsteam: Making technology for kids, with kids.

Kidsteam: Children and Adults Working as Design Partners

Making technology for kids without working directly with them, “is like making clothes for someone you don’t know the size of.” – Thomas, Kidsteam Child Design Partner Alumni

Introduction

The child and adult members of Kidsteam meet twice a week at the University of Maryland’s Human-Computer Interaction Lab to co-design technologies that support children’s learning and play. Kidsteam research enhances our understanding of intergenerational design techniques and of how to build technologies that are more relevant to children’s interests and needs.

Kidsteam and The Cooperative Inquiry Method

In 1998, Dr. Allison Druin adapted the Cooperative Inquiry method of design for use with children and established Kidsteam. The Kidsteam Cooperative Inquiry design team asks for a long-term partnership between its adult and child team members and offers a set of techniques that foster intergenerational communication and provide actionable design feedback. This feedback can be used in all stages of development, from early brainstorming to late-stage testing, and across technology platforms. Over the past two decades, design teams in universities and corporations around the world have employed the Cooperative Inquiry method.

The University of Maryland’s Kidsteam is the first intergenerational Cooperative Inquiry design team. Kidsteam brings together ~8 children, ages 7-11, with researchers and technologists with diverse backgrounds to design technologies for children twice a week throughout the academic year. Because of their long tenure, Kidsteam kids are experts in more than being kids: they are experts in collaborating with adults and other kids, in prototyping techniques, and in communicating their ideas and the importance of their ideas.

Interested in Joining or Working with Kidsteam?

Contact Beth Bonsignore, Director of Kidsteam, at ebonsign@umd.edu for more information. Join Kidsteam as one of our adult design partners, an industry or non-profit partner, or as a visiting scholar to learn the Cooperative Inquiry method of design.

 

Program Highlights


LincolnMem

Kidsteam at the Lincoln Memorial. Kidsteam was invited to help the National Park Service co-design the future of the Lincoln Memorial’s visitor experience. The team worked with 37 other adults from places such as the Pentagon, Sesame Workshop, Yahoo, Ford’s Theatre, AARP, and various parts of the National Park Service to envision the future of this iconic monument.


WhiteHouse

Kidsteam at the White House. Kidsteam visited the White House to prototype the online Every Kid in a Park experience with local 4th graders, members of the Department of the Interior, and of the National Park Service. The Every Kid in a Park initiative provides 4th graders and their families with a free annual pass to visit America’s natural wonders and historic sites.


DNT

Emmy-Winning Interactions. Nickelodeon won the 2013 Emmy for Outstanding Creative Achievement In Interactive Media – User Experience And Visual Design for their Nick App, which features the “Do Not Touch” button that was developed with Kidsteam. The button was recognized for its, “array of disruptive comedy and surprises.”