Human-Centered AI: Trusted, Reliable & Safe
Workshop: Human-Centered AI: Reliable, Safe & Trustworthy
Thursday, May 28, 2020
A workshop of the
37th Human-Computer Interaction Lab Symposium
University of Maryland
Overview and Topics
Well-designed technologies, which offer high levels of human control and high levels of computer automation, will increase human performance, rather than replace people. These Human-centered AI technologies are more likely to produce designs that are Trusted, Reliable & Safe (TRS). Achieving these goals will dramatically increase human performance, while supporting human self-efficacy, mastery, and responsibility.
This workshop explores case studies, guidelines, principles, and theories to design advanced technologies that bring the benefits of AI methods, while ensuring appropriate human control.
To Participate:
This virtual workshop, using Zoom, is organized into three sessions (see below). Each speaker will have 25 minutes for their presentation and discussion, with the suggestion of 15 minutes talk and 10 minutes discussion.
REGISTER HERE
For background on Human-Centered AI, see this HCIL project page:
//human-centered-ai-a-second-copernican-revolution/
Organizers
- John P Dickerson (john@cs.umd.edu)— Assistant Professor, Dept of Computer Science, University of Maryland
- Hernisa Kacorri (hernisa@umd.edu) — Assistant Professor, College of Information Studies (iSchool), University of Maryland
- Ben Shneiderman (ben@cs.umd.edu) — Professor, Dept of Computer Science, University of Maryland
List of talks
10:30 am – 10:40 am (Eastern Time)
- Welcome: John Dickerson, Hernisa Kacorri, Ben Shneiderman
10:40 am – 12:20 pm ET
- Ben Shneiderman, University of Maryland, Dept of Computer Science:
- Human-Centered AI: Reliable, Safe & Trustworthy (SLIDES)
Abstract. Well-designed technologies that offer high levels of human control and high levels of computer automation can increase human performance, leading to wider adoption. The Human-Centered Artificial Intelligence (HCAI) model clarifies how to (1) design for high levels of human control and high levels of computer automation so as to increase human performance, (2) understand the situations in which full human control or full computer control are necessary, and (3) avoid the dangers of excessive human control or excessive computer control. The new goal of HCAI is more likely to produce designs that are Reliable, Safe & Trustworthy (RST). Achieving these goals will dramatically increase human performance, while supporting human self-efficacy, mastery, creativity, and responsibility. Design guidelines and independent oversight mechanisms for prospective design reviews and retrospective analyses of failures will clarify the role of human responsibility, even as automation increases. Examples of failures, such as the Boeing 737 MAX, will be complemented by positive examples such as elevators, digital cameras, medical devices, and RST cars.
BEN SHNEIDERMAN is an Emeritus Distinguished University Professor in the Department of Computer Science, Founding Director (1983-2000) of the Human-Computer Interaction Laboratory, and a Member of the UM Institute for Advanced Computer Studies (UMIACS) at the University of Maryland. He is a Fellow of the AAAS, ACM, IEEE, and NAI, and a Member of the National Academy of Engineering, in recognition of his pioneering contributions to human-computer interaction and information visualization. His widely-used contributions include the clickable highlighted web-links, high-precision touchscreen keyboards for mobile devices, and tagging for photos. Shneiderman’s information visualization innovations include dynamic query sliders for Spotfire, development of treemaps for viewing hierarchical data, novel network visualizations for NodeXL, and event sequence analysis for electronic health records.
- Human-Centered AI: Reliable, Safe & Trustworthy (SLIDES)
- Gary Klein, MacroCognition Corp:
- Artificial Intelligence Quotient (AIQ): Helping people get smarter about smart systems (SLIDES)
Abstract. The AIQ toolkit is designed to help people using a specific AI system to understand the strengths and the limitations of that system. In this way, it should enable users to judge when to trust the outputs of the system and when to be skeptical about these outputs. These are cognitive support tools, primarily non-technological. To date, the AIQ suite consists of eight different tools and it is likely that others will be added. Three of the tools are still being designed. The other five have been developed and applied. The AIQ toolkit is primarily aimed at users but it should also have value for specialists designing AI systems and for evaluators. Some of the tools lend themselves to research but our primary interest is in helping those who are trying to field AI systems to increase the impact of the systems, reduce their brittleness, and improve user acceptance.
GARY KLEIN, Ph.D., is a cognitive psychologist who helped to initiate the Naturalistic Decision Making movement in 1989. His Recognition-Primed Decision (RPD) model has been tested and replicated several times. He also developed a Data/Frame model of sensemaking and a Triple-Path model of insight. His work relies on Cognitive Task Analysis methods, primarily the Critical Decision method that he and his colleagues developed in 1985. In addition, he has formulated the Pre-Mortem method for identifying risks, and the ShadowBox method for training cognitive skills. The 5 books he has authored and the 3 he has co-edited have sold over 100,000 copies. He also has a blog on Psychology Today that has received over 300,000 views. He founded Klein Associates, Inc. in 1977 and when it grew to 37 employees he sold it to Applied Research Associates in 2005. He started his new company, ShadowBox LLC, in 2014.
- Artificial Intelligence Quotient (AIQ): Helping people get smarter about smart systems (SLIDES)
- Robert R. Hoffman, Institute for Human and Machine Cognition:
- More Crucial Than Ever: The Concepts of Human-Centered Work System Design
Abstract. The Human-Centered Computing paradigm was intended to send a message to computer science, encouraging it to look beyond its disciplinary horizon defined by the Turing Test. HCC has been elaborated for over 25 years. Advocates have encouraged computer scientists to avoid the trap of designer-centered design and the trap of thinking only in terms of the one person-one machine work context. The broader notion of Human-Centered Work Systems goes beyond this message, to speak to a variety of disciplines and stakeholders. It embraces ideas from Cognitive Systems Engineering, Situated Cognition, Macrocognition, and Naturalistic Decision Making. Research in these areas has revealed a set of fundamental trade-offs that bound the activities of cognitive work systems and suggest means of measuring such capacities as resilience. This presentation will elaborate each of the fundamental bounds. The trade-offs they entail are brought into sharp relief by the handicapping caused by the procurement process. This presentation will discuss a way of overcoming barriers to the creation of human-centered work systems.
ROBERT HOFFMAN is Senior Research Scientist at the Institute for Human and Machine Cognition (IHMC). He is a recognized world leader in cognitive systems engineering and Human-Centered Computing. He is a Senior Member of the Association for the Advancement of Artificial Intelligence, Senior Member of the Institute of Electrical and Electronics and Engineers, Fellow of the Association for Psychological Science, Fellow of the Human Factors and Ergonomics Society, and a Fulbright Scholar.
- More Crucial Than Ever: The Concepts of Human-Centered Work System Design
- Bonnie Dorr, Institute for Human and Machine Cognition:
- Human Language Technology for Active Defense against Social Engineering (SLIDES)
Abstract. At the cross-section of cyber security and human language technology is a new, cross-disciplinary field, “Cyber-NLP,” wherein lies a problem of widespread importance: detection and thwarting of social engineering attacks. This talk motivates and describes an AI application that acts as an assistant to, but crucially not a replacement for, the human. This technology employs both natural language understanding and response generation as an active defense, through direct engagement with an attacker, during a social engineering attack. The assistant is under the control of the human, but is designed to help keep the human from unwittingly falling into traps that could harm the human or their organization. I will motivate the application’s design principles and will describe techniques for detection of asks and framings that pave the way for an AI-induced response that potentially wastes the attacker’s time and unveils their identity. Finally, I will generalize from attacks at the individual level (email, text, chat) to those at mass scale (social media) and will discuss human-centered AI as an alternative to automated censorship for the pervasive problem of social engineering.
Dr. BONNIE J. DORR is Associate Director and Senior Research Scientist of the Florida Institute for Human and Machine Cognition (IHMC), Professor of Computer Science at both University of West Florida (faculty associate) and University of Florida (courtesy professor), and Professor Emerita of Computer Science at the University of Maryland. She is a former DARPA Program Manager, former co-coordinator of the NIST Data Science Research Program, and former Associate Dean for the College of Computer, Mathematical, and Natural Sciences (CMNS). She co-founded and served as a director of the CLIP Laboratory in UMIACS. For 35 years, she has been conducting research in multilingual processing, summarization, and deep language understanding. She has led numerous DARPA and DoD/IC projects in representational semantics, language understanding, and data science and has participated in multiple NIST TAC and MT Evaluations. She is a former PI on DARPA SSIM and IARPA CAUSE and a current co-PI on DARPA ASED and DARPA ASIST. Her most recent work focuses on advances at the intersection of cyber, AI, and NLP–most notably cyber-event extraction and multilingual language processing for detecting attacks or discerning intentions of attackers. She holds a Ph.D. in computer science from Massachusetts Institute of Technology. She is a Sloan Fellow, a NSF Presidential Faculty (PECASE) Fellow, former President of the Association for Computational Linguistics (2008), Fellow of the Association for Advancement of Artificial Intelligence (2013), member of the Leadership Florida Class of XXXIII (2014), Fellow of the Association for Computational Linguistics (2016), and current member of DARPA’s Information Science and Technology (ISAT) study group (2020-2023).
- Human Language Technology for Active Defense against Social Engineering (SLIDES)
12:50 pm – 2:40 pm ET
- Hernisa Kacorri, University of Maryland, College of Information Studies:
- Machine teaching: Enabling end-users to innovate and build AI-infused systems (SLIDES)
Abstract. As machine learning and artificial intelligence become more present in everyday applications, so do our efforts to better capture, understand, and imagine this coexistence. Machine teaching lies at the core of these efforts as it enables end-users and domain experts with no machine learning expertise to innovate and build AI-infused systems. Beyond helping to democratize machine learning, it offers an opportunity for a deeper understanding of how people perceive and interact with such systems to inform the design of future interfaces and algorithms. We examine how people conceptualize, experience and reflect on their engagement with machine teaching in the context of a supervised image classification task, a task where humans are extremely good compared to machines, especially when they possess prior knowledge of the image classes.
HERNISA KACORRI is an Assistant Professor in the College of Information Studies. She holds an affiliate appointment in the Computer Science and the Human-Computer Interaction Lab at the University of Maryland, College Park and serves as a core faculty at the Trace R&D Center. She received her Ph.D. in Computer Science in 2016 from The Graduate Center at City University of New York, and has conducted research at the University of Athens, IBM Research-Tokyo, Lawrence Berkeley National Lab, and Carnegie Mellon University. Her research focuses on data-driven technologies that can benefit the disability community, with an emphasis on rigorous, user-based experimental methodologies to assess impact. Hernisa is a recipient of a Mina Rees Dissertation Fellowship in the Sciences, an ACM ASSETS best paper finalist and a best paper award, an ACM CHI honorable mention award, and an IEEE WACV best paper award. She has been recognized by the Rising Stars in EECS program of CMU/MIT.
- Machine teaching: Enabling end-users to innovate and build AI-infused systems (SLIDES)
- Brian M. Pierce, University of Maryland Applied Research Laboratory for Intelligence and Security (ARLIS):
- Engineering the Human-Machine Partnership
Abstract. Technology is evolving the machine in human-machine systems from a tool to a partner that has the potential to be much more than a tool. We need to establish what we want from this partnership before engaging in the engineering process of “design, build and test.” In other words, we need to first specify the system requirements. We are usually adept at identifying the functional requirements for a system, or what the system shall do, but we also need to specify the non-functional requirements, or what the system shall be. As AI moves machines from tools to partners, the question of what we want the AI-enhanced human-machine partnership to be assumes greater importance. For example, we want a tool to be safe, easy to use, and durable, but a partner should be more – such as explainable, trustworthy, resilient, and possess common sense. The talk focuses on technologies being developed by DARPA to help meet the emerging “shall be” requirements for the human-machine partnerships of the future.
Dr. BRIAN PIERCE joined ARLIS in 2020 as a Visiting Research Scientist, after completing his second tour at DARPA (2014-2019) as the Director and Deputy Director of the Information Innovation Office (I2O) driving advances in AI, data analytics and cyber. Dr. Pierce also serves on the advisory board for sparks & honey, and is a mediaX Distinguished Visiting Scholar at Stanford University for 2019-2020. Dr. Pierce has over 35 years of experience developing advanced technologies in the aerospace/defense industry. Prior to joining DARPA in 2014, he was a Technical Director in Space and Airborne Systems at the Raytheon Company. During his first tour at DARPA, he served as the Deputy Director of the Strategic Technology Office from 2005 to 2010. From 2002 to 2005, he was Executive Director of the Electronics Division at Rockwell Scientific Company in Thousand Oaks, California. From 1983 to 2002, he held various engineering positions at Hughes Aircraft Company and Raytheon in southern California. Dr. Pierce earned a Doctor of Philosophy degree in chemistry, a Master of Science degree in chemistry and a Bachelor of Science degree in chemistry and mathematics from the University of California at Riverside. He has published 40 technical articles, and produced 28 U.S. Patents.
- Engineering the Human-Machine Partnership
- Frank Pasquale, University of Maryland School of Law, Baltimore:
- Machines Judging Humans: The Promise and Perils of Formalizing Evaluative Criteria (SLIDES)
Abstract. Over the past decade, algorithmic accountability has become an important concern for social scientists, computer scientists, journalists, and lawyers. Exposés have sparked vibrant debates about algorithmic sentencing. Researchers have exposed tech giants showing women ads for lower-paying jobs, discriminating against the aged, deploying deceptive dark patterns to trick consumers into buying things, and manipulating users toward rabbit holes of extremist content. Public-spirited regulators have begun to address algorithmic transparency and online fairness, building on the work of legal scholars who have called for technological due process, platform neutrality, and nondiscrimination principles.
This policy work is just beginning, as experts translate academic research and activist demands into statutes and regulations. Lawmakers are proposing bills requiring basic standards of algorithmic transparency and auditing. We are starting down on a long road toward ensuring that AI-based hiring practices and financial underwriting are not used if they have a disparate impact on historically marginalized communities. And just as this “first wave” of algorithmic accountability research and activism has targeted existing systems, an emerging “second wave” of algorithmic accountability has begun to address more structural concerns. Both waves will be essential to ensure a fairer, and more genuinely emancipatory, political economy of technology. Second wave work is particularly important when it comes to illuminating the promise & perils of formalizing evaluative criteria.
FRANK PASQUALE, JD, MPhil, Piper & Marbury Professor of Law at the University of Maryland, is an expert on the law of big data, predictive analytics, artificial intelligence, and algorithms. He is author of The Black Box Society, (Harvard University Press, 2015) and has served as a member of the Council on Big Data, Ethics & Society.
- Machines Judging Humans: The Promise and Perils of Formalizing Evaluative Criteria (SLIDES)
- Adarsh Subbaswamy, Johns Hopkins University:
- Proactively Addressing Failures to Engineer Reliable Machine Learning Systems (SLIDES)
Abstract. The increasing use of machine learning driven decision making systems in high impact applications, such as deciding bank loans and making clinical treatment decisions, has led to renewed emphasis on improving and ensuring the safety and reliability of these systems. To do so, system developers are forced to reason in advance about likely sources of failure and address them prior to deployment. In this talk I will focus on a common source of failure in machine learning systems: shifts in population, behavior, or data collection between training and deployment. Leveraging tools from causality, I will discuss a framework that allows system developers to proactively reason about and express problematic shifts that can occur. Then, I will introduce new learning techniques that train models which are guaranteed to be robust to shifts the developers specify.
ADARSH SUBBASWAMY is a PhD student in computer science at Johns Hopkins University, where he is advised by Suchi Saria. His research interests lie at the intersection of machine learning and healthcare, where he combines methods from causal inference, graphical models, and uncertainty quantification to improve and assess the reliability of machine learning models. He is associated with the Johns Hopkins Center of Excellence in Regulatory Science and Innovation and has interacted with the US FDA regarding the assessment of machine learning-driven medical devices.
- Proactively Addressing Failures to Engineer Reliable Machine Learning Systems (SLIDES)
3:00 pm – 4:15 pm ET
- John Dickerson, University of Maryland, Dept of Computer Science:
- AI and Market Design: Defining the Division of Labor between Policymakers and Technicians
Abstract. Markets are systems that empower interested parties — humans, firms, governments, or autonomous agents — to exchange goods, services, and information. In some markets, such as stock and commodity exchanges, prices do all of the “work” of matching supply and demand. Due to logistical or societal constraints, many markets, e.g., school choice, rideshare, online dating, advertising, cadaveric organ allocation, online labor, public housing, refugee placement, and kidney exchange, cannot rely solely on prices to match supply and demand. Techniques from artificial intelligence (AI), computer science, and mathematics have a long history of both being created via, and also finding application in, the design and analysis of markets of both types. AI techniques determine how to discover structure in an uncertain matching problem, learn how to decide between matching now versus waiting, and balance competing objectives such as fairness, diversity, and economic efficiency. Yet, even defining what “best” means is important, often not obvious, and frequently involves a feedback loop between human stakeholders — each with their own value judgments — and automated systems. This talk covers optimization- and AI-based approaches to the design and analysis of markets, along with recent approaches to aggregating value judgments of human stakeholders and incorporating them into automated matching and resource allocation systems.
JOHN P DICKERSON is an Assistant Professor of Computer Science at the University of Maryland. His research centers on solving practical economic problems using techniques from computer science, stochastic optimization, and machine learning. He has worked extensively on theoretical and empirical approaches to designing markets for organ allocation, blood donation, school admissions, hiring, and computational advertising—and on analyzing the division of labor between policymakers and technical designers in those markets. He is an NSF CAREER Awardee, Facebook Fellow, Google Faculty Research Awardee, and Siebel Scholar.
- AI and Market Design: Defining the Division of Labor between Policymakers and Technicians
- Alison Smith-Renner, Decisive Analytics Corp & University of Maryland:
- Why didn’t you listen to me? Exploring humans’ perceptions and experience when controlling transparent models (SLIDES)
Abstract. Interactive machine learning techniques let users inject domain expertise to improve or adapt models. However, these models must balance user input and the underlying data, meaning they sometimes update slowly, poorly, or unpredictably—either by not incorporating user input as expected (adherence) or by making other unexpected changes (instability). While prior work explores control in terms of whether or how users provide feedback, less attention is paid to users’ reactions when their feedback is not applied predictably—a case that is especially prominent in transparent systems where controls are easy to validate. To address this, we performed two studies to examine users’ experience when controlling topic models, which are inherently transparent and can vary in terms of adherence, stability, and update speeds. Users disliked slow updates most, followed by lack of adherence. Instability was polarizing: some users liked it when it surfaced interesting information, while others did not. These studies highlight a need to better understand how users expect and desire machine learning systems to adhere to their input and how to manage such expectations appropriately.
ALISON SMITH-RENNER is a Ph.D. candidate in the Department of Computer Science at the University of Maryland, College Park (graduating in May 2020). She is advised by Dr. Leah Findlater and Dr. Jordan Boyd-Graber. She also leads the Machine Learning Visualization Lab for Decisive Analytics Corporation, where she designs user interfaces and visualizations for non-ML experts to interact with intelligent systems and their results. Her research interests lie at the intersection of machine learning and human-computer interaction, with a particular focus on enhancing users’ understanding and interaction with machine learning without requiring prior expertise. She is active in the explainable machine learning and human-centered machine learning communities.
- Why didn’t you listen to me? Exploring humans’ perceptions and experience when controlling transparent models (SLIDES)
- Hodan Omaar, Center for Data Innovation:
- Policy-in-the-Loop: A Survey of Legislative and Regulatory Proposals to Increase Human Control Over AI (SLIDES)
Abstract. There are a number of policy proposals to increase human control over AI systems. Some of these focus broadly on legislative or regulatory oversight of algorithmic decision making, while others focus more narrowly on the use of AI for specific applications, such as in autonomous vehicles, autonomous weapons, or medical devices. This talk will provide an overview of the various types of policy proposals and discuss how they might intersect with technical measures to provide more trusted AI.
HODAN OMAAR is a policy analyst at the Center for Data Innovation focusing on AI policy. Previously, she worked as a senior consultant on technology and risk management in London and as a crypto-economist in Berlin. She has a master’s degree in economics and math from the University of Edinburgh.
- Policy-in-the-Loop: A Survey of Legislative and Regulatory Proposals to Increase Human Control Over AI (SLIDES)
We are grateful for sponsorship support
- Adobe Research (Shriram Revankar)
- UMD Dept of Computer Science (Ming Lin, Chair)
- Applied Research Laboratory for Intelligence and Security (ARLIS) (Bill Regli, Director).