How to assess “good usability”?

7 min readOct 27, 2020

Arthur C. Clark, a British science fiction writer, famously said that “any sufficiently advanced technology is indistinguishable from magic”. As HCI researchers, we reserve the right to call ourselves magicians in the correct context. What makes an application good is tantamount to asking “what is magic?”. However, unlike magicians, we reveal our trade secrets quite openly so let’s dive into the ingredients of magic (and dark arts).

A brief history of “what’s good?”

Millions of artists and designers over the history of human-kind have tried to establish a definition of good design. The first recorded instance of this endeavor comes in the form of three design principles, given by Vitruvius in the first century BC (Zhang & Galletta, 2015):

Firmitas: Firmness, strength, and durability of the design.
Utilitas: The usefulness and applicability of the design.
Venustas: The aesthetic appeal of the design.

The designs which complied with the above principles were adjudged to be good. The most famous use of these design principles is Leonardo Da Vinci’s “Vitruvian Man”. Da Vinci utilized Vitruvius’ design principles to create the ideal human body with perfect proportions. It would not be wrong to call Da Vinci one of the first user experience designers.

Some of the modern definitions of usability are derived from the field of “Human Factors and Ergonomics”. Owing to the advances made in modern psychology with the help of Sigmund Freud and his student Carl Jung, it was possible to adopt a cognitive view of “good applications”.

Usability Engineering

The definition of “good” and “bad” we are interested in comes from the field of “Usability Engineering”. Usability Engineering was conceived in response to the growing needs of designing screen interfaces that were relevant to the needs of the users and did not require a steep learning curve. Computers designed before the 1980s assumed its users to be experts and did not factor the concerns of a lay user in its design. However, with the increasingly falling prices of computers, more and more naive users started buying personal computers for work and recreational use. The prevailing school of thought at that time was “essentialism”, wherein usability was seen as an inherent feature (Cockton, 2004) of an interface with an emphasis on designing “user-friendly” interfaces. Consequently, user interfaces started to be judged on an inscrutable single-dimensional metric of “user-friendliness”.

In the late 1980s and early 1990s, a new school of thought started emerging which envisioned usability as an emergent property of interactivity. This idea was prominently championed by Lucy Suchman (1987), who underscored the importance of mutual intelligibility between humans and machines. Her idea took the form of “situated” actions wherein the human is aware of the context of their inquiries and the state of the machine, and the machine is expected to provide relevant responses (or lack of) based on the human’s actions. Contextual usability gained prominence with the contributions of Don Norman and Jakob Nielsen. Using the contextual definition of usability, they were able to give birth to the field of “usability engineering”.

Modern definitions

Whether a piece of software is good or not depends on the usability of the piece of software at hand. Jakon Nielsen (1993) enlisted the help of five attributes to determine the usability of an interface: (1) learnability, (2) efficiency, (3) memorability, (4) errors (low error-rate and reporting), and (5) satisfaction. Don Norman (2002), in The Design of Everyday Things, tried to define good designs using a psychological perspective. Using human-centered design (HCD), which puts the needs of the user at the center of the design process, he was able to characterize good design as one which “.. requires good communication, especially from machine to person, indicating what actions are possible, what is happening, and what is about to happen”. Norman catapulted words like affordances, signifiers, constraints, mappings, and feedback into the vocabulary of every interface designer. However, the most important concept Norman introduced was “conceptual models”. He argued that in order for a design to be good it needs to match the conceptual model (or mental models) of the user. A conceptual model is an explanation for how things form meanings in a user’s mind. It does not have to be accurate but it needs to be an approximation of a user’s system image. An interface can be ascertained to be good if the conceptual model of a designer and a user could be bridged.

Another bridge a designer is required to help the user cross is the “gulf of execution” and “gulf of evaluation”. “Gulf of execution” reflects what a user could do with the system and “gulf of evaluation” reflects how a user could evaluate the actions of the system. If both of these processes require a user to exert minimum cognitive effort then the piece of software is “good”.

That’s all good but how do we measure “goodness”?

One of the goals of Human-Computer Interaction (HCI) research is to put people first in all endeavors. It’s the reason why the name HCI is preferred over Computer-Human Interaction(CHI) (putting the human first — at least symbolically). As a result of this human-first approach, the usability of the system becomes paramount in assessing whether a system is good or bad.

Unfortunately for HCI researchers, there is no craftily constructed magic box that accepts the system as input and produces a usability score as an output. Since “usability” is an abstract concept, it cannot be measured in its whole, instead, its parameters are measured. By measuring a system’s learnability, efficiency, memorability, errors, and satisfaction one can attempt to measure the usability of a system. Various methods and approaches have been employed to capture a system’s parameters. These methods can be classified into two types: analytical and empirical.

Analytical research methods simulate people using a system and then use heuristics and expert judgment to predict usability problems. The most popularly used analytical usability methods are (Krug, 2005; Nielsen, 1993):

Heuristic Evaluation: It is a collection of design principles that would uncover potential usability problems without conducting user tests.
Cognitive Walkthrough: Instead of a user, an expert defines tasks and mimics the role of a user to make sure that the user would know what to do, how to do, and understand feedback for each task.

Since the end-users are the most important component of usability research, designing for users must also mean designing with the help of users. Empirical research methods assess how well a system works for people who are actual users of the system. Users reveal this by their attitudes towards the system, and their behavior while using the system. To capture the attitudes and behavior of users, two types of research methods are used — attitudinal and behavioral. Attitudinal research methods reveal what the users say or feel about the system and behavioral research methods are used to understand how the users actually use the system. The main attitudinal research methods are:

Interviews: Structured or unstructured interviews are conducted in labs or user context to understand how users interact with the system.
Surveys: Intercepted or email surveys are used to gain a typically low-level understanding of the system.
Card sorting: This method is used to organize information on a system by taking into account how users see and group the components of an interface.
Participatory Design: Users are asked to design their own version of a system to understand their mental models.

While attitudinal research methods are exclusively qualitative in nature, behavioral research methods could be qualitative or quantitative. Qualitative behavioral research methods include:

Usability Testing: Users are asked to perform a set of tasks in a lab or online in the presence of a usability researcher.
Ethnographic field study: Researchers study the use of an interface in the users’ context.

Quantitative behavioral research methods include:

A/B testing or multivariate testing: Different versions of a design are tested on the users measuring the effects of different designs on user behavior.
Clickstream analysis: In this method, time spent on an interface, number of clicks, engagement (depending on context), click-through rates, etc. are measured.

In practical use, no single research method works as a panacea. Since all these research methods capture different aspects of usability, it is important for researchers to prioritize the method which is most relevant to their users’ context. Consequently, researchers end up using a combination of these methods to measure the “goodness” of a piece of software.

But are these measures good?

The success of these measures is dependant on the success of the system. If a system developed using these methods results in a higher return on investment (ROI) or increased user engagement (measured using key performance indexes) then these measures could be considered good. Historically, the use of usability methods has provided up to 500% returns in terms of ROI (Sauro & Lewis, 2012).

Furthermore, anecdotal experiences of interfaces over the years would also suggest that these measures have resulted in interfaces that are increasingly easier to use and more visually appealing.

Dark Arts

Edward Tufte famously noted that “only two industries refer to their customers as ‘users’: illegal drugs and software.” In many ways, with the help of the internet, systems have become like a drug as people increasingly spend more time looking at their screens every year (Livingston, 2019). This behavior places tremendous responsibilities on the shoulders of HCI researchers since a “good” software drives user behavior and shapes new metaphors. It is now quintessential to place the ethics component as a crucial part of the human-centered design process to create software or systems which are good in the personified sense of the word as well.

References

Cockton, G. (2004). Usability Evaluation. Retrieved from https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction-2nd-ed/usability-evaluation

Krug, S. (2005). Don’T Make Me Think: A Common Sense Approach to the Web (2Nd Edition). Thousand Oaks, CA, USA: New Riders Publishing.

Livingston, G. (2019). Americans 60 and older are spending more time in front of their screens than a decade ago. Retrieved November 4, 2019, from Pew Research Center website: https://www.pewresearch.org/fact-tank/2019/06/18/americans-60-and-older-are-spending-more-time-in-front-of-their-screens-than-a-decade-ago/

Nielsen, J. (1993). Usability Engineering. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

Norman, D. A. (2002). The Design of Everyday Things. New York, NY, USA: Basic Books, Inc.

Sauro, J., & Lewis, J. R. (2012). Quantifying the User Experience: Practical Statistics for User Research (1st ed.). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

Suchman, L. A. (1987). Plans and Situated Actions: The Problem of Human-machine Communication. New York, NY, USA: Cambridge University Press.

Zhang, P., & Galletta, D. F. (Eds.). (2015). Human-computer interaction and management information systems: Foundations. London New York: Routledge.

How to assess “good usability”?

Written by Smit Desai