|
|||||
Human
Computer Interaction
(CS408)
VU
Lecture
31
Lecture
31. Evaluation
Part VII
Learning
Goals
The
aim of this lecture is to
understand how to perform
evaluation through
usability
testing.
What is
Usability Testing?
While
there can be wide variations
in where and how you
conduct a usability test,
every
usability test shares these
five characteristics:
1. The
primary goal is to improve
the usability of a product.
For each test, you
also
have
more specific goals and
concerns that you articulate
when planning the
test.
2. The
participants represent real
users.
3. The
participants do real
tasks.
4. You
observe and record what
participants do and say.
5. You
analyze the data, diagnose
the real problems, and
recommend changes to fix
those
problems.
The
Goal is to Improve the
Usability of a Product
The
primary goal of a usability
test is to improve the
usability of the product
that is
being
tested. Another goal, as we will discuss
in detail later, is to improve
the process
by which
products are designed and
developed, so that you avoid
having the same
problems
again in other
products.
This
characteristic distinguishes a usability
test from a research study,
in which the
goal is
to investigate the existence of
some phenomenon. Although
the same facility
might be
used for both, they
have different purposes. This
characteristic also
distinguishes
a usability test from a quality
assurance or function test, which
has a
goal of
assessing whether the
product works according to
its specifications.
Within
the general goal of
improving the product, you
wilI have more specific
goals
and
concerns that differ from
one test to another.
You
might be particularly concerned
about how easy it is for
users to navigate
through
the menus. You could test
that concern before coding
the product, by
creating
an
interactive prototype of the menus, or by
giving users paper versions
of each
screen.
You
might be particularly concerned
about whether the interface
that you have
developed
for novice users will also
be easy for and acceptable to
experienced users.
For
one test, you might be concerned
about how easily the
customer representatives
who do
installations will be able to install
the product. For another
test, you might be
concerned
about how easily the
client's nontechnical staff
will be able to operate
and
maintain
the product.
285
Human
Computer Interaction
(CS408)
VU
These
more specific goals and
concerns help determine
which users are
appropriate
participants
for each test and
which tasks are appropriate to
have them do during
the
test.
The
Participants Represent Real
Users
The
people who come to test
the product must be members of
the group of people
who
now use or who will use
the product. A test that
uses programmers when
the
product
is intended for legal
secretaries is not a usability
test.
The
quality assurance people who
conduct function tests may
also find usability
problems,
and the problems they
find should not be ignored,
but they are not
conducting
a usability test. They are
not real users-unless it is a
product about
function
testing. They are acting
more like expert
reviewers.
If the
participants are more
experienced than actual
users, you may miss
problems that
will
cause the product to fail in
the marketplace. If the
participants are less
experienced
than actual users, you
may be led to make changes
that aren't
improvements
for the real
users.
The
Participants Do Real
Tasks
The
tasks that you have
users do in the test must be
ones that they will do with
the
product
on their jobs or in their homes.
This means that you
have to understand
users'
jobs
and the tasks for
which this product is
relevant.
In many
usability tests, particularly of
functionally rich and
complex software
products,
you can only test
some of the many tasks
that users will be able to do
with
the
product. In addition to being
realistic and relevant for
users, the tasks that
you
include
in a test should relate to
your goals and concerns
and have a high
probability
of
uncovering a usability
problem.
Observe
and Record What the
Participants Do and
Say
In a
usability test, you usually
have several people come,
one at a time, to work
with
the
product. You observe the
participant, recording both
performance and comments.
You
also ask the participant
for opinions about the
product. A usability test
includes
both
times when participants are
doing tasks with the
product and times when
they are
filling
out questionnaires about the
product.
Observing
and recording individual
participant's behaviors distinguishes a
usability
test from
focus groups, surveys, and
beta testing.
A typical
focus group is a discussion among 8 to 10
real users, led by a
professional
moderator.
Focus groups provide
information about users'
opinions, attitudes,
preferences,
and their self-report about
their performance, but focus
groups do not
usually
let you see how
users actually behave with
the product.
Surveys,
by telephone or mail, let
you collect information
about users'
opinions,
attitudes,
preferences, and their self-report of
behavior, but you cannot
use a survey to
observe
and record what users
actually do with a
product.
A typical
beta test (field test, clinical
trial, user acceptance test) is an
early release of a
product
to a few users. A beta test
has ecological validity,
that is, real people
are using
the
product in real environments to do
real tasks. However, beta
testing seldom yields
any
useful information about
usability. Most companies have
found beta testing to be
too
little, too unsystematic,
and much too late to be
the primary test of
usability.
286
Human
Computer Interaction
(CS408)
VU
Analyze
the Data, Diagnose the
Real Problems, and Recommend
Changes to Fix
Those
Problems
Collecting
the data is necessary, but
not sufficient, for a
usability test. After the
test
itself,
you still need to analyze
the data. You consider
the quantitative and
qualitative
data
from the participants
together with your own
observations and users'
comments.
You
use all of that to diagnose
and document the product's
usability problems and
to
recommend
solutions to those
problems.
The
Results Are Used to Change
the Product - and the
Process
We would
also add another point. It
may not be part of the
definition of the
usability
test
itself, as the previous five
points were, but it is
crucial, nonetheless.
A
usability test is not successful if it is
used only to mark off a
milestone on the
development
schedule. A usability test is successful
only if it helps to improve
the
product
that was tested and the
process by which it was
developed.
What Is
Not Required for a Usability
Test?
Our
definition leaves out some
features you may have been
expecting
to see,
such as:
a
laboratory with one-way
mirror
·
data-logging
software
·
videotape
·
a formal
test report
·
Each of
these is useful, but not
necessary, for a successful
usability test. For
example,
a
memorandum of findings and
recommendations or a meeting about
the test results,
rather
than a formal test report,
may be appropriate in your
situation.
Each of
these features has advantages in
usability testing that we
discuss in detail
later,
but none is an absolute
requirement. Throughout the
book, we discuss
methods
that
you can use when
you have only a shoestring
budget, limited staff, and
limited
testing
equipment.
When is a
Usability Test
Appropriate?
Nothing
in our definition of a usability test
limits it to a single, summative test at
the
end of a
project. The five points in
our definition are relevant
no matter where you
are
in the
design and development
process. They apply to both
informal and formal
testing.
When testing a prototype,
you may have fewer
participants and fewer
tasks,
take
fewer measures, and have a
less formal reporting
procedure than in a later
test,
but
the critical factors we
outline here and the general
process we describe in this
book
still apply. Usability
testing is appropriate iteratively
from predesign (test a
similar
product or earlier version),
through early design (test
prototypes), and
throughout
development (test different aspects,
retest changes).
Questions
that Remain in Defining
Usability Testing
We
recognize that our
definition of usability testing
still has some fuzzy
edges.
· Would
a test with only one
participant be called a usability
test? Probably not.
You
probably need at least two or three
people representing a subgroup
of
users to
feel comfortable that you
are not seeing idiosyncratic
behavior.
287
Human
Computer Interaction
(CS408)
VU
Would a
test in which there were no
quantitative measures qualify as
a
·
usability
test? Probably not. To substantiate
the problems that you
report, we
assume
that you will take at least
some basic measures, such as
number of
participants
who had the problem, or
number of wrong choices, or time
to
complete
a task. The actual measures
will depend on your specific
concerns
and
the stage of design or
development at which you are
testing. The measures
could
come from observations, from
recording with a data-logging
program,
or from a
review of the videotape
after the test. The issue is
not which
measures
or how you collect them,
but whether you need to have
some
quantitative
data to have a usability test.
Usability
testing is still a relatively
new development; its
definition is still
emerging.
You
may have other questions
about what counts as a usability test.
Our discussion of
usability
testing and of other
usability engineering methods, in
this chapter and
the
next
three chapters, may help
clarify your own thinking
about how to define
usability
testing.
Testing
Applies to All Types of
Products
If you
read the literature on
usability testing, you might
think that it is
only
about testing software for
personal computers. Not so.
Usability testing
works
for
all types of products. In
the last several years,
we've been involved in
usability
testing
of all these
products:
Consumer
products
Regular
TVs
High-definition
TVs
VCRs
Cordless
telephones
Telephone/answering
machines
Business
telephones
Medical
products
Bedside
terminal
Anesthesiologist's
workstation
Patient
monitor
Blood
gas analyzer
Integrated
communication system for
wards
Nurse's
workstation for intensive
care units
Engineering
devices
Digital
oscilloscope
Network
protocol analyzer (for
maintaining computer
networks)
Application
software for microcomputers,
minicomputers,
and
mainframes
Electronic
mail
Database
management software
Spreadsheets
Time management software
Compilers
and debuggers for programming
languages Operating system
software
288
Human
Computer Interaction
(CS408)
VU
Other
Voice
response systems (menus on the
telephone)
Automobile
navigation systems (in-car
information about how
to
get
where you want to
go)
The
procedures for the test may
vary somewhat depending on
what you are testing
and
the questions you are
asking. We give you hints
and tips, where appropriate,
on
special
concerns when you are
focusing the testing on
hardware or documentation;
but, in
general, we don't find that
you need to change the approach
much at all.
Most of
the examples in this book
are about testing some
type of hardware or
software
and the documentation that
goes with it. In some
cases, the hardware used
to
be just a
machine and is now a special
purpose computer. For usability
testing,
however,
the product doesn't even
have to involve any hardware
or software. You can
use
the techniques in this book
to develop usable
.
application or reporting
forms
.
instructions for noncomputer
products, like bicycles .
interviewing techniques
.
nonautomated procedures
.
questionnaires
Testing
All Types of Interfaces
Any
product that people have to
use, whether it is computer-based or
not, has a user
interface.
Norman in his marvelous
book, The Design of Everyday
Things (1988)
points
out problems with doors,
showers, light switches, coffee
pots, and many
other
objects
that we come into contact
with in our daily lives.
With creativity, you can
plan
a test of
any type of
interface.
Consider
an elevator. The buttons in
the elevator are an
interface- the way that
you,
the user,
talk to the computer that
now drives the machine.
Have you ever been
frustrated
by the way the buttons in an
elevator are arranged? Do you
search for the
one
you want? Do you press
the wrong one by
mistake?
You
might ask: How could
you test the interface to an
elevator in a usability
laboratory?
How could the developers
find the problems with an
elevator interface
before
building the elevator-at
which point it would be too
expensive to change?
In fact,
an elevator interface could be tested
before it is built. You
could create a
simulation
of the proposed control
panel on a touchscreen computer (a
prototype).
You
could even program the
computer to make the alarm
sound and to make
the
doors
seem to open and close,
based on which buttons users
touch. Then you
could
bring in
users one at a time, give
them realistic situations,
and have them use
the
touchscreen as
they would the panel in
the elevator.
Testing
All Parts of the
Product
Depending
on where in the development
process you are and
what you are
particularly
concerned about, you may
want to focus the usability
test on a specific
part of
the product, such as
.
installing hardware
.
operating hardware
.
cleaning and maintaining
hardware
289
Human
Computer Interaction
(CS408)
VU
.
understanding messages about
the hardware
.
installing software
.
navigating through
menus
. filling
out fields
.
recovering from
errors
.
learning from online or
printed tutorials
. finding
and following instructions in a
user's guide . finding and
following
instructions
in the on line help
Testing
Different Aspects of the
Documentation
When
you include documentation in
the test, you have to decide
if you are more
interested
in whether users go to the
documentation or in how well
the documentation
works
for them when they do go to
it. It is difficult to get
answers to both of
those
concerns at
the same time.
If you
want to find out how
much people learn from a
tutorial when they use
it, you
can
set up a test in which you
ask people to go through the
tutorial. Your test
paticipants
will do as you ask, and you will
get useful information about
the design,
content,
organization, and language of
the tutorial.
You
will, however, not have
any indication of whether
anyone will actually open
the
tutorial
when they get the
product. To test that, you
have to set up your
test
differently.
Instead
of instructing people to use
the tutorial, you have to
give them tasks and
let
them
know the tutorial is
available. In this second
type of test, you will find
out which
types of
users are likely to try
the tutorial, but if few
participants use it, you
won't get
much
useful information for
revising the
tutorial.
Giving
people instructions that encourage
them to use the manual or
tutorial may be
unrealistic
in terms of what happens in the
world outside the test
laboratory, but it is
necessary
if your concern is the
usability of the documentation. At
some point in the
process
of developing the product,
you should be testing the
usability of the
various
types of
documentation that users will
get with the
product.
At other
points, however, you should
be testing the usability of
the product in the
situation
in which most people will receive
it. Here's an
example:
A major
company was planning to put
a new software product on
its internal network.
The
product has online help
and a printed manual, but,
in reality, few users will
get a
copy of
the manual.
The
company planned to maintain a
help desk, and a major
concern for the
usability
test was
that if people don't get
the manual, they would
have to use the online
help,
call
the help desk, or ask a
co-worker. The company
wanted to keep calls to the
help
desk to a
minimum, and the testers
knew that when one
worker asks another for
help,
two
people are being
unproductive for the
company.
When
they tested the product,
therefore, this test team
did not include the
manual.
Participants
were told that the
product includes online
help, and they were
given the
phone
number of the help desk to
call if they were really
stuck. The test team
focused
on where
people got stuck, how
helpful the online help
was, and at what points
people
called
the help desk.
290
Human
Computer Interaction
(CS408)
VU
This
test gave the product team a
lot of information to improve
the interface and
the
online
help to satisfy the concern
that drove the test.
However, this test yielded
no
information
to improve the printed
manual. That would require a
different test.
Testing
with Different
Techniques
In most
usability tests, you have
one participant at a time
working with the
product.
You
usually leave that person
alone and observe from a
corner of the room or
from
behind a
one-way mirror. You
intervene only when the
person "calls the help
desk,"
which
you record as a need for
assistance.
You do it
this way because you
want to simulate what will
happen when
individual.
users
get the products in their
offices or homes. They'll be working on
their own, and
you
won't be right there in
their rooms to help
them.
Sometimes,
however, you may want to
change these techniques. Two ideas
that many
teams
have found useful
are:
.
co-discovery, having two
participants work
together
. active
intervention, taking a more
active role in the
test
Co-discovery
Co-discovery
is a technique in which you
have two participants work
together to
perform
the tasks (Kennedy, 1989).
You encourage the participants to
talk to each
other as
they work.
Talking
to another person is more natural
than thinking out loud
alone. Thus, co-
discovery
tests often yield more
information about what the
users are thinking
and
what
strategies they are using to solve
their problems than you
get by asking
individual
participants to think out
loud.
Hackman
and Biers (1992) have
investigated this technique.
They confirmed that
co-
discovery
participants make useful comments
that provide insight into
the design.
They
also found that having
two people work together
does not distort other
results.
Participants
who worked together did
not differ in their
performance or preferences
from
participants who worked
alone.
Co-discovery
is more expensive than
single participant testing,
because you have to
pay
two people for each
session. In addition, it may be
more difficult to watch
two
people
working with each other
and the product than to
watch just one person at
a
time.
Co-discovery may be used
anytime you conduct a
usability test, but it is
especially
useful early in design
because of the insights that
the participants
provide
as they
talk with each
other.
Active
Intervention
Active
intervention is a technique in which a
member of the test team sits
in the room
with
the participant and actively
probes the participant's understanding of
whatever is
being
tested. For example, you
might ask participants to
explain what they would
do
next
and why as they work
through a task. When they
choose a particular
menu
option,
you might ask them to describe
their understanding of the
menu structure at
that
moment. By asking probing
questions throughout the test,
rather than in one
interview
at the end, you can
get insights into
participants' evolving mental
model of
the
product.
291
Human
Computer Interaction
(CS408)
VU
You
can get a better
understanding of problems that
participants are having than
by
just
watching them and hoping
they'll think out
loud.
Active
intervention is particularly useful
early in design. It is an
excellent
technique to use with
prototypes, because it provides a
wealth of diagnostic
information.
It is not the technique to
use, however, if your
primary concern is to
measure
time to complete tasks or to
find out how often
users will call the help
desk.
To do a
useful active intervention test,
you have to define
your
goals
and concerns, plan the
questions you will use as probes,
and be careful not to
bias
participants by asking leading
questions.
Additional
Benefits of Usability
Testing
Usability
testing contributes to all
the benefits of focusing on
usability that we gave
in
Chapter
1. In addition, the process of
usability testing has two
specific benefits
that
may
not be as strong or obvious
from other usability
techniques. Usability
testing
helps
. change
people's attitudes about
users
. change
the design and development
process
Changing
People's Attitudes About
Users
Watching
users is both inspiring and
humbling. Even after
watching hundreds of
people
participate in usability tests, we
are still amazed at the
insights they give us
about
the assumptions we make.
When
designers, developers, writers, and
managers attend a usability
test or watch
videotapes
from a usability test for
the first time, there is
often a dramatic
transformation
in the way that they
view users and usability
issues. Watching just
a
few
people struggle with a
product has a much greater
impact on attitudes than
many
hours of
discussion about the importance of
usability or of understanding
users.
After an
initial refusal to believe
that the users in the test
really do represent the
people
for whom the product is
meant, many observers become instant
converts to
usability.
They become interested not
only in changing this
product, but in
improving
all
future products, and in
bringing this and other
products back for more
testing.
Changing
the Design and Development
Process
In
addition to helping to improve a
specific product, usability
testing can help
improve
the process that an
organization uses to design
and develop products
(Dumas,
1989).
The specific instances that
you see in a usability test
are most often
symptoms
of
broader and deeper global
problems with both the
product and the
process.
Comparing
Usability Testing to Beta
Testing
Despite
the surge in interest in
usability testing, many
companies still do not
think
about
usability until the product
is almost ready to be
released.
Their usability approach is to
give some customers an
early-release (almost
ready)
version of the product and
wait for feedback. Depending
on the industry and
situation,
these early¬
release
trials may be called beta
testing, field testing,
clinical trials, or user
acceptance
testing.
In beta
testing, real users do real
tasks in their real
environments. However,
many
companies
find that they get
very little feedback from
beta testers, and beta
testing
seldom
yields useful information
about usability problems for
these reasons:
. The
beta test site does not
even have to use the
product.
292
Human
Computer Interaction
(CS408)
VU
. The
feedback is unsystematic. Users
may report-after the
fact-what they
remember
and
choose to report. They may
get so busy that they
forget to report even
when
things go
wrong.
. In most
cases, no one observes the beta
test users and records their
behavior.
Because
users are focused on doing
their work, not on testing
the product, they
may
not be
able to recall the actions
they took that resulted in
the problems. In a
usability
test, you
get to see the actions, hear
the users talk as they do
the actions, and
record
the
actions on videotape so that
you can go back later
and review them, if you
aren't
sure
what the user
did.
. In a beta test,
you do not choose the
tasks. The tasks that
get tested are
whatever
users
happen to do in the time
they are working with the
product. A situation that
you
are concerned
about may not arise. Even if
it does arise, you may not
hear about it. In
a
usability test, you choose
the tasks that participants
do with the product. That
way,
you
can be sure that you
get information about
aspects of the product that
relate to
your
goals and concerns. That
way, you also get
comparable data across
participants.
If beta
testers do try the product
and have major problems
that keep them
from
completing
their work, they may
report those problems. The
unwanted by-product of
that
situation, however, may be
embarrassment at having released a
product with
major
problems, even to beta
testers.
Even
though beta testers know
that they are working
with an unfinished and
possibly
buggy
product, they may be using
it to do real work where
problems may have
serious
consequences.
They want to do their work
easily and effectively. Your
company's
reputation
and sales may suffer if beta
testers find the product
frustrating to use. A
bad
experience when beta testing
your product may make
the beta testers less
willing
to buy
the product and less
willing to consider other
products from your
company.
You
can improve the chances of
getting useful information
from beta test sites.
Some
companies
include observations and
interviews with beta
testing, going out to
visit
beta
test sites after people
have been working with the
product for a while.
Another
idea
would be to give tape recorders to selected
people at beta test sites
and ask them
to talk
on tape while they use the
product or to record observations
and problems as
they
occur.
Even
these techniques, however,
won't overcome the most
significant disadvantage of
beta
testing-that it comes too
late in the process. Beta
testing typically takes
place
only
very close to the end of
development, with a fully
coded product.
Critical
functional
bugs may get fixed after
beta testing, but time and
money generally mean
that
usability problems can't be
addressed.
Usability
testing, unlike beta testing,
can be done throughout the
design and
development
process. You can observe and
record users as they work
with prototypes
and
partially developed products.
People are more tolerant of
the fact that the
product
is still
under development when they
come to a usability test than when
they beta test
it. If
you follow the usability
engineering approach, you
can do usability testing
early
enough to
change the product-and retest the
changes.
293
Table of Contents:
|
|||||