|
|||||
Clinical
Psychology (PSY401)
VU
LESSON
21
THE
OBSERVATIONAL ASSESSMENT AND ITS
TYPES
Observation is a
visual method of gathering
information on activities:
of
what happens, what
your
object
of study does or how it
behaves.
In
the study of products you may be
interested in activities because some
products are
essentially
activity
with little or no tangible
essence, like computer programs, courses
of education, dramas and
other
presentations on stage or on TV. There
are also activities related
to "static" artifacts, notably
their
manufacture
and use that you
perhaps will want to
study.
OBSERVATION
METHODS
To
assess and understand
behavior, one must first
know what one is dealing
with. It comes as no
surprise,
then,
that behavioral assessment employs
observation
as
a primary technique. A clinician
can try to
understand
a phobic's fear of heights, a
student's avoidance of evaluation
settings, or anyone's tendency to
overeat.
These people
could
be interviewed or assessed with
self-report inventories. But
many
clinicians
would argue that unless
those people are directly
observed in their natural
environments, true un-
derstanding
will be incomplete. To determine
the frequency, strength, and
pervasiveness of the problem
behavior
or the factors that are
maintaining it, behavioral
clinicians advocate direct
observation.
Of
course, all this is easier
said than done. Practically speaking, it is
difficult and expensive to
maintain
trained
observers and have them available.
This is especially true in
the case of adults who
are
being
treated on an outpatient basis. It is
relatively easier to accomplish with
children or those
with
cognitive
limitations. It is likewise easier to
make observations in a sheltered or
institutional setting.
In
some cases, it is possible to use
observers who are
characteristically part of the
person's
environment
(such as spouse, parent, teacher, friend,
or nurse). In certain instances, it is
even possible to
have
the client do some self-observation. Of
course, there is the
ever-present question of ethics.
Clinical
psychologists
must take pains to make sure
that people are not observed
without their knowledge or
that
friend
and associates of the client
are not unwittingly
drawn
into the observational net in a
way that
compromises
their dignity and right to
privacy.
For
all these reasons, naturalistic
observation has never been used in
clinical practice as much as it
might
be.
Indeed, observation is still more
prominent in research than in
clinical practice. However,
one
need
not be a diehard proponent of
the behavioral approach to concede
the importance of
observational
data. It is not unlikely
that clinicians of many
different persuasions have arrived
at
incomplete
pictures of their clients. After
all, they may never
see them except during
the 50-minute
therapy
hour or through the prism of
objective or projective test data. But
because of the cumbersome
nature
of many observational procedures,
for years most clinicians
opted for the simpler and
seemingly
more
efficient methods of traditional
assessment.
NATURALISTIC
OBSERVATION
Naturalistic
observation is hardly a new idea. McReynolds (1975)
traced the roots of
naturalistic
observation
to the ancient civilizations of
Greece and China. About 50
years a g o , Barker and
Wright
(1951)
described their systematic and detailed
recordings of the behavior of 7 year-old
over one day (a
major
effort that took an entire
book). Beyond this, all of
us recognize instantly that our
own informal
assessments
of friends and associates
are heavily influenced by observations of
their naturally
occurring
behavior.
But observation, like testing, is useful
only when steps are taken to
ensure its reliability
and
validity.
163
Clinical
Psychology (PSY401)
VU
Example
of Naturalistic Observation
Over
the years, many forms of
naturalistic observation have been used
for specific settings. These
settings
have
included classrooms, playgrounds,
general and psychiatric hospitals, home
environments,
institutions
for those with mental
retardation, and therapy sessions in
outpatient clinics. Again, it
is
important
to note that many of the systems
employed in these settings have
been most widely-used
for
research purposes. But most,
of them are adaptable for
clinical use.
Home
Observation
Because
experiences in the family or home have
such pervasive effects on
adjustment, it is not
surprising
that a number of assessment
procedures have been developed
for behaviors occurring
in
this
setting. One of the best
known systems for home
observation is the
Behavioral
Coding System
(BCS)
developed
by Patterson (1977) a n d h i s colleagues
(Jones, Reid, & Patterson, 1979).
This
observational
system was designed for
use in the homes of
pre-delinquent boys who
exhibit
problems
in the areas of aggressiveness
and noncompliance. Trained observers
spend one or two
hours
in the homes of such boys,
observing and recording family
interactions. Usually
the
observations
are made immediately before or
during dinner. Observers are
not allowed to interact
with
family
members (although occasionally
they may talk with them
before or after the observations to
gain
better
acceptance of the procedure).
Each family member is
observed for two 5-minute
periods during each
observational
occasion. Observations are made of
behaviors in 28 categories, and every 6
seconds
during
the period a given family
member is being observed,
the observer notes whether
these
behaviors
have or have not occurred.
In
a recent study, Patterson and Forgatch
11995) reported observational
data-in this case, the sum
of
multiple
categories of aversive behavior
(such as yelling humiliating
destructiveness) ----coded from
home
interactions between 67 children and
their respective families. All
these children had
been
referred
for treatment because of antisocial behavior
problems. Interestingly, Patterson and
Forgatch
(1995)
found that children's
aversive behavior scores at
treat ment termination
significantly
predicted
future arrests over the
two-year follow-up period. In
contrast, no teacher, mother, or father
rating
of
the children at t e r mi n a t i o n s i g n i f i c a
n t l y p r ed i c t e d arrests. Thus, in this
study, the predictive
value
of naturalistic
observation (over more
traditional ratings by parents or
teachers) was
demonstrated.
School
Observation
Clinical
child psychologists must often
deal with behavior problems
that take place in the
school
setting;
some children are disruptive in
class, overly aggressive on
the playground, generally
fearful, cling
to
the teacher, will not
concentrate, and so on.
Although the verbal reports
of parents and teacher
are
useful,
the most direct assessment
procedure is actually to observe
the problem behavior in its natural
habitat.
Several
coding systems have been
developed over the years
for use in school
observation.
An
example of a behavioral observation
system used in school settings is
Achenbach's (1994) Direct
Observation
Form (DOF) of
the Child Behavior
checklist. The DOF is used to
assess problem behaviors
that'
may be observed in school
classrooms or other settings
(Achenbach, 1994). It consists of 96
problem
items,
as well as an open-ended item that allows
assessors to indicate problem
behaviors not covered
by
these
items. Assessors are
instructed to rate each item according to
its frequency duration and
intensity
within
a 10 minute observation period. It is
recommended that three to six
10_miuute observation periods be
completed
so that scores can be
averaged across_ occasions
(Achenbach, 1994). In this way, a more
reliable
and
stable estimate of the child's
level of behavior problems in the
classroom can be
obtained.
Hospital
Observation
Observation
techniques have long been
used in such settings as
psychiatric
hospitals
and institutions for
those
with mental retardation. The
sheltered characteristics of these
settings have made careful
observation
of
behavior much more feasible
than in more open,
uncontrolled environments.
164
Clinical
Psychology (PSY401)
VU
An
example of a
hospital observation device is
the Time
Sample Behavioral Checklist
(TSBC)
de-
veloped
by Gordon Paul and his
associates (Mariotto & Paul, 1974).
It is a time-sample behavioral
checklist
that can be used with chronic
psychiatric patients. By time-sample is
meant that observations
are
made at regular intervals
for a given patient.
Observers make a single 2second
observation of the
patient
once every waking hour.
Thus, a daily behavioral profile
can be constructed on each
patient.
Interobserver
reliability for this
checklist has typically been
quite high, and such
scales as the TSBC
are
helpful
providing a comprehensive behavioral
picture of the patient. For example,
using the TSBC,
Menditto
et
al. (1996) documented how a
combination of a relatively new antipsychotic
medication (clozapine) and
a
structured social learni ng
program (Paul & Lentz,
1977) helped significantly
decrease the
frequency
of in appro p r i a t e behaviors and aggressive
acts over a 6 month period
in a sample of
chronically
mentally ill patients on an inpatient
unit.
CONTROLLED
OBSERVATION
Naturalistic
observation has a great deal
of intuitive appeal. It provides a
picture of how
individuals
actually behave that is unfiltered by
self-reports, inferences, or other
potentially contaminating
variables.
However, this is easier said
than done. Sometimes the
specific kind of behavior in
which
clinicians
are interested does not
occur naturally very often.
Much time and resources can
be wasted
waiting
for the right behavior or
situation to happen. The
assessment of responsibility taking,
for
example,
may require day after day of expensive observation
before the right situation
arises. Then_ just
as
the
clinician is about to start
recording, some unexpected
"other" figure in the environment
may step in
to
spoil the situation by
subtly changing its whole
character. Furthermore, in free-flowing,
spontaneous
situations,
the client may move away so
that conversations cannot be
overheard, or the entire
scene
may
move down the hall
too q u i c k l y to be followed. In short,
naturalistic settings often put
clinicians
at
the mercy of events that can
sometimes overwhelm opportunities
for careful, objective
assessment. As
a
way of handling these problems,
clinicians sometimes use controlled
observation.
For
many years, researchers have used
techniques to elicit controlled samples
of behavior (Lanyon &
Goodstein,
1982). These are really
situational
tests that
put individuals in situations
more or less
similar
to those of real life. Direct
observations are then made of how the
individuals react. In a sense,
this
is
a kind of work-sample approach in which the
behavioral test situation
and the criterion behavior
to
be
predicted are quite similar.
This should reduce errors in
prediction, as contrasted, for example,
to
psychological
tests whose stimuli are
far removed from the predictive
situations.
STUDIES
IN HONEST AND DECEIT
Early
arrivals on this scene were the
studies of Hartshorne and
May and their associates
(1928, 1929,-
1930).
Although Hartshome and May were oriented
principally toward research,
the approaches
they
used have found direct
application in the assessment field.
Because Hartshorne and May
viewed
personality
or character in habit-response terms,
they attempted to measure it by directly
sampling
behavior.
For example if one wants to
assess children's honesty,
why not do so by confronting
them
with
situations where cheating is possible and
then observe their responses?
This is exactly what
Hartshorne
and May did in assessing
such behaviors as cheating,
lying, and 'stealing. Using a
series of
ingenious
natural settings, they were able to
execute their research under
disguised yet highly
controlled
conditions. Of particular interest were
data that suggested that
children's deceitful
behavior
was
highly situation-specific and
should not be construed as reflecting a
generalized trait.
RESPONSE
TO STRESS
During
World War II, the urgent
demand for highly trained
and resourceful military
intelligence
personnel
led to the development of a series of
situational stress tests.
Instead of using personality
tests
to
assess the manner in which
the individual might handle
disruptive or emotionally
stressful
situations,
the U.S. Office of Strategic
Services_ used
assigned tasks (OSS Assessment Staff,
1948).
Through
both objective records and
qualitative observation by trained
staff, the assessment of reaction
165
Clinical
Psychology (PSY401)
VU
to
stress was undertaken. Although the
demands of war did not
provide many good
opportunities for the
strict
validation of OSS assessment techniques,
they did provide an
excellent model of what is
possible
in
assessment. A sample OSS
task is the
following:
A
large cube had to be constructed
out of pegs, poles, and
blocks. Since the job could
not be done by one
person
alone, two helpers were
provided-but the task had to be completed in 10
minutes. The helpers
were
actually stooges who
interfered, were passive,
made impractical suggestions,
and the like. They
ridiculed
the candidate and generally frustrated
him terribly. In fact, no candidate
was ever successful
in
assembling
the cube.
Somewhat
related techniques were used
in selecting candidates for the British
Civil Service -
(Vernon,1950).
Although stress was not
incorporated into the British
procedures, the tasks on
which
candidates
worked prior to their selection
were based on careful job
analyses. L. V. Gordon (1967)
has
evaluated
several work-sample approaches to
assessment used in the prediction of
the performance of
Peace
Corps trainees.
PARENT
ADOLESCENT CONFLICT
In
order to more accurately assess the
nature and degree of parent-adolescent
conflict, Prinz and
Kent
(1978)
developed the Interaction
Behavior Code (IBC) system.
Using the IBC, several
raters review and
rate
audio taped discussions of
families
attempting to resolve a problem about
which they
disagree.
Items
are
rated separately for each
family member according to
the behavior's presence or absence
during the
discussion
(or for some items,
the degree to which they are
present). Summary scores
are
calculated
by averaging scores (across
raters) for negative behaviors
and positive behaviors.
For
the strict behaviorist, of course,
the preceding techniques
represent a mixture of observation
and
inference.
When ratings of leadership, stress
level, or ingenuity are
made, what is really happening is
that
observers
are inferring something from
behavior. They are not
just compiling lists of
behaviors or
checking
off occurrences.
CONTROLLED
PERFORMANCE TECHNIQUES
As
seen in the OSS assessment
studies, controlled situations allow one
to observe behavior under
conditions
that
offer potential for control
and standardization. A more exotic
example is the case in which
A. A.
Lazarus
(1961) assessed claustrophobic behavior
by placing a patient in a closed room
that was made
progressively
smaller by moving a screen.
Similarly, Bandura (1969) has
used films to expose people
to a
graduated
series of anxiety-provoking
stimuli.
A
series of assessment procedures
using controlled
performance techniques to study chronic
snake
phobias
illustrates several approaches to this
kind of measurement (Bandura, Adams, & Beyer,
1977).
BEHAVIORAL
AVOIDANCE
The
test of avoidance behavior consisted of a
series of 29 performance tasks
requiring increasingly more
threatening
interactions with a red-tailed boa
constrictor. Subjects were
instructed to approach a glass
cage
containing
the snake, to look down at
it, to touch and hold the
snake with gloved and
then bare hands, to let
it
loose
in the room and then return it to
the cage, to hold it within
12 cm of their faces, and finally
to
tolerate
the snake crawling in their laps
while they held their hands
passively at their sides....
Those
who
could not enter the
room containing the snake
received a score of 0; subjects
who did enter were
asked
to perform the various tasks
in the graded series. To control
for any possible influence
of
expressive
cues from the tester, she
stood behind the subject and read aloud
the tasks to be
performed....
The avoidance score was
the number of snake-interaction
tasks the subject performed
successfully.
166
Clinical
Psychology (PSY401)
VU
SELF
MONITORING
In
the previous discussion of naturalistic observation,
the observational procedures were
designed for use by
trained
staff: clinicians, research assistants,
teachers, nurses, ward
attendants, and others. But
such
procedures
are often expensive in both
time and money. Furthermore, it is
necessary in most
cases
to
rely on time-sampling or otherwise
limit the extent of the
observations. When dealing
with
individual
clients, it is often impractical or too
expensive to observe them as they move
freely about in
their
daily activities. Therefore,
clinicians have been relying
increasingly on self-monitoring
in-which
individuals
observe and record their
own behaviors, thoughts, and
emotions
In
effect, clients are asked to maintain
behavioral logs or diaries
over some predetermined time
period.
Such
a log can provide a running
re c o rd o f th e freq u en cy , i n t en sity , an d
d u rati on of certain
target
behaviors, along with the stimulus
conditions that accompanied them and
the consequences that
followed.
Such data are especially
useful in telling both
clinician and client how
often the behavior in
question
occurs. In addition, it can
provide an index of change as a
result of therapy (for
example, by
comparing
baseline frequency with
frequency after six weeks of
therapy). Also, it can help
focus the
client's
attention on undesirable behavior and
thus aid in reducing it.
Finally, clients can come
to
realize
the connections between environmental stimuli, the
consequences of their behavior,
and the
behavior
itself.
Of
course, there are problems
with self monitoring. Some
clients may-be inaccurate r may
purposely
distort
their observations or recordings for
various reasons. Others may
simply resist the
whole
procedure.
Despite these obvious difficulties,
self-monitoring has become a
useful and efficient
technique.
It can provide a great deal of
information at very low
cost. However, self-monitoring
is
usually
effective as
a
change agent only in conjunction
with a larger program of
therapeutic
intervention.
A
variety of monitoring aids
has been developed. Some
clients are
provided-with- small
counters or
stopwatches,
depending upon what are to
be monitored. Small file-sized or
wallet sized cards have
been
developed
upon which clients can
quickly and unobtrusively record
their data. At a more informal
level,
some
clients are simply
encouraged to make entries in a
diary. Such aids are
especially useful when
assessing
or treating such problems as obesity,
smoking, lack of assertiveness, and
alcoholism. These
aids
can
help reinforce the notion
that one's problems can be
reduced to specific behaviors. Thus, a
client
who
started with global
complaints of an ephemeral nature can
begin to see that "not
feeling good
about
myself" really involves
inability to stand up for one's
rights in specific circumstances,
speaking
without
thinking, or whatever
The
dysfunctional thought record (DTR) is
completed by the client and
provides the client
and
therapist
with a record of the client's automatic
thoughts that are related to
dysphoria or depression
(J.
S.
Beck, 1995). This DTR can
help the therapist and
client target certain
thoughts and reactions for
change
in a cognitive-behavioral treatment for
depression. The client is
instructed to complete the
DTR
when she or he notices a change in
mood. The situation,
automatic thought(s), and
associated
emotions
are specified. The final two
columns of the DTR can be
filled out in the therapy session
and
serve
as a therapeutic intervention. In this
way, clients are taught to
recognize, evaluate, and modify
these
automatic dysfunctional
thoughts.
VARIABLES
AFFECTING RELIABILITY OF
OBSERVATIONS
Whether
their data come from
interviewing, testing, or observation,
clinicians must be assured
that
the
data are reliable. In the
case of observation, clinicians must
have confidence that
different
observers
will produce basically the
same ratings and scores. For
example, when an observer of
interactions
in the home returns ratings
of a spouse's behavior as "low in
empathy," what
assurance
does the clinician have that
someone else rating the same
behavior in
the
same
167
Clinical
Psychology (PSY401)
VU
circumstances
would have made' the same
report? Many factors can
affect the reliability of
observations.
The following is a good
sample of these
factors.
COMPLEXITY
OF TARGET BEHAVIOR
Obviously,
the more complex the behavior to be
observed, the greater the opportunity for
unreliability.
Behavioral
assessment typically focuses on
less complex, lower-level behaviors
(Haynes, 1998). Ob-
servations
about what a person eats for
breakfast (lower-level behavior)
are likely to be more
reliable than
those
centering on interpersonal behavior
(higher-level, more complex behavior).
This applies to self-
monitoring
as well. Unless specific agreed-upon
behaviors are designated, the observer
has an
enormous
range of behavior upon which
to concentrate. Thus, to identify an instance of
interpersonal
aggression,
one observer might react to
sarcasm while another would
fail to include it and
focus
instead
on clear, physical acts.
TRAINING
OBSERVERS
There
is no substitute for the careful
and systematic training of
observers For example;
observers
who
are sent into psychiatric
hospitals to study patient
behaviors and then make
diagnostic ratings
must
be carefully prepared in advance. It is
necessary to brief them
extensively on just what
the
definition
of, say, depression is, what
specific behaviors represent depression; and so on.
Their goal
should
not be to "please" their
supervisor by coming up (consciously or unconsciously)
with data
"helpful"
to the project. Nor should
they protect one another by talking
over their ratings and
then
"agreeing
to agree."
Occasionally
there are instances of observer
drift, in which
observers, who work closely
together
subtly,
without awareness, begin to
drift away from other
observers in their ratings.
Although
reliability
among the drifting observers
may be acceptable, it is only so
because, over time,
they
have
begun to shift their
definitions of target behaviors .Occasionally, too,
observers are not as
careful
in
their observations when they
feel they are on their own
as when they expect to be
monitored
or
checked (Reid, 1970). To guard
against observer drift,
regularly scheduled reliability
checks (by an
independent
rater) should be conducted and
feedback provided to
raters.
VARIABLES
AFFECTING VALIDITY OF OBSERVATIONS
At
this point, it seems
unnecessary to reiterate the importance of
validity. We have encountered
the
concept
before in our discussions of
both interviewing and
testing; it is no less critical in the
case of
observation.
But here, issues of validity
can be deceptive. It seems
obvious in interviewing that
what
patients
tell the interviewer may
not correspond to their
actual behavior in non
interview settings.
Or
in the case of projective
tests, there may be validity
questions about inferring aggression
from
Rorschach
responses that involve vicious
animals, blood, or large teeth.
After all, percepts are
not the same
as
"real" behavior. But in the
case of observation, things seem much
clearer. When a child is observed
to
bully
his peers unmercifully and
these observations are corroborated by
reports from teachers,
there
would
seem to be little question of
the validity of the observers'
data. Aggression is aggression:
However,
things
are not always so simple, as the
following discussion will
illustrate.
CONTENT
VALIDITY
A
behavioral observation schema
should include the behaviors
that are deemed important
for the
research
or clinical purposes at hand. Usually
the investigator or clinician who
develops the system also
determines
whether or not the system
shows content validity. But
this process is almost circular, in
the
sense
that a system is valid if
the clinician decides that
it is valid. In developing the
Behavioral
Coding
System (BCS), Jones et al.
11975, circumvented this problem by
organizing several categories
of
noxious
behaviors in children and then
submitting them for ratings.
By using mothers' ratings,
they
were
able to confirm their own a
priori clinical judgments as to whether
or not certain deviant behaviors
168
Clinical
Psychology (PSY401)
VU
were
in fact noxious or
aversive.
CONCURRENT
VALIDITY
Another
way to approach the validity of
observations is to ask whether
one's obtained
observational
ratings
correspond to what others (such as
teachers, spouse, and
friends) are observing in the
same time
frame.
For example, do observational
ratings of children's aggression on the
playground made by
trained
observers agree with the ratings
made b y the children s
peers? In short, do the
children per-
ceive
each other's aggression in
the same way that observers
do?
CONSTRUCT
VALIDITY
Observational
'systems are usually derived
from some implicit or
explicit theoretical framework.
For-
example,
the BCS of Jones et al.
(1975) was derived from a
social learning framework that
sees
aggression
as the result of learning in the family.
When the rewards for aggression
are substantial,
aggression
Mill occur. When such
rewards are no longer contingent on
the behavior, aggression
should
subside.
Therefore, the construct
validity of the BCS could be
demonstrated by showing
that
children's
aggressive behavior declines
from a baseline point after
clinical treatment, with
clinical
treatment
defined as rearranging the social
contingencies in the family in a
way that ought to reduce
the
incidence
of observed aggression.
MECHANICS
OF RATINGS
It
is important that a unit of
analysis be specified .A unit of analysis
is the length of time observations
will
be
made, along with the type
and number of responses to be considered.
For example, it might be
decided
that
every physical movement or gesture will be recorded
for 1 minute ev ery 4 min
utes. The total
observational
time might consist of a
20-minute recess period for
kindergarten children. This
means that
every
4 minutes the child would be observed
for 1 minute and all
physical movements
recorded.
These
movements would then be coded or
rated for the variable under
study such as aggression,
problem
soling,
or dependency).
In
addition to the units of
analysis chosen, the
specific form that the
ratings will take must also
be
decided.
One could decide to record behaviors
along a dimension of intensity:
How strong was the
aggressive
behavior? One might also
include a duration
record:
How long did the behavior
last? Or one
might
use a simple frequency
count:
How many times in a designated
period did the behavior
under
study
occur?
Beyond
this, a scoring procedure must be
developed. Such procedures
can range from making
check
marks
on a sheet of paper attached to a
clipboard to the use of counters,
stopwatches, timers, and
even
laptop
computers. All raters, of course,
will employ the same
procedure.
REACTIVITY
Another
factor affecting the validity 4
observations is called reactivity.
Patients
or study participants
sometimes
react to the fact hat they
are being-observed by changing the
way they behave. The
talkative
person
suddenly, becomes quiet. The
complaining spouse suddenly
becomes the epitome of
self-
sacrifice.
Sometimes an individual may
even feel the need to
apologize for the dog by
saying; "He never
does
that when he is alone with us.' In
any case, reactivity can
severely hamper the validity of
ob-
servations
because it makes the observed
behavior unrepresentative of what
normally occurs. The
real
danger
of reactivity is that the observer may
not recognize its presence. If observed
behavior is not a true
sample,
this affects the extent to which
one can generalize from this
instance of behavior. Then,
too,
observers
may unwittingly interfere with or
influence the very behavior
they are sent to observe. In
the case
of
sexual dysfunction, for
example, Conte 11986) has
noted that behavioral ratings
are so intrusive that
169
Clinical
Psychology (PSY401)
VU
clinicians
usually have to rely
on
self-report methods.
SUGGESTIONS
FOR IMPROVING RELIABILITY AND VALIDITY OF
OBSERVATIONS
1)
Decide on target behaviors
that are both relevant and
comprehensive.
2)
Work from an explicit
theoretical framework that
will help define the
behaviors of interest.
3)
Employ trained
observers
4)
Make sure that the
observational format is strictly
specified
5)
Be aware of such potential
sources of error as bias and
fluctuations in concentration.
6)
Consider the possibility of
reactivity
7)
Giver careful consideration to
how representative the observations really
are
170
Table of Contents:
|
|||||