|
|||||
Chapter
5
Software
Team Organization
and
Specialization
Introduction
More
than almost any other
technical or engineering field,
software devel-
opment
depends upon the human
mind, upon human effort,
and upon
human
organizations. From the day
a project starts until it is
retired
perhaps
30 years later, human
involvement is critical to every
step in
development,
enhancement, maintenance, and
customer support.
Software
requirements are derived
from human discussions of
appli-
cation
features. Software architecture
depends upon the knowledge
of
human
specialists. Software design is
based on human
understanding
augmented
by tools that handle some of
the mechanical aspects,
but
none
of the intellectual
aspects.
Software
code is written line-by-line by
craftspeople as custom
arti-
facts
and involves the highest
quantity of human effort of
any modern
manufactured
product. (Creating sculpture
and building special
prod-
ucts
such as 12-meter racing
yachts or custom furniture
require similar
amounts
of manual effort by skilled
artisans, but these are
not main-
stream
products that are widely
utilized by thousands of
companies.)
Although
automated static analysis
tools and some forms of
auto-
mated
testing exist, the human
mind is also a primary tool
for finding
bugs
and security flaws. Both
manual inspections and
manual creation
of
test plans and test
cases are used for
over 95 percent of
software
applications,
and for almost 100
percent of software applications
larger
than
1,000 function points in
size. Unfortunately, both
quality and secu-
rity
remain weak links for
software.
As
the economy sinks into
global recession, the high
costs and mar-
ginal
quality and security of
custom software development
are going
275
276
Chapter
Five
to
attract increasingly critical
executive attention. It may
well be that
the
global recession will provide a
strong incentive to begin to
migrate
from
custom development to construction
from standard reusable
com-
ponents.
The global recession may
also provide motivation for
designing
more
secure software with higher
quality, and for moving
toward higher
levels
of automation in quality control
and security control.
In
spite of the fact that
software has the highest
labor content of any
manufactured
product, the topic of
software team organization
struc-
ture
is not well covered in the
software literature.
There
are anecdotal reports on the
value of such topics as
pair-pro-
gramming,
small self-organizing teams,
Agile teams, colocated
teams,
matrix
versus hierarchical organizations,
project offices, and
several
others.
But these reports lack
quantification of results. It is hard
to
find
empirical data that shows
side-by-side results of different
kinds of
organizations
for the same kinds of
applications.
One
of the larger collections of
team-related information that is
avail-
able
to the general public is the
set of reports and data
published by the
International
Software Benchmarking Standards
Group (ISBSG). For
example,
this organization has
productivity and average
application
size
for teams ranging between 1
and 20 personnel. They also
have data
on
larger teams, with the
exception that really large
teams in excess of
500
people are seldom reported
to any benchmark
organization.
Quantifying
Organizational Results
This
chapter will deal with organizational
issues in a somewhat
unusual
fashion.
As various organization structures
and sizes are
discussed,
information
will be provided that attempts to
show in quantified
form
a
number of important
topics:
1.
Typical staffing complements in
terms of managers, software
engi-
neers,
and specialists.
2.
The largest software
projects that a specific
organization size and
type
can handle.
3.
The average size of software
projects a specific organization
size
and
type handles.
4.
The average productivity
rates observed with specific
organization
sizes
and types.
5.
The average development
schedules observed with specific
organi-
zation
sizes and types.
6.
The average quality rates
observed with specific organization
sizes
and
types.
Software
Team Organization and
Specialization
277
7.
Demographics, or the approximate
usage of various
organization
structures.
8.
Demographics in the sense of the
kinds of specialists often
deployed
under
various organizational
structures.
Of
course, there will be some
overlap among various sizes
and kinds
of
organization structures. The
goal of the chapter is to
narrow down
the
ranges of uncertainty and to
show what forms of
organization are
best
suited to software projects of
various sizes and
types.
Organizations
in this chapter are
discussed in terms of typical
depart-
mental
sizes, starting with one-person
projects and working
upward
to
large, multinational, multidisciplinary
teams that may have
1,000
personnel
or more.
Observations
of various kinds of organization
structures are
derived
from
on-site visits to a number of
organizations over a multiyear
period.
Examples
of some of the organizations
visited by the author
include
Aetna
Insurance, Apple, AT&T, Boeing,
Computer Aid Incorporated
(CAI),
Electronic Data Systems
(EDS), Exxon, Fidelity, Ford
Motors,
General
Electric, Hartford Insurance, IBM,
Microsoft, NASA, NSA,
Sony,
Texas Instruments, the U.S.
Navy, and more than
100 other
organizations.
Organization
structures are important
aspects of successful
software
projects,
and a great deal more
empirical study is needed on
organiza-
tional
topics.
The
Separate Worlds of Information
Technology
and Systems
Software
Many
medium and large companies
such as banks and insurance
com-
panies
only have information
technology (IT) organizations.
While there
are
organizational problems and
issues within such
companies, there
are
larger problems and issues
within companies such as
Apple, Cisco,
Google,
IBM, Intel, Lockheed, Microsoft,
Motorola, Oracle,
Raytheon,
SAP
and the like, which
develop systems and embedded
software as
well
as IT software.
Within
most companies that build
both IT and systems
software, the
two
organizations are completely
different. Normally, the IT
organiza-
tion
reports to a chief information
officer (CIO). The systems
software
groups
usually report to a chief
technology officer
(CTO).
The
CIO and the CTO are
usually at the same level,
so neither has
authority
over the other. Very
seldom do these two
disparate software
organizations
share much in the way of
training, tools,
methodologies,
or
even programming languages.
Often they are located in
different
buildings,
or even in different
countries.
278
Chapter
Five
Because
the systems software
organization tends to operate as a
profit
center,
while the IT organizations
tends to operate as a cost
center, there
is
often friction and even
some dislike between the
two groups.
The
systems software group
brings in revenues, but the
IT organi-
zation
usually does not. The
friction is made worse by
the fact that
compensation
levels are often higher in
the systems software
domain
than
in the IT domain.
While
there are significant
differences between IT and
systems soft-
ware,
there are also similarities.
As the global recession
intensifies and
companies
look for ways to save
money, sharing information
between
IT
and systems groups would
seem to be advantageous.
Both
sides need training in
security, in quality assurance, in
testing,
and
in software reusability. The
two sides tend to be on
different business
cycles,
so it is possible that the
systems software side might
be growing
while
the IT side is downsizing, or
vice versa. Coordinating
position open-
ings
between the two sides
would be valuable in a
recession.
Also
valuable would be shared
resources for certain skills
that both
sides
use. For example, there is a
chronic shortage of good
technical
writers,
and there is no reason why
technical communications could
not
serve
the IT organization and the
systems organization
concurrently.
Other
groups such as testing,
database administration, and
quality
assurance
might also serve both
the systems and IT
organizations.
So
long as the recession is
lowering sales volumes and
triggering
layoffs,
organizations that employ
both systems software and IT
groups
would
find it advantageous to consider
cooperation.
Both
sides usually have less
than optimal quality,
although systems
software
is usually superior to IT applications in
that respect. It is pos-
sible
that methods such as PSP,
TSP, formal inspections,
static analysis,
automated
testing, and other
sophisticated quality control
methods could
be
used by both the IT side
and the systems side,
which would simplify
training
and also allow easier
transfers of personnel from one
side to
the
other.
Colocation
vs. Distributed Development
The
software engineering literature
supports a hypothesis that
develop-
ment
teams that are colocated in
the same complex are
more productive
than
distributed teams of the
same size located in
different cities or
countries.
Indeed
a study carried out by the
author that dealt with large
soft-
ware
applications such as operating
systems and
telecommunication
systems
noted that for each
city added to the
development of the
same
applications,
productivity declined by about 5
percent compared with
teams
of identical sizes located in a
single site.
Software
Team Organization and
Specialization
279
The
same study quantified the
costs of travel from city to
city. For one
large
telecommunications application that
was developed jointly
between
six
cities in Europe and one
city in the United States,
the actual costs of
airfare
and travel were higher
than the costs of programming or
coding.
The
overall team size for
this application was about
250, and no fewer
than
30 of these software engineers or
specialists would be traveling
from
country
to country every week, and
did so for more than
three years.
Unfortunately,
the fact that colocation is
beneficial for software is
an
indication
that "software engineering" is a
craft or art form rather
than
an
engineering field. For most
engineered products such as
aircraft,
automobiles,
and cruise ships, many
components and
subcomponents
are
built by scores of subcontractors who
are widely dispersed
geograph-
ically.
While these manufactured
parts have to be in one location
for
final
assembly, they do not have
to be constructed in the same
building
to
be cost-effective.
Software
engineering lacks sufficient
precision in both design
and
development
to permit construction from
parts that can be
developed
remotely
and then delivered for
final construction. Of course,
software
does
involve both outsourcers and
remote development teams,
but the
current
results indicate lower
productivity than for
colocated teams.
The
author's study of remote
development was done in the
1980s,
before
the Web and the
Internet made communication
easy across geo-
graphic
boundaries.
Today
in 2009, conference calls,
webinars, wiki groups, Skype,
and
other
high-bandwidth communication methods
are readily
available.
In
the future, even more
sophisticated communication methods
will
become
available.
It
is possible to envision three
separate development teams
located
eight
hours apart, so that work on
large applications could be
transmit-
ted
from one time zone to
another at the end of every
shift. This would
permit
24-hour development by switching
the work to three
different
countries
located eight hours apart.
Given the sluggish multiyear
devel-
opment
schedules of large software
applications, this form of
distributed
development
might cut schedules down by
perhaps 60 percent
compared
with
a single colocated
team.
For
this to happen, it is obvious
that software would need to
be an engi-
neering
discipline rather than a
craft or art form, so that
the separate
teams
could work in concert rather
than damaging each other's
results.
In
particular, the architecture,
design, and coding practices
would have
to
be understood and shared by
the teams at all three
locations.
What
might occur in the future
would be a virtual development
environ-
ment
that was available 24 hours
a day. In this environment,
avatars of
the
development teams could
communicate "face to face" by
using either
their
own images or generic
images. Live conversations
via Skype or the
280
Chapter
Five
equivalent
could also be used as well
as e-mail and various
specialized
tools
for activities such as
remote design and code
inspections.
In
addition, suites of design
tools and project planning
tools would also
be
available in the virtual
environment so that both
technical and busi-
ness
discussions could take place
without the need for
expensive travel.
In
fact, a virtual war room with
every team's status, bug
reports, issues,
schedules,
and other project materials
could be created that might
even
be
more effective than today's
colocated organizations.
The
idea is to allow three
separate teams located
thousands of miles
apart
to operate with the same
efficiency as colocated teams. It is
also
desirable
for quality to be even
better than today. Of course, with
24-hour
development,
schedules would be much
shorter than they are
today.
As
of 2009, virtual environments
are not yet at the
level of sophisti-
cation
needed to be effective for
large system development. But as
the
recession
lengthens, methods that
lower costs (especially
travel costs)
need
to be reevaluated at frequent
intervals.
An
even more sophisticated and
effective form of software
engineer-
ing
involving distributed development
would be that of
just-in-time
software
engineering practices similar to
those used on the
construction
of
automobiles, aircraft, and
large cruise ships.
In
this case, there would
need to be standard architectures
that sup-
ported
construction from reusable
components. The components
might
either
be already in stock, or developed by
specialized vendors
whose
geographic
locations might be anywhere on
the planet.
The
fundamental idea is that
rather than custom design
and custom
coding,
standard architectures and
standard designs would allow
con-
struction
from standard reusable
components.
Of
course, this idea involves
many software engineering
technical
topics
that don't fully exist in
2009, such as parts lists,
standard inter-
faces,
certification protocols for
quality and security, and
architectural
methods
that support reusable
construction.
As
of 2009, the development of
custom-built software
applications
ranges
between $1,000 per function
point and $3,000 per
function point.
Software
maintenance and enhancements
range between about
$100
and
$500 per function point
per year, forever. These
high costs make
software
among the most expensive
business "machines" ever
created.
As
the recession lengthens, it is
obvious that the high
costs of custom
software
development need to be analyzed
and more cost-effective
meth-
ods
developed. A combination of certified
reusable components that
could
be
assembled by teams that are
geographically dispersed could, in
theory,
lead
to significant cost reductions
and schedule reductions
also.
A
business goal for software
engineers would be to bring
software
development
costs down below $100
per function point, and
annual
maintenance
and enhancement costs below
$50 per function
point.
Software
Team Organization and
Specialization
281
A
corollary business goal
might be to reduce development
schedules
for
10,000function point applications
from today's averages of
greater
than
36 calendar months down to 12
calendar months or
less.
Defect
potentials should be reduced
from today's averages of
greater
than
5.00 per function point
down to less than 2.50
per function point.
At
the same time, average
levels of defect removal
efficiency should
rise
from today's average of less
than 85 percent up to greater
than 95
percent,
and ideally greater than 97
percent.
Colocation
cannot achieve such major
reductions in costs,
schedules,
and
quality, but a combination of
remote development, virtual
develop-
ment
environments, and standard
reusable components might
well turn
software
engineering into a true
engineering field, and also
lower both
development
and maintenance costs by
significant amounts.
The
Challenge of Organizing
Software
Specialists
In
a book that includes
"software engineering" in the
title, you might
suppose
that the majority of the
audience at which the book
is aimed
are
software engineers working on
development of new
applications.
While
such software engineers are
a major part of the
audience, they
actually
comprise less than one-third
of the personnel who work
on
software
in large corporations.
In
today's world of 2009, many
companies have more
personnel working
on
enhancing and modifying
legacy applications than on
new develop-
ment.
Some companies have about as
many test personnel as they
do
conventional
software engineering personnel--sometimes
even more.
Some
of the other software
occupations are just as
important as soft-
ware
engineers for leading
software projects to a successful
outcome.
These
other key staff members
work side-by-side with software
engi-
neers,
and major applications
cannot be completed without
their work.
A
few examples of other
important and specialized
skills employed on
software
projects include architects,
business analysts, database
admin-
istrators,
test specialists, technical
writers, quality assurance
special-
ists,
and security
specialists.
As
discussed in Chapter 4 and
elsewhere, the topic of
software spe-
cialization
is difficult to study because of
inconsistencies in job
titles,
inconsistencies
in job descriptions, and the
use of abstract titles
such
as
"member of the technical
staff " that might encompass
as many as
20
different jobs and
occupations.
In
this chapter, we deal with an
important issue. In the
presence of so
many
diverse skills and
occupations, all of which
are necessary for
soft-
ware
projects, what is the best
way to handle organization
structures?
282
Chapter
Five
Should
these specialists be embedded in
hierarchical structures?
Should
they
be part of matrix software
organization structures and
report in to
their
own chain of command while
reporting via "dotted lines"
to project
managers?
Should they be part of small
self-organizing teams?
This
topic of organizing specialists is
surprisingly ambiguous as of
2009
and has very little
solid data based on
empirical studies. A
few
solid
facts are known,
however:
1.
Quality assurance personnel
need to be protected from
coercion in
order
to maintain a truly objective
view of quality and to
report
honestly
on problems. Therefore, the QA
organization needs to be
separate
from the development
organization all the way up
to the
level
of a senior vice president of
quality.
2.
Because the work of
maintenance and bug repairs
is rather differ-
ent
from the work of new
development, large corporations
that have
extensive
portfolios of legacy software
applications should
consider
using
separate maintenance departments
for bug repairs.
3.
Some specialists such as
technical writers would have
little oppor-
tunity
for promotion or job
enrichment if embedded in
departments
staffed
primarily by software engineers.
Therefore, a separate
technical
publications organization would
provide better career
opportunities.
The
fundamental question for
specialists is whether they
should be
organized
in skill-based units with others
who share the same
skills
and
job titles, or embedded in
functional departments where
they will
actually
exercise those
skills.
The
advantage of skill-based units is
that they offer specialists
wider
career
opportunities and better
educational opportunities. Also, in
case
of
injury or incapacity, the
skill-based organizations can
usually assign
someone
else to take over.
The
advantage of the functional
organization where specialists
are
embedded
in larger units with many
other kinds of skills is
that the
specialists
are immediately available
for the work of the
unit.
In
general, if there are a
great many of a certain kind
of special-
ist
(technical writers, testers,
quality assurance, etc.),
the skill-based
organizations
seem advantageous. But for
rare skills, there may
not be
enough
people in the same
occupation for a skill-based
group to even
be
created (i.e., security,
architecture, etc.).
In
this chapter, we will consider
various alternative methods
for deal-
ing
with the organization of key
specialists associated with
software.
There
are more than 120
software-related specialties in all,
and for
some
of these, there may only be
one or two employed even in
fairly
large
companies.
Software
Team Organization and
Specialization
283
This
chapter concentrates on key
specialties whose work is
critical
to
the success of large applications in
large companies. Assume
the
software
organization in a fairly large
company employs a total of
1,000
personnel.
In this total of 1,000
people, how many different
kinds of spe-
cialists
and how many specific
individuals are likely to be
employed? For
that
matter, what are the
specialists that are most
important to success?
Table
5-1 identifies a number of
these important specialists
and the
approximate
distribution out of a total of
1,000 software
personnel.
TABLE
5-1
Distribution
of Software Specialists for 1,000
Total Software Staff
Number
Percent
1.
Maintenance
specialists
315
31.50%
2.
Development
software engineers
275
27.50%
3.
Testing
specialists
125
12.50%
4.
First-line
managers
120
12.00%
5.
Quality
assurance specialists
25
2.50%
6.
Technical
writing specialists
23
2.30%
7.
Customer
support specialists
20
2.00%
8.
Configuration
control specialists
15
1.50%
9.
Second-line
managers
9
0.90%
10.
Business
analysts
8
0.80%
11.
Scope
managers
7
0.70%
12.
Administrative
support
7
0.70%
13.
Project
librarians
5
0.50%
14.
Project
planning specialists
5
0.50%
15.
Architects
4
0.40%
16.
User
interface specialists
4
0.40%
17.
Cost
estimating specialists
3
0.30%
18.
Measurement/metric
specialists
3
0.30%
19.
Database
administration specialists
3
0.30%
20.
Nationalization
specialists
3
0.30%
21.
Graphical
artists
3
0.30%
22.
Performance
specialists
3
0.30%
23.
Security
specialists
3
0.30%
24.
Integration
specialists
3
0.30%
25.
Encryption
specialists
2
0.20%
26.
Reusability
specialists
2
0.20%
27.
Test
library control
specialists
2
0.20%
28.
Risk
specialists
1
0.10%
29.
Standards
specialists
1
0.10%
30.
Value
analysis specialists
1
0.10%
TOTAL
SOFTWARE EMPLOYMENT
1000
100.00%
284
Chapter
Five
As
can be seen from Table 5-1,
software engineers do not
operate all by
themselves.
A variety of other skills
are needed in order to
develop and
maintain
software applications in the
modern world. Indeed, as of
2009,
the
number and kinds of software
specialists are increasing,
although
the
recession may reduce the
absolute number of software
personnel if
it
lengthens and stays
severe.
Software
Organization Structures
from
Small to Large
The
observed sizes of software
organization structures range
from a low
of
one individual up to a high that
consists of multidisciplinary
teams
of
30 personnel or more.
For
historical reasons, the
"average" size of software
teams tends to be
about
eight personnel reporting to a
manager or team leader.
However,
both
smaller and larger teams
are quite common.
This
section of Chapter 5 examines
the sizes and attributes of
soft-
ware
organization structures from
small to large, starting with
one-
person
projects.
One-Person
Software Projects
The
most common corporate
purpose for one-person
projects is that of
carrying
out maintenance and small
enhancements to legacy
software
applications.
For new development,
building web sites is a
typical one-
person
activity in a corporate
context.
However,
a fairly large number of
one-person software
companies
actually
develop small commercial
software packages such as
iPhone
applications,
shareware, freeware, computer
games, and other
small
applications.
In fact, quite a lot of
innovative new software and
product
ideas
originate from one-person
companies.
Because
small software maintenance
projects are
Demographics
common,
on any given day, probably
close to 250,000 one-person
projects
are
under way in the United
States, with the majority
being mainte-
nance
and enhancements.
In
terms of one-person companies
that produce small
applications, the
author
estimates that as of 2009,
there are probably more
than 10,000 in
the
United States. This has
been a surprisingly fruitful
source of inno-
vation,
and is also a significant
presence in the open-source,
freeware,
and
shareware domains.
Project
size The
average size of new
applications done by
one-person
projects
is about 50 function points,
and the maximum size is
below 1,000
Software
Team Organization and
Specialization
285
function
points. For maintenance or
defect repair work, the
average size is
less
than 1 function point and
seldom tops 5 function
points. For enhance-
ment
to legacy applications, the
average size is about 5 to 10
function
points
for each new feature added,
and seldom tops 15 function
points.
Productivity
rates Productivity
rates for one-person efforts
are usually
quite
good, and top 30 function
points per staff month.
One caveat is
that
if the one-person development
team also has to write
user manuals
and
provide customer support,
then productivity gets cut
approximately
in
half.
Another
caveat is that many
one-person companies are
home based.
Therefore
unexpected events such as a
bout of flu, a new baby, or
some
other
normal family event such as
weddings and funerals can
have a
significant
impact on the work at
hand.
A
third caveat is that
one-person software projects
are very sensitive
to
the skill and work
practices of specific individuals.
Controlled experi-
ments
indicate about a 10-to-1
difference between the best
and worst
results
for tasks such as coding
and bug removal. That
being said, quite
a
few of the people who
migrate into one-person
positions tend to be at
the
high end of the competence
and performance
scale.
Development
schedules for one-person
maintenance and
Schedules
enhancement
projects usually range
between a day and a week.
For new
development
by one person, schedules usually
range between about
two
months
and six months.
Quality
The
quality levels for
one-person applications are
not too bad.
Defect
potentials run to about 2.5
bugs per function point,
and defect
removal
efficiency is about 90 percent.
Therefore a small iPhone
applica-
tion
of 25 function points might
have a total of about 60
bugs, of which
6
will still be present at
release.
Specialization
You
might think that one-person
projects would be the
domain
of generalists, since it is obvious
that special skills such
as
testing
and documentation all have
to be found in the same
individual.
However,
one of the more surprising
results of examining
one-person
projects
is that many of them are
carried out by people who
are not
software
engineers or programmers at
all.
For
embedded and systems
software, many one-person
software
projects
are carried out by
electrical engineers,
telecommunication
engineers,
automotive engineers, or some
other type of engineer.
Even
for
business software, some
one-person projects may be
carried out by
accountants,
attorneys, business analysts,
and other domain experts
who
are
also able to program. This
is one of the reasons why such a
significant
286
Chapter
Five
number
of inventions and new ideas
flow from small companies
and
one-person
projects.
The
major caution about
one-person
Cautions
and counter
indications
projects
for either development or
maintenance is lack of backup in
case
of
illness or incapacity. If something
should happen to that one
person,
work
will stop completely.
A
second caution is if the
person developing software is a
domain
expert
(i.e., accountant, business
analyst, statistician, etc.)
who is
building
an application for personal
use in a corporation, there
may be
legal
questions involving the
ownership of the application
should the
employee
leave the company.
A
third caution is that there
may be liability issues in
case the soft-
ware
developed by a knowledge worker
contains errors or does
some
kind
of damage to the company or
its clients.
One-person
projects are the norm
and are quite
effective
Conclusions
for
small enhancement updates
and for maintenance changes
to legacy
applications.
Although
one-person development projects
must necessarily be
rather
small,
a surprising number of innovations
and good ideas have
origi-
nated
from brilliant individual
practitioners.
Pair
Programming for
Software
Development
and Maintenance
The
idea of pair-programming is for
two software developers to
share one
computer
and take turns doing
the coding, while the
other member of
the
team serves as an observer.
The roles switch back
and forth between
the
two at frequent intervals,
such as perhaps every 30
minutes to an
hour.
The team member doing
the coding is called the
driver
and
the
other
member is the navigator
or
observer.
As
of 2009, the results of pair
programming are ambiguous.
Several
studies
indicate fewer defects from
pair programming, while
others
assert
that development schedules
are improved as well.
However,
all of the experiments were
fairly small in scale and
fairly
narrow
in focus. For example, no known
study of pair-programming defects
compared
the results against an
individual programmer who
used static
analysis
and automatic testing.
Neither have studies
compared top-gun
individuals
against average to mediocre
pairs, or vice versa.
There
are also no known studies
that compare the quality
results of
pair
programming against proven
quality approaches such as
formal
design
and code inspections, which
have almost 50 years of
empirical
data
available, and which also
utilize the services of
other people for
finding
software defects.
Software
Team Organization and
Specialization
287
While
many of the pair-programming
experiments indicate
shorter
development
schedules, none indicate reduced
development effort or costs
from
having two people perform
work that is normally
performed by one
person.
For
pair programming to lower
development costs, schedules
would
have
to be reduced by more than 50
percent. However,
experiments
and
data collected to date
indicate schedule reductions of
only about
15
percent to 30 percent, which
would have the effect of
raising develop-
ment
costs by more than 50
percent compared with a single
individual
doing
the same work.
Pair-programming
enthusiasts assert that
better quality will
com-
pensate
for higher development
effort and costs, but
that claim is not
supported
by studies that included
static analysis, automatic
testing,
formal
inspections, and other
sophisticated defect removal
methods. The
fact
that two developers who
use manual defect removal
methods might
have
lower defects than one
developer using manual
defect removal
methods
is interesting but
unconvincing.
Pair
programming might be an interesting
and useful method
for
developing
reusable components, which
need to have very high
quality
and
reliability, but where
development effort and
schedules are com-
paratively
unimportant. However, Watts
Humphrey's Team
Software
Process
(TSP) is also an excellent
choice for reusable
components and
has
far more historical data
available than pair
programming does.
Subjectively,
the pair-programming concept
seems to be enjoyable to
many
who have experienced it.
The social situation of
having another
colleague
involved with complicated algorithms
and code structures
is
perceived
as being advantageous.
As
the recession of 2009
continues to expand and
layoffs become more
numerous,
it is very likely that pair
programming will no longer be
utilized,
due to the fact that
companies will be reducing software
staffs
down
to minimal levels and can no
longer afford the extra
overhead.
Most
of the literature on pair
programming deals with colocation
in
a
single office. However,
remote pair-programming, where
the partners
are
in different cities or countries, is
occasionally cited.
Pair
programming is an interesting form of
collaboration, and
collabo-
ration
is always needed for
applications larger than
about 100 function
points
in size.
In
the context of test-driven
development, one interesting
variation
of
pair programming would be
for one of the pair to write
test cases and
the
other to write code, and
then to switch roles.
Another
area where pair programming
has been used
successfully
is
that of maintenance and bug
repairs. One maintenance
outsource
company
has organized their
maintenance teams along the
lines of an
urban
police station. The reason
for this is that bugs
come in at random
288
Chapter
Five
intervals,
and there is always a need
to have staff available when
a new
bug
is reported, especially a new
high-severity bug.
In
the police model of
maintenance, a dispatcher and
several pairs of
maintenance
programmers work as partners,
just as police
detectives
work
as partners.
During
defect analysis, having two
team members working side
by
side
speeds up finding the origins of
reported bugs. Having two
people
work
on the defect repairs as
partners also speeds up the
repair inter-
vals
and reduces bad-fix
injections. (Historically, about 7
percent of
attempts
to repair a bug accidentally
introduce a new bug in the
fix
itself.
These are called bad
fixes.)
In
fact, pair programming for
bug repairs and maintenance
activities
looks
as if it may be the most
effective use of pairs yet
noted.
Demographics
Because
pair programming is an experimental
approach,
the
method is not widely
deployed. As the recession
lengthens, there may
be
even less pair-programming.
The author estimates that as
of 2009,
perhaps
500 to 1,000 pairs are
currently active in the
United States.
Project
size The
average size of new
applications done by
pair-program-
ming
teams is about 75 function
points, and the maximum
size is fewer
than
1,000 function points. For
maintenance or defect repair
work, the
average
size is less than 1 function
point. For enhancement to
legacy
applications,
the average size is about 5
to 10 function points for
each
new
feature added.
Productivity
rates for pair-programming
efforts
Productivity
rates
are
usually in the range of 16 to 20
function points per staff
month or
30
percent less than the
same project done by one
person.
Pair-programming
software projects are very
sensitive to the
skill
and
work practices of specific
individuals. As previously mentioned,
con-
trolled
experiments indicate about a
10-to-1 range difference
between
the
best and worst results
for tasks such as coding
and bug removal by
individual
participants in such
studies.
Some
psychological studies of software
personnel indicate a
tendency
toward
introversion, which may make
the pair-programming
concept
uncomfortable
to some software engineers.
The literature on pair
pro-
gramming
does indicate social
satisfaction.
Schedules
Development
schedules for pair-programming
maintenance
and
enhancement projects usually
range between a day and a
week.
For
new development by pairs,
schedules usually range
between about
two
months and six months.
Schedules tend to be about 10
percent to
30
percent shorter than
one-person efforts for the
same number of func-
tion
points.
Software
Team Organization and
Specialization
289
Quality
The
quality levels for
pair-programming applications are
not
bad.
Defect potentials run to
about 2.5 bugs per
function point, and
defect
removal efficiency is about 93
percent. Therefore, a small
iPhone
application
of 25 function points might
have a total of about 60
bugs, of
which
4 will still be present at release.
This is perhaps 15 percent
better
than
individual developers using
manual defect removal and
testing.
However,
there is no current data
that compares pair
programming with
individual
programming efforts where
automated static analysis
and
automated
testing are part of the
equation.
There
are few studies to date on
the role of
specialization
Specialization
in
a pair-programming context. However,
there are reports of
interest-
ing
distributions of effort. For
example, one of the pair
might write test
cases
while the other is coding,
or one might write user
stories while
the
other codes.
To
date there are no studies of
pair programming that
concern teams
with
notably different backgrounds
working on the same
application;
that
is, a software engineer
teamed with an electrical engineer or
an
automotive
engineer; a software engineer
teamed with a medical
doctor;
and
so forth. The pairing of
unlike disciplines would
seem to be a topic
that
might be worth experimenting
with.
Cautions
and counter indications
The
topic of pair programming
needs
additional
experimentation before it can become a
mainstream approach,
if
indeed it ever does. The
experiments need to include more
sophisticated
quality
control, and also to compare
top-gun individual
programmers.
The
higher costs of pair
programming are not likely
to gain adherents
during
a strong recession.
There
is scarcely enough empirical
data about pair
pro-
Conclusions
gramming
to draw solid conclusions.
Experiments and anecdotal
results
are
generally favorable, but the
experiments to date cover
only a few
variables
and ignore important topics
such as the role of static
analysis,
automatic
testing, inspections, and
other quality factors. As
the global
recession
lengthens and deepens, pair
programming may drop
from
view
due to layoffs and
downsizing of software
organizations.
Self-Organizing
Agile Teams
For
several years, as the Agile
movement gained adherents,
the concept
of
small self-organizing teams
also gained adherents. The
concept of
self-organized
teams is that rather than
have team members
reporting
to
a manager or formal team
leader, the members of the
team would
migrate
to roles that they felt
most comfortably matched
their skills.
290
Chapter
Five
In
a self-organizing team, every
member will be a direct
contribu-
tor
to the final set of
deliverables. In an ordinary department
with a
manager,
the manager is usually not a
direct contributor to the
code
to
deliverables that reach end
users. Therefore, self-organizing
teams
should
be slightly more efficient
than ordinary departments of
the same
size,
because they would have one
additional worker.
In
U.S. businesses, ordinary
departments average about
eight employ-
ees
per manager. The number of
employees reporting to a manager
is
called
the span
of control. (The
actual observed span of
control within
large
companies such as IBM has
ranged from a low of 2 to a
high of
30
employees per
manager.)
For
self-organizing teams, the
nominal range of size is
about "7 plus or
minus
2." However, to truly match
any given size of software
project, team
sizes
need to range from a low of
two up to a maximum of about
12.
A
significant historical problem with
software has been that of
decom-
posing
applications to fit existing organization
structures, rather
than
decomposing
the applications into
logical pieces based on the
funda-
mental
architecture.
The
practical effect has been to
divide large applications
into multiple
segments
that can be developed by an
eight-person department
whether
or
not that matches the
architecture of the
application.
In
an Agile context, a user
representative may be a member of
the
team
and provides inputs as to
the features that are
needed, and also
provides
experiential reports based on running
the pieces of the
applica-
tion
as they are finished. The
user representative has a
special role and
normally
does not do any code
development, although some
test cases
may
be created by the embedded
user representative. Obviously,
the
user
will provide inputs in terms of
user stories, use cases,
and informal
descriptions
of the features that are
needed.
In
theory, self-organizing teams
are cross-functional, and
everyone
contributes
to every deliverable on an as-needed
basis. However, it is
not
particularly effective for
people to depart from their
main areas of
competence.
Technical writers may not
make good programmers.
Very
few
people are good technical
writers. Therefore, the best
results tend
to
be achieved when team
members follow their
strengths.
However,
in areas where everyone (or
no one) is equally skilled, all
can
participate.
Creating effective test
cases may be an example
where skills
are
somewhat sparse throughout.
Dealing with security of code is
an
area
where so few people are
skilled that if it is a serious
concern, out-
side
expertise will probably have to be
imported to support the
team.
Another
aspect of self-organizing teams is
the usage of daily
status
meetings,
which are called Scrum
sessions, using
a term derived from
the
game of rugby. Typically,
Scrum sessions are short
and deal with
three
key issues: (1) what
has been accomplished since
the last Scrum
Software
Team Organization and
Specialization
291
session,
(2) what is planned between
today and the next
Scrum session,
and
(3) what problems or
obstacles have been
encountered.
(Scrum
is not the only method of
meeting and sharing
information.
Phone
calls, e-mails, and informal
face-to-face meetings occur
every day.
There
may also be somewhat larger
meetings among multiple
teams,
on
an as-needed basis.)
One
of the controversial roles with
self-organizing teams is that
of
Scrum
master. Nominally,
the Scrum master is a form
of coordinator
for
the entire project and is
charged with setting expectations
for work
that
spans multiple team members;
that is, the Scrum
master is a sort
of
coach. This role means that
the personality and
leadership qualities
of
the Scrum master exert a
strong influence on the
overall team.
Demographics
Because
Agile has been on a rapid
growth path for
sev-
eral
years, the number of small
Agile teams is still
increasing. As of
2009,
the author estimates that in
the United States alone
there are
probably
35,000 small self-organizing
teams that collectively
employ
about
250,000 software engineers
and other
occupations.
Project
size The
average size of new
applications done by
self-organizing
teams
with seven members is about
1,500 function points, and
the
maximum
size is perhaps 3,000
function points. (Beyond
3,000 func-
tion
points, teams of teams would
be utilized.) Self-organizing
teams
are
seldom used for maintenance
or defect repair work, since
a bug's
average
size is less than 1 function
point and needs only one
person. For
enhancements
to legacy applications, self-organizing
teams might be
used
for major enhancements in
the 150 to 500function
point range.
For
smaller enhancements of 5 to 10 function
points, individuals
would
probably
be used for coding, with
perhaps some assistance from
testers,
technical
writers, and integration
specialists.
Although
there are methods for
scaling up small teams to
encom-
pass
teams of teams, scaling has
been a problem for
self-organizing
teams.
In fact, the entire Agile
philosophy seems better
suited to
applications
below about 2,500 function
points. Very few
examples
of
large systems greater than
10,000 function points have
even been
attempted
using Agile or self-organizing
teams.
Productivity
rates Productivity
rates for self-organizing
teams on proj-
ects
of 1,500 function points are
usually in the range of 15
function
points
per staff month. They
sometimes top 20 function
points per staff
month
for applications where the
team has significant
expertise and may
drop
below 10 function points per
staff month for unusual or
complex
projects.
292
Chapter
Five
Productivity
rates for individual sprints
are higher, but that
fact is
somewhat
irrelevant because the sprints do
not include final
integration
of
all components, system test
of the entire application,
and the final
user
documentation.
Self-organizing
team projects tend to
minimize the
performance
ranges
of individuals and may help
to bring novices up to speed
fairly
quickly.
However, if the range of
performance on a given team
exceeds
about
2-to-1, those at the high
end of the performance range
will become
dissatisfied
with the work of those at
the low end of the
range.
Development
schedules for new development by
self-
Schedules
organizing
teams for typical
1,500function point projects
usually range
between
about 9 months and 18 months
and would average
perhaps
12
calendar months for the
entire application.
However,
the Agile approach is to
divide the entire
application into a
set
of segments that can be
developed independently. These
are called
sprints
and
would typically be of a size
that can be completed in
perhaps
one
to three months. For an
application of 1,500 function
points, there
might
be five sprints of about 300
function points each. The
schedule
for
each sprint might be around
2.5 calendar months.
Quality
The
quality levels for
self-organizing teams are
not bad, but
usually
don't achieve the levels of
methods such as Team
Software
Process
(TSP) where quality is a
central issue. Typical
defect potentials
run
to about 4.5 bugs per
function point, and defect
removal efficiency
is
about 92 percent.
Therefore,
an application of 1,500 function
points developed by a
self-organizing
Agile team might have a
total of about 6,750 bugs,
of
which
540 would still be present
at release. Of these, about 80
might
be
serious bugs.
However,
if tools such as automated
static analysis and
automated
testing
are used, then defect
removal efficiency can
approach 97 percent.
In
this situation, only about
200 bugs might be present at
release. Of
these,
perhaps 25 might be
serious.
There
are few studies to date on
the role of
specialization
Specialization
in
self-organizing teams. Indeed,
some enthusiasts of
self-organizing
teams
encourage generalists. They
tend to view specialization as
being
similar
to working on an assembly line.
However, generalists often
have
gaps
in their training and
experience. The kinds of
specialists who might
be
useful would be security
specialists, test specialists,
quality assur-
ance
specialists, database specialists,
user-interface specialists,
network
specialists,
performance specialists, and
technical writers.
Software
Team Organization and
Specialization
293
Cautions
and counter indications
The
main caution about
self-organizing
teams
is that the lack of a
standard and well-understood
structure opens up
the
team to the chance of power
struggles and disruptive
social conflicts.
A
second caution is that scaling
Agile up from small
applications to
large
systems with multiple teams in
multiple locations has
proven to
be
complicated and
difficult.
A
third caution is that the
poor measurement practices
associated
with
Agile and with many
self-organizing teams give
the method the
aura
of a cult rather than of an
engineering discipline. The
failure either
to
measure productivity or quality, or to
report benchmarks using
stan-
dard
metrics is a serious
deficiency.
Conclusions
The
literature and evidence for
self-organizing Agile teams
is
somewhat mixed and
ambiguous. For the first
five years of the
Agile
expansion,
self-organizing teams were
garnering a majority of
favorable
if
subjective articles.
Since
about the beginning of 2007,
on the other hand, an
increasing
number
of articles and reports have
appeared that raise
questions about
self-organizing
teams and that even
suggest that they be
abolished due
to
confusion as to roles, disruptive
power struggles within the
teams,
and
outright failures of the
projects.
This
is a typical pattern within
the software industry. New
develop-
ment
methods are initially
championed by charismatic individuals
and
start
out by gaining a significant
number of positive articles
and positive
books,
usually without any
empirical data or quantification of
results.
After
several years, problems
begin to be noted, and
increasing num-
bers
of applications that use the
method may fail or be
unsuccessful. In
part
this may be due to poor
training, but the primary
reason is that
almost
no software development method is
fully analyzed or used
under
controlled
conditions prior to deployment.
Poor measurement
practices
and
a lack of benchmarks are
also chronic problems that
slow down
evaluation
of software methods.
Unfortunately,
self-organizing teams originated in
the context of Agile
development.
Agile has been rather
poor in measuring either
productiv-
ity
or quality, and creates
almost no effective benchmarks.
When Agile
projects
are measured, they tend to
use special metrics such as
story
points
or use-case points, which are
not standardized and lack
empirical
collections
of data and
benchmarks.
Team
Software Process (TSP)
Teams
The
concept of Team Software
Process (TSP) was developed
by Watts
Humphrey
based on his experiences at IBM
and as the originator
of
the
capability maturity model
(CMM) for the Software
Engineering
Institute
(SEI).
294
Chapter
Five
The
TSP concept deals with the
roles and responsibilities
needed to
achieve
successful software development. But
TSP is built on
individual
skills
and responsibilities, so it needs to be
considered in context with
the
Personal Software Process
(PSP). Usually, software
engineers and
specialists
learn PSP first, and
then move to TSP
afterwards.
Because
of the background of Watts
Humphrey with IBM and with
the
capability maturity model,
the TSP approach is
congruent with the
modern
capability maturity model
integrated (CMMI) and
appears to
satisfy
many of the criteria for
CMMI level 5, which is the
top or highest
level
of the CMMI structure.
Because
TSP teams are
self-organizing teams, they
have a surface
resemblance
to Agile teams, which are
also self-organizing.
However,
the
Agile teams tend to adopt
varying free-form structures
based on the
skills
and preferences of whoever is
assigned to the team.
The
TSP teams, on the other
hand, are built on a solid
underpinning
of
specific roles and
responsibilities that remain
constant from project
to
project. Therefore, with TSP
teams, members are selected
based on
specific
skill criteria that have
been shown to be necessary for
successful
software
projects. Employees who lack
needed skills would probably
not
become
members of TSP teams, unless
training were
available.
Also,
prior training in PSP is
mandatory for TSP teams.
Other kinds
of
training such as estimating,
inspections, and testing may
also be used
as
precursors.
Another
interesting difference between
Agile teams and TSP
teams is
the
starting point of the two
approaches. The Agile
methods were origi-
nated
by practitioners whose main
concerns were comparatively
small
IT
applications of 1,500 or fewer
function points. The TSP
approach was
originated
by practitioners whose main
concerns were large
systems
software
applications of 10,000 or more
function points.
The
difference in starting points
leads to some differences in
skill sets
and
specialization. Because small
applications use few
specialists, Agile
teams
are often populated by
generalists who can handle
design, coding,
testing,
and even documentation on an
as-needed basis.
Because
TSP teams are often
involved with large applications,
they
tend
to utilize specialists for
topics such as configuration
control, inte-
gration,
testing, and the
like.
While
both Agile and TSP
share a concern for quality,
they tend to go
after
quality in very different
fashions. Some of the Agile
methods are
based
on test-driven development, or creating
test cases prior to
creat-
ing
the code. This approach is
fairly effective. However,
Agile tends to
avoid
formal inspections and is
somewhat lax on recording
defects and
measuring
quality.
With
TSP, formal inspections of
key deliverables are an
integral part, as
is
formal testing. Another
major difference is that TSP
is very rigorous in
Software
Team Organization and
Specialization
295
measuring
every single defect
encountered from the first
day of require-
ments
through delivery, while
defect measures during Agile
projects are
somewhat
sparse and usually don't
occur before testing.
Both
Agile and TSP may
utilize automated defect
tracking tools, and
both
may utilize approaches such
as static analysis, automated
testing,
and
automated test library
controls.
Some
other differences between
Agile and TSP do not
necessarily affect
the
outcomes of software projects,
but they do affect what is
known about
those
outcomes. Agile tends to be
lax on measuring productivity
and qual-
ity,
while TSP is very rigorous
in measuring task hours,
earned value,
defect
counts, and many other
quantified facts.
Therefore,
when projects are finished,
Agile projects have only
vague
and
unconvincing data that
demonstrates either productivity or
qual-
ity
results. TSP, on the other
hand, has a significant
amount of reliable
quantified
data available.
TSP
can be utilized with both
hierarchical and matrix
organization
structures,
although hierarchical structures
are perhaps more
common.
Watts
Humphrey reports that TSP is
used for many different
kinds of
software,
including defense applications,
civilian government
applica-
tions,
IT applications, commercial software in
companies such as
Oracle
and
Adobe, and even by some of
the computer game companies,
where
TSP
has proven to be useful in
eliminating annoying
bugs.
Demographics
TSP
is most widely used by large
organizations that
employ
between perhaps 1,000 and
50,000 total software
personnel.
Because
of the synergy between TSP
and the CMMI, it is also
widely
used
by military and defense software
organizations. These large
organi-
zations
tend to have scores of specialized
skills and hundreds of
projects
going
on at the same time.
The
author estimates that there
are about 500 companies in
the
United
States now using TSP.
While usage may be
experimental in some
of
these companies, usage is
growing fairly rapidly due
to the success
of
the approach. The number of
software personnel using TSP
in 2009
is
perhaps 125,000 in the
United States.
Project
size The
average size of new
applications done by TSP
teams
with
eight employees and a
manager is about 2,000
function points.
However,
TSP organizations can be
scaled up to any arbitrary
size, so
even
large systems in excess of
100,000 function points can
be handled
by
TSP teams working in
concert. For large
applications with multiple
TSP
teams, some specialist teams
such as testing, configuration
control,
and
integration also support the
general development
teams.
Another
caveat with multiple teams
attempting to cooperate is
that
when
more than about a dozen
teams are involved
simultaneously,
296
Chapter
Five
some
kind of a project office may
be needed for overall
planning and
coordination.
Productivity
rates for TSP departments on
projects of
Productivity
rates
2,000
function points are usually
in the range of 14 to18
function points
per
staff month. They sometimes
top 22 function points per
staff month
for
applications where the team
has significant expertise,
and may drop
below
10 function points per staff
month for unusual or complex
proj-
ects.
Productivity tends to be inversely
proportional to application
size
and
declines as applications grow
larger.
Schedules
Development
schedules for new
development by TSP groups
with
eight team members working
on a 2,000function point
project
usually
range between about 12
months and 20 months and
would aver-
age
perhaps 14 calendar months
for the entire
application.
Quality
The
quality levels for TSP
organizations are exceptionally
good.
Average
defect potentials with TSP
run to about 4.0 bugs
per func-
tion
point, and defect removal
efficiency is about 97 percent.
Delivered
defects
would average about 0.12
per function point.
Therefore,
an application of 2,000 function
points developed by a
single
TSP
department might have a
total of about 8,000 bugs, of which
240 would
still
be present at release. Of these, about 25
might be serious
bugs.
However,
if in addition to pretest inspections,
tools such as
automated
static
analysis and automated
testing are used, then
defect removal
efficiency
can approach 99 percent. In
this situation, only about
80 bugs
might
be present at release. Of these,
perhaps 8 might be serious
bugs,
which
is a rate of only 0.004 per
function point.
Generally,
as application sizes increase,
defect potentials also
increase,
while
defect removal efficiency
levels decline. Interestingly, with
TSP,
this
rule may not apply.
Some of the larger TSP
applications achieve
more
or less the same quality as
small applications.
Another
surprising finding with TSP is
that productivity does
not
seem
to degrade significantly as application
size goes up.
Normally,
productivity
declines with application size,
but Watts Humphrey
reports
no
significant reductions across a
wide range of application
sizes. This
assertion
requires additional study,
because that would make
TSP
unique
among software development
methods.
TSP
envisions a wide variety of
specialists. Most TSP
Specialization
teams
will have numerous specialists
for topics such as
architecture,
testing,
security, database design,
and many others.
Interestingly,
the TSP approach does
not recommend software
quality
assurance
(SQA) as being part of a
standard TSP team. This is
because
Software
Team Organization and
Specialization
297
of
the view that the
TSP team itself is so
rigorous in quality control
that
SQA
is not needed.
In
companies where SQA groups
are responsible for
collecting quality
data,
TSP teams will provide such
data as needed, but it will be
collected
by
the team's own personnel
rather than by an SQA person
or staff
assigned
to the project.
The
main caution about TSP
organiza-
Cautions
and counter
indications
tions
and projects is that while
they measure many important
topics,
they
do not use standard metrics
such as function points. The
TSP use
of
task hours is more or less
unique, and it is difficult to
compare task
hours
against standard resource
metrics.
Another
caution is that few if any
TSP projects have ever
submit-
ted
benchmark data to any of the
formal software benchmark
groups
such
as the International Software
Benchmarking Standards
Group
(ISBSG).
As a result, it is almost impossible to
compare TSP against
other
methods without doing
complicated data
conversion.
It
is technically feasible to calculate
function point totals using
sev-
eral
of the new high-speed
function point methods. In
fact, quantifying
function
points for both new
applications and legacy
software now takes
only
a few minutes. Therefore,
reporting on quality and
productivity
using
function points would not be
particularly difficult.
Converting
task-hour data into normal
workweek and
work-month
information
would be somewhat more
troublesome, but no doubt
the
data
could be converted using
algorithms or some sort of
rule-based
expert
system.
It
would probably be advantageous
for both Agile and
TSP projects
to
adopt high-speed function
point methods and to submit
benchmark
results
to one or more of the benchmark
organizations such as
ISBSG.
Conclusions
The
TSP approach tends to
achieve a high level of
successful
applications
and few if any failures. As
a result, it deserves to be
studied
in
depth.
From
observations made during
litigation for projects that
failed or
never
operated successfully, TSP
has not yet had
failures that ended up
in
court.
This may change as the
number of TSP applications
grows larger.
TSP
emphasizes the competence of
the managers and technical
staff,
and
it emphasizes effective quality
control and change
management
control.
Effective estimating and
careful progress tracking
also are stan-
dard
attributes of TSP projects.
The fact that TSP
personnel are
carefully
trained
before starting to use the
method, and that experienced
mentors
are
usually available, explains why
TSP is seldom
misused.
With
Agile, for example, there
may be a dozen or more
variations of
how
development activities are
performed, but they still
use the name
298
Chapter
Five
"Agile"
as an umbrella term. TSP
activities are more
carefully defined
and
used, so when the author
visited TSP teams in
multiple companies,
the
same activities carried out
the same way were
noted.
Because
of the emphasis on quality,
TSP would be a good choice
as the
construction
method for standard reusable
components. It also seems
to
be
a good choice for hazardous
applications where poor
quality might
cause
serious problems; that is,
in medical systems, weapons
systems,
financial
applications, and the
like.
Conventional
Departments with
Hierarchical
Organization
Structures
The
concept of hierarchical organizations is
the oldest method
for
assigning
social roles and
responsibilities on the planet.
The etymology
of
the word "hierarchy" is from
the Greek, and the
meaning is "rule by
priests."
But the concept itself is
older than Greece and was
also found
in
Egypt, Sumer, and most
other ancient
civilizations.
Many
religions are organized in
hierarchical fashion, as are
military
organizations.
Some businesses are
hierarchical if they are
privately
owned.
Public companies with shareholders
are usually
semi-hierarchical,
in
that the operating units
report upward level-by-level to
the president
or
chief executive officer
(CEO). The CEO, however,
reports to a board
of
directors elected by the
shareholders, so the very
top level of a public
company
is not exactly a true
hierarchy.
In
a hierarchical organization, units of
various sizes each have a
formal
leader
or manager who is appointed to
the position by higher
authorities.
While
the appointing authority is
often the leader of the
next highest
level
of organization in the structure,
the actual power to appoint
is usu-
ally
delegated from the top of
the hierarchy. Once
appointed, each leader
reports
to the next highest leader
in the same chain of
command.
While
appointed leaders or managers at
various levels have
author-
ity
to issue orders and to
direct their own units,
they are also
required
to
adhere to directives that
descend from higher
authorities. Progress
reports
flow back up to higher
authorities.
In
business hierarchies, lower
level managers are usually
appointed
by
the manager of the next
highest level. But for
executive positions
such
as vice presidents the
appointments may be made by a
committee
of
top executives. The purpose
of this, at least in theory, is to
ensure
the
competence of the top
executives of the hierarchy.
However, the
recent
turmoil in the financial
sector and the expanding
global reces-
sion
indicates that top
management tends to be a weak link in
far too
many
companies.
It
should be noted that the
actual hierarchical structure of an
orga-
nization
and its power structure
may not be identical. For
example,
Software
Team Organization and
Specialization
299
in
Japan during the Middle
Ages, the emperor was at
the top of the
formal
government hierarchy, but
actual ruling power was
vested in a
military
organization headed by a commander
called the shogun.
Only
the
emperor could appoint the
shogun, but the specific
appointment
was
dictated by the military
leadership, and the emperor
had almost
no
military or political
power.
A
longstanding issue with hierarchical
organizations is that if
the
leader
at the top of the pyramid is
weak or incompetent, the
entire
structure
may be at some risk of
failing. For hierarchical
governments,
weak
leadership may lead to
revolutions or loss of territory to
strong
neighbors.
For
hierarchical business organizations,
weak leadership at the
top
tends
to lead to loss of market
share and perhaps to failure
or bank-
ruptcy.
Indeed analysis of the
recent business failures
from Enron
through
Lehmann does indicate that
the top of these hierarchies
did
not
have the competence and
insight necessary to deal with
serious
problems,
or even to understand what
the problems were.
It
is an interesting business phenomenon
that the life expectancy of
a
hierarchical
corporation is approximately equal to
the life expectancies
of
human beings. Very few
companies live to be 100
years old. As the
global
recession lengthens and deepens, a
great many companies
are
likely
to expire, although some will
expand and grow
stronger.
A
hierarchical organization has
two broad classes of
employees. One
of
these classes consists of
the workers or specialists
who actually do
the
work of the enterprise. The
second class consists of the
managers
and
executives to whom the
workers report. Of course,
managers also
report
to higher-level managers.
The
distinction between technical
work and managerial work is
so
deeply
embedded in hierarchical organizations
that it has created
two
very
distinct career paths:
management and technical
work.
When
starting out their careers,
young employees almost
always
begin
as technical workers. For
software, this means
starting out as
software
engineers, programmers, systems
analysts, technical
writers,
and
the like. After a few
years of employment, workers
need to make
a
career choice and either
get promoted into management
or stay with
technical
work.
The
choice is usually determined by
personality and personal
inter-
ests.
Many people like technical
work and never want to
get into manage-
ment.
Other people enjoy planning
and coordination of group
activities
and
opt for a management
career.
There
is an imbalance in the numbers of
managers and
technical
workers.
In most companies, the
managerial community totals to
about
15
percent of overall employment,
while the technical workers
total to
about
85 percent. Since managers
are not usually part of
the production
300
Chapter
Five
process
of the company, it is important
not to have an excessive
number
of
managers and executives. Too
many managers and executives
tend to
degrade
operational performance. This
has been noted in both
business
and
military organizations.
It
is interesting that up to a certain
point, the compensation
levels
of
technical workers and
managers are approximately
the same. For
example,
in most corporations, the
top technical workers can
have com-
pensation
that equals third-line
managers. However, at the
very top of
corporations,
there is a huge
imbalance.
The
CEOs of a number of corporations
and some executive vice
presi-
dents
have compensation packages
that are worth millions of
dollars. In
fact,
some executive compensation
packages are more than
250 times the
compensation
of the average worker within
the company. As the
global
recession
deepens, these enormous executive
compensation packages
are
being
challenged by both shareholders
and government
regulators.
Another
topic that is beginning to be
questioned is the span of
control,
or
the number of technical
workers who report to one
manager. For his-
torical
reasons that are somewhat
ambiguous, the average
department
in
the United States has
about eight technical
workers reporting to one
manager.
The ranges observed run
from two employees per
manager to
about
30 employees per
manager.
Assuming
an average of eight technical
workers per manager,
then
about
12.5 percent of total
employment would be in the
form of first-line
managers.
When higher-level managers
are included, the overall
total
is
about 15 percent.
From
analyzing appraisal scores and
examining complaints
against
managers
in large corporations, it appears
that somewhat less
than
15
percent of the human
population is qualified to be effective
in man-
agement.
In fact, only about 10
percent (or less) seem to be
qualified to
be
effective in management.
That
being said, it might be of
interest to study raising
the average
span
of control from 8 workers
per manager up to perhaps 12
workers
per
manager. Weeding out
unqualified managers and
restoring them to
technical
work might improve overall
efficiency and reduce the
social
discomfort
caused by poor
management.
Practicing
managers state that
increasing the span of
control would
lower
their ability to control
projects and understand the
actual work of
their
subordinates. However, time
and motion studies carried
out by the
author
in large corporations such as IBM
found that software
managers
tended
to spend more time in
meetings with other managers
than in dis-
cussions
or meetings with their own
employees. In fact, a possible law
of
business
is "managerial meetings are
inversely proportional to the
span
of
control." The more managers
on a given project, the more
time they
spend
with other managers rather
than with their own
employees.
Software
Team Organization and
Specialization
301
Another
and more controversial
aspect of this study had to
do with
project
failure rates, delays, and
other mishaps. For large
projects with
multiple
managers, the failure rates
seem to correlate more
closely
to
the number of managers
involved with the projects
than with the
number
of software engineers and
technical workers.
While
the technical workers often
managed to do their jobs and
get
along
with their colleagues in other
departments, managerial efforts
tend
to
be diluted by power struggles
and debates with other
managers.
This
study needs additional
research and validation.
However, it led
to
the conclusion that
increasing the span of
control and reducing
mana-
gerial
numbers tends to raise the
odds of a successful software
project
outcome.
This would especially be
true if the displaced
managers hap-
pened
to be those of marginal competence
for managerial work.
In
many hierarchical departments with
generalists, the same
people
do
both development and
maintenance. It should be noted
that if the
same
software engineers are
responsible for both
development and
maintenance
concurrently, it will be very difficult
to estimate their
development
work with accuracy. This is
because maintenance
work
involved
with fixing high-severity defects
tends to preempt
software
development
tasks and therefore disrupts
development schedules.
Another
topic of significance is that
when exit interviews are
reviewed
for
technical workers, two
troubling facts are noted:
(1) technical work-
ers
with the highest appraisal scores
tend to leave in the largest
num-
bers;
and (2) the most
common reason cited for
leaving a company is
"I
don't like working for
bad management."
Another
interesting phenomenon about
management in hierarchical
organizations
is termed "the Peter
Principle" and needs to be
mentioned
briefly.
The Peter Principle was
created by Dr. Lawrence J.
Peter and
Raymond
Hull in the 1968 book of the
same name. In essence, the
Peter
Principle
holds that in hierarchical
organizations, workers and
manag-
ers
are promoted based on their
competence and continue to
receive
promotions
until they reach a level
where they are no longer
competent.
As
a result, a significant percentage of
older employees and
managers
occupy
jobs for which they
are not competent.
The
Peter Principle may be
amusing (it was first
published in a
humorous
book), but given the
very large number of
cancelled software
projects
and the even larger
number of schedule delays
and cost over-
runs,
it cannot be ignored or discounted in a
software context.
Assuming
that the atomic unit of a
hierarchical software
organization
consists
of eight workers who report
to one manager, what are
their
titles,
roles, and
responsibilities?
Normally,
the hierarchical mode of
organization is found in
compa-
nies
that utilize more
generalists than specialists.
Because software
specialization
tends to increase with company
size, the implication
is
302
Chapter
Five
that
hierarchical organizations are
most widely deployed for
small to
midsize
companies with small technical
staffs. Most often,
hierarchical
organizations
are found in companies that
employ between about 5
and
50
software personnel.
The
primary job title in a
hierarchical structure would be
programmer
or
software engineer, and such
personnel would handle both
develop-
ment
and maintenance work.
However,
the hierarchical organization is
also found in larger
companies
and
in companies that do have
specialists. In this case, an
eight-person
department
might have a staffing
complement of five software
engineers,
two
testers, and a technical
writer all reporting to the
same manager.
Large
corporations have multiple
business units such as
marketing,
sales,
finance, human resources, manufacturing,
and perhaps research.
Using
hierarchical principles, each of these
might have its own
software
organization
dedicated to building the
software used by a specific
business
unit;
that is, financial
applications, manufacturing support
applications,
and
so forth.
But
what happens when some
kind of a corporate or enterprise
appli-
cation
is needed that cuts across
all business units?
Cross-functional
applications
turned out to be difficult in
traditional hierarchical or
"stovepipe"
organizations.
Two
alternative approaches were
developed to deal with
cross-
functional
applications. Matrix management was
one, and it will be
discussed
in the next section of this
chapter. The second was
enter-
prise
resource planning (ERP)
packages, which were created
by large
software
vendors such as SAP and
Oracle to handle
cross-functional
business
applications.
As
discussed in the next topic,
the matrix-management
organization
style
is often utilized for
software groups with extensive
specializa-
tion
and a need for
cross-functional applications that
support multiple
business
units.
In
the software world,
hierarchical organizations
are
Demographics
found
most often in small
companies that employ
between perhaps
5
and 50 total software
personnel. These companies
tend to adopt a
generalist
philosophy and have few
specialists other than some
tech-
nical
skills such as network
administration and technical
writing. In
a
generalist context, hierarchical
organizations of about five to
eight
software
engineers reporting to a manager
handle development,
testing,
and
maintenance activities
concurrently.
The
author estimates that there
are about 10,000 such
small compa-
nies
in the United States. The
number of software personnel
working
under
hierarchical organization structures is
perhaps 250,000 in
the
United
States as of 2009.
Software
Team Organization and
Specialization
303
Hierarchical
structures are also found in
some large companies,
so
perhaps
another 500,000 people work
in hierarchical structures
inside
large
companies and government
agencies.
The
average size of new
applications done by
hierarchical
Project
size
teams
with eight employees and a
manager is about 2,000
function
points.
However, one of the characteristics of
hierarchical organizations
is
that they can cooperate on
large projects, so even
large systems in
excess
of 100,000 function points
can be handled by multiple
depart-
ments
working in concert.
The
caveat with multiple departments
attempting to cooperate is
that
when more than about a
dozen are involved
simultaneously, some
kind
of project office may be
utilized for overall
planning and
coordina-
tion.
Some of the departments
involved may handle
integration, testing,
configuration
control, quality assurance,
technical writing, and
other
specialized
topics.
Productivity
rates for hierarchical
departments on
Productivity
rates
projects
of 2,000 function points are
usually in the range of 12
func-
tion
points per staff month.
They sometimes top 20
function points per
staff
month for applications where
the team has significant
expertise,
and
may drop below 10 function
points per staff month
for unusual
or
complex projects. Productivity
tends to be inversely proportional
to
application
size and declines as
applications grow
larger.
Development
schedules for new
development by a single
Schedules
hierarchical
group with eight team
members working on a
2,000
function
point project usually range
between about 14 months
and
24
months and would average
perhaps 18 calendar months
for the
entire
application.
The
quality levels for
hierarchical departments are
fairly aver-
Quality
age.
Defect potentials run to
about 5.0 bugs per
function point, and
defect
removal efficiency is about 85
percent. Delivered defects
would
average
about 0.75 per function
point.
Therefore,
an application of 2,000 function
points developed by a
single
hierarchical department would
have a total of about 10,000
bugs,
of
which 1,500 would still be
present at release. Of these,
about 225
might
be serious bugs.
However,
if pretest inspections are
used, and if tools such as
auto-
mated
static analysis and
automated testing are used,
then defect
removal
efficiency can approach 97
percent. In this situation,
only
about
300 bugs might be present at
release. Of these, perhaps 40
might
be
serious.
304
Chapter
Five
There
are few studies to date on
the role of
specialization
Specialization
in
hierarchical software organization
structures. Because of
common
gaps
in the training and
experience of generalists, some
kinds of special-
ization
are needed for large
applications. The kinds of
specialists that
might
be useful would be security
specialists, test specialists,
quality
assurance
specialists, database specialists,
user-interface specialists,
network
specialists, performance specialists,
and technical
writers.
Cautions
and counter indications
The
main caution about
hierarchical
organization
structures is that software
work tends to be
artificially
divided
to match the abilities of
eight-person departments, rather
than
segmented
based on the architecture
and design of the
applications
themselves.
As a result, some large
functions in large systems
are arbi-
trarily
divided between two or more
departments when they should
be
handled
by a single group.
While
communication within a given
department is easy and
sponta-
neous,
communication between departments
tends to slow down due
to
managers
guarding their own
territories. Thus, for large
projects with
multiple
hierarchical departments, there
are high probabilities of
power
struggles
and disruptive social
conflicts, primarily among
the manage-
ment
community.
Conclusions
The
literature on hierarchical organizations
is interesting
but
incomplete. Much of the
literature is produced by enthusiasts
for
alternate
forms of organization structures
such as matrix
management,
Agile
teams, pair programming,
clean-room development, and
the like.
Hierarchical
organizations have been in
continuous use for
software
applications
since the industry began.
While that fact might
seem to
indicate
success, it is also true that
the software industry has
been
characterized
by having higher rates of
project failures, cost
overruns,
and
schedule overruns than any
other industry. The actual
impact of
hierarchical
organizations on software success or
software failure is
still
somewhat
ambiguous as of 2009.
Other
factors such as methods,
employee skills, and
management
skills
tend to be intertwined with organization
structures, and this
makes
it hard to identify the
effect of the organization
itself.
Conventional
Departments with Matrix
Organization
Structures
The
history of matrix management is
younger than the history of
soft-
ware
development itself. The
early literature on matrix
management
seemed
to start around the late
1960s, when it was used
within NASA
for
dealing with cross-functional projects
associated with complex space
programs.
Software
Team Organization and
Specialization
305
The
idea of matrix management
soon moved from NASA into
the
civilian
sector and was eventually
picked up by software
organizations
for
dealing with specialization and
cross-functional applications.
In
a conventional hierarchical organization,
software personnel of
various
kinds report to managers within a
given business unit.
The
technical
employees may be generalists, or
the departments may
include
various
specialists too, such as
software engineers, testers,
and techni-
cal
writers. If a particular business
unit has ten software
departments,
each
of these departments might
have a number of software
engineers,
testers,
technical writers, and so
forth.
By
contrast, in a matrix organization,
various occupation groups
and
specialists
report to a skill or career
manager. Thus all technical
writers
might
report to a technical publications
group; all software
engineers
might
be in a software engineering group;
all testers might be in a
test
services
group; and so forth.
By
consolidating various kinds of
knowledge workers within
skill-
based
organizations, greater job
enrichment and more career
opportu-
nities
tend to occur than when
specialists are isolated and
fragmented
among
multiple hierarchical
departments.
Under
a matrix organization, when
specialists are needed for
vari-
ous
projects, they are assigned
to projects and report
temporarily to
the
project managers for the
duration of the projects.
This of course
introduces
the tricky concept of
employees working for two
managers
at
the same time.
One
of the managers (usually the
skill manager) has appraisal
and
salary
authority over specialist
employees, while the other
(usually
the
project manager) uses their
services for completing the
project.
The
project managers may provide
inputs to the skill managers
about
job
performance.
The
manager with appraisal and
salary authority over
employees is
said
to have solid
line reporting
authority. The manager who
merely
borrows
the specialists for specific
tasks or a specific project is
said to
have
dotted
line authority.
These two terms reflect
the way organization
charts
are drawn.
It
is an interesting phenomenon that
matrix management is
new
enough
so that early versions of
SAP, Oracle, and some
other enterprise
resource
planning (ERP) applications
did not support dotted-line
or
matrix
organization structures. As of 2009,
all ERP packages now
sup-
port
matrix organization
diagrams.
The
literature on matrix management
circa 2009 is very
strongly
polarized
between enthusiasts and
opponents. About half of the
books
and
articles regard matrix
management as a major business
achieve-
ment.
The other half of the
books and articles regard
matrix manage-
ment
as confusing, disruptive, and a
significant business
liability.
306
Chapter
Five
A
Google search of the phrase
"failures of matrix management"
returned
315,000 citations, while a
search of the phrase "successes
of
matrix
management" returned 327,000
citations. As can be seen, this
is
a
strong polarization of opinion
that is almost evenly
divided.
Over
the years, three forms of
matrix organization have
surfaced
called
weak
matrix, strong matrix, and
balanced
matrix.
The
original form of matrix organization
has now been
classified
as
a weak
matrix.
In this form of organization,
the employees report
primarily
to a skill manager and are
borrowed by project managers
on
an
as-needed basis. The project
managers have no appraisal
author-
ity
or salary authority over the
employees and therefore
depend upon
voluntary
cooperation to get work
accomplished. If there are
conflicts
between
the project managers and
the skill managers in terms
of
resource
allocations, the project
managers lack the authority
to acquire
the
skills their projects may
need.
Because
weak matrix organizations
proved to be troublesome,
the
strong
matrix
variation soon appeared. In a strong
matrix, the special-
ists
may still report to a skill
manager, but once assigned to a
project,
the
needs of the project take
precedence. In fact, the
specialists may
even
be formally assigned to the
project manager for the
duration of
the
project and receive
appraisals and salary
reviews.
In
a balanced
matrix,
responsibility and authority
are nominally
equally
shared between the skill
manager and the project
manager. While
this
sounds like a good idea, it
has proven to be difficult to
accomplish.
As
a result, the strong matrix
form seems to be dominant
circa 2009.
Demographics
In
the software world, matrix
organizations are found
most
often in large companies
that employ between perhaps
1,000 and
50,000
total software personnel.
These large companies tend
to have
scores
of specialized skills and
hundreds of projects going on at
the
same
time.
The
author estimates that there
are about 250 such
large companies
in
the United States with
primarily matrix organization.
The number
of
software personnel working
under matrix organization
structures is
perhaps
1 million in the United
States as of 2009.
Project
size The
average size of new
applications done by matrix
teams
with
eight employees and a
manager is about 2,000
function points.
However,
matrix organizations can be
scaled up to any arbitrary
size, so
even
large systems in excess of
100,000 function points can
be handled
by
multiple matrix departments
working in concert.
The
caveat with multiple departments
attempting to cooperate is
that
when
more than about a dozen
are involved simultaneously,
some kind
of
a project office may be
needed for overall planning
and coordination.
Software
Team Organization and
Specialization
307
With
really large applications in
excess of 25,000 function
points,
some
of the departments may be
fully staffed by specialists
who handle
topics
such as integration, testing,
configuration control, quality
assur-
ance,
technical writing, and other
specialized topics.
Productivity
rates Productivity
rates for matrix departments
on projects
of
2,000 function points are
usually in the range of 10
function points
per
staff month. They sometimes
top 16 function points per
staff month
for
applications where the team
has significant expertise,
and may drop
below
6 function points per staff
month for unusual or complex
projects.
Productivity
tends to be inversely proportional to
application size and
declines
as applications grow
larger.
Schedules
Development
schedules for new
development by a single
matrix
group with eight team
members working on a
2,000function
point
project usually ranges
between about 16 months and
28 months and
would
average perhaps 18 calendar
months for the entire
application.
Quality
The
quality levels for matrix
organizations often are
average.
Defect
potentials run to about 5.0
bugs per function point,
and defect
removal
efficiency is about 85 percent.
Delivered defects would
average
about
0.75 per function point.
Matrix and hierarchical
organizations are
identical
in quality, unless special
methods such as formal
inspections,
static
analysis, automated testing,
and other state-of-the-art
approaches
have
been introduced.
Therefore,
an application of 2,000 function
points developed by a
single
matrix department might have
a total of about 10,000
bugs, of
which
1,500 would still be present
at release. Of these, about
225 might
be
serious bugs.
However,
if pretest inspections are
used, and if tools such as
automated
static
analysis and automated
testing are used, then
defect removal effi-
ciency
can approach 97 percent. In
this situation, only about
300 bugs
might
be present at release. Of these,
perhaps 40 might be
serious.
As
application sizes increase,
defect potentials also
increase, while
defect
removal efficiency levels
decline.
Specialization
The
main purpose of the matrix
organization structure is
to
support specialization. That
being said, there are
few studies to date
on
the kinds of specialization in
matrix software organization
structures.
As
of 2009, topics such as the
numbers of architects needed,
the number
of
testers needed, and the
number of quality assurance
personnel needed
for
applications of various sizes
remains ambiguous.
Typical
kinds of specialization are
usually needed for large
applica-
tions.
The kinds of specialists
that might be useful would
be security
specialists,
test specialists, quality
assurance specialists,
database
308
Chapter
Five
specialists,
user-interface specialists, network
specialists, perfor-
mance
specialists, and technical
writers.
Cautions
and counter indications
The
main caution about matrix
organi-
zation
structures is that of political
disputes between the skill
managers
and
the project managers.
Another
caution, although hard to
evaluate, is that roughly
half of the
studies
and literature about matrix
organization assert that the
matrix
approach
is harmful rather than
beneficial. The other half,
however,
says
the opposite and claims
significant value from
matrix organiza-
tions.
But any approach with 50 percent
negative findings needs to
be
considered
carefully and not adopted
blindly.
A
common caution for both
matrix and hierarchical
organizations is
that
software work tends to be
artificially divided to match
the abilities
of
eight-person departments, rather
than segmented based on the
archi-
tecture
and design of the
applications. As a result, some
large functions in
large
systems are arbitrarily
divided between two or more
departments
when
they should be handled by a
single group.
While
technical communication within a
given department is
easy
and
spontaneous, communication between
departments tends to
slow
down
due to managers guarding
their own territories. Thus,
for large
projects
with multiple hierarchical or matrix
departments, there
are
high
probabilities of power struggles
and disruptive social
conflicts,
primarily
among the management
community.
Conclusions
The
literature on matrix organizations is so
strongly polar-
ized
that it is hard to find a consensus.
With half of the literature
praising
matrix
organizations and the other
half blaming them for
failures and
disasters,
it is not easy to find solid
empirical data that is
convincing.
From
observations made during
litigation for projects that
failed or
never
operated successfully, there
seems to be little difference
between
hierarchical
and matrix organizations.
Both matrix and
hierarchical
organizations
end up in court about the
same number of times.
What
does make a difference is
the competence of the
managers and
technical
staff, and the emphasis on
effective quality control
and change
management
control. Effective estimating
and careful progress
tracking
also
make a difference, but none
of these factors are
directly related to
either
the hierarchical or matrix
organization styles.
Specialist
Organizations in Large
Companies
Because
development software engineers
are not the only or
even the
largest
occupation group in big
companies and government
agencies,
it
is worthwhile to consider what
kinds of organizations best
serve the
needs
of the most common
occupation groups.
Software
Team Organization and
Specialization
309
In
approximate numerical order by
numbers of employees, the
major
specialist
occupations would be
1.
Maintenance software
engineers
2.
Test personnel
3.
Business analysts and
systems analysts
4.
Customer support
personnel
5.
Quality assurance
personnel
6.
Technical writing
personnel
7.
Administrative personnel
8.
Configuration control
personnel
9.
Project office staff
Estimating
specialists
■
Planning
specialists
■
Measurement
and metrics
specialists
■
Scope
managers
■
Process
improvement specialists
■
Standards
specialists
■
Many
other kinds of personnel
perform technical work such
as net-
work
administration, operating data
centers, repair of workstations
and
personal
computers, and other
activities that center
around operations
rather
than software. These
occupations are important,
but are outside
the
scope of this book.
Following
are discussion of organization
structures for
selected
specialist
groups.
Software
Maintenance Organizations
For
small companies with fewer
than perhaps 50 software
personnel,
maintenance
and development are usually
carried out by the
same
people,
and there are no separate
maintenance groups. For that
matter,
some
forms of customer support
may also be tasked to the
software
engineering
community in small
companies.
However,
as companies grow larger,
maintenance specialization
tends
to
occur. For companies with
more than about 500
software personnel,
maintenance
groups are the norm
rather than the
exception.
(Note:
The International Software
Benchmarking Standards
Group
(ISBSG)
has maintenance benchmark
data available for more
than
310
Chapter
Five
400
projects and is adding new
data monthly. Refer to
www.ISBSG.org
for
additional information.)
The
issue of separating maintenance
from development has
both
detractors
and adherents.
The
detractors of separate maintenance
groups state that
separating
maintenance
from development may require
extra staff to become
famil-
iar
with the same applications,
which might artificially
increase overall
staffing.
They also assert that if
enhancements and defect
repairs are
taking
place at the same time
for the same applications
and are done by
two
different people, the two
tasks might interfere with
each other.
The
adherents of separate maintenance
groups assert that
because
bugs
occur randomly and in fairly
large numbers, they
interfere with
development
schedules. If the same
person is responsible for
adding a
new
feature to an application and
for fixing bugs, and
suddenly a high-
severity
bug is reported, fixing the
bug will take precedence
over doing
development.
As a result, development schedules will
slip and probably
slip
so badly that the ROI of the
application may turn
negative.
Although
both sets of arguments have
some validity, the
author's
observations
support the view that
separate maintenance
organizations
are
the most useful for
larger companies that have
significant volumes
of
software to maintain.
Separate
maintenance teams have
higher productivity rates in
find-
ing
and fixing problems than do
developers. Also, having
separate main-
tenance
change teams makes
development more predictable
and raises
development
productivity.
Some
maintenance groups also
handle small enhancements as
well
as
defect repairs. There is no
exact definition of a "small
enhancement,"
but
a working definition is an update
that can be done by one
person in
less
than one week. That would
limit the size of small
enhancements to
about
5 or fewer function
points.
Although
defect repairs and
enhancements are the two
most common
forms
of maintenance, there are
actually 23 different kinds of
mainte-
nance
work performed by large
organizations, as shown in Table
5-2.
Although
the 23 maintenance topics
are different in many
respects,
they
all have one common feature
that makes a group
discussion pos-
sible:
they all involve modifying
an existing application rather
than
starting
from scratch with a new
application.
Each
of the 23 forms of modifying
existing applications has a
dif-
ferent
reason for being carried
out. However, it often
happens that
several
of them take place
concurrently. For example,
enhancements
and
defect repairs are very
common in the same release
of an evolving
application.
The
maintenance literature has a
number of classifications for
main-
tenance
tasks such as "adaptive,"
"corrective," or "perfective." These
seem
Software
Team Organization and
Specialization
311
TABLE
5-2
Twenty-Three
Kinds of Maintenance Work
1.
Major
enhancements (new features of
greater than 20 function
points)
2.
Minor
enhancements (new features of
less than 5 function
points)
3.
Maintenance
(repairing defects for good
will)
4.
Warranty
repairs (repairing defects
under formal
contract)
5.
Customer
support (responding to client
phone calls or problem
reports)
6.
Error-prone
module removal (eliminating
very troublesome code
segments)
7.
Mandatory
changes (required or statutory
changes)
8.
Complexity
or structural analysis (charting
control flow plus complexity
metrics)
9.
Code
restructuring (reducing cyclomatic
and essential
complexity)
10.
Optimization
(increasing performance or
throughput)
11.
Migration
(moving software from one
platform to another)
12.
Conversion
(changing the interface or
file structure)
13.
Reverse
engineering (extracting latent
design information from
code)
14.
Reengineering
(transforming legacy applications to
modern forms)
15.
Dead
code removal (removing segments no
longer utilized)
16.
Dormant
application elimination (archiving
unused software)
17.
Nationalization
(modifying software for
international use)
18.
Mass
updates such as Euro or Year
2000 repairs
19.
Refactoring,
or reprogramming applications to improve
clarity
20.
Retirement
(withdrawing an application from
active service)
21.
Field
service (sending maintenance
members to client
locations)
22.
Reporting
bugs or defects to software
vendors
23.
Installing
updates received from
software vendors
to
be classifications that derive
from academia. While there
is nothing
wrong
with them, they manage to
miss the essential point.
Maintenance
overall
has only two really
important economic
distinctions:
1.
Changes that are charged to
and paid for by customers
(enhance-
ments)
2.
Changes that are absorbed by
the company that built
the software
(bug
repairs)
Whether
a company uses standard
academic distinctions of
mainte-
nance
activities or the more
detailed set of 23 shown
here, it is important
to
separate costs into the
two buckets of customer-funded or
self-funded
expenses.
Some
companies such as Symantec
charge customers for
service
calls,
even for reporting bugs.
The author regards such
charges as being
unprofessional
and a cynical attempt to
make money out of
incompetent
quality
control.
312
Chapter
Five
There
are also common sequences or
patterns to these
modification
activities.
For example, reverse
engineering often precedes
reengineer-
ing,
and the two occur so
often together as to almost
constitute a linked
set.
For releases of large
applications and major
systems, the author
has
observed from six to ten
forms of maintenance all
leading up to the
same
release.
In
recent years, the
Information Technology Infrastructure
Library
(ITIL)
has had a significant impact
on maintenance, customer
sup-
port,
and service management in
general. The ITIL is a rather
large
collection
of more than 30 books and
manuals that deal with
service
management,
incident reporting, change
teams, reliability
criteria,
service
agreements, and a host of
other topics. As this book
is being
written
in 2009, the third release
of the ITIL is under
way.
It
is an interesting phenomenon of the
software world that
while
ITIL
has become a major driving
force in service agreements
within
companies
for IT service, it is almost
never used by commercial
vendors
such
as Microsoft and Symantec
for agreements with their
customers.
In
fact, it is quite instructive to
read the small print in
the end-user
license
agreements (EULAs) that are
always required prior to
using
the
software.
When
these agreements are read,
it is disturbing to see clauses
that
assert
that the vendors have no
liabilities whatsoever, and
that the
software
is not guaranteed to operate or to
have any kind of
quality
levels.
The
reason for these one-sided
EULA agreements is that
software
quality
control is so bad that even
major vendors would go
bankrupt if
sued
for the damages that
their products can
cause.
For
many IT organizations and
also for commercial software
groups,
a
number of functions are
joined together under a
larger umbrella: cus-
tomer
support, maintenance (defect
repairs), small enhancements
(less
than
5 function points), and
sometimes integration and
configuration
control.
In
addition, several forms of
maintenance work deal with
software
not
developed by the company
itself:
1.
Maintenance of commercial applications
such as those
acquired
from
SAP, Oracle, Microsoft, and
the like. The maintenance
tasks
here
involve reporting bugs,
installing new releases, and
possibly
making
custom changes for local
conditions.
2.
Maintenance of open-source and
freeware applications such
as
Firefox,
Linux, Google, and the
like. Here, too, the
maintenance
tasks
involve reporting bugs and
installing new releases,
plus cus-
tomization
as needed.
Software
Team Organization and
Specialization
313
3.
Maintenance of software added to
corporate portfolios via
mergers
or
acquisitions with other companies.
This is a very tricky
situa-
tion
that is fraught with problems
and hazards. The tasks
here can
be
quite complex and may
involve renovation, major
updates, and
possibly
migration from one database to
another.
In
addition to normal maintenance,
which combines defect
repairs
and
enhancements, legacy applications
may undergo thorough
and
extensive
modernization, called renovation.
Software
renovation can include
surgical removal of
error-prone
modules,
automatic or manual restructuring to
reduce complexity,
revision
or replacement of comments, removal of
dead code segments,
and
possibly even automatic
conversion of the legacy
application
from
old or obsolete programming
languages into newer
program-
ming
languages.
Renovation
may also include data
mining to extract business
rules
and
algorithms embedded in the
code but missing from
specifications
and
written descriptions of the
code. Static analysis and
automatic test-
ing
tools may also be included
in renovation. Also, it is now
possible to
generate
function point totals for
legacy applications automatically,
and
this
may also occur as part of
renovation activities.
The
observed effect of software
renovation is to stretch out
the useful
life
of legacy applications by an additional
ten years. Renovation
reduces
the
number of latent defects in
legacy code, and therefore
reduces future
maintenance
costs by about 50 percent
per calendar year for
the applica-
tions
renovated. Customer support
costs are also
reduced.
As
the recession deepens and
lengthens, software renovation
will
become
more and more valuable as a
cost-effective alternative to
retir-
ing
legacy applications and
redeveloping them. The
savings accrued
from
renovation could reduce
maintenance costs so
significantly
that
redevelopment could occur
using the savings that
accrue from
renovation.
If
a company does plan to
renovate legacy applications, it is
appro-
priate
to fix some of the chronic
problems that no doubt are
present in
the
original legacy code. The
most obvious of these would
be to remove
security
vulnerabilities, which tend to be
numerous in legacy
applications.
The
second would be to improve
quality by using inspections,
static
analysis,
automated testing, and other
modern techniques such as
TSP
during
renovations.
A
combination of the Team
Software Process (TSP), the
Caja security
architecture
from Google, and perhaps
the E programming language,
which
is
more secure than most
languages, might be considered
for renovating
applications
that deal with financial or
valuable proprietary
data.
314
Chapter
Five
For
predicting the staffing and
effort associated with software
main-
tenance,
some useful rules of thumb
have been developed based
on
observations
of maintenance groups in companies
such as IBM, EDS,
Software
Productivity Research, and a
number of others.
Maintenance
assignment scope =
the amount of software that
one
maintenance
programmer can successfully
maintain in a single
calen-
dar
year. The U.S. average as of
2009 is about 1,000 function
points. The
range
is between a low of about
350 function points and a
high of about
5,500
function points. Factors
that affect maintenance
assignment scope
include
the experience of the
maintenance team, the
complexity of the
code,
the number of latent bugs in
the code, the presence or
absence of
"error-prone
modules" in the code, and
the available tool suites
such as
static
analysis tools, data mining
tools, and maintenance
workbenches.
This
is an important metric for
predicting the overall
number of main-
tenance
programmers needed.
(For
large applications, knowledge of
the internal structure is
vital
for
effective maintenance and
modification. Therefore, major
systems
usually
have their own change
teams. The number of
maintenance pro-
grammers
in such a change team can be
calculated by dividing the
size
of
the application in function
points by the appropriate
maintenance
assignment
scope, as shown in the
previous paragraph.)
Defect
repair rates =
the average number of bugs
or defects that
a
maintenance programmer can fix in a
calendar month of 22
working
days.
The U.S. average is about 10
bugs repaired per calendar
month.
The
range is from fewer than 5
to about 17 bugs per staff
month. Factors
that
affect this rate include
the experience of the
maintenance program-
mer,
the complexity of the code,
and "bad-fix injections," or
new bugs
accidentally
injected into the code
created to repair a previous
bug. The
U.S.
average for bad-fix
injections is about 7
percent.
Renovation
productivity =
the average number of
function points
per
staff month for renovating
software applications using a
full suite
of
renovation support tools.
The U.S. average is about 65
function points
per
staff month. The range is
from a low of about 25
function points per
staff
month for highly complex
applications in obscure languages
to
more
than 125 function points
per staff month for
applications of mod-
erate
complexity in fairly modern
languages. Other factors
that affect
this
rate include the overall
size of the applications,
the presence or
absence
of "error-prone modules" in the
application, and the
experience
of
the renovation team.
(Manual
renovation without automated
support is much more
dif-
ficult,
and hence productivity rates
are much lower--in the
vicinity of
14
function points per staff
month. This is somewhat
higher than new
development,
but still close to being
marginal in terms of return
on
investment.)
Software
Team Organization and
Specialization
315
Software
does not age gracefully.
Once software is put into
production,
it
continues to change in three
important ways:
1.
Latent defects still present
at release must be found and
fixed after
deployment.
2.
Applications continue to grow
and add new features at a
rate of
between
5 percent and 10 percent per
calendar year, due either
to
changes
in business needs, or to new laws
and regulations, or
both.
3.
The combination of defect
repairs and enhancements
tends to
gradually
degrade the structure and
increase the complexity
of
the
application. The term for
this increase in complexity
over
time
is called entropy.
The
average rate at which
software entropy
increases
is about 1 percent to 3 percent
per calendar year.
A
special problem with software
maintenance is caused by
the
fact
that some applications use
multiple programming
languages.
As
many as 15 different languages
have been found within a
single
large
application.
Multiple
languages are troublesome
for maintenance because
they
add
to the learning chores of
the maintenance teams. Also
some (or all)
of
these language may be "dead"
in the sense that there are
no longer
working
compilers or interpreters. This
situation chokes
productivity
and
raises the odds of bad-fix
injections.
Because
software defect removal and
quality control are
imperfect,
there
will always be bugs or defects to
repair in delivered software
appli-
cations.
The current U.S. average
for defect removal
efficiency is only
about
85 percent of the bugs or
defects introduced during
development.
This
has been the average
for more than 20
years.
The
actual values are about 5
bugs per function point
created during
development.
If 85 percent of these are
found before release,
about
0.75
bug per function point will
be released to customers.
For
a typical application of 1,000
function points or 100,000
source
code
statements, that implies
about 750 defects present at
delivery.
About
one fourth, or 185 defects,
will be serious enough to stop
the
application
from running or will create
erroneous outputs.
Since
defect potentials tend to
rise with the overall size
of the appli-
cation,
and since defect removal
efficiency levels tend to
decline with
the
overall size of the
application, the overall
volume of latent
defects
delivered
with the application rises with
size. This explains why
super-
large
applications in the range of
100,000 function points,
such as
Microsoft
Windows and many enterprise
resource planning
(ERP)
applications,
may require years to reach a
point of relative
stability.
These
large systems are delivered
with thousands of latent bugs
or
defects.
316
Chapter
Five
Of
course, average values are
far worse than best
practices. A com-
bination
of formal inspections, static
analysis, and automated
testing
can
bring cumulative defect
removal efficiency levels up to 99
percent.
Methods
such as the Team Software
Process (TSP) can lower
defect
potentials
down below 3.0 per
function point.
Unless
very sophisticated development
practices are followed, the
first
year
of the release of a new
software application will include a
heavy
concentration
of defect repair work and
only minor
enhancements.
However,
after a few years, the
application will probably stabilize
as
most
of the original defects are
found and eliminated. Also
after a few
years,
new features will increase in
number.
As
a result of these trends,
maintenance activities will
gradually
change
from the initial heavy
concentration on defect repairs to a
longer-
range
concentration on new features
and enhancements.
Not
only is software deployed with a
significant volume of
latent
defects,
but the phenomenon of
bad-fix injection has been
observed
for
more than 50 years. Roughly
7 percent of all defect
repairs will
contain
a new defect that was
not there before. For
very complex and
poorly
structured applications, these
bad-fix injections have
topped
20
percent.
Even
more alarming, once a bad fix
occurs, it is very difficult to
cor-
rect
the situation. Although the
U.S. average for initial
bad-fix injection
rates
is about 7 percent, the
secondary injection rate
against previous
bad
fixes is about 15 percent
for the initial repair
and 30 percent for
the
second.
A string of up to five consecutive
bad fixes has been
observed,
with
each attempted repair adding
new problems and failing to
correct
the
initial problem. Finally,
the sixth repair attempt
was successful.
In
the 1970s, the IBM
Corporation did a distribution
analysis of
customer-reported
defects against their main
commercial software
applications.
The IBM personnel involved in
the study, including
the
author,
were surprised to find that
defects were not randomly
distrib-
uted
through all of the modules
of large applications.
In
the case of IBM's main
operating system, about 5
percent of the
modules
contained just over 50
percent of all reported
defects. The most
extreme
example was a large database
application, where 31
modules
out
of 425 contained more than
60 percent of all customer-reported
bugs.
These
troublesome areas were known
as error-prone
modules.
Similar
studies by other corporations
such as AT&T and ITT
found
that
error-prone modules were
endemic in the software
domain. More
than
90 percent of applications larger
than 5,000 function points
were
found
to contain error-prone modules in
the 1980s and early
1990s.
Summaries
of the error-prone module
data from a number of
companies
were
published in the author's
book Software
Quality: Analysis
and
Guidelines
for Success.
Software
Team Organization and
Specialization
317
Fortunately,
it is possible to surgically remove
error-prone modules
once
they are identified. It is
also possible to prevent
them from occur-
ring.
A combination of defect measurements,
formal design
inspections,
formal
code inspections, and formal
testing and test-coverage
analysis
have
proven to be effective in preventing
error-prone modules
from
coming
into existence.
Today,
in 2009, error-prone modules
are almost nonexistent in
organiza-
tions
that are higher than
level 3 on the capability
maturity model (CMM)
of
the Software Engineering
Institute. Other development
methods such
as
the Team Software Process
(TSP) and Rational Unified
Process (RUP)
are
also effective in preventing
error-prone modules. Several
forms of
Agile
development such as extreme
programming (XP) also seem
to be
effective
in preventing error-prone modules
from occurring.
Removal
of error-prone modules is a normal
aspect of renovating
legacy
applications, so those software
applications that have
under-
gone
renovation will have no error-prone
modules left when the
work
is
complete.
However,
error-prone modules remain
common and troublesome
for
CMMI
level 1 organizations. They
are also alarmingly common
in legacy
applications
that have not been
renovated and that are
maintained
without
careful measurement of
defects.
Once
deployed, most software
applications continue to grow at
annual
rates
of between 5 percent and 10
percent of their original
functionality.
Some
applications, such as Microsoft
Windows, have increased in
size
by
several hundred percent over
a ten-year period.
The
combination of continuous growth of
new features coupled
with
continuous
defect repairs tends to
drive up the complexity
levels of aging
software
applications. Structural complexity
can be measured via
met-
rics
such as cyclomatic and
essential complexity using a
number of com-
mercial
tools. If complexity is measured on an
annual basis and
there
is
no deliberate attempt to keep
complexity low, the rate of
increase is
between
1 percent and 3 percent per
calendar year.
However,
and this is important, the
rate at which entropy or
com-
plexity
increases is directly proportional to
the initial complexity of
the
application.
For example, if an application is
released with an average
cyclomatic
complexity level of less
than 10, it will tend to
stay well struc-
tured
for at least five years of
normal maintenance and
enhancement
changes.
But
if an application is released with an
average cyclomatic
com-
plexity
level of more than 20,
its structure will degrade
rapidly, and its
complexity
levels might increase by
more than 2 percent per
year. The
rate
of entropy and complexity will
even accelerate after a few
years.
As
it happens, both bad-fix
injections and error-prone
modules tend to
correlate
strongly (although not
perfectly) with high levels of
complexity.
318
Chapter
Five
A
majority of error-prone modules
have cyclomatic complexity
levels
of
10 or higher. Bad-fix injection
levels for modifying
high-complexity
applications
are often higher than 20
percent.
Here,
too, renovation can reverse
software entropy and bring
cyclo-
matic
complexity levels down below
10, which is the maximum
safe
level
of code complexity.
There
are several difficulties in
exploring software maintenance
costs
with
accuracy. One of these
difficulties is the fact
that maintenance
tasks
are often assigned to
development personnel who
interleave both
development
and maintenance as the need
arises. This practice
makes
it
difficult to distinguish maintenance
costs from development
costs,
because
the programmers are often
rather careless in recording
how
time
is spent.
Another
and very significant problem
is that a great deal of
software
maintenance
consists of making very
small changes to software
appli-
cations.
Quite a few bug repairs
may involve fixing only a
single line of
code.
Adding minor new features,
such as perhaps a new
line-item on a
screen,
may require fewer than 50
source code
statements.
These
small changes are below
the effective lower limit
for counting
function
point metrics. The function
point metric includes
weighting
factors
for complexity, and even if
the complexity adjustments
are set to
the
lowest possible point on the
scale, it is still difficult to
count function
points
below a level of perhaps 15
function points.
An
experimental method called
micro
function points has
been devel-
oped
for small maintenance
changes and bug repairs.
This method is
similar
to standard function points,
but drops down to three
decimal
places
of precision and so can deal
with fractions of a single
function
point.
Of
course, the work of making a
small change measured with
micro
function
points may be only an hour
or less. But in large
companies,
where
as many as 20,000 such
changes are made in a year,
the cumula-
tive
costs are not trivial.
Micro function points are
intended to eliminate
the
problem that small
maintenance updates have not
been subject to
formal
economic analysis.
Quite
a few maintenance tasks
involve changes that are
either a frac-
tion
of a function point, or may at
most be fewer than 5
function points
or
about 250 Java source
code statements. Although
normal counting
of
function points is not
feasible for small updates,
and micro function
points
are still experimental, it is
possible to use the
backfiring
method
of
converting counts of logical
source code statements into
equivalent
function
points. For example, suppose
an update requires adding
100
Java
statements to an existing application.
Since it usually takes
about
50
Java statements to encode 1 function
point, it can be stated that
this
small
maintenance project is about 2
function points in
size.
Software
Team Organization and
Specialization
319
Because
of the combination of 23 separate
kinds of maintenance
work
mixed
with both large and small
updates, maintenance effort is
harder to
estimate
and harder to measure than
in conventional software
develop-
ment.
As a result, there are many
fewer maintenance benchmarks
than
development
benchmarks. In fact, there is
much less reliable
information
about
maintenance than about
almost any other aspect of
software.
Maintenance
activities are frequently
outsourced to either domestic
or
offshore
outsource companies. For a
variety of business reasons,
main-
tenance
outsource contracts seem to be
more stable and less
likely to
end
up in court than software
development contracts.
The
success of maintenance outsource
contracts is because of
two
major
factors:
1.
Small maintenance changes do
not have the huge
cost and schedule
slippage
rates associated with major
development projects.
2.
Small maintenance changes to
existing software almost
never fail
completely.
A significant number of development
projects do fail and
are
never completed at
all.
There
may be other reasons as
well, but the fact
remains that main-
tenance
outsource contracts seem
more stable and less
likely to end up
in
court than development
outsource contracts.
Maintenance
is the dominant work of the
software industry in
2009
and
will probably stay the
dominant activity for the
indefinite future.
For
software, as with many other
industries, once the industry
passes
50
years of age, more workers
are involved with repairing
existing prod-
ucts
than there are workers
involved with building new
products.
Demographics
In
the software world, separate
maintenance organiza-
tions
are found most often in
large companies that employ
between
perhaps
500 and 50,000 total
software personnel.
The
author estimates that there
are about 2,500 such
large compa-
nies
the United States with
separate maintenance organizations.
The
number
of software personnel working on
maintenance in maintenance
organizations
is perhaps 800,000 in the
United States as of 2009.
(The
number
of software personnel who
perform both development and
main-
tenance
is perhaps 400,000.)
The
average size of software
defects is less than 1
function
Project
size
point,
which is why micro-function points
are needed.
Enhancements
or
new features typically range
from a low of perhaps 5
function points
to
a high of perhaps 500
function points. However,
there are so many
enhancements,
that software applications
typically grow at a rate
of
around
8 percent per calendar year
for as long as they are
being used.
320
Chapter
Five
Productivity
rates for defect repairs
are only about
Productivity
rates
10
function points per staff
month, due to the difficulty
of finding the
exact
problem, plus the need
for regression testing and
constructing
new
releases. Another way of
expressing defect repair
productivity is to
use
defects or bugs fixed per
month, and a typical value
would be about
10
bugs per staff
month.
The
productivity rates for
enhancements average about 15
function
points
per staff month, but
vary widely due to the
nature and size of
the
enhancement,
the experience of the team,
the complexity of the
code,
and
the rate at which
requirements change during
the enhancement.
The
range for enhancements can
be as low as about 5 function
points
per
staff month, or as high as 35
function points per staff
month.
Development
schedules for defect repairs
range from a few
Schedules
hours
to a few days, with one major
exception. Defects that are
abeyant,
or
cannot be replicated by the
change teams, may take
weeks to repair
because
the internal version of the
application used by the
change team
may
not have the defect. It is
necessary to get a great
deal more infor-
mation
from users in order to
isolate abeyant
defects.
Fixing
a bug is not the same as
issuing a new release.
Within some
companies
such as IBM, maintenance schedules in
the sense of defect
repairs
vary with the severity level
of the bugs reported; that
is, severity
1
bugs (most serious), about 1
week; severity 2 bugs, about
two weeks;
severity
3 bugs, next release;
severity 4 bugs, next
release or whenever
it
is convenient.
Development
schedules for enhancements
usually run from
about
1
month up to 9 months. However,
many companies have fixed
release
intervals
that aggregate a number of
enhancements and defect
repairs
and
release them at the same
time. Microsoft "service
packs" are one
example,
as are the intermittent
releases of Firefox. Normally,
fixed
release
intervals are either every
six months or once a year,
although
some
may be quarterly.
Quality
The
main quality concerns for
maintenance or defect
repairs
are
threefold: (1) higher defect
potentials for maintenance
and enhance-
ments
than for new development,
(2) the presence or absence of
error-
prone
modules in the application,
and (3) the bad-fix
injection rates for
defect
repairs, which average about
7 percent.
Maintenance
and enhancement defect
potentials are higher than
for
new
development and run to about
6.0 bugs per function
point. Defect
removal
efficiency is usually lower
than for new development
and is only
about
83 percent. As a result, delivered
defects would average
about
1.08
per function point.
Software
Team Organization and
Specialization
321
An
additional quality concern
that grows slowly worse
over a period
of
years is that application
complexity (as measured by
cyclomatic com-
plexity)
slowly increases because changes
tend to degrade the
original
structure.
As a result, each year,
defect potentials may be
slightly higher
than
the year before, while
bad-fix injections may
increase. Unless the
application
is renovated, these problems
tend to become so bad
that
eventually
the application can no
longer be safely
modified.
In
addition to renovation, other
approaches such as formal
inspections
for
major enhancements and
significant defect repairs,
static analysis,
and
automatic testing can raise
defect removal efficiency
levels above
95
percent. However, bad-fix
injections and error-prone
modules are
still
troublesome.
The
main purpose of the
maintenance organization
Specialization
structures
is to support maintenance specialization.
While not every-
one
enjoys maintenance, it happens
that quite a few programmers
and
software
engineers do enjoy
it.
Other
specialist work in a maintenance
organization includes
inte-
gration
and configuration control.
Maintenance software
engineers
normally
do most of the testing on
small updates and small
enhance-
ments,
although formal test
organizations may do some
specialized
testing
such as system testing prior
to a major release.
Curiously,
software quality assurance
(SQA) is seldom
involved
with
defect repairs and minor
enhancements carried out by
main-
tenance
groups. However, SQA
specialists usually do work on
major
enhancements.
Technical
writers don't have a major
role in software
maintenance,
but
may occasionally be involved if
enhancements trigger changes
in
user
manuals or HELP text.
That
being said, few studies to
date deal with either
personality or
technical
differences between successful
maintenance programmers
and
successful
development programmers.
The
main caution about
maintenance
Cautions
and counter
indications
specialization
and maintenance organizations is
that they tend to
lock
personnel
into narrow careers,
sometimes limited to repairing a
single
application
for a period of years. There
is little chance of career
growth
or
knowledge expansion if a software
engineer spends years fixing
bugs
in
a single software application.
Occasionally, switching back
and forth
from
maintenance to development is a good
practice for
minimizing
occupational
boredom.
The
literature on maintenance organizations
is very
Conclusions
sparse
compared with the literature on
development. Although there
are
322
Chapter
Five
some
good books, there are
few long-range studies that
show application
growth,
entropy increase, and defect
trends over multiple
years.
Given
that software maintenance is
the dominant activity of
the
software
industry in 2009, a great
deal more research and
study are
indicated.
Research is needed on data
mining of legacy applications
to
extract
business rules; on removing
security vulnerabilities from
legacy
code;
on the costs and value of
software renovation; and on
the applica-
tion
of quality control methods
such as inspections, static
analysis, and
automated
testing to legacy
code.
Customer
Support Organizations
In
small companies with few
software applications and
few custom-
ers
or application users, support
may be carried out on an
informal
basis
by the development team
itself. However, as numbers of
customers
increase
and numbers of applications
needing support increase, a
point
will
soon be reached where a formal
customer support organization
will
be
needed.
Informal
rules of thumb for customer
support indicate that
customer
support
staffing is dependent on three
variables:
1.
Number of customers
2.
Number of latent bugs or
defects in released
software
3.
Application size measured in
terms of function points or
lines
of
code
One
full-time customer support
person would probably be
needed for
applications
that meet these criteria:
150 customers, 500 latent
bugs in
the
software (75 serious bugs),
and 10,000 function points
or 500,000
source
code statements in a language
such as Java.
The
most effective known method
for improving customer
support is to
achieve
much better application
quality levels than are
typical today in
2009.
Every reduction of about 220
latent defects at delivery
can reduce
customer
support staffing needs by one
person. This is based on
the
assumption
that customer support
personnel speak to about 30
custom-
ers
per day, and each
released defect is encountered by 30
customers.
Therefore,
each released defect
occupies one day for one
customer sup-
port
staff member, and there
are 220 working days
per year.
Some
companies attempt to reduce
customer support costs by
charg-
ing
for support calls, even to
report bugs in the
applications! This is
an
extremely bad business
practice that primarily
offends customers
without
benefiting the companies.
Every customer faced with a
charge
for
customer support is an unhappy
customer who is actively in
search
of
a more sensible competitive
product.
Software
Team Organization and
Specialization
323
Also,
since software is routinely
delivered with hundreds of
serious
bugs,
and since customer reports
of those bugs are valuable
to soft-
ware
vendors, charging for
customer support is essentially
cutting off
a
valuable resource that can
be used to lower maintenance
costs. Few
companies
that charge for support
have many happy customers,
and
many
are losing market
shares.
Unfortunately,
customer support organizations
are among the
most
difficult
of any kind of software
organization to staff and
organize well.
There
are several reasons for
this. The first is that
unless a company
charges
for customer support (not a
recommended practice), the
costs
can
be high. The second is that
customer-support work tends to
have
limited
career opportunities, and
this makes it difficult to
attract and
keep
personnel.
As
a result, customer support
was one of the first
business activities
to
be outsourced to low-cost offshore
providers. Because customer
sup-
port
is labor intensive, it was
also among the first
business activities
to
attempt to automate at least
some responses. To minimize
the time
required
for discussions with live
support personnel, there are
a variety
of
frequently asked questions
(FAQ) and other topics
that users can
access
by phone or e-mail prior to
speaking with a real
person.
Unfortunately,
these automated techniques
are often frustrating
to
users
because they require minutes
of time dealing with
sometimes
arcane
voice messages before
reaching a real person. Even
worse,
these
automated voice messages are
almost useless for the
hard of
hearing.
That
being said, companies in the
customer support business
have
made
some interesting technical
innovations with voice response
sys-
tems
and also have developed
some fairly sophisticated
help-desk pack-
ages
that keep track of callers
or e-mails, identify bugs or
defects that
have
been previously reported,
and assist with other
administrative
functions.
Because
calls and e-mail from
customers contain a lot of
potentially
valuable
information about deep bugs
and security flaws, prudent
com-
panies
want to capture this
information for analysis and
to use it as part
of
their quality and security
improvement programs.
At
a sociological level, an organization
called the Service and
Support
Professionals
Association (SSPA) not only
provides useful
information
for
support personnel, but also
evaluates the customer
support of vari-
ous
companies and issues awards
and citations for
excellence. The SSPA
group
also has conferences and
events dealing with customer
support.
(The
SSPA web site is
www.thesspa.com.)
SSPA
has an arrangement with the
well-known J.D. Power
and
Associates
to evaluate customer service in
order to motivate
companies
324
Chapter
Five
by
issuing various awards. As an
example, the SSPA web
site mentions
the
following recent awards as of
2009:
ProQuest
Business Solutions--Most
improved
■
IBM
Rochester--Sustained excellence for
three consecutive
years
■
Oracle
Corporation--Innovative support
■
Dell--Mission
critical support
■
RSA
Security--Best support for
complex systems
■
For
in-house support as opposed to commercial
companies that sell
software,
the massive compendium of
information contained in
the
Information
Technology Infrastructure Library
(ITIL) spells out
great
topics
such as Help-Desk response
time targets, service
agreements,
incident
management, and hundreds of
other items of
information.
Software
customer support is organized in a
multitier arrangement
that
uses automated and FAQ as
the initial level, and
then brings in
more
expertise at other levels. An
example of such a multitier
arrange-
ment
might resemble the
following:
Level
0--Automated voice messages,
FAQ, and pointers to
available
■
downloads
Level
1--Personnel who know basics
of the application and
common
■
bugs
Level
2--Experts in selected
topics
■
Level
3--Development personnel or top-gun
experts
■
The
idea behind the multilevel
approach is to minimize the
time
requirements
of developers and experts,
while providing as much
useful
information
as possible in what is hopefully an
efficient manner.
As
mentioned in a number of places in this
book, the majority of
customer
service
calls and e-mails are
due to poor quality and
excessive numbers
of
bugs. Therefore, more sophisticated
development approaches such as
using
Team Software Process (TSP),
formal inspections, static
analysis,
automated
testing, and the like will
not only reduce development
costs and
schedules,
but will also reduce
maintenance and customer
support costs.
It
is interesting to consider how one of
the J.D. Power award
recipi-
ents,
IBM Rochester, goes about
customer support:
"There
is a strong focus on support
responsiveness, in terms of both
time
to
response as well as the
ability to provide solutions.
When customers
call
in, there is a target that
within a certain amount of
time (a minute or
a
couple of minutes), the call
must be answered. IBM does
not want long
hold
times where customers spend
>10 minutes just waiting
for the phone
to
be answered.
Software
Team Organization and
Specialization
325
When
problems/defects are reported,
the formal fix may take some
time.
Before
the formal fix is available,
the team will provide a
temporary solu-
tion
in as soon as possible, and a key
metric used is "time to
first relief."
The
first-relief temporary repairs
may take less than 24
hours for some
new
problems, and even less if
the problem is already
known.
When
formal fixes are provided, a
key metric used by IBM
Rochester is
the
quality of the fixes:
percent of defective fixes.
The Rochester's
defec-
tive
fix rate is the lowest among
the major platforms in IBM.
(Since the
industry
average for bad-fix
injection is about 7%, it is
commendable that
IBM
addresses this
issue.)
The
IBM Rochester support center
also conducts a "trailer
survey." This is
a
survey of customer satisfaction
about the service or fix.
These surveys
are
based on samples of problem
records that are closed. IBM
Rochester's
trailer
survey satisfaction is in the
high 90s in terms of
percentages of
satisfied
customers.
Another
IBM Rochester factor could be
called the "cultural
factor." IBM as
a
corporation and Rochester as a
lab both have a long
tradition of focus on
quality
(e.g., winning the Malcolm
Baldrige quality award).
Because cus-
tomer
satisfaction correlates directly
with quality, the IBM
Rochester prod-
ucts
have long had a reputation
for excellence (IBM system
/34, system/36,
system
/38, AS/400, system i,
etc.). IBM and Rochester
employees are proud
of
the quality that they
deliver for both products
and services."
For
major customer problems,
teams (support, development,
test, etc.)
work
together to come up with solutions.
Customer feedback has
long
been
favorable for IBM Rochester,
which explains their
multiyear award
for
customer support excellence.
Often when surveyed
customers men-
tion
explicitly and favorably the
amount of support and
problem solving
that
they receive from the IBM
Rochester site.
In
the software world, in-house
customer support
staffed
Demographics
by
actual employees is in rapid
decline due to the
recession. Probably a
few
hundred large companies
still provide such support,
but as layoffs
and
downsizing continue to escalate,
their numbers will be
reduced.
However,
for small companies that
have never employed
full-time
customer
support personnel, no doubt
the software engineers will
still
continue
to field customer calls and
respond to e-mails. There
are prob-
ably
10,000 or more U.S.
organizations with between 1 and 50
employees
where
customer support tasks are
performed informally by
software
engineers
or programmers.
For
commercial software organizations,
outsourcing of customer
sup-
port
to specialized support companies is
now the norm. While
some of
these
support companies are
domestic, there are also
dozens of customer
326
Chapter
Five
support
organizations in other countries with
lower labor costs
than
the
United States or Europe.
However, as the recession
continues, labor
costs
will decline in the United
States, which now has
large pools of
unemployed
software technical personnel.
Customer support,
mainte-
nance,
and other labor-intensive
tasks may well start to
move back to
the
United States.
The
average size of applications
where formal customer
Project
size
support
is close to being mandatory is about
10,000 function points.
Of
course,
for any size application,
customers will have questions
and need
to
report bugs. But applications in
the 10,000function point
range usu-
ally
have many customers. In
addition, these large
systems always are
released
with thousand of latent
bugs.
Productivity
rates Productivity
rates for customer support
are not mea-
sured
using function points, but
rather numbers of customers
assisted.
Typically,
one tier-1 customer support
person on a telephone
support
desk
can talk to about 30 people
per day, which translates
into each call
taking
about 16 minutes.
For
tier 2 and tier 3 customer
support, where experts are
used, the
work
of talking to customers is probably
not full time. However,
for
problems
serious enough to reach tier
2, expect each call to take
about
70
minutes. For problems that
reach tier 3, there will no
doubt be mul-
tiple
calls back and forth
and probably some internal
research. Expect
tier
3 calls to take about 240
minutes.
If
a customer is reporting a new
bug that has not
been identified or
fixed,
then
days or even weeks may be
required. (The author worked
as an expert
witness
in a lawsuit where the time
required to fix one bug in a
financial
application
was more than nine
calendar months. In the course of
fixing
this
bug, the first four
attempts each took about
two months. They not
only
failed
to fix the original bug, but
added new bugs in each
fix.)
Schedules
The
primary schedule issue for
customer support is the
wait
or
hold time before speaking to
a live support person.
Today, in 2009,
reaching
a live person can take
between 10 minutes and more
than 60
minutes
of hold time. Needless to
say, this is very
frustrating to clients.
Improving
quality should also reduce
wait times. Assuming
constant
support
staffing, every reduction of
ten severity 1 or 2 defects
released
in
software should reduce wait
times by about 30 seconds.
Quality
Customer
support calls are directly
proportional to the number
of
released defects or bugs in
software. It is theoretically possible
that
releasing
software with zero defects
might reduce the number of
cus-
tomer
support calls to zero, too.
In today's world, where
defect removal
Software
Team Organization and
Specialization
327
efficiency
only averages 85 percent and
hundreds or thousands of
seri-
ous
bugs are routinely still
present when software is
released, there will
be
hundreds of customer support
calls and e-mails.
It
is interesting that some open-source
and freeware applications
such
as
Linux, Firefox, and Avira
seem to have better quality
levels than equiv-
alent
applications released by established
vendors such as Microsoft
and
Symantec.
In part this may be due to
the skills of the
developers, and in
part
it may be due to routinely
using tools such as static
analysis prior
to
release.
Specialization
The
role of tier-1 customer
support is very specialized.
Effective
customer support requires a
good personality when
dealing
with
crabby customers plus fairly
sophisticated technical skills. Of
these
two,
the criterion for technical
skill is easier to fill then
the criterion for
a
good personality when
dealing with angry or outraged
customers. That
being
said, there are few
studies to date that deal
with either personal-
ity
or technical skills in support
organizations.
In
addition to customer support
provided by vendors of software,
some
user
associations and nonprofit
groups provide customer
support on
a
volunteer basis. Many
freeware and open-source
applications have
user
groups that can answer
technical questions. Even
for commercial
software,
it is sometimes easier to get an
informed response to a
ques-
tion
from an expert user than it
is from the company that
built the
software.
The
main caution about customer
sup-
Cautions
and counter
indications
port
work is that it tends to
lock personnel into narrow
careers, some-
times
limited to discussing a single
application such as Oracle or
SAP
for
a period of years. There is
little chance of career
growth or knowledge
expansion.
Another
caution is that improving
customer support via
automation
and
expert systems is technically
feasible, but many existing
patents
cover
such topics. As a result,
attempts to develop improved
customer
support
automation may require
licensing of intellectual
property.
The
literature on customer support is
dominated by
Conclusions
two
very different forms of
information. The Information
Technology
Infrastructure
Library (ITIL) contains more
than 30 volumes and
more
than
5,000 pages of information on
every aspect of customer
support.
However,
the ITIL library is aimed
primarily at in-house customer
sup-
port
and is not used very
much by commercial software
vendors.
For
commercial software customer
support, some trade books
are
available,
but the literature tends to
be dominated by white
papers
and
monographs published by customer
support outsource
companies.
328
Chapter
Five
Although
these tend to be marketing
texts, some of them do
provide
useful
information about the
mechanics of customer support.
There
are
also interesting reports
available from companies
that provide cus-
tomer-support
automation, which is both
plentiful and seems to
cover
a
wide range of
features.
Given
the fact that customer
support is a critical activity of
the
software
industry in 2009, a great
deal more research and
study are
indicated.
Research is needed on the
relationship between quality
and
customer
support, on the role of user
associations and volunteer
groups,
and
on the potential automation
that might improve customer
support.
In
particular, research is needed on
providing customer support
for deaf
and
hard-of-hearing customers, blind
customers, and those with
other
physical
challenges.
Software
Test Organizations
There
are ten problems with
discussing software test
organizations that
need
to be highlighted:
1.
There are more than 15
different kinds of software
testing.
2.
Many kinds of testing can be
performed either by developers,
by
in-house
test organizations, by outsource
test organizations, or by
quality
assurance teams based on
company test
strategies.
3.
With Agile teams and with
hierarchical organizations, testers
will
probably
be embedded with developers and
not have separate
departments.
4.
Matrix organizations testers would
probably be in a separate
test-
ing
organization reporting to a skill
manager, but assigned to
spe-
cific
projects as needed.
5.
Some test organizations are
part of quality assurance
organizations
and
therefore have several kinds
of specialists besides
testing.
6.
Some quality assurance
organizations collect data on
test results,
but
do no testing of their
own.
7.
Some testing organizations
are called "quality
assurance" and per-
form
only testing. These may
not perform other QA
activities such
as
moderating inspections, measuring
quality, predicting
quality,
teaching
quality, and so on.
8.
For any given software
application, the number of
separate kinds
of
testing steps ranges from a
low of 1 form of testing to a
high of
17
forms of testing based on
company test
strategies.
Software
Team Organization and
Specialization
329
9.
For any given software
application, the number of
test and/or qual-
ity
assurance organizations that
are part of its test
strategy can
range
from a low of one to a high of
five, based on company
quality
strategies.
10.
For any given defect
removal activity, including
testing, as many as
11
different kinds of specialists
may take part.
As
can perhaps be surmised from
the ten points just
highlighted,
there
is no standard way of testing
software applications in 2009.
Not
only
is there no standard way of
testing, but there are no
standard
measures
of test coverage or defect
removal efficiency, although
both
are
technically straightforward
measurements.
The
most widely used form of
test measurement is that of
test cover-
age,
which shows the amount of
code actually executed by
test cases.
Test
coverage measures are fully
automated and therefore easy
to do.
This
is a useful metric, but much
more useful would be to
measure
defect
removal efficiency as
well.
Defect
removal efficiency is more
complicated and not fully
auto-
mated.
To measure the defect
removal efficiency of a specific
test stage
such
as unit
test, all
defects found by the test
are recorded. After
unit
test
is finished, all other
defects found by all other
tests are recorded,
as
are defects found by
customers in the first 90
days. When all
defects
have
been totaled, then removal
efficiency can be
calculated.
Assume
unit test found 100
defects, function test and
later test stages
found
200 defects, and customers
reported 100 defects in the
first
90
days of use. The total
number of defects found was
400. Since unit
test
found 100 out of 400
defects, in this example,
its efficiency is 25
percent,
which is actually not far
from the 30 percent average
value of
defect
removal efficiency for unit
test.
(A
quicker but less reliable
method for determining
defect removal
efficiency
is that of defect seeding.
For example, if 100 known
bugs were
seeded
into the software discussed
in the previous paragraph
and 25
were
found, then the defect
removal efficiency level of 25
percent could
be
calculated immediately. However,
there is no guarantee that
the
"tame"
bugs that were seeded would
be found at exactly the same
rate
as
"wild" bugs that are
made by accident.)
It
is an unfortunate fact that
most forms of testing are
not very effi-
cient
and find only about 25
percent to 40 percent of the
bugs that are
actually
present, although the range
is from less than 20 percent
to
more
than 70 percent.
It
is interesting that there is
much debate over black
box testing,
which
lacks information on internals;
white
box testing, with
full vis-
ibility
of internal code; and
gray
box testing, with
visibility of internals,
but
testing is at the external
level.
330
Chapter
Five
So
far as can be determined,
the debate is theoretical,
and few experi-
ments
have been performed to
measure the defect removal
efficiency
levels
of black, white, or gray box
testing. When measures of
efficiency
are
taken, white box testing
seems to have higher levels
of defect
removal
efficiency than black box
testing.
Because
many individual test stages
such as unit test are so
low
in
efficiency, it can be seen why
several different kinds of
testing are
needed.
The term cumulative
defect removal efficiency
refers
to the
overall
efficiency of an entire sequence of
tests or defect
removal
operations.
As
a result of lack of testing
standards and lack of
widespread test-
ing
effectiveness measurements, testing by
itself does not seem to be
a
particularly
cost-effective approach for
achieving high levels of
quality.
Companies
that depend purely upon
testing for defect removal
almost
never
top 90 percent in cumulative
defect removal, and often
are below
75
percent.
The
newer forms of testing such
as test-driven development
(TDD)
use
test cases as a form of
specification and create the
test cases first,
before
the code itself is created.
As a result, the defect
removal efficiency
of
TDD is higher than many
forms of testing and can
top 85 percent.
However,
even with TDD, bad-fix
injection needs to be factored
into the
equation.
About 7 percent of attempts to fix
bugs accidentally
include
new
bugs in the fixes.
If
TDD is combined with other approaches
such as formal
inspection
of
the test cases and
static analysis of the code,
then defect removal
efficiency
can top 95 percent.
There
is some ambiguity in the
data that deals with
automatic testing
versus
manual testing. In theory,
automatic testing should
have higher
defect
removal efficiency than
manual testing in at least 70
percent
of
trials. For example, manual
unit testing averages about
30 percent
in
terms of defect removal
efficiency, while automatic
testing may top
50
percent. However, testing
skills vary widely among
software engi-
neers
and programmers, and
automatic testing also
varies widely. More
study
of this topic is
indicated.
The
poor defect removal
efficiency of normal testing
brings up an
important
question: If
testing is not very
effective in finding and
remov-
ing
bugs, what is effective? This
is an important question, and it
is
also
a question that should be
answered in a book entitled
Software
Engineering
Best Practices.
The
answer to the question of
"What is effective in achieving
high
levels
of quality?" is that a combination of
defect prevention and
mul-
tiple
forms of defect removal is
needed for optimum
effectiveness.
Defect
prevention refers
to methods and techniques
that can lower
defect
potentials from U.S.
averages of about 5.0 per
function point.
Software
Team Organization and
Specialization
331
Examples
of methods that have
demonstrated effectiveness in
terms
of
defect prevention include
the higher levels of the
capability matu-
rity
model integration (CMMI),
joint application design
(JAD), qual-
ity
function deployment (QFD),
root-cause analysis, Six Sigma
for
software,
the Team Software Process
(TSP), and also the
Personal
Software
Process (PSP).
For
small applications, the
Agile method of having an
embedded user
as
part of the team can
also reduce defect
potentials. (The caveat
with
embedded
users is that for
applications with more than
about 50 users,
one
person cannot speak for
the entire set of users.
For applications with
thousands
of users, having a single embedded user
is not adequate. In
such
cases,
focus groups and surveys of
many users are
necessary.)
As
it happens, formal inspections of
requirements, design, and
code
serve
double duty and are
very effective in terms of
defect prevention as
well
as being very effective in
terms of defect removal.
This is because
participants
in formal inspections spontaneously
avoid making the
same
mistakes that are found
during the
inspections.
The
combination of methods that
have been demonstrated to
raise
defect
removal efficiency levels
includes formal inspections of
require-
ments,
design, code, and test
materials; static analysis of
code prior to
testing;
and then a test sequence
that includes at least eight
forms of
testing:
(1) unit test, (2)
new function test, (3)
regression test, (4)
per-
formance
test, (5) security test,
(6) usability test, (7)
system test, and
(8)
some form of external test
with customers or clients, such as
beta
test
or acceptance test.
Such
a combination of pretest inspections,
static analysis, and at
least
eight
discrete test stages will
usually approach 99 percent in
terms of
cumulative
defect removal efficiency levels.
Not only does this
combination
raise
defect removal efficiency
levels, but it is also very
cost-effective.
Projects
that top 95 percent in
defect removal efficiency
levels usually
have
shorter development schedules
and lower costs than
projects that
skimp
on quality. And, of course,
they have much lower
maintenance
and
customer support costs,
too.
Testing
is a teachable skill, and
there are a number of
for-profit and
nonprofit
organizations that offer
seminars, classes, and
several flavors
of
certification for test
personnel. While there is
some evidence that
certified
test personnel do end up with
higher levels of defect
removal
efficiency
than uncertified test
personnel, the poor
measurement and
benchmark
practices of the software
industry make that claim
some-
what
anecdotal. It would be helpful if
test certification included
a
learning
segment on how to measure
defect removal
efficiency.
Following
in Table 5-3 are examples of
a number of different
forms
of
software inspection, static
analysis, and testing, with
the probable
organization
that performs each activity
indicated.
332
Chapter
Five
TABLE
5-3
Forms
of Software Defect Removal
Activities
Pretest
Removal Inspections
Performed
by
1.
Requirements
Analysts
2.
Design
Designers
3.
Code
Programmers
4.
Test
plans
Testers
5.
Test
cases
Testers
6.
Static
analysis
Programmers
General
Testing
7.
Subroutine
test
Programmers
8.
Unit
test
Programmers
9.
New
function test
Testers
or programmers
10.
Regression
test
Testers
or programmers
11.
System
test
Testers
or programmers
Special
Testing
12.
Performance
testing
Performance
specialists
13.
Security
testing
Security
specialists
14.
Usability
testing
Human
factors specialists
15.
Component
testing
Testers
16.
Integration
testing
Testers
17.
Nationalization
testing
Foreign
language experts
18.
Platform
testing
Platform
specialists
19.
SQA
validation testing
Software
quality assurance
20.
Lab
testing
Hardware
specialists
External
Testing
21.
Independent
testing
External
test company
22.
Beta
testing
Customers
23.
Acceptance
testing
Customers
Special
Activities
24.
Audits
Auditors,
SQA
25.
Independent
verification and validation
(IV&V)
IV&V
contractors
26.
Ethical
hacking
Hacking
consultants
Table
5-3 shows 26 different kinds
of defect removal activity
carried
out
by a total of 11 different kinds of
internal specialists, 3
specialists
from
outside companies, and also
by customers. However, only
very large
and
sophisticated high-technology companies
would have such a
rich
mixture
of specialization and would
utilize so many different
kinds of
defect
removal.
Smaller
companies would either have
the testing carried out by
software
engineers
or programmers (who often
are not well trained), or
they would
Software
Team Organization and
Specialization
333
have
a testing group staffed
primarily by testing specialists.
Testing can
also
be outsourced, although as of 2009,
this activity is not
common.
At
this point, it is useful to
address three topics that
are not well
covered
in the testing
literature:
1.
How many testers are needed
for various kinds of
testing?
2.
How many test cases are
needed for various kinds of
testing?
3.
What is the defect removal
efficiency of various kinds of
testing?
Table
5-4 shows the approximate
staffing levels for the 17
forms of
testing
that were illustrated in
Table 5-3. Note that
this information is
only
approximate, and there are
wide ranges for each
form of testing.
Because
testing executes source
code, the information in
Table 5-4
is
based on source code counts
rather than on function
points. With
more
than 700 programming
languages ranging from
assembly through
TABLE
5-4
Test
Staffing for Selected Test
Stages
Application
language
Java
Application
code size
50,000
Application
KLOC
50
Function
points
1,000
General
Testing
Assignment
Scope
Test
Staff
1.
Subroutine
test
10,000
5.00
2.
Unit
test
10,000
5.00
3.
New
function test
25,000
2.00
4.
Regression
test
25,000
2.00
5.
System
test
50,000
1.00
Special
Testing
6.
Performance
testing
50,000
1.00
7.
Security
testing
50,000
1.00
8.
Usability
testing
25,000
2.00
9.
Component
testing
25,000
2.00
10.
Integration
testing
50,000
1.00
11.
Nationalization
testing
1,50,000
0.33
12.
Platform
testing
50,000
1.00
13.
SQA
validation testing
75,000
0.67
14.
Lab
testing
50,000
1.00
External
Testing
15.
Independent
testing
7,500
6.67
16.
Beta
testing
25,000
2.00
17.
Acceptance
testing
25,000
2.00
334
Chapter
Five
modern
languages such as Ruby and
E, the same application
illustrated
in
Table 5-4 might vary by
more than 500 percent in
terms of source
code
size. Java is the language
used in Table 5-4 because it is one of
the
most
common languages in
2009.
The
column labeled "Assignment
Scope" illustrates the
amount of
source
code that one tester will
probably be responsible for
testing.
Note
that there are very
wide ranges in assignment
scopes based on the
experience
levels of test personnel, on
the cyclomatic complexity of
the
code,
and to a certain extent, on
the specific language or
combination of
languages
in the application being
tested.
Because
the testing shown in Table
5-4 involves a number of
differ-
ent
people with different skills
who probably would be from
different
departments,
the staffing breakdown for
all 17 tests would
include
5
developers through unit
test; 2 test specialists for
integration and
system
test; 3 specialists for
security, nationalization, and
usability
test;
1 SQA specialist; 7 outside
specialists from other
companies; and
2
customers: 20 people in
all.
Of
course, it is unlikely that
any small application of
1,000 function
points
or 50 KLOC (thousands of lines of
code) would use (or
need) all
17
of these forms of testing.
The most probable sequence
for a 50-KLOC
Java
application would be 6 kinds of
testing performed by 5
developers,
2
test specialists, and 2
users, for a total of 9 test
personnel in all.
In
Table 5-5, data from
the previous tables is used
as the base for
staffing,
but the purpose of Table
5-5 is to show the
approximate num-
bers
of test cases produced for
each test stage, and then
the total number
of
test cases for the
entire application. Here,
too, there are major
varia-
tions,
so the data is only
approximate.
The
code defect potential for
the 50 KLOC code sample of
the Java
application
would be about 1,500 total
bugs, which is equal to 1.5
code
bugs
per function point, or 30
bugs per KLOC. (Note
that earlier bugs
in
requirements and design are
excluded and assumed to have
been
removed
before testing
begins.)
If
all 17 of the test stages
were used, they would
probably detect about
95
percent
of the total bugs present,
or 1,425 in all. That would
leave 75 bugs
latent
when the application is
delivered. Assuming both the
numbers for
potential
defects and the numbers for
test cases are reasonably
accurate
(a
questionable assumption) then it
takes an average of 1.98 test cases
to
find
1 bug.
Of
course, since only about 6
out of the 17 test stages
are usually per-
formed,
the removal efficiency would
probably be closer to 75 percent,
which
is
why additional nontest methods
such as inspections and
static analysis
are
needed to achieve really
high levels of defect
removal efficiency.
If
even this small 50-KLOC
example uses more than
2,800 test cases,
it
is obvious that corporations with
hundreds of software
applications
Software
Team Organization and
Specialization
335
TABLE
5-5
Test
Cases for Selected Test
Stages
Application
language
Java
Application
code size
50,000
Application
KLOC
50
Function
points
1,000
Test
Cases
Total
Test
Test
Cases
Test
Staff
Per
KLOC
Cases
Per
Person
General
Testing
1.
Subroutine
test
5.00
12.00
600
120.00
2.
Unit
test
5.00
10.00
500
100.00
3.
New
function test
2.00
5.00
250
125.00
4.
Regression
test
2.00
4.00
200
100.00
5.
System
test
1.00
3.00
150
150.00
Special
Testing
6.
Performance
testing
1.00
1.00
50
50.00
7.
Security
testing
1.00
3.00
150
150.00
8.
Usability
testing
2.00
3.00
150
75.00
9.
Component
testing
2.00
1.50
75
37.50
10.
Integration
testing
1.00
1.50
75
75.00
11.
Nationalization
testing
0.33
0.50
25
75.76
12.
Platform
testing
1.00
2.00
100
100.00
13.
SQA
validation testing
0.67
1.00
50
74.63
14.
Lab
testing
1.00
1.00
50
50.00
External
Testing
15.
Independent
testing
6.67
4.00
200
29.99
16.
Beta
testing
2.00
2.00
100
50.00
17.
Acceptance
testing
2.00
2.00
100
50.00
TOTAL
TEST CASES
2825
TEST
CASES PER KLOC
56.50
TEST
CASES PER PERSON (20
TESTERS)
141.25
will
eventually end up with millions of
test cases. Once created,
test cases
have
residual value for
regression test purposes.
Fortunately, a number
of
automated tools can be used
to store and manage test
case libraries.
The
existence of such large test
libraries is a necessary
overhead
of
software development and
maintenance. However, this
topic needs
additional
study. Creating reusable
test cases would seem to be
of value.
Also,
there are often errors in
test cases, which is why
inspections of test
plans
and test cases are
useful.
With
hundreds of different people
creating test cases in large
com-
panies
and government agencies,
there is a good chance that
duplicate
tests
will accidentally be created. In fact,
this does occur, and a
study at
336
Chapter
Five
IBM
noted about 30 percent
redundancy or duplicates in one
software
lab's
test library.
The
final Table 5-6 in this
section shows defect removal
efficiency
levels
against six sources of
error: requirements defects,
design defects,
coding
defects, security defects,
defects in test cases, and
performance
defects.
Table
5-6 is complicated by the
fact that not every
defect removal
method
is equally effective against
each type of defect. In
fact, many
TABLE
5-6
Defect
Removal Efficiency by Defect
Type
Pretest
Removal
Req.
Des.
Code
Sec.
Test
Perf.
Inspections:
defects
defects
defects
defects
defects
defects
1.
Requirements
85.00%
2.
Design
85.00%
25.00%
3.
Code
85.00%
40.00%
15.00%
4.
Test plans
85.00%
5.
Test cases
85.00%
6.
Static analysis
30.00%
87.00%
25.00%
20.00%
General
Testing
7.
Subroutine test
35.00%
10.00%
8.
Unit test
30.00%
10.00%
9.
New function test
15.00%
35.00%
10.00%
10.
Regression test
15.00%
11.
System test
10.00%
20.00%
25.00%
7.00%
25.00%
Special
Testing
12.
Performance testing
5.00%
10.00%
70.00%
13.
Security testing
65.00%
14.
Usability testing
10.00%
10.00%
15.
Component testing
10.00%
25.00%
16.
Integration testing
10.00%
30.00%
17.
Nationalization testing
3.00%
18.
Platform testing
10.00%
19.
SQA validation
testing
5.00%
5.00%
15.00%
20.
Lab testing
10.00%
10.00%
10.00%
20.00%
External
Testing
21.
Independent testing
5.00%
30.00%
5.00%
5.00%
10.00%
22.
Beta testing
30.00%
25.00%
10.00%
15.00%
23.
Acceptance testing
30.00%
20.00%
5.00%
15.00%
Special
Activities
24.
Audits
15.00%
10.00%
25.
Independent verification
and
validation (IV&V)
10.00%
10.00%
10.00%
26.
Ethical hacking
85.00%
Software
Team Organization and
Specialization
337
forms
of defect removal have 0
percent efficiency against
security flaws.
Coding
defects are the easiest
type of defect to remove;
requirements
defects,
security defects, and
defects in test materials
are the most
dif-
ficult
to eliminate.
Historically,
formal inspections have the
highest levels of
defect
removal
efficiency against the
broadest range of defects.
The more
recent
method of static analysis
has a commendably high level
of defect
removal
efficiency against coding
defects, but currently
operates only on
about
15 programming languages out of
more than 700.
The
data in Table 5-6 has a
high margin of error, but
the table itself
shows
the kind of data that needs
to be collected in much greater
volume
to
improve software quality and
raise overall levels of
defect removal effi-
ciency
across the software industry. In
fact, every
software
application
larger
than 1,000 function points
in size should collect this
kind of data.
One
important source of defects is
not shown in Table 5-6
and that
is
bad-fix injection. About 7
percent of bug repairs
contain a fresh
bug
in the repair itself. Assume
that unit testing found
and removed
100
bugs in an application. But there is a
high probability that 7
new
bugs
would be accidentally injected
into the application due to
errors
in
the fixes themselves.
(Bad-fix injections greater
than 25 percent may
occur
with error-prone modules.)
Bad-fix
injection is a very common
source of defects in software,
but it
is
not well covered either in
the literature on testing or in
the literature
on
software quality
assurance.
Another
quality issue that is not well
covered is that of error-
prone
modules. As mentioned elsewhere in
this book, bugs are
not
randomly
distributed, but tend to
clump in a small number of
very
buggy
modules.
If
an application contains one or
more error-prone modules,
then
defect
removal efficiency levels
against those modules may be
only half
of
the values shown in Table
5-6, and bad-fix injection
rates may top
25
percent. This is why error-prone
modules can seldom be
repaired, but
need
to be surgically removed and
replaced by a new
module.
In
spite of the long history of
testing and the large
number of test per-
sonnel
employed by the software
industry, a great deal more
research
is
needed. Some of the topics
that need research are
automatic genera-
tion
of test cases from specifications,
developing reusable test
cases,
better
predictions of test case numbers
and removal efficiency,
and
much
better measurement of test
results in terms of defect
removal
efficiency
levels.
Demographics
In
the software world, testing
has long been one of
the
major
development activities, and
test personnel are among
the largest
software
occupation groups. But to date
there is no accurate census
of
338
Chapter
Five
test
personnel, due in part to
the fact that so many
different kinds of
specialists
get involved in
testing.
Because
testing is on the critical
path for releasing software,
there
is
a tendency for software
project managers or even
senior executives
to
put pressure on test
personnel to truncate testing
when schedules
are
slipping. By having test
organizations reporting to separate
skill
managers,
as opposed to project or application
managers, this adds a
measure
of independence.
However,
testing is such an integral
part of software development
that
test
personnel need to be involved
essentially from the first
day that
development
begins. Whether testers
report to skill managers or
are
embedded
in project teams, they need
early involvement during
require-
ment
and design. This is
especially true with test-driven
development
(TDD),
where test cases are an
integral part of the
requirements and
design
processes.
The
minimum size of applications where
formal testing
Project
size
is
mandatory is about 100
function points. As a rule,
the larger the
application,
the more kinds of pretest
defect removal activities
and
more
kinds of testing are needed
to be successful or even to finish
the
application
at all.
For
large systems less than
10,000 function points,
inspections, static
analysis,
security analysis, and about
ten forms of testing are
needed
to
achieve high levels of
defect removal efficiency.
Unfortunately, many
companies
skimp on testing and nontest
activities, so U.S.
average
results
are embarrassingly bad: 85
percent cumulative defect
removal
efficiency.
These results have been
fairly flat or constant from
1996
through
2009.
Productivity
rates There
are no effective productivity
rates for testing.
There
are no effective size
metrics for test cases. At a
macro level, testing
productivity
can be measured by using
"work hours per function
point"
or
the reciprocal "function
points per staff month,"
but those measures
are
abstract and don't really
capture the essence of
testing.
Measures
such as "test cases created
per month" or "test cases
exe-
cuted
per month" send the
wrong message, because they
might encour-
age
extra testing simply to puff
up the results and not
raise defect
removal
efficiency.
Measures
such as "defects detected
per month" are unreliable,
because
for
really top-gun developers,
there may not be very
many defects to
find.
The "cost per defect"
metric is also unreliable
for the same
reason.
Testers
will still run many test
cases whether an application
has any
bugs
or not. As a result, cost
per defect rises as defect
quantities go
down;
hence the cost per
defect metric penalizes
quality.
Software
Team Organization and
Specialization
339
The
primary schedule issues for
test personnel are those
of
Schedules
test
case creation and test
case execution. But testing
schedules depend
more
upon the number of bugs
found and the time it
takes to repair the
bugs
than on test cases.
One
factor that is seldom
measured but also delays
test schedules
is
bugs or defects in test cases
themselves. A study done
some years
ago
by IBM found more bugs in
test cases than in the
applications
being
tested. This topic is not
well covered by the testing
literature.
(This
was the same study
that had found about 30
percent redun-
dant
or duplicate test cases in test
libraries.) Running duplicate
test
cases
adds to testing costs and
schedules, but not to defect
removal
efficiency
levels.
When
testing starts on applications with
high volumes of defects,
the
entire
schedule for the project is
at risk, because testing schedules
will
extend
far beyond their planned
termination. In fact, testing
delays due
to
excessive defect volumes is
the main reason for
software schedule
delays.
The
most effective way to
minimize test schedules is to
have very few
defects
present because pretest inspections
and static analysis
found
most
of them before testing
began. Defect prevention
such as TSP or
joint
application design (JAD) can
also speed up test
schedules.
For
the software industry as a
whole, delays in testing due
to exces-
sive
bugs is a major cause of
application cost and
schedule overruns
and
also of project cancellations.
Because long delays and
cancellation
trigger
a great deal of litigation,
high defect potentials and
low levels
of
defect removal efficiency
are causative factors in
breach of contract
lawsuits.
Quality
Testing
by itself has not been
efficient enough in finding
bugs to
be
the only form of defect
removal used on major
software applications.
Testing
alone almost never tops 85
percent defect removal
efficiency,
with
the exception of the newer
test-driven development (TDD),
which
can
hit 90 percent.
Testing
combined with formal inspections
and static analysis
achieves
higher
levels of defect removal
efficiency, shorter schedules,
and lower
costs
than testing alone.
Moreover, these savings not
only benefit devel-
opment,
but also lower the
downstream costs of customer
support and
maintenance.
Readers
who are executives and
qualified to sign contracts
are
advised
to consider 95 percent as the minimum
acceptable level of
defect
removal efficiency. Every
outsource contract, every
internal qual-
ity
plan, and every license with
a software vendor should
require proof
that
the development organization will
top 95 percent in defect
removal
efficiency.
340
Chapter
Five
Testing
specialization covers a wide
range of skills.
Specialization
However,
for many small companies
with a generalist philosophy,
soft-
ware
developers may also serve as
software testers even though
they
may
not be properly trained for
the role.
For
large companies, a formal
testing department staffed by
testing
specialists
will give better results
than development testing by
itself.
For
very large multinational
companies and for companies
that build
systems
and embedded software, test
and quality assurance
specialists
will
be numerous and have many
diverse skills.
There
are several forms of test
certification available. Testers
who go
to
the trouble of achieving
certification are to be commended
for taking
their
work seriously. However,
there is not a great deal of
empirical
data
that compares the defect
removal efficiency levels of
tests carried
out
by certified testers versus
the same kind of testing
performed by
uncertified
testers.
The
main caution about testing
is that
Cautions
and counter
indications
it
does not find very
many bugs or defects. For
more than 50 years,
the
software
industry has routinely
delivered large software
applications
with
hundreds of latent bugs, in
spite of extensive
testing.
A
second caution about testing is
that testing cannot find
require-
ments
errors such as the famous
Y2K problem. Once an error
becomes
embedded
in requirements and is not
found via inspections,
quality func-
tion
deployment (QFD), or some
other nontest approach, all
that testing
will
accomplish is to confirm the
error. This is why correct
requirements
and
design documents are vital
for successful testing. This
also explains
why
formal inspections of requirements
and design documents
raise
testing
efficiency by about 5 percent
per test stage.
The
literature on testing is extensive
but almost totally
Conclusions
devoid
of quantitative data that
deals with defect removal
efficiency,
with
testing costs, with test
staffing, with test specialization,
with
return
on investment (ROI), or with the
productivity of test
personnel.
However,
there are dozens of books
and hundreds of web sites
with
information
on testing.
Several
nonprofit organizations are
involved with testing, such as
the
Association
for Software Testing (AST)
and the American Society
for
Quality
(ASQ). There is also a
Global Association for
Software Quality
(GASQ).
There
are local and regional
software quality organizations in
many
cities.
There are also for-profit
test associations that hold
a number of
conferences
and workshops, and also
offer certification
exams.
Given
the central role of testing
over the past 50 years of
software
engineering,
the gaps in the test
literature are surprising
and dismaying.
Software
Team Organization and
Specialization
341
A
technical occupation that
has no clue about the
most efficient and
cost-
effective
methods for preventing or
removing serious errors is
not qualified
to
be called "engineering."
Some
of the newer forms of
testing such as test-driven
development
(TDD)
are moving in a positive
direction by shifting test
case develop-
ment
to earlier in the development
cycle, and by joining test
cases with
requirements
and design. These changes in
test strategy result in
higher
levels
of defect removal efficiency
coupled with lower costs as
well.
But
to achieve really high
levels of quality in a cost-effective
manner,
testing
alone has always been
insufficient and remains
insufficient in
2009.
A synergistic combination of defect
prevention and a
multiphase
suite
of defect removal activities
that combine inspections,
static analysis,
automated
testing, and manual testing
provide the best overall
results.
For
the software industry as a
whole, defect potentials
have been far
too
high, and defect removal
efficiency far too low
for far too many
years.
This
unfortunate combination has
raised development costs,
stretched
out
development schedules, caused
many failures and also
litigation,
and
raised maintenance and
customer support costs far
higher than
they
should be.
Defect
prevention methods such as
Team Software Process
(TSP),
quality
function deployment (QFD), Six
Sigma for software, joint
appli-
cation
design (JAD), participation in
inspections, and certified
reusable
components
have the theoretical
potential of lowering defect
potentials
by
80 percent or more compared with
2009. In other words, defect
poten-
tials
could drop from about
5.0 per function point
down to about 1.0
per
function
point or lower.
Defect
removal combinations that
include formal inspections,
static
analysis,
test-driven development, using
both automatic and
manual
testing,
and certified reusable test
cases could raise average
defect
removal
efficiency levels from
today's approximate average of
about
85
percent in 2009 up to about 97
percent. Levels that
approach
99.9
percent could even be
achieved in many cases.
Effective
combinations of defect prevention
and defect removal
activities
are available in 2009 but
seldom used except by a few
very
sophisticated
organizations. What is lacking is
not so much the
tech-
nologies
that improve quality, but
awareness of how effective
the best
combinations
really are. Also lacking is
awareness of how
ineffective
testing
alone can be. It is lack of
widespread quality
measurements
and
lack of quality benchmarks
that are delaying
improvements in
software
quality.
Also
valuable are predictive
estimating tools that can
predict both
defect
potentials and the defect
removal efficiency levels of
any com-
bination
of review, inspection, static
analysis, automatic test
stage,
and
manual test stage. Such
tools exist in 2009 and
are marketed by
342
Chapter
Five
companies
such as Software Productivity Research
(SPR), SEER,
Galorath,
and
Price Systems. Even more
sophisticated tools that can
predict the
damages
that latent defects cause to
customers exist in prototype
form.
The
final conclusion is that
until the software industry
can routinely
top
95 percent in average defect
removal efficiency levels,
and hit 99
percent
for critical software
applications, it should not
even pretend
to
be a true engineering discipline.
The phrase "software
engineering"
without
effective quality control is a
hoax.
Software
Quality Assurance
(SQA)
Organizations
The
author of this book worked
for five years in IBM's
Software Quality
Assurance
organizations in Palo Alto
and Santa Teresa,
California. As a
result,
the author may have a
residual bias in favor of
SQA groups that
function
along the lines of IBM's
SQA groups.
Within
the software industry, there
is some ambiguity about the
role
and
functions of SQA groups.
Among the author's clients
(primarily
Fortune
500 companies), following is an
approximate distribution of
how
SQA organizations
operate:
In
about 50 percent of companies,
SQA is primarily a testing
orga-
■
nization
that performs regression
tests, performance tests,
system
tests,
and other kinds of testing
that are used for
large systems as
they
are integrated. The SQA
organization reports to a vice
president
of
software engineering, to a CIO, or to
local development
managers
and
is not an independent organization.
There may be some
respon-
sibility
for measuring quality, but
testing is the main focus.
These
SQA
organizations tend to be quite
large and may employ
more than
25
percent of total software
engineering personnel.
In
about 35 percent of companies,
SQA is a focal point for
estimating
■
and
measuring quality and
ensuring adherence to local
and national
quality
standards. But the SQA group
is separate from testing
orga-
nizations,
and performs only limited
and special testing such as
stan-
dards
adherence. To have an independent
view, the SQA
organization
reports
to its own vice president of
quality and is not part of
the devel-
opment
or test organizations. (This is
the form of SQA that IBM
had
when
the author worked there.)
These organizations tend to be
fairly
small
and employ between 1 percent
and 3 percent of total
software
engineering
personnel.
About
10 percent of companies have a
testing organization but
no
■
SQA
organization at all. The
testing group usually
reports to the CIO
or
to a vice president or senior
software executive. In such
situations,
testing
is the main focus, although
there may be some
measurement
Software
Team Organization and
Specialization
343
of
quality. While the testing
organization may be large,
the staffing
for
SQA is zero.
In
about 5 percent of companies,
there is a vice president of
SQA and
■
possibly
one or two assistants, but
nobody else. In this
situation, SQA
is
clearly nothing more than an
act that can be played
when custom-
ers
visit. Such organizations
may have testing groups
that report
to
various development managers.
The so-called SQA
organizations
where
there are executives but no
SQA personnel employ less
than
one-tenth
of one percent of total software
engineering personnel.
Because
software quality assurance
(SQA) is concerned with more
than
testing,
it is interesting to look at the
activities and roles of
"traditional"
SQA
groups that operate
independently from test
organizations.
1.
Collecting and measuring
software quality during
development and
after
release, including analyzing
test results and test
coverage. In
some
organizations such as IBM, defect
removal efficiency
levels
are
also calculated.
2.
Predicting software quality
levels for major new
applications,
including
construction of special quality
estimating tools.
3.
Performing statistical studies of
quality or carrying out
root-cause
analysis.
4.
Examining and teaching
quality methods such as
quality function
deployment
(QFD) or Six Sigma for
software.
5.
Participating in software inspections as
moderators or recorders,
and
also teaching
inspections.
6.
Ensuring that local,
national, and international
quality standards
are
followed. SQA groups are
important for achieving ISO
9000
certification,
for example.
7.
Monitoring the activities
associated with the various
levels of the
capability
maturity model integration
(CMMI). SQA groups play
a
major
part in software process
improvements and ascending to
the
higher
levels of the CMMI.
8.
Performing specialized testing
such as standards
adherence.
9.
Teaching software quality
topics to new
employees.
10.
Acquiring quality benchmark
data from external
organizations
such
as the International Software
Benchmarking Standards
Group
(ISBSG).
A
major responsibility of IBM's
SQA organization was
determining
whether
the quality level of new
applications was likely to be
good
344
Chapter
Five
enough
to ship the application to
customers. The SQA
organization
could
stop delivery of software
that was felt to have
insufficient quality
levels.
Development
managers could appeal an SQA
decision to stop the
release
of questionable software, and
the appeal would be decided
by
IBM's
president or by a senior vice
president. This did not
happen often,
but
when it did, the event
was taken very seriously by
all concerned.
The
fact that the SQA
group was vested with this
power was a strong
incentive
for development managers to
take quality
seriously.
Obviously,
for SQA to have the
power to stop delivery of a
new applica-
tion,
the SQA team had to
have its own chain of
command and its
own
senior
vice president independent of
the development organization.
If
SQA
had reported to a development
executive, then threats or
coercion
might
have made the SQA
role ineffective.
One
unique feature of the IBM
SQA organization was a
formal "SQA
research"
function, which provided
time and resources for
carrying out
research
into topics that were
beyond the state of the
art currently avail-
able.
For example, IBM's first
quality estimation tool was
developed
under
this research program.
Researchers could submit
proposals for
topics
of interest, and those
selected and approved would
be provided
with
time and with some funding
if necessary.
Several
companies encourage SQA and
other software
engineering
personnel
to write technical books and
articles for outside
journals such
as
CrossTalk
(the
U.S. Air Force software
journal) or some of the
IEEE
journals.
One
company, ITT, as part of its
software engineering research
lab,
allowed
articles to be written during
business hours and even
provided
assistance
in creating camera-ready copy
for books. It is a
significant
point
that authors should be
allowed to keep the
royalties from the
technical
books that they
publish.
It
is an interesting phenomenon that
almost every company
with
defect
removal efficiency levels
that average more than 90
percent has
a
formal and active SQA
organization. Although formal
and active SQA
groups
are associated with better-than-average
quality, the data is
not
sufficient
to assert that SQA is the
primary cause of high
quality.
The
reason is that most
organizations that have low
software quality
don't
have any measurements in
place, and their poor
quality levels only
show
up if they commission a special
assessment, or if they are
sued
and
end up in court.
It
would be nice to say that
organizations
with formal SQA teams
aver-
age
greater than 90 percent in defect
removal efficiency and
that similar
companies
doing similar software that
lack formal SQA teams
average
less
than 80 percent in defect removal
efficiency. But
the unfortunate fact
is
that only the companies with
formal SQA teams are
likely to know
Software
Team Organization and
Specialization
345
what
their defect removal
efficiency levels are. In
fact, quality
measure-
ment
practices are so poor that
even some companies that do
have an
SQA
organization do not know
their defect removal
efficiency levels.
In
the software world, SQA is
not large
numerically,
Demographics
but
has been a significant
source of quality innovation.
There are per-
haps
5,000 full-time SQA
personnel employed in the
United States as
of
2009.
SQA
organizations are very
common in companies that
build sys-
tems
software, embedded software, or
commercial software, such as
SAP,
Microsoft,
Oracle, and the like.
SQA organizations are less
common in
IT
groups such as banks and
finance companies, although
they do occur
within
the larger companies.
Many
cities have local SQA
organizations, and there are
also national
and
international equality associations as
well.
There
is one interesting anomaly with SQA
support of software
appli-
cations.
Development teams that use
the Team Software Process
(TSP)
have
their own internal
equivalent of SQA and also
collect extensive
data
on bugs and quality.
Therefore, TSP teams
normally do not have
any
involvement from corporate
SQA organizations. They of
course
provide
data to the SQA organization
for corporate reporting
purposes,
but
they don't have embedded
SQA personnel.
Normally,
SQA involvement is mandatory
for large appli-
Project
size
cations
above about 2,500 function
points. While SQA
involvement
might
be useful for smaller
applications, they tend to
have better
quality
than large applications.
Since SQA resources are
limited,
concentrating
on large applications is perhaps
the best use of
SQA
personnel.
There
are no effective productivity
rates for SQA
Productivity
rates
groups.
However, it is an interesting and
important fact that
produc-
tivity
rates for software
applications that do have
SQA involvement,
and
which manage to top 95
percent in defect removal
efficiency,
are
usually much better than
applications of the same
size that
lack
SQA.
Even
if SQA productivity itself is
ambiguous, measuring the
quality
and
productivity of the applications
that are supported by SQA
teams
indicates
that SQA has significant
business value.
The
primary schedule issues for
SQA teams are the
overall
Schedules
schedules
for the applications that
they support. As with
productivity
and
quality, there is evidence
that an SQA presence on an
application
tends
to prevent schedule
delays.
346
Chapter
Five
Indeed
if SQA is successful in introducing
formal inspections,
sched-
ules
can even be
shortened.
The
most effective way to
shorten software development
schedules is
to
have very few defects
due to defect prevention,
and to remove most
of
them prior to testing due to
pretest inspections and
static analysis.
Since
SQA groups push hard
for both defect prevention
and early defect
removal,
an effective SQA group will
benefit development
schedules--and
especially
so for large applications,
which typically run
late.
For
the software industry as a
whole, delays due to
excessive bugs
are
a major cause of application
cost and schedule overruns
and also of
project
cancellations. Effective SQA
groups can minimize the
endemic
problems.
It
is a proven fact that an
effective SQA organization
can lead to
significant
cost reductions and
significant schedule improvements
for
software
projects. Yet because the
top executives in many
companies do
not
understand the economic
value of high quality and
regard quality
as
a luxury rather than a
business necessity, SQA
personnel are among
the
first to be let go during a
recession.
Quality
The
roles of SQA groups center
on quality, including
quality
measurement,
quality predictions, and
long-range quality
improvement.
SQA
groups also have a role in
ISO standards and the CMMI.
SQA
organizations
also teach quality courses
and assist in the
deployment
of
methods such as quality
function deployment (QFD)
and Six Sigma
for
software. In fact, it is not
uncommon for many SQA
personnel to be
Six
Sigma black belts.
There
is some uncertainty in 2009
about the role of SQA
groups when
test-driven
development (TDD) is utilized.
Because TDD is fairly
new,
the
intersection of TDD and SQA is
still evolving.
As
already mentioned in the
testing section of this
chapter, read-
ers
who are executives and
qualified to sign contracts
are advised to
consider
95 percent as the minimum acceptable
level of defect
removal
efficiency.
Every outsource contract,
every internal quality plan,
and
every
license with a software vendor
should require proof that
the devel-
opment
organization will top 95 percent in
defect removal
efficiency.
There
is one troubling phenomenon
that needs more study.
Large
systems
above 10,000 function points
are often released with
hundreds
of
latent bugs in spite of
extensive testing and
sometimes in spite of
large
SQA teams. Some of these
large systems ended up in
lawsuits
where
the author happened to an
expert witness. It usually
happened
that
the advice of the SQA
teams was not taken,
and that the
project
manager
skimped on quality control in a
misguided attempt to
compress
schedules.
Software
Team Organization and
Specialization
347
Specialization
SQA
specialization covers a wide
range of skills that
can
include
statistical analysis, function
point analysis, and also
testing.
Other
special skills include Six
Sigma, complexity analysis,
and root-
cause
analysis.
The
main caution about SQA is
that it
Cautions
and counter
indications
is
there to help, and not to
hinder. Dogmatic attitudes
are counterpro-
ductive
for effective cooperation with
development and testing
groups.
An
effective SQA organization
can benefit not only
qual-
Conclusions
ity,
but also schedules and
costs. Unfortunately, during
recessions, SQA
teams
are among the first to be
affected by layoffs and
downsizing. As
the
recession of 2009 stretches
out, it causes uncertainty
about the
future
of SQA in U.S.
business.
Because
quality benefits costs and
schedules, it is urgent for
SQA
teams
to take positive steps to
include measures of defect
removal effi-
ciency
and measures of the economic
value of quality as part of
their
standard
functions. If SQA could
expand the number of formal
quality
benchmarks
brought in to companies, and
collect data for
submission
to
benchmark groups, the data
would benefit both companies
and the
software
industry.
Several
nonprofit organizations are
involved with SQA, such as
the
American
Society for Quality (ASQ).
There is also a Global
Association
for
Software Quality
(GASQ).
Local
and regional software
quality organizations are in
many
cities.
Also, for-profit SQA
associations such as the
Quality Assurance
Institute
(QAI) hold a number of
conferences and workshops,
and also
offer
certification exams.
SQA
needs to assist in introducing a
synergistic combination of
defect
prevention
and a multiphase suite of
defect removal activities
that
combine
inspections, static analysis,
automated testing, and
manual
testing.
There is no silver bullet
for quality, but fusions of
a variety
of
quality methods can be very
effective. SQA groups are
the logical
place
to provide information and
training for these effective
hybrid
methods.
Effective
combinations of defect prevention
and defect removal
activities
are available in 2009, but
seldom used except by a few
very
sophisticated
organizations. As mentioned in the
testing section of
this
chapter, what is lacking is not so
much the technologies
that
improve
quality, but awareness of
how effective the best
combinations
really
are. It is lack of widespread
quality measurements and
lack
of
quality benchmarks that are
delaying improvements in
software
quality.
348
Chapter
Five
Also
valuable are predictive
estimating tools that can
predict both defect
potentials
and the defect removal
efficiency levels of any
combination of
review,
inspection, static analysis,
automatic test stage, and
manual test
stage.
Normally, SQA groups will
have such tools and use
them frequently.
In
fact, the industry's first
software quality prediction
tool was developed
by
the IBM SQA organization in
1973 in San Jose,
California.
The
final conclusion is that SQA
groups need to keep pushing
until
the
software industry can
routinely top 95 percent in
average defect
removal
efficiency levels, and hit 99
percent for critical
software applica-
tions.
Any results less than
these are insufficient and
unprofessional.
Summary
and Conclusions
Fred
Brooks, one of the pioneers of
software at IBM, observed in his
clas-
sic
book The
Mythical Man Month that
software was strongly
affected
by
organization structures. Not long
after Fred published, the
author
of
this book, who also
worked at IBM, noted that
large systems tended
to
be decomposed to fit existing
organization structures. In
particular,
some
major features were
artificially divided to fix standard
eight-
person
departments.
This
book only touches the
surface of organizational issues.
Deeper
study
is needed on the relative merits of
small teams versus large
teams.
In
addition, the "average" span
of control of eight employees
reporting
to
one manager may well be in
need of revision. Studies of
the effective-
ness
of various team sizes found
that raising the span of
control from
8
up to 12 would allow marginal
managers to return to technical
work
and
would minimize managerial
disputes, which tend to be
endemic.
Further,
since software application
sizes are increasing, larger
spans of
control
might be a better match for
today's architecture.
Another
major topic that needs
additional study is that of
really large
software
teams that may include
500 or more personnel and
dozens
of
specialists. There is very
little empirical data on the
most effective
methods
for dealing with such large
groups with diverse skills. If
such
teams
are geographically dispersed,
that adds yet another
topic that is
in
need of additional
study.
More
recently Dr. Victor Basili,
Nachiappan Nagappan, and
Brendan
Murphy
studied organization structures at
Microsoft and
concluded
that
many of the problems with
Microsoft Vista could be
traced back to
organizational
structure issues.
However,
in 2009, the literature on
software organization
structures
and
their impact is sparse
compared with other topics
that influence
software
engineering such as methods,
tools, programming
languages,
and
testing.
Software
Team Organization and
Specialization
349
Formal
organization structures tend to be
territorial because manag-
ers
are somewhat protective of
their spheres of influence.
This tends
to
narrow the focus of teams.
Newer forms of informal
organizations
that
support cross-functional communication
are gaining in
popularity.
Cross-functional
contacts also increase the
chances of innovation
and
problem
solving.
Software
organization structures should be
dynamic and change
with
technology,
but unfortunately, they
often are a number of years
behind
where
they should be.
As
the recession of 2009
continues, it may spur
additional research
into
organizational topics. For
example, new subjects that
need to be
examined
include wiki sites, virtual
departments that
communicate
using
virtual reality, and the
effectiveness of home offices to
minimize
fuel
consumption.
A
very important topic with
almost no literature is that of
dealing
with
layoffs and downsizing in
the least disruptive way.
That topic is
discussed
in Chapters 1 and 2 of this
book, but few additional
citations
exist.
Because companies tend to
get rid of the wrong
people, layoffs
often
damage operational efficiency
levels for years
afterwards.
Another
important topic that needs
research, given the slow
develop-
ment
schedules for software,
would be a study of global
organizations
located
in separate time zones eight
hours apart, which would
allow
software
applications and work
products to be shifted around
the globe
from
team to team, and thus
permit 24-hour development
instead of
8-hour
development.
A
final organizational topic
that needs additional study
are the opti-
mum
organizations that can
create reusable modules and
other reusable
deliverables,
and then construct software
applications from
reusable
components
rather than coding them on a
line-by-line basis.
Readings
and References
Brooks,
Fred. The
Mythical Man-Month. Reading,
MA: Addison Wesley,
1995.
Charette,
Bob. Software
Engineering Risk Analysis
and Management. New
York:
McGraw-Hill,
1989.
Crosby,
Philip B. Quality
is Free. New
York: New American Library,
Mentor Books, 1979.
DeMarco,
Tom. Controlling
Software Projects. New
York: Yourdon Press,
1982.
DeMarco,
Tom. Peopleware:
Productive Projects and
Teams. New
York: Dorset
House,1999.
Glass,
Robert L. Software
Creativity,
Second Edition. Atlanta:
*books, 2006.
Glass,
R.L. Software
Runaways: Lessons Learned from
Massive Software Project
Failures.
Englewood
Cliffs, NJ: Prentice Hall,
1998.
Humphrey,
Watts. Managing
the Software Process.
Reading, MA: Addison Wesley,
1989.
Humphrey,
Watts. PSP:
A Self-Improvement Process for Software
Engineers.
Upper
Saddle
River, NJ: Addison Wesley,
2005.
Humphrey,
Watts. TSP
Leading a Development
Team.
Boston: Addison Wesley,
2006.
Humphrey,
Watts. Winning
with Software: An Executive
Strategy.
Boston: Addison
Wesley,
2002.
350
Chapter
Five
Jones,
Capers. Applied
Software Measurement, Third
Edition.
New
York: McGraw-Hill,
2008.
Jones,
Capers. Estimating
Software Costs. New
York: McGraw-Hill,
2007.
Jones,
Capers. Software
Assessments, Benchmarks, and Best
Practices. Boston:
Addison
Wesley
Longman, 2000.
Kan,
Stephen H. Metrics
and Models in Software
Quality Engineering, Second
Edition.
Boston:
Addison Wesley Longman,
2003.
Kuhn,
Thomas. The
Structure of Scientific Revolutions.
Chicago:
University of Chicago
Press,
1996.
Nagappan,
Nachiappan, B. Murphy, and V.
Basili. The
Influence of Organizational
Structure
on Software Quality. Microsoft
Technical Report
MSR-TR-2008-11.
Microsoft
Research, 2008.
Pressman,
Roger. Software
Engineering A Practitioner's
Approach, Sixth
Edition.
New
York:
McGraw-Hill, 2005.
Strassmann,
Paul. The
Squandered Computer. Stamford,
CT: Information Economics
Press,
1997.
Weinberg,
Gerald M. Becoming
a Technical Leader.
New York: Dorset House,
1986.
Weinberg,
Gerald M. The
Psychology of Computer Programming.
New
York: Van
Nostrand
Reinhold, 1971.
Yourdon,
Ed. Outsource:
Competing in the Global
Productivity Race. Upper
Saddle River,
NJ:
Prentice Hall PTR,
2005.
Yourdon,
Ed. Death
March The Complete
Software Developer's Guide to
Surviving
"Mission
Impossible" Projects. Upper
Saddle River, NJ: Prentice Hall
PTR, 1997.
Table of Contents:
|
|||||