|
|||||
...Image
Processing Fundamentals
of
their implementation as a point
operation, a local operation, or a
global operation
as
described in Section 2.2.1.
9.1
HISTOGRAM-BASED OPERATIONS
An
important class of point
operations is based upon the
manipulation of an image
histogram
or a region
histogram.
The most important examples
are described
below.
9.1.1
Contrast stretching
Frequently,
an image is scanned in such a way that
the resulting brightness
values
do
not make full use of the
available dynamic range. This can be
easily observed in
the
histogram of the brightness
values shown in Figure 6. By
stretching the
histogram
over the available dynamic
range we attempt to correct this
situation. If
the
image is intended to go from brightness 0
to brightness 2B1 (see Section
2.1),
then
one generally maps the 0%
value (or minimum
as defined in
Section 3.5.2) to
the
value 0 and the 100%
value (or maximum)
to the value 2B1. The appropriate
transformation
is given by:
a[m, n]
- minimum
(
)
b[m, n]
= 2 B
- 1
·
(77)
maximum
- minimum
This
formula, however, can be
somewhat sensitive to outliers
and a less sensitive
and
more general version is given
by:
a[
m, n]
≤
plow%
0
(
)
2
B
- 1
· a[ m,
n
] -
plow %
plow % <
a[ m,
n
] <
phigh %
b[m, n]
=
(78)
phigh % -
plow %
(
)
2B
- 1
a[m, n]
≥
phigh %
In
this second version one
might choose the 1% and
99% values for plow%
and
phigh%,
respectively, instead of the 0% and 100%
values represented by eq.
(77). It
is
also possible to apply the
contrast-stretching operation on a
regional basis using
the
histogram from a region to
determine the appropriate
limits for the
algorithm.
Note
that in eqs. (77) and (78)
it is possible to suppress the
term 2B1 and simply
normalize
the brightness range to 0 ≤
b[m,n] ≤
1.
This means representing the
final
pixel
brightnesses as reals instead of
integers but modern computer
speeds and
RAM
capacities make this quite
feasible.
45
...Image
Processing Fundamentals
9.1.2
Equalization
When
one wishes to compare two or
more images on a specific
basis, such as
texture,
it is common to first normalize
their histograms to a "standard"
histogram.
This
can be especially useful when
the images have been
acquired under
different
circumstances.
The most common histogram
normalization technique is histogram
equalization
where
one attempts to change the
histogram through the use of
a
function
b
= ƒ(a) into a
histogram that is constant
for all brightness values.
This
would
correspond to a brightness distribution
where all values are
equally probable.
Unfortunately,
for an arbitrary image, one
can only approximate this
result.
For
a "suitable" function ƒ(·) the
relation between the input
probability density
function,
the output probability
density function, and the
function ƒ(·) is given
by:
pa ( a)da
pb (b)db = pa (a)da
⇒
dƒ
=
(79)
pb (b)
From
eq. (79) we see that
"suitable" means that ƒ(·)
is differentiable and that
dƒ/da
≥ 0.
For histogram equalization we
desire that pb(b)
= constant and this
means
that:
(
)
f
(a)
= 2 B
- 1
· P( a)
(80)
where
P
(a)
is the probability distribution
function
defined in Section 3.5.1
and
illustrated
in Figure 6a. In other
words, the quantized
probability
distribution
function
normalized from 0 to 2B1 is
the
look-up table required for
histogram
equalization.
Figures 21a-c illustrate the
effect of contrast stretching and
histogram
equalization
on a standard image. The histogram
equalization procedure can also be
applied
on a regional basis.
Figure
21a
Figure
21b
Figure
21c
Original
Contrast
Stretched
Histogram
Equalized
9.1.3
Other histogram-based operations
The
histogram derived from a
local region can also be used to
drive local filters
that
are
to be applied to that region. Examples
include minimum
filtering,
median
46
...Image
Processing Fundamentals
filtering,
and maximum
filtering
[23]. The concepts minimum,
median, and
maximum
were introduced in Figure 6.
The filters based on these
concepts will be
presented
formally in Sections 9.4.2 and
9.6.10.
9.2
M ATHEMATICS-BASED OPERATIONS
We
distinguish in this section between
binary arithmetic and ordinary
arithmetic. In
the
binary case there are
two brightness values "0"
and "1". In the ordinary
case
we
begin with 2B brightness values or levels
but the processing of the
image can
easily
generate many more levels.
For this reason many
software
systems
provide
16
or 32 bit representations for
pixel brightnesses in order to
avoid problems with
arithmetic
overflow.
9.2.1
Binary operations
Operations
based on binary (Boolean)
arithmetic form the basis
for a powerful set
of
tools that will be described here and
extended in Section 9.6,
mathematical
morphology.
The operations described
below are point operations
and thus admit a
variety
of efficient implementations including
simple look-up tables. The
standard
notation
for the basic set of
binary operations is:
c=a
NOT
c
=a + b
OR
c
= a·b
AND
(81)
c
=a ⊕b = a·b
+a · b
XOR
c
= a \ b
= a- b = a· b
SUB
The
implication is that each operation is
applied on a pixel-by-pixel basis.
For
example,
c[m,
n] =
a[m
, n]
· b
[m,
n] ∀m, n
. The
definition of each operation
is:
NOT
OR
b
AND
b
a
a
0
1
a
0
1
0
1
0
0
1
0
0
0
1
0
1
1
1
1
0
1
↑
↑
(82)
input
output
XOR
b
SUB
b
a
0
1
a
0
1
0
0
1
0
0
0
1
1
0
1
1
0
These
operations are illustrated in Figure 22
where the binary value
"1" is shown in
black
and the value "0" in
white.
47
...Image
Processing Fundamentals
a)
Image
a
b)
Image
b
c)
NOT(b) = b
d)
OR(a,b)
= a
+ b
e)
AND(a,b)
= a
· b
f)
XOR(
a,b)
= a
⊕ b
g)
SUB(a,b)
= a
\ b
Figure
22: Examples of
the various binary point
operations.
The
SUB(·) operation can be
particularly useful when the
image a
represents
a
region-of-interest
that we want to analyze
systematically and the image
b
represents
objects
that, having been analyzed,
can now be discarded, that is
subtracted, from
the
region.
9.2.2
Arithmetic-based operations
The
gray-value point operations
that form the basis
for image processing are
based
on
ordinary mathematics and
include:
48
...Image
Processing Fundamentals
Operation
Definition
preferred
data type
ADD
c=a+b
integer
SUB
c=ab
integer
MUL
c=a·b
integer
or floating point
DIV
c=a/b
floating
point
LOG
c
= log(a)
floating
point
(83)
EXP
c
= exp(a)
floating
point
SQRT
c
= sqrt(a)
floating
point
TRIG.
c
= sin/cos/tan(a)
floating
point
c
= (2B 1) a
INVERT
integer
9.3
CONVOLUTION-BASED OPERATIONS
Convolution,
the mathematical, local
operation
defined in Section 3.1 is
central to
modern
image processing. The basic
idea is that a window of
some finite size
and
shape--the
support--is
scanned across the image.
The output pixel value is
the
weighted
sum of the input pixels
within the window where
the weights are
the
values
of the filter assigned to
every pixel of the window
itself. The window
with
its
weights is called the
convolution
kernel. This
leads directly to the
following
variation
on eq. (3). If the filter
h[j,k] is zero
outside the (rectangular)
window
{j=0,1,...,J1; k=0,1,...,K1}, then,
using eq. (4), the
convolution can be written as
the
following finite sum:
J
-1K -1
∑
∑ h[
j
, k
]a[
m
- j, n
- k]
c[m, n]
=
a[m
, n]
⊗
h[m,
n] =
(84)
j=
0
k
=0
This
equation can be viewed as more than
just a pragmatic mechanism
for
smoothing
or sharpening an image. Further, while
eq. (84) illustrates the
local
character
of this operation, eqs. (10)
and (24) suggest that
the operation can be
implemented
through the use of the
Fourier domain which
requires a global
operation,
the Fourier transform. Both
of these aspects will be discussed
below.
9.3.1
Background
In
a variety of image-forming systems an
appropriate model for the
transformation
of
the physical signal
a(x,y) into an
electronic signal c(x,y)
is the convolution of
the
input
signal with the impulse
response of the sensor system.
This system might
consist
of both an optical as well as an
electrical sub-system. If each of
these
systems
can be treated as a linear, shift-invariant
(LSI)
system then the
convolution
model
is appropriate. The definitions of
these two, possible, system
properties are
given
below:
49
...Image
Processing Fundamentals
a1 →
c1 and
a2 →
c 2
If
(85)
Linearity
Then
w1 · a1 +
w2 · a2 →
w1 · c1 +
w2 · c2
a( x,
y ) → c( x,
y)
If
(86)
Shift-Invariance
Then
a(x
- xo, y
- yo ) → c( x
- xo , y
- yo )
where
w1 and
w2 are
arbitrary complex constants
and xo and
yo are
coordinates
corresponding
to arbitrary spatial translations.
Two
remarks are appropriate at this point.
First, linearity implies (by
choosing w1 =
w2 = 0) that
"zero in" gives "zero
out". The offset described in
eq. (70) means
that
such
camera signals are not
the output of a linear
system and thus
(strictly
speaking)
the convolution result is
not applicable. Fortunately, it is
straightforward
to
correct for this non-linear
effect. (See Section
10.1.)
Second,
optical lenses with a
magnification, M,
other than 1×
are
not shift invariant;
a
translation of 1 unit in the
input image a(x,y)
produces a translation of M
units
in
the
output image c(x,y).
Due to the Fourier property
described in eq. (25) this
case
can
still be handled by linear system
theory.
If
an impulse point of light δ(x,y)
is imaged through an LSI system
then the
impulse
response of that system is called
the point
spread function (PSF).
The
output
image then becomes the
convolution of the input image
with the PSF.
The
Fourier
transform of the PSF
is called
the optical
transfer function ( OTF).
For
optical
systems that are
circularly-symmetric, aberration-free, and
diffraction-
limited
the PSF
is given by
the Airy disk shown in Table
4T.5. The OTF
of
the
Airy
disk is also presented in
Table 4T.5.
If
the convolution window is
not the diffraction-limited
PSF of the lens but
rather
the
effect of defocusing a lens
then an appropriate model
for h(x,y)
is a pill box of
radius
a
as described in
Table 4T.3. The effect
on a test pattern is illustrated
in
Figure
23.
50
...Image
Processing Fundamentals
a)
Test
pattern
b)
Defocused
image
Figure
23: Convolution of
test pattern with a pill
box of radius a=4.5
pixels.
The
effect of the defocusing is
more than just simple
blurring or smoothing.
The
almost
periodic negative lobes in
the transfer function in
Table 4T.3 produce a
180° phase
shift in which black turns
to white and vice-versa. The
phase shift is
clearly
visible in Figure 23b.
9.3.2
Convolution in the spatial
domain
In
describing filters based on
convolution we will use the
following convention.
Given
a filter h[j,k]
of dimensions J
× K, we will consider
the coordinate [j=0,k=0]
to
be in the center of the
filter matrix, h.
This is illustrated in Figure
24. The
"center"
is well-defined when J
and K are
odd; for the case
where they are even,
we
will
use the approximations
(J/2,
K/2)
for the "center" of the
matrix.
[(
)]
[(
)]
[ (
)]
)(
)(
h - J - 1
, -
K
- 1
...
h J
-12
, -
K
- 12
h
0, -
K
- 12
...
...
...
2
2
M
O
M
M
M
N
M
...
h[ -1, -
1]
h[0,
-1]
h[1,
-1] ...
M
M
[(
)]
[(
)]
h
J -
12 ,
0
h
- J -12
, 0
h=
...
h[
-1, 0]
h[1,
0] ...
h[0,
0]
...
h[
-1,1]
h[1,
+1] ...
M
M
h[0,1]
M
N
M
M
M
O
M
[(
)]
[(
)]
[ (
)]
)(
)(
h
- J -1
, K
-1
h
0, K
-1
J
-1
, K
- 1
...
...
...
... h
2
2
2
2
2
Figure
24: Coordinate
system for describing
h[j,k]
When
we examine the convolution
sum (eq. (84)) closely,
several issues become
evident.
·
Evaluation of formula (84)
for m=n=0 while
rewriting the limits of
the
convolution
sum based on the "centering"
of h[j,k] shows
that values of a[j,k]
can
be
required that are outside
the image boundaries:
51
...Image
Processing Fundamentals
+
Jo
+
Ko
(
J
- 1)
(K
- 1)
∑
∑ h[
j
, k]a[-
j
, -k ]
c[0,
0] =
Jo =
,
Ko =
(87)
2
2
j
=- Jo k
=-
Ko
The
question arises what
values should we assign to the
image a[m,n]
for m<0,
m≥M, n<0,
and n≥N? There is no
"answer" to this question.
There are only
alternatives
among which we are free to
choose assuming we understand
the
possible
consequences of our choice. The
standard alternatives are a)
extend the
images
with a constant (possibly
zero) brightness value, b)
extend the image
periodically,
c) extend the image by mirroring it at
its boundaries, or d) extend
the
values
at the boundaries indefinitely. These
alternatives are illustrated in
Figure 25.
F
F
F
F
F
F
F
F
F
(a)
(b)
(c)
(d)
Figure
25: Examples of
various alternatives to extend an
image outside its
formal
boundaries. See text for
explanation.
·
When the convolution sum is
written in the standard form
(eq. (3)) for an
image
a[m,n]
of size M × N:
M-1 N
-1
∑
∑ a[
j, k
]h[m - j, n
- k]
c[m, n]
=
(88)
j=
0
k
= 0
we
see that the convolution
kernel h[j,k]
is mirrored around j=k=0 to
produce
h[j,k] before it is
translated by [m,n] as indicated in
eq. (88). While
some
convolution
kernels in common use are
symmetric in this respect, h[j,k]= h
[j,k],
many
are not. (See Section
9.5.) Care must therefore be
taken in the
implementation
of filters with respect to
the mirroring
requirements.
·
The computational complexity
for a K
× K convolution
kernel implemented in
the
spatial
domain on an image of N
× N is O(K2) where
the complexity is measured
per
pixel on the
basis of the number of
multiplies-and-adds (MADDs).
·
The value computed by a
convolution that begins with
integer brightnesses
for
a[m,n]
may produce a rational
number or a floating point
number in the result
c[m,n].
Working exclusively with
integer brightness values
will, therefore,
cause
roundoff
errors.
52
...Image
Processing Fundamentals
·
Inspection of eq. (84)
reveals another possibility
for efficient implementation
of
convolution.
If the convolution kernel
h[j,k] is separable,
that is, if the kernel
can be
written
as:
h[
j, k]
=
hrow[
k] · hcol [
j
]
(89)
then
the filtering can be
performed as follows:
J
-1K
-1
c[m, n]
=
∑ ∑hrow [k
]a[m - j , n
- k ]hcol [
j
]
(90)
j
= 0k
= 0
This
means that instead of applying
one, two-dimensional filter it is
possible to
apply
two, one-dimensional filters,
the first one in the
k
direction
and the second
one
in the j
direction.
For an N
× N image this, in
general, reduces the
computational
complexity per pixel from
O(J·
K) to O
(J+K).
An
alternative way of writing
separability is to note that
the convolution
kernel
(Figure
24) is a matrix h
and, if separable,
h
can be
written as:
=
[hcol ] ·
[hrow ]t
[h]
(91)
(
J
× K ) =
(
J
× 1)
·
(1 ×
K)
where
" t" denotes
the matrix transpose operation. In
other words, h
can be
expressed
as the outer
product of a column
vector [hcol ]
and a row vector [hrow].
·
For certain filters it is possible to
find an incremental
implementation for
a
convolution.
As the convolution window
moves over the image
(see eq. (88)),
the
leftmost
column of image data under
the window is shifted out as
a new column of
image
data is shifted in from the
right. Efficient algorithms
can take advantage of
this
[24] and, when combined
with separable filters as described
above, this can
lead
to algorithms where the
computational complexity per
pixel is O(constant).
9.3.3
Convolution in the frequency
domain
In
Section 3.4 we indicated
that there was an
alternative method to implement
the
filtering
of images through convolution. Based on
eq. (24) it appears possible
to
achieve
the same result as in eq.
(84) by the following
sequence of operations:
Compute
A(Ω,Ψ) = F{a[m,n]}
i)
Multiply
A(Ω,Ψ) by the
precomputed H(Ω,Ψ)
= F{h[m,n]}
ii)
(92)
Compute
the result c[m,n]
= F1{A(Ω,Ψ)·H(Ω,Ψ)}
iii)
53
...Image
Processing Fundamentals
·
While it might seem that
the "recipe" given above in
eq. (92) circumvents
the
problems
associated with direct
convolution in the spatial
domain--specifically,
determining
values for the image outside
the boundaries of the
image--the Fourier
domain
approach, in fact, simply "assumes"
that the image is repeated
periodically
outside
its boundaries as illustrated in
Figure 25b. This phenomenon
is referred to
as
circular
convolution.
If
circular convolution is not acceptable
then the other possibilities
illustrated in
Figure
25 can be realized by embedding the image
a[m,n] and the
filter H(Ω,Ψ)
in
larger
matrices with the desired
image extension mechanism for
a[m,n]
being
explicitly
implemented.
·
The computational complexity per
pixel of the Fourier
approach for an image
of
N
× N and for a
convolution kernel of K
× K is O(logN) complex
MADDs
independent
of K.
Here we assume that
N
> K and
that N
is a highly
composite
number
such as a power of two. (See
also 2.1.) This latter
assumption permits
use
of
the computationally-efficient Fast
Fourier Transform (FFT)
algorithm.
Surprisingly
then, the indirect route
described by eq. (92) can be
faster than the
direct
route given in eq. (84).
This requires, in general,
that K2 >>
logN.
The range
of
K
and N for
which this holds depends on
the specifics of the
implementation.
For
the machine on which this
manuscript is being written
and the specific
image
processing
package that is being used,
for an image of N
= 256 the
Fourier
approach
is faster than the
convolution approach when K
≥ 15.
(It should be noted
that
in this comparison the
direct convolution involves
only integer arithmetic
while
the
Fourier domain approach requires
complex floating point
arithmetic.)
9.4
SMOOTHING OPERATIONS
These
algorithms are applied in
order to reduce noise and/or to prepare
images for
further
processing such as segmentation. We
distinguish between linear
and non-
linear
algorithms where the former
are amenable to analysis in the
Fourier domain
and
the latter are not. We
also distinguish between
implementations based on a
rectangular
support for the filter
and implementations based on a
circular support
for
the filter.
9.4.1
Linear Filters
Several
filtering algorithms will be presented
together with the most
useful
supports.
·
Uniform
filter The
output image is based on a local
averaging of the input
filter
where
all of the values within
the filter support have
the same weight. In
the
continuous
spatial domain (x,y) the
PSF
and
transfer function are given
in Table
4T.1
for the rectangular case
and in Table 4T.3 for
the circular (pill box)
case.
54
...Image
Processing Fundamentals
For
the discrete spatial domain
[m,n]
the filter values are
the samples of the
continuous
domain case. Examples for
the rectangular case
(J=K=5)
and the
circular
case (R
=2.5)
are shown in Figure
26.
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
hrect [
j
, k]
=
1
1
hcirc[
j
, k]
=
1
1
1
1
1
1
1
1
25
1
21
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
(a)
Rectangular
filter (J=K=5)
(b)
Circular
filter (R
=2.5)
Figure
26: Uniform
filters for image
smoothing
Note
that in both cases the
filter is normalized so that Σ h[j,k] = 1. This is
done so
that
if the input a[m,n]
is a constant then the
output image c[m,n]
is the same
constant.
The justification can be found in
the Fourier transform
property described
in
eq. (26). As can be seen
from Table 4, both of these
filters have transfer
functions
that have negative lobes and
can, therefore, lead to phase
reversal as seen
in
Figure 23. The square
implementation of the filter is
separable and incremental;
the
circular implementation is incremental
[24, 25].
·
Triangular
filter The
output image is based on a
local averaging of the
input
filter
where the values within
the filter support have
differing weights. In
general,
the
filter can be seen as the
convolution of two (identical)
uniform filters either
rectangular
or circular and this has
direct consequences for the
computational
complexity
[24, 25]. (See Table
13.) In the continuous
spatial domain the PSF
and
transfer
function are given in Table
4T.2 for the rectangular
support case and in
Table
4T.4 for the circular
(pill box) support case. As
seen in Table 4 the
transfer
functions
of these filters do not have
negative lobes and thus do
not exhibit phase
reversal.
Examples
for the rectangular support
case (J=K=5)
and the circular support
case
(R =2.5)
are shown in Figure 27.
The filter is again
normalized so that Σ
h[j,k]=1.
1
1
0
0
2
3
2
0
1
0
2
2
0
0
4
6
4
2
2
2
1
1
hrect [
j
, k]
=
3
3
hcirc[
j
, k]
=
1
1
6
9
6
2
5
2
81
2
25
0
2
4
6
4
0
2
2
2
1
1
0
0
2
3
2
0
1
0
(a)
Pyramidal
filter (J=K=5)
(b)
Cone
filter (R
=2.5)
Figure
27: Triangular
filters for image
smoothing
55
...Image
Processing Fundamentals
·
Gaussian
filter The use of
the Gaussian kernel for
smoothing has become
extremely
popular. This has to do with
certain properties of the
Gaussian (e.g. the
central
limit theorem, minimum space-bandwidth
product) as well as
several
application
areas such as edge finding
and scale space analysis.
The PSF
and
transfer
function for the continuous
space Gaussian are given in
Table 4T6. The
Gaussian
filter is separable:
2
1
- y
x2
-
2σ ·
e
2σ
1
2
2
h(
x
, y)
=
g2D
(x,
y) =
e
2š σ
2šσ
(93)
=
g1D
( x)
· g1 D( y
)
There
are four distinct ways to
implement the Gaussian:
Convolution using a finite
number of samples (No) of the Gaussian
as the
convolution
kernel. It is common to choose
No = 3σ
or 5σ.
1
2
-
n
2σ 2
n
≤ No
g1 D[n]
=
2š σ e
(94)
n
> No
0
Repetitive convolution using a
uniform filter as the
convolution kernel.
g1 D[n]
≈
u[ n]
⊗
u[n]
⊗
u[n]
1
n
≤ No
u[n] =
(2
No + 1)
(95)
n
> No
0
The
actual implementation (in
each dimension) is usually of
the form:
c[n] =
((a[n] ⊗
u[n])
⊗
u[n])
⊗
u[n]
(96)
This
implementation makes use of
the approximation afforded by
the central limit
theorem.
For a desired σ
with
eq. (96), we use No = σ
although
this severely
restricts
our choice of σ's to integer
values.
Multiplication in the frequency
domain. As the Fourier
transform of a Gaussian
is
a Gaussian
(see Table T.6), this means
that it is straightforward to prepare
a
filter
H(Ω,Ψ) = G2D(Ω,Ψ) for
use with eq. (92). To
avoid truncation effects in
the
frequency
domain due to the infinite
extent of the Gaussian it is
important to choose
a
σ that is
sufficiently large. Choosing σ >
k/š
where
k
= 3 or 4 will
usually be
sufficient.
56
...Image
Processing Fundamentals
Use of a recursive filter implementation.
A recursive filter has an
infinite impulse
response
and thus an infinite support.
The separable Gaussian filter can
be
implemented
[26] by applying the
following recipe in each
dimension when σ
≥
0.5.
Choose
the σ
based
on the desired goal of the
filtering;
i)
ii)
Determine
the parameter q
based on
eq. (98);
iii)
Use
eq. (99) to determine the
filter coefficients {b0,b1,b2,b3,B
};
(97)
iv)
Apply
the forward difference
equation, eq. (100);
v)
Apply
the backward difference
equation, eq. (101);
The
relation between the desired σ and
q
is given
by:
.98711σ - 0.96330
σ
≥ 2.5
q=
(98)
3.97156 -
4.14554 1
- .26891σ
0.5
≤ σ ≤
2.5
The
filter
coefficients {b0,b1,b2,b3,B
} are
defined by:
b0 =
1.57825
+ (2.44413q) +
(1.4281q 2 ) +
(0.422205q3 )
b1 =
(2.44413q ) +
(2.85619q 2) +
(1.26661q 3)
b2 =
-(1.4281q 2 ) -
(1.26661q3 )
(99)
b3 =
0.422205q 3
B
= 1
-
(b1 +
b2 +
b3 )
/
b0
The
one-dimensional forward
difference equation takes an input
row (or column)
a[n] and
produces an intermediate output
result w[n] given
by:
w[n] =
Ba[n]
+
(b1w[n - 1]
+
b2w[n - 2]
+
b3 w[
n
- 3])
/
b0
(100)
The
one-dimensional backward
difference equation takes
the intermediate result
w[n] and
produces the output
c[n]
given by:
c[n] =
Bw[n]
+
(b1c[n + 1]
+
b2 c[n + 2]
+
b3 c[n + 3])
/
b0
(101)
The
forward equation is applied
from n
= 0 up to n
= N
1 while
the backward
equation
is applied from n
= N
1 down
to n
= 0.
The
relative performance of these
various implementation of the Gaussian
filter can
+∞
be
described as follows. Using
the root-square
error ∑ n=
- ∞ g[ n
| σ
]- h[n]
2
between
a true, infinite-extent Gaussian,
g[n|σ],
and an approximated Gaussian,
57
...Image
Processing Fundamentals
h[n], as a measure of
accuracy, the various
algorithms described above
give the
results
shown in Figure. 28a. The
relative speed of the
various algorithms in
shown
in
Figure 28b.
The
root-square error measure is extremely
conservative and thus all
filters, with
the
exception of "Uniform 3×" for
large σ, are
sufficiently accurate. The
recursive
implementation
is the fastest independent of σ; the
other implementations can
be
significantly
slower. The FFT
implementation, for example, is
3.1 times slower
for
N=256
. Further, the FFT requires
that N
be a highly
composite number.
1E+01
15
Root-Square
Error
Computation
Time Ratio
1E+00
Recursive
Gauss
1E-01
Uniform
3x
1E-02
10
1E-03
1E-04
Gauss
trunc@ 5σ
Gauss
trunc@ 3σ
1E-05
5
Optimized
FFT
1E-06
1E-07
1E-08
0
0
5
10
15
20
0
5
10
σ
σ
a)
Accuracy
comparison
b)
Speed
comparison
Figure
28: Comparison of
various Gaussian algorithms
with N=256.
The
legend is spread across both
graphs
·
Other
The
Fourier domain approach offers
the opportunity to implement
a
variety
of smoothing algorithms. The
smoothing filters will then be
lowpass
filters.
In
general it is desirable to use a lowpass
filter that has zero phase so as
not to
produce
phase distortion when
filtering the image. The
importance of phase
was
illustrated
in Figures 5 and 23. When
the frequency domain
characteristics can be
represented
in an analytic form, then
this can lead to relatively
straightforward
implementations
of H(Ω,Ψ). Possible
candidates include the
lowpass filters
"Airy"
and
"Exponential Decay" found in
Table 4T.5 and Table
4T.8, respectively.
9.4.2
Non-Linear Filters
A
variety of smoothing filters
have been developed that
are not linear. While
they
cannot,
in general, be submitted to Fourier
analysis, their properties and domains
of
application
have been studied
extensively.
·
Median
filter The
median statistic was
described in Section 3.5.2. A
median
filter
is based upon moving a
window over an image (as in a
convolution) and
58
...Image
Processing Fundamentals
computing
the output pixel as the
median value of the brightnesses
within the input
window.
If the window is J
× K in size we
can order the J·K
pixels in
brightness
value
from smallest to largest. If
J·K
is odd
then the median will be the
(J·K+1)/2
entry
in the list of ordered brightnesses. Note
that the value selected will
be exactly
equal
to one of the existing brightnesses so
that no roundoff error will be
involved
if
we want to work exclusively
with integer brightness
values. The algorithm as it
is
described
above has a generic complexity per pixel
of O(J·K·log(J·K)).
Fortunately,
a fast algorithm (due to
Huang et al. [23]) exists
that reduces the
complexity
to O(K)
assuming J
≥ K.
A
useful variation on the
theme of the median filter
is the percentile
filter. Here
the
center
pixel in the window is
replaced not by the 50%
(median) brightness
value
but
rather by the p%
brightness value where
p% ranges
from 0% (the minimum
filter)
to 100% (the maximum
filter). Values
other then (p=50)%
do not, in general,
correspond
to smoothing filters.
·
Kuwahara
filter Edges
play an important role in
our perception of images
(see
Figure
15) as well as in the
analysis of images. As such it is
important to be able to
smooth
images without disturbing the sharpness
and, if possible, the
position of
edges.
A filter that accomplishes
this goal is termed an
edge-preserving
filter and
one
particular example is the
Kuwahara filter [27].
Although this filter can
be
implemented
for a variety of different
window shapes, the algorithm
will be
described
for a square window of size
J
= K
= 4L
+ 1 where
L
is an integer.
The
window
is partitioned into four
regions as shown in Figure
29.
Region
1
}
Region
2
Region
3
Center
Pixel
Region
4
Figure
29: Four,
square regions defined for
the Kuwahara filter. In
this
example
L=1 and thus
J=K=5.
Each region is [(J+1)/2]
× [(K+1)/2].
59
...Image
Processing Fundamentals
In
each of the four regions
(i=1,2,3,4),
the mean brightness, mi in eq. (34),
and the
variancei, si2 in eq.
(36), are measured. The
output value of the center
pixel in the
window
is the mean value of that
region that has the
smallest variance.
9.4.3
Summary of Smoothing
Algorithms
The
following table summarizes
the various properties of the
smoothing algorithms
presented
above. The filter size is
assumed to be bounded by a rectangle of
J
× K
where,
without loss of generality,
J
≥ K. The
image size is N
× N.
Algorithm
Domain
Type
Support
Separable
/ Incremental
Complexity/pixel
O(
constant)
Uniform
Space
Linear
Square
Y/Y
O(
K)
Uniform
Space
Linear
Circular
N/Y
O(
constant)
ª
Triangle
Space
Linear
Square
Y/N
O(
K) ª
Triangle
Space
Linear
Circular
N/N
∞ª
O(
constant)
ª
Gaussian
Space
Linear
Y/N
O(
K) ª
Median
Space
Non-Linear
Square
N/Y
Square
ª
O(
J· K)
Kuwahara
Space
Non-Linear
N/N
O(log
N)
Other
Frequency
Linear
--
--/--
Table
13: Characteristics of
smoothing filters. ªSee text
for additional
explanation.
Examples
of the effect of various
smoothing algorithms are
shown in Figure 30.
b)
Uniform 5
× 5
c)
Gaussian (σ
=
2.5)
a)
Original
60
...Image
Processing Fundamentals
d)
Median 5
× 5
e)
Kuwahara 5
× 5
Figure
30: Illustration of
various linear and
non-linear smoothing
filters
9.5
DERIVATIVE-BASED OPERATIONS
Just
as smoothing is a fundamental operation
in image processing so is the
ability
to
take one or more spatial
derivatives of the image.
The fundamental problem
is
that,
according to the mathematical
definition of a derivative, this
cannot be done. A
digitized
image is not a continuous
function a(x,y)
of the spatial variables but
rather
a
discrete function a[m,n]
of the integer spatial coordinates. As a result
the
algorithms
we will present can only be
seen as approximations
to the
true spatial
derivatives
of the original spatially-continuous
image.
Further,
as we can see from the
Fourier property in eq. (27),
taking a derivative
multiplies
the signal spectrum by
either u
or v.
This means that high
frequency
noise
will be emphasized in the resulting image.
The general solution to
this
problem
is to combine the derivative operation
with one that suppresses
high
frequency
noise, in short, smoothing in
combination with the desired
derivative
operation.
9.5.1
First Derivatives
As
an image is a function of two
(or more) variables it is
necessary to define
the
direction
in which the derivative is
taken. For the
two-dimensional case we have
the
horizontal
direction, the vertical
direction, or an arbitrary direction
which can be
considered
as a combination of the two. If we
use hx to denote a
horizontal
derivative
filter (matrix), hy to denote a
vertical derivative filter
(matrix), and hθ to
denote
the arbitrary angle derivative
filter (matrix),
then:
[hθ ] =
cosθ ·
[
hx ] + sinθ ·[h
y ]
(102)
·
Gradient
filters It is also possible to
generate a vector derivative description
as
the
gradient,
∇a[m,n], of an
image:
61
...Image
Processing Fundamentals
ar
ar
r
r
(
)
∇a = ix + i y = (hx ⊗ a)ix + hy ⊗ a i y
(103)
x
y
r
r
where
ix and iy are
unit vectors in the
horizontal and vertical
direction,
respectively.
This leads to two
descriptions:
(
)
(hx ⊗ a)2
2
∇a =
+
hy ⊗ a
(104)
Gradient
magnitude
and
(
)
hy ⊗ a
ψ
(∇a) = arctan
(105)
(hx ⊗ a)
Gradient
direction
The
gradient magnitude is sometimes
approximated by:
∇a ≅ hx ⊗ a + hy ⊗ a
(106)
Approx.
Gradient magnitude
The
final results of these
calculations depend strongly on
the choices of hx and hy.
A
number of possible choices for (hx, hy) will now be
described.
·
Basic
derivative filters These
filters are specified
by:
[
] = [1
i
) [
hx ] = hy
t
-
1]
(107)
[
hx ] = [h
y ] = [1
t
0
- 1]
ii)
where
"t" denotes matrix
transpose. These two filters
differ significantly in
their
Fourier
magnitude and Fourier phase
characteristics. For the frequency
range 0 ≤
Ω
≤
š,
these are given
by:
F
i
) [
h] = [1
- 1]
H(Ω)
= 2
sin(Ω 2)
;
ϕ(Ω
) = (š - Ω
) 2
↔
(108)
F
ii)
[h] = [1
0
-1]
↔ H (Ω) =
2 sin
Ω ;
ϕ(Ω
) = š /
2
The
second form (ii)
gives suppression of high
frequency terms (Ω
≈ š)
while the
first
form (i)
does not. The first
form leads to a phase shift;
the second form does
not.
·
Prewitt
gradient filters These
filters are specified
by:
62
...Image
Processing Fundamentals
1 0 -1
1
1
1
[
hx ] = 3
1 0 -
1 = 3
1· [1
0 -
1]
1 0 -1
1
(109)
1 1 1
1
[]
1
1
hy = 0
0 0 = 0
· [1
1 1]
3
3
-1
-1 -
1 -1
Both
hx and hy are
separable. Beyond the
computational implications are
the
implications
for the analysis of the
filter. Each filter takes
the derivative in one
direction
using eq. (107)ii and
smoothes in the orthogonal
direction using a
one-
dimensional
version of a uniform
filter as
described in Section 9.4.1.
·
Sobel
gradient filters These
filters are specified
by:
1 0 -1
1
1
1
[
hx ] = 4
2 0 -
2 = 4
2
· [1
0 -1]
1 0 -1
1
(110)
1 2
1
1
[]
1
1
0
· [1
2 1]
hy = 0
0
0=
4
4
-1 -
2 -1
-1
Again,
hx and
hy are separable.
Each filter takes the
derivative in one direction
using
eq. (107)ii
and smoothes in
the orthogonal direction
using a one-dimensional
version
of a triangular
filter as
described in Section 9.4.1.
·
Alternative
gradient filters The
variety of techniques available from
one-
dimensional
signal processing for the
design of digital filters
offers us powerful
tools
for designing one-dimensional
versions of hx and hy. Using
the Parks-
McClellan
filter design algorithm, for
example, we can choose the
frequency bands
where
we want the derivative to be
taken and the frequency
bands where we want
the
noise to be suppressed. The algorithm
will then produce a real, odd
filter with a
minimum
length that meets the
specifications.
As
an example, if we want a filter
that has derivative
characteristics in a passband
(with
weight 1.0) in the frequency
range 0.0 ≤
Ω ≤ 0.3š and
a stopband (with
weight
3.0) in the range 0.32š
≤ Ω ≤ š,
then the algorithm produces
the following
optimized
seven sample filter:
63
...Image
Processing Fundamentals
[
hx ] = [hy ]
1
t
[- 3571
3571]
=
-15580
-8212
(111)
8212
0
15580
16348
The
gradient can then be calculated as in
eq. (103).
·
Gaussian gradient
filters In modern
digital image processing one
of the most
common
techniques is to use a Gaussian filter
(see Section 9.4.1) to
accomplish the
required
smoothing and one of the
derivatives listed in eq. (107).
Thus, we might
first
apply the recursive Gaussian in
eq. (97) followed by eq.
(107)ii
to achieve
the
desired,
smoothed derivative filters
hx and hy. Further,
for computational
efficiency,
we can combine these two
steps as:
B
w[n] =
( a[n + 1]
-
a[n
- 1])
+ (b1 w[
n
- 1]
+
b2w[
n
- 2]
+
b3w[n - 3])
/
b0
2
(112)
c[n] =
Bw[n]
+
(b1c[n + 1]
+
b2 c[n + 2]
+
b3 c[
n
+ 3])
/
b0
where
the various coefficients are
defined in eq. (99). The
first (forward)
equation
is
applied from n
= 0 up to n
= N
1 while
the second (backward) equation
is
applied
from n
= N
1 down
to n
= 0.
·
Summary Examples of
the effect of various
derivative
algorithms on a
noisy
version
of Figure 30a (SNR
= 29 dB) are
shown in Figure 31a-c. The
effect of
various
magnitude
gradient algorithms on Figure
30a are shown in Figure
32a-c.
After
processing, all images are
contrast stretched as in eq. (77)
for display
purposes.
(a)
(b)
(c)
Gaussian
(σ=1.5) &
eq. (107)ii
Simple
Derivative eq.
(107)ii
Sobel
eq. (110)
Figure
31: Application of
various algorithms for
hx the
horizontal derivative.
64
...Image
Processing Fundamentals
(a)
(b)
(c)
Gaussian
(σ=1.5) &
eq. (107)ii
Simple
Derivative eq.
(107)ii
Sobel
eq. (110)
Figure
32: Various
algorithms for the magnitude
gradient, |∇a|.
The
magnitude gradient takes on large
values where there are
strong edges in the
image.
Appropriate choice of σ
in
the Gaussian-based derivative
(Figure 31c) or
gradient
(Figure 32c) permits
computation of virtually any of
the other forms
simple,
Prewitt, Sobel, etc. In that
sense, the Gaussian derivative
represents a
superset
of derivative filters.
9.5.2
Second Derivatives
It
is, of course, possible to
compute higher-order derivatives of
functions of two
variables.
In image processing, as we shall
see in Sections 10.2.1 and
10.3.2, the
second
derivatives or Laplacian play an
important role. The
Laplacian is defined
as:
(
)
2
2
a
a
=
(h2 x
⊗ a ) + h2 y
⊗ a
2
∇
a=
+
(113)
2
2
x
y
where
h2x and h2y are
second derivative filters. In
the frequency domain we
have
for
the Laplacian filter (from
eq. (27)):
F
(
)
2
-
u2 +
v 2 A(u, v)
∇a
↔
(114)
The
transfer function of a Laplacian
corresponds to a parabola H(u,v)
= (u2 + v2).
·
Basic
second derivative filter
This filter is specified
by:
[
h2 x
] = [h
2y ] = [1
t
-2 1]
(115)
and
the frequency spectrum of
this filter, in each
direction, is given
by:
H(Ω )
=
F
{1
-2 1}
= -2(1
- cos Ω)
(116)
65
...Image
Processing Fundamentals
over
the frequency range š ≤ Ω ≤ š.
The two, one-dimensional
filters can be used
in
the manner suggested by eq.
(113) or combined into one,
two-dimensional filter
as:
0 1 0
[
h] = 1
-4
1
(117)
0 1 0
and
used as in eq. (84).
·
Frequency
domain Laplacian This
filter is the implementation of
the general
recipe
given in eq. (92) and
for the Laplacian filter
takes the form:
{(
}
)
c[m, n]
=
F
-1
-
Ω 2
+
Ψ2
A(Ω,
Ψ )
(118)
·
Gaussian
second derivative filter
This is the straightforward
extension of the
Gaussian
first derivative filter
described above and can be applied
independently in
each
dimension. We first apply
Gaussian smoothing with a σ chosen
on the basis
of
the problem specification. We
then apply the desired
second derivative filter
eq.
(115)
or eq. (118). Again there is
the choice among the various
Gaussian
smoothing
algorithms.
For
efficiency, we can use the
recursive implementation and combine
the two
steps--smoothing
and derivative operation--as
follows:
w[n] =
B(a[n] -
a[n
- 1])
+ (b1 w[n - 1]
+
b2w[
n
- 2]
+
b3w[n - 3])
/
b0
(119)
c[n] =
B(w[n + 1]
-
w[n])
+ (b1c[n + 1]
+
b2 c[n + 2]
+
b3 c[n + 3])
/
b0
where
the various coefficients are
defined in eq. (99). Again,
the first (forward)
equation
is applied from n
= 0 up to n
= N
1 while
the second (backward)
equation
is applied from n
= N
1 down
to n
= 0.
·
Alternative
Laplacian filters Again
one-dimensional digital filter
design
techniques
offer us powerful methods to
create filters that are
optimized for a
specific
problem. Using the
Parks-McClellan design algorithm, we can
choose the
frequency
bands where we want the
second derivative to be taken
and the frequency
bands
where we want the noise to be
suppressed. The algorithm will
then produce a
real,
even filter with a minimum
length that meets the
specifications.
As
an example, if we want a filter
that has second derivative
characteristics in a
passband
(with weight 1.0) in the
frequency range 0.0 ≤
Ω ≤ 0.3 š and
a stopband
(with
weight 3.0) in the range
0.32š
≤ Ω ≤ š,
then the algorithm produces
the
following
optimized seven sample
filter:
66
...Image
Processing Fundamentals
[
hx ] = [hy ]
1
t
[-3448
-
3448]
=
-16383
(120)
10145
1495
1495
10145
11043
The
Laplacian can then be
calculated as in eq.
(113).
·
SDGD
filter A filter
that is especially useful in edge
finding and object
measurement
is the Second-Derivative-in-the-Gradient-Direction
(SDGD)
filter.
This
filter uses five partial
derivatives:
2
2
a
a
a
Axx =
Axy =
Ax =
x2
xy
x
(121)
2
2
a
a
a
Ayx =
Ayy = 2
Ay =
xy
y
y
Note
that Axy = Ayx which accounts
for the five
derivatives.
This
SDGD
combines the
different partial derivatives as
follows:
Axx Ax
+
2
Axy Ax
Ay + Ayy A2
2
y
SDGD(a ) =
(122)
Ax + A2
2
y
As
one might expect, the large
number of derivatives involved in
this filter implies
that
noise suppression is important and that Gaussian
derivative filters--both
first
and
second order--are highly recommended if
not required [28]. It is
also
necessary
that the first and second
derivative filters have
essentially the same
passbands
and stopbands. This means that if
the first derivative filter
h1x is
given
by
[1 0 -1] (eq. (107)ii) then
the second derivative filter
should be given by h1x ⊗
h1x = h2x = [1 0 2 0
1].
·
Summary The
effects of the various second
derivative filters are
illustrated in
Figure
33a-e. All images were
contrast stretched for
display purposes using
eq.
(78)
and the parameters 1% and
99%.
67
...Image
Processing Fundamentals
(a)
(b)
(c)
Fourier
parabola eq. (118) Gaussian
(σ=1.0) &
eq. (117)
Laplacian
eq. (117)
(d)
(e)
SDGD
(σ=1.0)
eq. (122)
"Designer"
eq. (120)
Figure
33: Various
algorithms for the Laplacian
and Laplacian-related
filters.
9.5.3
Other Filters
An
infinite number of filters,
both linear and non-linear,
are possible for
image
processing.
It is therefore impossible to describe
more than the basic
types in this
section.
The description of others
can be found be in the
reference literature
(see
Section
11) as well as in the
applications literature. It is important
to use a small
consistent
set of test images that are
relevant to the application
area to understand
the
effect of a given filter or
class of filters. The effect
of filters on images can be
frequently
understood by the use of
images that have pronounced
regions of
varying
sizes to visualize the
effect on edges or by the
use of test patterns such
as
sinusoidal
sweeps to visualize the
effects in the frequency
domain. The former
have
been
used above (Figures 21, 23,
and 3033) and the latter
are demonstrated below
in
Figure 34.
68
...Image
Processing Fundamentals
(a)
Lowpass
filter
(b)
Bandpass
filter
(c)
Highpass
filter
Figure
34: Various
convolution algorithms applied to
sinusoidal test image.
9.6
M ORPHOLOGY -BASED OPERATIONS
In
Section 1 we defined an image as an
(amplitude) function of two,
real
(coordinate)
variables a(x,y)
or two, discrete variables a[m,n]. An
alternative
definition
of an image can be based on the notion
that an image consists of a set
(or
collection)
of either continuous or discrete
coordinates. In a sense the
set
corresponds
to the points or pixels that
belong to the objects in the
image. This is
illustrated
in Figure 35 which contains
two objects or sets
A
and
B. Note
that the
coordinate
system is required. For the
moment we will consider the
pixel values to
be
binary as discussed in Section
2.1 and 9.2.1. Further we
shall restrict our
discussion
to discrete space (Z2). More
general discussions can be
found in [6, 7,
29].
n
A
m
B
Figure
35: A binary
image containing two object
sets A
and
B.
The
object A
consists of those
pixels α
that
share some common
property:
A
=
{α property(α)
= =
TRUE}
(123)
Object
As
an example, object B in Figure 35
consists of {[0,0], [1,0],
[0,1]}.
69
...Image
Processing Fundamentals
The
background of A
is given by
Ac (the
complement
of
A) which is
defined as
those
elements that are not in
A:
Ac = {α α ∉
A}
(124)
Background
In
Figure 3 we introduced the concept of
neighborhood connectivity. We
now
observe
that if an object A is defined on
the basis of Cconnectivity
(C=4, 6, or
8)
then
the background Ac has a
connectivity given by 12 C. The
necessity for this is
illustrated
for the Cartesian grid in
Figure 36.
background
object
Figure
36: A binary
image requiring careful
definition of object
and
background
connectivity.
9.6.1
Fundamental definitions
The
fundamental operations associated
with an object are the
standard set
operations
union,
intersection,
and complement
{∪, ∩, c} plus
translation:
·
Translation Given a
vector x
and a set
A, the
translation,
A
+ x,
is defined as:
A
+ x = {α + x
α ∈
A}
(125)
Note
that, since we are dealing
with a digital image composed of
pixels at integer
coordinate
positions (Z2), this
implies restrictions on the
allowable translation
vectors
x.
The
basic Minkowski
set operations--addition and
subtraction--can now be
defined.
First we note that the
individual elements that
comprise B
are not
only
pixels
but also vectors
as they
have a clear coordinate
position with respect to
[0,0].
Given
two sets A
and
B:
U(A + β)
A⊕
B
=
(126)
Minkowski
addition
β
∈B
I
(
A
+β )
AO
B
=
(127)
Minkowski
subtraction
β
∈B
70
...Image
Processing Fundamentals
9.6.2
Dilation and Erosion
From
these two Minkowski operations we
define the fundamental
mathematical
morphology
operations dilation
and erosion:
U(
A
+ β )
D(
A,
B) =
A
⊕
B
=
(128)
Dilation
β
∈B
I
(
A
-β )
~
E(
A,
B) =
AO
B
=
(129)
Erosion
β
∈B
where
B
= {- β β ∈ B}
.
These two operations are
illustrated in Figure 37 for
the
~
objects
defined in Figure 35.
D(A,B)
E(A,B)
~
B
B
(a)
Dilation
D(A,B)
(b)
Erosion
E
(A,B)
Figure
37: A
binary image containing two
object sets A
and
B. The
three pixels in
B
are
"color-coded" as is their effect in
the result.
While
either set A
or
B
can be thought of
as an "image", A
is usually
considered as
the
image and B
is called a
structuring
element. The
structuring element is to
mathematical
morphology what the convolution kernel is
to linear filter
theory.
Dilation,
in general, causes objects to
dilate or grow in size;
erosion
causes
objects
to
shrink. The amount and the
way that they grow or
shrink depend upon the
choice
of
the structuring element.
Dilating or eroding without
specifying the
structural
element
makes no more sense than
trying to lowpass filter an image
without
specifying
the filter. The two
most common structuring elements
(given a Cartesian
grid)
are the 4-connected and
8-connected sets, N4 and
N8.
They are illustrated in
Figure
38.
71
...Image
Processing Fundamentals
n
n
m
m
(a)
N4
(b)
N8
Figure
38: The
standard structuring elements
N4 and
N8.
Dilation
and erosion
have
the following properties:
D(
A,
B) =
A
⊕
B
=
B
⊕
A
= D(
B,
A)
(130)
Commutative
E(
A,
B) ≠
E
(
B,
A)
(131)
Non-Commutative
A
⊕ ( B
⊕
C
) = (
A
⊕
B)
⊕
C
(132)
Associative
A
⊕ ( B
+ x ) = (
A
⊕
B)
+ x
(133)
Translation
Invariance
Dc (
A,
B) =
E(
Ac ,
B)
~
(134)
Duality
c
c
E
(
A,
B) =
D(
A
,
B)
~
With
A
as an object
and Ac as the
background, eq. (134) says
that the dilation
of an
object
is equivalent to the erosion
of the
background. Likewise, the
erosion
of
the
object
is equivalent to the dilation
of the
background.
Except
for special cases:
D(
E
(
A,
B),
B) ≠
A
≠ E ( D(
A,
B),
B)
(135)
Non-Inverses
Erosion
has
the following translation
property:
AO
(
B
+ x ) =
(
A
+ x )OB = (
AOB) +
x
(136)
Translation
Invariance
Dilation
and
erosion
have
the following important
properties. For any
arbitrary
structuring
element B
and
two image objects A1
and
A2
such
that A1
⊂
A2
(A1
is a
proper
subset of A2):
D(
A1,
B) ⊂
D
(
A2,
B)
(137)
Increasing
in A
E(
A1,
B) ⊂
E(
A2
,
B)
For
two structuring elements
B1
and
B2
such
that B1
⊂
B2:
72
...Image
Processing Fundamentals
E(
A,
B1) ⊃
E(
A,
B2
)
(138)
Decreasing
in B
The
decomposition
theorems below make it
possible to find efficient
implementations
for morphological
filters.
A
⊕ ( B
∪
C
) = (
A
⊕
B)
∪ ( A
⊕
C)
= ( B
∪
C)
⊕
A
(139)
Dilation
AO
(
B
∪
C) =
(
AOB) ∩
(
AOC )
(140)
Erosion
(
AO
B)OC =
AO
(
B
⊕
C)
(141)
Erosion
nB =
(B4442⊕ L
43)
1
⊕
B
⊕
B
44⊕
B
(142)
Multiple
Dilations
n
times
An
important decomposition theorem is due to
Vincent [30]. First, we
require
some
definitions. A convex
set
(in R
2) is one for
which the straight line
joining any
two
points in the set consists of
points that are also in the
set. Care must
obviously
be
taken when applying this
definition to discrete pixels as
the concept of a "straight
line"
must be interpreted appropriately in
Z2. A set is
bounded
if each of
its
elements
has a finite magnitude, in
this case distance to the
origin of the
coordinate
~
system.
A set is symmetric
if
B=
B
. The
sets N4 and
N8 in
Figure 38 are examples
of
convex, bounded, symmetric
sets.
Vincent's
theorem, when applied to an
image consisting of discrete
pixels, states
that
for a bounded, symmetric
structuring element B that
contains no holes and
contains
its own center, [ 0, 0] ∈
B
:
D(
A,
B) =
A
⊕
B
=
A
∪ ( A
⊕
B)
(143)
where
A
is the
contour of the object. That
is, A
is the set of
pixels that have a
background
pixel as a neighbor. The
implication of this theorem is
that it is not
necessary
to process all the pixels in an
object in order to compute a
dilation
o r
(using
eq. (134)) an erosion.
We only have to process the
boundary pixels. This
also
holds for all operations
that can be derived from
dilations
and
erosions.
The
processing
of boundary pixels instead of object
pixels means that, except
for
pathological
images, computational complexity
can be reduced from
O(N2) to O(N)
for
an N
× N image. A
number of "fast" algorithms
can be found in the
literature
that
are based on this result
[30-32]. The simplest
dilation and erosion
algorithms
are
frequently described as
follows.
·
Dilation Take
each binary object pixel
(with value "1") and set
all background
pixels
(with value "0") that
are C-connected
to that object pixel to the
value "1".
73
...Image
Processing Fundamentals
·
Erosion Take
each binary object pixel
(with value "1") that is
C-connected to
a
background
pixel and set the
object pixel value to
"0".
Comparison
of these two procedures to eq.
(143) where B
=
NC=4 or
NC=8 shows
that
they are equivalent to the
formal definitions for
dilation and erosion.
The
procedure
is illustrated for dilation
in Figure
39.
(a)
B
=
N4
(b)
B= N8
Figure
39: Illustration of
dilation.
Original object pixels are
in gray;
pixels
added through dilation
are in
black.
9.6.3
Boolean Convolution
An
arbitrary binary
image
object (or structuring
element) A
can be
represented as:
+∞
+∞
∑
∑ a[
j
, k
] · δ
[m
- j, n
- k]
A↔
(144)
k
=-∞
j
=-∞
where
Σ and ·
are the Boolean operations
OR
and AND
as defined in
eqs. (81) and
(82),
a[j,k] is a characteristic
function that
takes on the Boolean values
"1" and "0"
as
follows:
1 a
∈
A
a[
j, k]
=
(145)
0 a
∉
A
and
δ[m,n] is a Boolean
version of the Dirac delta
function that takes on
the
Boolean
values "1" and "0" as
follows:
1 j
= k = 0
δ[ j
, k]
=
(146)
0 otherwise
Dilation
for
binary images can therefore
be written as:
+∞
+∞
∑
∑ a[
j, k
] · b[m
- j, n
- k] =
a
⊗
b
D(
A,
B) =
(147)
k
= - ∞ j =-∞
74
...Image
Processing Fundamentals
which,
because Boolean OR
and AND
are
commutative, can also be
written as
+∞
+∞
∑
∑ a[m - j, n
- k] · b[ j,
k
] =
b
⊗
a
= D(
B,
A)
D(
A,
B) =
(148)
k
=-∞
j
=-∞
Using
De Morgan's theorem:
(a
+ b) = a
· b
(
a
· b) = a
+ b
and
(149)
on
eq. (148) together with
eq. (134), erosion
can be
written as:
+∞
+∞
∏
∏ (a[m - j , n
- k ] +
b
[-
j
, -
k])
E(
A,
B) =
(150)
k
=-∞
j
=-∞
Thus,
dilation
and erosion
on binary
images can be viewed as a form
of
convolution
over a Boolean
algebra.
In
Section 9.3.2 we saw that,
when convolution is employed, an
appropriate choice
of
the boundary conditions for
an image is essential. Dilation and erosion--being
a
Boolean
convolution--are no exception. The
two most common choices
are that
either
everything outside the
binary image is "0" or everything
outside the binary
image
is "1".
9.6.4
Opening and Closing
We
can combine dilation
and erosion
to build
two important higher
order
operations:
o
O(
A,
B) =
A
B = D( E(
A,
B),
B)
(151)
Opening
C(
A,
B) =
A
·
B
= E (D(
A, -
B), -
B)
(152)
Closing
The
opening
and closing
have
the following properties:
C
c (
A,
B) =
O
(
Ac ,
B)
(153)
Duality
c
c
O
(
A,
B) =
C
(
A
,
B)
O(
A
+ x,
B) =
O(
A,
B) +
x
(154)
Translation
C(
A
+ x,
B) =
C(
A,
B) +
x
For
the opening
with
structuring element B and images
A,
A1, and
A2, where
A1
is
a
subimage of A2
(A1
⊆
A2):
75
...Image
Processing Fundamentals
O(
A,
B) ⊆
A
(155)
Antiextensivity
O(
A1,
B) ⊆
O(
A2
,
B)
(156)
Increasing
monotonicity
O(O (
A,
B),
B) =
O(
A,
B)
(157)
Idempotence
For
the closing
with
structuring element B and images
A,
A1, and
A2, where
A1
is a
subimage
of A2
(A1
⊆
A2):
A
⊆ C(
A,
B)
(158)
Extensivity
C(
A1,
B) ⊆
C(
A2,
B)
(159)
Increasing
monotonicity
C(C (
A,
B),
B) =
C(
A,
B)
(160)
Idempotence
The
two properties given by eqs.
(155) and (84) are so
important to mathematical
~
morphology
that they can be considered as the
reason for defining
erosion
with
B
instead
of B
in eq.
(129).
9.6.5
HitandMiss operation
The
hit-or-miss
operator was
defined by Serra but we
shall refer to it as the
hit-
and-miss
operator
and define it as follows.
Given an image A
and
two structuring
elements
B1
and
B2, the set
definition and Boolean
definition are:
(
)
E( A,
B1
)
∩ E Ac ,
B2
HitMiss(
A,
B1
,
B2
) =
(161)
Hit-and-Miss
E(A,
B1
)
·
E( A
,
B2
)
where
B1
and
B2
are
bounded, disjoint structuring
elements. (Note the use of
the
notation
from eq. (81).) Two sets
are disjoint
if
B1
∩
B2
= ∅, the
empty set. In an
important
sense the hit-and-miss
operator is the
morphological equivalent of
template
matching, a well-known
technique for matching
patterns based upon
cross-correlation.
Here, we have a template
B1
for the
object and a template
B2
for
the
background.
9.6.6
Summary of the basic
operations
The
results of the application of
these basic operations on a test
image are illustrated
below.
In Figure 40 the various
structuring elements used in the
processing are
defined.
The value "" indicates a
"don't care". All three
structuring elements
are
symmetric.
76
...Image
Processing Fundamentals
1 1 1
-
1 -
B
=
N8
=
1 1 1
B1
=
B2
=
1 - 1
1
1 1 1
-
1
-
(a)
(b)
(c)
Figure
40: Structuring
elements B,
B1, and
B2
that are
3 ×
3 and
symmetric.
The
results of processing are
shown in Figure 41 where the
binary value "1" is
shown
in black and the value "0"
in white.
b)
Dilation
with
2B
c)
Erosion
with
2B
a)
Image
A
d)
Opening
with
2B
e)
Closing
with
2B
f)
8-c
contour:
AE (A,N8)
Figure
41: Examples of
various mathematical morphology
operations.
The
opening
operation
can separate objects that
are connected in a binary
image.
The
closing
operation
can fill in small holes.
Both operations generate a
certain
amount
of smoothing on an object contour
given a "smooth" structuring
element.
The
opening
smoothes from
the inside of the object
contour and the closing
smoothes
from the outside of the
object contour. The
hit-and-miss
example
has
found
the 4-connected contour
pixels. An alternative method to
find the contour is
simply
to use the relation:
A
= A - E( A,
N8)
(162)
4-connected
contour
or
77
...Image
Processing Fundamentals
A
= A - E( A,
N4 )
(163)
8-connected
contour
9.6.7
Skeleton
The
informal definition of a skeleton is a
line representation of an object that
is:
i)
one-pixel
thick,
ii)
through
the "middle" of the object,
and,
(164)
iii)
preserves
the topology of the
object.
These
are not always realizable.
Figure 42 shows why this is
the case.
(a)
(b)
Figure
42: Counterexamples
to the three
requirements.
In
the first example, Figure
42a, it is not possible to generate a
line that is one
pixel
thick
and in the center of an object while
generating a path that reflects
the
simplicity
of the object. In Figure 42b it is
not possible to remove a pixel
from the
8-connected
object and simultaneously preserve the
topology--the notion of
connectedness--of
the object. Nevertheless,
there are a variety of
techniques that
attempt
to achieve this goal and to produce a
skeleton.
A
basic formulation is based on the
work of Lantuéjoul [33]. The
skeleton
subset
Sk(A) is defined
as:
Sk ( A)
= E(
A, kB)
- [ E(
A, kB)
o
B]
k
= 0,1,... K
Skeleton
subsets
(165)
where
K
is the
largest value of k
before
the set Sk(A) becomes
empty. (From eq.
(156),
E(
A, kB)
o
B
⊆ E(
A, kB)
). The structuring element
B
is chosen
(in Z2) to
approximate
a circular disc, that is,
convex, bounded and symmetric.
The skeleton
is
then the union of the
skeleton subsets:
K
S(
A)
=
U
Sk ( A)
Skeleton
(166)
k=0
78
...Image
Processing Fundamentals
An
elegant side effect of this
formulation is that the
original object can be
reconstructed
given knowledge of the
skeleton subsets Sk(A), the
structuring
element
B, and K:
K
U
(Sk ( A)
⊕ k B)
A=
Reconstruction
(167)
k
=0
This
formulation for the
skeleton, however, does not
preserve the topology, a
requirement
described in eq.
(164).
An
alternative point-of-view is to implement
a thinning,
an erosion that reduces
the
thickness
of an object without permitting it to
vanish. A general thinning
algorithm
is
based on the hit-and-miss
operation:
Thin(
A,
B1,
B2
) =
A
- HitMiss( A,
B1,
B2
)
Thinning
(168)
Depending
on the choice of B1
and
B2, a large
variety of thinning
algorithms--and
through
repeated application skeletonizing
algorithms--can be implemented.
A
quite practical implementation
can be described in another
way. If we restrict
ourselves
to a 3 ×
3
neighborhood, similar to the
structuring element B = N
8 in
Figure
40a, then we can view the
thinning operation as a window
that repeatedly
scans
over the (binary) image and
sets the center pixel to "0"
under certain
conditions.
The center pixel is not
changed to "0" if
and only if:
i)
an isolated pixel is found (e.g.
Figure 43a),
ii)
removing a pixel would
change the connectivity
(e.g. Figure 43b),
(169)
iii)
removing a pixel would
shorten a line (e.g. Figure
43c).
As
pixels are (potentially)
removed in each iteration,
the process is called a
conditional
erosion. Three
test cases of eq. (169)
are illustrated in Figure
43. In
general
all possible rotations and
variations have to be checked. As
there are only
512
possible combinations for a 3 ×
3
window on a binary image,
this can be done
easily
with the use of a lookup
table.
(a)
Isolated
pixel
(b)
Connectivity
pixel
(c)
End
pixel
Figure
43: Test
conditions for conditional
erosion of the center
pixel.
79
...Image
Processing Fundamentals
If
only condition (i)
is used then each object
will be reduced to a single pixel.
This
is
useful if we wish to count
the number of objects in an
image. If only
condition
(ii) is used then
holes in the objects will be
found. If conditions (i
+ ii)
are used
each
object will be reduced to either a single
pixel if it does not contain
a hole or to
closed
rings if it does contain
holes. If conditions (i
+ ii
+ iii)
are used then the
"complete
skeleton" will be generated as an
approximation to eq. (56).
Illustrations
of
these various possibilities
are given in Figure
44a,b.
9.6.8
Propagation
It
is convenient to be able to reconstruct
an image that has "survived"
several
erosions
or to fill an object that is defined,
for example, by a boundary.
The formal
mechanism
for this has several
names including region-filling,
reconstruction,
and
propagation.
The formal definition is
given by the following
algorithm. We start
with
a seed
image S(0), a mask
image A, and a
structuring element B. We then
use
dilations
of S
with
structuring element B and masked by
A
in an iterative
procedure
as
follows:
[
]
S(
k
) =
S(k -1)
⊕
B
∩
A
until
S
(k ) =
S(k -1)
(170)
Iteration
k
With
each iteration the seed
image grows (through dilation)
but within
the
set
(object)
defined by A;
S
propagates
to fill
A. The
most common choices for
B
are
N4 or
N8.
Several remarks are central
to the use of propagation. First, in
a
straightforward
implementation, as suggested by eq.
(170), the computational
costs
are
extremely high. Each iteration
requires O(N2) operations for an
N × N image
and
with the required number of
iterations this can lead to a complexity
of O(N3).
Fortunately,
a recursive implementation of the
algorithm exists in which one or
two
passes
through the image are
usually sufficient, meaning a
complexity of O(N2).
Second,
although we have not paid
much attention to the issue
of
object/background
connectivity until now (see
Figure 36), it is essential
that the
connectivity
implied by B
be matched to
the connectivity associated
with the
boundary
definition of A
(see
eqs. (162) and (163)).
Finally, as mentioned earlier,
it
is
important to make the
correct choice ("0" or "1")
for the boundary condition
of
the
image. The choice depends upon
the application.
9.6.9
Summary of skeleton and
propagation
The
application of these two operations on a
test image is illustrated in
Figure 44. In
Figure
44a,b the skeleton operation
is shown with the endpixel
condition (eq. (169)
i+ii+iii)
and without the end pixel
condition (eq. (169)
i+ii).
The propagation
operation
is illustrated in Figure 44c.
The original image, shown in
light gray, was
eroded
by E
(A,6N8)
to produce the seed
image
shown in black. The original
was
then
used as the mask
image to
produce the final result.
The border value in
both
images
was "0".
80
...Image
Processing Fundamentals
Several
techniques based upon the
use of skeleton
and
propagation
operations
in
combination
with other mathematical
morphology operations will be given
in
Section
10.3.3.
Original
= light gray
Mask
= light gray
↓
↓
↓
Skeleton
= black
Seed
= black
a)
Skeleton
with
end pixels b)
Skeleton
without
end pixels
c)
Propagation
with
N8
Condition
eq. (169)
i+ii+iii
Condition
eq. (169)
i+ii
Figure
44: Examples of
skeleton
and propagation.
9.6.10
Gray-value morphological
processing
The
techniques of morphological filtering can
be extended to gray-level images.
To
simplify
matters we will restrict our
presentation to structuring elements,
B,
that
comprise
a finite number of pixels
and are convex and
bounded. Now,
however,
the
structuring element has gray
values associated with every
coordinate position as
does
the image A.
·
Gray-level
dilation, DG(·), is
given by:
DG (
A,
B) =
max
{a[m - j, n
- k ] +
b[ j,
k
]}
(171)
Dilation
[
j,k ]∈B
For
a given output coordinate
[m,n],
the structuring element is
summed with a
shifted
version of the image and
the maximum encountered over
all shifts within
81
...Image
Processing Fundamentals
the
J
× K domain of
B
is used as
the result. Should the
shifting require values of
the
image
A
that
are outside the M
x N
domain of
A, then a
decision must be made as
to
which model for image
extension, as described in Section 9.3.2,
should be used.
·
Gray-level
erosion,
E
G(·), is
given by:
EG(
A,
B) =
min
{a[m + j, n
+ k] -
b[ j,
k
]}
(172)
Erosion
[
j,
k]∈B
The
duality between gray-level
erosion and gray-level
dilation--the
gray-level
counterpart
of eq. (134) is:
~
EG(
A,
B) =
- DG ( -
A,
B)
(173)
Duality
~
D
(
A,
B) =
- E
( -
A,
B)
G
G
~
where
" A
" means
that a[j,k]
→
a[j,k] and " -
A
" means
that a[j,k]
→ a[j,k].
The
definitions of higher order operations
such as gray-level opening
and gray-level
closing
are:
OG (
A,
B) =
DG ( EG(
A,
B),
B)
(174)
Opening
( ( ) )
~ ~
CG (
A,
B) =
EG DG
A,
B
,
B
(175)
Closing
The
duality between gray-level
opening and gray-level
closing is:
OG (
A,
B) =
- CG( -
A,
B)
(176)
Duality
CG(
A,
B) =
-OG ( -
A,
B)
The
important properties that
were discussed earlier such as
idempotence,
translation
invariance, increasing in A, and so forth
are also applicable to gray
level
morphological
processing. The details can be found in
Giardina and Dougherty
[6].
In
many situations the seeming
complexity of gray level
morphological processing
is
significantly reduced through
the use of symmetric
structuring elements
where
b[j,k]
= b[j,k]. The
most common of these is
based on the use of
B
= constant
=
0.
For this important case
and using again the
domain [j,k] ∈
B, the
definitions
above
reduce to:
DG (
A,
B) =
max
{a[m - j, n
- k ]}
= max(
A)
(177)
Dilation
[
j,k ]∈B
B
82
...Image
Processing Fundamentals
EG(
A,
B) =
min
{a[m - j, n
- k]}
= min(
A)
(178)
Erosion
[
j,
k]∈B
B
OG (
A,
B) =
max
(min( A))
(179)
Opening
B
B
CG (
A,
B) =
min(max
( A))
(180)
Closing
B
B
The
remarkable conclusion is that
the maximum
filter and the
minimum
filter,
introduced
in Section 9.4.2, are
gray-level dilation and
gray-level erosion for
the
specific
structuring element given by
the shape of the filter
window with the
gray
value
"0" inside the
window. Examples of these
operations on a simple
one-
dimensional
signal are shown in Figure
45.
Closing
250
250
Dilation
"Original"
200
200
150
150
100
100
Erosion
50
50
Opening
0
0
0
50
100
150
200
0
50
100
150
200
Horizontal
Position
Horizontal
Position
a)
Effect of 15
× 1 dilation
and erosion
b)
Effect of 15
× 1 opening
and closing
Figure
45: Morphological
filtering of gray-level
data.
For
a rectangular window, J
× K, the
two-dimensional maximum or
minimum
filter
is separable into two,
one-dimensional windows. Further, a
one-dimensional
maximum
or minimum filter can be
written in incremental form.
(See Section
9.3.2.)
This means that gray-level
dilations and erosions have a
computational
complexity
per pixel that is O(constant),
that is, independent of
J
and K. (See
also
Table
13.)
The
operations defined above can be used to
produce morphological algorithms
for
smoothing,
gradient determination and a version of
the Laplacian. All are
constructed
from the primitives for
gray-level
dilation and gray-level
erosion and
in
all
cases the maximum
and minimum
filters
are taken over the domain [
j
, k]
∈
B
.
9.6.11
Morphological smoothing
This
algorithm is based on the
observation that a gray-level opening
smoothes
a
gray-value
image from above the
brightness surface given by
the function a[m,n]
83
...Image
Processing Fundamentals
and
the gray-level
closing smoothes
from below. We use a
structuring element B
based
on eqs. (84) and
(178).
MorphSmooth(
A,
B) =
CG (OG (
A,
B),
B)
(181)
=
min(max(max(min(A))))
Note
that we have suppressed the
notation for the structuring
element B
under
the
max
and min operations to
keep the notation simple.
Its use, however, is
understood.
9.6.12
Morphological gradient
For
linear filters the gradient
filter yields a vector
representation (eq. (103))
with a
magnitude
(eq. (104)) and direction
(eq. (105)). The version
presented here
generates
a morphological estimate of the
gradient
magnitude:
1
(D
(
A,
B) -
EG(
A,
B))
Gradient
(
A,
B) =
2 G
(182)
1
=
(max(
A) -
min(
A))
2
9.6.13
Morphological Laplacian
The
morphologically-based Laplacian filter is
defined by:
(( DG (
A,
B) -
A)
- ( A
- EG (
A,
B)))
1
Laplacian(
A,
B) =
2
1
=
( DG (
A,
B) +
EG (
A,
B) -
2
A)
(183)
2
1
=
(max(
A) +
min(A) -
2
A)
2
9.6.14
Summary of morphological
filters
The
effect of these filters is
illustrated in Figure 46. All
images were processed
with
a
3 ×
3
structuring element as described in
eqs. (177) through (183).
Figure 46e
was
contrast stretched for
display purposes using eq. (78)
and the parameters 1%
and
99%. Figures 46c,d,e should be compared
to Figures 30, 32, and
33.
84
Table of Contents:
|
|||||