ZeePedia

Algorithms:HISTOGRAM-BASED OPERATIONS, Equalization, Binary operations, Second Derivatives

<< Displays:REFRESH RATE, INTERLACING, RESOLUTION
Techniques:SHADING CORRECTION, Estimate of shading, Unsharp masking >>
img
...Image Processing Fundamentals
of their implementation as a point operation, a local operation, or a global operation
as described in Section 2.2.1.
9.1 HISTOGRAM-BASED OPERATIONS
An important class of point operations is based upon the manipulation of an image
histogram or a region histogram. The most important examples are described
below.
9.1.1 Contrast stretching
Frequently, an image is scanned in such a way that the resulting brightness values
do not make full use of the available dynamic range. This can be easily observed in
the histogram of the brightness values shown in Figure 6. By stretching the
histogram over the available dynamic range we attempt to correct this situation. If
the image is intended to go from brightness 0 to brightness 2B­1 (see Section 2.1),
then one generally maps the 0% value (or minimum as defined in Section 3.5.2) to
the value 0 and the 100% value (or maximum) to the value 2B­1. The appropriate
transformation is given by:
a[m, n] - minimum
(
)
b[m, n] = 2  B - 1 ·
(77)
maximum - minimum
This formula, however, can be somewhat sensitive to outliers and a less sensitive
and more general version is given by:
a[ m, n] plow%
0
(
)
2  B - 1 · a[ m, n ] - plow %
plow % < a[ m, n ] < phigh %
b[m, n] = 
(78)
phigh % - plow %
(
)
2B - 1
a[m, n] phigh %
In this second version one might choose the 1% and 99% values for plow% and
phigh%, respectively, instead of the 0% and 100% values represented by eq. (77). It
is also possible to apply the contrast-stretching operation on a regional basis using
the histogram from a region to determine the appropriate limits for the algorithm.
Note that in eqs. (77) and (78) it is possible to suppress the term 2B­1 and simply
normalize the brightness range to 0 b[m,n] 1. This means representing the final
pixel brightnesses as reals instead of integers but modern computer speeds and
RAM capacities make this quite feasible.
45
img
...Image Processing Fundamentals
9.1.2 Equalization
When one wishes to compare two or more images on a specific basis, such as
texture, it is common to first normalize their histograms to a "standard" histogram.
This can be especially useful when the images have been acquired under different
circumstances. The most common histogram normalization technique is histogram
equalization where one attempts to change the histogram through the use of a
function b = ƒ(a) into a histogram that is constant for all brightness values. This
would correspond to a brightness distribution where all values are equally probable.
Unfortunately, for an arbitrary image, one can only approximate this result.
For a "suitable" function ƒ(·) the relation between the input probability density
function, the output probability density function, and the function ƒ(·) is given by:
pa ( a)da
pb (b)db = pa (a)da
dƒ =
(79)
pb (b)
From eq. (79) we see that "suitable" means that ƒ(·) is differentiable and that
dƒ/da 0. For histogram equalization we desire that pb(b) = constant and this
means that:
(
)
f (a) = 2  B - 1 · P( a)
(80)
where P (a) is the probability distribution function defined in Section 3.5.1 and
illustrated in Figure 6a. In other words, the quantized probability distribution
function normalized from 0 to 2B­1 is the look-up table required for histogram
equalization. Figures 21a-c illustrate the effect of contrast stretching and histogram
equalization on a standard image. The histogram equalization procedure can also be
applied on a regional basis.
Figure 21a
Figure 21b
Figure 21c
Original
Contrast Stretched
Histogram Equalized
9.1.3 Other histogram-based operations
The histogram derived from a local region can also be used to drive local filters that
are to be applied to that region. Examples include minimum filtering, median
46
img
...Image Processing Fundamentals
filtering, and maximum filtering [23]. The concepts minimum, median, and
maximum were introduced in Figure 6. The filters based on these concepts will be
presented formally in Sections 9.4.2 and 9.6.10.
9.2 M  ATHEMATICS-BASED OPERATIONS
We distinguish in this section between binary arithmetic and ordinary arithmetic. In
the binary case there are two brightness values "0" and "1". In the ordinary case
we begin with 2B brightness values or levels but the processing of the image can
easily generate many more levels. For this reason many software systems provide
16 or 32 bit representations for pixel brightnesses in order to avoid problems with
arithmetic overflow.
9.2.1 Binary operations
Operations based on binary (Boolean) arithmetic form the basis for a powerful set
of tools that will be described here and extended in Section 9.6, mathematical
morphology. The operations described below are point operations and thus admit a
variety of efficient implementations including simple look-up tables. The standard
notation for the basic set of binary operations is:
c=a
NOT
c =a + b
OR
c = a·b
AND
(81)
c =a b = a·b +a · b
XOR
c = a \ b = a- b = a· b
SUB
The implication is that each operation is applied on a pixel-by-pixel basis. For
example, c[m, n] = a[m , n] · b [m, n] m, n . The definition of each operation is:
NOT
OR
b
AND
b
a
a
0
1
a
0
1
0
1
0
0
1
0
0
0
1
0
1
1
1
1
0
1
(82)
input
output
XOR
b
SUB
b
a
0
1
a
0
1
0
0
1
0
0
0
1
1
0
1
1
0
These operations are illustrated in Figure 22 where the binary value "1" is shown in
black and the value "0" in white.
47
img
...Image Processing Fundamentals
a) Image a
b) Image b
c) NOT(b) = b
d) OR(a,b) = a + b
e) AND(a,b) = a · b
f) XOR( a,b) = a b
g) SUB(a,b) = a \ b
Figure 22: Examples of the various binary point operations.
The SUB(·) operation can be particularly useful when the image a represents a
region-of-interest that we want to analyze systematically and the image b represents
objects that, having been analyzed, can now be discarded, that is subtracted, from
the region.
9.2.2 Arithmetic-based operations
The gray-value point operations that form the basis for image processing are based
on ordinary mathematics and include:
48
img
...Image Processing Fundamentals
Operation
Definition
preferred data type
ADD
c=a+b
integer
SUB
c=a­b
integer
MUL
c=a·b
integer or floating point
DIV
c=a/b
floating point
LOG
c = log(a)
floating point
(83)
EXP
c = exp(a)
floating point
SQRT
c = sqrt(a)
floating point
TRIG.
c = sin/cos/tan(a)
floating point
c = (2B ­ 1) ­ a
INVERT
integer
9.3 CONVOLUTION-BASED OPERATIONS
Convolution, the mathematical, local operation defined in Section 3.1 is central to
modern image processing. The basic idea is that a window of some finite size and
shape--the support--is scanned across the image. The output pixel value is the
weighted sum of the input pixels within the window where the weights are the
values of the filter assigned to every pixel of the window itself. The window with
its weights is called the convolution kernel. This leads directly to the following
variation on eq. (3). If the filter h[j,k] is zero outside the (rectangular) window
{j=0,1,...,J­1; k=0,1,...,K­1}, then, using eq. (4), the convolution can be written as
the following finite sum:
J -1K -1
∑ ∑  h[ j , k ]a[ m - j, n - k]
c[m, n] = a[m , n] h[m, n] =
(84)
j= 0 k =0
This equation can be viewed as more than just a pragmatic mechanism for
smoothing or sharpening an image. Further, while eq. (84) illustrates the local
character of this operation, eqs. (10) and (24) suggest that the operation can be
implemented through the use of the Fourier domain which requires a global
operation, the Fourier transform. Both of these aspects will be discussed below.
9.3.1 Background
In a variety of image-forming systems an appropriate model for the transformation
of the physical signal a(x,y) into an electronic signal c(x,y) is the convolution of the
input signal with the impulse response of the sensor system. This system might
consist of both an optical as well as an electrical sub-system. If each of these
systems can be treated as a linear, shift-invariant (LSI) system then the convolution
model is appropriate. The definitions of these two, possible, system properties are
given below:
49
...Image Processing Fundamentals
a1 c1  and  a2 c  2
If
­
(85)
Linearity
Then w1 · a1 + w2 · a2 w1 · c1 + w2 · c2
a(  x, y ) c(  x, y)
If
­
(86)
Shift-Invariance
Then a(x - xo, y - yo ) c(  x - xo , y - yo )
where w1 and w2 are arbitrary complex constants and xo and yo are coordinates
corresponding to arbitrary spatial translations.
Two remarks are appropriate at this point. First, linearity implies (by choosing w1 =
w2 = 0) that "zero in" gives "zero out". The offset described in eq. (70) means that
such camera signals are not the output of a linear system and thus (strictly
speaking) the convolution result is not applicable. Fortunately, it is straightforward
to correct for this non-linear effect. (See Section 10.1.)
Second, optical lenses with a magnification, M, other than 1× are not shift invariant;
a translation of 1 unit in the input image a(x,y) produces a translation of M units in
the output image c(x,y). Due to the Fourier property described in eq. (25) this case
can still be handled by linear system theory.
If an impulse point of light δ(x,y) is imaged through an LSI system then the
impulse response of that system is called the point spread function (PSF). The
output image then becomes the convolution of the input image with the PSF. The
Fourier transform of the PSF is called the optical transfer function ( OTF). For
optical systems that are circularly-symmetric, aberration-free, and diffraction-
limited the PSF is given by the Airy disk shown in Table 4­T.5. The OTF of the
Airy disk is also presented in Table 4­T.5.
If the convolution window is not the diffraction-limited PSF of the lens but rather
the effect of defocusing a lens then an appropriate model for h(x,y) is a pill box of
radius a as described in Table 4­T.3. The effect on a test pattern is illustrated in
Figure 23.
50
img
...Image Processing Fundamentals
a) Test pattern
b) Defocused image
Figure 23: Convolution of test pattern with a pill box of radius a=4.5 pixels.
The effect of the defocusing is more than just simple blurring or smoothing. The
almost periodic negative lobes in the transfer function in Table 4­T.3 produce a
180° phase shift in which black turns to white and vice-versa. The phase shift is
clearly visible in Figure 23b.
9.3.2 Convolution in the spatial domain
In describing filters based on convolution we will use the following convention.
Given a filter h[j,k] of dimensions J × K, we will consider the coordinate [j=0,k=0]
to be in the center of the filter matrix, h. This is illustrated in Figure 24. The
"center" is well-defined when J and K are odd; for the case where they are even, we
will use the approximations (J/2, K/2) for the "center" of the matrix.
[(
)]
[(
)]
[ (
)]
)(
)(
h - J - 1 , - K - 1
... h J -12 , - K - 12
h 0, - K - 12
...
...
...
2
2
M
O
M
M
M
N
M
... h[ -1, - 1]
h[0, -1]
h[1, -1] ...
M
M
[(
)]
[(
)]
h J - 12 , 0
h - J -12 , 0
h= 
...
h[ -1, 0]
h[1, 0]  ...
h[0, 0]
...
h[ -1,1]
h[1, +1] ...
M
M
h[0,1]
M
N
M
M
M
O
M
[(
)]
[(
)]
[ (
)]
)(
)(
h - J -1 , K -1
h 0, K -1
J -1 , K - 1
...
...
...
...  h
2 
2
2
2
2
Figure 24: Coordinate system for describing h[j,k]
When we examine the convolution sum (eq. (84)) closely, several issues become
evident.
· Evaluation of formula (84) for m=n=0 while rewriting the limits of the
convolution sum based on the "centering" of h[j,k] shows that values of a[j,k] can
be required that are outside the image boundaries:
51
img
...Image Processing Fundamentals
+ Jo
+ Ko
( J - 1)
(K - 1)
∑ ∑  h[ j , k]a[- j , -k ]
c[0, 0] =
Jo =
, Ko =
(87)
2
2
j =- Jo k =- Ko
The question arises ­ what values should we assign to the image a[m,n] for m<0,
mM, n<0, and nN? There is no "answer" to this question. There are only
alternatives among which we are free to choose assuming we understand the
possible consequences of our choice. The standard alternatives are a) extend the
images with a constant (possibly zero) brightness value, b) extend the image
periodically, c) extend the image by mirroring it at its boundaries, or d) extend the
values at the boundaries indefinitely. These alternatives are illustrated in Figure 25.
F
F
F
F
F
F
F
F
F
(a)
(b)
(c)
(d)
Figure 25: Examples of various alternatives to extend an image outside its
formal boundaries. See text for explanation.
· When the convolution sum is written in the standard form (eq. (3)) for an image
a[m,n] of size M × N:
M-1 N -1
∑ ∑  a[ j, k ]h[m - j, n - k]
c[m, n] =
(88)
j= 0 k = 0
we see that the convolution kernel h[j,k] is mirrored around j=k=0 to produce
hjk] before it is translated by [m,n] as indicated in eq. (88). While some
convolution kernels in common use are symmetric in this respect, h[j,k]= h jk],
many are not. (See Section 9.5.) Care must therefore be taken in the
implementation of filters with respect to the mirroring requirements.
· The computational complexity for a K × K convolution kernel implemented in the
spatial domain on an image of N × N is O(K2) where the complexity is measured
per pixel on the basis of the number of multiplies-and-adds (MADDs).
· The value computed by a convolution that begins with integer brightnesses for
a[m,n] may produce a rational number or a floating point number in the result
c[m,n]. Working exclusively with integer brightness values will, therefore, cause
roundoff errors.
52
...Image Processing Fundamentals
· Inspection of eq. (84) reveals another possibility for efficient implementation of
convolution. If the convolution kernel h[j,k] is separable, that is, if the kernel can be
written as:
h[ j, k] = hrow[ k] · hcol [ j ]
(89)
then the filtering can be performed as follows:
J -1K -1
c[m, n] =   hrow [k ]a[m - j , n - k ]hcol [ j ]
(90)
j = 0k = 0
This means that instead of applying one, two-dimensional filter it is possible to
apply two, one-dimensional filters, the first one in the k direction and the second
one in the j direction. For an N × N image this, in general, reduces the
computational complexity per pixel from O(J· K) to O (J+K).
An alternative way of writing separability is to note that the convolution kernel
(Figure 24) is a matrix h and, if separable, h can be written as:
= [hcol ] · [hrow ]t
[h]
(91)
( J × K ) = ( J × 1)
· (1 × K)
where "  t" denotes the matrix transpose operation. In other words, h can be
expressed as the outer product of a column vector [hcol ] and a row vector [hrow].
· For certain filters it is possible to find an incremental implementation for a
convolution. As the convolution window moves over the image (see eq. (88)), the
leftmost column of image data under the window is shifted out as a new column of
image data is shifted in from the right. Efficient algorithms can take advantage of
this [24] and, when combined with separable filters as described above, this can
lead to algorithms where the computational complexity per pixel is O(constant).
9.3.3 Convolution in the frequency domain
In Section 3.4 we indicated that there was an alternative method to implement the
filtering of images through convolution. Based on eq. (24) it appears possible to
achieve the same result as in eq. (84) by the following sequence of operations:
Compute A(,Ψ) = F{a[m,n]}
i)
Multiply A(,Ψ) by the precomputed H(,Ψ) = F{h[m,n]}
ii)
(92)
Compute the result c[m,n] = F­1{A(,ΨH(,Ψ)}
iii)
53
...Image Processing Fundamentals
· While it might seem that the "recipe" given above in eq. (92) circumvents the
problems associated with direct convolution in the spatial domain--specifically,
determining values for the image outside the boundaries of the image--the Fourier
domain approach, in fact, simply "assumes" that the image is repeated periodically
outside its boundaries as illustrated in Figure 25b. This phenomenon is referred to
as circular convolution.
If circular convolution is not acceptable then the other possibilities illustrated in
Figure 25 can be realized by embedding the image a[m,n] and the filter H(,Ψ) in
larger matrices with the desired image extension mechanism for a[m,n] being
explicitly implemented.
· The computational complexity per pixel of the Fourier approach for an image of
N × N and for a convolution kernel of K × K is O(logN) complex MADDs
independent of K. Here we assume that N > K and that N is a highly composite
number such as a power of two. (See also 2.1.) This latter assumption permits use
of the computationally-efficient Fast Fourier Transform (FFT) algorithm.
Surprisingly then, the indirect route described by eq. (92) can be faster than the
direct route given in eq. (84). This requires, in general, that K2 >> logN. The range
of K and N for which this holds depends on the specifics of the implementation.
For the machine on which this manuscript is being written and the specific image
processing package that is being used, for an image of N = 256 the Fourier
approach is faster than the convolution approach when K 15. (It should be noted
that in this comparison the direct convolution involves only integer arithmetic while
the Fourier domain approach requires complex floating point arithmetic.)
9.4 SMOOTHING OPERATIONS
These algorithms are applied in order to reduce noise and/or to prepare images for
further processing such as segmentation. We distinguish between linear and non-
linear algorithms where the former are amenable to analysis in the Fourier domain
and the latter are not. We also distinguish between implementations based on a
rectangular support for the filter and implementations based on a circular support
for the filter.
9.4.1 Linear Filters
Several filtering algorithms will be presented together with the most useful
supports.
· Uniform filter ­ The output image is based on a local averaging of the input filter
where all of the values within the filter support have the same weight. In the
continuous spatial domain (x,y) the PSF and transfer function are given in Table
4­T.1 for the rectangular case and in Table 4­T.3 for the circular (pill box) case.
54
img
...Image Processing Fundamentals
For the discrete spatial domain [m,n] the filter values are the samples of the
continuous domain case. Examples for the rectangular case (J=K=5) and the
circular case (R =2.5) are shown in Figure 26.
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
hrect [ j , k] =
1
1
hcirc[ j , k] =
1
1
1
1
1
1
1
1
25
1
21
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
(a) Rectangular filter (J=K=5)
(b) Circular filter (R =2.5)
Figure 26: Uniform filters for image smoothing
Note that in both cases the filter is normalized so that Σ h[j,k] = 1. This is done so
that if the input a[m,n] is a constant then the output image c[m,n] is the same
constant. The justification can be found in the Fourier transform property described
in eq. (26). As can be seen from Table 4, both of these filters have transfer
functions that have negative lobes and can, therefore, lead to phase reversal as seen
in Figure 23. The square implementation of the filter is separable and incremental;
the circular implementation is incremental [24, 25].
· Triangular filter ­ The output image is based on a local averaging of the input
filter where the values within the filter support have differing weights. In general,
the filter can be seen as the convolution of two (identical) uniform filters either
rectangular or circular and this has direct consequences for the computational
complexity [24, 25]. (See Table 13.) In the continuous spatial domain the PSF and
transfer function are given in Table 4­T.2 for the rectangular support case and in
Table 4­T.4 for the circular (pill box) support case. As seen in Table 4 the transfer
functions of these filters do not have negative lobes and thus do not exhibit phase
reversal.
Examples for the rectangular support case (J=K=5) and the circular support case
(R =2.5) are shown in Figure 27. The filter is again normalized so that Σ h[j,k]=1.
1
1
0
0
2
3
2
0
1
0
2
2
0
0
4
6
4
2
2
2
1
1 
hrect [ j , k] =
3
3
hcirc[ j , k] =
1
1
6
9
6
2
5
2
81
2
25
0
2
4
6
4
0
2
2
2
1
1
0
0
2
3
2
0
1
0
(a) Pyramidal filter (J=K=5)
(b) Cone filter (R =2.5)
Figure 27: Triangular filters for image smoothing
55
img
...Image Processing Fundamentals
· Gaussian filter ­ The use of the Gaussian kernel for smoothing has become
extremely popular. This has to do with certain properties of the Gaussian (e.g. the
central limit theorem, minimum space-bandwidth product) as well as several
application areas such as edge finding and scale space analysis. The PSF and
transfer function for the continuous space Gaussian are given in Table 4­T6. The
Gaussian filter is separable:
 
 2
  1
- y
x2
-
2σ   ·
e  2σ 
1
2
2
h( x , y) = g2D (x, y) = 
e
2š σ
  2šσ
(93)
 
= g1D ( x) · g1 D( y )
There are four distinct ways to implement the Gaussian:
­ Convolution using a finite number of samples (No) of the Gaussian as the
convolution kernel. It is common to choose No = 3σ or 5σ.
  1
 2
- n
2σ 2
n No
g1 D[n] =  2š σ e
(94)
n > No
0
­ Repetitive convolution using a uniform filter as the convolution kernel.
g1 D[n] u[ n] u[n] u[n]
1
n No
u[n] =  (2 No + 1)
(95)
n > No
0
The actual implementation (in each dimension) is usually of the form:
c[n] = ((a[n] u[n]) u[n]) u[n]
(96)
This implementation makes use of the approximation afforded by the central limit
theorem. For a desired σ with eq. (96), we use No = σ although this severely
restricts our choice of σ's to integer values.
­ Multiplication in the frequency domain. As the Fourier transform of a Gaussian
is a Gaussian (see Table ­T.6), this means that it is straightforward to prepare a
filter H(,Ψ) = G2D(,Ψ) for use with eq. (92). To avoid truncation effects in the
frequency domain due to the infinite extent of the Gaussian it is important to choose
a σ that is sufficiently large. Choosing σ > k/š where k = 3 or 4 will usually be
sufficient.
56
img
...Image Processing Fundamentals
­ Use of a recursive filter implementation. A recursive filter has an infinite impulse
response and thus an infinite support. The separable Gaussian filter can be
implemented [26] by applying the following recipe in each dimension when σ ≥
0.5.
Choose the σ based on the desired goal of the filtering;
i)
ii)
Determine the parameter q based on eq. (98);
iii)
Use eq. (99) to determine the filter coefficients {b0,b1,b2,b3,B };
(97)
iv)
Apply the forward difference equation, eq. (100);
v)
Apply the backward difference equation, eq. (101);
The relation between the desired σ and q is given by:
.98711σ - 0.96330
σ ≥ 2.5
q=
(98)
3.97156 - 4.14554 1 - .26891σ
0.5 ≤ σ ≤ 2.5
The filter coefficients {b0,b1,b2,b3,B } are defined by:
b0 = 1.57825 + (2.44413q) + (1.4281q  2 ) + (0.422205q3 )
b1 = (2.44413q ) + (2.85619q  2) + (1.26661q  3)
b2 = -(1.4281q  2 ) - (1.26661q3 )
(99)
b3 = 0.422205q  3
B = 1 - (b1 + b2 + b3 )  / b0
The one-dimensional forward difference equation takes an input row (or column)
a[n] and produces an intermediate output result w[n] given by:
w[n] = Ba[n] + (b1w[n - 1] + b2w[n - 2] + b3 w[ n - 3]) / b0
(100)
The one-dimensional backward difference equation takes the intermediate result
w[n] and produces the output c[n] given by:
c[n] = Bw[n] + (b1c[n + 1] + b2 c[n + 2] + b3 c[n + 3])  / b0
(101)
The forward equation is applied from n = 0 up to n = N ­ 1 while the backward
equation is applied from n = N ­ 1 down to n = 0.
The relative performance of these various implementation of the Gaussian filter can
+∞
be described as follows. Using the root-square error  ∑  n= - ∞ g[ n | σ ]- h[n] 2
between a true, infinite-extent Gaussian, g[n|σ], and an approximated Gaussian,
57
img
...Image Processing Fundamentals
h[n], as a measure of accuracy, the various algorithms described above give the
results shown in Figure. 28a. The relative speed of the various algorithms in shown
in Figure 28b.
The root-square error measure is extremely conservative and thus all filters, with
the exception of "Uniform 3×" for large σ, are sufficiently accurate. The recursive
implementation is the fastest independent of σ; the other implementations can be
significantly slower. The FFT implementation, for example, is 3.1 times slower for
N=256 . Further, the FFT requires that N be a highly composite number.
1E+01
15
Root-Square Error
Computation Time Ratio
1E+00
Recursive Gauss
1E-01
Uniform 3x
1E-02
10
1E-03
1E-04
Gauss trunc@ 5σ
Gauss trunc@ 3σ
1E-05
5
Optimized FFT
1E-06
1E-07
1E-08
0
0
5
10
15
20
0
5
10
σ
σ
a) Accuracy comparison
b) Speed comparison
Figure 28: Comparison of various Gaussian algorithms with N=256.
The legend is spread across both graphs
· Other ­ The Fourier domain approach offers the opportunity to implement a
variety of smoothing algorithms. The smoothing filters will then be lowpass filters.
In general it is desirable to use a lowpass filter that has zero phase so as not to
produce phase distortion when filtering the image. The importance of phase was
illustrated in Figures 5 and 23. When the frequency domain characteristics can be
represented in an analytic form, then this can lead to relatively straightforward
implementations of H(,Ψ). Possible candidates include the lowpass filters "Airy"
and "Exponential Decay" found in Table 4­T.5 and Table 4­T.8, respectively.
9.4.2 Non-Linear Filters
A variety of smoothing filters have been developed that are not linear. While they
cannot, in general, be submitted to Fourier analysis, their properties and domains of
application have been studied extensively.
· Median filter ­ The median statistic was described in Section 3.5.2. A median
filter is based upon moving a window over an image (as in a convolution) and
58
img
...Image Processing Fundamentals
computing the output pixel as the median value of the brightnesses within the input
window. If the window is J × K in size we can order the J·K pixels in brightness
value from smallest to largest. If J·K is odd then the median will be the (J·K+1)/2
entry in the list of ordered brightnesses. Note that the value selected will be exactly
equal to one of the existing brightnesses so that no roundoff error will be involved
if we want to work exclusively with integer brightness values. The algorithm as it is
described above has a generic complexity per pixel of O(J·K·log(J·K)).
Fortunately, a fast algorithm (due to Huang et al. [23]) exists that reduces the
complexity to O(K) assuming J K.
A useful variation on the theme of the median filter is the percentile filter. Here the
center pixel in the window is replaced not by the 50% (median) brightness value
but rather by the p% brightness value where p% ranges from 0% (the minimum
filter) to 100% (the maximum filter). Values other then (p=50)% do not, in general,
correspond to smoothing filters.
· Kuwahara filter ­ Edges play an important role in our perception of images (see
Figure 15) as well as in the analysis of images. As such it is important to be able to
smooth images without disturbing the sharpness and, if possible, the position of
edges. A filter that accomplishes this goal is termed an edge-preserving filter and
one particular example is the Kuwahara filter [27]. Although this filter can be
implemented for a variety of different window shapes, the algorithm will be
described for a square window of size J = K = 4L + 1 where L is an integer. The
window is partitioned into four regions as shown in Figure 29.
Region 1
}
Region 2
Region 3
Center Pixel
Region 4
Figure 29: Four, square regions defined for the Kuwahara filter. In this
example L=1 and thus J=K=5. Each region is [(J+1)/2] × [(K+1)/2].
59
img
...Image Processing Fundamentals
In each of the four regions (i=1,2,3,4), the mean brightness, mi in eq. (34), and the
variancei, si2 in eq. (36), are measured. The output value of the center pixel in the
window is the mean value of that region that has the smallest variance.
9.4.3 Summary of Smoothing Algorithms
The following table summarizes the various properties of the smoothing algorithms
presented above. The filter size is assumed to be bounded by a rectangle of J × K
where, without loss of generality, J K. The image size is N × N.
Algorithm
Domain
Type
Support
Separable / Incremental
Complexity/pixel
O( constant)
Uniform
Space
Linear
Square
Y/Y
O( K)
Uniform
Space
Linear
Circular
N/Y
O( constant) ª
Triangle
Space
Linear
Square
Y/N
O( K) ª
Triangle
Space
Linear
Circular
N/N
ª
O( constant) ª
Gaussian
Space
Linear
Y/N
O( K) ª
Median
Space
Non-Linear
Square
N/Y
Square ª
O( J· K)
Kuwahara
Space
Non-Linear
N/N
O(log N)
Other
Frequency
Linear
--
--/--
Table 13: Characteristics of smoothing filters. ªSee text for additional explanation.
Examples of the effect of various smoothing algorithms are shown in Figure 30.
b) Uniform 5 × 5
c) Gaussian (σ = 2.5)
a) Original
60
img
...Image Processing Fundamentals
d) Median 5 × 5
e) Kuwahara 5 × 5
Figure 30: Illustration of various linear and non-linear smoothing filters
9.5 DERIVATIVE-BASED OPERATIONS
Just as smoothing is a fundamental operation in image processing so is the ability
to take one or more spatial derivatives of the image. The fundamental problem is
that, according to the mathematical definition of a derivative, this cannot be done. A
digitized image is not a continuous function a(x,y) of the spatial variables but rather
a discrete function a[m,n] of the integer spatial coordinates. As a result the
algorithms we will present can only be seen as approximations to the true spatial
derivatives of the original spatially-continuous image.
Further, as we can see from the Fourier property in eq. (27), taking a derivative
multiplies the signal spectrum by either u or v. This means that high frequency
noise will be emphasized in the resulting image. The general solution to this
problem is to combine the derivative operation with one that suppresses high
frequency noise, in short, smoothing in combination with the desired derivative
operation.
9.5.1 First Derivatives
As an image is a function of two (or more) variables it is necessary to define the
direction in which the derivative is taken. For the two-dimensional case we have the
horizontal direction, the vertical direction, or an arbitrary direction which can be
considered as a combination of the two. If we use hx to denote a horizontal
derivative filter (matrix), hy to denote a vertical derivative filter (matrix), and hθ to
denote the arbitrary angle derivative filter (matrix), then:
[hθ ]  = cosθ · [  hx ]  + sinθ ·[h y ]
(102)
· Gradient filters ­ It is also possible to generate a vector derivative description as
the gradient, a[m,n], of an image:
61
img
...Image Processing Fundamentals
ar
ar
r
r
(
)
a =  ix +  i  y = (hx a)ix + hy a i  y
(103)
x
y
r
r
where ix and iy  are unit vectors in the horizontal and vertical direction,
respectively. This leads to two descriptions:
(
)
(hx a)2
2
a =
+ hy a
(104)
Gradient magnitude ­
and
(
)
hy a
ψ (a) = arctan
(105)
(hx a)
Gradient direction ­
The gradient magnitude is sometimes approximated by:
a hx a + hy a
(106)
Approx. Gradient magnitude ­
The final results of these calculations depend strongly on the choices of hx and hy.
A number of possible choices for (hx, hy) will now be described.
· Basic derivative filters ­ These filters are specified by:
[ ] = [1
i ) [  hx ]  = hy
t
- 1]
(107)
[  hx ]  = [h  y ] = [1
t
0 - 1]
ii)
where "t" denotes matrix transpose. These two filters differ significantly in their
Fourier magnitude and Fourier phase characteristics. For the frequency range 0 ≤ Ω
≤ š, these are given by:
F
i ) [ h] = [1 - 1]
H() = 2 sin(2)  ; ϕ() = (š - Ω ) 2
(108)
F
ii) [h] = [1 0
-1] H () = 2 sin ;
ϕ() = š / 2
The second form (ii) gives suppression of high frequency terms (Ω ≈ š) while the
first form (i) does not. The first form leads to a phase shift; the second form does
not.
· Prewitt gradient filters ­ These filters are specified by:
62
img
...Image Processing Fundamentals
1 0 -1
1
1
1
[  hx ]  = 3 1 0 - 1 = 3 1· [1 0 - 1]

1 0 -1
1
(109)
1  1  1
1
[]
1
1
hy =  0  0  0  =  0 · [1 1 1]
3
 3  -1
 
-1 - 1 -1
Both hx and hy are separable. Beyond the computational implications are the
implications for the analysis of the filter. Each filter takes the derivative in one
direction using eq. (107)ii and smoothes in the orthogonal direction using a one-
dimensional version of a uniform filter as described in Section 9.4.1.
· Sobel gradient filters ­ These filters are specified by:
1 0  -1
1
1
1
[  hx ]  = 4 2 0 - 2 = 4 2· [1 0 -1]
 
1 0  -1
1
(110)
1  2
1
1
[]
1
1  
0 · [1 2 1]
hy =   0  0
0=
4
4  
-1 - 2 -1
 -1
Again, hx and hy are separable. Each filter takes the derivative in one direction
using eq. (107)ii and smoothes in the orthogonal direction using a one-dimensional
version of a triangular filter as described in Section 9.4.1.
· Alternative gradient filters ­ The variety of techniques available from one-
dimensional signal processing for the design of digital filters offers us powerful
tools for designing one-dimensional versions of hx and hy. Using the Parks-
McClellan filter design algorithm, for example, we can choose the frequency bands
where we want the derivative to be taken and the frequency bands where we want
the noise to be suppressed. The algorithm will then produce a real, odd filter with a
minimum length that meets the specifications.
As an example, if we want a filter that has derivative characteristics in a passband
(with weight 1.0) in the frequency range 0.0 ≤ Ω ≤ 0.3š and a stopband (with
weight 3.0) in the range 0.32š ≤ Ω ≤ š, then the algorithm produces the following
optimized seven sample filter:
63
img
...Image Processing Fundamentals
[  hx ]  = [hy ]
1
t
[- 3571
3571]
=
-15580
-8212
(111)
8212
0
15580
16348
The gradient can then be calculated as in eq. (103).
· Gaussian gradient filters ­ In modern digital image processing one of the most
common techniques is to use a Gaussian filter (see Section 9.4.1) to accomplish the
required smoothing and one of the derivatives listed in eq. (107). Thus, we might
first apply the recursive Gaussian in eq. (97) followed by eq. (107)ii to achieve the
desired, smoothed derivative filters hx and hy. Further, for computational
efficiency, we can combine these two steps as:
B
w[n] =  ( a[n + 1] - a[n - 1]) + (b1 w[ n - 1] + b2w[ n - 2] + b3w[n - 3])  / b0
2
(112)
c[n] = Bw[n] + (b1c[n + 1] + b2 c[n + 2] + b3 c[ n + 3])  / b0
where the various coefficients are defined in eq. (99). The first (forward) equation
is applied from n = 0 up to n = N ­ 1 while the second (backward) equation is
applied from n = N ­ 1 down to n = 0.
· Summary ­ Examples of the effect of various derivative algorithms on a noisy
version of Figure 30a (SNR = 29 dB) are shown in Figure 31a-c. The effect of
various magnitude gradient algorithms on Figure 30a are shown in Figure 32a-c.
After processing, all images are contrast stretched as in eq. (77) for display
purposes.
(a)
(b)
(c)
Gaussian (σ=1.5) & eq. (107)ii
Simple Derivative ­ eq. (107)ii
Sobel ­ eq. (110)
Figure 31: Application of various algorithms for hx ­ the horizontal derivative.
64
img
...Image Processing Fundamentals
(a)
(b)
(c)
Gaussian (σ=1.5) & eq. (107)ii
Simple Derivative ­ eq. (107)ii
Sobel ­ eq. (110)
Figure 32: Various algorithms for the magnitude gradient, |a|.
The magnitude gradient takes on large values where there are strong edges in the
image. Appropriate choice of σ in the Gaussian-based derivative (Figure 31c) or
gradient (Figure 32c) permits computation of virtually any of the other forms ­
simple, Prewitt, Sobel, etc. In that sense, the Gaussian derivative represents a
superset of derivative filters.
9.5.2 Second Derivatives
It is, of course, possible to compute higher-order derivatives of functions of two
variables. In image processing, as we shall see in Sections 10.2.1 and 10.3.2, the
second derivatives or Laplacian play an important role. The Laplacian is defined as:
(
)
2
2
a
a
= (h2 x a ) + h2 y a
2
a=
+
(113)
2
2
x
y
where h2x and h2y are second derivative filters. In the frequency domain we have
for the Laplacian filter (from eq. (27)):
F
(
)
2
- u2 + v  2 A(u, v)
a
(114)
The transfer function of a Laplacian corresponds to a parabola H(u,v) = ­(u2 + v2).
· Basic second derivative filter ­ This filter is specified by:
[  h2 x ]  = [h  2y ] = [1
t
-2 1]
(115)
and the frequency spectrum of this filter, in each direction, is given by:
H() = F {1 -2 1} = -2(1 - cos )
(116)
65
...Image Processing Fundamentals
over the frequency range ­š ≤ Ω ≤ š. The two, one-dimensional filters can be used
in the manner suggested by eq. (113) or combined into one, two-dimensional filter
as:
0  1  0
[ h] = 1  -4  1
(117)
0  1  0
and used as in eq. (84).
· Frequency domain Laplacian ­ This filter is the implementation of the general
recipe given in eq. (92) and for the Laplacian filter takes the form:
{(
}
)
c[m, n] = F  -1 - Ω  2 + Ψ2 A(, Ψ )
(118)
· Gaussian second derivative filter ­ This is the straightforward extension of the
Gaussian first derivative filter described above and can be applied independently in
each dimension. We first apply Gaussian smoothing with a σ chosen on the basis
of the problem specification. We then apply the desired second derivative filter eq.
(115) or eq. (118). Again there is the choice among the various Gaussian
smoothing algorithms.
For efficiency, we can use the recursive implementation and combine the two
steps--smoothing and derivative operation--as follows:
w[n] = B(a[n] - a[n - 1]) + (b1 w[n - 1] + b2w[ n - 2] + b3w[n - 3])  / b0
(119)
c[n] = B(w[n + 1] - w[n]) + (b1c[n + 1] + b2 c[n + 2] + b3 c[n + 3]) / b0
where the various coefficients are defined in eq. (99). Again, the first (forward)
equation is applied from n = 0 up to n = N ­ 1 while the second (backward)
equation is applied from n = N ­ 1 down to n = 0.
· Alternative Laplacian filters ­ Again one-dimensional digital filter design
techniques offer us powerful methods to create filters that are optimized for a
specific problem. Using the Parks-McClellan design algorithm, we can choose the
frequency bands where we want the second derivative to be taken and the frequency
bands where we want the noise to be suppressed. The algorithm will then produce a
real, even filter with a minimum length that meets the specifications.
As an example, if we want a filter that has second derivative characteristics in a
passband (with weight 1.0) in the frequency range 0.0 ≤ Ω ≤ 0.3 š and a stopband
(with weight 3.0) in the range 0.32š ≤ Ω ≤ š, then the algorithm produces the
following optimized seven sample filter:
66
img
...Image Processing Fundamentals
[  hx ]  = [hy ]
1
t
[-3448
- 3448]
=
-16383
(120)
10145
1495
1495
10145
11043
The Laplacian can then be calculated as in eq. (113).
· SDGD filter ­ A filter that is especially useful in edge finding and object
measurement is the Second-Derivative-in-the-Gradient-Direction (SDGD) filter.
This filter uses five partial derivatives:
2
2
a
a
a
Axx =
Axy =
Ax =
x2
xy
x
(121)
2
2
a
a
a
Ayx =
Ayy =   2
Ay =
xy
y
y
Note that Axy = Ayx which accounts for the five derivatives.
This SDGD combines the different partial derivatives as follows:
Axx Ax + 2 Axy Ax Ay + Ayy A2
2
y
SDGD(a ) =
(122)
Ax + A2
2
y
As one might expect, the large number of derivatives involved in this filter implies
that noise suppression is important and that Gaussian derivative filters--both first
and second order--are highly recommended if not required [28]. It is also
necessary that the first and second derivative filters have essentially the same
passbands and stopbands. This means that if the first derivative filter h1x is given
by [1 0 -1] (eq. (107)ii) then the second derivative filter should be given by h1x
h1x = h2x = [1 0 ­2 0 1].
· Summary ­ The effects of the various second derivative filters are illustrated in
Figure 33a-e. All images were contrast stretched for display purposes using eq.
(78) and the parameters 1% and 99%.
67
img
...Image Processing Fundamentals
(a)
(b)
(c)
Fourier parabola ­ eq. (118) Gaussian (σ=1.0) & eq. (117)
Laplacian ­ eq. (117)
(d)
(e)
SDGD (σ=1.0) ­ eq. (122)
"Designer" ­ eq. (120)
Figure 33: Various algorithms for the Laplacian and Laplacian-related filters.
9.5.3 Other Filters
An infinite number of filters, both linear and non-linear, are possible for image
processing. It is therefore impossible to describe more than the basic types in this
section. The description of others can be found be in the reference literature (see
Section 11) as well as in the applications literature. It is important to use a small
consistent set of test images that are relevant to the application area to understand
the effect of a given filter or class of filters. The effect of filters on images can be
frequently understood by the use of images that have pronounced regions of
varying sizes to visualize the effect on edges or by the use of test patterns such as
sinusoidal sweeps to visualize the effects in the frequency domain. The former have
been used above (Figures 21, 23, and 30­33) and the latter are demonstrated below
in Figure 34.
68
img
...Image Processing Fundamentals
(a) Lowpass filter
(b) Bandpass filter
(c) Highpass filter
Figure 34: Various convolution algorithms applied to sinusoidal test image.
9.6 M  ORPHOLOGY -BASED OPERATIONS
In Section 1 we defined an image as an (amplitude) function of two, real
(coordinate) variables a(x,y) or two, discrete variables a[m,n]. An alternative
definition of an image can be based on the notion that an image consists of a set (or
collection) of either continuous or discrete coordinates. In a sense the set
corresponds to the points or pixels that belong to the objects in the image. This is
illustrated in Figure 35 which contains two objects or sets A and B. Note that the
coordinate system is required. For the moment we will consider the pixel values to
be binary as discussed in Section 2.1 and 9.2.1. Further we shall restrict our
discussion to discrete space (Z2). More general discussions can be found in [6, 7,
29].
n
A
m
B
Figure 35: A binary image containing two object sets A and B.
The object A consists of those pixels α that share some common property:
A = {α property(α) = = TRUE}
(123)
Object ­
As an example, object B in Figure 35 consists of {[0,0], [1,0], [0,1]}.
69
img
...Image Processing Fundamentals
The background of A is given by Ac (the complement of A) which is defined as
those elements that are not in A:
Ac = {α α ∉ A}
(124)
Background ­
In Figure 3 we introduced the concept of neighborhood connectivity. We now
observe that if an object A is defined on the basis of C­connectivity (C=4, 6, or 8)
then the background Ac has a connectivity given by 12 ­ C. The necessity for this is
illustrated for the Cartesian grid in Figure 36.
background
object
Figure 36: A binary image requiring careful definition of object and
background connectivity.
9.6.1 Fundamental definitions
The fundamental operations associated with an object are the standard set
operations union, intersection, and complement {, , c} plus translation:
· Translation ­ Given a vector x and a set A, the translation, A + x, is defined as:
A + x = {α + x α ∈ A}
(125)
Note that, since we are dealing with a digital image composed of pixels at integer
coordinate positions (Z2), this implies restrictions on the allowable translation
vectors x.
The basic Minkowski set operations--addition and subtraction--can now be
defined. First we note that the individual elements that comprise B are not only
pixels but also vectors as they have a clear coordinate position with respect to [0,0].
Given two sets A and B:
U(A + β)
AB =
(126)
Minkowski addition ­
β ∈B
I  ( A )
AO B =
(127)
Minkowski subtraction ­
β ∈B
70
img
...Image Processing Fundamentals
9.6.2 Dilation and Erosion
From these two Minkowski operations we define the fundamental mathematical
morphology operations dilation and erosion:
U( A + β )
D( A, B) = A B =
(128)
Dilation ­
β ∈B
I ( A )
~
E( A, B) = AO B =
(129)
Erosion ­
β ∈B
where B = {- β β ∈ B}  . These two operations are illustrated in Figure 37 for the
~
objects defined in Figure 35.
D(A,B)
E(A,B)
~
B
B
(a) Dilation D(A,B)
(b) Erosion E (A,B)
Figure 37: A binary image containing two object sets A and B. The three pixels in
B are "color-coded" as is their effect in the result.
While either set A or B can be thought of as an "image", A is usually considered as
the image and B is called a structuring element. The structuring element is to
mathematical morphology what the convolution kernel is to linear filter theory.
Dilation, in general, causes objects to dilate or grow in size; erosion causes objects
to shrink. The amount and the way that they grow or shrink depend upon the choice
of the structuring element. Dilating or eroding without specifying the structural
element makes no more sense than trying to lowpass filter an image without
specifying the filter. The two most common structuring elements (given a Cartesian
grid) are the 4-connected and 8-connected sets, N4 and N8. They are illustrated in
Figure 38.
71
img
...Image Processing Fundamentals
n
n
m
m
(a) N4
(b) N8
Figure 38: The standard structuring elements N4 and N8.
Dilation and erosion have the following properties:
D( A, B) = A B = B A = D( B, A)
(130)
Commutative ­
E( A, B) E ( B, A)
(131)
Non-Commutative ­
A (  B C ) = (  A B)  ⊕ C
(132)
Associative ­
A ( B + x ) = ( A B) + x
(133)
Translation Invariance ­
Dc ( A, B) = E( Ac , B)
~
(134)
Duality ­
c
c
E ( A, B) = D( A , B)
~
With A as an object and Ac as the background, eq. (134) says that the dilation of an
object is equivalent to the erosion of the background. Likewise, the erosion of the
object is equivalent to the dilation of the background.
Except for special cases:
D( E ( A, B), B) A E ( D( A, B), B)
(135)
Non-Inverses ­
Erosion has the following translation property:
AO ( B + x ) = ( A + x )OB = ( AOB) + x
(136)
Translation Invariance ­
Dilation and erosion have the following important properties. For any arbitrary
structuring element B and two image objects A1 and A2 such that A1 A2 (A1 is a
proper subset of A2):
D( A1, B) D ( A2, B)
(137)
Increasing in A ­
E( A1, B) E( A2 , B)
For two structuring elements B1 and B2 such that B1 B2:
72
img
...Image Processing Fundamentals
E( A, B1) E( A, B2 )
(138)
Decreasing in B ­
The decomposition theorems below make it possible to find efficient
implementations for morphological filters.
A (  B C ) = (  A B)  ∪ ( A C) = (  B C) A
(139)
Dilation ­
AO ( B C) = ( AOB) ( AOC )
(140)
Erosion ­
( AO B)OC = AO ( B C)
(141)
Erosion ­
nB = (B4442L 43)
1 B B 44B
(142)
Multiple Dilations ­
n times
An important decomposition theorem is due to Vincent [30]. First, we require
some definitions. A convex set (in R  2) is one for which the straight line joining any
two points in the set consists of points that are also in the set. Care must obviously
be taken when applying this definition to discrete pixels as the concept of a "straight
line" must be interpreted appropriately in Z2. A set is bounded if each of its
elements has a finite magnitude, in this case distance to the origin of the coordinate
~
system. A set is symmetric if B= B . The sets N4 and N8 in Figure 38 are examples
of convex, bounded, symmetric sets.
Vincent's theorem, when applied to an image consisting of discrete pixels, states
that for a bounded, symmetric structuring element B that contains no holes and
contains its own center, [ 0, 0] B :
D( A, B) = A B = A ( A B)
(143)
where A is the contour of the object. That is, A is the set of pixels that have a
background pixel as a neighbor. The implication of this theorem is that it is not
necessary to process all the pixels in an object in order to compute a dilation o r
(using eq. (134)) an erosion. We only have to process the boundary pixels. This
also holds for all operations that can be derived from dilations and erosions. The
processing of boundary pixels instead of object pixels means that, except for
pathological images, computational complexity can be reduced from O(N2) to O(N)
for an N × N image. A number of "fast" algorithms can be found in the literature
that are based on this result [30-32]. The simplest dilation and erosion algorithms
are frequently described as follows.
· Dilation ­ Take each binary object pixel (with value "1") and set all background
pixels (with value "0") that are C-connected to that object pixel to the value "1".
73
img
...Image Processing Fundamentals
· Erosion ­ Take each binary object pixel (with value "1") that is C-connected to a
background pixel and set the object pixel value to "0".
Comparison of these two procedures to eq. (143) where B = NC=4 or NC=8 shows
that they are equivalent to the formal definitions for dilation and erosion. The
procedure is illustrated for dilation in Figure 39.
(a) B = N4
(b) B= N8
Figure 39: Illustration of dilation. Original object pixels are in gray;
pixels added through dilation are in black.
9.6.3 Boolean Convolution
An arbitrary binary image object (or structuring element) A can be represented as:
+∞
+∞
∑ ∑  a[ j , k ] · δ [m - j, n - k]
A
(144)
k =-∞ j =-∞
where Σ and · are the Boolean operations OR and AND as defined in eqs. (81) and
(82), a[j,k] is a characteristic function that takes on the Boolean values "1" and "0"
as follows:
1  a A
a[ j, k] = 
(145)
0  a A
and δ[m,n] is a Boolean version of the Dirac delta function that takes on the
Boolean values "1" and "0" as follows:
1  j = k = 0
δ[ j , k] = 
(146)
0 otherwise
Dilation for binary images can therefore be written as:
+∞
+∞
∑ ∑  a[ j, k ] · b[m - j, n - k] = a b
D( A, B) =
(147)
k = - ∞ j =-∞
74
img
...Image Processing Fundamentals
which, because Boolean OR and AND are commutative, can also be written as
+∞
+∞
∑ ∑  a[m - j, n - k] · b[ j, k ] = b a = D( B, A)
D( A, B) =
(148)
k =-∞ j =-∞
Using De Morgan's theorem:
(a + b) = a · b
(  a · b) = a + b
and
(149)
on eq. (148) together with eq. (134), erosion can be written as:
+∞
+∞
∏ ∏ (a[m - j , n - k ] + b [- j , - k])
E( A, B) =
(150)
k =-∞ j =-∞
Thus, dilation and erosion on binary images can be viewed as a form of
convolution over a Boolean algebra.
In Section 9.3.2 we saw that, when convolution is employed, an appropriate choice
of the boundary conditions for an image is essential. Dilation and erosion--being a
Boolean convolution--are no exception. The two most common choices are that
either everything outside the binary image is "0" or everything outside the binary
image is "1".
9.6.4 Opening and Closing
We can combine dilation and erosion to build two important higher order
operations:
o
O( A, B) = A B = D( E( A, B), B)
(151)
Opening ­
C( A, B) = A · B = E (D( A, - B), - B)
(152)
Closing ­
The opening and closing have the following properties:
C  c ( A, B) = O ( Ac , B)
(153)
Duality ­
c
c
O ( A, B) = C ( A , B)
O( A + x, B) = O( A, B) + x
(154)
Translation ­
C( A + x, B) = C( A, B) + x
For the opening with structuring element B and images A, A1, and A2, where A1 is
a subimage of A2 (A1 A2):
75
img
...Image Processing Fundamentals
O( A, B) A
(155)
Antiextensivity ­
O( A1, B) O( A2 , B)
(156)
Increasing monotonicity ­
O(O ( A, B), B) = O( A, B)
(157)
Idempotence ­
For the closing with structuring element B and images A, A1, and A2, where A1 is a
subimage of A2 (A1 A2):
A C( A, B)
(158)
Extensivity ­
C( A1, B) C( A2, B)
(159)
Increasing monotonicity ­
C(C ( A, B), B) = C( A, B)
(160)
Idempotence ­
The two properties given by eqs. (155) and (84) are so important to mathematical
~
morphology that they can be considered as the reason for defining erosion with B
instead of B in eq. (129).
9.6.5 Hit­and­Miss operation
The hit-or-miss operator was defined by Serra but we shall refer to it as the hit-
and-miss operator and define it as follows. Given an image A and two structuring
elements B1 and B2, the set definition and Boolean definition are:
(
)
E(  A, B1 ) E Ac , B2
HitMiss( A, B1 , B2 ) = 
(161)
Hit-and-Miss ­
E(A, B1 ) · E(  A , B2 )
where B1 and B2 are bounded, disjoint structuring elements. (Note the use of the
notation from eq. (81).) Two sets are disjoint if B1 B2 = , the empty set. In an
important sense the hit-and-miss operator is the morphological equivalent of
template matching, a well-known technique for matching patterns based upon
cross-correlation. Here, we have a template B1 for the object and a template B2 for
the background.
9.6.6 Summary of the basic operations
The results of the application of these basic operations on a test image are illustrated
below. In Figure 40 the various structuring elements used in the processing are
defined. The value "­" indicates a "don't care". All three structuring elements are
symmetric.
76
img
...Image Processing Fundamentals
1 1 1
­
­
- 1  - 
­
B = N8 = 1 1 1
B1 = ­
­
B2 = 1  - 1
1
1 1 1
-  1  -
­
­
­
(a)
(b)
(c)
Figure 40: Structuring elements B, B1, and B2 that are 3 × 3 and symmetric.
The results of processing are shown in Figure 41 where the binary value "1" is
shown in black and the value "0" in white.
b) Dilation with 2B
c) Erosion with 2B
a) Image A
d) Opening with 2B
e) Closing with 2B
f) 8-c contour: A­E (A,N8)
Figure 41: Examples of various mathematical morphology operations.
The opening operation can separate objects that are connected in a binary image.
The closing operation can fill in small holes. Both operations generate a certain
amount of smoothing on an object contour given a "smooth" structuring element.
The opening smoothes from the inside of the object contour and the closing
smoothes from the outside of the object contour. The hit-and-miss example has
found the 4-connected contour pixels. An alternative method to find the contour is
simply to use the relation:
A = A - E(  A, N8)
(162)
4-connected contour ­
or
77
img
...Image Processing Fundamentals
A = A - E(  A, N4 )
(163)
8-connected contour ­
9.6.7 Skeleton
The informal definition of a skeleton is a line representation of an object that is:
i)
one-pixel thick,
ii)
through the "middle" of the object, and,
(164)
iii)
preserves the topology of the object.
These are not always realizable. Figure 42 shows why this is the case.
(a)
(b)
Figure 42: Counterexamples to the three requirements.
In the first example, Figure 42a, it is not possible to generate a line that is one pixel
thick and in the center of an object while generating a path that reflects the
simplicity of the object. In Figure 42b it is not possible to remove a pixel from the
8-connected object and simultaneously preserve the topology--the notion of
connectedness--of the object. Nevertheless, there are a variety of techniques that
attempt to achieve this goal and to produce a skeleton.
A basic formulation is based on the work of Lantuéjoul [33]. The skeleton subset
Sk(A) is defined as:
Sk (  A) = E( A, kB) - [ E( A, kB) o  B]
k = 0,1,... K
Skeleton subsets ­
(165)
where K is the largest value of k before the set Sk(A) becomes empty. (From eq.
(156), E( A, kB) o  B E( A, kB) ). The structuring element B is chosen (in Z2) to
approximate a circular disc, that is, convex, bounded and symmetric. The skeleton
is then the union of the skeleton subsets:
K
S( A) =
U  Sk (  A)
Skeleton ­
(166)
k=0
78
img
...Image Processing Fundamentals
An elegant side effect of this formulation is that the original object can be
reconstructed given knowledge of the skeleton subsets Sk(A), the structuring
element B, and K:
K
U (Sk (  A) k B)
A=
Reconstruction ­
(167)
k =0
This formulation for the skeleton, however, does not preserve the topology, a
requirement described in eq. (164).
An alternative point-of-view is to implement a thinning, an erosion that reduces the
thickness of an object without permitting it to vanish. A general thinning algorithm
is based on the hit-and-miss operation:
Thin( A, B1, B2 ) = A - HitMiss(  A, B1, B2 )
Thinning ­
(168)
Depending on the choice of B1 and B2, a large variety of thinning algorithms--and
through repeated application skeletonizing algorithms--can be implemented.
A quite practical implementation can be described in another way. If we restrict
ourselves to a 3 × 3 neighborhood, similar to the structuring element B = N  8 in
Figure 40a, then we can view the thinning operation as a window that repeatedly
scans over the (binary) image and sets the center pixel to "0" under certain
conditions. The center pixel is not changed to "0" if and only if:
i) an isolated pixel is found (e.g. Figure 43a),
ii) removing a pixel would change the connectivity (e.g. Figure 43b),
(169)
iii) removing a pixel would shorten a line (e.g. Figure 43c).
As pixels are (potentially) removed in each iteration, the process is called a
conditional erosion. Three test cases of eq. (169) are illustrated in Figure 43. In
general all possible rotations and variations have to be checked. As there are only
512 possible combinations for a 3 × 3 window on a binary image, this can be done
easily with the use of a lookup table.
(a) Isolated pixel
(b) Connectivity pixel
(c) End pixel
Figure 43: Test conditions for conditional erosion of the center pixel.
79
...Image Processing Fundamentals
If only condition (i) is used then each object will be reduced to a single pixel. This
is useful if we wish to count the number of objects in an image. If only condition
(ii) is used then holes in the objects will be found. If conditions (i + ii) are used
each object will be reduced to either a single pixel if it does not contain a hole or to
closed rings if it does contain holes. If conditions (i + ii + iii) are used then the
"complete skeleton" will be generated as an approximation to eq. (56). Illustrations
of these various possibilities are given in Figure 44a,b.
9.6.8 Propagation
It is convenient to be able to reconstruct an image that has "survived" several
erosions or to fill an object that is defined, for example, by a boundary. The formal
mechanism for this has several names including region-filling, reconstruction, and
propagation. The formal definition is given by the following algorithm. We start
with a seed image S(0), a mask image A, and a structuring element B. We then use
dilations of S with structuring element B and masked by A in an iterative procedure
as follows:
[
]
S( k ) = S(k -1) B A
until S  (k ) = S(k -1)
(170)
Iteration k ­
With each iteration the seed image grows (through dilation) but within the set
(object) defined by A; S propagates to fill A. The most common choices for B are
N4 or N8. Several remarks are central to the use of propagation. First, in a
straightforward implementation, as suggested by eq. (170), the computational costs
are extremely high. Each iteration requires O(N2) operations for an N × N image
and with the required number of iterations this can lead to a complexity of O(N3).
Fortunately, a recursive implementation of the algorithm exists in which one or two
passes through the image are usually sufficient, meaning a complexity of O(N2).
Second, although we have not paid much attention to the issue  of
object/background connectivity until now (see Figure 36), it is essential that the
connectivity implied by B be matched to the connectivity associated with the
boundary definition of A (see eqs. (162) and (163)). Finally, as mentioned earlier, it
is important to make the correct choice ("0" or "1") for the boundary condition of
the image. The choice depends upon the application.
9.6.9 Summary of skeleton and propagation
The application of these two operations on a test image is illustrated in Figure 44. In
Figure 44a,b the skeleton operation is shown with the endpixel condition (eq. (169)
i+ii+iii) and without the end pixel condition (eq. (169) i+ii). The propagation
operation is illustrated in Figure 44c. The original image, shown in light gray, was
eroded by E (A,6N8) to produce the seed image shown in black. The original was
then used as the mask image to produce the final result. The border value in both
images was "0".
80
img
...Image Processing Fundamentals
Several techniques based upon the use of skeleton and propagation operations in
combination with other mathematical morphology operations will be given in
Section 10.3.3.
Original = light gray
Mask = light gray
Skeleton = black
Seed = black
a) Skeleton with end pixels  b) Skeleton without end pixels
c) Propagation with N8
Condition eq. (169) i+ii+iii
Condition eq. (169) i+ii
Figure 44: Examples of skeleton and propagation.
9.6.10 Gray-value morphological processing
The techniques of morphological filtering can be extended to gray-level images. To
simplify matters we will restrict our presentation to structuring elements, B, that
comprise a finite number of pixels and are convex and bounded. Now, however,
the structuring element has gray values associated with every coordinate position as
does the image A.
· Gray-level dilation, DG(·), is given by:
DG ( A, B) = max {a[m - j, n - k ] + b[ j, k ]}
(171)
Dilation ­
[ j,k ]B
For a given output coordinate [m,n], the structuring element is summed with a
shifted version of the image and the maximum encountered over all shifts within
81
...Image Processing Fundamentals
the J × K domain of B is used as the result. Should the shifting require values of the
image A that are outside the M x N domain of A, then a decision must be made as
to which model for image extension, as described in Section 9.3.2, should be used.
· Gray-level erosion, E  G(·), is given by:
EG( A, B) = min {a[m + j, n + k] - b[ j, k ]}
(172)
Erosion ­
[ j, k]B
The duality between gray-level erosion and gray-level dilation--the gray-level
counterpart of eq. (134) is:
~
EG( A, B) = - DG ( - A, B)
(173)
Duality ­
~
D ( A, B) = - E ( - A, B)
G
G
~
where " A " means that a[j,k] ajk] and " - A " means that a[j,k] ­a[j,k].
The definitions of higher order operations such as gray-level opening and gray-level
closing are:
OG ( A, B) = DG ( EG( A, B), B)
(174)
Opening ­
( ( ) )
~ ~
CG ( A, B) = EG DG A, B , B
(175)
Closing ­
The duality between gray-level opening and gray-level closing is:
OG ( A, B) = - CG( - A, B)
(176)
Duality ­
CG( A, B) = -OG ( - A, B)
The important properties that were discussed earlier such as idempotence,
translation invariance, increasing in A, and so forth are also applicable to gray level
morphological processing. The details can be found in Giardina and Dougherty [6].
In many situations the seeming complexity of gray level morphological processing
is significantly reduced through the use of symmetric structuring elements where
b[j,k] = bjk]. The most common of these is based on the use of B = constant =
0. For this important case and using again the domain [j,k] B, the definitions
above reduce to:
DG ( A, B) = max {a[m - j, n - k ]} = max(  A)
(177)
Dilation ­
[ j,k ]B
B
82
img
...Image Processing Fundamentals
EG( A, B) = min {a[m - j, n - k]} = min(  A)
(178)
Erosion ­
[ j, k]B
B
OG ( A, B) = max (min( A))
(179)
Opening ­
B
B
CG ( A, B) = min(max ( A))
(180)
Closing ­
B
B
The remarkable conclusion is that the maximum filter and the minimum filter,
introduced in Section 9.4.2, are gray-level dilation and gray-level erosion for the
specific structuring element given by the shape of the filter window with the gray
value "0" inside the window. Examples of these operations on a simple one-
dimensional signal are shown in Figure 45.
Closing
250
250
Dilation
"Original"
200
200
150
150
100
100
Erosion
50
50
Opening
0
0
0
50
100
150
200
0
50
100
150
200
Horizontal Position
Horizontal Position
a) Effect of 15 × 1 dilation and erosion
b) Effect of 15 × 1 opening and closing
Figure 45: Morphological filtering of gray-level data.
For a rectangular window, J × K, the two-dimensional maximum or minimum
filter is separable into two, one-dimensional windows. Further, a one-dimensional
maximum or minimum filter can be written in incremental form. (See Section
9.3.2.) This means that gray-level dilations and erosions have a computational
complexity per pixel that is O(constant), that is, independent of J and K. (See also
Table 13.)
The operations defined above can be used to produce morphological algorithms for
smoothing, gradient determination and a version of the Laplacian. All are
constructed from the primitives for gray-level dilation and gray-level erosion and in
all cases the maximum and minimum filters are taken over the domain [ j , k] B .
9.6.11 Morphological smoothing
This algorithm is based on the observation that a gray-level opening smoothes a
gray-value image from above the brightness surface given by the function a[m,n]
83
img
...Image Processing Fundamentals
and the gray-level closing smoothes from below. We use a structuring element B
based on eqs. (84) and (178).
MorphSmooth( A, B) = CG (OG ( A, B), B)
(181)
= min(max(max(min(A))))
Note that we have suppressed the notation for the structuring element B under the
max and min operations to keep the notation simple. Its use, however, is
understood.
9.6.12 Morphological gradient
For linear filters the gradient filter yields a vector representation (eq. (103)) with a
magnitude (eq. (104)) and direction (eq. (105)). The version presented here
generates a morphological estimate of the gradient magnitude:
1
(D ( A, B) - EG( A, B))
Gradient ( A, B) =
2   G
(182)
1
= (max( A) - min( A))
2
9.6.13 Morphological Laplacian
The morphologically-based Laplacian filter is defined by:
((  DG ( A, B) - A) - (  A - EG ( A, B)))
1
Laplacian( A, B) =
2
1
= (  DG ( A, B) + EG ( A, B) - 2 A)
(183)
2
1
= (max( A) + min(A) - 2 A)
2
9.6.14 Summary of morphological filters
The effect of these filters is illustrated in Figure 46. All images were processed with
a 3 × 3 structuring element as described in eqs. (177) through (183). Figure 46e
was contrast stretched for display purposes using eq. (78) and the parameters 1%
and 99%. Figures 46c,d,e should be compared to Figures 30, 32, and 33.
84