|
|||||
MTH001
Elementary Mathematics
LECTURE #
22
WHAT IS
STATISTICS?
·
That
science which enables us to
draw conclusions about
various phenomena on
the
basis of real data collected
on sample-basis
·
A
tool for data-based
research
·
Also
known as Quantitative
Analysis
·
A
lot of application in a wide
variety of disciplines ... Agriculture,
Anthropology,
Astronomy,
Biology, Economic, Engineering,
Environment, Geology,
Genetics,
Medicine,
Physics, Psychology, Sociology,
Zoology .... Virtually every
single
subject
from Anthropology to Zoology .... A to
Z!
·
Any
scientific enquiry in which you
would like to base your
conclusions and
decisions
on real-life data, you need
to employ statistical
techniques!
·
Now
a day, in the developed
countries of the world,
there is an active
movement
for
of Statistical Literacy.
THE
NATURE OF
THIS
DISCIPLINE:
DESCRIPTIVE
statistics
Probability
INFERENTIAL
STATISTICS
MEANINGs
OF `STATISTICS':
Page
143
MTH001
Elementary Mathematics
The
word "Statistics" which
comes from the Latin
words status,
meaning a political
state,
originally meant information
useful to the state, for
example, information about
the
sizes
of population sand armed
forces. But this word
has now acquired different
meanings.
·
In
the first
place, the
word statistics
refers to
"numerical facts
systematically
arranged".
In this sense, the word
statistics is always used in
plural. We have, for
instance,
statistics of prices, statistics of
road accidents, statistics of
crimes, statistics
of
births, statistics of educational
institutions, etc. In all
these examples, the
word
statistics
denotes a set of numerical
data in the respective
fields. This is the
meaning
the
man in the street gives to
the word Statistics
and
most people usually use
the
word
data
instead.
·
In
the second
place, the
word statistics
is defined as a
discipline that
includes
procedures
and techniques used to
collect, process and analyze
numerical data to
make
inferences and to research
decisions in the face of
uncertainty. It should of
course
be borne in mind that
uncertainty does not imply
ignorance but it refers to
the
incompleteness
and the instability of data
available. In this sense,
the word statistics
is
used in the singular. As it
embodies more of less all
stages of the general
process
of
learning, sometimes called
scientific
method, statistics is
characterized as a
science.
Thus the word statistics
used in
the plural refers to a set
of numerical
information
and in the singular, denotes
the science of basing
decision on numerical
data.
It should be noted that
statistics as a subject is mathematical
in character.
·
Thirdly,
the word statistics are
numerical quantities calculated
from sample
observations;
a single quantity that has
been so collected is called a
statistic.
The
mean
of a sample for instance is a
statistic. The word
statistics is plural when
used in
this
sense.
CHARACTERISTICS
OF THE
SCIENCE
OF STATISTICS:
Statistics
is a discipline in its own
right. It would therefore be
desirable to know the
characteristic
features of statistics in order to
appreciate and understand
its general nature.
Some
of its important characteristics
are given below:
i)
Statistics
deals with the behaviour of
aggregates or large groups of
data. It
has
nothing to do with what is
happening to a particular individual or
object of
the
aggregate.
ii)
Statistics
deals with aggregates of
observations of the same
kind rather than
isolated
figures.
iii)
Statistics
deals with variability that
obscures underlying patterns. No
two
objects
in this universe are exactly
alike. If they were, there
would have been
no
statistical problem.
iv)
Statistics
deals with uncertainties as
every process of getting
observations
whether
controlled or uncontrolled, involves
deficiencies or chance
variation.
That
is why we have to talk in
terms of probability.
v)
Statistics
deals with those
characteristics or aspects of things
which can be
described
numerically either by counts or by
measurements.
Page
144
MTH001
Elementary Mathematics
vi)
Statistics
deals with those aggregates
which are subject to a
number of
random
causes, e.g.
the
heights of persons are
subject to a number of
causes
such as race, ancestry, age,
diet, habits, climate and so
forth.
vii)
Statistical
laws are valid on the
average or in the
long run. There is n
guarantee
that a certain law will
hold in all cases.
Statistical inference is
therefore
made in the face of
uncertainty.
viii)
Statistical
results might be misleading
the incorrect if sufficient
care in
collecting,
processing and interpreting
the data is not exercised or
if the
statistical
data are handled by a person
who is not well versed in
the subject
mater
of statistics.
THE
WAY IN WHICH STATISTICS
WORKS:
As
it is such an important area of
knowledge, it is definitely useful to
have a fairly good
idea
about
the way in which it works,
and this is exactly the
purpose of this introductory
course.
The
following points indicate
some of the main functions
of this science:
·
Statistics
assists in summarizing the
larger set of data in a form
that is easily
understandable.
·
Statistics
assists in the efficient
design of laboratory and
field experiments as well
as
surveys.
·
Statistics
assists in a sound and
effective planning in any
field of inquiry.
·
Statistics
assists in drawing general
conclusions and in making
predictions of how
much
of
a thing will happen under
given conditions.
IMPORTANCE
OF STATISTICS
IN
VARIOUS FIELDS
As
stated earlier, Statistics is a
discipline that has finds
application in the most
diverse fields
of
activity. It is perhaps a subject
that should be used by
everybody. Statistical
techniques
being
powerful tools for analyzing
numerical data are used in
almost every branch
of
learning.
In all areas, statistical
techniques are being
increasingly used, and are
developing
very
rapidly.
·
A
modern administrator whether in
public or private sector
leans on statistical data
to
provide
a factual basis for
decision.
·
A
politician uses statistics
advantageously to lend support
and credence to his
arguments
while elucidating the
problems he handles.
·
A
businessman, an industrial and a
research worker all employ
statistical methods in
their
work. Banks, Insurance
companies and Government all
have their statistics
departments.
·
A
social scientist uses
statistical methods in various
areas of socio-economic life of
a
nation.
It is sometimes said that "a
social scientist without an
adequate
understanding
of statistics, is often like
the blind man groping in a
dark room for a
black
cat that is not
there".
The
Meaning of Data:
The
word "data" appears in many
contexts and frequently is
used in ordinary
conversation.
Although the word carries
something of an aura of scientific
mystique, its
Page
145
MTH001
Elementary Mathematics
meaning
is quite simple and mundane.
It is Latin for "those that
are given" (the singular
form
is
"datum"). Data may therefore
be thought of as the results
of observation.
EXAMPLES
OF DATA
Data
are collected in many
aspects of everyday
life.
·
Statements
given to a police officer or
physician or psychologist during an
interview
are
data.
·
So
are the correct and
incorrect answers given by a
student on a final
examination.
·
Almost
any athletic event produces
data.
·
The
time required by a runner to
complete a marathon,
·
The
number of errors committed by a
baseball team in nine
innings of play.
And,
of course, data are obtained
in the course of scientific
inquiry:
·
the
positions of artifacts and
fossils in an archaeological
site,
·
The
number of interactions between
two members of an animal
colony during a
period
of observation,
·
The
spectral composition of light
emitted by a star.
OBSERVATIONS
AND VARIABLES:
Observation:
In
statistics, an observation
often
means any sort of numerical
recording of information,
whether
it is a physical measurement such as
height or weight; a classification
such as
heads
or tails, or an answer to a question
such as yes or no.
Variables:
A
characteristic that varies
with an individual or an object, is
called a variable.
For example,
age
is a variable as it varies from
person to person. A variable
can assume a number
of
values.
The given set of all
possible values from which
the variable takes on a
value is
called
its Domain. If for a given
problem, the domain of a
variable contains only one
value,
then
the variable is referred to as a
constant.
QUANTITATIVE
AND
QUALITATIVE
VARIABLES:
Variables
may be classified into
quantitative and qualitative
according to the form
of
the
characteristic of interest.
A
variable is called a quantitative
variable when a
characteristic can be
expressed
numerically
such as age, weight, income
or number of children. On the
other hand, if the
characteristic
is non-numerical such as education,
sex, eye-colour, quality,
intelligence,
poverty,
satisfaction, etc. the
variable is referred to as a qualitative
variable. A
qualitative
characteristic
is also called an attribute.
An individual or an object with
such a characteristic
can
be counted or enumerated after
having been assigned to one
of the several
mutually
exclusive
classes or categories.
Discrete
and Continuous
Variables:
A
quantitative variable may be
classified as discrete or continuous.
A
discrete
variable
is one that can take
only a discrete set of
integers or whole numbers,
which is the
values
are taken by jumps or
breaks. A discrete variable
represents count
data
such as the
Page
146
MTH001
Elementary Mathematics
number
of persons in a family, the
number of rooms in a house,
the number of deaths in
an
accident,
the income of an individual,
etc.
A
variable is called a continuous
variable if it
can take on any
value-fractional or
integralwithin
a given interval, i.e.
its
domain is an interval with
all possible values
without
gaps.
A continuous variable represents
measurement data such as the
age of a person, the
height
of a plant, the weight of a
commodity, the temperature at a
place, etc.
A
variable whether countable or
measurable, is generally denoted by
some symbol
such
as X or Y and Xi or
Xj represents the ith or
jth value of the variable.
The subscript i or j
is
replaced by a number such as
1,2,3, ... when referred to a
particular value.
Measurement
Scales:
By
measurement,
we usually mean the
assigning of number to observations
or
objects
and scaling is a process of
measuring. The four scales
of measurements are
briefly
mentioned
below:
NOMINAL
SCALE:
The
classification or grouping of the
observations into mutually
exclusive qualitative
categories
or classes is said to constitute a
nominal
scale. For
example, students are
classified
as male and female. Number 1
and 2 may also be used to
identify these two
categories.
Similarly, rainfall may be
classified as heavy moderate
and light. We may
use
number
1, 2 and 3 to denote the
three classes of rainfall.
The numbers when they
are used
only
to identify the categories of
the given scale, carry no
numerical significance and
there is
no
particular order for the
grouping.
ORDINAL
OR RANKING SCALE:
It
includes the characteristic of a
nominal scale and in
addition has the property
of
ordering
or ranking
of measurements.
For example, the performance
of students (or
players)
is
rated as excellent, good
fair or poor, etc. Number 1,
2, 3, 4 etc. are also used
to indicate
ranks.
The only relation that
holds between any pair of
categories is that of "greater
than" (or
more
preferred).
INTERVAL
SCALE:
A
measurement scale possessing a
constant interval size
(distance) but not a
true
zero
point, is called an interval
scale. Temperature
measured on either the
Celsius or the
Fahrenheit
scale is an outstanding example of
interval scale because the
same difference
exists
between 20o C
(68o F) and 30o C
(86o F) as between 5o C
(41o F) and 15o C
(59o F). It
cannot
be said that a temperature of 40
degrees is twice as hot as a
temperature of 20
degree,
i.e. the ratio 40/20
has no meaning. The
arithmetic operation of
addition,
subtraction,
etc. is meaningful.
RATIO
SCALE:
It
is a special kind of an interval
scale where the sale of
measurement has a
true
zero
point as
its origin. The ratio
scale is used to measure
weight, volume, distance,
money,
etc.
The, key to differentiating
interval and ratio scale is
that the zero point is
meaningful for
ratio
scale.
ERRORS
OF MEASUREMENT:
Experience
has shown that a continuous
variable can never be
measured with
perfect
fineness because of certain
habits and practices,
methods of measurements,
instruments
used, etc. the measurements
are thus always recorded
correct to the
nearest
units
and hence are of limited
accuracy. The actual
or true
values
are, however, assumed
to
exist.
For example, if a student's
weight is recorded as 60 kg (correct to
the nearest
kilogram),
his true weight in fact
lies between 59.5 kg and
60.5 kg, whereas a
weight
recorded
as 60.00 kg means the true
weight is known to lie
between 59.995 and 60.005
kg.
Thus
there is a difference, however
small it may be between the
measured value and
the
true
value. This sort of
departure from the true
value is technically known as
the error
of
measurement.
In other words, if the
observed value and the
true value of a variable
are
denoted
by x and x + ε
respectively,
then the difference (x + ε)
x, i.e. ε
is
the error. This
Page
147
MTH001
Elementary Mathematics
error
involves the unit of
measurement of x and is therefore
called an absolute
error. An
absolute
error divided by the true
value is called the
relative
error. Thus
the relative error
ε
=
,
which when multiplied by
100, is percentage
error. These
errors are
independent
x+ε
of
the units of measurement of x. It
ought to be noted that an
error has both magnitude
and
direction
and that the word
error
in statistics
does not mean mistake
which is a chance
inaccuracy.
BIASED
AND RANDOM ERRORS:
An
error is said to be biased
when
the observed value is
consistently and constantly
higher
or
lower than the true
value. Biased errors arise
from the personal
limitations of the
observer,
the imperfection in the
instruments used or some
other conditions which
control
the
measurements. These errors
are not revealed by
repeating the measurements.
They are
cumulative
in nature, that is, the
greater the number of
measurements, the greater
would be
the
magnitude of error. They are
thus more troublesome. These
errors are also
called
cumulative
or systematic
errors.
An
error, on the other hand, is
said to be unbiased when the
deviations, i.e. the
excesses
and
defects, from the true
value tend to occur equally
often. Unbiased errors and
revealed
when
measurements are repeated
and they tend to cancel
out in the long run.
These errors
are
therefore compensating
and
are also known as random
errors or accidental
errors.
Page
148
Table of Contents:
|
|||||