TYPES OF PROBABILITY SAMPLING:Systematic Random Sample

<< PROBABILITY AND NON-PROBABILITY SAMPLING:Convenience Sampling

DATA ANALYSIS:Information, Editing, Editing for Consistency >>

Research Methods STA630

Lesson 28

TYPES OF PROBABILITY SAMPLING

Probability samples that rely on random processes require more work than nonrandom ones. A

researcher must identify specific sampling elements (e.g. persons) to include in the sample. For

example, if conducting a telephone survey, the researcher needs to try to reach the specific sampled

person, by calling back several times, to get an accurate sample.

Random samples are most likely to yield a sample that truly represents the population. In addition,

random sampling lets a researcher statistically calculate the relationship between the sample and the

population that is the size of sampling error. A non-statistical definition of the sampling error is the

deviation between sample result and a population parameter due to random process.

Simple Random Sample

The simple random sample is both the easiest random sample to understand and the one on which other

types are modeled. In simple random sampling, a research develops an accurate sampling frame, selects

elements from sampling frame according to mathematically random procedure, then locates the exact

element that was selected for inclusion in the sample.

After numbering all elements in a sampling frame, the researcher uses a list of random numbers to

decide which elements to select. He or she needs as many random numbers as there are elements to be

sampled: for example, for a sample of 100, 100 random numbers are needed. The researcher can get

random numbers from a random number table, a table of numbers chosen in a mathematically random

way. Random-number tables are available in most statistics and research methods books. The numbers

are generated by a pure random process so that any number has an equal probability of appearing in any

position. Computer programs can also produce lists of random number.

A random starting point should be selected at the outset.

Random sampling does not guarantee that every random sample perfectly represents the population.

Instead, it means that most random samples will be close to the population most of the time, and that

one can calculate the probability of a particular sample being inaccurate. A researcher estimates the

chance that a particular sample is off or unrepresentative by using information from the sample to

estimate the sampling distribution. The sampling distribution is the key idea that lets a researcher

calculate sampling error and confidence interval.

Systematic Random Sample

Systematic random sampling is simple random sampling with a short cut for random selection. Again,

the first step is to number each element in the sampling frame. Instead of using a list of random

numbers, researcher calculates a sampling interval, and the interval becomes his or her own quasi

random selection method. The sampling interval (i.e. 1 in K where K is some number) tells the

researcher how to select elements from a sampling frame by skipping elements in the frame before one

for the sample.

Sampling intervals are easy to compute. We need the sample size and the population size. You can

think of the sample interval as the inverse of the sampling ratio. The sampling ratio for 300 names out

of 900 will be 300/900 = .333 = 33.3 percent. The sampling interval is 900/300 = 3

Begin with a random start. The easiest way to do this is to point blindly at a number from those from

the beginning that are likely to be part of the sampling interval.

When the elements are organized in some kind of cycle or pattern, the systematic sampling will not give

a representative sample.

Research Methods STA630

Stratified Random Sample

When the population is heterogeneous, the use of simple random sample may not produce representative

sample. Some of the bigger strata may get over representation while some of the small ones may

entirely be eliminated. Look at the variables that are likely to affect the results, and stratify the

population in such a way that each stratum becomes homogeneous group within itself. Then draw the

required sample by using the table of random numbers. Hence in stratified random sampling a sub-

sample is drawn utilizing simple random sampling within each stratum. (Randomization is not done for

quota sampling).

There are three reasons why a researcher chooses a stratified random sample: (1) to increase a sample's

statistical efficiency, (2) to provide adequate data for analyzing the various subpopulations, and (3) to

enable different research methods and procedures to be used in different strata.

1. Stratification is usually more efficient statistically than simple random sampling and at worst it

is equal to it. With the ideal stratification, each stratum is homogeneous internally and

heterogeneous with other strata. This might occur in a sample that includes members of several

distinct ethnic groups. In this instance, stratification makes a pronounced improvement in

statistical efficiency.

Stratified random sampling provides the assurance that the sample will accurately reflect the

population on the basis of criterion or criteria used for stratification. This is a concern because

occasionally simple random sampling yields a disproportionate number of one group or another,

and the sample ends up being less representative than it could be.

Random sampling error will be reduced with the use of stratified random sampling

Because each group is internally homogeneous but there are comparative differences

Between groups. More technically, a smaller standard error may result from stratified

Sampling because the groups are adequately represented when strata are combined.

2. It is possible when the researcher wants to study the characteristics of a certain population

subgroups. Thus if one wishes to draw some conclusions about activities in different classes of

student body, stratified sampling would be used.

3. Stratified sampling is also called for when different methods of data collection are applied in

different parts of the population. This might occur when we survey company employees at the

home office with one method but mist use a different approach with employees scattered over

the country.

Stratification Process

The ideal stratification would be based on the primary variable (the dependent variable) under study.

The criterion is identified as an efficient basis for stratification. The criterion for stratification is that it

is a characteristic of the population elements known to be related to the dependent variable or other

variables of interest. The variable chosen should increase homogeneity within each stratum and increase

heterogeneity between strata.

Next, for each separate subgroup or stratum, a list of population elements must be obtained. Serially

number the elements within each stratum. Using a table of random numbers or some other device, a

separate simple random sample is taken within each stratum. Of course the researcher must determine

how large a sample must be drawn from each stratum

Proportionate versus Disproportionate

If the number of sampling units drawn from each stratum is in proportion to the relative population size

of the stratum, the sample is proportionate stratified sampling. Sometime, however, a disproportionate

stratified sample will be selected to ensure an adequate number of sampling units in every stratum

Research Methods STA630

In a disproportionate, sample size for each stratum is not allocated in proportion to the population size,

but is dictated by analytical considerations.

Cluster Sampling

The purpose of cluster sampling is to sample economically while retaining the characteristics of a

probability sample. Groups or chunks of elements that, ideally, would have heterogeneity among the

members within each group are chosen for study in cluster sampling. This is in contrast to choosing

some elements from the population as in simple random sampling, or stratifying and then choosing

members from the strata, or choosing every nth case in the population in systematic sampling. When

several groups with intra-group heterogeneity and inter-group homogeneity are found, then a random

sampling of the clusters or groups can ideally be done and information gathered from each of the

members in the randomly chosen clusters.

Cluster samples offer more heterogeneity within groups and more homogeneity among and

homogeneity within each group and heterogeneity across groups.

Cluster sampling addresses two problems: researchers lack a good sampling frame for a dispersed

population and the cost to reach a sampled element is very high. A cluster is unit that contains final

sampling elements but can be treated temporarily as a sampling element itself. Researcher first samples

clusters, each of which contains elements, then draws a second a second sample from within the clusters

selected in the first stage of sampling. In other words, the researcher randomly samples clusters, and

then randomly samples elements from within the selected clusters. He or she can create a good

sampling frame of clusters, even if it is impossible to create one for sampling elements. Once the

researcher gets a sample of clusters, creating a sampling frame for elements within each cluster becomes

more manageable. A second advantage for geographically dispersed populations is that elements within

each cluster are physically closer to each other. This may produce a savings in locating or reaching each

element.

A researcher draws several samples in stages in cluster sampling. In a three-stage sample, stage 1 is

random sampling of big clusters; stage 2 is random sampling of small clusters within each selected big

cluster; and the last stage is sampling of elements from within the sampled within the sampled small

clusters. First, one randomly samples the city blocks, then households within blocks, then individuals

within households. This can also be an example of multistage area sampling.

The unit costs of cluster sampling are much lower than those of other probability sampling designs.

However, cluster sampling exposes itself to greater biases at each stage of sampling.

Double Sampling

This plan is adopted when further information is needed from a subset of the group from which some

information has already been collected for the same study. A sampling design where initially a sample

is used in a study to collect some preliminary information of interest, and later a sub-sample of this

primary sample is used to examine the matter in more detail, is called double sampling.

What is the Appropriate Sample Design?

A researcher who must make a decision concerning the most appropriate sample design for a specific

project will identify a number of sampling criteria and evaluate the relative importance of each criterion

before selecting a sample design. The most common criteria

Degree of Accuracy

Selecting a representative sample is, of course, important to all researchers. However, the error may

vary from project to project, especially when cost saving or another benefit may be a trade-off for

reduction in accuracy.

Research Methods STA630

Resources

The costs associated with the different sampling techniques vary tremendously. If the researcher's

financial and human resources are restricted, this limitation of resources will eliminate certain methods.

For a graduate student working on a master's thesis, conducting a national survey is almost always out

of the question because of limited resources. Managers usually weigh the cost of research versus the

value of information often will opt to save money by using non-probability sampling design rather than

make the decision to conduct no research at all.

Advance Knowledge of the Population

Advance knowledge of population characteristics, such as the availability of lists of population

members, is an important criterion. A lack of adequate list may automatically rule out any type of

probability sampling..

National versus Local Project

Geographic proximity of population elements will influence sample design. When population elements

are unequally distributed geographically, a cluster sampling may become more attractive.

Need for Statistical Analysis

The need for statistical projections based on the sample is often a criterion. Non-probability sampling

techniques do not allow researcher to use statistical analysis to project the data beyond the sample.

Table of Contents: