DATA TRANSFROMATION:Indexes and Scales, Scoring and Score Index

<< DATA ANALYSIS:Information, Editing, Editing for Consistency

DATA PRESENTATION:Bivariate Tables, Constructing Percentage Tables >>

Research Methods STA630

Lesson 30

DATA TRANSFROMATION

Data transformation is the process of changing data from their original form to a format that is more

suitable to perform a data analysis that will achieve the research objectives. Researchers often modify

thee values of a scalar data or create new variables. For example many researchers believe that response

bias will be less if interviewers ask consumers for their year of birth rather than their age, even though

the objective of the data analysis is to investigate respondents' age in years. This does not present a

problem for thee research analyst, because a simple data transformation is possible. The raw data coded

at birth year can be easily transformed to age by subtracting the birth year from thee current year.

Collapsing or combining categories of a variable is a common data transformation that reduces the

number of categories. For example five categories of Likert scale response categories to a question may

be combined like: the "strongly agree" and the "agree" response categories are combined. The

"strongly disagree" and the "disagree" response categories are combined into a single category. The

result is the collapsing of the five-category scale down to three.

Creating new variables by re-specifying the data numeric or logical transformations is another important

data transformation. For example, Likert summated scale reflect the combination of scores (raw data)

from various attitudinal statements. The summative score for an attitude scale with three statements is

calculated as follows:

Summative Score = Variable 1 + Variable 2 + Variable 3

This calculation can be accomplished by using simple arithmetic or by programming a computer with a

data transformation equation that creates the new variable "summative score."

The researchers have created numerous different scales and indexes to measure social phenomenon. For

example scales and indexes have been developed to measure the degree of formalization in bureaucratic

organization, the prestige of occupations, the adjustment of people in marriage, the intensity of group

interaction, thee level of social activity in a community, and thee level of socio-economic development

of a nation.

Keep it in mind that every social phenomenon can be measured. Some constructs can be measured

directly and produce precise numerical values (e.g. family income). Other constructs require the use of

surrogates or proxies that indirectly measure a variable (e.g. job satisfaction). Second, a lot can be

learned from measures used by other researchers. We are fortunate to have the work of thousands of

researchers to draw on. It is not always necessary to start from a scratch. We can use a past scale or

index, or we can modify it for our own purposes. The process of creating measures for a construct

evolves over time. Measurement is an ongoing process with constant change; new concepts are

developed, theoretical definitions are refined, and scales or indexes that measure old or new constructs

are improved.

Indexes and Scales

Scales and indexes are often used interchangeably. One researcher's scale is another's index. Both

produce ordinal- or interval- level measures of variable. To add to thee confusion, scale and index

techniques can be combined in one measure. Scales and indexes give a researcher more information

about variables and make it possible to assess thee quality of measurement. Scales and indexes increase

reliability and validity, and they aid in data reduction; that is condense and simplify the information

that is collected.

A scale is a measure in which the researcher captures the intensity, direction, level, or potency of a

variable construct. It arranges responses or observation on a continuum. A scale can use single

indicator or multiple indicators. Most are at thee ordinal level of measurement.

101

Research Methods STA630

An index is a measure in which a researcher adds or combines several distinct indicators of a construct

into a single score. This composite score is often a simple sum of multiple indicators. It is used for

content or convergent validity. Indexes are often measured at the interval or ratio level.

Researchers sometimes combine the features of scales and indexes in a single measure. This is common

when a researcher has several indicators that are scales. He or she then adds these indicators together to

yield a single score, thereby an index.

Unidimensionality: It means that al the items in a scale or index fit together, or measure a single

construct. Unidimensionality says: If you combine several specific pieces of information into a single

score or measure, have all the pieces measure the same thing. (each sub dimension is part of the

construct's overall content).

For example, we define the construct "feminist ideology" as a general ideology about gender. Feminist

ideology is a highly abstract and general construct. It includes a specific beliefs and attitudes towards

social, economic, political, family, sexual relations. The ideology's five belief areas parts of a single

general construct. The parts are mutually reinforcing and together form a system of beliefs about

dignity, strength, and power of women.

Index Construction

You may have heard about a consumer price index (CPI). The CPI, which is a measure of inflation, is

created by totaling the cost of buying a list of goods and services (e.g. food, rent, and utilities) and

comparing the total to the cost of buying the same list in the previous year. An index is combination of

items into a single numerical score. Various components or subgroups of a construct are each

measured, and then combined into one measure.

There are many types of indexes. For example, if you take an exam with 25 questions, the total number

of questions correct is a kind of index. It is a composite measure in which each question measures a

small piece of knowledge, and all the questions scored correct or incorrect are totaled to produce a

single measure.

One way to demonstrate that indexes are not a very complicated is to use one. Answer yes or no to the

seven questions that follow on the characteristics of an occupation. Base your answers on your thoughts

regarding the following four occupations: long-distance truck driver, medical doctor, accountant,

telephone operator. Score each answer 1 for yes and 0 for no.

1. Does it pay good salary?

2. Is the job secure from layoffs or unemployment?

3. Is the work interesting and challenging?

4. Are its working conditions (e.g. hours, safety, time on the road) good?

5. Are there opportunities for career advancement and promotion?

6. Is it prestigious or looked up to by others?

7. Does it permit self-direction and thee freedom to make decisions?

Total the seven answers for each of the four occupations. Which had the highest and which had the

lowest score? The seven questions are our operational definition of the construct good occupation.

Each question represents a subpart of our theoretical definition.

Creating indexes is so easy that it is important to be careful that every item in the index has face

validity. Items without face validity should be excluded. Each part of the construct should be measured

with at least one indicator. Of course, it is better to measure the parts of a construct with multiple

indicators.

102

Research Methods STA630

Another example of an index is college quality index. Our theoretical definition says that a high quality

college has six distinguished characteristics: (1) fewer students per faculty member, (2) a highly

educated faculty, (3) more books in the library, (4) fewer students dropping out of college, (5) more

students who go to advanced degrees, and (6) faculty members who publish books or scholarly articles.

We score 100 colleges on each item, and then add the score for each to create an index score of college

quality that can be used to compare colleges.

Indexes can be combined with one another. For example, in order to strengthen the college quality

index. We add a sub-index on teaching quality. The index contain eight elements: (1) average size of

classes, (2) percentage of class time devoted to discussion, (3) number of different classes each faculty

member teaches, (4) availability of faculty to students outside thee classroom, (5) currency and amount

of reading assigned, (6) degree to which assignments promote learning, (7) degree to which faculty get

to know each student, and (8) student ratings of instruction. Similar sub-index measures can be created

for other parts of the college quality index. They can be combined into a more global measure of

college quality. This further elaborates the definition of a construct "quality of college."

Weighting

An important issue in index construction is whether to weight items. Unless it is otherwise stated,

assume that an index is un-weighted. Likewise, unless we have a good reason for assigning different

weights, use equal weights. A weighted index gives each item equal weight. It involves adding up the

items without modification, as if each were multiplied by 1 (or 1 for negative items that are negative).

Scoring and Score Index

In one our previous discussions we had tried to measure job satisfaction. It was operationalized with the

help of dimensions and elements. We had constructed number of statements on each element with 5

response categories using Likert scale i.e. strongly agree, agree, undecided, disagree, and strongly

disagree. We could score each of these items from 1 to 5 depending upon the degree of agreement with

the statement. The statements have been both positive as well as negative. For positive statements we

can score straight away from 5 to 1 i.e. strongly agree to strongly disagree. For the negative statements

we have to reverse the score i.e. 1 for "strongly agree," 2 for "agree," 3 for "undecided" to 4 for

"disagree," and 5 for "strongly disagree." Reason being that negative multiplied by a negative becomes

positive i.e. a negative statement and a person strongly disagreeing with it implies that he or she has a

positive responsive so we give a score of 5 in this example. In our example, let us say there were 23

statements measuring for different elements and dimensions measuring job satisfaction. When on each

statement the respondent could get a minimum score of 1 and a maximum score of 5, on 23 statements a

respondent could get a minimum score of (23 X 1) and a maximum score of (23 X 5) 115. In this way

the score index ranges from 23 to 115, the lower end of the score index showing minimum job

satisfaction and upper end as the highest job satisfaction. In reality we may not find any on the

extremes, rather the respondents could be spread along this continuum. We could use the raw scores of

independent and dependent variable and apply appropriate statistics for testing the hypothesis. We

could also divide the score index into different categories like high "job satisfaction" and "low

satisfaction" for presentation in a table. We cross-classify job satisfaction with some other variable,

apply appropriate statistics for testing the hypothesis.

103

Table of Contents: