CHAPTER 6

of A Judge's Deskbook on the Basic Philosopies and Methods of Science,
by Shirley A. Dobbin, Ph.D, and Sophia I. Gatowski, Ph.D

An Introduction to the Scientific Methods of Survey Research

A survey is a set of one or more questions asked of respondents (i.e., subjects). A survey can ask people about their attitudes, beliefs, plans, health, work, income, life satisfactions and concerns, consumer preferences, political views, and so on. Virtually any human issue can be surveyed

Learning Objectives for Chapter 6

Upon completion of this chapter, the reader should be able to:

  • Articulate the three components of total survey design;
  • Understand different sampling procedures used in survey research and how to evaluate them;
  • Understand the importance of considering non-response rates;
  • Understand the importance of good question construction and the implications of poor question construction;
  • Recognize potential biases in the manner in which questions are designed and survey instruments are constructed;
  • Understand the importance of pre-testing survey instruments; and
  • Critically evaluate survey methodology and results.

Surveys in Court

Courts have been slow to admit survey evidence. In the 1930s, courts viewed such evidence as hearsay and hence ruled it inadmissible, particularly since those surveyed could not be cross-examined in court. By the 1950s, some courts held that surveys were not hearsay since they were not being used to prove the truth of what respondents said. Other courts were accepting of surveys as evidence of "present state of mind, attitude, or belief," a recognized exception to the hearsay rule. Today, in the case of surveys, the question for the court has become: "Was the survey conducted in accordance with generally accepted survey principles, and were the results used in a statistically correct way?"(1)

This chapter provides an overview of survey methodology. It became clear from the results of the national survey that judges had little knowledge about how best to evaluate survey methodology and survey results. Indeed, many judges expressed distrust of survey results and dissatisfaction with their ability to adequately address methodological issues pertaining to survey research. While it is true that the results of poorly designed and poorly conducted surveys should be viewed with skepticism, the same can be said for the results of poorly designed experimental or quasi-experimental research (e.g., if significant threats to validity have not been appropriately addressed). Like other research methodologies, survey research relies upon a range of methodologies and procedures that have been agreed upon by those who do the research to be characteristic of well designed and properly conducted surveys.

It is important to recognize, however, that survey research does present some unique challenges and that it is uniquely prone to the influence of bias. Indeed, how specific questions are written, the order in which questions are presented, and the way in which questions are asked, and of whom they are asked, all influence the validity and reliability of the survey results. Thus, the claim that survey questions can be designed to get whatever the researcher wants, is, to some extent, true. However, such surveys are not representative of well designed surveys. Well designed survey research provides a powerful tool for gathering valid, reliable, and useful information. Throughout this chapter, the methods and techniques of properly designed survey research will be presented. In addition, attention will be drawn to those areas which are potentially problematic and require particular focus.

Like all measurement techniques in all scientific disciplines, survey measurement is not error-free. The procedures used to conduct a survey have a major effect on the likelihood that the resulting data will describe accurately what is intended to be described. The content and execution of a survey must be scrutinized to determine if the survey provides relevant, reliable, and valid data on the issue before the court. The purpose and goals of the survey should be clearly stated at the outset, and they should be relevant to the issue at hand. That is, a statement of the problem to be studied should be clearly articulated.

Some Examples of What Surveys Can Measure

  • Attitudes and Preferences
  • Beliefs
  • Past Experiences
  • Levels of Knowledge
  • Census Information

 

Some Examples of How Survey Evidence Might be Used in Court

  • Obscenity/Contemporary Community Standards
  • Trademark Infringement
  • Deceptive Advertising
  • Employment Discrimination
  • Assessment of Damages
  • Mass Tort Aggregation
  • Change of Venue
  • Expansion of Voir Dire

To determine the relevance, reliability, and validity of a survey, the total survey design(2) must always be reviewed. In evaluating the total survey design, three different methodological components of the survey must be evaluated critically:

 

  • Sampling Procedures
  • Question Construction
  • Interviewing Procedures

These three components taken together constitute what is called the total survey design. It is not uncommon for researchers to fail to design high-quality procedures in all three of these areas. Researchers often pay attention to only one or two of the primary design features. Current best practice in survey research, however, requires an examination of all three design areas. As with any kind of scientific evidence, it is important to critically evaluate survey methodology.

To what extent do judges around the country define surveys as "scientific"?

Of those judges surveyed who believed that "scientific" knowledge can be distinguished from other forms of technical or specialized knowledge (n=244), only 11% believed that surveys constitute "science." The majority of judges (80%) believed that surveys constitute a form of technical knowledge. A number of the judges (19%) expressed the opinion that the results obtained from surveys are too subjective and sufficiently lacking in validity and reliability to be considered scientific. Some judges recognized the importance of sampling procedures, question construction and the like, but expressed uncertainty about how best to evaluate the survey instrument. 18% of the judges believed that survey results reflected nothing more than questions designed to get what the researcher wanted and statistics manipulated to provide a particular result


Before going any further, stop and reflect ...

  • How might the results of a survey be used in a specific case? Can you think of some examples of cases in which survey results might be proffered?
  • Would you classify survey research as "scientific" or as technical knowledge? Why?
  • How is survey research like other scientific methods? How is it unlike other scientific methodologies?
  • Given what you know about the scientific method, what information would you expect to hear from an expert presenting results obtained from a survey?

I. Sampling Procedures

  • Defining the Population

One of the first steps in designing a survey, or in deciding whether an existing survey is relevant, is to identify the target population. The target population consists of all the population elements (e.g., objects, individuals, or other social units) whose attitudes, perceptions, behaviors, or knowledge the survey results are intended to represent.

Frequently, however, the target population includes members who are inaccessible or who cannot be identified in advance. As a consequence, some decisions must be made when identifying the sample population who will actually be sampled and from whom data will actually be gathered.

Inferences will be drawn about the target population of interest based upon the information obtained from the sample population. It is therefore critically important that the sample population be properly selected and that it is an accurate reflection of the target population. Moreover, when evaluating the relevance of survey results, it is important to consider whether the target population used for the survey is relevant to the specific issue in question.

A survey report must include a clear definition and description of both the target population and the sample population, as well as a discussion of the differences between the two and an evaluation of how those differences might influence the results of the survey and their interpretation. A survey report must also include a detailed discussion of how the sample population was selected. The way to evaluate a sample population is not by the characteristics of the sample, but by the process by which the sample was selected.

  • The Sampling Frame

The sampling frame is the set of objects, events, or people that has a chance to be selected, given the sampling approach that is chosen. Any sample selection procedure will give some elements (e.g., individuals) a chance to be included in the sample. The first step in evaluating the quality of a sample is to define the sampling frame. Most sampling schemes fall into three general classes:

1. sampling from a more or less complete list of individuals in the population to be studied;

2. sampling from a set of people who go somewhere or do something that enables them to be sampled (e.g., patients receiving a particular type of medical treatment); and

3. sampling in two or more stages, with the first stage involving sampling something other than the individuals to be selected, but within which the individuals are contained (e.g., a city block or metropolitan area). In two or more steps, these primary units are sampled, and eventually a list of individuals (or other identified sampling units) is created from which a final sample selection is made.

There are three characteristics of a sampling frame that should be evaluated:

  • Comprehensiveness;
  • Probability of Selection; and
  • Efficiency.

i. Comprehensiveness

A sample can only be representative of the sampling frame -- that is, the population that actually had a chance to be selected. Most sampling approaches leave out at least a few people from the population the researcher wants to study. Although some sample lists (e.g., registered voters, telephone directories, people with driver's licenses, homeowners) cover large segments of some populations, they also omit major segments with distinctive characteristics (e.g., a telephone directory excludes those individuals with unlisted telephone numbers; a list of homeowners excludes people who rent). A key part of any sampling scheme is determining the percentage of the study population that has a chance of being selected and the extent to which those that are excluded are distinctive.

ii. Probability of Selection

It is essential that the researcher, or those evaluating the research, be able to specify the probability of selection for each individual selected (e.g., individuals who appear on a list more than once have a higher chance of selection than those individuals who appear on a list only once).

iii. Efficiency

In some cases, sampling frames include units that are not among those that the researcher wants to sample and should be deleted. The researcher must be able to identify how the appropriate persons in the sample were included and how the inappropriate persons were excluded.

Questions to consider when evaluating survey evidence ...

  • Was the survey designed to address the relevant question of interest?
  • Was the design of the survey, its administration, and the interpretation of the results appropriately controlled to ensure objectivity?
  • Were the experts who designed, conducted, and/or analyzed the survey appropriately skilled and experienced?
  • Were the experts who presented the results of the surveys conducted by others appropriately qualified by skill and experience?

Probability Sampling

When a sample is chosen for a study, the primary objective is to draw a sample that approximates the target population as closely as possible. This is accomplished by selecting the members of the population for inclusion in the sample so that every member of the population has a known and specified probability of being included in, or excluded from, the sample as every other member. This is known as probability sampling. The use of probability sampling techniques maximizes both the representativeness of the survey results and the ability to assess the accuracy of estimates obtained from the survey.

A probability sample is drawn from a population at random. That is, no systematic bias is permitted to creep into the selection process so that more people of any one kind get included in the sample than ought to given their numbers in the population and the goals of the survey.

Although a variety of probability sampling techniques exist, and they range in level of complexity, a review of the primary types of sampling techniques provides a basic understanding of the sampling process.

Simple Random Sampling

Simple random sampling approximates drawing a sample out of a hat -- members of a population are selected one at a time, independent of each other and without replacement. Once a unit is selected, it has no further chance to be selected (e.g., once a name is drawn out of a hat, it is not replaced and therefore cannot be re-selected).

For example, assume the researcher has a list of 8,500 individuals and each individual appears on the list only once. The goal of the survey is to select a simple random sample of 100 individuals. The researcher would number each individual from 1 to 8,500 and then using a computer, a table of random numbers, or through some other means of generating random numbers, the researcher would produce 100 different numbers occurring between 1 and 8,500. The individuals corresponding to the 100 numbers chosen would constitute a simple random sample of that population of 8,500.

Systematic Sampling

When drawing a systematic sample from a list, the researcher first determines the number of entries on the list and the number of elements from the list that are to be selected. Dividing the latter by the former will produce a sampling fraction.

For example, assume there is a list of 8,500 people and the researcher requires a sample of 100 -- 1/85 of the list (100/8,500 or 1 out of every 85 individuals) is to be included in the sample. In order to select a systematic sample, the starting point is designated by choosing a random number from 1 to 85 and then from that number taking every 85th person on the list (e.g., 24 is randomly selected as the starting number, then person 109 is selected, followed by person 194, 279, etc.). It is important to examine the population list to ensure that the list is not ordered in some systematic way or according to some re-occurring pattern that will affect the sample (e.g., if a list is always ordered as male/female then it may result in an over-representation of one gender).

Stratified Sampling

When a simple random sample is drawn, each new selection is independent and unaffected by any selections that come before it. As a result of this process, any of the characteristics of the sample may, by chance, differ somewhat from the population from which it is drawn. Often, little is known about the characteristics of individual population members before data collection. It is not uncommon, however, that at least a few characteristics of the population can be identified at the time of sampling. When this is the case, there is a possibility of structuring the sampling process to reduce the normal sampling variation, thereby producing a sample that is likely to be more efficient in its ability to reflect the total population.

Differential Probabilities of Selection

Sometimes stratification is used as a first step to vary the probability of selection across various population subgroups. For example, a target population may be subdivided on the basis of geographic location, membership in some group, ethnicity, and so forth. It is important to note that each element in the target population can only belong to one stratum. That is, each stratum must be mutually exclusive.

The probability of selecting any given element within a stratum may differ across strata. For example, stratum 1 is comprised of all African American individuals living in City X at a given point in time (100,000 African American individuals) and stratum 2 is comprised of all Hispanic American individuals living in the same city at the same time (10,000 Hispanic American individuals). There are more African American individuals living in the city than Hispanic American individuals, therefore the probability of a specific African American individual being selected in stratum 1 (1/100,000) is less than the probability of a specific Hispanic American individual being selected (1/10,000). If the researcher wants a sample population that contains an equal number of African American and Hispanic American individuals, he must either over-sample from stratum 2 (Hispanic Americans) (i.e., increase the probability of a specific individual being selected) or under-sample from stratum 1 (African Americans) (i.e., decrease the probability of a specific individual being selected).

Cross-Sectional vs. Longitudinal Survey Designs

Cross-Sectional Design

A cross-sectional design involves administering the survey to a group of people at a given point in time, yielding data on the measured characteristics as they exist at the time of the survey. The information can be completely descriptive or it can involve testing relationships among different characteristics of the sample.

Longitudinal Design

The longitudinal or panel design involves administering the survey to the same group of people at different points in time. Longitudinal surveys make it possible to assess changes over time within individuals. It is often difficult, however, to obtain subjects who are willing to be surveyed several times and often large numbers of people drop out of the study before it is completed.


Stratum: a population characteristic (e.g., gender, ethnicity, geographic location) the basis on which the population can be divided; each population element can only belong to one stratum, and each stratum is mutually exclusive.


Questions to consider when evaluating survey evidence ...

  • Was the sampling frame clearly defined?
  • Was the sampling frame comprehensive?
  • Were probabilities for selection of elements known?
  • Was the sample drawn using specifiable probabilities of selection?
  • Was the sample selection free of bias, or were some categories of people likely to be omitted or over- or under-represented?

The Importance of Considering Non-Responses

The accuracy of any particular inference from the sample to the target population depends on who provides an answer to a particular question. In every survey, there are some people who agree to be respondents and answer every question, others agree to be respondents but do not answer every question, and there are still others who refuse to be respondents. There are three categories of those selected to be in a sample who do not actually provide data:

1. those whom the data collection procedures do not reach, thereby not giving them a chance to answer the questions (e.g., wrong telephone numbers, people who screen calls with answering machines);

2. those asked to provide data but who refuse to; and

3. those asked to provide data but who are unable to perform the task required of them (e.g., people who do not speak the researcher's language, people whose reading and writing skills preclude filling out self-administered questionnaires).

The effect of non-responses on survey estimates depends on the percentage not responding and the extent to which those not responding are biased -- that is, the extent to which non-respondents differ systematically from those in the sample who did respond.

Although there is no agreed-upon standard for a minimum acceptable response rate, it is generally the case that a survey with a higher response rate will produce a better and less biased sample than one that has a higher level of non-response.

Examples of Biases in Non-Responses

Mail Surveys --people who have a particular interest in the subject matter or the research itself are more likely to return mail questionnaires than those who are less interested; better-educated or more literate people usually return mail questionnaires more quickly than those with less education

Telephone Surveys -- certain types of people will tend not to be home during certain times of the day (e.g., if telephone surveys are conducted between 9 a.m. and 5 p.m. Mondays to Fridays the sample will yield a high proportion of homemakers, retired people, and the unemployed).


Different Methods of Data Collection: In-Person, Telephone, and Mail Surveys

In-Person Interviews

Advantages: probably the most effective way of enlisting cooperation; rapport and confidence-building are facilitated; longer interviews are easier to complete; interviewer can address respondent concerns directly; interviewer controls the flow, pace, and order of the interview

Disadvantages: can be very costly; requires highly trained interviewers who are located near interview respondents; data collection typically takes longer; some respondents (e.g., those in areas with high crime rates, or rural areas) may be more easily accessed by other data collection modes

Telephone Interviews

Advantages: lower cost than in-person interviews; helpful if population to be sampled is large and/or geographically diverse; better access to certain populations; shorter data collection period; interviewer staffing and supervision easier (e.g., interviewers do not have to be located near to sample); random digit dialing (RDD) techniques can be used (RDD provides coverage of households with both listed and unlisted numbers by generating numbers at random from the frame of all possible telephone numbers -- e.g., numbers with a particular prefix); better response rate than mail surveys

Disadvantages: omits respondents without telephones; non-response is higher than with in-person interviewing; limits use of visual aids and interviewer observations; may be less appropriate for sensitive or personal questions

Mail Surveys

Advantages: relatively low cost; requires minimal staff and facilities; provide access to widely dispersed samples or to samples that might otherwise be inaccessible; respondents can take time to think about answers or look up records

Disadvantages: difficult to enlist cooperation; requires up-to-date and complete addresses; often have low response rates; some potential respondents may not have the necessary reading and writing skills to complete the survey without assistance


Sources of Error in Survey Research

  • Sampling Procedures Used
  • Question Construction
  • Interviewers
  • Coding of Responses
  • Data Entry
  • Interpretation

Methodological Issues to Consider

  • Sampling Procedures
  • Question Design
  • Demand Characteristics
  • Interviewer Biases
  • Interpretation of Results
  • Internal Validity
  • External Validity

Target Population: the larger population that consists of all the elements (e.g., objects, individuals, or other social units) whose attitudes, perceptions, behaviors, or knowledge the survey results are intended to represent; the larger population of interest

Population Element: a single member of the population

Sample Population: the smaller population of elements from whom information will actually be gathered; information gathered from the sample population will be used to infer information about the target population

Inferences will be drawn about the target population based upon the information from the sample population.


Questions to consider when evaluating survey evidence ...

  • Was the target population identified appropriately and defined properly?
  • Was the defined target population relevant to the issue in question?
  • Did the population members constitute individuals whose attitudes and/or behaviors are relevant to the dispute?
  • Did the sample population adequately reflect the target population &emdash; the individuals whose attitudes and behaviors are relevant to the issue in question?


Sampling Frame: the set of objects, events, or people that has a chance to be selected; the sampling frame includes an identification of the sources (e.g., telephone book) from which elements will be drawn, a specification of the probability for selection, and a detailed discussion of the sampling process


Questions to consider when evaluating survey evidence ...

  • Was the sampling frame appropriately specified &emdash; sources of information, sampling procedures to be used, probability of an element being selected?
  • Was the sampling frame appropriately comprehensive &emdash; did it include all necessary population elements?


Before going any further, stop and reflect ...

  • A concern expressed by many judges in the national survey was that survey questions could be designed to elicit whatever information the researcher wanted. Think about how you would ask questions on a survey. How might you shape a question to elicit a certain response?
  • Recognizing that the way in which questions are designed may influence the answer given, can you think of some ways in which researchers can reduce the potential bias of question design?

Designing Good Questions

Good questions are reliable (providing consistent measures in comparable situations) and valid (answers correspond to what they are intended to measure). Designing good, reliable and valid questions presents a number of challenges to the survey researcher. For example, the researcher has to ensure that the questions are written in such a way that the respondent will understand what the question means and that the respondent's understanding of the question matches what the researcher intended. The researcher also has to construct questions in such a way as to minimize biases that may be inherent in the way in which the question is worded, multiple choice answers are provided, rating scales (e.g., on a scale of 0 to 5) are developed, questions are ordered, and so forth.(3)

The Use of Filter Questions

Filter questions are used to screen out respondents who do not have an opinion on, or any knowledge of, the issue being addressed. For example, some survey respondents may have no opinion on an issue under investigation, either because they have never thought about it before, or because the question mistakenly assumes a familiarity with the issue. There are three approaches researchers generally use to deal with a 'don't know' possibility.

1. Simply ask the question directly and rely on the respondent to volunteer a 'don't know' response. Faced with a direct question, however, respondents may be unwilling to admit a lack of knowledge and instead guess at the answer.

  • The survey can include a quasi-filter question to reduce guessing by providing a 'don't know' response alternative. By signaling to the respondent that it is acceptable not to know the answer, the filter reduces the demand for an answer and, as a result, the inclination to hazard a guess is reduced. Respondents are more likely to endorse a 'don't know' if it is mentioned explicitly by the interviewer than if it is merely accepted when the respondent spontaneously offers it as a response.
  • The survey can include full-filter questions, that is, questions that lay the ground work for the substantive question by first asking the respondent if he has an opinion about the issue. The interviewer then asks the substantive question only of those respondents who indicate that they have an opinion on the issue.

The choice among these three options and the way they are used can affect the rate of 'don't know' and 'no opinion' responses that a given question will evoke. For example, respondents are more likely to say they have no opinion when a full-filter question is used than if a quasi-filter question is used. It is important to recognize that the use of full-filter questions may produce an under-reporting of opinions. For example, full-filter questions may discourage respondents who actually have opinions from offering them by conveying the implicit suggestion that the respondent can avoid difficult or time-consuming follow-up questions by saying that he has no opinion.

In sum, a survey that uses full-filter questions tends to provide a conservative estimate of the number of respondents holding an opinion, while a survey that uses neither full-filter nor quasi-filter questions tends to over-estimate the number of respondents with opinions because some respondents offering opinions are guessing.

Open-Ended vs. Close-Ended Questions

When constructing a questionnaire, the researcher must decide what form, or forms, questions will take. Survey questions can be classified broadly into two forms: open and closed. Open-ended questions ask for a reply in the respondents' own words - no answers are suggested. Close-ended questions (e.g., multiple choice questions) ask respondents to choose one of two or more response alternatives suggested to them.

Open-ended questions that do not provide answers allow respondents to answer according to their own frames of reference, without having to choose among specific alternatives suggested by the interviewer. Open-ended questions generally reveal what is most salient to respondents, what things are foremost in their minds. However, open-ended questions can also elicit a great deal of repetitious, irrelevant material. Respondents may miss the point of the question or engage in long, awkward silences as they try to organize and articulate their thoughts. The interviewer must then skillfully probe to bring respondents back to the subject, to clarify responses, and to encourage elaboration (the appropriate use of probes will be discussed later in this chapter). Individuals also differ a great deal in their ability to articulate their thoughts, with the result that differences in responses may reflect differences in ability to express opinions as much as real differences in shades of opinion. Close-ended questions ensure that respondents will choose among alternatives of interest to the investigator; but the list of alternatives might suggest answers that respondents had not thought of before, or force respondents into what may be an unnatural frame of reference, and they generally do not permit respondents to express their exact meaning.

The value of any open- or close-ended question depends on the information the question is intended to elicit. Open-ended questions are more appropriate when the survey is attempting to gauge what comes first to a respondent's mind, but close-ended questions are suitable for

assessing the choices between well-identified options or obtaining ratings on a clear set of alternatives.

Rating Frequencies of Behavior

Often surveys will ask respondents to indicate how frequently they engage in a particular behavior during a specified period of time. This is a common type of close-ended question in which the respondent is given alternatives of, for example, "once a week," "twice a week," "one week a month," etc. How the response alternatives are presented in the close-ended format influences the respondent's judgment of frequency. For example, the response categories may provide implicit cues to the respondent about how rare or common the researcher expects the event to be. Researchers often use response categories of "sometimes" or "frequently," but the meaning of those terms may differ depending upon the issue being addressed and the person providing the answer (e.g., does the respondent define "sometimes" in the same way that the researcher does?). The influence of response categories on reporting of frequencies of behavior tend to be more pronounced when respondents have difficulty recalling behaviors, either due to poor memory or because the behavior is not very distinctive.

Rating Scales

Rating scales are frequently used in survey research. For example, respondents might be asked to provide a rating of the amount of experience they have had with some event on a scale from 0 (no experience) to 10 (a great deal of experience).

Research has shown that the scale used can influence the answer given (e.g., a scale of 0-10 vs. a scale of -5 to +5). For example, respondents in a research study were asked the following question: "How successful would you say you have been in life?"(4) Respondents were then asked to rate their answer on a scale ranging from "not at all successful" to "extremely successful." One group of respondents were told to rate their answer on a scale of 0 (not at all successful) to 10 (extremely successful), while the other group was told to rate their answers on a scale of -5 (not at all successful) to +5 (extremely successful). The research demonstrated that differences between the endpoints of the scales influenced the respondents' interpretation of what is meant by "not at all successful." That is, when "not at all successful" was assigned a value of 0 on the 0 to 10 scale, respondents interpreted "not at all successful" to mean the absence of outstanding achievements. By contrast, when "not at all successful" was assigned a value of -5 on the -5 to +5 scale, respondents interpreted "not at all successful" to mean the presence of explicit failures.(5)

These findings, as well as findings from other research on survey methodology, indicate that answers given are influenced by the manner in which the questions and the response categories are designed. That is, respondents draw on implicit cues in the question and response categories when providing answers. This is especially true if questions are difficult or ambiguous. It is therefore important that researchers pay particular attention to how questions and response categories are designed and to acknowledge the potential influence of implicit cues on answers given.

The Use of Probes

When questions allow respondents to express their opinions in their own words, some of the respondents may give ambiguous or incomplete answers. In such cases, interviewers may be instructed to record any answer the respondent gives and move on to the next question, or they may be instructed to probe to obtain a more complete response or to clarify the meaning of a response. If probes are used, the wording of probes should be clearly defined and they should be used consistently across all interviewers. If probes are not used systematically across all interviewers, the probes themselves may introduce bias into the results (e.g., some interviewers may probe more than others and elicit more complete responses while other interviewers may under-probe or the probes themselves may provide implicit cues about how the respondent should answer the question).

The Importance of Question Order and Context

It is important to realize that the order of questions may influence the responses given. Each question-response provides context for the next question-response. That is, how a respondent interprets the meaning of a question is influenced by questions that come before it. Thus, the wording of questions may be a confounding variable. Care should be taken in how questions are ordered and according to what logical sequence. Particular care should be taken in the placement of sensitive questions (e.g., questions about personal issues that might make the respondent uncomfortable should not be the initial questions as such placement of sensitive questions might interfere with the respondent's willingness to continue).

Often survey researchers will develop two versions of a survey instrument. Each version reflects a different ordering of questions, or "blocks" of questions. Statistical checks can then be conducted to determine whether there were any order effects - that is, whether differences in responses can be attributed to the way in which questions were ordered.

The Importance of Pilot-Testing

In a pilot, or pretest, the proposed survey is administered to a small sample of individuals who are the same as, or very similar to, the individuals who would be eligible to participate in a full-scale survey. During the pilot test, the researchers observe the respondents for any difficulties they may have with questions and probe the source of any difficulties so that questions can be rephrased if confusion or other problems arise. The length of the survey, both in terms of the number of questions and the time it takes to complete, can also be reviewed during the pilot test. If a survey instrument is overly long and cumbersome, or if it takes a long time to complete, respondents are less likely to agree to participate, or, if they agree, they are less likely to complete the survey. For self-administered surveys, pilot tests are often conducted with a focus or discussion group of individuals who each complete the survey and then provide feedback to the researchers regarding question clarity, areas of confusion, and so forth.

III. Interviewing Techniques

Although not all surveys involve interviewing (some surveys have respondents answer self-administered questions), it is certainly common to use an interviewer to ask questions and record answers. When interviewers are used, it is important to ensure that the interviewer does not influence the answers respondents give, while at the same time maximizing the accuracy with which questions are answered.

Interviewers must be properly trained on the overall purpose and goals of the survey, the question-specific objectives, and proper interview techniques (e.g., reading questions as stated, the appropriate use of probes). The more complex the survey, the more highly trained the interviewers should be. Inadequately trained and supervised interviewers can be a serious source of inaccuracy and uncertainty in survey research.

Researchers can use a variety of validation techniques to ensure that the survey is administered in such a way as to minimize error and bias. For example:

  • ïresearchers can, and in fact should, monitor interviews as they occur and closely supervise interviewers;
  • ïresearchers can contact a small sample of respondents to ensure that the interview took place and that they were qualified to participate; and

ïresearchers can compare the work done by each individual interviewer - for example, by reviewing the interviews and individual responses recorded by each interviewer, any response patterns and inconsistencies can be identified and addressed.

Characteristics of Good Questions

  • Questions should be clearly written, complete, and concise
  • Question meaning should be clear and consistent for all respondents
  • Questions should not be double-barreled (i.e., a single question that actually contains two questions)
  • Questions should not include double negatives
  • Questions and response alternatives should not be "loaded" (e.g., response alternatives should present both sides of the issue; not use overly emotion-laden terminology)
  • Questions should have mutually exclusive response categories


Designing Good Questions: Issues to Consider

  • How the wording of questions may influence responses given
  • Whether the respondent interpreted questions in the manner intended by the researcher
  • Whether respondents with no opinions or no knowledge of specific issues were screened through filter questions and, if not, whether the consequences of not filtering respondents were recognized
  • How response categories, including rating scales, provide implicit cues to the respondent about how to answer the question
  • How probes, if used improperly or inconsistently, can influence answers given
  • How question order may influence responses

CRITICAL QUESTIONS REVIEWED

  • Were the experts who designed, conducted, and/or analyzed the survey appropriately skilled and experienced?
  • Were the experts who presented the results of the surveys conducted by others appropriately qualified by skill and experience?
  • Was the target population identified appropriately and properly defined?
  • Was the target population relevant to the issue in question?
  • Did the population members constitute individuals whose attitudes and/or behaviors are relevant to the issue in question?
  • Did the sample population adequately reflect the target population; that is, the individuals whose attitudes and behaviors are relevant to the issue in question?
  • Was the sampling frame appropriately specified; including, sources of information, sampling procedures to be used, probability of an element being selected?
  • Was the sampling frame appropriately comprehensive; that is, did it include all necessary population elements?
  • Was the sample drawn using specifiable probabilities for selection?
  • Was the sample free of bias, or were some categories of people likely to be omitted or over- or under-represented?
  • Was the mode of data collection selected appropriate (e.g., in-person, telephone, mail)?
  • Were the disadvantages of the selected collection mode acknowledged?
  • Were questions constructed appropriately (e.g., clear, concise, consistent meaning)?
  • Were filter questions (e.g., "don't know's") used appropriately?
  • Were open- and close-ended questions used? How was the use of each type of question justified?
  • Were the potential biasing influences of question construction and response categories adequately acknowledged and minimized?
  • Were steps taken to guard against order and context effects?
  • Were appropriate pilot tests conducted and feedback incorporated into the final survey instrument?
  • Were interviewers appropriately selected, trained, and supervised?
  • What procedures were used to ensure and determine that the survey was administered in such a way as to minimize bias?

Endnotes:

1. For a discussion of the unique legal issues created by the use of survey results in legal proceedings, see Diamond, S.S. (1994). "Reference Guide on Survey Research." In Reference Manual on Scientific Evidence. Federal Judicial Center. Washington, D.C.: Government Printing Office, pgs. 221-272; Shapiro, M. et al. (1997). "Guide to Survey Research." In Black, B. and Lee, P.W. (Eds.). Expert Evidence: A Practioner's Guide to Law, Science and the FJC Manual. West Publishing Co., pgs. 159-194.

2. Dillman, D.A. (1978). Mail and Telephone Surveys: The Total Design Method. New York: Wiley.

3. For a good overview of some of the problems and challenges of designing good survey questions see Schwartz, N. (1999). "Self-Reports: How the Questions Shape the Answers." American Psychologist, Vol. 54(2), pgs. 93-105. This article presents an overview of the issues and provides many illustrative examples, and it does so in a readable and accessible format.

4. Schwartz, N., Knauper, B., Hippler, H.J., Noelle-Neuman, E., and Clark, F. (1991). "Rating Scales: Numeric Values May Change the Meaning of Scale Labels." Public Opinion Quarterly, Vol. 49, pgs. 388-395, in Schwarz (1997), Supra note 3.

5. Ibid.

GLOSSARY

close-ended questions questions which offer a series of response alternatives (answers) among which the respondent must choose

cross-sectional administering the survey to a group of people at a given point in time, yielding

survey data on the measured characteristics as they exist at the time of the survey; the information can be completely descriptive or it can involve testing relationships among different characteristics of the sample

differential when a sample population is stratified and the probability of selecting any given

probabilities element within a stratum differs across strata

of selection

filter question a screening that asks the respondent if he or she has an opinion about, or knowledge of, the issue; only those with an opinion of, or knowledge about, the issue are then asked the follow-up question

full-filter question ask respondents if they have an opinion on the issue and then only ask the follow-up question of those that have an opinion

longitudinal administering the survey to the same group of people at different points in time;

survey makes it possible to assess changes over time within individuals; often difficult, however, to obtain subjects who are willing to be surveyed several times and often large numbers of people drop out of the study before it is completed

open-ended questions questions designed to allow respondents to answer in their own words, no answers are suggested

order effects responses to survey questions may be influenced by the order in which questions are asked; order effects introduce a confound into the research and draw into doubt conclusions drawn about the respondents' answers

population element a single member of the population

probability sampling every member of the population has a known and specified probability of being included in the sample

quasi-filter question providing the respondent with a "don't know" or "no opinion" response option; reduces the demand for an answer and reduces guessing on part of respondent

sampling frame the set of objects, events, or people that has a chance to be selected; includes an identification of the sources (e.g., telephone book) from which elements will be drawn, a specification of the probability for selection, and a detailed discussion of the sampling process

sample population the smaller population of elements from whom information will actually be gathered; information gathered from the sample population will be used to infer information about the target population

simple random elements of a population are selected one at a time, independent of the other

sampling and without replacement; once an element unit is selected, it has no further chance to be selected

stratified sampling when at least a few characteristics of the population can be identified at the time of sampling, the sampling process can be structured so that the population is organized or stratified according to known characteristics, thereby producing a sample that is more likely to reflect the total population

stratum a population characteristic (e.g., gender, geographic location) the basis on which the population can be divided; each population element can only belong to one stratum, and each stratum is mutually exclusive (i.e., a population element can only exist in one stratum)

target population the larger population that consists of all the elements (e.g., objects, individuals, or other social units) whose attitudes, perceptions, behaviors, or knowledge the survey results are intended to represent; the larger population of interest

SUGGESTED READINGS

Converse, J. and Presser, S. (1986). Survey Questions: Handcrafting the Standardized Questionnaire. Newbury Park: Sage Publications.

Diamond, S.S. (1994). "Reference Guide on Survey Research." In Reference Manual on Scientific Evidence. Federal Judicial Center. Washington, D.C.: Government Printing Office.

Dillman, D.A. (1978). Mail and Telephone Surveys: The Total Design Method. New York: Wiley

Fowler, F. J. (1993). Survey Research Methods, 2nd Edition. Newbury Park: Sage Publications.

Schwartz, N. (1999). "Self-Reports: How the Questions Shape the Answers." American Psychologist, Vol. 54(2), pgs. 93-105.

Shapiro, M., Chase, J.L., Eshelman, R.L., Bode, H.J., Apjohn, N.G., and Leibensperger, E.P. (1997). "Guide to Survey Research." In Black, B. & Lee, P.W. (Eds.) Expert Evidence: A Practitioner's

Guide to Law, Science, and the FJC Manual. West Publishing Co., pgs. 159-194.

--- CHAPTER 5 --- FRONT PAGE --- CHAPTER 7 ---