CHAPTER 5

of A Judge's Deskbook on the Basic Philosopies and Methods of Science,
by Shirley A. Dobbin, Ph.D, and Sophia I. Gatowski, Ph.D

An Introduction to the Experimental Method

This chapter provides an introduction to the scientific method used in quantitative experimental and quasi-experimental research studies. The goal of this chapter is not to provide judges with the level of understanding necessary to design or conduct an experiment, nor is the goal to provide an in-depth and detailed discussion of scientific methods, validity and reliability concerns, and the like. Rather, the goal of this chapter is to provide judges with sufficient background knowledge about the general methods of experimental research and the key principles and concepts which underlie the scientific method, so that judges can ask informed questions, understand the answers (or know when to ask more questions) and make informed decisions about the admissibility of proffered scientific evidence. Recall that the questions and issues discussed herein are presented as indicators of the issues scientific experts should be addressing when proffering science for use in the court. These issues represent what judges should be listening for and what they should be asking about when the expert does not offer the appropriate information.

Upon completion of this chapter, you should be able to:

  • Articulate why it is important for judges to have a basic understanding of the experimental method;
  • Understand the importance of establishing a causal relationship in experiments and the difference between causality and correlation;
  • Understand the research process including:
    • hypothesis-testing
    • the difference between pure experimental and quasi-experimental designs and why this distinction is important
    • the importance of validity, including different types of threats to validity
    • the importance of reliability
  • Critically evaluate the general process of experimental research

Cause vs. Correlation

An experiment is a test of a causal proposition: Do changes in variable A cause systematic changes in variable B? However, extreme care must be taken in assuming that a cause-and-effect relationship has been demonstrated. This is especially true of quasi-experimental designs -- such designs must be evaluated critically to ensure that validity threats that may undermine the causal relationship or introduce extraneous variables are reduced or eliminated.

It is important to realize that correlated events are not necessarily causally related. Two events are correlated when the presence of a high value of one variable is regularly associated with a high or low value of another. Although a correlation shows that a relationship exists between two variables, that relationship may result from a common cause (both variables are affected by a third variable, not each other) or from the method used to gather the data.

The Experimental Research Process: An Overview

The research process can be viewed as the overall scheme of systematic scientific activities which experimental researchers engage in with the goal of producing new knowledge. The research process consists of seven principle stages: (1) statement of the problem; (2) hypothesis development and hypothesis-testing; (3) research design; (4) measurement; (5) data collection; (6) data analysis; and (7) generalization. Each of these stages is interrelated with theory, in the sense that the theory both affects and is effected by them.

The most characteristic feature of the research process is its cyclical nature. Research usually starts with a problem and ends in a tentative empirical generalization. The generalization at the end of one cycle is the beginning of the next cycle. This cyclical process continues indefinitely, reflecting the progress of a scientific discipline and the ongoing accumulation of scientific knowledge. The research process is also self-correcting. Tentative generalizations to research problems are tested logically and empirically. If these generalizations are rejected, new ones are formulated and tested. In the process of reformulation, all the research operations are re-evaluated because the rejection of a tentative hypothesis might be due to a number of variables including deficiencies in research design, measurement, and data analysis, as well as improperly developed theoretical constructs.

In designing and carrying out research, the researcher moves through all phases of the research process beginning with ideas that are then refined and developed into one or more specific questions. The researcher then designs the procedures to be used to answer the questions and proceeds with the observations. The design phase is crucial in scientific research. The researcher must carefully and systematically plan each step, asking and answering a variety of conceptual and procedural questions along the way. The process begins with an idea that is refined into a statement of the problem.

Correlation: an increase or descrease in one variable is associated with a corresponding increase or decrease in a second variable; correlation does not equal causation.

Positive Correlation: an increase in one variable is associated with an increase in another variable.

Negative Correlation: an increase in one variable is associated with a decrease in another variable.


Establishing Cause

Three requirements must be met before a causal connection between two events can be inferred:

  1. Covariation: two events must vary together; a change in one variable must result in a change in the other variable
  2. Temporal Ordering: in order for a variable to cause a change in the other, the cause must precede the effect
  3. No (or severely minimized) extraneous variables: all other possible causal variables must be ruled out.

The third requirement gives the experimental method its particular strength -- but it also fuels the argument that experimental results have no "real world" meaning. That is, the very act of controlling the situation carefully enough to eliminate extraneous or unwanted variables may make the situation so far removed from the real world that the results have no "real" meaning.

Induction and Deduction Revisited

Both induction and deduction are rational processes that are used constantly by scientists. It is the combination of these two kinds of thinking&emdash;induction and deduction&emdash;that characterizes science. When a researcher begins with empirical observations and, based upon those observations infers constructs, or theories, he is engaged in inductive reasoning. When these constructs or theories then serve as a basis for making predictions about new, specific observations, he has engaged in deductive reasoning. From the specific observation to the general idea; from the general idea back to the more specific observation; induction and deduction. The scientist uses both processes to build conceptual models and to validate them.

Recall that science is generally characterized as

  • the systematic organization of information about the world with the goal of discovering new relationshps among natural phenomena
  • endeavoring to explain why phenomena occur and how they are related, and
  • the formulation of explanations in such a way that they can be empirically tested.

Recall that philosophers of science from a social constructivist perspective argue that scientific discovery is embedded in a social and political context.

Statement of the Problem

What a researcher identifies as a problem or an issue worthy of study is often influenced by attitudes about what constitutes legitimate science and what constitutes a legitimate problem to study, disciplinary training in particular theories and methodologies, the process of peer review, and the degree of institutional and financial support for different types of research (e.g., universities, research institutions, or funding agencies).

A researcher's initial idea may be somewhat vague but it does identify the variables to be studied. The researcher's initial ideas are transformed into a statement of the problem by building a prediction into the question. In experimentation, the problem statement focuses on a causal prediction (Does variable A cause a specific change in variable B?). The nature of the expected effect is also clearly stated (i.e., whether the predicted effect of variable A is to increase or decrease the level or occurrence of variable B).

Articulating the statement of the problem is an important early step in designing experimental research. In articulating the statement of the problem, the researcher should have clearly identified and defined the major variables to be studied and clearly specified the nature of the predicted effect of one variable on another variable.

Statement of the Problem

  1. the identification of at least two variables
  2. a statement about an expected relationship between the identified variables
  3. an indication of the nature or direction (e.g. increase or decrease) of the causal effect.


Questions to consider
when evaluating scientific evidence:
  • Do you have a clear understanding of what the research was designed to study?
  • Do you understand the nature of the predicted relationship? That is, did the researcher clearly articulate the statement of the problem?

Hypothesis Development and Hypothesis-Testing

Most scientific research studies are attempts to test an hypothesis formulated by the researcher. An hypothesis is a type of idea; it states that two or more variables are expected to be related to one another. The research hypothesis has its beginning in initial ideas which are often vague and overly general. The researcher must carefully refine these initial ideas into a statement of the problem drawing on observations of the phenomenon, as well as a thorough review of previous research.

A well constructed hypothesis should be internally consistent and logical. An hypothesis that is clearly illogical or self-contradictory on its face should be rejected. An hypothesis should also be examined to ensure that it really provides insight and understanding into why observed phenomena occur; an hypothesis that requires constant modification to explain away contradictory results should be treated with skepticism. And a valid hypothesis in experimental research should also be able to survive experimental or observational tests that will show it to be false if it is wrong.

The researcher designs a study or experiment to test an hypothesis. Hypothesis-testing is a critical part of the experimental research process. The researcher makes a specific prediction concerning the outcome of the experiment. If the prediction is confirmed by the results of the experiment, the hypothesis is supported. If the prediction is not confirmed, the researcher will either reject the hypothesis or conduct further research using different methods to test the hypothesis.

Developing the research hypothesis is a major task for the researcher. It is the research hypothesis that is tested through the processes of systematically making, measuring, analyzing, and interpreting empirical observations under controlled conditions.

 

Hypothesis: a claim or prediction that two or more variables are expected to be related to one another.

Hypothesis-testing: the process of systematically testing an hypothesis

Research Hypothesis:

  1. identifies and operationalizes the independent and dependent variables;
  2. states the relationship between the independent and dependent variable; and
  3. allows for the possibility of emprically testing the relationship.

The research hypothesis is a complex statement that actually incorporates two hypotheses:

i. The Null Hypothesis; and

ii. The Experimental (or Causal or Alternative) Hypothesis.

i. The Null Hypothesis ( H0 )

i. The null hypothesis is what its name suggests; null means 'none.' The null hypothesis states that there is no difference between the two conditions beyond chance differences (e.g., Variable A has no effect on Variable B). If a statistically significant difference is found, the null hypothesis is rejected. If the difference is found to be within the limits of chance, it is concluded that there is insufficient evidence to reject the null hypothesis.

ii. The Experimental (or Causal or Alternative) Hypothesis ( H1 )

The experimental hypothesis states that a particular variable has a predicted effect on another variable. The nature of this effect or relationship can be stated in two ways: (1) The manipulation of the first variable causes an increase in the level of the second variable (an increase in variable A causes an increase in variable B); or (2) the manipulation of the first variable causes a decrease in the second variable (an increase in variable A causes a decrease in variable B).

The statement of the problem is converted into a research hypothesis when the theoretical concepts in the problem statement are described in terms of their procedures for measurement or manipulation. This process is called operationalization, or creating an operational definition of the concepts. By combining the statement of the problem and operational definitions within experimental research, the researcher makes a prediction about the effects of the specific, operationally defined independent variable on the specific, operationally defined dependent variable.

The independent variable (IV) is the presumed cause of some outcome under study; changes in an independent variable are hypothesized to have an effect on the outcome or behavior of interest. The independent variable is the variable that is experimentally manipulated by the researcher.

The dependent variable (DV) is a presumed effect. The dependent variable is predicted to change as a result of the manipulation of the independent variable. The value of the dependent variable (e.g., score) is dependent on the value of the independent variable.

It is very important that the independent variable and the dependent variable are clearly defined and clearly articulated. The operational definition of both the independent and dependent variable should accurately define what is meant by a particular variable. If the independent and dependent variables are not clearly defined and articulated -- that is, they are not clearly operationalized -- then it is difficult to determine with any degree of certainty whether or not the researcher has actually studied what she intended to study and whether the results of the experiment are valid and reliable.

Operational Definition: a description of an independent or dependent variable, stated in terms of how the variable is to be measured or manipulated.

 

Independent Variable: the presumed cause of some outcome under study; the experimentally manipulated variable; changes in an independent variable are hypothesized to have an effect on the outcome or behavior of interest.

 

Dependent Variable: a measure of presumed effect in a study; predicted to change as a result of the manipulation of the independent variable; the value of the dependent variable (e.g. score) is dependent on the value of the independent variable.

Questions to consider when evaluating scientific evidence:

  • How were independent and dependent variables operationalized? That is, do you clearly understand what each variable means (as indicated by operational definitions)?
  • Did the operational definitions adequately capture the full conceptual meaning of the variables?
  • Do you have a clear understanding of how the researcher intended to measure changes in the variables?
  • Did the researcher actually measure changes in the variables in the way originally intended?

III. Experimental Research Design

The components of the experimental process discussed so far underscore that careful and systematic planning and development are critical for well conducted scientific research. The research design is the detailed customized process for the systematic testing of the research hypothesis. The research design should have been clearly described in detail - it serves as a road map for other researchers to follow when replicating the research and it provides a basis for a critical review of the research methodology. Note that the experimental design refers to both the activity involved in the detailed planning of the experiment and to the detailed plan itself.

There is a great deal of variability in how research is designed and conducted. Each type of design carries with it strengths and weaknesses and different designs are more appropriate for answering certain kinds of research questions than others. The task of the researcher is to develop a research design that properly and appropriately tests the research hypothesis.

Experimental Variance

Variance is a necessary part of experimentation - without variation there would be no differences to test. When an experiment is conducted, the researcher predicts variation and hopes to determine that the variation between two or more research groups is due to experimental manipulation of the independent variable. However, as much as variation between experimental groups is the goal, the researcher (and those evaluating the research) must be cautious about unwanted or extraneous variation. Unwanted variation can occur in any study and can threaten the validity of the study by allowing for alternative explanations of results. This reduces confidence in drawing causal inference, in generalizing beyond the sample, and in interpreting results. Two primary forms of variance are:

  • Systematic Between-Groups Variance; and
  • Non-Systematic Within-groups Variance.
  • Systematic Between-Groups Variance

For purposes of illustration, let us assume that a study has three levels of the independent variable (e.g., three different dosages of a given drug) and three groups, each of which gets one of the dosages. The researcher predicts that the dependent measure will differ across each of the three groups depending on the level of the independent variable (drug dosage) for that group. If there is not a significantly high between-groups variance - that is, the groups are essentially the same on the dependent measure - then the independent variable had no effect. Thus, a significantly high between-groups variance is needed to support the research hypothesis that the independent variable influenced the dependent variable as predicted.

The Concept of Falsifiability Revisited: Testing the Null Hypothesis

Popper argued that the objective of testing is the refutation of the hypothesis. When a theory's predictions are falsified, the theory will be rejected. Those theories that survive falsification are said to be corroborated and tentatively accepted, but not positively confirmed. Popper's notion of falsification is based upon the assumption that even after the testing and corroboration of predictions, an hypothesis has only achieved the status of 'not yet disconfirmed. '

In the process of hypothesis-testing, a finding that the difference between two experimental conditions is within the realm of chance results in a failure to reject the null hypothesis. Following the logic of Popper's concept of falsification, the null hypothesis has not yet been disconfirmed -- but the potential for rejecting the null hypothesis exists. That is, it is still possible for the null hypothesis to be rejected given future tests of that hypothesis. According to Popper, a theory that does not have the potential to be falsified in this way is not scientific.

It is important to note, however, that while it may be the case that the null hypothesis cannot be absolutely accepted, in many practical contexts decisions have to made and actions have to be taken as though the null hypothesis were true. This is especially true in applied research where decisions have to be made based upon imperfect knowledge.

 

 

However, even if the between-group variance is high, the researcher must be careful in drawing conclusions about a causal relationship. The significant difference may be due to either the systematic effects of the independent variable as predicted by the research hypothesis (experimental variance) or it may be due to systematic effects of uncontrolled extraneous variables (extraneous variance), or a combination of the two. Thus, between-groups variance is a function of both experimental effects and confounding effects. High experimental variance is important for the experiment. High extraneous variance is a serious problem and it makes conclusions about causality difficult, if not impossible, to draw. Thus, it is critically important that researchers seek to maximize the experimental variance and control for, or minimize, the extraneous variance.

  • Non-Systematic Within-Groups Variance

The term error variance is often used to denote the non-systematic within-groups variability. Error variance is due to random factors that affect only some subjects within a group. For example, some subjects may score lower than other subjects in the group because they are tired or anxious, or because of individual differences in personality, motivation, or interest. Error variance is also increased by experimenter errors or equipment variations that cause measurement errors for some subjects in a group but not for others. Error variance is the variation among individuals within a group that is due to chance factors. Because no two individuals are alike and no procedures are perfect, there will always be some degree of error variance. Error variance is random and therefore has random effects. If random errors cause some subjects to score lower than others, it is reasonable to assume that random error will cause other subjects to score higher-- that is, the effects of within-groups random error tend to cancel each other out.

To show a causal effect of the independent variable on the dependent variable, the experimental variance must be high and not be masked by too much extraneous or error variance. The greater the extraneous and/or error variance, the more difficult it becomes to show the effects of systematic, experimental variance. In experimentation, each study is designed to maximize the experimental variance, control extraneous variance, and minimize error variance.

  • Researchers must be aware of, explicitly acknowledge, and act to minimize threats to validity.
  • Researchers must ensure that subjects in different research groups are as similar as possible.
  • Researchers must ensure that subjects in experimental and control groups are treated in the same way with the exception of the experimental manipulation.

When critically evaluating research, it is important to ask some fundamental questions with respect to the research design:

  • Did the researcher use an experimental or quasi-experimental design?
  • Did the researcher appropriately control for extraneous, or confounding, variables that might influence the nature and strength of the relationship between the variables?
  • Did the researcher appropriately acknowledge and diminish threats to validity?

The answers to these questions place boundaries and limits on the type of interpretations that can be made about the research results obtained and the level of confidence with which one can make causal inferences about the relationship between the independent and dependent variables.

  • Did the researcher use an experimental or quasi-experimental design?

The primary difference between experimental and quasi-experimental research design is the degree of control the researcher exercises over the research situation and whether or not the researcher is able to randomly assign individuals (i.e., subjects) to different research groups.

Pure experimental designs are characterized by the ability to randomly assign subjects to different experimental conditions. Moreover, in pure experiments researchers have a great deal of control over the research groups and the research environment. In experimental research the researcher is generally able to manipulate the research situation or condition, to make causal predictions about the outcome, and to observe the resulting outcome. It is because the researcher has the ability to randomly assign subjects and manipulate the research situation, that he can draw causal inferences about the effect of one variable (i.e., the independent variable) on another variable (i.e., the dependent variable).

In pure experiments the control of variance is maximized through the use of random assignment of subjects and control groups. A control group serves as a basis of comparison for some other, experimental group. The ideal control group is similar to the experimental group on all variables except the variable of interest (the independent variable). This is achieved through random assignment of subjects. There are a large number of experimental research designs that vary in complexity, however an in-depth discussion of experimental research design is beyond the scope and purpose of this Deskbook.

The highest degree of control is obtained with experiments that allow causal conclusions to be drawn with the highest degree of confidence. However, there are times when the standards of a 'true' or 'pure' experiment cannot be met, but the researcher still wants to answer a causal question. In these situations, a quasi-experimental design ("quasi" means "similar to") can be used.

Quasi-experimental designs have the same general form as experimental designs including a causal hypothesis and some type of manipulation to compare two or more conditions or groups. However, in quasi-experimental designs researchers have less control over the research environment and do not randomly assign individuals to different research groups. That is, quasi-experimental research does not use random assignment to create equivalent comparison groups from which experimental cause is inferred. Instead, comparisons are made between non-equivalent groups that differ from each other in ways other than the presence or absence of some experimental variable whose effects are being tested. Quasi-experimentation requires separating the effects of an experimental manipulation from the effects due to the original non-comparability of the research groups. In order to separate these effects, the researcher must explicate specific threats to valid causal inferences and find some way to overcome, or at least minimize, these threats. The random assignment of subjects to experimental groups (or conditions) in pure experiments prevents most threats to validity, but in quasi-experiments the threats must be explicitly identified and handled. As with experimental designs, there are a variety of quasi-experimental research designs.

When evaluating scientific research, the distinction between experimental and quasi-experimental methods, and the implications of those distinctions, is an important one to recognize. For example, whether the research is experimental or quasi-experimental will be influence whether random assignment of subjects and control groups are used; the extent to which confounding variables need to be explicated and controlled for; the extent to which a researcher can make a causal inference; and the degree to which one can have confidence in that inference.

  • Did the researcher appropriately control for extraneous, or confounding variables that might influence the nature and strength of the relationship between the variables?

When critically evaluating the results of any experimental or quasi-experimental research, it is important to consider whether or not the results might be due to, or at least affected by, the influence of some extraneous or confounding variable. In experimental research cause-and-effect conclusions are justified only when, all other things being equal, the manipulation of the independent variable leads to a resultant change in the dependent variable. The fact that other things must be kept equal, means that the experimental and control groups must be as identical as possible. Other possibly influencing variables must be held constant so that the only thing that really varies among the groups is the independent variable.

When other factors are inadvertently allowed to vary, these factors confound the results. Confounding variables can take a variety of forms. For example, confounds might include differences in the characteristics of the subjects (e.g., in a study of the effects of a particular kind of medication, the fact that all of the subjects in group 1 are over the age of 50 and all the subjects in group 2 are under the age of 50 would be a confound -- the potential relationship between taking the medication and some observed change might be due more to differences in age between the two groups than to dosage), characteristics of the research environment (e.g., differences in the results of an experiment conducted in two labs might be due more to the cleanliness of the labs), or characteristics of the instruments used (e.g., perhaps one instrument is better calibrated than another).

Confounding variables should have been anticipated by the researcher and their potential influence should have been reduced through experimental controls. Controlling or eliminating confounding variables is crucial. Each confounding variable is a threat to the validity of the experiment. This is especially true for quasi-experimental research designs.

  • Did the researcher appropriately acknowledge, control for, and diminish threats to validity?

A major concern in research is the validity of the procedures and conclusions.

  • A valid measure measures what it is supposed to measure
  • A valid research design tests what it is supposed to test

Regardless of whether it has its basis in the physical, biological, social, or behavioral sciences, research influences our personal choices, the quality of our lives, our culture, and our environment. For example, psychological testing is prevalent in modern American society; it is estimated that hundreds of millions of achievement and intelligence tests are administered each year. Advances in the biological sciences, such as DNA testing, have impacted case law and the administration of justice; people who were convicted for crimes they did not commit are being freed from prison on the strength of DNA evidence, while others who might otherwise have gone unpunished for their crimes are being rightfully prosecuted. Children are being removed from their families on the basis of predictions of future likelihood of parental abuse or on assessments of poor parental competency. Examining the validity of research enables us to say with some confidence that we have achieved a degree of understanding with respect to some phenomenon; that we are measuring what we intend to measure; that the results achieved can safely be said to be specific to the conditions we created; and that the evidence we have gathered truly supports the conclusions we have drawn.

There are four main types of validity that are important to consider when evaluating the conclusions and interpretations made about a scientific study.

 

Types of Validity

Statistical Conclusion Validity

Construct Validity

External Validity

Internal Validity

  • Statistical Conclusion Validity

When statistical procedures are used to test the null hypothesis a statement is being made about the statistical validity of the results. A threat to statistical conclusion validity occurs when there are concerns about the adequacy and appropriateness of the conclusion to reject or fail to reject the null hypothesis. There are three primary threats to statistical conclusion validity:

  • ïthe possibility that the measures (e.g., calibrated instruments in a laboratory, surveys) used to assess the dependent variable are unreliable -- that is, the measures cannot be depended upon to measure true changes;
  • the possibility that experimental treatments are not consistently implemented across subjects, time, or experimenters; and
  • the possibility that the researcher has violated the assumptions that underlie specific statistical tests.

 

  • Construct Validity

Hypotheses are bound to theoretical ideas or constructs. Construct validity refers to how well the study's results support the theory or constructs behind the research and whether the theory supported by the findings provides the best available theoretical explanation of the results. There must be congruence between the conceptual definition and the operational definition. The level of congruence between the conceptual and operational definition is closely related to the level of confidence the researcher can have in the construct validity of the theory under study. In order to help reduce threats to construct validity the researcher should have clearly stated definitions and carefully built the hypotheses on solid, well-validated constructs. The theoretical bases for the research should be clear and well supported, with rival theories carefully ruled out.

  • External Validity

In the strictest sense, the results of an experiment are limited to those subjects and conditions used in the particular experiment. However, researchers typically want to be able to generalize the results beyond the specific conditions and subjects, and to be able to apply the findings to other similar subjects and conditions. External validity refers to the degree to which researchers can generalize the results of the research to other subjects, conditions, times, and places.

  • Internal Validity

Internal validity is of great concern to the researcher because it involves the very essence of experimentation -- the demonstration of causality. In an experiment, internal validity addresses the question: "Was the independent variable, and not some confounding (extraneous) variable, responsible for the observed changes in the dependent variable?" Internal validity refers to the approximate validity with which the researcher can infer that a relationship between two variables is causal or that the absence of a relationship implies the absence of cause. Threats to internal validity are especially salient for quasi-experimental designs.

It is possible that there exists more than one threat to internal validity in any given situation. The net biasing effect of internal validity threats depends on the number of threats, whether the existing threats to internal validity are similar or different in the direction of bias, and on the magnitude of any bias they cause independently. The more numerous and powerful the validity threats and the more similar the direction of their effects, the more likely it is that a false causal inference will be made.

In a classic text, Cook and Campbell(1) summarized the major types of confounding variables that can affect the results of experimental, and especially quasi-experimental, research designs and thus lead to erroneous causal inferences. The primary threats to internal validity are briefly presented below. A detailed discussion of the various threats to internal validity is beyond the scope and purpose of this Deskbook.

There are also threats to internal validity that are due to subject and experimenter factors. The expectations and biases of both the researcher and the subjects can systematically affect the results of the experiment in subtle ways, thus reducing the study's validity. Factors such as motivation, knowledge, expectations, and information or misinformation about the study can be powerful influences on behavior.

Recall that the major objective of an experiment is to demonstrate with confidence that the manipulated independent variable is the major cause of the observed changes in the dependent variable. When causality is not clear because some variable other than the independent variable may have caused the effect, then the internal validity of the research is threatened. While it may be that the independent variable did have some causal influence on the dependent variable, potential threats to internal validity that have not been explicitly identified and handled by the researcher undermine the ability to infer a direct causal relationship between the independent and dependent variable. Thus, threats to internal validity reduce the confidence one can have in the causal relationship between the independent and dependent variable.

Experimental control procedures are needed to counteract threats to validity so that researchers can have confidence in their conclusions. Many control procedures are available to meet the variety of threats to validity. But not every threat to validity is likely to occur in every experiment; thus, not every control measure is needed in every experiment. Although some control procedures are of general value and therefore applicable to nearly all scientific studies, many of the available controls must be carefully selected to meet the particular threats to validity present in the study. However, it is important to realize that controls are necessary in all kinds of research.

IV. Measurement

Measurement in experimental and quasi-experimental research can take a variety of forms. However, what is most important is to assess the reliability of the measures used. Reliability is concerned with whether repeated efforts to measure the same phenomenon come up with the same results. Much of the work on reliability has been done in conjunction with testing.

Test-retest procedures examine the consistency of answers given by the same people to the same test items, or a parallel set of items, at two different test administrations. Test-retest reliability is reported as a correlation (a value of 1.0 indicates a test with perfect reliability; a value of 0.0 indicates the test has no reliability). When a measure consists of a number of items, internal consistency checks look at people's responses to different subsets of items in the instrument.

Reliability is not an inherent quality of the measure but a quality of the measure used in a particular context. Even when an existing measure is reported to have high reliability, it may not be very reliable when used by different raters with different subjects.

V. Data Collection

There is a great deal of diversity in how data can be collected. But, regardless of data collection procedures used, the procedures should have been well articulated as part of the research design and justifications should have been given for the selection of one collection procedure over another potentially appropriate procedure. As discussed earlier, it is important that potential confounds are identified and minimized.

VI. Data Analysis

Chapter 9 presents a brief overview of basic statistical procedures and principles. As with the methodologies used, the data analysis plan should have been carefully laid out in the design of the research. Different statistical tests rest on different assumptions about the underlying population of study, the nature of the inference being made, and the appropriate boundaries within which statistical significance can be determined. Inappropriate selection of statistical tests, violations of underlying statistical assumptions, and incorrect inferences of causality and significance, represent significant threats to statistical conclusion validity and to the confidence with which results can be accepted.

VII. Generalization

As briefly discussed in the section on external validity, the results of an experiment are technically limited to those subjects and conditions used in the particular experiment. However, it is typically the case that researchers want to be able to generalize the results beyond the specific conditions and subjects, and to be able to apply the findings to other similar subjects and conditions. The extent to which research findings are generalizable to another group of subjects, to other situations, and other times, depends a great deal on the amount of care taken in designing the research and minimizing threats to validity (especially threats to external validity). Although it is often the case that the results of well designed research can be generalized, there must be some common relationship or characteristic across individuals, situations, or time.

In most cases, determining the generalizability of research findings to other people, settings, and time involves a replication of the research on another group of subjects, or in a different location, or under different constraints. As discussed at the beginning of this chapter, the research process is cyclical -- findings of one study provide questions for the next study.

CRITICAL QUESTIONS REVIEWED

 

  • Do you have a clear understanding of what the research was designed to study? Do you understand the nature of the predicted relationship? That is, did the researcher clearly articulate the statement of the problem?



  • How were independent and dependent variables operationalized? That is, do you clearly understand what each variable means (as indicated by operational definition)?
  • Do the operational definitions adequately capture the conceptual meaning of the variables?



  • Do you have a clear understanding of how the researcher measured changes in the variables?



  • Did the researcher clearly outline the research design? Do you have a good general overview of the major research steps involved?



  • Did the researcher provide adequate justifications for why decisions were made, especially if alternative methods were also appropriate?



  • Did the researcher use an experimental or quasi-experimental design?



  • If an experimental design, was random assignment used? Was an appropriate control group used?



  • If a quasi-experimental design, were appropriate steps taken to avoid, minimize or eliminate potential threats to validity? Was the appropriate level of care taken in expressing the strength or nature of the causal relationship?



  • Did the researcher appropriately control for extraneous, or confounding, variables that might influence the nature and strength of the relationship between the variables?



  • Did the researcher appropriately acknowledge and diminish threats to validity?



  • How was the reliability of the measures determined?

Endnotes:

Endnotes:



1. Cook, T. D. & Campbell, D.T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings. Houghton Mifflin Company: Boston.

GLOSSARY

 

attrition a threat to internal validity when subjects drop out of a study differentially; if the number and type of people dropping out is not distributed across groups, there may be a biasing effect

cause three elements to establishing cause: covariation between variables; temporal ordering; and no (or severely minimized) confounds

confounding variable an extraneous variable, unrelated to the experimental relationship of interest, that interferes with the researcher's ability to draw a causal connection between the independent variable and the dependent variable; the influence of the independent variable (Variable A) on the dependent variable (Variable B) cannot be disentangled from the possible influence of a confounding variable (Variable C)

construct validity refers to how well the study's results support the theory or constructs behind the research and whether the theory supported by the findings provides the best available theoretical explanation of the results

control (experimental) the major function of experimental controls is to rule out (or severely diminish) threats to a valid causal inference; control is used in three general ways: the ability to control the situation in which an experiment is being conducted so as to keep out extraneous forces; the ability to determine which subjects/conditions receive a particular experimental manipulation at a particular time; and the attempt to control for the knowledge, experiences, attitudes, and expectations of subjects

control group a group of subjects used in experimental research that serves as a basis of comparison for other (experimental) groups; the ideal control group is similar to the experimental group on all variables except the independent variable

correlation two events are correlated when the presence of a high value of one variable is regularly associated with a high or low value of another

covariation two events vary together; a change in one variable is associated with, but not necessarily caused by, another variable

deception the researcher misinforms the subjects about the purpose of the experiment, the hypothesis being tested, etc.; a technique to control for subject and experimenter effects that may threaten the validity of the experiment

demand characteristics cues implicit in the research setting or procedures, or unintentionally communicated by the researcher, that provide information to the subjects on how they should behave and/or information about the purpose and goals of the research

dependent variable a measure of presumed effect in a study; the DV is predicted to change as a result of the manipulation of the independent variable

diffusion of treatment a threat to internal validity when subjects in different experimental conditions are in close proximity and are able to communicate with each other; earlier subjects may "give away" the procedures used to those scheduled later

double blind procedure the researcher does not know which condition subjects have been assigned to and subjects do not know which condition they have been assigned to; a technique to reduce threats to validity

error variance random error that results from differences between subjects, experimenter errors, and/or equipment variations

experimental hypothesis states the effect the independent variable is predicted to have on the dependent variable

experimental studies characterized by the ability of the researcher to: manipulate the situation or condition; make predictions about the outcome; and observe the resulting outcome; because of the use of manipulation and experimental controls, it is possible to make causal inferences about the effect of the manipulation on the outcome

experimenter effects the possibility that the expectations, motivations, and biases of the researcher may systematically affect the results of the experiment in subtle ways, thus reducing the study's validity

experimenter the expectation of the scientist that the research may turn out in a particular way

expectancies or have particular findings may influence data selection, the way in which the research is designed, the statistical procedures used and the interpretation and results

experimental variance differences between research groups are due to systematic effects of the independent variable on the dependent variable as predicted by the research hypothesis

extraneous variance differences between research groups are due to uncontrolled or extraneous variables rather than to the systematic effects of the independent variable on the dependent variable

external validity the degree to which researchers can generalize the results of the research to other subjects, conditions, times, and places

falsifiability a theory is only scientific to the extent that there is a potential for falsification; the goal of falsification is to refute (prove incorrect) a theory based upon observations gained through the scientific method

history effect a threat to internal validity when an observed effect might be due to an event which takes place between the pretest and the posttest, when this event is not the event of interest, or when the outcome or the effects of the experiment might be due to different life experiences of the subjects

hypothesis a type of idea; it states that two or more variables are expected to be related to one another

hypothesis-testing the process of systematically testing an hypothesis

independent variable the presumed cause of some outcome under study; the IV is the experimentally manipulated variable; changes in an independent variable are hypothesized to have an effect on the outcome or behavior of interest

instrumentation a threat when an effect might be due to a change in the measuring instrument between pretest and posttest or to the researchers becoming more proficient over time in administering tests or in making observations

internal consistency responses to different subsets of questions are compared for consistency; reliability check

internal validity refers to the approximate validity with which the researcher can infer that a relationship between two variables is causal or that the absence of a relationship implies the absence of cause

maturation effect a threat to internal validity when an observed effect might be due to the respondent's growing older, wiser, Ber, and more experienced and not due to the experimental manipulation

negative correlation an increase in one variable is associated with a decrease in another variable

non-experimental descriptive rather than predictive; they can demonstrate that a relationship exists

studies between antecedent relationships and outcomes, but in most cases they cannot establish a causal connection

non-systematic differences between subjects in the same group that are due to random influences;

within-in group variability within a group

variance

null hypothesis the null hypothesis states that there is no difference between the two conditions beyond chance differences; if a statistically significant difference is found, the null hypothesis is rejected, if the difference is found to be within the limits of chance, it is concluded that there is insufficient evidence to reject the null hypothesis

operational definition a description of an independent or dependent variable, stated in terms of how the variable is to measured or manipulated

placebo effect merely being in the experiment produces the predicted effect, regardless of whether or not they were actually part of the experimental group

positive correlation an increase in one variable is associated with an increase in another variable

quasi-experimental incorporates experimental manipulations, outcome measures and comparison

studies groups, but does not involve the random assignment of subjects to different experimental conditions

random assignment all subjects have an equal chance of being assigned to a given experimental condition; a procedure used to ensure that experimental conditions do not differ significantly from each other

regression toward a threat to internal validity when the researcher selects subjects because their

the mean scores on a measure are extreme because on the second testing, scores have tendency to be less extreme regardless of the experimental manipulation

reliability the extent to which multiple measures of a phenomenon produce the same results; the extent to which a measure is free from random error

research hypothesis a precise and formal statement of the research question; identifies and operationalizes the independent and dependent variables; states a relationship clearly between the independent and dependent variable; and clearly allows for the possibility of empirically testing the relationship

sample consists of members of the population who have been selected for observation in an empirical study; a sample should be representative of the larger population of interest; if the sample is not representative, then sampling bias exists and generalizations made on the basis of the results obtained from the sample are likely to be inaccurate and lack external validity

selection a threat to internal validity when care is not taken by the researcher to insure that two or more groups being compared are equivalent before the manipulation begins

sequencing effects a threat when subjects are exposed to more than one experimental condition, their experiences in earlier conditions may influence their experiences in later conditions; if the order of presentation of conditions for all subjects is condition A, followed by condition B, followed by condition C, then systematic confounding effects can occur

single blind procedure the subjects do not know whether they have been assigned to the experimental or control conditions; a technique to reduce threats to validity

statement of the includes: identification of at least two variables; a statement about an expected

problem relationship between identified variables; and an indication of the nature of the causal effect

subject effects the possibility that the expectations, motivations, and biases of the subjects may systematically affect the results of the experiment in subtle ways, thus reducing the study's validity

statistical conclusion are the results due to some systematic factor or are they due merely to chance

validity variations (e.g., are measures reliable? have statistical assumptions been violated?)

systematic between- systematic differences between research groups on measures of the dependent

groups variance variable as a function of different levels of the independent variable; includes influence of both experimental variance and extraneous variance

temporal ordering in order for a variable to cause a change in the other, the cause must precede the effect; a necessary component

test-retest procedures used to examine consistency in answers given by the same people to the same test items or parallel set of items, or at two different test administrations; reported as a correlation; reliability check

testing effect a threat to internal validity when an effect might be due to the number of times particular responses are measured or to the fact that the questionnaire itself alerts subjects to the subject matter or the goals of the research

validity concerned with whether the experiment/instruments/measures are really testing what they are supposed to test

SUGGESTED READINGS:

Topic-related resources prepared for the judiciary and legal community:

Black, B., Hoffman, J.M, Dunbar, J.F., Hogan, C.A., & Lavender, G.W. (1997). "The Law of Expert Testimony - A Post-Daubert Analysis." In Black, B. & Lee, P.W. (Eds.) Expert Evidence: A Practioner's Guide to Law, Science, and FJC Manual, West Publishing, pgs. 34-46.

For more technical treatments of the topic:

Cook, T. D. & Campbell, D.T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings. Houghton Mifflin Company: Boston.

Manly, B.F.J. (1992). The Design and Analysis of Research Studies. Cambridge: Cambridge University Press.

Montgomery, D.C. (1991). Design and Analysis of Experiments, 3rd Edition. New York: John Wiley

--- CHAPTER 4 --- FRONT PAGE --- CHAPTER 6 ---