Pre Test and Post Test designs are widely used in behavioral research. Mainly for the purpose of comparing groups and / or measuring the change resulting from the experimental treatments. A pre- and post-test design is an experiment in which measurements are taken in individuals before and after they are involved in some treatment.
Pre Test and Post Test Design
- Administer a pretest to a group of people and record their scores and some treatment designed to change the score of individuals.
- Administer a posttest to the same group of people and record their scores.
- Discuss the difference between the pre and post test scores.
Example: all students in a certain class take a pre-test. The teacher then uses a certain teaching technique for a week and administers a posttest of similar difficulty. Then, analyze the differences between the pre and post test scores to see if the teaching technique had a significant effect on the scores.
Pre-Test and Post-Test design with control group
- Randomly assign individuals to a treatment group or a control group and administer the same pretest to everyone and record their scores.
- Administer some treatment procedure to the individuals in the treatment group and administer some standard procedure to the individuals in the control group.
- Administer the same posttest to people in both groups.
- Analyze the difference between the pre and post test scores between the treatment group and the control group.
Example: A teacher divides randomly assigns half of her class to a control group and the other half to a treatment group. Then use a standard teaching technique and a new teaching technique with each group, respectively. He does this for a week and then administers a posttest of similar difficulty to all students.
Possible problems with internal validity
Internal validity refers to the degree to which a study establishes a reliable cause and effect relationship between a treatment and an outcome. In a pre and post test design experiment, there are several factors that could affect internal validity, including:
History: Individuals experience some event outside of the study that affects measurements before and after a treatment.
Maturity: biological changes in participants affect measurements before and after a treatment.
Attrition: an individual leaves the study before a further measurement can be made.
Regression to the mean: People who score extremely high or low on some measure have a tendency to score closer to the average the next time. This despite the treatment in which they participate.
Selection bias: the individuals in the treatment group and the control group are not really comparable.
Often times, random selection and random assignment of individuals to groups can minimize these threats to internal validity, but not in all cases.
Basic Pre-Test and Post-Test experimental designs
Internal validity is the degree to which the experimental treatment makes a difference (or causes a change) in specific experimental settings. External validity is the degree to which the treatment effect can be generalized across populations, settings, treatment variables, and measurement instruments.
Factors that threaten internal validity are: antecedents, maturation, pre-test effects, instruments, statistical regression towards the mean, differential selection of participants, mortality, and factor interactions.
Threats to external validity include: interaction effects of selection and treatment biases. In the same way, the reactive interaction effect of preliminary tests, the reactive effect of experimental procedures and the interference of multiple treatments are considered. For a detailed discussion of the threats to internal and external disability, readers can consult Bellini and Rumrill.
The annotations used are:
Y1 = pre-test scores,
T = experimental treatment,
Y2 = post-test scores,
D = Y2− Y1 (gain scores)
and RD = random design (random selection and assignment of participants to groups and then random assignment of groups to treatments).
ANCOVA with Pre Test and Post Test data
The purpose of using pre-test scores as a covariate in ANCOVA with a pre-test-post-test design is: (a) reduce the variance of the error and (b) eliminate systematic biases. With randomized designs, the main goal of ANCOVA is to reduce the error variance, because random assignment of subjects to groups protects against systematic bias. It is important to note that when pre-test scores are unreliable, treatment effects can be severely biased in non-randomized designs.
Another problem with ANCOVA is related to the differential growth of subjects in intact or self-selected groups in the dependent variable. Pretest differences (systematic bias) between groups can affect interpretations of posttest differences. Recall that assumptions such as randomization, the linear relationship between the pre and post-test scores, and the homogeneity of the regression slopes underlie the ANCOVA.
ANOVA and ANCOVA in Gain Scores
In an attempt to avoid problems that could be created by a violation of these assumptions, some researchers use ANOVA on gain scores. This without knowing that the same assumptions are required for the analysis of profit scores. Previous research has shown that when the regression slope is equal to 1, ANCOVA and ANOVA in the gain scores produce the same F relationship.
The gain score analysis being a bit more powerful due to the degrees of freedom lost with the analysis of covariance. When the regression slope is not equal to 1, which is usually the case, ANCOVA will result in a more powerful test. For example, if there is no linear relationship between the pretest and posttest scores, the ANCOVA can be expanded to include a quadratic or cubic component. Or, if the regression slopes are not equal, ANCOVA can lead to a procedure such as the Johnson-Neymante technique that provides regions of importance.
ANOVA on residual scores
Compared with the ANCOVA model, the ANOVA on residual scores is less powerful and some authors recommend that it be avoided. Maxwell, Delaney, and Man-heimer warned researchers about a common misconception that the ANOVA on residual scores is the same as the ANCOVA. They showed that:
(a) When the residuals are obtained from the regression coefficients grouped within the group, the ANOVA of the residual scores results in an inflated significance level α and (b) When the regression coefficient is used for the total sample of all groups combined, ANOVA on residual scores produces an inadequate conservative test.
Measurement of the Dependent Variable in Pre Test and Post Test
Certainties in the Measurement of the Variable
If the mean post-test score is better than the mean pre-test score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, this often cannot be concluded with a high degree of certainty because there may be other explanations as to why the post-test scores are better. Perhaps an anti-drug program was broadcast on television and many of the students saw it, or perhaps a celebrity died of a drug overdose and many of the students found out. Another category of alternative explanations is called maturation. If it were a one-year program, participants could become less impulsive or better reasoners and this could be responsible for the change.
Changes in the Dependent Variable
Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean. This refers to the statistical fact that an individual who scores extremely on one variable one time will tend to score lower the next time. For example, a bowler with a long-term average of 150 who suddenly rolls a 220 will score lower in the next game. Your score will "roll back" toward your mean score of 150.
Regression to the mean can be a problem when participants are selected for further studies because of their extreme scores. Imagine, for example, that only students who scored especially poorly on a fraction test receive a special training program and then retest. Regression to the mean practically guarantees that your scores will be higher even if the training program has no effect.
A closely related and extremely important concept in psychological research, for example, is spontaneous remission. This is the tendency for many medical and psychological problems to improve over time without any treatment. The common cold is a good example. If 100 common cold sufferers were measured today for symptom severity, given a bowl of chicken soup every day, and then had their symptom severity measured again in a week, they would likely improve a lot.
However, this does not mean that the chicken soup was responsible for the improvement, as they would have improved a lot without any treatment. The same goes for many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. Reviewing the results of several depression treatment studies, researchers Michael Posternak and Ivan Miller found that participants in waiting list control conditions improved by an average of 10 to 15%. This before receiving any treatment. Therefore, in general, great care must be taken when inferring causality from pretest-posttest designs.
Bellini and P. Rumrill, Research in rehabilitation counsel-ing, Springﬁeld, IL: Charles C. Thomas. R.D. Bock, Basic issues in the measurement of change. in:Advances in Psychological and Educational Measurement,D.N.M.DeGruijterandL.J.Th.VanderKamp, eds,JohnWiley& Sons, NY, 1976, pp. 75–96.
A.D. Bryk and H. I. Weisberg, Use of the nonequivalent con-trol group design when subjects are growing, PsychologicalBulletin 85 (1977), 950–962
I.S. Cahen and R.L. Linn, Regions of signiﬁcant criterion dif-ference in aptitude- treatment interaction research, AmericanEducational Research Journal 8 (1971), 521–530.