Errors in the collection of statistical data can be classified as follows:
Data management errors
According to Gawande (2010), data management errors are often more idiosyncratic than systematic.
In one case, a group accidentally used backwards coded variables, which made their conclusions contrary to those supported by the data. In another case, the authors received an incomplete data set because entire categories of data were omitted. When corrected, the qualitative conclusions did not change, but the quantitative conclusions changed by a factor of > 7. These idiosyncratic data management errors can occur in any project and, like statistical analysis errors, could be corrected by a new analysis of the data. In some cases, idiosyncratic errors can be avoided by following checklists.
Errors in Storage and Data Exchange
Errors in the storage and exchange of long-term data can make the findings unrtable because the data is not available to be re-analysis. Many metaanalysts have tried to obtain additional information about a study, but have not been able to do so because the authors did not respond, could not find the data, or were not sure how they had calculated their original results.
Statistical analysis errors
Statistical analysis errors involve methods that do not reliably support the conclusions. They can occur if the underlying assumptions of the analyses are not met, erroneous values are used in the calculations, the statistical code is mis-specified, incorrect statistical methods are chosen, or the result of a statistical test is misinterpreted, regardless of the quality of the underlying data.
Causes of statistical analysis errors
Bent et al (2003), point to the following causes:
First, the misdiagnosis of group randomised trials
It may inappropriately and implicitly assume the independence of observations. Worse, when there is only one cluster per group, the clusters are completely confused with the treatment, resulting in zero degrees of freedom to test the group effects. This has also led to a retraction.
Second, effect sizes for meta-analyses may mishandled multiple treatment groups
For example, assuming independence despite sharing a control group) or not using the correct variance component in calculations. In turn, metaanalytic estimates of these effect size calculations may be incorrect, and have sometimes required correction.
Third, it is inappropriate to compare the nominal significance of two independent statistical tests as a means of drawing a conclusion about differential effects.
This error of “differences in nominal significance” is sometimes made in studies with more than one group, in which the final measurements are compared with the baseline separately for each group. If one is significant and the other is not, an author may erroneously conclude that the two groups are different.
Effects of statistical analysis errors
The effects of these errors on conclusions can be serious. However, when the effects of the treatment are poorly analyzed, we often cannot immediately say that the conclusions are false, but we can say that the analyses are not reliable for statistical inference and conclusions. You have to contact the authors and publishers to solve the problem. In other cases, the conclusions may obviously be wrong.
If a DINS error is made in a study and the point estimates for each group are identical, it is clear that the appropriate test between groups would not be statistically significant. Fortunately, the nature of statistical errors is such that, if authors and journals are willing, and the underlying data is not bad, the analysis errors can be corrected. Unfortunately, correcting errors often requires an arduous process that highlights the limitations of the self-correcting nature of science.
Even if it is not an error in the data or analysis, research filtered through the lens of poor logic can distort the results, leading to conclusions that do not emerge from the data, analysis or fundamental premises.
Classical logical fallacies appear in the literature. “Cum hoc, ergo propter hoc” (with this, therefore, because of this; common from cross-sectional data) and “post hoc, ergo propter hoc” (after this, therefore, because of this; common with longitudinal data) are two examples of logic errors that assume that the observed associations are sufficient evidence for causality. Assuming causality from observational evidence is common.
In some cases, articles are careful to adequately describe associations rather than causality claims, such as: “Dietary factors were estimated to be associated with a substantial proportion of deaths from heart disease, stroke, and type 2 diabetes.” However, the media or subsequent communications from authors can succumb to these fallacies [e.g., “Our Nation’s Nutritional Crisis: Nearly 1,000 Cardiovascular and Diabetes Deaths Every Day (!) due to poor diet.”
Arguments in logic errors
Arguments based on authority, reputation and ad hominem reasoning are also common. These arguments may focus on the characteristics of the authors, the caliber of a journal, or the prestige of the authors’ institutions to reinforce or refute a study. In an example of ad hominem reasoning, one author was disparagingly identified only as a consultant to the chemical industry with a competitive interest in passively dismissing arguments, while he was also reasoned on the basis of authority and reputation by negatively contrasting the arguments of the other authors with independent scientific entities. They can serve as useful heuristics for making everyday decisions. Using them to support or refute the quality of evidence in published articles is tangential to science.
Other logical fallacies are evident in the literature, but one that unites the others is to argue that the conclusions drawn from erroneous research are false: the “fallacy of the fallacy.” The identification of an error in an article or reasoning cannot be used to say that the conclusions are wrong. Rather, we can only say that the conclusions are unreliable until further analysis.
Communication errors do not necessarily affect the data and methods, but are failures in the logic used to connect the results to the conclusions. In the simplest case, communication can be overzealous, extrapolating beyond what a study can tell us.
Communication Error Example 1
Authors discussing the benefits and limitations of animal trials in predicting cancer risk in humans point out that the problem with animal testing is that the results of animal trials are often incorrectly extrapolated to humans.
Studies are reported in which the doses given to animals were degrees of magnitude higher than expected for humans. In one study, animals were given a dose of daminozide (a regulator of plant growth) that would require humans to consume 28,000 pounds of apples a day for 10 years to obtain it – extrapolation errors in both species and dose.
Communication Error Example 2
Other forms of erroneous extrapolation are evident. A study of responses to small 1-day exposures may be inadequate to extrapolate to chronic exposures. Publication, notification, and citation biases are other forms of communication errors that can lead to a form of erroneous data when a collection of scientific reports is considered as data in itself. If scientists do not publish some results for whatever reason, the entirety of the data used to summarize our scientific knowledge (e.g., meta-analysis) is incomplete.
P-Hacking and P-Fidding
“P-hacking” and related practices such as researcher degrees of freedom and “p-fiddling”, among other names, represent a form of selective information and can also be considered statistical analysis errors. In most cases, there is no single, universally agreed method for analyzing a particular data set, so testing multiple analyses can be considered scientifically prudent to verify the robustness of the results.
However, p-hacking uses the P-value of an analysis as the rule by which a particular analysis is chosen, rather than the suitability of the analysis itself, often without fully revealing how that P-value was chosen. The conclusions are questionable because the unrevealed flexibility in data collection and analysis allows anything to be presented as significant. A striking example is the publication of seemingly very statistically significant results in the “Bible Code”, which were later debunked as a variant of p-hacking.
Contributing factor issues
According to Verhulst et al (2012), scientists are humans who make ill-informed mistakes and guesses, sometimes with the best of intentions. However, scientific processes are designed to limit these human weaknesses, but humans continue to communicate results derived from methods, data, or misinterpretations. Sometimes, mistakes only become apparent with time and technological improvements. Understanding and identifying what contributes to errors that cloud scientific processes can be key to improving the robustness of scientific findings.
An obvious contributing theme is the simple ignorance, whether of an individual, the research team, a reviewer, the editors, or others. Although the existence of errors has been catalogued and published, this only states that the errors are known to us and to the scientific community at large, but not necessarily to each individual. In other words, these errors are “unknown”: errors known to science, but not to a particular scientist.
Bad examples in literature can, by themselves, perpetuate ignorance. An effective peer review after publication can be particularly helpful in mitigating ignorance, as these errors serve as instructive examples of what not to do. It is also important to recognize that some errors have not yet been made, identified or corrected, and therefore the errors are currently unknown. Time may be the most critical component in revealing these as-yet-unidentified errors.
A misconception of the study
An ill-conceived study presents basic problems for the rest of the process of conducting, analyzing and reporting the research. The beginning of the study can bifurcates in the generation of hypotheses and the verification of them, although the two branches certainly contribute to each other. If a study is initiated with the intention of making a discovery, but without a clear scientific plan, the decisions made along the way will follow the data.
This is not a problem in itself, as long as the final results are reported as a wandering scan. In contrast, poorly planned testing of a hypothesis can allow researchers to choose variations in methods or analyses not based on a rigorous question or theory, but on interests and expectations. A frequently used example is the experience of C. Glenn Begley, who, after failing to replicate the results of another research group, was informed by one of the original authors that an experiment had been tested several times, but that they only published the results that gave the best picture.
Harking or Storytelling Post Hoc
Generating hypotheses after the results are known (the so-called HARKing or post hoc storytelling) provides the façade of a carefully conducted study, but in fact, the path from hypothesis, through data collection, to rigorous conclusions is short-circuited when looking at the results and applying a story that fits the data. In some respects, Gregor Mendel’s classical studies of pea genetics are consistent with the latter model, with data probably too perfect to have arisen naturally.
Publications serve as an academic bargaining chip, so academics may be pressured to publish something — sometimes anything — to increase that currency, gain ownership, or maintain funding. This is the so-called “publish or perish” paradigm. Given the expansion of the number of journals, there are fewer barriers to publishing, and a more modern expectation may include the desire to publish in higher-ranking journals, get more publicity, or report positive, novel, or exciting results.
There may also be personal expectations. After months of experimentation or years of data collection, you want to get something “useful” from a project. Not everything is worth publishing if you do not contribute knowledge. If the data is bad, the methods flawed or the conclusions are not valid, the publication will not contribute to the knowledge, but can subtract it. Publication-based, goal-oriented pressure can move behavior away from rigorous science. In 1975, Paul Feyerabend expressed concern about the increase in publications without a concomitant increase in knowledge. He pointed out that most scientists today are devoid of ideas, full of fear, with the intention of producing some insignificant result in order to contribute to the avalanche of inan articles that now constitutes scientific progress in many areas.
Many scientists began their lines of research out of an innate interest: a deep curiosity, a desire for discovery, or a personal connection to a world problem. Conducting experiments, analyzing data, and observing the world are not only aspects of science, but also represent personal interests and passions. Thus, when results bring something interesting, passion and enthusiasm risk nullifying the fact that science is designed to be the great antidote to the poison of enthusiasm and superstition.
Whether it’s time, staff, education, or money, rigorous science requires resources. Insufficient resources can encourage mistakes. If time is short, the proper checks for rigour can be waived. If there are too few staff, a team may be insufficient to complete a project. Pi there is too little education, you may lack the proper experience. As such, rigorous methodology may be inaccessible. Practical compromises have to be made, sometimes at the expense of rigour.
Insufficient checking of methods, results, or conclusions due to a conflict of priorities can also contribute to introducing or ignoring errors. A researcher may consciously know that he or she should not make certain mistakes or shortcuts, but priorities can compete for resources, attention, or willpower. The result can be sloppy science, negligent behavior, or distortion of observations. In fact, there may be some disparity among scientists when it comes to dealing with such conflicts. According to a meta-analysis, greater creativity is associated with lower levels of awareness, compared to those with less creativity.
It is often impossible to determine whether the authors succumbed to these conflicts of priorities, intentionally deviated from scientific rigor, or made honest mistakes. The most common discourse on priorities revolves around the disclosure of potential financial conflicts, but there are many other sources of conflict. When individuals fully believe in an idea or have built an entire career and image on an idea, to publish something against it would be to conflict with an entrenched ideology. In other cases, the ideology that an author can defend is considered fair.
Our specialists wait for you to contact them through the quote form or direct chat. We also have confidential communication channels such as WhatsApp and Messenger. And if you want to be aware of our innovative services and the different advantages of hiring us, follow us on Facebook, Instagram or Twitter.
If this article was to your liking, do not forget to share it on your social networks.
You may also like: The importance of a postgraduate degree: how will they help my career?
Verhulst B, Eaves L, Hatemi PK (2012) Correlation not causation: The relationship between personality traits and political ideologies. Am J Pol Sci 56:34–51, and erratum (2016) 60:E3–E4. Google Scholar
Bent S, Tiedt TN, Odden MC, Shlipak MG (2003) The relative safety of ephedra compared with other herbal products. Ann Intern Med 138:468–471, and correction (2003) 138:1012.Google Scholar
Gawande A (2010) The Checklist Manifesto: How to Get Things Right (Metropolitan Books, New York). Google Scholar