Questionable research practices (QRPs) and research misbehavior are activities that are not transparent, ethical, or fair and thus threaten scientific integrity and the publishing process. However, the fact QRPs are hard to identify and define – and sometimes easy to get away with – makes them “questionable” rather than technically illegal. To be an ethical researcher, QRPs should be forbidden in all your work.
The prevalence of QRPs and research misbehaviors in science have created dangerous consequences, such as the replication crisis. At the same time, QRPs have also inspired many solutions, such as pre-registration and preprinting.
But what are QRPs? You might not even know you’re doing them. Especially when you’re under pressure. This article explains what QRPs are, why they’re wrong, and how you can avoid doing them (and damaging your career).
What is a questionable research practice?
QRPs and research misbehaviors are decisions made during the research process that raise questions regarding your work’s rigor and precision.
More simply put: QRPs are practices that are not transparent. Transparency (and reproducibility) are always things to strive for in your work. For your own good, for your field, and for greater science.
QRPs can encompass low-grade violations like selectively citing your own work to increase your visibility, and up to more serious and career-defining violations like running multiple analyses until you get a significant result or writing hypotheses after knowing the results. Engaging in QRPs leads to inflation of false positives and, ultimately, to the so-called replication crisis and a lack of trust in the scientific process.
The term QRPs was popularized in a 2012 article by John, Loewenstein, and Prelec. These authors did distinguish between fraud, which is data falsification, and QRPs. Although fraud is rare, QRPs are fairly common in most research areas. In fact, an estimated one in two researchers has engaged in at least one QRP over the last three years.
Why are QRPs done?
Researchers engage in QRPs to meet the highly competitive demands of academic journals and to publish (or be seen to publish) significant results. This pressure to publish stems from a general notion that unsuccessful research isn’t useful and that you can only publish statistically significant findings. On top of that, the high rejection rate of most journals further puts researchers under pressure to “publish or perish.” Negative results are seen as not useful and unpublishable.
These external pressures weigh on researchers to selectively use, knowingly or not, QRPs to publish “significant” findings. The inconsistent and selective usage of these practices also makes them dangerous to academic pursuits and the process in general.
Imagine the following scenario: You run your analyses and don’t get a significant result. You look into your dataset and find one single outlier that, when removed, gives you significance. You remove the outlier, obtain significant results, and publish your article in a high-ranking journal.
But would you have done the same thing if the results had been significant to begin with?
The answer is most likely no. Researchers mainly engage in QRPs to publish significant results. This then increases the discoverability of research and, thus, citation score. And this leads to more funding and potentially to job security in the form of a tenured position or promotion.
In this process, however, you risk the very thing you want to uphold as a researcher—the rigor and precision of the scientific process. What’s worse is that some researchers may be engaging in QRPs without even knowing it.
The most common types of QRP, and how can you avoid them
The most common QRPs are: not accurately recording your research process, improper referencing, selective reporting, p-hacking, HARK-ing, collecting more data after seeing the results, not discussing contrary evidence, and failing to share your data. We’ve listed them in the order in which they can occur during the research process, as well as presented solutions to make sure you don’t engage in them.
Not accurately recording your research process
A researcher must keep a careful record of all the steps and decisions in their research process. Not doing so is a QRP because it violates one of the most important practices in research – proper documentation.
Everything in science and research must be documented step-by-step in as much detail as possible. This includes all steps from the conceptualization of the research project, to sampling plans, materials used, data manipulations and management, analysis, results, and next steps.
Ultimately, scientific record keeping must include why something was done, how it was done, who did it, and who it was for. This record should be written in enough detail so that others can understand why you did something and replicate how you did it.
Failing to list all the steps of a procedure can actually lead to dangerous consequences. For example, if you didn’t note down the exact dosage or timing by which a drug should be given, this might have negative health effects. Additionally, not recording all of the steps in your process will make it difficult for others to reproduce your research process.
One way to avoid this QRP is to write detailed protocols for your research process before starting a project. A protocol should be written in sufficient detail for any lab member or other interested researcher to execute it perfectly and should ideally be registered in an open-access platform like BMJ Open or OSF registries. Someone else has to be able to repeat your work.
Also, lab managers and research advisors need to set clear standards for record-keeping processes in their research groups. This includes delegating record-keeping to certain team members, providing examples of well-maintained records, providing training to ensure colleagues are up to date on best practices for record-keeping, and ensuring adherence to these standards.
Improper referencing of concepts, techniques, etc. (or no referencing at all)
Improper referencing is another QRP in which an idea or concept is attributed to the wrong source or authors. This can happen when, for instance, authors take information from one paper and cite that paper instead of the source of the concept. This can easily happen if background referencing work isn’t done correctly or thoroughly.
The goal of referencing is to give credit to the correct source. Improper referencing is dangerous because it gives credit to the wrong source, making it plagiarism, unintentional or not. This can have serious consequences like suspension. Correct referencing is also important because it distinguishes your own ideas from other authors’ ideas.
To avoid improper referencing, you should always cite the original idea or concept. You can do this by finding the original paper or citation and citing that one in your work, and not another article that cited it.
Another way to avoid improper referencing is to always use a citation manager like Zotero or Mendeley. This will help you organize your references and ensure they are correctly cited in your text.
Related to improper citations, failing to give credit to those responsible for doing research work, or coming up with original ideas, is also a QRP. This relates to not crediting authors who contributed to the research or writing process and can be avoided by using the CReDiT authorship statement in articles which provides a more detailed description of which phase of the research each involved member was responsible for.
Selective reporting refers to situations when researchers only report results, variables, conditions, or even studies that are significant or consistent with their predictions. This practice is also known as “cherry picking.” Selective reporting can also refer to the preferential discussion of studies that worked and not disclosing studies that didn’t as well as failing to report all experimental conditions, variables, or data (i.e., outliers) in their work.
Imagine you conduct a study where you include two experimental groups and one control group. However, during the research, participants in one of the experimental groups do not complete the tasks according to the instructions, which impacts data quality. So, you decide not to include these data in the final paper and to only mention the one experimental group in the methods. Readers are therefore led to believe that the experiment “worked” as intended.
This practice is very misleading because it doesn’t present all the experiment’s findings and can make the experiment seem “successful” when it actually wasn’t. In good research, all conditions, experiments, and results need to be reported, justified, and explained. The reasons why something did not work also need to be properly reported as well as negative results. Reporting negative results is important because they can help guide other researchers and prevent them going down similar unpromising paths.
This QRP is also dangerous because if you only include results or studies consistent with your predictions, you’re missing all the information in non-included analyses or studies. This missing information may trigger a new way of thinking or doing things, which we’d never know if it isn’t adequately presented and explained.
One way to avoid this QRP is to create a careful study design that can accommodate all variables and interactions you have predictions for. You’ll then need to conduct studies which have enough statistical power for the study design and number of variables you’ll have. This is because studies with low statistical power are likely to fail; having sufficient power would lead to less failed studies.
Power analysis should be computed a priori, before starting your study. There are many useful statistical packages, such as Superpower or the pwr package in R, which provide useful thresholds for minimum power required for your different statistical tests. If these aspects are considered in advance, even if a study didn’t work, you’ll still have the right justification for having done it in the first place in advance.
Another way to avoid this QRP if you do need to exclude some data or variable from your study, is to create a set of clearly defined exclusion criteria. Justifications for why you are excluding data or variables needs to be properly outlined and explained in advance, ideally in a pre-registration or protocol presentation. This will help you avoid randomly excluding data, which can harm your results, and be more transparent in your research process.
P-value hacking, or p-hacking, refers to a situation where many different analyses are carried out to discover statistically significant results when, in reality, there is no effect. In other words, p-hacking is the process of running statistical analyses on data sets until you get a significant finding.
You engage in p-hacking when you do things like excluding certain participants without prior justification just to re-run your analyses and get significance, you stop collecting data once you reach a p < 0.05, or you include many outcomes in your study but only report the significant ones in your paper (similar to selective reporting).
This QRP creates the inflation of false positives, or thinking that effects are present, when they are not. This can distort meta-analytic findings and give incorrect notions about a particular field or question in general. P-hacking is also questionable because researchers have already decided what the data should show rather than what it actually shows, adding more bias and reducing objectivity.
One solution to p-hacking is pre-registration, which requires you to create a detailed analysis plan prior to beginning your study. This will help you decide the criteria for statistical tests beforehand, which you can then follow in your analysis. Registered reports are also a great way to avoid p-hacking because they essentially guarantee publication regardless of the results.
Another way to avoid this QRP is to perform a Bonferroni correction. This method adjusts p-values to account for several statistical tests performed on the dataset. In other words, the Bonferroni correction reduces the chance of getting false positives. However, this method becomes very conservative when you’re testing for a large number of hypotheses, which then could lead to obtaining more false negative results.
Sharing your raw data in an open-access repository such as OSF or Github or placing it on a preprint platform such as Research Square can help to avoid p-hacking by allowing other researchers to perform your analysis and verify your findings. Hypothesizing after results are known (HARK-ing) HARK-ing is the process where researchers present post-hoc hypotheses as though they were a priori. In other words, you present a hypothesis made after results are collected as though this had been predicted from the start. This QRP may occur when researchers see that results don’t align with initial predictions, and so are changed to align with existing data.
HARK-ing also entails excluding a priori hypotheses if they didn’t work. This QRP can also include practices like presenting hypotheses found in post-hoc literature and presenting them as though they had been predicted from the start.
The main danger of HARK-ing is that hypotheses are always proven and never falsified, which prevents the research community from finding out why some hypotheses are wrong. HARK-ing also adds to the replication crisis seen in many fields because if hypotheses are always tailored to a specific sample, they cannot be generalized or replicated.
HARK-ing can also lead to numerous research biases. This blurs the line between confirmatory and exploratory research. It also limits valuable information about contrary evidence or something that didn’t work. And if encouraged by advisers or professors, this QRP can also lead to the wrong models of science being communicated to students.
The best way to avoid this QRP is to encourage transparency across the entire research process.
First, clearly define and list all your hypotheses before data collection or analysis, ideally with pre-registration or registered reports.
Second, ensure you have the proper design and statistical power to test your hypotheses. Then, report all results, even if they are against your initial hypotheses.
FInally, if your results generated new ideas or hypotheses, then by being transparent about that in the discussion section.
Collecting more data after seeing the results
Collecting more data after seeing the results is when researchers decide to get more data once they’ve already analyzed their results. Usually, they do this because the results are not significant or not going in the direction they expected.
It is important to note that it is usually safe to check the data during data collection. This QRP only becomes problematic if researchers constantly check the data and stop collection as soon as they see significance. This is dangerous because it capitalizes on sampling error and inflates the effect size when the experiment is stopped.
Like HARK-ing, collecting more data after seeing the results is another QRP in which decisions are made post-hoc. This can be avoided by setting a stopping criterion about when data collection should end a priori.
This can be done by performing a power analysis to determine the minimum sample size needed to detect the effect of interest, assuming 80% or 90% power. This can be done using packages such as Superpower.
Importantly, the sampling strategy you used must be openly discussed in the paper. This will help future researchers take any biases from small sample sizes into account.
Not discussing contrary evidence
When researchers don’t discuss evidence that goes against their results or hypotheses, their work is incomplete and less valid.
When this happens, researchers will only present articles that fit with their paper and fail to consider contrary views or findings. This paints an overly positive picture of the topic. That can prevent readers from getting the full scope of the information necessary to draw meaningful conclusions.
This QRP is especially dangerous in the medical or healthcare community, where it is critical to disclose any adverse effects to drugs or reasons why a drug didn’t work. Failure to disclose contrary medical evidence is unethical and can lead to harmful public health outcomes.
For example, in 1999 Merck launched a new painkiller called Vioxx, claiming it came with fewer gastrointestinal problems. However, the company heavily downplayed the serious side effects of increased heart issues and stroke that the drug caused. The concealment of this contrary evidence came at with critical consequences for many people and ultimately Merck itself, when it paid over 58 million in settlements.
This QRP can be avoided by presenting all the different viewpoints or theories related to your research question first in the introduction of your paper. This is so the readers understand the full state of the art. During the work, researchers like you must pay careful attention to and explain all results, even ones that completely contradict your predictions or expectations. Then, in the discussion section, you need to clearly explain how your results relate to all the different viewpoints presented earlier.
Not sharing your data
Not sharing your data refers to the common practice researchers make, which isn’t sharing their raw data, analyses, and analysis scripts on open-access repositories. This practice is questionable because the work cannot be replicated if the raw data is unavailable. Additionally, other researchers are left wondering how you got the results you got.
Although this practice has not been encouraged in the last few decades by many professors or advisors, the tide is turning. Many open-access repositories like OSF and global initiatives like the UK Reproducibility Network encourage open sharing of data and analysis codes to promote replicability and transparency.
To avoid this QRP, upload your raw data, analysis codes, and scripts to an open-access repository once you’ve finished your project. You can then refer to the location of your data in the paper when you submit it to the journal.