Supporting reproducibility in academia by focusing on processes

The institutions of an international research network are exploring and sharing ways to support reproducibility by focusing on the research process, not just on the outputs. Here they summarize several approaches discussed at a recent online workshop.
Supporting reproducibility in academia by focusing on processes
Like

The international family of Reproducibility Networks is a strategic community effort to promote transparent and trustworthy research practices in the academic research system. The UK Reproducibility Network (UKRN) has grown rapidly since it was first established in 2019.  It supports peer-led collaborations between 32 members as well as a number of other institutions that have local reproducibility networks. Regular online events provide a forum to share initiatives and ideas across the membership. Here we report on a recent workshop, “Total Quality Management in Academia”.

In order to ensure high quality outputs, many sectors outside academia, such as manufacturing, do not focus on assessing carefully selected outputs, but instead probe the day-to-day operations and procedures most critical to the quality and value of the work. The workshop explored how we could apply some of the methods of quality control used in those other sectors to academic research, in order to enhance research quality.

At the core of this approach, termed total quality management (TQM), is the proposition that if we take care of the process, the results will take care of themselves. This workshop presented an overview of quality management within the pharmaceutical industry, as well as reproducibility pilots which are currently being set up or are actively being conducted at a range of UKRN member institutions.

Screengrab of Marcus Munafo at the UKRN online workshop Total Quality Management in Academia
UKRN online workshop Total Quality Management in Academia – Marcus Munafo (UKRN)

Quality Assurance in Pharma – Fiona Booth, University of Bristol

The opening presentation explored the evolution of the risk-based approach to quality management adopted within the pharmaceutical industry.

Traditional quality controls and checks are highly resource-intensive and tend to be a "one-size-fits-all" approach to quality management. Over the last 15 years, risk-based quality management has become the regulatory standard for all aspects of pharmaceutical research and manufacturing.  Quality risk management1 allows teams to focus their attention and resources to the aspects of research and manufacturing which have the greatest impact on quality by:

  • Identifying which data are critical to a particular process or research project;
  • Developing a deep understanding of the flow of critical data from the point of generation, through the analysis and reporting stages to retention and archiving;
  • Designing controls to ensure that critical data has the highest possible integrity (it is attributable, contemporaneous, legible, original and complete);
  • Actively monitoring data quality and assessing it against predefined quality tolerance limits;
  • Proactively identifying data quality issues before they occur. When errors and issues arise, they are discussed openly and transparently with learnings shared amongst teams.

Adopting a risk management mindset needs a shift in thinking; errors must be anticipated, and research processes designed in such way that the dataset can tolerate those errors.  It helps us to make the best use of our resources as it is rarely realistic or efficient to try to eliminate all errors.

Risk management approaches can be applied to almost any research project; they are adaptable and scalable and provide valuable information on the robustness of the methodology in a way which can be shared at the point of dissemination to aid reproducibility.

Spot checks for reproducibility in a University setting – Mark Kelson, University of Exeter

The second presentation covered a nascent proposal concerning spot checks of research in a research setting. Borrowing heavily from an existing scheme run by the Sainsbury laboratory in Cambridge, this proposal would see three recent pre-prints or published works selected at random from the entire University’s output.

In each piece the most critical table or figure or finding would be identified (typically covering those results presented in the abstract). Then the lead or corresponding author would be invited to share the supporting data and code with an independent evaluator for an assessment to be made about the reproducibility of the work.

The central idea is that this process is light touch (most authors’ works will not be selected) and supportive (feedback will be provided regarding how reproducible the work is).

Many of the details are critical and exclusions to the scheme would be numerous (e.g., in cases where data or code is commercially sensitive or concerning potentially identifiable individuals). Similarly, work relying on proprietary software or results from computationally expensive work would need to be considered carefully. In such cases, a graduated check might be possible (i.e., if data *and* code are not available, then perhaps code alone could be shared. If neither code nor data is shareable, perhaps a report could be written by someone quasi-independent of the original team).

A pilot approach in a small number of departments would seem a reasonable next step to assess the feasibility of such a scheme. Documents for senior approval are being developed.

Pre-submission certification of computational reproducibility – Reny Baykova, University of Sussex

The third presentation introduced a pilot scheme which offers researchers the opportunity to have an independent statistician check and certify the computational reproducibility of papers before they are submitted for publication.

Computational reproducibility refers to the ability to get the same numerical results, figures, and inferential conclusions when rerunning the same analysis pipeline on the same dataset. The proposed scheme is open to researchers affiliated with the School of Psychology at the University of Sussex and focuses on quantitative research. Participation is voluntary, and the key aims of the project are to help researchers gain the skills to conduct reproducible research, and to create a demand for more rigorous reproducibility practices coming from researchers themselves.

When researchers decide to submit their paper to a reproducibility check, the independent statistician examines the shared materials, points out any results that do not reproduce, and provides suggestions on how to improve the reproducibility of the study. After the researchers have updated the materials and manuscript, the independent statistician completes and uploads a reproducibility report online. Researchers can then highlight that their paper has been certified as computationally reproducible and reference the reproducibility report in their manuscript, thus setting them apart in terms of transparency and rigour in the eyes of editors, reviewers, and readers.

Our ultimate, long-term aim is that as more and more researchers recognize the benefits of submitting their studies to a reproducibility check, the practice will become a standard part of the research process and help reinstate confidence in the integrity of science.

ReproduceMe: A pilot project on computational reproducibility – Daniel Baker, University of York

The final presentation described a pilot project currently underway at the University of York. This project acknowledges the substantial technical barriers to generating high quality work that is computationally reproducible, and aims to provide ‘reproducibility as a service’. Over a six-month period, the intention is to take 10 completed scientific manuscripts and rework them into fully reproducible documents using tools such as markdown. This is a scripting language that permits text to be combined with executable code, so that a full PDF-format manuscript is generated, including running all analyzes and generating figures. Successful implementation, along with publicly available data and code, means that anyone (the reader, reviewer, or editor of a paper) can see precisely the analysis steps that were conducted to transform the raw data into the final paper, greatly simplifying auditing and certification procedures (see previous sections).

At the time of the workshop, production had been completed on one paper, and a further 9 had been volunteered by members of the host department. Training materials and standardized forms for onboarding and feedback have also been developed. In addition to generating reproducible documents, substantial progress has been made towards implementing reproducible computing environments, using tools such as Docker2 and Github3. The aim here is to ‘freeze’ the versions of programming languages and software packages used for the original analysis, so that exactly the same tools are available in perpetuity. Thanks in part to a helpful suggestion from an audience member at the workshop (Dr Lincoln Colling, of the University of Sussex), we now have a fully automated build pipeline available that performs all analyzes on a remote server, and therefore does not interfere with software installed locally on the user’s computer. In the long term we hope to secure resources to provide reproducibility services on an ongoing basis.

Conclusions

The workshop concluded with a discussion of the barriers to implementing quality checks in academic research. There was acknowledgement that this type of approach needs a shift in research culture so that it becomes the norm and is welcomed as a positive process.

The availability of personnel with the necessary skills and time available to conduct such checks was also noted to be a potential barrier. Specific career pathways for those with relevant skills – for example, in open research practices – would ensure this becomes an integral part of our research culture.

You can watch a recording of the workshop via this link. UKRN online events are open to all; further information can be found here. The next UKRN event is "Octopus.ac: A new academic publishing model to improve research culture" on 15 June 2023.

  1. ICH Guideline for good clinical practice E6(R2) ICH E6 (R2) Good clinical practice - Scientific guideline | European Medicines Agency (europa.eu)
  2. https://www.docker.com/
  3. https://github.com/

All workshop presenters contributed to creation of this blog post.


Photo by Testalize.me on Unsplash 

Join the FEBS Network today

Joining the FEBS Network’s molecular life sciences community enables you to access special content on the site, present your profile, 'follow' contributors, 'comment' on and 'like' content, post your own content, and set up a tailored email digest for updates.