Fighting irreproducibility in preclinical medicine using a meta-analytical approach for detecting flaws in behavior-based testing

Otto Kalliokoski

One of the greatest problems with present-day medical research is irreproducible results. Frequently, the results from a study conducted in mice in one laboratory cannot be reproduced using the same mice, only in another laboratory. The implication, when results fail to replicate, is that the original findings were unreliable, or simply incorrect. “Double checking” studies in this manner takes time, however. In the meanwhile, therapies and drugs can be developed based on the original findings; drugs and therapies that will invariably turn out to be ineffective. In addition to the societal costs of failed investments in medicine and patients waiting in vain for therapies that were not effective in the first place, irreproducible findings are a major source of waste with respect to (laboratory) animal lives.

Neuroscience is the medical field that generates the fewest studies that can be successfully replicated. Laboratory animal studies in neuroscience often rely on testing behavior – rats running mazes to test memory, mice pressing levers in Skinner boxes to test motivation, etc. The irreproducible results in neuroscience can often be traced back to these behavioral tests. Some of the tests are half a century old, the theory behind them is murky, and they are often needlessly cruel.

We are looking to drive a move away from these outdated methods using meta-analytical investigations of historical data. It is important to explain how a method can be used for decades without measuring anything particularly useful if we want to avoid perpetuating the same mistakes. We need to clearly convey how primarily sharing results that conformed to expectations, and suppressing results that did not (what is known as “publication bias”), has given flimsy behavioral tests an air of legitimacy and reliability. Using modern statistical methods and clear no-nonsense communication, we aim to convey the message that many of the established behavioral tests are long overdue for a substitution.


