Rustam Gilfanov: Automation and Big Data, the Clinical Trials of Tomorrow

Clinical Trials

By Rustam Gilfanov

An attempt to glimpse the future — the 2020s or even 2030s.

Today, clinical trials involving volunteers are the only way to prove the effectiveness and safety of a new medication. This was not always the case, however. Less than a century ago, tests on rodents were enough for manufacturers to start releasing their drugs on the market — often leading to a tragic outcome.

What was the world’s history of clinical trials, and why does this format need an overhaul and a fresh start? What will they look like in the next few years, and how can big data contribute to that?

Clinical trials before modern-day protocols

It is impossible to predict with animal tests only how an unknown drug will affect a human body. To understand that, scientists needed years of trial and error; and some errors are considered unforgivable.

Human trials were performed as far back as in the 1950s, with their protocols similar to the ones applied today [1]. The first full-scale clinical trial was in 1954: the developers were testing a polio vaccine [2]. Yet until the 1960s, it was not mandatory for pharmaceutical companies to test an active ingredient on human volunteers before launching its mass production.

In 1937, S.E. Massengill Company was selling liquid sulfonamide as a treatment against infections in children. The substance was difficult to solve, so the company decided to apply diethylene glycol as the excipient. The formula was launched into production and then hit the shelves, while its toxicity had not been tested even on animals. Soon, the Food and Drug Administration got a report from Oklahoma stating that eight children and one adult had died after taking the elixir. The drug was promptly recalled, but in the end, there were 107 deaths [3].

The turning point was the thalidomide disaster [4]: after rat tests, its developers decided it could be administered to humans as a sedative. West German authorities facilitated its market release, and soon thalidomide was on sale in almost 50 countries. The results were catastrophic: 40,000 people suffered from polyneuropathy, a peripheral nerve disorder. Approximately 10,000 babies, whose mothers had been taking thalidomide, were born with severe limb defects; half of the newborns did not survive [5].

Soon, testing methods were transformed, with governments taking medication releases under control and imposing stringent clinical trial rules for all pharmaceutical companies. Further studies revealed that different animal species react differently to the same substance. That is why new drugs began to be tested on rats, hamsters, rabbits, monkeys, and other animals. Active substances were also tested on cell cultures in vitro and on human volunteers.

Clinical trials today: types and design

The World Health Organization defines a clinical trial as any research study that prospectively assigns human participants or groups of humans to one or more health-related interventions to evaluate the effects on health outcomes.

There are two types of clinical trials: observational and experimental. In the former case, scientists simply collect data while monitoring a natural process and conclude whether the analyzed drug facilitated recovery or caused any adverse effect. The latter type is more widespread. Participants are divided into two groups: one group gets the medicine, while the other takes a placebo or a substance already available on the market [6]. At the same time, there is no single trial protocol — depending on the medication objectives, drug developers have to plan and design anew every time, turning to the academic community for advice.

The trial protocol is the document that helps analyze testing phases [7]. Without it, the regulator will not grant permission to start the trial. The protocol specifies the trial goals and location, its sample size, and criteria for selecting the candidates. The participants must be fully aware they are engaged in an experiment and give their prior consent. The informed consent signed by every patient is based on the ethical principles stipulated by the World Medical Association Declaration of Helsinki. The document is signed and dated by the Informed Consent form provisioned by the GCP standard (ICH Harmonized Tripartite Guideline for GCP) [8].

The protocol also describes procedures, data processing, and analysis methods. The “golden standard” of today is the randomized, double-blind, controlled clinical trial. Trials consist of four phases, three of them are mandatory. During the first phase, the drug is administered to 15-200 healthy people, at phase two — to up to 300 patients with a disease the new medicine is supposed to cure.

This phase may last for several months through two years. In phase three, the medication is compared to standard treatment as is being tested on 300-3000 patients for up to four years. Phase four can be initiated after the drug has been registered; it helps reveal long-term adverse effects and rare contraindications, with thousands of people taking part in the procedure.

The final trial is also called post-marketing monitoring. Collecting the “field data” can be a challenge; yet without it, it is almost impossible to make a comprehensive conclusion about medication efficacy and safety, as well as to expand its indication and detect potential adverse effects.

Thus, the full cycle of clinical development takes over 10 years on average before the new drug makes its way to the market. State regulators need to be confident that clinical trials adhere to the GCP standard before they authorize the medication [9].

Sometimes, accelerated registration is possible when there is a necessity to meet an urgent medical demand related to a severe disease with few therapeutic options. This procedure is relevant for orphan diseases that only 10 out of 100,000 people suffer from. Many manufacturers see no benefit in making such medications, as their development requires multimillion investments that many not pay off. This is why many countries facilitate authorization of orphan drugs to incentivize pharmaceutical companies.

Accelerated registration was also applied during the coronavirus pandemic. In December 2021, the European Medicines Association supported granting conditional authorization to Nuvaxovid, a vaccine developed by the US company Novavax. After that, however, the manufacturer is still obliged to provide full information on its clinical trials.

Too long, too expensive: why we need to revise the trial format

Unfortunately, access to breakthrough medicines is still unavailable to most patients. In the late 2010s, several so-called “genetic pills” were released, like Luxturna restoring the eyesight, Glybera helping treat lipoprotein lipase deficiency, and Kymriah used against acute lymphoblastic leukemia. The price of these drugs reaches up to USD 1 million. One shot of Zolgensma, a medication against spinal muscular atrophy in children, costs over USD 2 million.

These medications are truly revolutionary, but their faster appearance on the market and price reduction require an overhaul of current approaches to clinical trials. Pharmaceutical companies are complaining of a lack of volunteers, numerous ethical and legal constraints, time- and money-consuming trial procedures. All of this leads to exorbitant prices on medications unrivaled on the market — for obvious reasons.

Computer technologies: automation

The author proposes to revise the process of clinical trials using computer technologies that could assist scientists at different stages of drug development.

The development starts with defining the molecular target that needs to be impacted to change the course of the disease [10]. Once the target has been found, it is necessary to find the substance capable of binding with it. According to the estimations, there can be up to 1063 possible organic compounds of a size up to 30 atoms — thanks to computer methods, there is no need to look through all of them.

Even now, in silico modeling is applied in every project on developing various medications, be it small molecules, proteins, or gene therapy drugs. The experimental stage requires synthesizing and testing hundreds of thousands of candidates selected by the computer; only a few of them will reach preclinical and clinical trials. These processes need more sophisticated automation.

Automation enables getting and analyzing multiple substances in the same conditions while minimizing the human factor and greatly reducing development time. Of course, this is closely tied to computing methods, mainly machine learning.

The next stage of automation would require uniting all laboratory equipment into one system controlled by a single software [11]. A perfect scenario would involve bringing together all stages, from modeling to preclinical trials. As of now, this scenario is almost impossible but some companies are making their first steps in that direction, like AstraZeneca and their  NiCoLab platform [12]. Laboratory devices can be equipped with sensors to track all stages of the experiment [13]. This approach is called the Internet of Things, and it is capable of boosting lab performance by 30-40  [14].

This method can be expanded to clinical trials as well: thanks to telemedicine and various gadgets, the researchers can monitor the status of the volunteers in real time.

In silico: from animal to humans

When it comes to preclinical trials, nowadays developers can design a complete “human model” in virtual reality, digitally recreating an organism to study various biological processes in silico. While in vivo is understood in molecular biology as the colony of artificially cultivated cells, in vitro — as the system of cell-less synthesis in lab conditions, the in silico approach encompasses tasks on modeling the behavior of single molecules, biochemical processes, and even functioning of specific physiological systems [15].

This modeling is a costly process, yet it provides almost infinite opportunities for research, testing existing substances, and monitoring various types of therapy.

Big data as a resource

Another promising direction involves big data technologies that can accelerate clinical trials. For instance, analytical systems can screen the candidates among several databases. Manufacturers are trying to get access to the medical records of patients and strike deals with IT companies competent in big data analysis. In February 2018, Roche pharmaceutical holding paid USD 2 billion to purchase all shares of Flatiron Health startup that collects clinical data of cancer patients.

South Korean companies Hanshin Medipia and Infinity Care actively use Longenesis blockchain platform to facilitate their biomedical research. This technology automates the process of getting patients’ consent to medical intervention: platform users agree to partake in a medical study or a trial. The system works with pharmaceutical companies and research institutes to help them browse through anonymous metadata and see what information is available. Then the patients are offered to join a test or a study or provide their information for drug effectiveness assessment. This significantly speeds up the routine working procedures.

Big data also helps forecast adverse effects of certain components and compounds even before the trial. The analytical method includes checking hundreds of various substance characteristics and saves the manufacturer’s time and money.

The information on how the drug performs “in the field” is collected either after the clinical trials or during phase four. The number of trials involving the big data analysis exceeded 300 even in 2017, as Reuters agency informs referencing the website of global clinical trials The companies are most interested in researching cancers, heart diseases, and respiratory disorders [16].

In March 2018, it became known that the world’s largest pharmaceutical companies have divisions collecting and using real-life data on various diseases. These activities include the diabetes studies by AstraZeneca и Sanofi, joint research on stroke prevention by Pfizer and Bristol-Myers Squibb, Takeda Pharmaceutical project on intestine disorders.

Despite the numerous opportunities for applying big data in the industry, several constraints must be taken into consideration. First, pharmaceutical companies often do not know how to use the data correctly. Second, the data is often unstructured, incoherent, and presented in different formats, so it possesses no real value. Finally, its use requires a fully digitalized patient’s consent, and getting the patients engaged in this process can also be problematic.

Personalized drug development

Vast perspectives for big data methods can be found in personalized medicine — the bleeding edge of healthcare development. In the context of this article, we are interested in the drug-on-demand segment. This may sound incredible, yet big data may help customize medications for individual patients, thus boosting the effectiveness of unique therapy.

Analyzing the large amounts of data, scientists will be able to detect patterns of diseases and forecast their progress and possible mutations of viruses. Meanwhile, the developers will get the chance to profile risks, focus on preventing diseases, and predict what drugs will be demanded in the future.

It seems evident that big data, automation, and digitalization have a lot to offer for advancing clinical trials — and their full-scale implementation will revolutionize medicine.

About the Author

Rustam Gilfanov

Rustam Gilfanov is a famous IT entrepreneur, a founder of a large IT company, and a partner of the LongeVC Fund.


  1. Doll, R. Controlled trials: the 1948 watershed. BMJ 317:1217-1220, 1998
  2. Drinker P, McKhann CF. The use of a new apparatus for the prolonged administration of artificial respiration: I. A fatal case of poliomyelitis. JAMA. 1929; 92: 1658–1660. [PubMed] [Google Scholar]
  3. ELIXIR SULFANILAMIDE-MASSENGILL: Report of the United States Secretary of Agriculture. Cal West Med. 1938 Jan;48(1):68-70. PMID: 18744371; PMCID: PMC1705805.
  4. Arlen RR, Wells PG. 1996. Inhibition of thalidomide teratogenicity by acetylsalicylic acid: evidence for Prostaglandin H Synthase‐catalyzed bioactivation of thalidomide to a teratogenic reactive intermediate. J Pharm and Exp Thera (JPET)277:1649‐1658. [PubMed] [Google Scholar]
  5. N. Vargesson, “Thalidomide-induced limb defects: resolving a 50-year-old puzzle,” BioEssays, vol. 31, no. 12, pp. 1327–1336, 2009.
  6. Nellhaus, Emma M. and Davies, Todd H. (2017) “Evolution of Clinical Trials throughout History,” Marshall Journal of Medicine: Vol. 3: Iss. 1, Article 9. DOI:
  7. O’Brien K, Wright J, How to write a protocol J Orthod2002 29(1):58-61.  [Google Scholar]
  8. ICH harmonized tripartite guideline: Guideline for Good Clinical Practice. J Postgrad Med 2001;47:45-50
  9. ICH harmonised guideline integrated addendum to ICH E6(R1): Guideline for Good Clinical Practice ICH E6(R2) ICH Consensus Guideline.
  10. J. Jesús Naveja, José L. Medina-Franco. (2019). Finding Constellations in Chemical Space Through Core AnalysisFront. Chem.7;
  11. Hayden E.C. (2014). The automated labNature516, 131–132;
  12. Chubb P. (2020). From simple automation to smart labs: the future for drug discoveryEur. Pharm. Rev.;
  13. Chubb P. (2020). Productivity in pharma could be drastically improved with IoT implementationEur. Pharm. Rev.;
  14. Pearson S. (2020). Biopharma’s Lab of the Future Can’t WaitGEN;
  15. Sieburg H.B. (1991). Physiological studies in silico. In: Complex systems 1990. SFI Series «Studies in the Sciences of Complexity»12, 321.
  16. Hirschler B. (2018) Big pharma, big data: why drugmakers want your health records 15 Available at:


Please enter your comment!
Please enter your name here