Generate deeper and more agile clinical evidence, thanks to the use of AI

Fill in this form carefully so that we can check if there is a match between you and us:

We will contact you even if we are not interested in your research project.

Hey, look! Before you continue scrolling, this is of INTEREST TO YOU:​

Hey, look! Before you continue scrolling, this is of INTEREST TO YOU:​

Big Pharma's mistake with AI almost cost them their jobs.

“Be careful what you wish for, it may come true.” Oscar Wilde

Yesterday I told you how there are pharmas that commission AI algorithms and don’t know what they are getting into.

What I was telling you was not a theoretical idea. It is something concrete that happened to us with a big pharma in one of its subsidiaries.

I’m going to tell you about it because it represents a great lesson. 

One that you should not forget.

Look, these evidence generation teams sometimes commission AI projects because it’s cool, because it sounds sophisticated and because they want to impress the KOLs.

That’s all fine.
The problem is not realizing the power of well analyzed data.

I don’t mean analyzed using SPSS or R, with old school statistical models.

I don’t mean pre-2015 technology. It’s not that.

I mean using neural networks, machine learning, to find signals in the noise.

To solve key research questions that come from key business questions.

But the problem of not thinking about it, the problem of acting on fads is that it can happen to you like this pharma I tell you….

They came up with models that predict their efficacy and safety versus their competitors.

We are talking about a very prevalent indication that moves millions of dollars.

And of course, the algorithm spoke.

And it ordered which drugs were better or worse for each type of patient.

Fortunately, by pure chance, the results were in line with the current guidelines.

And there were no major repercussions.

But it was pure luck.

Because they had not foreseen any of this.

And when they saw the spectacular results, they ran nervously from one place to another.

This cannot be done like this. We have to think deeper. Really, deeper.

If you commission a machine learning project to predict response in subsets of patients you are likely to beat, nay, crush, the others. But think through what you will do with the algorithm when you have it in hand.

Think about it deeply because you’re going to need it.

High-validity Virtual Registries
across 17 Therapeutic Areas in 13 countries


  • Atrial Fibrillation
  • Coronary disease and Diabetes Mellitus type II



  • Diabetes Mellitus Type I & II
  • Hyperparathyroidism



  • Chronic Lymphocytic Leukemia
  • Multiple Myeloma
  • Amyloidosis
  • Hemophilia B


  • Sepsis


  • Primary & Secondary Immunodeficiencies


  • Ischemic Stroke
  • Motor Neuron Disease/Amyotrophic lateral sclerosis
  • Multiple Sclerosis
  • Post-stroke spasticity with Botulinum Toxin


  • Postpartum Bleeding


  • Conjunctivitis

Traumatology & Orthopedics

  • Hip Fractures
  • Second Hip Fractures
  • Carpal Tunnel

An international oncology study of Artificial Intelligence applied to electronic medical records:

This is a unique collaborative study between the Head and Neck Cancer International Group (HNCIG) and Savana.

The first of its kind for head and neck cancer study, HNC-TACTIC is a multi-language, multi-center, retrospective, real-world evidence study analyzing Electronic Medical Records (EMRs).

The study aims to describe patients with head and neck squamous cell carcinoma (HNSCC) in a real-world setting.


What we do:

  • Sometimes the simplest way can also be the best one.
  • And the best way is not a cut of a database, nor a group of preselected variables, nor a certain number of patients. It’s not that. And neither a very costly registry.
  • The best way we can imagine is to retrieve the actual complete information about what is happening at the points of care.
  • It’s having access to the complete medical records information.

Every single patient. Every single variable.
The most realistic data source possible.

Every single patient. Every single variable.
The most realistic data source possible.

  • In order to get this, you basically need:
    • A combined team of data scientists and clinicians with experience in research (in our case, lead by oncologists).
    • A system able to retrieve information from any healthcare provider (as long as they have electronic medical records -EMR-, paper doesn’t work).
    • Natural Language Processing, because 80% of the variables and outcomes are going to be in the clinical narratives’ free text.

This system is exactly what we created.

This system is exactly what we created.

And how, is in practice, getting the information through this methodology better?

  • The key is in our team helping researchers selecting which fragments of meaningful data in order to satisfy the objectives of the investigation.
  • In fact, because we had to signify how much deeper we get into data compared to anyone else, thanks to AI (true AI, not buzzword AI), we called the result of this methodology Deep Real World Evidence.
  • As you have probably suffered in the past, current databases exist to collect clinical data, but with considerable gaps due to recording limitations in the current methods.
  • Deep Real World Evidence from EMR offers a much (not a bit but a much) greater insight into the routine clinical care of patients throughout all stages of the disease.
  • Combining free text with other data sources (e.g. laboratory data, pathology, genomics, etc.), an insilico registry gets generated to describe the patient population with the defined disease, their associated clinical conditions and treatments, and develop predictive models.

Deep data layers analysis:

Deep data layers analysis:

If we do our job well, there is no need for:

  • Observational studies.
  • Traditional registries.
  • Classical disease databases.

In some situations, AI-generated RWE makes
a bigger difference than in others.

Below you can find the cases where we discovered, along with our customers, that deep-RWE brings a higher benefit compared to traditional RWE:

01 - Pipeline

Killing as early as possible those programs with lesser chances of success, thanks to a highest accuracy and reliable knowledge than with market research activities. Focus on:

  • Patients’ characteristics and subpopulations.
  • Patient journeys.

02 - Clinical Development

Increased speed at lower cost thanks to:

  • Uncertainty reduction in patient availability, outcomes, non-responders & effect-size. Focus on:
    • Epidemiology and patient pathway.
    • Predictive factors.
    • Standard of care.
  • Higher likelihood of technical success with:
    • Subpopulations for randomized clinical trials, avoiding effect-size dilution.
    • Endpoints identification.
  • RWD control arms.

03 - Regulatory Dossier

  • Stronger clinical development package
    • Identifying unmet need.
    • Indirect treatment comparison: standard of care & outcomes. 
  • Satisfying post-authorization commitments:
    • With high quality data for drug description and effectiveness/safety.
    • Ensuring deadlines for deliverable.
    • At lower cost.
  • Identifying surrogate endpoints:
    • Facing immaturity of primary/hard clinical trial endpoints.
    • Targeting endpoints impacting on patients’ lives.
    • Demonstrating surrogacy.
  • Predicting non/poor-responders:
    • Narrowing the indication to a more reimbursable population, thanks to reducing outcomes uncertainty, thus getting higher price per value.

04 - Differentiation & Adoption

  • Real world benefits against competitors.
  • Identification of differential patient relevant endpoints.
  • Identification of predictive factors of response.
  • Optimal treatment sequence.
  • Subpopulation clustering.

05 - Institutional & KOL development

  • Complementary Evidence generation to Clinical Development and R&D:
    • Ensuring reliability thanks to independency.
    • Achieving diversity & reproducibility.
    • At lower cost.
  • Strategic alliances & partnerships with KOLs & sites:
    • Faster execution with just in time response to unpredictable needs.

Discover how we do it:

Drug discovery: beyond EMR and into genomics.

Once we have facilitated the most difficult part, which is extracting variables from free text (clinical characteristics, comorbidities, signs and symptoms, adverse events or outcomes), we can also combine all this unstructured information with other structured data layers (genomics, transcriptomics, proteomics and imaging) which can be sourced both from our worldwide network of hospitals and from clinical trial databases.

Savana works with its premium partners in order to offer a combined proposal:

You need to know that we did this before… many times

We invested millions and years in developing a methodology by which we can infer the variables from the EMR, keeping quality and controlling bias. The consequence is a methodology which results are replicable, thus generalizable.

We collaborate with a network of 200 hospitals across Western Europe and the Americas.

And yes, we are absolutely the only ones who do this at multilingual level!

And yes, we are absolutely the only ones who do this at multilingual level!


You don’t have to. You just need to go to our peer-reviewed publications, both clinical and technical, where our methodology has been scrutinized and proven.

In our publications you will also find validations of the AI models we have created.

It depends on what you understand by more complicated. If I only need one pair of shoes, it’s easier to just manufacture it. But if you need thousands of shoes, the only way is to build a factory.

If you want to generate real world evidence about a disease or a drug, you will normally want a) very granular information b) new mathematical models in order to find new associations and hypothesis. Then, this is your method. While if you want to spend millions and years in creating a registry, this is not for you.

It really depends on how deep you want to get into the information. If you want the information in 1 month, then you’d better go for a database cut. But if you want to own a dynamic registry, navigate it, query it in search for new insights,… and you can wait some months to have this, then it’s definitely worth it.

We are just enjoying the result of years of focused investment into being the best at mining medical records for real world evidence generation purposes. There is no magic in it. All we are doing is applying state of the art AI and the scientific method to clinical research.

No. The amount of information you will get will be relatively more cost-efficient than any traditional way of doing things. By far.

Of course not. You are the only one who has the clinical question and you will need to guide our team until we are sure that they understand the exact problem you are trying to solve. Aside from that, agreements with hospitals are tough, and in our experience, what works better is to convince them by approaching them together, so our collaboration will serve to accelerate the Project.

We normalize the clinical concepts according to the SNOMED CT ontology, with variables added by Savana’s internal medical staff in those cases not covered by SNOMED CT. Mapping to OMOP is also part of the process when required.

YesSavana is compatible with other similar platforms. Other types of repositories based on structured text or free text can be complementary to the information processed by Savana.

Savana is able to extract both structured and unstructured data (free text). 

Thanks to using natural language processing techniques to extract clinical content from the free text of the electronic medical record, Savana offers doctors the opportunity to carry out research on pathologies and/or patient groups in real time and at any time, which to date has been impossible to perform. At the same time those results can be published in a scientific journal.

Savana facilitates massive and very fast extraction of clinical variables found in the free text of EMRs, which replaces the current work of manually reviewing chart by chart. 

Structured data like pharmacy, laboratory or genomics also can be extracted and added to the database if required.

Clinical documents: being the company which has processed the biggest number of documents of this type worldwide; allowing our algorithms to currently be among the most trained for this purpose. Savana has been implemented in +200 sites across 16 countries for years and its use has generated abundant scientific publications, answering questions in multiple therapeutic areas.

Savana is compatible with all EMR systems, regardless of format and source. Our technology is vendor agnostic. The only limitation is that the documents are text and not images. The preferred document formats for the information extracted from EMR are CSV, JSON, XML and DB, being compatible with other data exchange formats that we will assess previously.

There is no extraordinary requirement beyond the usual ones for a healthcare provider IT (internet connection, usual operating systems, etc).

It enables the export of all the data in different formats for its use by other artificial intelligence tools, or statistical tool, such as SPSS or R.

No, the hospital has the processed information at its disposal to make the use it deems appropriate.

No. Every site must opt in or out once we have a new study protocol. That way they always keep control over their data. Of course, every hospital can also suggest a study to the rest of the network.

Generate deeper and more agile clinical evidence, thanks to the use of AI

Fill in this form carefully so that we can check if there is a match between you and us:

We will contact you even if we are not interested in your research project.

Complete the info, and a KAM will contact you ASAP:

Want to use it?:

Start with your proposed AI + RWE use case:

This is the first step for AI + RWE: