Group 7@3x

The most realistic clinical data possible

Savana is a unique company at curating the RWE

coming from different sites, across multiple countries and languages.

The most realistic clinical data possible​

Savana is a unique company at curating the RWE coming from different sites, across multiple countries and languages.

These are some of our Publications

These are some our Publications

ERS - publications logo
An association between the severity of coronavirus disease 2019 (COVID-19) and the presence of certain chronic conditions has been suggested.
Journal of Clinical Medicine logo
Patients with Chronic Obstructive Pulmonary Disease (COPD) have a higher prevalence of coronary ischemia and other factors that put them at risk

Generating RWE across 16 countries and 5 languages

Generating RWE across 16 countries and 5 languages

150 1345 14568 235678 5678421 3 billion

Electronic Medical Records

0

Years experience in AI

0

Countries

0

Languages

0 +

Healthcare providers

6 improved clinical practices

6 improved clinical practices

We send several weekly emails for you to start making friends with AI.

Sign in if you want to generate evidence and improve clinical practice beyond classical methods.

Subscribe here:

By the way, if you register you will receive our masterclass about AI in healthcare

These days there are countless companies doing RWE. And they all throw the same messages, even when no one really knows what they mean.

Statements like:

Improving patient outcomes with better insights.

End-to-end data platform for collecting and connecting the information.

Better, more and affordable real-world evidence at scale.

Information processing, analytics & research platform.

Accelerating health research through automatic data capture.

Connected intelligence for real-world analytics.

Lately, they have also started to place a few cool buzzwords, such as #AI, #MachineLearning and #NaturalLanguageProcessing.

But as YOU know. Medicine is not simple.

And at the end of the day, a predictive value in real and prospective patients ends up unmasking everything. 

And that cognitive-computing-analytics-Alpha-power-project is not going to improve the quality of your evidence generation nor is it going to do anything for the patients.

In fact, all these groups have good intentions; they ultimately want to communicate two ideas which are totally right:

01. Artificial Intelligence:

Artificial Intelligence is mathematics supported by computation, which, when used effectively, can generate analysis that go beyond classical statistics.

  • In addition, today we have technology that we did not have 10 years ago and that allows us to handle enormous amounts of information.
  • If you manage all this well, with a good clinical question and with order, you can make a leap in the quality of the evidence.

02. Electronic Medical Records:

Electronic Medical Records are an immense source of valuable clinical information which formerly you could only extract partially, manually, slowly and at great expense. While now there is technology which allows for its extraction at scale.

These two statements are true because some things have changed:

AI and machine learning techniques have exploded.

AI and machine learning techniques have exploded.

Some people think that AI is about an almighty computer with which you can talk and will give you all the answers about all the patients, or an infinite database automatically generated. But in short, what it is and what we now have is a tool which allows us to make real progress towards finding associations and new variables

Natural Language Processing is very robust now and it lets us extract that circa 80% of the information which is narrative free text.

Natural Language Processing is very robust now and it lets us extract that circa 80% of the information which is narrative free text.

And of course now we have Electronic Medical Records (not paper), with file formats and standards that are getting better at talking to each other.

And of course now we have Electronic Medical Records (not paper), with file formats and standards that are getting better at talking to each other.

The consequence is that traditional registries have been outperformed.

And so have observational studies, manual chart reviews, claims databases, ICD codes…

Because now we can extract the complete information from medical records.

Not just the claims, or just a few hundreds of variables that someone decided to register. Now we can have much more realistic and flexible databases.

And even better, we can update them directly from the information systems at the points of care, without intermediaries. 

Likewise Biotech has improved exponentially, so have RWE possibilities.

And the timing is perfect, because precisely RWE is more demanded than ever for decision making.

But what is the problem with this new AI approach to RWE?

Very simple. It is not research-grade. All these players have a tech angle, but not a science angle. 

If you request information from the same site several times, you will get different variables. It’s not robust. It’s not validated.

That’s why Savana’s absolute focus for 8 years has been on developing a methodology through which AI tech meets science standards, controlling bias and missing data.

This way, the generation of information from medical records is replicable and accurate.

This way, the generation of information from medical records is replicable and accurate.

In other words, if we create the same pragmatic registry several times, we will get the same variables. And we do it without using complicated AI technicalities as an excuse to lower the standards.

And since we have strategies to harmonize this automatic extraction among sites, countries and languages, the information gets generalizable at an international level.

And once we meet the scientific standards, we can start enjoying the richness of information.

We can get so much deeper into the information that, since we needed a name to express the amount of insights that we were finding.

We called this Deep RWE.

We called this Deep RWE.

  • Deep RWE basically incorporates richer clinical characteristics into high-validity pragmatic registries.
  • That offers an opportunity to understand populations across clinical criteria while also supporting care pathway enhancement.
  • It is possible to pragmatically achieve high visibility into cohorts of interest, interventions, and clinical and financial outcomes.
  • By engaging health systems directly, we go to the source for the highest quality phenotype data.
  • Moving literally from hundreds of variables into tens of thousands of them, allows us to query the databases with disruptive questions about biomarkers.

This approach is extremely flexible, as you can go with one single specific clinical question or with a whole approach to a disease, retrospectively (for example 5 years) and prospectively, with updates of information at the requested frequency.

It's like having a dynamic registry, but without having to create it.

It's like having a dynamic registry, but without having to create it.

Applying high-validity Pragmatic Registries
across 16 Therapeutic Areas in 16 countries

An international oncology study of Artificial Intelligence applied to electronic medical records:

This is a unique collaborative study between the Head and Neck Cancer International Group (HNCIG) and Savana.

The first of its kind for head and neck cancer study, HNC-TACTIC is a multi-language, multi-center, retrospective, real-world evidence study analyzing Electronic Medical Records (EMRs).

The study aims to describe patients with head and neck squamous cell carcinoma (HNSCC) in a real-world setting.

 

What we do:

  • Sometimes the simplest way can also be the best one.
  • And the best way is not a cut of a database, nor a group of preselected variables, nor a certain number of patients. It’s not that. And neither a very costly registry.
  • The best way we can imagine is to retrieve the actual complete information about what is happening at the points of care.
  • It’s having access to the complete medical records information.

Every single patient. Every single variable.
The most realistic data source possible.

Every single patient. Every single variable.
The most realistic data source possible.

  • In order to get this, you basically need:
    • A combined team of data scientists and clinicians with experience in research (in our case, lead by oncologists).
    • A system able to retrieve information from any healthcare provider (as long as they have electronic medical records -EMR-, paper doesn’t work).
    • Natural Language Processing, because 80% of the variables and outcomes are going to be in the clinical narratives’ free text.

This system is exactly what we created.

This system is exactly what we created.

And how, is in practice, getting the information through this methodology better?


  • The key is in our team helping researchers selecting which fragments of meaningful data in order to satisfy the objectives of the investigation.
  • In fact, because we had to signify how much deeper we get into data compared to anyone else, thanks to AI (true AI, not buzzword AI), we called the result of this methodology Deep Real World Evidence.
  • As you have probably suffered in the past, current databases exist to collect clinical data, but with considerable gaps due to recording limitations in the current methods.
  • Deep Real World Evidence from EMR offers a much (not a bit but a much) greater insight into the routine clinical care of patients throughout all stages of the disease.
  • Combining free text with other data sources (e.g. laboratory data, pathology, genomics, etc.), an insilico registry gets generated to describe the patient population with the defined disease, their associated clinical conditions and treatments, and develop predictive models.

Deep data layers analysis:

Deep data layers analysis:

If we do our job well, there is no need for:

  • Observational studies.
  • Traditional registries.
  • Classical disease databases.

Drug discovery: beyond EMR and into genomics.

Once we have facilitated the most difficult part, which is extracting variables from free text (clinical characteristics, comorbidities, signs and symptoms, adverse events or outcomes), we can also combine all this unstructured information with other structured data layers (genomics, transcriptomics, proteomics and imaging) which can be sourced both from our worldwide network of hospitals and from clinical trial databases.

Savana works with its premium partners in order to offer a combined proposal:

We tested our scientific and technological capabilities through the following Reality Checks:

01 - Crohn's Disease

WHAT

  • An AI predictive model for Crohn’s disease relapses.

WHY THIS WORK IS REMARKABLE 

  • Because through the analysis of almost 6.000 patients and the ranking of 25.000 variables, it created one of the first AI algorithms in Inflammatory Bowel Disease.

WHAT WAS THE CONSEQUENCE 

  • Gastroenterologists now have an available predictive model for this disease.

WHAT THIS WORK DEMONSTRATES

  • The RWE generated through NLP applied to EMR, in combination with a Machine Learning approach, facilitates the generation of predictive models in inflammatory diseases.
European Journal of Gastronterology & Hepatology logo

02 - Coronary Type 2 Diabetes

WHAT

  • High rates of cardiovascular events in a large real-world series of PCI-revascularized patients with Type 2 Diabetes and Coronary Artery Disease with no history of Miocardial Infarction or stroke.

WHY THIS WORK IS REMARKABLE

  • Because through NLP it was possible to analyse +200.000 diabetes patients from 12 representative hospitals from a European region, without having to create any database or registry.

WHAT WAS THE CONSEQUENCE 

  • Due to knowing the prevalence,  agreements regarding the most appropriate management of the disease could be facilitated.

WHAT THIS WORK DEMONSTRATES 

  • NLP applied to EMR is an improved and more innovative method for generating epidemiology for almost any disease.

03 - Systemic corticosteroids - Bronchial Asthma

WHAT 

  • Systemic corticosteroids are frequently prescribed to patients with asthma, especially in primary care. Its use is associated with a greater number of adverse events.

WHY THIS WORK IS REMARKABLE

  • Because it was able to jointly analyze patients  from both Primary Care and Specialized Care, in a healthcare system where the information from these two environments were previously disconnected.

WHAT WAS THE CONSEQUENCE 

  • Awareness was raised about the overprescription of systemic corticosteroids in clinical practice.

WHAT THIS WORK DEMONSTRATES 

  • Savana’s NLP and Machine Learning techniques represent a robust way of identifying variability and quality issues in clinical practice.
Journal of Investigational Allergology and Clinical Immunology logo

04 - COVID

WHAT

  • Inhaled corticosteroids may be associated with a protective effect against severe COVID-19.

WHY THIS WORK IS REMARKABLE

  • Because these results were consistent (months ahead) with the NEJM-published RECOVERY clinical trial, led by Oxford.

WHAT WAS THE CONSEQUENCE 

  • The pulmonologists were able to reinforce their observed clinical impression regarding steroids and COVID, in real time during the pandemic. 

WHAT THIS WORK DEMONSTRATES 

  • Savana’s NLP and ML methodology are reliable for establishing associations between disease and treatments, with no manual data collection efforts required.
European Respiratory journal logo

05 - COPD

WHAT 

  • Clinical management of COPD in a European region.

WHY THIS WORK IS REMARKABLE

  • Because it scanned a complete +1million population, analysing every patient with COPD in a few days, avoiding the manual creation of any registry or database.

WHAT WAS THE CONSEQUENCE 

  • Decisions regarding innovative therapies in COPD were taken.

WHAT THIS WORK DEMONSTRATES 

  • Savana’s NLP and ML methodology enables clinicians to make informed decisions around drugs indications.

06 - COVID in COPD

WHAT

  • A higher incidence of COVID-19 in COPD patients and higher rates of hospital admissions and mortality, mainly associated with pneumonia.

WHY THIS WORK IS REMARKABLE

  • Because it was able to scan every patient with COPD and COVID in a European Region in the middle of the first wave of the pandemic, applying NLP to EMRs at scale enabled us to extract pioneering insights faster than any traditional method could have achieved.

WHAT WAS THE CONSEQUENCE

  • It was the first source of information confirming the intersection of these two conditions.

WHAT THIS WORK DEMONSTRATES

  • Savana’s Research Network and methodology facilitates the accelerated generation of updated evidence in a novel manner  unrivalled by more traditional methods of data collection.
Journal of Clinical Medicine logo

We send several weekly emails for you to start making friends with AI.

Sign in if you want to generate evidence and improve clinical practice beyond classical methods.

Subscribe here:

I don't want to subscribe, I want to read the blog

Technology and analysis: You choose.

Some healthcare providers are very good at data science, AI and even Natural Language Processing.

In those cases, they don't need our technology.

In those cases, they don't need our technology.

We simply make sure that the data they produce is harmonic with the other sites in our network, so they can jointly conduct multisite research projects.

When sites become part of the 

dRWE research ecosystem

they are invited to participate in national and international research studies contributing to evidence generation, accelerating health science and improving patient care.

Savana informs each site about the ongoing research studies in which they may participate, sponsored by private and/or public institutions. If interested and following ethics committee approval, there is no longer a requirement to complete Case Report Forms CRFs since the data is already structured and available. It is simply a matter of sharing the specific data agreed in the study protocol.

Sometimes it's exactly the opposite. What sites want is our technology, not our data analysis capabilities.

Sometimes it's exactly the opposite. What sites want is our technology, not our data analysis capabilities.​

They need our platform to curate their data. The appreciate the power to unlock all of the clinical value embedded within existing Electronic Medical Records in order to self reuse it for different purposes, that range from research projects (for example through a grant)  to clinical trial recruitment and even predictive modelling for management. 

Even when NLP can be done by many, they appreciate that we have read more than 3 billion clinical documents, thus our engine is well trained and sometimes it’s better not reinventing the wheel.

And sometimes they just want both: NLP and also our analysis. We are happy with that too.​

And sometimes they just want both: NLP and also our analysis. We are happy with that too.​

Scientific institutions also benefit from our approach

This partnership will accelerate the use of data from de-identified EMRs to monitor disease progression and outcomes in UK patients hospitalised with COVID-19 as part of the international Big COVIData study.

This partnership will unlock the clinical data from de-identified, free text in Electronic Medical Records to establish a predictive model to identify patients who may have COPD but have not yet been diagnosed and medications that are proving positive outcomes while being affordable to all.

Of course not everything is easy.

Even when we apply a federated model where data never belongs to us, hospitals are often a pain with regards to agreements around medical records and the EMR providers usually don’t help. 

But we have the ability to give enough value to healthcare providers so that everybody wins.

But we have the ability to give enough value to healthcare providers so that everybody wins.

What we offer to the sites is:

  • The experience of a company focused exclusively on EMR reuse. Not EMR reuse and proteomics. Or Medicine and banking and retail.
  • We have read 3 billion clinical documents. That is a lot.
  • We have invested in creating our system so that it’s multilingual by design.
  • We have clinicians involved full time in every aspect of our projects.

The problem we essentially solved for the first time is bringing the inferences of variables (from clinical notes) to scale.

And the solution we provide is the interaction between clinicians and data scientists:

This is our healthcare provider network

involving +200 sites

We send several weekly emails for you to start making friends with AI.

It's not for those who want to generate evidence by traditional means, filling registries by hand and using logistic regression. This is for those who want to leverage machine learning in order to generate evidence in a more automated way and with higher granularity of variables.

Subscribe here:

I don't want to subscribe, I want to read the blog

Enrol in a project