New Articles

Datawatch: Analytics Goes Viral – How Data is Used to Help Predict, Prevent and Curtail Outbreaks


Datawatch: Analytics Goes Viral – How Data is Used to Help Predict, Prevent and Curtail Outbreaks

In the latest blog in our Datawatch series, we look at the role analytics plays in keeping outbreaks at bay – from Cholera in the 1800s to COVID-19 today.

COVID-19 has changed the world dramatically in a short space of time, presenting new challenges for world leaders and medical experts alike. In fighting it, we’ve had to use all the tools at our disposal, and past experience tells us that advanced analytics is perhaps the most powerful weapon in our armory.

You’d be forgiven for thinking that analytics and data science are relatively new tools that today give us an advantage in our fight against viral outbreaks. In a sense you would be right, the tools and techniques used by data scientists have evolved significantly in recent years. But analytics has actually been used in this way for over a century, with one of the earliest examples taking place way back in 1854.

At that time, the residents of Victorian London were in the midst of a rampant cholera epidemic that had killed more than 600 people in a week. Little was known about these kinds of outbreaks back then, and many people assumed that cholera was an airborne disease. However, thanks to some rudimentary data analysis and modeling, Dr. John Snow was soon able to put this misunderstanding to bed.

Long before GIS maps were ever a thing, Dr. Snow began to gather data related to the cholera deaths and plot them, by hand, on a map of London. As a result of this early form of data visualization, Snow was able to trace the source of the outbreak to a water pump on Broad Street. The pump handle was replaced, and the outbreak was stopped in its tracks.

Amazingly, these same techniques are still used today – although visualization has improved somewhat. You can see how Dr. Snow’s work may look if it were carried out today, here.

Big data, analytics, and the fight against COVID-19

Although the theory behind this technique is still widely used today, we now have tools at our disposal that Dr. Snow could only have dreamt of back in 1854. Most notably, huge computational power that allows us to crunch massive amounts of data in record times.

This technology has played a huge part in our battle against the recent COVID-19 pandemic, helping medical experts and world leaders identify the right responses, develop the right solutions, and plot the best routes to recovery. Here are just three ways analytics has helped us to fight the pandemic.

Tracking the spread of the virus

Tracking the spread of COVID-19 has been essential in our battle to mitigate and overcome its impacts. It’s interesting to note that in this instance, analytics played a part in tracking COVID-19 before most of us even knew it existed.

In 2019, an AI system belonging to an outbreak risk startup called BlueDot detected some similarities between what the press was calling ‘a strain of pneumonia’ in Wuhan and the Sars outbreak of 2003.

Since this initial discovery, BlueDot has continued to track the spread of COVID and monitor its movements, using AI to analyze a wealth of unstructured data, including social media posts and news reports.

Social media can actually play a huge role in situations like this. By applying sentiment analysis to unstructured social data, it’s possible to track everything from the regions the virus has spread to, to the attitudes to proposed legislative responses and government guidance.

All of this data can then feed into action plans and help health officials respond more appropriately, accurately defining the best social distancing and quarantine measures, for instance.

Developing vaccines

As the pandemic trundled on into its second year, it became apparent that this wasn’t going to be something that just went away. And this meant that vaccination was our best chance of life returning to normality.

The problem though is that developing a vaccine typically takes years. Before Pfizer and AstraZenica, the mumps vaccine held the record for the fastest to be developed, and that took almost half a decade.

However, thanks to advances in analytics and AI, a COVID vaccine was approved and made available for emergency use within a year of the virus’ outbreak.

A large part of this was down to global cooperation, and the fact that virologists have encountered coronaviruses before. But data analysis and tools like AI and machine learning were also significant factors.

For example, AlphaFold, a tool in Google’s DeepMind platform, used AI algorithms to catalog the structure of potential proteins that could help the virus spread – a vital part of understanding how a virus works and how it can be contained.

AlphaFold is a state-of-the-art system that can predict the structure of proteins based on their genetic sequence. This system was used to investigate proteins associated with COVID, before the information was made openly available to scientists working on the vaccine.

With the same aim, AI and natural language processing have played a big part in applying analytics to the COVID-19 Open Research Dataset – a collection of almost 500,000 scholarly articles that are gathered from across the world and made openly available to the global research community.

Elsewhere, in a lab in Tennessee, the world’s second-fastest supercomputer has been crunching data in an attempt to understand how the virus behaves, analyzing 2.5 billion genetic combinations to ascertain how COVID attacks the human body.

Responding at the right time in the right way

COVID-19 has been perhaps the toughest test imaginable for healthcare institutions worldwide. With resources in short supply, difficult decisions have had to be made each day. For instance, what critical assets are needed in each location? And where and when will hospital beds be required as the virus moves through populations?

These problems can’t be answered by leafing through spreadsheets – there’s simply too much data, too many variables, and a picture that changes each day. However, using advanced analytics, healthcare officials have been able to make these key decisions based on vital, actionable and timely insights.

For example, epidemiological models have been useful in forecasting infection spread throughout regions, helping healthcare workers to predict the potential numbers of infected people that will require medical treatment – and what that level of treatment will look like.

Predictive simulation and scenario modeling have also been used to help forecast the required number of healthcare workers based on given scenarios, along with the strain outbreaks may place on healthcare services. This data has then fed directly into national lockdown plans.

One example of this in action can be seen at the Sheba Medical Centre in Israel, where data-driven forecasting is used to optimize the allocation of resources before outbreaks even strike. The center has used machine learning to crunch data related to confirmed cases, deaths, test results, contact tracing and the availability of medical resources, to ensure it’s prepared for what lies around the corner.

The center also led a national competition to develop the best technology for predicting the deterioration rate of COVID patients.

A step change for virology

The scale and speed of COVID-19’s spread have been unparalleled. But the scale of our response to it has been equally as impressive. Using the latest analytics techniques, healthcare workers have been able to prepare for unpredictable scenarios, governments have gained insights into the best actions for keeping people safe, and businesses have been able to take measured approaches to adapt to the world around them.

In the midst of this pandemic, it’s hard to find many if any positives, but the lessons learned during COVID-19 will have a huge effect on the way we tackle similar events in the future.

Whether it’s developing vaccines, ensuring the appropriate resources are in the right place at the right time, or fast-tracking our understanding of the situation to keep as many people safe as possible, analytics is able to provide the answers to the most complex questions these situations present. And it’s been doing so since the 1800s.


Nitin Aggarwal is the VP & Business Head of Analytics at The Smart Cube, a global provider of analytics and procurement intelligence solutions.