A guide to health data analytics and BI

Analytics and BI are essential for business. But what if you handle sensitive health data? Here, we explain what you need for health data BI and analytics.
A guide to health data analytics and BI
Analytics and BI are essential for business. But what if you handle sensitive health data? Here, we explain what you need for health data BI and analytics.

Analytics and business intelligence are both essential if you are going to grow your business in today’s competitive market. Analytics and BI will give you deep insights into the wants and needs of your customers. Healthcare, in particular, could be transformed by good BI. But almost every jurisdiction in the world has strict rules governing the use of health data. Fortunately, it is still possible to perform BI on such data while remaining compliant. In this blog, we will look at the steps you need to take in order to build compliant apps for health data analytics and BI.

The legal constraints on handling health data

Health data is one of the most tightly-regulated forms of personal data in the world. Within the EU, personal data is protected by the General Data Protection Regulation (GDPR). Under GDPR, health data counts as a special category, receiving greater protection than other personal data. Some EU states go further, for instance in France health data must also be stored in compliance with HDS. Within the USA, transfers of health data between businesses and between business and government are regulated by HIPAA (the Health Insurance Portability and Accountability Act). Other states have specific cybersecurity rules, adding strict technical requirements on how data is collected and stored.

The 4 steps for health data BI

So, what are the technical and administrative steps you need to take if you want to use health data for analytics or BI?

4 steps to use health data in BI

1. Collecting the data

The specific rules for collecting health data vary depending on the jurisdiction you operate in. For instance, in the EU, personal data can only be processed if you have a legal basis for doing so. For most cases, this basis would be informed consent. But with health data, you have to use a different basis, as set out in Article 9 of the GDPR. This might be explicit consent but more often it is for public health reasons, because it’s in the public interest or for scientific research purposes. The important thing is, whatever basis you are using to collect the data, you will probably need to get suitable consent if you want to use this data for marketing or BI.

2. Storing the data

Every jurisdiction has slightly different rules for storing health data. In the USA, data must be stored in accordance with the Health Information Technology for Economic and Clinical Health Act (HITECH). Within the EU, the GDPR requires you to use strong encryption along with pseudonymisation. Some EU states even impose stricter rules, like the French requirement to comply with the HDS (Hébergeurs de Données de Santé). In general, the data will need to be properly encrypted and pseudonymised before you store it.

Health data needs better levels of encryption than that found on clouds like Azure or AWS. The idea is to raise the bar for any hacker that may try to access the data. So, you should be using application-level encryption, where every user’s data is encrypted with a different key. You also need to separate the personal data from the sensitive data and store these separately, linking them with a pseudonym. This process of pseudonymization can be hard with health data because often there is embedded personal data. For instance, X-ray images may have the patient’s details embedded within them.

3. Using the data

Once you have stored the data, you will be able to use it for your main business case. This is known as primary use of data. For many use cases, you will need access to the raw data. Here are three common digital health use cases:

  • Health data analytics. Applying big data analytics and AI to health data can be transformative. We are already seeing advances in oncology, patient outcomes and epidemiology. However, you need to use the raw health data in order to run your machine learning models.
  • Telemedicine. One of the key advances in digital health has been remote monitoring and diagnosis. For instance, patients can wear portable ECGs that are monitored in real-time to identify cardiac events. This requires patients and doctors to exchange sensitive data. There's also a need to securely record the results of any consultations.
  • Medical IoT. The internet of things is transforming many industries. Within healthcare, applications for IoT include patient monitoring, smart devices and wearables. Often, these require health data to be shared with health professionals and there's always a requirement for collecting and storing the data.

4. Anonymising the data

Before you can do health data BI, you need to provide a suitable data feed. What “suitable” means depends on many factors including:

  • The jurisdiction you operate in. For instance, in the USA all you need to do is apply suitable pseudonymization. But in the EU, unless you have explicit consent, you usually need to use anonymisation.
  • Whether you want to share the data with your internal analysts or a 3rd party. This will potentially require you to sign Business Associate Agreements (BAAs) or Data Processing Agreements (DPAs).
  • Exactly what data you want to use for analytics. If you don’t need the sensitive health data, you will be able to follow less stringent rules.
  • Whether you need to export the data to a different jurisdiction. In many multinational companies, the data may need to be aggregated at a central location before being analysed.

You will need to speak to lawyers or consultants to understand exactly what you need to do at this stage. You will also need to get technical guidance as to how the data should be made available. There are many choices ranging from classical structured databases through to cloud-style data lakes.

When do I need to anonymise?

In some jurisdictions (notably, the USA), if you have correctly applied pseudonymisation, you may freely share the data. However, in the EU and other jurisdictions, health data BI is only really possible if you have anonymised the data. This is the process of stripping out all direct and indirect identifiers from the data so there is no way that it can be associated with the original patient. This is a really good example of the data minimisation principle in GDPR.

Anonymisation isn’t a new concept – as far back as the 19th century, the US Census Bureau was trying to anonymise data. However, as more datasets became available and computing power increased, it became harder and harder to anonymise data successfully. Nowadays, the standard approaches to anonymisation include k-anonymity, differential privacy and synthetic data. But all of these have known issues that can reduce the degree of anonymity. The correct anonymisation solution will depend on your specific needs and circumstances. Please contact us to discuss your requirements.

Putting it all together

So, now you understand all the components, how do you put them together to create a complete system? Obviously, there are several possible architectures, depending on your specific circumstances. Below we look at two of these that cover the most common scenarios for health data analytics and BI.

Single jurisdiction

If you operate within a single jurisdiction, you just need to follow the steps above. The data is collected and stored in a compliant fashion. The primary use for the data (e.g. your digital health app, etc) accesses the data using a logged secure API. For secondary use (e.g. health data analytics and BI), you create a compliant feed of the data and anonymise it before sharing. The diagram below shows how to do this.

How to use health data for BI. Chino.io can help you.

Multiple jurisdictions

When you operate in multiple jurisdictions, things are a little more complex. Of course, you can treat each jurisdiction as a stand-alone system. But what if you need to collect and analyse the data centrally? Unfortunately, there’s no one-size-fits-all solution. There are so many variables that this has to be looked at on a case-by-case basis. If you want more information, we would be happy to give you advice based on your specific situation.

How Chino.io makes it easy

Implementing either of the above architectures is technically challenging. Just getting the encryption, pseudonymization and related security requirements right can take you months. Then there’s the need to understand all the specific requirements in each jurisdiction. Here at Chino.io, we specialise in storing health data. We work with companies all over the world including the EU, USA, Australia and South Africa. Our experts can also advise you on exactly which architecture will work for you. Our technology solves all the technical issues and can be installed on any server or cloud. To learn more, download our eBook on building compliant health apps for GDPR and HIPAA.