— Mar 28, 2016 — 9 min read

The Patient Generated Health Data Deluge

Imagine a patient comes to you with their year’s worth of blood pressure data on a thumb drive. You don’t know what device they used to gather this data or even if they took it correctly. You also ask yourself what was there position when they took their blood pressure, you have no context. The only thing farfetched about this statement is they handed you a thumb drive, with today’s technology you are more likely to get real time patient vitals steamed to your dashboard. What are you supposed to now do with all of this patient generated health data (PGHD)? If you miss a diagnosis that there is a possible underlying condition you could be liable. Are we prepared to handle the upcoming patient generated data deluge stemmed from our wearable devices and wireless monitors? Do you think we prepared to shift from getting health data only from our biannual checkup to getting loads of data from our devices 24/7?

Patient generated health data is a double edge sword. It allows clinicians to get a better picture of their patients’ overall health on a day to day basis. With more data points you can see how a patient is doing over time within different life contexts and spot any trends within their health. More importantly it engages patients to monitor and be proactive of their health. However, there are some drawback to patient generated data such as quality, volume, and context of data.

For clinicians loads of data can become a cumbersome lability. If a physician happens to miss a diagnosis what will happen and will they be liable? How can we expect them to go through these volumes of data in their already busy schedule and constrained budgets? Physicians barely have time to talk with their patients, how can we expect them to analyze their data too?

Only a small percentage of the population actually reports their health data to their clinician. Imagine if only 50% of the population reported their health data on a daily basis. How will that burden your physicians and can our current healthcare system handle that additional strain. Health is an accumulation of your daily habits. We cannot get a good picture of your health with just distantly spaced measurements of vitals. The deluge of data allows clinicians to get a better overall picture on your average health so we must find a way to handle it all.

What are some sources of PGHD data?

Device generated data
Self-reported data
Contextual data

It seems like every day there is a new fitness tracker however it is disappointing because they essentially all measure the same vitals and metrics. That being true we now how access to cheap sensors that can measure anything from our steps, heart rate, bp, to even our posture. They are constantly measuring your vitals and recording them so you can get a fine grained view of your health data. This depth of information has not been available before for clinicians to analyze. It is also the volume and velocity of data being produced from these devices. There are some drawbacks to using cheap commodity sensors. We all have heard of commodity sensors and devices not producing accurate data. We must be stringent in the data sources we chose to base our diagnosis from. From fundamentally flawed from its source such as miss calibrated devices do more harm than good.

What is the volume and velocity of data that might be typical to be gathered in a given day? For example if you only measures 4 common vitals, heart rate, step count, blood pressure, and respiratory rate you can be talking about 60,000 data points in a single day! Granted they will be represented in graphical form so you can pinpoint trends in the data. This is beneficial longitudinal data which paints a better picture of a patients overall health and not just a small snapshot.

Self-reported health data can be strongly biased which can result in inaccurate data. As a patient we are biased and more than often think we are healthier than we actually are. We represent that by saying we didn’t eat that second serving or saying that we walked 4 miles instead of just 1. What is better is self-reported contextual data. It is valuable to have users provide context to their health data. We can gain a lot of insight if a patient tells us he is preparing for a big speech or if he is playing ball outside with his friends. Contextual data can even allow patients to track how they feel after taking certain medication. Human feeling is something that we don’t have an easy way of measuring, as for now…

Burden that data has on healthcare professionals

It is crazy to think that doctors are now liable for the tremendous amount of health care data generated by their patients. If it is not enough that doctors are already liable for, they get health data dumped on them as well. Data integrity and security are also a huge liabilities when dealing with PGHD. Consumers have even filled a lawsuit against fitbit because of inaccurate data readings. On top of that when customers demand the use of their data that means physicians need to invest in software and staff to make that possible.

Can doctors and clinicians keep up with the demands of analyzing data? Clinicians don’t need to be trained on how to interpret large data sets. That is where we, engineers, can play a big roll. We can make it easier to summarize and interpret the data for clinicians and then present the data in a format they are used to seeing. We need to summarize the data in a format that they can understand and make meaningful clinical decisions from.

Solutions that can help solve this data deluge problem

From a doctors perspective they must be able to cut through the noise and just focus on relevant data that can help them make the right diagnosis and offer the best treatment plan. We cannot expect them to analyze rows and rows of data and look at never ending graphs. We have to leverage technology to not only gather all of this data for us but we need to use it to help interpret it for us. With machine learning we can develop algorithms that can perform pattern recognition, clustering, and recommendations, on volumes of data. This type of technology is already available on the market to help in clinical decision support.

For example, a researcher has come up with a way where a device measures your heart beats and through machine learning can detect if you might have atrial fibrillation. We need smart decision support algorithms to analyze the data for us. Once the amount and variety of data increases we cannot rely just on clinicians to analyze all the data. Recommendation systems and pattern recognition software need to become our eyes and brains to process the mounds and mounds of data that will be produced by each patient.

Health data is vital for the next stage of making medicine affordable and detecting aliments before they grow into larger problems. At the end of the day we must not solely relay on technology to tell us how we feel and what our health is. Biology has already developed sophisticated control mechanisms and sensing capabilities to determine if our health is detreating. We must also increase our ability to listen to our own sensors already existing in our bodies which are finely tuned by evolution.