E4 Bytes: Systematic Benchmarking of Physiological Sensors

Researchers at the University of Salzburg and Harvard University investigated the accuracy of the measurements taken by wearable sensors, such as the Empatica E4, against well-calibrated laboratory equipment.

Wearable sensors are increasingly used in research, as well as for personal and private purposes. A variety of scientific studies are based on physiological measurements from such rather low-cost wearables. However, only a few research studies perform a stringent evaluation and comparison of wearable sensors, including their shortcomings, before carrying out a scientific study. Thus, it is oftentimes not clear how accurate measurements from wearables are compared to measurements from well-calibrated, high-quality laboratory equipment, apart from a formal comparison of technical specifications. In consequence, the quality and comparability of measurements from different devices is oftentimes not clear.

In this paper, we demonstrate an approach to quantify the accuracy of low-cost wearables in comparison to high-quality laboratory sensors. We, therefore, developed a benchmark framework for physiological sensors that cover the entire workflow from sensor data acquisition to the computation and interpretation of diverse correlation and similarity metrics. We evaluated this framework based on a study with 18 participants. Each participant was equipped with one high-quality laboratory sensor and two wearables, one of which being the Empatica E4. These three sensors simultaneously measured physiological parameters such as ECG, galvanic skin response, skin temperature, a.o., while the study participants were cycling on an ergometer following a predefined routine (warmup phase, stepwise increase of resistance, cool-down phase).

The results of our benchmarking show that cardiovascular parameters (heart rate, inter-beat interval, heart rate variability) exhibit high correlations and similarities. Measurement of galvanic skin response, which is a more delicate undertaking, resulted in lower, but still reasonable correlations and similarities. We conclude that the benchmarked wearables provide physiological measurements such as heart rate and inter-beat interval with an accuracy close to that of the professional high-end sensor, but the accuracy varies more for other parameters, such as galvanic skin response. The Empatica E4 demonstrated extraordinary stability and high quality in the measurements.


[1] Sagl, G., Resch, B., Petutschnig, A., Kyriakou, K., Liedlgruber, M. Wilhelm, F. H. Wearables and the Quantified Self: Systematic Benchmarking of Physiological Sensors Sensors, 19(20), 4448.