SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers
Carter JF., Jorge J., Gibson O., Tarassenko L.
Advances in camera-based physiological monitoring have enabled the robust, non-contact measurement of res-piration and the cardiac pulse, which are known to be in-dicative of the sleep stage. This has led to research into camera-based sleep monitoring as a promising alternative to 'gold-standard' polysomnography, which is cumbersome, expensive to administer, and hence unsuitable for longer-term clinical studies. In this paper, we introduce SleepVST, a transformer model which enables state-of-the-art performance in camera-based sleep stage classification (sleep staging). After pretraining on contact sensor data, SleepVST outperforms existing methods for cardio-respiratory sleep staging on the SHHS and MESA datasets, achieving total Cohen's kappa scores of 0.75 and 0.77 respectively. We then show that SleepVST can be successfully transferred to cardio-respiratory waveforms extracted from video, enabling fully contact-free sleep staging. Using a video dataset of 50 nights, we achieve a total accuracy of 78.8% and a Cohen's κ of 0.71 in four-class video-based sleep staging, setting a new state-of-the-art in the domain.