A workforce of Stanford Medication researchers has launched SleepFM Scientific, a multimodal sleep basis mannequin that learns from medical polysleep checks and predicts long-term illness threat from a single evening’s sleep. The analysis findings have been revealed in Nature Medication, and the workforce has launched the medical code as an open supply sleepfm-clinical repository on GitHub beneath the MIT license.
From nocturnal polysomnography to common expressions
Polysomnography information mind exercise, eye actions, coronary heart alerts, muscle tone, respiratory effort, and oxygen saturation all through the evening in a sleep laboratory. Though it’s the gold normal check in sleep drugs, it’s only used for sleep staging and sleep apnea prognosis in most medical workflows. The analysis workforce treats these multichannel alerts as dense physiological time sequence and trains the underlying mannequin to study representations which might be shared throughout all modalities.
SleepFM is skilled on roughly 585,000 hours of sleep recordings from roughly 65,000 individuals drawn from a number of cohorts. The biggest cohort is from the Stanford Sleep Medication Middle, the place roughly 35,000 adults and youngsters have been studied over one evening from 1999 to 2024. Its medical cohorts are linked to digital well being information, which later allow survival evaluation for tons of of illness classes.

Mannequin structure and pre-training function
On the modeling stage, SleepFM makes use of a convolutional spine to extract native options from every channel, adopted by attention-based aggregation between channels and a temporal transformer that operates over quick segments of the evening. The identical core structure has already appeared in earlier work on SleepFM for sleep staging and sleep-disordered respiration detection, the place studying joint embeddings throughout mind exercise, electrocardiogram, and respiratory alerts was proven to enhance downstream efficiency.
The aim of pre-training is to get rid of one contrastive studying. For every quick time section, the mannequin builds separate embeddings for every modality group, reminiscent of mind alerts, cardiac alerts, and respiratory alerts, and learns to regulate these modality embeddings in order that the subset can predict the mixed illustration of the remaining modalities. This strategy makes the mannequin sturdy to lacking channels and heterogeneous recording montages which might be widespread in real-world sleep research.
After pre-training with unlabeled polysomnography, the spine is frozen and the mind is skilled to specialise in small duties. For normal sleep duties, a light-weight recurrent or linear head maps embeddings to sleep levels or apnea labels. For medical threat prediction, the mannequin aggregates in a single day right into a single patient-level embedding, concatenates primary demographics reminiscent of age and gender, and feeds this illustration right into a Cox proportional hazards layer for time-to-event modeling.
Benchmarks for sleep staging and apnea
Earlier than shifting to illness prediction, the analysis workforce validated that SleepFM competes with skilled fashions on normal sleep evaluation duties. Earlier research have already proven {that a} easy classifier on SleepFM embeddings outperforms end-to-end convolutional networks in classifying sleep levels and detecting sleep-disordered respiration, with improved macro AUROC and AUPRC on a number of public datasets.
On this medical research, the identical pre-trained spine is reused for sleep staging and apnea severity classification throughout a multicenter cohort. The outcomes reported within the analysis paper present that SleepFM matches or outperforms current instruments reminiscent of conventional convolutional fashions and different automated sleep staging methods, validating that this illustration captures core sleep physiology and never simply statistical artifacts from a single dataset.
Predicting 130 ailments and mortality from a single evening’s sleep
The central contribution of this Stanford analysis paper is illness prediction. The analysis workforce has mapped prognosis codes from the Stanford Digital Medical Document to phecodes, defining greater than 1,000 candidate illness teams. For every phacode, we calculate the time to first prognosis after the sleep research and match a Cox mannequin on prime of the SleepFM embedding.
SleepFM identifies 130 illness outcomes with predictable threat from in a single day polysomnography with sturdy discriminatory energy. These embrace loss of life from any trigger, dementia, myocardial infarction, coronary heart failure, power kidney illness, stroke, atrial fibrillation, some cancers, and a number of psychiatric and metabolic issues. For a lot of of those situations, efficiency metrics reminiscent of concordance index and space beneath the receiver working curve are inside comparable ranges to established threat scores, though the mannequin makes use of solely sleep information and primary demographics.
The report additionally notes that for some cancers, being pregnant issues, cardiovascular ailments, and psychological well being issues, SleepFM-based predictions attain accuracy ranges of roughly 80% over multi-year threat home windows. This implies that delicate patterns within the coordination between mind, coronary heart, and respiratory alerts convey details about underlying illness processes that aren’t but clinically seen.
Comparability with an easier baseline
To evaluate the added worth, the analysis workforce in contrast the SleepFM-based threat mannequin to 2 baselines. The primary makes use of solely demographic traits reminiscent of age, gender, and BMI. Second, we practice an end-to-end mannequin instantly on polysomnography checks and outcomes with none unsupervised pre-training. Throughout most illness classes, pre-trained SleepFM representations mixed with easy Survival Head yield greater settlement and better long-term AUROC than each baselines.
This research clearly exhibits that the acquire doesn’t come from a fancy prediction head, however from an underlying mannequin that has realized a common illustration of sleep physiology. In apply, which means medical facilities can reuse a single pre-trained spine and study small site-specific heads with comparatively modest labeled cohorts whereas nonetheless approaching state-of-the-art efficiency.
Take a look at the paper and full code right here. Additionally, be at liberty to comply with us on Twitter. Additionally, do not forget to affix the 100,000+ ML SubReddit and subscribe to our e-newsletter. grasp on! Are you on telegram? Now you can additionally take part by telegram.
Take a look at the newest launch of ai2025.dev. It’s a 2025-focused analytics platform that transforms mannequin launches, benchmarks, and ecosystem exercise into structured datasets that may be filtered, in contrast, and exported.
Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of synthetic intelligence for social good. His newest endeavor is the launch of Marktechpost, a synthetic intelligence media platform. It stands out for its thorough protection of machine studying and deep studying information, which is technically sound and simply understood by a large viewers. The platform boasts over 2 million views per thirty days, demonstrating its reputation amongst viewers.


