Construction and analysis of the multimodal physiological database of depression emotion expression

Nan LI, Yao LI, Yongjie ZHOU, Rongfeng SU, Nan YAN, Lan WANG

Journal of Tsinghua University(Science and Technology) ›› 2025, Vol. 65 ›› Issue (7) : 1320-1327.

PDF(2749 KB)
PDF(2749 KB)
Journal of Tsinghua University(Science and Technology) ›› 2025, Vol. 65 ›› Issue (7) : 1320-1327. DOI: 10.16511/j.cnki.qhdxxb.2025.26.031
Man-machine Speech Communication

Construction and analysis of the multimodal physiological database of depression emotion expression

Author information +
History +

Abstract

Objective: Adolescent depression has become an urgent global health issue, which has a considerable increase in mental health problems among young individuals. Traditional diagnostic methods for depression often rely on self-reported symptoms and subjective evaluations, which can lead to underdiagnosis or misdiagnosis, especially in teenagers who may conceal their symptoms due to stigma. This study aims to fill this gap by developing a multimodal physiological signal database designed specifically for adolescent depression. This database incorporates various physiological signals, including speech and heart rate data, to enhance the objectivity of depression diagnosis. The goal of this work is to provide a tool that improves diagnostic accuracy and offers insights into the autonomic nervous system (ANS) dysfunctions associated with depression, thus paving the way for effective therapeutic interventions. Methods: This study recruited 86 adolescents aged between 12.00 and 20.00 years who were native Mandarin speakers. Data collection focused on multiple physiological modalities, including speech audio recordings, electrocardiogram (ECG) signals, and blood pressure readings. The participants were asked to engage in emotion elicitation tasks based on cognitive psychology principles. These tasks were designed to trigger a range of emotional responses, from neutral to positive and negative states, and allowed for real-time collection of vocal and physiological data under varying emotional conditions. ECG signals were analyzed to assess heart rate variability (HRV), a key marker of ANS function. Statistical methods, along with machine learning algorithms, were employed to analyze the relationship between vocal characteristics (pitch and speech energy) and physiological markers (HRV) to uncover potential patterns that differentiate depressed individuals from healthy controls. Results: Analysis revealed significant differences between the depressed and control groups in speech patterns and physiological responses. Adolescents with depression showed reduced variability in pitch and low speech energy, which were indicative of emotional blunting, a common symptom of depression. These vocal changes were strongly correlated with anomalies in HRV, specifically, a reduction in HRV, which signals impaired ANS function. The integration of multimodal data types (speech and physiological signals) not only confirmed the presence of ANS dysregulation in depressed adolescents but also provided a new framework for identifying vocal biomarkers as reliable indicators of depression severity. Additionally, the present study demonstrated that using multimodal data improved the overall precision of depression diagnosis because the combination of physiological and vocal features yielded better discriminatory power than the use of either modality alone. Conclusions: The creation of the multimodal physiological database presented herein represents an important step forward in the objective diagnosis of adolescent depression. By combining speech analysis with physiological markers, such as HRV, this study offers a comprehensive tool that can be used to diagnose depression with increased accuracy. This database not only provides a valuable resource for clinicians and researchers but also opens new avenues for personalized treatment approaches based on objective physiological data. Furthermore, this work highlights the critical role of the ANS in the pathology of depression and underscores the importance of integrating multimodal data in future psychiatric diagnostics. In conclusion, this database has the potential to revolutionize how adolescent depression is diagnosed and treated, providing a nuanced understanding of the neurophysiological mechanisms underlying this disorder.

Key words

depression / multimodal physiological signal database / autonomic nervous system

Cite this article

Download Citations
Nan LI , Yao LI , Yongjie ZHOU , et al . Construction and analysis of the multimodal physiological database of depression emotion expression[J]. Journal of Tsinghua University(Science and Technology). 2025, 65(7): 1320-1327 https://doi.org/10.16511/j.cnki.qhdxxb.2025.26.031

References

1
LU J , XU X F , HUANG Y Q , et al. Prevalence of depressive disorders and treatment in China: A cross-sectional epidemiological study[J]. The Lancet Psychiatry, 2021, 8 (11): 981- 990.
2
HERRMAN H , PATEL V , KIELING C , et al. Time for united action on depression: A lancet-world psychiatric association commission[J]. The Lancet, 2022, 399 (10328): 957- 1022.
3
GUO S Y , KAMINGA A C , XIONG J . Depression and coping styles of college students in China during COVID-19 pandemic: A systemic review and meta-analysis[J]. Frontiers in Public Health, 2021, 9, 613321.
4
MAKOWSKI D , PHAM T , LAU Z J , et al. NeuroKit2: A python toolbox for neurophysiological signal processing[J]. Behavior Research Methods, 2021, 53 (4): 1689- 1696.
5
QURESHI S A , SAHA S , HASANUZZAMAN M , et al. Multitask representation learning for multimodal estimation of depression level[J]. IEEE Intelligent Systems, 2019, 34 (5): 45- 52.
6
PAMPOUCHIDOU A , SIMOS P G , MARIAS K , et al. Automatic assessment of depression based on visual cues: A systematic review[J]. IEEE Transactions on Affective Computing, 2019, 10 (4): 445- 470.
7
MUNDT J C , SNYDER P J , CANNIZZARO M S , et al. Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology[J]. Journal of Neurolinguistics, 2007, 20 (1): 50- 64.
8
FOX M E , LOBO M K . The molecular and cellular mechanisms of depression: A focus on reward circuitry[J]. Molecular Psychiatry, 2019, 24 (12): 1798- 1815.
9
BARRETT L F , SIMMONS W K . Interoceptive predictions in the brain[J]. Nature Reviews Neuroscience, 2015, 16 (7): 419- 429.
10
RAISON C L , MILLER A H . Malaise, melancholia and madness: The evolutionary legacy of an inflammatory bias[J]. Brain, Behavior, and Immunity, 2013, 31, 1- 8.
11
GIBBONS C H . Basics of autonomic nervous system function[J]. Handbook of Clinical Neurology, 2019, 160, 407- 418.
12
TUMATI S , PAULUS M P , NORTHOFF G . Out-of-step: Brain-heart desynchronization in anxiety disorders[J]. Molecular Psychiatry, 2021, 26 (6): 1726- 1737.
13
BLEKER L S , VAN DAMMEN L , LEEFLANG M M G , et al. Hypothalamic-pituitary-adrenal axis and autonomic nervous system reactivity in children prenatally exposed to maternal depression: A systematic review of prospective studies[J]. Neuroscience & Biobehavioral Reviews, 2020, 117, 243- 252.
14
CAI H S , YUAN Z Q , GAO Y W , et al. A multi-modal open dataset for mental disorder analysis[J]. Scientific Data, 2022, 9 (1): 178.
15
SHEN J , ZHANG X W , HUANG X , et al. An optimal channel selection for EEG-based depression detection via kernel-target alignment[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 25 (7): 2545- 2556.
16
ZHU J , WANG Z H , GONG T , et al. An improved classification model for depression detection using EEG and eye tracking data[J]. IEEE Transactions on NanoBioscience, 2020, 19 (3): 527- 537.
17
PALLARÉS V , INSABATO A , SANJUÁN A , et al. Extracting orthogonal subject-and condition-specific signatures from fMRI data using whole-brain effective connectivity[J]. Neuroimage, 2018, 178, 238- 254.
18
BYLSMA L M . Emotion context insensitivity in depression: Toward an integrated and contextualized approach[J]. Psychophysiology, 2021, 58 (2): e13715.
19
HOEMANN K , KHAN Z , FELDMAN M J , et al. Context-aware experience sampling reveals the scale of variation in affective experience[J]. Scientific Reports, 2020, 10 (1): 12459.
20
OWREN M J . GSU Praat tools: Scripts for modifying and analyzing sounds using Praat acoustics software[J]. Behavior Research Methods, 2008, 40 (3): 822- 829.
21
LAUSEN A , HAMMERSCHMIDT K . Emotion recognition and confidence ratings predicted by vocal stimulus type and prosodic parameters[J]. Humanities and Social Sciences Communications, 2020, 7 (1): 1- 17.
22
SCHERER K R , SUNDBERG J , FANTINI B , et al. The expression of emotion in the singing voice: Acoustic patterns in vocal performance[J]. The Journal of the Acoustical Society of America, 2017, 142 (4): 1805- 1815.

RIGHTS & PERMISSIONS

All rights reserved. Unauthorized reproduction is prohibited.
PDF(2749 KB)

Accesses

Citation

Detail

Sections
Recommended

/