Objective: Adolescent depression has become an urgent global health issue, which has a considerable increase in mental health problems among young individuals. Traditional diagnostic methods for depression often rely on self-reported symptoms and subjective evaluations, which can lead to underdiagnosis or misdiagnosis, especially in teenagers who may conceal their symptoms due to stigma. This study aims to fill this gap by developing a multimodal physiological signal database designed specifically for adolescent depression. This database incorporates various physiological signals, including speech and heart rate data, to enhance the objectivity of depression diagnosis. The goal of this work is to provide a tool that improves diagnostic accuracy and offers insights into the autonomic nervous system (ANS) dysfunctions associated with depression, thus paving the way for effective therapeutic interventions. Methods: This study recruited 86 adolescents aged between 12.00 and 20.00 years who were native Mandarin speakers. Data collection focused on multiple physiological modalities, including speech audio recordings, electrocardiogram (ECG) signals, and blood pressure readings. The participants were asked to engage in emotion elicitation tasks based on cognitive psychology principles. These tasks were designed to trigger a range of emotional responses, from neutral to positive and negative states, and allowed for real-time collection of vocal and physiological data under varying emotional conditions. ECG signals were analyzed to assess heart rate variability (HRV), a key marker of ANS function. Statistical methods, along with machine learning algorithms, were employed to analyze the relationship between vocal characteristics (pitch and speech energy) and physiological markers (HRV) to uncover potential patterns that differentiate depressed individuals from healthy controls. Results: Analysis revealed significant differences between the depressed and control groups in speech patterns and physiological responses. Adolescents with depression showed reduced variability in pitch and low speech energy, which were indicative of emotional blunting, a common symptom of depression. These vocal changes were strongly correlated with anomalies in HRV, specifically, a reduction in HRV, which signals impaired ANS function. The integration of multimodal data types (speech and physiological signals) not only confirmed the presence of ANS dysregulation in depressed adolescents but also provided a new framework for identifying vocal biomarkers as reliable indicators of depression severity. Additionally, the present study demonstrated that using multimodal data improved the overall precision of depression diagnosis because the combination of physiological and vocal features yielded better discriminatory power than the use of either modality alone. Conclusions: The creation of the multimodal physiological database presented herein represents an important step forward in the objective diagnosis of adolescent depression. By combining speech analysis with physiological markers, such as HRV, this study offers a comprehensive tool that can be used to diagnose depression with increased accuracy. This database not only provides a valuable resource for clinicians and researchers but also opens new avenues for personalized treatment approaches based on objective physiological data. Furthermore, this work highlights the critical role of the ANS in the pathology of depression and underscores the importance of integrating multimodal data in future psychiatric diagnostics. In conclusion, this database has the potential to revolutionize how adolescent depression is diagnosed and treated, providing a nuanced understanding of the neurophysiological mechanisms underlying this disorder.