Objective: High-arch dams require reliable long-term monitoring to ensure safety in complex operating environments and under extreme loads. Vibration-based operational modal analysis under ambient excitation is well-suited for continuous deployment due to its passive and minimally intrusive nature. However, the vibration response under these conditions is often weak and susceptible to noise and non-stationary excitation. When covariance-driven stochastic subspace identification (SSI-COV) is applied, the stabilization diagram frequently becomes cluttered with a mixture of physical poles and spurious poles, complicating manual pole selection and diminishing the effectiveness of automated pole clustering, particularly for densely spaced modes and weakly excited higher-order modes. This study aims to enhance automated modal identification for high-arch dams by refining the clustering distance metric to better separate physical poles from spurious ones. Methods: A classical workflow that combines SSI-COV with density-based spatial clustering of applications with noise (DBSCAN) is adopted and enhanced by redesigning the clustering distance metric. Conventional metrics typically use a weighted summation of frequency difference and mode-shape similarity, which may not fully capture the relationships between the two features and may falter when identifying densely spaced modes. In this study, a coupled distance formulation is introduced that directly integrates the modal assurance criterion (MAC) with the absolute frequency deviation and is placed in the denominator. When the mode-shape correlation between two poles is weak and MAC approaches 0, the distance increases significantly. By contrast, when the correlation is strong and MAC approaches 1, the distance reduces to the absolute frequency deviation. Consequently, pole pairs with simultaneously exhibit small frequency differences and highly consistent mode shapes are assigned minimal clustering distances, whereas those with large frequency differences or inconsistent mode shapes are pushed apart. This leads to a clearer separation of physical and spurious modes in the stabilization diagram, thus meeting requirements for automated clustering-based interpretation. A statistical analysis of clustering distances is then performed using the stabilization diagram from a high-arch dam dataset. Finally, the method is validated through two case studies. The first involves a five-degree-of-freedom numerical system excited by broadband white noise with added measurement noise; the responses are segmented into consecutive windows to test both single-window identification and continuous modal tracking. The second case utilizes multisensor field vibration data from an actual high-arch dam, including a representative short-duration record and a multiday dataset for continuous monitoring. For each case, the proposed formulation computes clustering distances, DBSCAN clusters the poles, and modal frequencies and damping ratios are extracted to evaluate clustering accuracy and the performance of automated identification. Results: The distance-based statistical analysis reveals that the proposed metric enhances separability. Pole pairs that meet both feature-consistency conditions are clustered within a compact distance interval, whereas partially consistent or inconsistent pairs shift toward larger distances. In the numerical example, the proposed method produces physical clusters that are less prone to absorbing noise points compared to the baseline metric, leading to an approximately 31% increase in identified modal poles for weakly excited higher-order modes. In the real dam case, the baseline metric generates excessive clusters that are closely packed, making it difficult to form effective clusters with clear and interpretable boundaries. By contrast, the proposed method clearly identifies three clusters for the high-arch dam and achieves a 34% increase in recognized poles for the relatively higher-order mode during continuous identification. This suggests that the improvement is most significant for relatively higher-order modes, where the number of identified modal poles increases by approximately one-third compared to the baseline approach. Conclusions: By integrating frequency and MAC in a division-based formulation, the proposed metric enhances the compactness of the identified clusters and enables stable distinction between physical and spurious poles, while also improving the identification of weakly excited higher-order vibration modes. This directly enhances the robustness of DBSCAN-based automated modal identification and continuous modal tracking for high-arch dams under ambient excitation. The method can be easily incorporated into existing SSI-COV workflows, as it mainly updates the distance-computation step, providing a practical solution for reliable long-term vibration-based dam monitoring.