A Machine Learning-Based Framework for Structured Traffic Risk Prediction Using Spatiotemporal Data
Keywords:
Data mining, Traffic risk prediction, Machine learningAbstract
The timeliness and effectiveness of traffic accident rescue operations are of critical importance for ensuring the safety of citizens' lives and property. Accurate prediction of traffic accident risks, coupled with prompt on-site investigation and rescue upon identification of potential hazards, would significantly enhance response efficiency and effectively safeguard public safety. Existing traffic risk prediction models face two principal limitations. First, their accuracy is often compromised by analysts' insufficient consideration of factors that induce traffic accidents, as well as constraints imposed by experiential or domain knowledge boundaries. Second, most models assess risk solely through the probability of traffic accident occurrence, while neglecting the impact of accident severity on overall risk levels. To address these limitations, this study proposes a structured real-time traffic risk prediction model that decomposes overall traffic risk into two hierarchical components: accident occurrence probability and accident severity. This dual-structure approach facilitates a more nuanced understanding of how different factors contribute to overall risk and offers distinct advantages in adapting to the dynamic nature of transportation environments. The methodological framework proceeds as follows. First, necessary data preprocessing is performed on the target data objects. Subsequently, data mining techniques and in-depth analysis are applied to examine various spatiotemporal factors influencing traffic accidents, utilizing analytical methods including autocorrelation analysis and Pearson correlation analysis. Feature selection is then conducted based on the analytical results. Finally, the traffic risk prediction model is constructed using the PU-Learning algorithm in combination with a random forest model. Experimental analysis demonstrates that the proposed model achieves superior predictive performance compared to benchmark models, confirming its effectiveness in addressing the challenges inherent in dynamic transportation environments.

