Using traffic data to identify land-use characteristics based on ensemble learning approaches

Jiahui Zhao

Zhibin Li

Pan Liu

DOI: https://doi.org/10.5198/jtlu.2023.2218

Keywords: Land use identification, Ensemble learning, Human activity data


Abstract

The land-use identification process, which involves quantifying the types and intensity of human activities at a regional level, is a critical investigation step for ongoing land-use planning. One limitation of land-use identification practices is that they are based on theoretical-driven models using survey and socioeconomic data, which are often considered costly and time consuming. Another limitation is that most of these identification methods cannot incorporate the effect of daily human activity, resulting in some significant spatial heterogeneity being ignored. In this context, a novel land-use identification framework is proposed to quantify land-use characteristics using traffic-flow and traffic-events data. Regarding the identification models, two widely used Ensemble learning methods: Random Forest and Adaboost, are introduced to classify the land-use type and fit the land-use density. The case study collected the transit vehicle positions, traffic events, and geo-tagged data at the regional level in the San Francisco Bay Area, California. The results demonstrated that this framework with Ensemble learning was significantly accurate at identifying land-use characteristics in both the type classification and density regression tasks. The result averages improved 12.63%, 12.84%, 11.05%, 5.44%, 12.84% for Area Under ROC Curve (AUC), Classification Accuracy (CA), F-Measure (F1), Precision, and Recall, respectively, in classification tasks and 56.81%, 21.20%, 47.29% for Mean Squared Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), respectively, in regression tasks than other models. The Random Forest model performs better in labels with high regularity, such as education, residence, and work activities. Apart from the accuracy, the correlation analysis of the error term also showed that the result was consistent with people’s common sense of land-use characteristics, demonstrating the interpretability of the proposed framework.


References

Bao, J., Liu, P., Yu, H., & Xu, C. (2017). Incorporating Twitter-based human activity information in spatial analysis of crashes in urban areas. Accident Analysis & Prevention, 106, 358–69. https://doi.org/10.1016/j.aap.2017.06.012

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Duan, Y., Lei, K., Tong, H., Li, B., Wang, W., & Hou, Q. (2021). Land-use characteristics of Xi’an residential blocks based on pedestrian traffic system. Alexandria Engineering Journal, 60(1), 15–24.

Fekih, M., Bonnetain, L., Furno, A., Bonnel, P., Smoreda, Z., Galland, S., & Bellemans, T. (2022). Potential of cellular signaling data for time-of-day estimation and spatial classification of travel demand: A large-scale comparative study with travel survey and land-use data. Transportation Letters, 14(7), 787–805. htttps://doi.org/10.1080/19427867.2021.1945854

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–39. https://doi.org/10.1006/jcss.1997.1504

Harris, D. M., & Harris, S. L. (2013). Digital design and computer architecture (2nd ed.). Amsterdam: Elsevier.

Jedwab, R., Loungani, P., & Yezer, A. (2021). Comparing cities in developed and developing countries: Population, land area, building height and crowding. Regional Science and Urban Economics, 86,103609. https://doi.org/10.1016/j.regsciurbeco.2020.103609

Jia, R., Khadka, A., & Kim, I. (2018). Traffic crash analysis with point-of-interest spatial clustering. Accident Analysis & Prevention, 121, 223–230. https://doi.org/10.1016/j.aap.2018.09.018

Kasanko, M., Barredo, J. I., Lavalle, C., McCormick, N., Demicheli, L., Sagris, V., & Brezger, A. (2006). Are European cities becoming dispersed? Landscape and Urban Planning, 77(1–2), 111–30. hppts://doi.org/10.1016/j.landurbplan.2005.02.003

Krause, C. M., & Zhang, L. (2019). Short-term travel behavior prediction with GPS, land use, and point-of-interest data.Transportation Research Part B: Methodological, 123, 349–361. https://doi.org/10.1016/j.trb.2018.06.012

Li, Z., Luan, W., Zhang, Z., & Su, M. (2020). Relationship between urban construction land expansion and population/economic growth in Liaoning Province, China. Land Use Policy, 99, 105022. https://doi.org/10.1016/j.landusepol.2020.105022

Liu, Y., Wang, F., Xiao, Y., & Gao, S. (2012). Urban land uses and traffic ‘source-sink areas’: Evidence from GPS-enabled taxi data in Shanghai. Landscape and Urban Planning, 106(1), 73–87. https://doi.org/10.1016/j.landurbplan.2012.02.012

Mendonça, R., Roebeling, P., Martins, F., Fidélis, T., Teotónio, C., Alves, H., & Rocha, J. (2020). Assessing economic instruments to steer urban residential sprawl, using a hedonic pricing simulation modelling approach. Land Use Policy, 92, 104458. https://doi.org/10.1016/j.landusepol.2019.104458

Moeckel, R., Heilig, M., Hilgert, T., & Kagerbauer, M. (2020). Benefits of integrating microscopic land use and travel demand models: Location choice, time use & stability of travel behavior. Transportation Research Procedia, 48, 1956–1967. https://doi.org/10.1016/j.trpro.2020.08.226

Pavlyuk, D. (2020). Towards ensemble learning of traffic flows’ spatiotemporal structure. Transportation Research Procedia, 47, 361–368. https://doi.org/10.1016/j.trpro.2020.03.110

Phillips, T., & Abdulla, W. (2021). Developing a new ensemble approach with multi-class SVMs for manuka honey quality classification. Applied Soft Computing, 111, 107710. https://doi.org/10.1016/j.asoc.2021.107710

Wang, C., Xu, C., & Fan, P. (2020). Effects of traffic enforcement cameras on macro-level traffic safety: A spatial modeling analysis considering interactions with roadway and land use characteristics. Accident Analysis & Prevention, 144, 105659. https://doi.org/10.1016/j.aap.2020.105659

Wang, M., & Debbage, N. (2021). Urban morphology and traffic congestion: Longitudinal evidence from US cities. Computers, Environment and Urban Systems, 89, 101676. https://doi.org/10.1016/j.compenvurbsys.2021.101676

Wu, Y., Shan, J., & Choguill, C. L. (2021). Combining behavioral interventions with market forces in the implementation of land-use planning in China: A theoretical framework embedded with nudge. Land Use Policy, 108, 105569. https://doi.org/10.1016/j.landusepol.2021.105569

Xiao, J. (2019). SVM and KNN ensemble learning for traffic incident detection. Physica A: Statistical Mechanics and its Applications, 517, 29–35. https://doi.org/10.1016/j.physa.2018.10.060

Xu, W., & Yang, L. (2019). Evaluating the urban land-use plan with transit accessibility. Sustainable Cities and Society, 45, 474–85. https://doi.org/10.1016/j.scs.2018.11.042

Zhang, J., Chen, M., & Hong, X. (2021). Nonlinear process monitoring using a mixture of probabilistic PCA with clusterings. Neurocomputing, 458, 319–326. https://doi.org/10.1016/j.neucom.2021.06.039

Zhao, D., Hu, X., Xiong, S., Tian, J., Xiang, J., Zhou, J., & Li, H. (2021). K-means clustering and KNN classification based on negative databases. Applied Soft Computing, 110, 107732. https://doi.org/10.1016/j.asoc.2021.107732

Zhao, J., Fan, W., & Zhai, X. (2020). Identification of land-use characteristics using bicycle sharing data: A deep learning approach. Journal of Transport Geography, 82, 102562. https://doi.org/10.1016/j.jtrangeo.2019.102562

Zheng, G., Chai, W. K., Katos, V., & Walton, M. (2021). A joint temporal-spatial ensemble model for short-term traffic prediction. Neurocomputing, 457, 26–39. https://doi.org/10.1016/j.neucom.2021.06.028

Zhu, Z., Cui, X., Zhang, K., Ai, B., Shi, B., & Yang, F. (2021). DNN-based seabed classification using differently weighted MBES multifeatures. Marine Geology, 438, 106519. https://doi.org/10.1016/j.margeo.2021.106519