Analyzing spatiotemporal congestion pattern on urban roads based on taxi GPS data

Kaisheng Zhang, Daniel (Jian) Sun, Suwan Shen, Yi Zhu


With the development of in-vehicle data collection devices, GPS trajectory has become a priority source to identify traffic congestion and understand the operational states of road network in recent years. This study aims to investigate the relationship between traffic congestion and built environment, including traffic related factors and land use. Fuzzy C-means clustering was used to conduct an exhaustive study on 24-hour congestion pattern of road segments in urban area, so that the spatial autoregressive moving average model (SARMA) was introduced to analyze the output from the clustering analysis to establish the relationship between built environment and the 24-hour congestion pattern. The clustering result classified the road segments into four congestion levels, while the regression explained 12 traffic-related factors and land use factors’ impact on road congestion pattern. The continuous congestion was found to mainly occur in the city center, and the factors, such as road type, bus station in the vicinity, ramp nearby, commercial land use and so on have large impact on congestion formation. The Fuzzy C-means clustering was proposed to be combined with quantitative spatial regression, and the overall evaluation process will assist to assess the spatial-temporal levels of service of traffic from the congestion perspective.


Congestion Pattern, Taxi GPS Data, Fuzzy C-means Cluster, Spatiotemporal Regression, Built Environment Factor

Full Text:



Anselin, L., Bera, A. K., Florax, R., & Yoon, M. J. (1996). Simple diagnostic tests for spatial dependence. Regional Science and Urban Economics, 26(1), 77–104.

Azar, A. T., El-Said, S. A., & Hassanien, A. E. (2013). Fuzzy and hard clustering analysis for thyroid disease. Computer Methods and Programs in Biomedicine, 111(1), 1–16.

Azimi, M., & Zhang, Y. (2010). Categorizing freeway flow conditions by using clustering methods. Transportation Research Record, 2173, 105–114.

Bezdek, J. C. (1980). A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2(1), 1–8.

Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers and Geosciences, 10(2–3), 191–203.

Chen, C., Zhang, D., Li, N., & Zhou, Z. H. (2014). B-Planner: Planning bidirectional night bus routes using large-scale taxi GPS traces. IEEE Transactions on Intelligent Transportation Systems, 15(4), 1451–1465.

Cliff, A. D., & Ord, J. K. (1982). Spatial processes: Models and applications. Quarterly Review of Biology, 57(2).

Cui, J., Liu, F., Janssens, D., An, S., Wets, G., & Cools, M. (2016). Detecting urban road network accessibility problems using taxi GPS data. Journal of Transport Geography, 51, 147–157.

Ding, J., Gao, S., Jenelius, E., Rahmani, M., Huang, H., Ma, L., & Ben-Akiva, M. (2014). Routing policy choice set generation in stochastic time-dependent networks: Case studies for Stockholm, Sweden, and Singapore. Transportation Research Record, 2466, 76–86.

Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32–57.

Feng, H., Li, C., Zhao, N., & Hu, H. (2011). Modeling the impacts of related factors on traffic operation. Procedia Engineering, 12, 99–104.

Fukuyama, Y., & Sugeno, M. (1989). A new method of choosing the number of clusters for fuzzy C-means method. Presented at the 5th Fuzzy System Symposium, Kobe, Japan.

Goddard, J. B. (1970). Functional regions within the city center: A study by factor analysis of taxi flows in central London. Transactions of the Institute of British Geographers, 49, 161–182.

Greicius, M. D., Krasnow, B., Reiss, A. L., & Menon, V. (2003). Functional connectivity in the resting brain: A network analysis of the default mode hypothesis. Proceedings of the National Academy of Sciences, 100(1), 253–258.

Hahn, E., Chatterjee, A., Younger, M. S., Hahn, E., Chatterjee, A., & Younger, M. S. (2002). Macro-level analysis of factors related to area-wide highway traffic congestion. Transportation Research Record, 1817, 11–16.

Handy, S., Cao, X., & Mokhtarian, P. (2005). Correlation or causality between the built environment and travel behavior? Evidence from northern California. Transportation Research Part D Transport and Environment, 10(6), 427–444.

He, F., Yan, X., Liu, Y., & Ma, L. (2016). A traffic congestion assessment method for urban road networks based on speed performance index. Procedia Engineering, 137, 425–433.

Hu, X., An, S., & Wang, J. (2014). Exploring urban taxi drivers’ activity distribution based on GPS data. Mathematical Problems in Engineering, 2014(2), 1–13.

Hwang, K., Wu, K., & Jian, R. J. (2006). Modeling consumer preference for Global Positioning System-based taxi dispatching service: Case study of Taichung City, Taiwan. Transportation Research Record,1971, 99–106.

Jiménez-Meza, A., Arámburo-Lizárraga, J., & Fuente, E. D. L. (2013). Framework for estimating travel time, distance, speed, and street segment level of service (los), based on GPS data. Procedia Technology, 7(4), 61–70.

Kerner, B. S., & Klenov, S. L. (2006). Probabilistic breakdown phenomenon at on-ramp bottlenecks in three-phase traffic theory: Congestion nucleation in spatially non-homogeneous traffic. Physics, 1965, 473–492.

Kumar, V., & Vanajakshi, L. D. (2013). Modewise travel time estimation on urban arterials using transit buses as probes. Paper presented at the 92nd Annual Meeting of the Transportation Research Board, Washington, D.C.

Lu, Y., & Li, S. (2014). An empirical study of with-in day OD prediction using taxi GPS data in Singapore. Langmuir the Acs Journal of Surfaces and Colloids, 30(31), 9567–9576.

Miller, J. S., & Evans, L. D. (2011). Divergence of potential state-level performance measures to assess transportation and land use coordination. Journal of Transport and Land Use, 4(3), 81–103.

Moeckel R. (2016). Constraints in household relocation: Modeling land-use/transport interactions that respect time and monetary budgets. Journal of Transport and Land Use, 10(1), 211–228.

Montero, L., Pacheco, M., Barcelo, J., Homoceanu, S., & Casanovas, J. (2016). A case study on cooperative car data for traffic state estimation in an urban network. Presented at the 95th Annual Meeting of the Transportation Research Board, Washington, D.C.

Moran, P. A. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1–2), 17–23.

Pakhira, M. K., Bandyopadhyay, S., & Maulik, U. (2005). A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets and Systems, 155(2), 191–214.

Pan, G., Qi, G., Wu, Z., Zhang, D., & Li, S. (2013). Land-use classification using taxi GPS traces. IEEE Transactions on Intelligent Transportation Systems 14(1), 113–123.

Qian, X., & Ukkusuri, S. V. (2015). Exploring spatial variation of urban taxi ridership using geographically weighted regression. Paper presented at the 94th Annual Meeting of the Transportation Research Board, Washington, DC.

Qing, C., Parfenov, S., & Kim, L. J. (2015). Identifying travel patterns during extreme weather using taxi GPS data. Presented at Transportation Research Board 94th Annual Meeting, Washington, DC.

Schw, M. V., & Jensen, O. N. (2010). A simple and fast method to determine the parameters for fuzzy C–means cluster analysis. Bioinformatics, 26(22), 2841–2848.

Sun, D., & Elefteriadou, L. (2011). Lane changing behavior on urban streets: A focus group based study. Applied Ergonomics: Human Factors in Technology and Society, 42(5), 682–691.

Sun, D., & Elefteriadou, L. (2012). Lane changing behavior on urban street: An “in-vehicle” field experiment based study. Computer-Aided Civil and Infrastructure Engineering, 27(7), 525–542.

Sun, D., Zhang, C., Zhang, L., Chen, F., & Peng, Z. R. (2014). Urban travel behavior analyses and route prediction based on floating car data. Transportation Letters, 6(3), 118–125.

Tang, L., Yang, X., Kan, Z., & Li, Q. (2015). Lane-level road information mining from vehicle GPS trajectories based on Naïve Bayesian Classification. ISPRS International Journal of Geo-Information, 4(4), 2660–2680.

Tang, J., Jiang, H., Li, Z., & Li, M. (2016). A two-layer model for taxi customer searching behaviors using GPS trajectory data. IEEE Transactions on Intelligent Transportation Systems, 17, 1–7.

Tian, G., Ewing, R., White, A. Hamidi, S., Walters, J., & Goates, J. P. (2015). Traffic generated by mixed-use developments: Thirteen-region study using consistent measures of built environment. Journal of Urban Planning and Development, 137(3), 248–261.

Tulic, M., Bauer, D., & Scherrer, W. (2014). Link and route travel time prediction including the corresponding reliability in an urban network based on taxi floating car data. Transportation Research Record, 2442, 140–149.

Wang, J. F., Li, X. H., Christakos, G., Liao, Y. L., Zhang, T., Gu, X., & Zheng, X. Y. (2010). Geographical detectors‐based health risk assessment and its application in the neural tube defects study of the Heshun Region, China. International Journal of Geographical Information Science, 24(1), 107–127.

Wang, H., Peng, Z. R., Lu, Q. C., Sun, J., & Bai, C. (2017). Assessing effects of bus service quality on passengers’ taxi-hiring behavior. Transport. Advance online publication. doi: 10.3846/16484142.2016.1275786

Wen, T. H., Chin, W. C., & Lai, P. C. (2017). Understanding the topological characteristics and flow complexity of urban traffic congestion. Physica A: Statistical Mechanics and its Applications, 473(1), 166–177.

Wheaton, W. C. (1998). Land use and density in cities with congestion. Journal of Urban Economics, 43(2), 258–272.

Yang, Y., & Diez-Roux, A. V. (2012). Walking distance by trip purpose and population Subgroups. American Journal of Preventive Medicine, 43(1), 11–19.

Yazici, M. A., Kamga, C., & Singhal, A. (2016). Modeling taxi drivers’ decisions for improving airport ground access: John F. Kennedy airport case. Transportation Research Part A: Policy and Practice, 91, 48–60.

Yu, J., & Lu, P. (2016). Learning traffic signal phase and timing information from low-sampling rate taxi GPS trajectories. Knowledge-Based Systems, 110, 275–292.

Yu, L., & Liu, Y. (2011). Traffic characteristics analysis and suggestions on school bus operation for primary school students in Beijing. Journal of Transportation Systems Engineering & Information Technology, 11(5), 193–200.

Zhang, J., Qiu, P., Duan, Y., Du, M., & Lu, F. (2015). A space-time visualization analysis method for taxi operation in Beijing. Journal of Visual Languages and Computing, 31, 1–8.

Zhang, L., & Levinson, D. (2017). A model of the rise and fall of roads. Journal of Transport and Land Use, 10(2), 1–23.

Zhang, L., Hong, J. H., Nasri, A., & Shen, Q. (2012). How built environment affects travel behavior: A comparative analysis of the connections between land use and vehicle miles traveled in U.S. cities. Journal of Transport and Land Use, 5(3), 40–52.

Zhu, Z., & Nandi, A. K. (2014). Blind digital modulation classification using minimum distance centroid estimator and non-parametric likelihood function. IEEE Transactions on Wireless Communications, 13(8), 4483–4494.