3D Pedestrian Trajectory Prediction using Deep Learning from Kinect Data

Jafari, Akbar; Hosseininaveh, Ali; Mahmoodian, Mojtaba

doi:10.61882/jgit.13.1.1

[Home ] [Archive]

[ فارسی ]

Engineering Journal of Geospatial Information Technology

نشریه علمی-پژوهشی مهندسی فناوری اطلاعات مکانی

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Contact us

Site Facilities

Search in website

Receive site information

Volume 13, Issue 1 (6-2025)

jgit 2025, 13(1): 1-17

Back to browse issues page

3D Pedestrian Trajectory Prediction using Deep Learning from Kinect Data

Akbar Jafari

, Ali Hosseininaveh ^*

, Mojtaba Mahmoodian

K.N.Toosi University of Technology

Abstract: (2853 Views)

Pedestrian trajectory prediction is a critical challenge in the fields of computer vision and intelligent transportation systems, as it directly impacts the safety and decision-making capabilities of autonomous systems. Most existing approaches rely on two-dimensional (RGB) data and recurrent neural networks such as LSTM (Long Short Term Memory), which neglect the depth dimension and therefore fail to accurately estimate distances between pedestrians and surrounding objects. In this study, we propose a 3D-LSTM (Three Dimension LSTM) model that utilizes RGB-D data obtained from a fixed Kinect sensor to predict pedestrian positions in metric three-dimensional space. The proposed framework includes depth extraction from stereo images, coordinate normalization, and LSTM-based sequence modeling to forecast future pedestrian positions in the (X, Y, Z) coordinates. Experimental evaluations conducted on the École Polytechnique Fédérale de Lausanne (EPFL) dataset demonstrate that the 3D prediction accuracy (average RMSE: 15.7 cm) is comparable to conventional two-dimensional methods while additionally providing real-world distance and spatial interaction information that is crucial for collision avoidance and motion planning. The results indicate that incorporating the third dimension does not degrade performance; instead, it enhances the ability of intelligent systems to make safer and more informed decisions in dynamic environments. This approach lays the groundwork for advanced navigation and autonomous driving systems with enhanced three-dimensional situational awareness.

Keywords: Pedestrian Trajectory, Trajectory prediction, deep learning, LSTM Network

Full-Text [PDF 1105 kb] (361 Downloads)

Type of Study: Research | Subject: Aerial Photogrammetry
Received: 2023/06/10 | Accepted: 2024/05/26 | ePublished ahead of print: 2025/03/17 | Published: 2025/08/31

References

1. [1] Lee, N., et al. Desire,"Distant future prediction in dynamic scenes with interacting agents", in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. [DOI:10.1109/CVPR.2017.233]

2. [2] Vemula, A., K. Muelling, and J. Oh. Social attention, "Modeling attention in human crowds", in Proceedings of the IEEE international Conference on Robotics and Automation (ICRA), 2018. [DOI:10.1109/ICRA.2018.8460504]

3. [3] Combs, T.S., et al., "Automated vehicles and pedestrian safety: exploring the promise and limits of pedestrian detection", American journal of preventive medicine. 56(1): p. 1-7, 2019. [DOI:10.1016/j.amepre.2018.06.024]

4. [4] Manh, H. and G.J.a.p.a. Alaghband, "Scene-lstm: A model for human trajectory prediction", arXiv preprint arXiv:1808.04018,2018.

5. [5] Rasouli, A. and J.K.J.I.T.o.I.T.S. Tsotsos, "Autonomous vehicles that interact with pedestrians: A survey of theory and practice", Proceedings of the IEEE transactions on intelligent transportation systems, 21(3): p. 900-91, 2019. [DOI:10.1109/TITS.2019.2901817]

6. [6] Yazdan, R., M.J.I.J.o.P. Varshosaz, and R. Sensing, "Improving traffic sign recognition results in urban areas by overcoming the impact of scale and rotation", ISPRS Journal of Photogrammetry and Remote Sensing, 171: p. 18-35, 2021. [DOI:10.1016/j.isprsjprs.2020.10.003]

7. [7] Shi, X., et al., "Pedestrian trajectory prediction in extremely crowded scenarios", Sensors, 19(5): p. 1223,2019 [DOI:10.3390/s19051223]

8. [8] Xue, H., D.Q. Huynh, and M. Reynolds, "SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction", in Proceeding of the IEEE Winter Conference on Applications of Computer Vision (WACV). 2018. [DOI:10.1109/WACV.2018.00135]

9. [9] Fernando, T., et al., Soft+ hardwired attention, "An lstm framework for human trajectory prediction and abnormal event detection", Neural networks, 108: p. 466-478, 2018 [DOI:10.1016/j.neunet.2018.09.002]

10. [10] Kalman, R.E., "A new approach to linear filtering and prediction problems", published in Journal of Basic Engineering, 82 (Series D): 35-45. 1960. [DOI:10.1115/1.3662552]

11. [11] Thrun, S., W. Burgard, and D.J.C. Fox, MA, USA, "Probabilistic Robotics-Intelligent Robotics and Autonomous Agents Series", The MIT Press. 2006.

12. [12] Williams, C.K., "Prediction with Gaussian processes: From linear regression to linear prediction and beyond, in Learning in graphical models", Springer Netherlands. p. 599-621, 1998 [DOI:10.1007/978-94-011-5014-9_23]

13. [13] Voulodimos, A., et al., "Deep learning for computer vision: A brief review", Computational intelligence and neuroscience, 2018. [DOI:10.1155/2018/7068349]

14. [14] Pascanu, R., et al., "How to construct deep recurrent neural networks", 2013.

15. [15] Hochreiter, S., et al., "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies", A field guide to dynamical recurrent neural networks. IEEE Press,2001

16. [16] Hochreiter, S. and J.J.N.c. Schmidhuber, "Long short-term memory", Neural computation, p. 1735-1780, 1997 [DOI:10.1162/neco.1997.9.8.1735]

17. [17] Bahdanau, D., K. Cho, and Y.J.a.p.a. Bengio, "Neural machine translation by jointly learning to align and translate", arXiv preprint arXiv:1409.0473, 2014.

18. [18] Becker, S., et al., "An evaluation of trajectory prediction approaches and notes on the trajnet benchmark", arXiv preprint arXiv:1805.07663, 2018.

19. [19] Alahi, A., et al. "Social lstm: Human trajectory prediction in crowded spaces", in Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 961-971, 2016. [DOI:10.1109/CVPR.2016.110]

20. [20] Alahi, A., et al.,"Learning to predict human behavior in crowded scenes, in Group and Crowd Behavior for Computer Vision", InGroup and Crowd Behavior for Computer Vision, Academic Press, Elsevier. p. 183-207, 2017 [DOI:10.1016/B978-0-12-809276-7.00011-4]

21. [21] Heo, D., J.Y. Nam, and B.C.J.S. Ko, "Estimation of Pedestrian Pose Orientation Using Soft Target Training Based on Teacher-Student Framework", Sensors, p. 1147, 2019 [DOI:10.3390/s19051147]

22. [22] Collins, R.T. "Mean-shift blob tracking through scale space", in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003.

23. [23] Gandhi, T. and M.M. Trivedi. "Image based estimation of pedestrian orientation for improving path prediction", in Proceedings of the IEEE Intelligent Vehicles Symposium, 2008. [DOI:10.1109/IVS.2008.4621257]

24. [24] Simo-Serra, E., et al. "Single image 3D human pose estimation from noisy observations", in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012. [DOI:10.1109/CVPR.2012.6247988]

25. [25] Quintero, R., et al. "Pedestrian path prediction using body language traits", in Proceedings of the IEEE Intelligent Vehicles Symposium Proceedings. 2014. [DOI:10.1109/IVS.2014.6856498]

26. [26] Kim, S., et al., Brvo: Predicting pedestrian trajectories using velocity-space reasoning. The International Journal of Robotics Research, 34(2), p201-17, 2015. [DOI:10.1177/0278364914555543]

27. [27] Bera, A., et al. "GLMP-realtime pedestrian path prediction using global and local movement patterns", in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2016. [DOI:10.1109/ICRA.2016.7487768]

28. [28] Ma, W.-C., et al. "Forecasting interactive dynamics of pedestrians with fictitious play", in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. [DOI:10.1109/CVPR.2017.493]

29. [29] Ahmadabadian, A.H., et al., "An automatic 3D reconstruction system for texture-less objects", Robotics and Autonomous Systems, 117: p. 29-39, 2019. [DOI:10.1016/j.robot.2019.04.001]

30. [30] Ren, S., et al., "Faster r-cnn: Towards real-time object detection with region proposal networks", IEEE transactions on pattern analysis and machine intelligence, 39(6): p. 1137-1149, 2016 [DOI:10.1109/TPAMI.2016.2577031]

31. [31] Shafiee, M.J., et al., Fast YOLO, "A fast you only look once system for real-time embedded object detection in video", arXiv preprint arXiv:1709.05943, 2017. [DOI:10.15353/vsnl.v3i1.171]

32. [32] Graves, A.J.a.p.a., "Generating sequences with recurrent neural networks", arXiv preprint arXiv:1308.0850, 2013.

Send email to the article author

‎ 10.61882/jgit.13.1.1

Mendeley

Zotero

RefWorks

Jafari A, Hosseininaveh A, Mahmoodian M. 3D Pedestrian Trajectory Prediction using Deep Learning from Kinect Data. jgit 2025; 13 (1) :1-17
URL: http://jgit.kntu.ac.ir/article-1-918-en.html

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Volume 13, Issue 1 (6-2025)

Back to browse issues page

Persian site map - English site map - Created in 0.1 seconds with 34 queries by YEKTAWEB 4735