Skeleton based Human Action Recognition using a Structured-Tree Neural Network


  •   Muhammad Sajid Khan

  •   Andrew Ware

  •   Misha Karim

  •   Nisar Bahoo

  •   Muhammad Junaid Khalid


The ability for automated technologies to correctly identify a human’s actions provides considerable scope for systems that make use of human-machine interaction. Thus, automatic3D Human Action Recognition is an area that has seen significant research effort. In work described here, a human’s everyday 3D actions recorded in the NTU RGB+D dataset are identified using a novel structured-tree neural network. The nodes of the tree represent the skeleton joints, with the spine joint being represented by the root. The connection between a child node and its parent is known as the incoming edge while the reciprocal connection is known as the outgoing edge. The uses of tree structure lead to a system that intuitively maps to human movements. The classifier uses the change in displacement of joints and change in the angles between incoming and outgoing edges as features for classification of the actions performed

Keywords: Structure-Tree Neural Network (STNN), Skeleton, Human Action Recognition (HAR)


“50 years of object recognition: Directions forward.” Computer vision and image understanding 117, no. 8 (2013): 827-891.

Shafaei, Alireza, and James J. Little. “Real-time human motion capture with multiple depth cameras.” In 2016 13th Conference on Computer and Robot Vision (CRV), pp. 24-31. IEEE, 2016.

Shahroudy, Amir, Jun Liu, Tian-Tsong Ng, and Gang Wang. “Nturgb+ d: A large scale dataset for 3D human activity analysis.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1010-1019. 2016.

Liu, Jun, Amir Shahroudy, Mauricio Lisboa Perez, Gang Wang, Ling-Yu Duan, and Alex KotChichung. “Nturgb+ d 120: A large-scale benchmark for 3D human activity understanding.” IEEE transactions on pattern analysis and machine intelligence, 2019.

Rusu, Radu Bogdan, Jan Bandouch, Zoltan Csaba Marton, Nico Blodow, and Michael Beetz. “Action recognition in intelligent environments using point cloud features extracted from silhouette sequences.” In RO-MAN 2008-The 17th IEEE International Symposium on Robot and Human Interactive Communication, pp. 267-272. IEEE, 2008.

Li, Meng, and Howard Leung. “Graph-based approach for 3D human skeletal action recognition.” Pattern Recognition Letters 87 195-202, 2017.

Yang, Xiaodong, and Ying Li Tian. “Effective 3D action recognition using EigenJoints.” Journal of Visual Communication and Image Representation 25, no. 1 (2014): 2-11.

Munaro, Matteo, GioiaBallin, Stefano Michieletto, and Emanuele Menegatti. “3D flow estimation for human action recognition from coloured point clouds.” Biologically Inspired Cognitive Architectures 5: 42-51, 2013.

Wu, Qingqiang, Guanghua Xu, Longting Chen, Ailing Luo, and Sicong Zhang. “Human action recognition based on kinematic similarity in real-time.” PloS one 12, no. 10, 2017.

Shi, Lei, Yifan Zhang, Jian Cheng, and Hanqing Lu. “Skeleton-based action recognition with directed graph neural networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7912-7921. 2019.

Rusu, Radu Bogdan, and Steve Cousins. “3D is here: Point cloud library (PCL).” In 2011 IEEE international conference on robotics and automation, pp. 1-4. IEEE, 2011.

Yang, Zhengyuan, Yuncheng Li, Jianchao Yang, and Jiebo Luo. “Action recognition with spatio–temporal visual attention on skeleton image sequences.” IEEE Transactions on Circuits and Systems for Video Technology 29, no. 8: 2405-2415, 2018.


Download data is not yet available.


How to Cite
Khan, M., Ware, A., Karim, M., Bahoo, N. and Khalid, M. 2020. Skeleton based Human Action Recognition using a Structured-Tree Neural Network. European Journal of Engineering and Technology Research. 5, 8 (Aug. 2020), 849-854. DOI: