0% found this document useful (0 votes)
41 views4 pages

Evaluating Kinect, OpenPose and BlazePose For Human Body Movement Analysis On A Low Back Pain Physical Rehabilitation Dataset

Uploaded by

Aleksa Marusic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views4 pages

Evaluating Kinect, OpenPose and BlazePose For Human Body Movement Analysis On A Low Back Pain Physical Rehabilitation Dataset

Uploaded by

Aleksa Marusic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Evaluating Kinect, OpenPose and BlazePose for

Human Body Movement Analysis on a Low Back Pain Physical


Rehabilitation Dataset
Aleksa Marusic Sao Mai Nguyen Adriana Tapus
Autonomous Systems and Robotics Autonomous Systems and Robotics Autonomous Systems and Robotics
Lab, Computer Science and System Lab, U2IS, ENSTA Paris, Institut Lab, Computer Science and System
Engineering (U2IS), ENSTA Paris Polytechnique de Paris and Engineering (U2IS), ENSTA Paris
Institut Polytechnique de Paris Depart. Informatique, IMT Atlantique Institut Polytechnique de Paris
[email protected] [email protected] [email protected]

ABSTRACT
Analyzing human motion is an active research area, with
various applications. In this work, we focus on human mo-
tion analysis in the context of physical rehabilitation using
a robot coach system. Computer-aided assessment of physi-
cal rehabilitation entails evaluation of patient performance
in completing prescribed rehabilitation exercises, based on
processing movement data captured with a sensory system,
such as RGB and RGB-D cameras. As 2D and 3D human pose
estimation from RGB images had made impressive improve-
ments, we aim to compare the assessment of physical reha- Figure 1: Setting of the system including a Microsoft Kinect
bilitation exercises using movement data obtained from both v2 and an open source humanoid robot called Poppy
RGB-D camera (Microsoft Kinect) and estimation from RGB
videos (OpenPose and BlazePose algorithms). A Gaussian In rehabilitation programs, a clinician instructs patients on how
Mixture Model (GMM) is employed from position (and ori- to perform rehabilitation exercises and then monitors their perfor-
entation) features, with performance metrics defined based mance in a clinical setting. Such treatment depends on the availabil-
on the log-likelihood values from GMM. The evaluation is ity of physiotherapists. In order to increase flexibility, home-based
performed on a medical database of clinical patients carry- rehabilitation is often used instead of clinic-based rehabilitation. In
ing out low back-pain rehabilitation exercises, previously such systems, a physiotherapist makes a rehabilitation plan con-
coached by robot Poppy. sisting of several recommended exercises. Patients then perform
the exercises at home and visit the clinic periodically for progress
ACM Reference Format:
assessment. However, the lack of supervision and timely feedback
Aleksa Marusic, Sao Mai Nguyen, and Adriana Tapus. 2023. Evaluating
by a healthcare professional are considered the main factors for
Kinect, OpenPose and BlazePose for Human Body Movement Analysis on a
Low Back Pain Physical Rehabilitation Dataset. In Companion of the 2023 decreasing the engagement of the patients throughout the months-
ACM/IEEE International Conference on Human-Robot Interaction (HRI ’23 long repetition of physical exercises. Low motivation and poor
Companion), March 13–16, 2023, Stockholm, Sweden. ACM, New York, NY, supervision increase the chances of incorrect performance of the
USA, 4 pages. https://doi.org/10.1145/3568294.3580153 exercises by the patient, which slows down the recovery process
and increases the risk of re-injury [13].
1 INTRODUCTION A solution for this problem can be a robot coach for physical
rehabilitation exercises. The robot should be able to learn how
Physical rehabilitation affects an increasing number of people and to perform a rehabilitation exercise as well as to assess patients’
is usually prescribed to patients who suffer from certain disabilities, movements. The system used in this paper is composed of an open-
or need to restore functional abilities, usually after an injury or source humanoid robot called Poppy [10], and a depth camera
surgery. One of the most important conditions is low back pain (Microsoft Kinect v2), as detailed in [7, 8]. Figure 1 shows the setting
(LBP), which is the leading cause of disability worldwide [4]. of this system. The Kinect sensor is used to capture human motion of
both the therapists and the patients. To be able to provide feedback,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed human motion analysis is therefore crucial.
for profit or commercial advantage and that copies bear this notice and the full citation Analyzing human motion is a very active research topic today,
on the first page. Copyrights for components of this work owned by others than the with applications in several domains such as sports sciences [6],
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission action and gesture recognition [3, 5, 9], and range-of-motion esti-
and/or a fee. Request permissions from [email protected]. mation [2]. Physiotherapists and GPs have extensive experience to
HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden classify a certain motion as correct or incorrect. Therefore, devel-
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9970-8/23/03. . . $15.00 oping an automatic system for such a task is not easy due to a wide
https://doi.org/10.1145/3568294.3580153 diversity of movements and a certain degree of subjectivity [2].
HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden Aleksa Marusic, Sao Mai Nguyen, and Adriana Tapus

Figure 2: Kinect skeleton Figure 3: OpenPose skeleton

Obtaining precise movement data by motion sensors is crucial. In from healthy subjects but, more importantly, of rehabilitation pa-
[13], the authors review different motion capture sensors used for tients. The data is extracted from a 4 weeks evolution of each patient.
rehabilitation exercise evaluation. One way is to use optical motion The videos collected from patients were annotated by a physiother-
tracking systems, which place a set of markers on a patient’s body, apist, using the Anvil video annotation research tool. Each exercise
ankles, and key body parts, thus obtaining the precise position of was labeled as either correct or incorrect.
each joint at each time. Although such systems are highly accurate This section describes the protocol and rationale for how the
and reliable, they are very expensive and need to be (re)installed ev- Keraal dataset was created. It describes the participants included,
ery time the patient is doing any exercise. The other, non-invasive the hardware, and the experimental protocol used in the dataset. It
approach, is the usage of depth cameras, which provide both po- is made available on http://nguyensmai.free.fr/KeraalDataset.html.
sition and orientation of skeleton joints. These systems become 2.2.1 Rehabilitation program. 31 patients, aged 18 to 70 years, were
very popular as they are cheap and easy to use. The most popular recruited in the double blind study. This prospective, centrally ran-
such system is Microsoft Kinect sensor, which has been used in domized, controlled, single-blind, and bi-centric study was con-
most of the related works. Finally, a standard vision camera can be ducted from October 2017 to May 2019. 12 patients suffering from
used as a motion sensor as well. Over the years, these cameras had low-back pain were included in the Robot Supervised Rehabilitation
a limit on the evaluation accuracy, but nowadays, with the huge Group, and were asked by a humanoid robot coach to perform each
development of computer vision and deep learning techniques, it of the three predefined exercises the best they can from its demon-
is possible to estimate skeleton joints positions, and even orienta- stration. The details on this clinical trial, including the patient care,
tions, from plain RGB images such as with algorithms OpenPose the rehabilitation sessions, the robot coach, the inclusion and ex-
and BlazePose [12, 14]. clusion criteria, the characteristics of the patients, the efficiency of
Our work aims to evaluate the latter two sensor systems and the care have been reported in A et al. [1].
investigates whether standard vision cameras, using OpenPose and
BlazePose algorithms [12, 14], can have comparable results with 2.2.2 Exercises and errors. A list of three exercises has been chosen
motion captured by the Microsoft Kinect system. in conjunction with therapists as common rehabilitation exercises
The rest of the paper is structured as follows. Section 2 provides that are also used for low-back pain treatment. Illustrations of these
sensory data overview and the dataset used, while Section 3 de- exercises can be seen as exercises 1, 2 and 3 of http://keraal.enstb.
scribes the implemented method for rehabilitation assessment. The org/exercises.html. The 3 exercises are centered on spine stretching:
results are summarized in Section 4. The conclusions and discus- a left rotation of the trunk followed by a right rotation, a left and
sions are part of Section 5. right lateral bending of the trunk and a breathing exercise with the
upper limbs flexed 90◦ at shoulder and elbow.
2 DATASET A list of common errors was defined in conjunction with the
experience of the therapists from the CHRU Brest.
2.1 Skeleton data
A Human Pose Skeleton represents the orientation of a person in a 2.2.3 Participants. The dataset contains data from three groups :
graphical format. It is a set of coordinates that can be connected to • Group1A : Rehabilitation Patients : 14 recordings per exercise
describe the pose of the person. Points/coordinates in the skeleton among the daily sessions of 6 patients were annotated.
are called joints or keypoints while the connections between them • Group2A : Healthy participants: 6 healthy adults are free to
are pairs or limbs. execute each exercise correctly or with errors. 51 recordings
Figures 2 and 3 show joints of Kinect and OpenPose 1 skeletons. per exercise were annotated.
• Group3 : Healthy participants : 3 healthy adults perform
2.2 Dataset correctly the exercises and simulate the identified errors.
The Keraal dataset has been recorded within a long-term rehabilita- 2.2.4 Sensor system. Using the Microsoft Kinect V2 sensor, we
tion program, targeting low back pain. The data includes recordings obtained the RGB video with the skeleton drawn, and the skeleton
joint positions and orientations information. From the RGB videos,
1 https://maelfabien.github.io/tutorials/open-pose/#run-openpose we can also obtain additional estimation of joint positions and
Evaluating Kinect, OpenPose and BlazePose for Human Body Movement Analysis HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden

orientations using the human body keypoint detection libraries Table 1: F1 scores on Kinect trained data
OpenPose [14] and Blazepose [12]. Moreover, as the Vicon system
is considered the best system for precision, for Group3, we also Ex. type / train gr group1A group2A group3
recorded with MoCap using the Vicon system. For synchronization CTK 0.57 0.92 0.95
purpose, the two systems were activated simultaneously. ELK 0.25 0.53 0.34
RTK 0.48 0.88 0.62
3 METHODOLOGY 0.43 0.78 0.64
3.1 Gaussian Mixture Model
Gaussian mixture model (GMM) belongs to a group of probabilistic Table 2: F1 scores on OpenPose trained data
models and is used for representing data with a mixture of Gaussian
probability density functions. Ex. type / train gr group1A group2A group3
As in [11], we encode the movement point positions as a Gauss- CTK 0.4 0.94 0.56
ian Mixture Model (GMM): 𝜃 = [𝑡, 𝑥], where 𝑡 is the timestamp and ELK 0.25 0.67 0.51
𝑥 the joints positions. RTK 0.45 0.97 0.63
𝐾
∑︁ 0.37 0.86 0.57
𝑝 (𝜃 ) = 𝜙𝑖 N (𝜇𝑖 , Σ𝑖 ) (1)
𝑖=1
RTK – and performing group –1A, 2A, and 3. To alleviate the issues
, where the 𝑖 𝑡ℎ vector component is characterized by normal dis-
caused by class imbalance and few data, for each such combination,
tributions with weights 𝜙𝑖 , means 𝜇𝑖 , and covariance matrices Σ𝑖 .
we performed 4-fold cross-validation, thus we split correct perfor-
Each Gaussian of the mixture is thus defined by:
mances into 4 groups and trained four GMMs, each time leaving
Σ𝑖 Σ𝑥𝑡
 𝑡   𝑡 
𝜇𝑖 a different fold for validation and training on the remaining three
𝜇𝑖 = , Σ𝑖 = 𝑖 (2)
𝜇𝑖𝑥 Σ𝑥𝑡
𝑖 Σ𝑥𝑖 folds.
In order to evaluate the performance of GMMs, we compute
, where the indices 𝑡 and 𝑥 refer to respectively time and position.
the F1-score from correct and incorrect detections. We evaluate
The parameters 𝜙𝑖 , 𝜇𝑖 , Σ𝑖 are learned by Expectation-Maximisation
such a measure for varying thresholds from minimum to maximum
(EM) from the skeleton data of the movements captured by the
value recorded for the negative log-likelihood of GMM baseline.
Kinect or estimated with OpenPose or BlazePose.
We calculate negative log-likelihood values for all validation data
For motion assessment, GMM is trained only on correct perfor-
(all incorrect performances for that combination and one fold of
mances. In order to evaluate an observed sequence 𝑋 during tests,
four of the correct performances, the one left for validation) and
we consider the negative log-likelihood that the given sequence
find the threshold to be used for classification by calculating F1
𝑋 has been generated by the learned GMM. Correct motion se-
score for every possible threshold and choose the threshold with
quences are expected to result in lower negative log-likelihood, in
the biggest F1 score. Then, we average the obtained F1 scores for
comparison to incorrect motion sequences. Hence, in order to de-
four 4 folds to obtain F1-score for one combination. These scores
tect incorrect motion sequences, we consider all motion sequences
can be seen in Tables 1 and 2.
whose negative log-likelihood is higher than a threshold.
By comparing the reported results, we can see some differences
in F1 and accuracy scores for one concrete combination of exercise
3.2 GMM in Riemannian manifold type-performing group, but the overall scores are comparable. Dif-
While joint positions are naturally viewed in 3D Euclidean space, ferences in specific cases can be better understood by looking at the
quaternions that we obtain from Kinect data having both position confusion matrices which are shown in Figures 4a, 4b and 4c, where
and orientation can be represented as elements of the 3-sphere we report the confusion matrices per group and exercise for each
𝑆 3 , which is a 3 dimensional Riemannian manifold. Such space is skeleton data, using either 2D positions with GMM or 3D positions
not linear as Euclidean, so the calculation of the mean and the and orientations with GMM in Riemannian manifold. They show
covariance is not quite possible. However, we can consider tangent that the confusion matrices are quite similar, thus that the model
spaces as a linear approximation and map a point from the manifold makes similar mistakes, for all the skeleton data analyzed. Thus,
to the tangent space and vice versa. we can hypothesize that OpenPose, BlazePose or Kinect data are
equivalent for automatic rehabilitation motion evaluation.
4 RESULTS AND DISCUSSION For group3, exercise CTK, comparing Kinect and OpenPose data,
We compare the ability of the two baselines to detect incorrect we can see that the main problem is the quite big number of incor-
motion sequences, while only correct demonstrations are available rect exercises. Therefore, even small differences in the number of
during training. We evaluate our methods for Kinect, OpenPose and correct performances classified wrongly make big differences in F1
BlazePose data on three rehabilitation exercises for three different scores between the two baselines.
groups of performances, as explained above. In other words, we On the other hand, by comparing Kinect and OpenPose data for
trained a different GMM model for each combination of data type exercise CTK for group 1A, we can see a slightly bigger number of
used – Kinect / BlazePose (2D positions and 3D positions + orien- incorrect exercises recognized as correct in OpenPose GMM. One
tations) and OpenPose (2D positions) – exercise type – CTK, ELK, possible explanation to this is the difference in calculated F1 scores
HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden Aleksa Marusic, Sao Mai Nguyen, and Adriana Tapus

(a) Group 1A (b) Group 2A (c) Group 3

Figure 4: Confusion matrices for each performing group. Each row represents a data type (BlazePose 2D joint position, BlazePose
3D position + orientation, Kinect 2D, Kinect 3D + orientation, OpenPose). Each column represents an exercise (CTK, ELK, RTK).

across the 4 folds. In this specific case, the minimum F1 score was //doi.org/10.3390/data6050046
0.35 while the maximum score was 0.53. [3] Tapus A., Bandera A., Vazquez-Martin R., and Calderita L. V. 2018. Perceiving
the person and their interactions with the others for social robotics - A review.
Pattern Recognition Letters (2018). https://doi.org/10.1016/j.patrec.2018.03.006
5 CONCLUSIONS [4] Wu A, March L, Zheng X, Huang J, Wang X, Zhao J, Blyth FM, Smith E, Buchbinder
R, and Hoy D. 2020. Global low back pain prevalence and years lived with
We identified the need for automatic coaching of physical rehabili- disability from 1990 to 2017: estimates from the Global Burden of Disease Study
tation exercises and the importance of quality motion assessment. 2017. Ann Transl Med. 8, 6 (Jan. 2020), 299. https://doi.org/10.21037/atm.2020.02.
175
We investigated whether a simple RGB camera has comparable [5] Glowinski D, Dael N, Camurri A, Volpe G, Mortillaro M, and Scherer K. 2011.
results with Kinect by utilizing deep learning for pose estimation. Toward a Minimal Representation of Affective Gestures. IEEE Transactions on
Affective Computing 2, 2 (2011), 106–118. https://doi.org/10.1109/T-AFFC.2011.7
We evaluated and compared a baseline motion analysis algo- [6] Kulić D, Venture G, and Nakamura Y. 2009. Detecting changes in motion charac-
rithm, a Gaussian Mixture Model, on Kinect and OpenPose / BlazePose teristics during sports training. 2009 Annual International Conference of the IEEE
data, for the task of rehabilitation motion assessment. We demon- Engineering in Medicine and Biology Society (2009), 4011–4014.
[7] Maxime Devanne and Sao Mai Nguyen. 2017. Multi-level motion analysis for
strated that, on average, results obtained through Kinect, OpenPose physical exercises assessment in kinaesthetic rehabilitation. In 2017 IEEE-RAS
and BlazePose data were quite comparable, which shows that sim- 17th International Conference on Humanoid Robotics (Humanoids). 529–534. https:
ple RGB cameras have a potential to be used as the main sensor for //doi.org/10.1109/HUMANOIDS.2017.8246923
[8] Maxime Devanne, Sao Mai Nguyen, Olivier Remy-Neris, Béatrice Le Gales-
collecting movement data. As RGB cameras are easily accessible Garnett, Gilles Kermarrec, and André Thepaut. 2018. A Co-design Approach
for laymen through any computer webcam or smartphone, and for a Rehabilitation Robot Coach for Physical Rehabilitation Based on the Er-
ror Classification of Motion Errors. In IEEE International Conference on Robotic
are often cheaper, we can expect an increasing use of OpenPose, Computing (IRC). 352–357.
Blazepose and similar pose estimation algorithms. [9] Aggarwal J.K. and Ryoo M.S. 2011. Human Activity Analysis: A Review. ACM
Although the analysis algorithm was not the focus in this work, Comput. Surv. 43, 3, Article 16 (apr 2011), 43 pages. https://doi.org/10.1145/
1922649.1922653
we could see that the targeted tasks are still quite challenging. In [10] Matthieu L. 2014. Poppy: open-source, 3D printed and fully-modular robotic plat-
that sense, some more sophisticated methods, possibly with the form for science, art and education. Theses. Université de Bordeaux.
usage of deep learning and / or graph neural networks, can be [11] Nguyen S M and Tanguy P. 2016. Cognitive architecture of a humanoid robot
for coaching physical exercises in kinaesthetic rehabilitation. In International
designed in order to better assess rehabilitation movements. Workshop on Cognitive Robotics.
[12] Bazarevsky V, Grishchenko I, Raveendran K, Zhu T, Zhang F, and Grundmann M.
2020. BlazePose: On-device Real-time Body Pose tracking. CoRR abs/2006.10204
REFERENCES (2020).
[1] Blanchard A, Nguyen S M, Devanne M, Simonnet M, Le Goff-Pronost M, and [13] Liao Y, Vakanski A, Xian M, Paul D, and Baker R. 2020. A review of computational
Rémy-Néris O. 2022. Technical Feasibility of Supervision of Stretching Exer- approaches for evaluation of rehabilitation exercises. Computers in Biology and
cises by a Humanoid Robot Coach for Chronic Low Back Pain: The R-COOL Medicine 119 (2020), 103687.
Randomized Trial. BioMed Research International 2022 (mar 2022), 1–10. [14] Cao Z, Hidalgo Martinez G, Simon T, Wei S, and Sheikh Y. A. 2019. OpenPose:
[2] Miron A, Sadawi N, Waidah I, Hussain H, and Grosan C. 2021. IntelliRehabDS Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE
(IRDS)—A Dataset of Physical Rehabilitation Movements. Data 6, 5 (2021). https: Transactions on Pattern Analysis and Machine Intelligence (2019).

You might also like