Intelligent Salat Posture Monitoring
Intelligent Salat Posture Monitoring
Md Mozasser Rahman1, Rayan Abbas Ahmed Alharazi2, Muhammad Khairul Imban b Zainal Badri2
1
Department of Mechnical Engineering Technology, Faculty of Engineering Technology, Universiti Tun Hussein Onn Malaysia, Johor,
Malaysia
2
Department of Mechnical Engineering, Kulliyyah of Engineering, International Islamic University Malaysia, Selangor, Malaysia
Corresponding Author:
Md Mozasser Rahman
Department of Mechanical Engineering Technology, Faculty of Engineering Technology Universiti Tun
Hussein Onn Malaysia
Km 1, Jalan Panchor, 84600 Pagoh, Muar, Johor Darul Ta’zim, Malaysia
Email: [email protected]
1. INTRODUCTION
Salat (prayer) is one of the main pillars in Islam, which is considered one of the most important
aspects of our faith. Our beloved Prophet Muhammad (peace be upon him) received the commandments of
salat during Isra’ and Mi’raj (the night journey). Hence, giving hope for humanity once more as they have
lost the light on how to worship the true and the only one God, The Almighty Allah (Glorious is He and He is
Exalted). During that time, Muslims learn how to perform salat by following the orders and actions of
Prophet Muhammad (peace be upon him). This is done by looking at the action and orders directly using the
senses of sight and hearing. In other words, the technique used to learn salat at that time solely by using the
human senses to detect the correct movements and words. Although Muslims during that time can only learn
how to perform salat through the Prophet’s words and actions, the teaching and learning are highly effective
because Muslims nowadays are still performing salat the same way as the Prophet.
As the world is getting older, some Muslims tend to forget the proper way to perform salat as they
are bound to the world. Today, technologies are improving at a very fast pace. Many kinds of research and
development have been conducted to improve our lives. This raised a big responsibility for Muslim
researchers to develop a technology that benefits this world and hereafter. With this goal, developing the
“Intelligent Salat Monitoring and Training System” as an educational tool could help many Muslims learn
and recognize the proper way of performing salat. There are a few research have been done regarding the
activity monitoring of salat. Alobaid and Rasheed [1], Al-Ghannam and Dossari [2] used smartphone
technology to recognize Salat activities. Rabbi et.al. [3], Ibrahim and Ahmad [4] assessed the activities of
salat using electromyographic (EMG) signals. Therefore, the most important technology that we need to learn
and master to perform the salat inspection and training system for the Muslim community is machine vision
and image processing.
Computer vision is used to inspect and track human movement in various fields, such as sport,
health care and even games. As Muslim, we usually notice and understand how to perform salat by following
others. We follow the postures and movements of others in performing salat mainly by scanning using our
eyes. Then we analyze and process the learning of salat, whether it is correct or wrong. However, by
combining this technology with the religious aspect, we will gain many advantages. Using the system, we
can learn the proper way of salat by looking at the correct posture of salat installed in the database system;
thus, giving the proper feedback to the user. The system feedback can be in the form of words and numbers,
indicating the percentage of error in the salat movement.
Many researchers found algorithms to detect human parts, such as the face, hand, movements, and
postures. Some of the algorithms can detect the posture of the human body [5]. Different algorithm leads to
differences in need for the system. Therefore, some consideration must be made to have a fully functional
system. Approaches and algorithms to perform the inspection and training system for salat must be chosen so
that the image we need to measure and compare does not lack the information needed by the system.
Elements like the angle of sight, size, color, and texture of the image need to be measured using multiple
algorithms to get an accurate result so that the system does not give the wrong feedback to the user. In order
to overcome the problems, the MATLAB program is used to implement and test the methods proposed.
Muslim communities and others who convert to Islam across the world are in dire need of the basic
knowledge of salat. By developing the salat inspection and training system, this technology can teach and
share the knowledge with ease. The system resolves several problems as: i) help Muslim across the world in
learning the correct ways of salat anytime and everywhere; ii) reduce time, cost, and does not use much space
for learning salat; iii) avoid from being used by fake preachers in learning salat; iv) for Muslims who feel
embarrassed to learn the salat from others; and v) help newly converted Muslims to learn salat with ease.
Several movements in salat are considered important [6]. These movements and postures are needed
to be in a correct manner so that Allah will accept our salat. Notably, salat has a few sequenced movements
to have a complete cycle known as raka’ah. The sequences of one complete cycle are shown in Figure 1.
2. RELATED WORKS
2.1. Human body modelling in machine vision
Human motion and pose recognition can be categorized into two types of models, which are model-
based and appearance-based methods. Model-based object tracking algorithms are based on simple CAD
(computer-aided design) wire models of objects, as shown in Figure 2. Using this kind of models, we can
Intelligent system for Islamic prayer (salat) posture monitoring (Md Mozasser Rahman)
222 ISSN: 2252-8938
draw the starting and endpoint of the lines correctly into the image plane and granting a real-time tracking of
objects at the cost of a small computational effort. Appearance-based method use no priori knowledge on the
data present in the image. Instead, it analyzes the data by using the statistic of the available dataset in the
database to extract the modes. By doing so, it will group the data in the best possible condition. According to
Azad et al., the appearance-based method uses various algorithms to illustrate the object [7]. In other words,
appearance-based approaches are more reliable in many types of situations because they do not require a
specific object to be a model.
information transmission, which makes hand gesture recognition (HGR) as one of the research hotspots in
the field of human-machine interface (HMI) [10]. The two-dimensional (2D) contour representation used
the human body and projected it from three-dimensional (3D) space onto the two-dimensional image plane
[11]–[13]. It will approximate the human body by using deformable contours, ribbons, or cardboards [14].
Figure 5 shows 2D images of the hand. Three-dimensional (3D) representation describes the parts of the
human body in 3D space using a combination of cylinders as shown in Figure 6 [15]. The 3D representation
shape can also use other shapes such as a cone or sphere to represent the human body.
3. METHOD
3.1. Actual picture and mechanical design
To design the salat inspection and training system, we used polyvinyl chloride (PVC) pipe as the
base in the design. In this design, we prioritize portability first as it requires large spaces to place or store it.
By using PVC pipe, we can assemble and disassemble it easily, which takes less than five minutes. In order
to build the base of the system, a combination of plain ended pipe, equal elbow pipe, end cap pipe, and equal
tee pipe are needed. Figure 7(a) shows the actual picture the system and the isometric view of the system is
shown in Figure 7(b). In the actual picture, two black lines are drawn on the base carpet. The middle black
line is for the initial position of at-tawarrok. The user will sit there until the system finished the inspection.
The black line located near the back camera is for the initial position for takbiratul ihram, ruku’ and sujud.
The user will perform all these postures of the salat at their respective initial position, marked with the black
lines. Two cameras are installed in the system as shown in Figure 7(b). One camera is installed at the front to
inspect the front part of the body, such as hands and head, which are placed higher than the second one. The
second camera is installed at the back to inspect the back part of the body, such as the legs, placed lower than
the first one. The front camera is used to inspect the postures of salat for takbiratul ihram, ruku’ and sujud,
while the back camera is used to inspect the posture during at-tawarrok.
Two servo motors are installed in the system located below the cameras. The function of these
servomotors is to change the camera angle when taking the video of the user’s salat using the system. This
system can be carried and implemented everywhere because of its unique features. In the base carpet, a force-
sensing resistor is installed to inspect the user during the sujud. The force-sensing resistor is used during
sujud to check whether or not the parts of the body, such as the forehead and nose, are touching the ground.
(a) (b)
Figure 7. Salat monitoring and training system (a) actual picture and (b) isometric view
𝐺𝑟𝑎𝑦 = (0.2126 × 𝑅𝑒𝑑 2.2 + 0.7152 × 𝐺𝑟𝑒𝑒𝑛2.2 + 0.0722 × 𝐵𝑙𝑢𝑒 2.2 )1/2.2 (1)
Furthermore, the input image converted to grayscale as we need to match with the database images
as template to the input image. However, to process the matching we would choose an approach. The
matching process moves the template image to all possible 35 positions in a larger source image and
computes a numerical index that indicates how well the template matches the image in that position. One of
the well know matching process is Euclidean distance, Let I be a gray level image and g be a gray-value
template of size (n×m):
𝑑(𝐼, 𝑔, 𝑟, 𝑐) = √∑𝑛𝑖=1 ∑𝑚
𝑗=1(𝐼(𝑟 + 𝑖, 𝑐 + 𝑗) − 𝑔(𝑖, 𝑗))
2 (2)
∑𝑁−1
𝑖=0 (𝑥𝑖 −𝑥̅ ).(𝑦𝑖 −𝑦
̅)
𝑐𝑜𝑟 = (3)
√∑𝑁−1 2 𝑁−1 ̅)2
𝑖=0 (𝑥𝑖 −𝑥̅ ) .∑𝑖=0 (𝑦𝑖 −𝑦
Where,
x is the template gray level image
𝑥̅ is the average gray level in the template image
y is the source image section
𝑦̅ is the average gray level in the source image
N is the number of pixels in the section image
The value cor is between -1 t0+1, with larger values representing a strongrt relationship between the two
images.
As we now the correlation matching result never shows 100% matching as the images are different
in small details. Therefore, we should apply a threshold for the correlation result, the threshold can be set the
highest value of match accrued. Regarding the feature extraction of this system, we considered HOG
descriptors as one descriptor as shown in Figure 8 to show how the system could perform. Figure 8(a) shows
the equivalent histogram of an image and the obtained HOG feature of the image is shown in Figure 8(b).
However, there are many descriptors can be used or combined to work together. Other descriptor apply the
same operation of coveting the image pixels to vote to its color number as it described from (0–255), 0 for
white ad 255 for black color. However, Divide the feature into log-polar bins instead of dividing the feature
into square is the commune used approach. To identify the image to the computer we need to use descriptors,
Intelligent system for Islamic prayer (salat) posture monitoring (Md Mozasser Rahman)
226 ISSN: 2252-8938
as high as we train the system with image descriptors as the error of the system decrease. However,
combining multi type of descriptors “such as scale-invariant feature transform (SIFT), gradient location
orientation histogram (GLOH) and speeded up robust features (SURF)” will help to enhance the
performance ofthe system.
(a) (b)
Figure 8. Process of calculating HOG (a) creating histogram from image and (b) visualization of HOG features
of the image
A sample HOG representation of both correct and wrong position during takbeerat alehram shown
in Figure 9. The correct position, the rising of hand above shoulder, with it descriptor is shown in Figure 9(a).
Wherease the wrong position, hand is below shoulder is shown in Figure 9(b). The descriptor of the image on
the right side showing intensity variation in the images.
(a) (b)
Figure 9. Comparison of right and wrong position of takbeerat alehram using HOG representation of
the image (a) correct position and (b) wrong position
Both HGR and HOG descriptors were used for matching the two positions as shown in Figure 10,
the error of the result can be recognized by the different between two extreme points. The difference in
strong corners between two overlay images will represent the amount of unmatched features or error
between matching two images. However, as this difference increase as the salat position had performed by
the prayer is wrong. Therefore, we need to increase the extracted feature by increasing the corners number, as
well as, threshold the matching result so our system would trigger the position as wrong if the difference
between two images exceeds 30%.
Figure 10. The result of the matching process using HGR and HOG
Figure 11. Graphical user interface (GUI) of Intelligent Salat Monitoring System
Once the system finishes taking the video, it will start filtering the region of interest and undergo
color conversion from RGB to grayscale. Then the image will be matched in the database using the template
matching and Euclidean distance as the medium. In order to improve its accuracy, grey-level correlation is
used to increase the system’s performance.
If the salat is done correctly within the programmed value of the threshold, shown in the graph, a
message will pop up that says “GOOD PERFORMANCE” in a green-colored text. Otherwise, it will suggest
the correct postures that you should do in a red-colored text, indicating the performance of your salat is bad.
This stage applied to takbiratul ihram, ruku’ and at-tawarrok only. For sujud, a special graph indicates how
many force readings will be shown to the user with its performance. By doing this, it can train the user to
learn the salat until the correct posture is performed.
The front camera is used for taking the video during takbiratul ihram, ruku’ and sujud. The system
gives feedback to the user by showing the pictures and graph. Figure 12 dipicts the good takbeerat alehram.
The left side is the screen shoot of the recorded video, middle graph shows the matches percentage and the
notification image is shown in the right side. Figure 13 dipicts the bad performance during takbeerat alehram.
The left side is the screen shoot of the recorded video, middle graph shows the matches percentage and the
notification image is shown in the right side.
Performance of correct ruku’ is shown in Figure 14. The left side is the screen shoot of the recorded
video, middle graph shows the matches percentage and the notification image is shown in the right side.
Figure 15 describes the information regarding incorrect ruku’ performance. The left is the template for the
incorrect postures stored in the database, bad performance of ruku’ and matching percentages, which do not
reach the threshold.
If the system finds a frame with 98% matches a green rectangular will appear on the region of the
interest, if the matches last for more than three seconds the performance of the posture considered correct and
a notification pop up. However, if the matches didn’t last more than three second, the system will notify the
student via a pop image and comment on the image. This is simply the feedback mechanism of the salat
inspection and training system.
Intelligent system for Islamic prayer (salat) posture monitoring (Md Mozasser Rahman)
228 ISSN: 2252-8938
Figure 12. Takbeerat alehram, from the left is the recorded video, matches percentage and the notification
image
All the results applied the same feedback concept in the system except for sujud. In sujud, additional
information was added. A graph shows the reading on force-sensing resistor indicates the user is performing
the sujud. Whenever the user’s nose and forehead touch the sensor located at the base carpet, the force will
trigger the sensor, thus giving the reading on the MATLAB. There are 12 forces reading during sujud, set as
the threshold to indicate good performance of the sujud. Figure 16 show the force sensor reading to verify the
performance of sujud. Figure 16(a) shows how feedback on good Sujud performance is shown to the user,
while Figure 16(b) indicates bad sujud performance.
(a)
(b)
Figure 16. Force-sensing resistor reading during the sujud showing the performance of (a) good sujud and (b)
bad sujud
Intelligent system for Islamic prayer (salat) posture monitoring (Md Mozasser Rahman)
230 ISSN: 2252-8938
5. CONCLUSION
In conclusion, the first objective is to learn the correct postures in salat, whereby we analyze books
of hadith and Muslim scholars in the literature review section. The second objective is to develop an image-
processing algorithm using MATLAB, which is also achieved as the result show quite a good performance.
The third objective of this study is to test the salat performance and provide feedback to the user. This is also
achieved as the output result of the matching image pop up the message about the salat performance and train
the user by giving the correct instructions regarding the current postures of the salat. The results are quite
accurate, as the method proposed is able to identify and match the pattern to recognize up to 90% and inform
the user about their salat performance. The reading in the graph is more accurate when the user performs
salat using the system itself because the camera angle is fixed. Although the posture is correct, some results
show errors when the lighting is bad. This is because the pattern matching in MATLAB is confused when the
lighting is insufficient. It will affect the results of pattern matching, for example, the posture of salat is
correct, but the system keeps on giving bad performance feedback to the user. This issue can be solved by
using the system under sufficient light; hence, increasing the accuracy of the overall system. It is
recommended to bright room to ensure clear images captured. The camera angle also needs to be fixed and
constant between the database and the correct and wrong image for the system to detect the pattern without
an error.
ACKNOWLEDGEMENTS
Communication of this research is made possible through monetary assistance by Universiti Tun
Hussein Onn Malaysia and the UTHM Publisher’s Office via Publication Fund E15216.
REFERENCES
[1] O. Alobaid and K. Rasheed, “Prayer activity recognition using an accelerometer sensor,” in 2018 World Congress in Computer
Science, Computer Engineering and Applied Computing, CSCE 2018 - Proceedings of the 2018 International Conference on
Artificial Intelligence, ICAI 2018, 2018, pp. 271–277.
[2] R. Al-Ghannam and H. Al-Dossari, “Prayer activity monitoring and recognition using acceleration features with mobile phone,”
Arabian Journal for Science and Engineering, vol. 41, no. 12, pp. 4967–4979, May 2016, doi: 10.1007/s13369-016-2158-7.
[3] M. F. Rabbi, N. Wahidah Arshad, K. H. Ghazali, R. Abdul Karim, M. Z. Ibrahim, and T. Sikandar, “EMG activity of leg muscles
with knee pain during islamic prayer (salat),” in Proceedings - 2019 IEEE 15th International Colloquium on Signal Processing
and its Applications, CSPA 2019, Mar. 2019, pp. 213–216, doi: 10.1109/CSPA.2019.8696025.
[4] F. Ibrahim and S. A. Ahmad, “Assessment of upper body muscle activity during salat and stretching exercise: A pilot study,” in
Proceedings - IEEE-EMBS International Conference on Biomedical and Health Informatics: Global Grand Challenge of Health
Informatics, BHI 2012, Jan. 2012, pp. 412–415, doi: 10.1109/BHI.2012.6211603.
[5] R. R. Porle, A. Chekima, F. Wong, M. Mamat, N. Parimon, and Y. F. A. Gaus, “Two-dimensional human pose estimation for the
upper body using histogram and template matching techniques,” in 2013 1st International Conference on Artificial Intelligence,
Modelling & Simulation (AIMS), 2013, pp. 249–254, doi: 10.1109/AIMS.2013.46.
[6] N. A. Jaafar, N. A. Ismail, and Y. A. Yusoff, “An investigation of motion tracking for solat movement with dual sensor
approach,” ARPN Journal of Engineering and Applied Sciences, vol. 10, no. 23, pp. 17981–17986, 2015.
[7] P. Azad, T. Asfour, and R. Dillmann, “Combining appearance-based and model-based methods for real-time object recognition
and 6D localization,” in 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2006, pp. 5339–5344,
doi: 10.1109/IROS.2006.282094.
[8] Y. Yang and D. Ramanan, “Articulated pose estimation with flexible mixtures-of-parts,” in Proceedings of the IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, 2011, pp. 1385–1392, doi: 10.1109/CVPR.2011.5995741.
[9] M. Lora, S. Ghidoni, M. Munaro, and E. Menegatti, “A geometric approach to multiple viewpoint human body pose estimation,”
in 2015 European Conference on Mobile Robots, ECMR 2015 - Proceedings, 2015, pp. 1–6, doi: 10.1109/ECMR.2015.7324195.
[10] J. L. Raheja, K. Das, and A. Chaudhary, “Fingertip detection: a fast method with natural hand,” International Journal of
Embedded Systems and Computer Engineering, vol. 3, no. 2, pp. 85–89, 2012, doi: 10.48550/arXiv.1212.0134.
[11] L. Guo, Z. Lu, and L. Yao, “Human-machine interaction sensing technology based on hand gesture recognition: a review,” IEEE
Transactions on Human-Machine Systems, vol. 51, no. 4, pp. 300–309, Aug. 2021, doi: 10.1109/THMS.2021.3086003.
[12] P. Parvathy, K. Subramaniam, G. K. D. Prasanna Venkatesan, P. Karthikaikumar, J. Varghese, and T. Jayasankar, “Development
of hand gesture recognition system using machine learning,” Journal of Ambient Intelligence and Humanized Computing, vol. 12,
no. 6, pp. 6793–6800, 2021, doi: 10.1007/s12652-020-02314-2.
[13] M. M. Islam, S. Siddiqua, and J. Afnan, “Real time hand gesture recognition using different algorithms based on American sign
language,” 2017, doi: 10.1109/ICIVPR.2017.7890854.
[14] J. K. Aggarwal and S. Park, “Human motion: modeling and recognition of actions and interactions,” in Proceedings - 2nd
International Symposium on 3D Data Processing, Visualization, and Transmission. 3DPVT 2004, 2004, pp. 640–647, doi:
10.1109/TDPVT.2004.1335299.
[15] M. W. Lee and R. Nevatia, “Human pose tracking in monocular sequence using multilevel structured models,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 27–38, 2009, doi: 10.1109/TPAMI.2008.35.
[16] P. N. Maraskolhe and A. S. Bhalchandra, “Analysis of facial expression recognition using histogram of oriented gradient (HOG),”
in Proceedings of the 3rd International Conference on Electronics and Communication and Aerospace Technology, ICECA 2019,
2019, pp. 1007–1011, doi: 10.1109/ICECA.2019.8821814.
[17] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proceedings - 2005 IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, CVPR 2005, 2005, vol. 1, pp. 886–893, doi: 10.1109/CVPR.2005.177.
[18] S. Zhang and X. Wang, “Human detection and object tracking based on histograms of oriented gradients,” in Proceedings -
International Conference on Natural Computation, 2013, pp. 1349–1353, doi: 10.1109/ICNC.2013.6818189.
[19] K. V. V. Kumar and P. V. V. Kishore, “Indian classical dance mudra classification using HOG features and SVM classifier,”
International Journal of Electrical and Computer Engineering (IJECE), vol. 7, no. 5, pp. 2537–2546, Oct. 2017, doi:
10.11591/ijece.v7i1.pp2537-2546.
[20] C. Wattanapanich, H. Wei, and W. Petchkit, “Investigation of robust gait recognition for different appearances and camera view
angles,” International Journal of Electrical and Computer Engineering (IJECE), vol. 11, no. 5, pp. 3977–3987, Oct. 2021, doi:
10.11591/ijece.v11i5.pp3977-3987.
[21] L.Wang, W.Hu, and T.Tan, “Recent developments in human motion analysis,” Pattern Recognition, vol. 36, no. 3, pp. 585–601,
2003, doi: 10.1016/S0031-3203(02)00100-0.
[22] O. C. Ibe, Markov process for stochastic modeling, 2nd edition. Watthan, MA 02451, USA: Elsevier, 2009.
[23] M. Rocha and P. G. Ferreira, “Hidden markov models,” in Bioinformatics Algorithms, vol. 54, Elsevier, 2018, pp. 255–273.
[24] L. Wang, Y. Zhang, and J. Feng, “On the Euclidean distance of images,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 27, no. 8, pp. 1334–1339, Aug. 2005, doi: 10.1109/TPAMI.2005.165.
[25] A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257–267, Mar. 2001, doi: 10.1109/34.910878.
BIOGRAPHIES OF AUTHORS
Intelligent system for Islamic prayer (salat) posture monitoring (Md Mozasser Rahman)