Machine learning based tuberculosis (ML-TB) health predictor model: early TB health disease prediction with ML models for prevention in developing countries

View article
PeerJ Computer Science

Introduction

Among many toxic diseases, one of the leading fatal diseases is tuberculosis (TB), defined as, “an infectious disease caused by bacillus Mycobacterium tuberculosis”. TB primarily communicates a disease to human respiratory tract i.e., Lungs. According to World Health Organization (WHO), Pakistan ranks fifth among the 30 high TB-burden countries globally (The World Health Organization, 2024a). To get control of this lethal disease and reduce death ratio, ensuring good health and well-being, it is essential to put forward an automated solution for TB diagnosis in its initial phase. In this contemporary era, the integration of cutting-edge technologies in the field of Medical and Health sciences, called Healthcare Informatics (HI), attempts to originate innovative and digital solutions as well as novel apparatuses to save human lives in an effective manner. Consequently, current research carried out a study on an automated machine learning (ML) oriented solution to predict about TB manifestation in an individual at an earlier stage.

The contribution of this research is four-fold: (a) the proposed systematic architecture of the ML-TB predictive model; (b) novel data collection from hospitals via self-administered TB screening questionnaire; (c) data analysis in Python to generate the targeted ML solutions using its primary signs, symptoms and risk factors; (d) the context of the research is Sindh, a high TB-burden province of Pakistan. Major highlights and insights found in the article are: (a) implemented ML algorithms have shown highly accurate results; (b) consideration of primary TB features for its automated diagnosis is vital for doctors to increase awareness and prevent the individuals from TB complications; and (c) the produced solution aids the doctors to diagnose and treat the patients at primary phase of TB that certainly reduces TB death ratio.

The rest of the article is organized as follows: In “Healthcare Informatics (HI) and Machine Learning (ML)”, the Healthcare Informatics and machine learning are defined comprehensively. In “The Targeted Disease: Tuberculosis (TB)”, the targeted disease tuberculosis is discussed. In “Literature Review”, the extensive literature review is cited and the rationale of research problem is identified. “ML-TB Predictor” explains the methodology adopted in the research. “Data Analysis” shows the data analysis. “Results and Discussion” represents the experimental settings, results and discussions. Finally, “Conclusions” concludes the article and suggests future research directions.

Healthcare informatics and machine learning

The U.S. National Library of Medicine described HI as “an interdisciplinary study of design, development, adoption, and implementation of modern technical apparatuses and processes in healthcare system” (Parry, 2014). Computer systems, clinical guiding principles, proper diagnostic taxonomies, information and communication systems are some of the resources used in HI. To manage patient healthcare, either individually or in groups, computational intelligence is used. In general, HI is motivated to advance the inclusive efficacy of patient healthcare (Holzinger, 2016). At present, machine learning (ML–a subdomain of Artificial Intelligence), is significantly integrated in the sphere of Healthcare Informatics (HI). The pioneer of computer gaming and Artificial Intelligence (AI), Arthur Samuel, in 1959 formulated the term ML, and described ML as the branch of research that enables computers to learn without being explicitly programmed (The World Health Organization, 2024b). To develop machines (algorithms) which are self-learner, self-checker, self-improviser and decision maker is the ultimate goal of ML. ML system learns from past data (i.e., training), constructs a mathematical model, and provides predictions or decisions whenever it receives new data (i.e., testing). The amount of data influences the accuracy of predicted output as shown in Fig. 1. ML combines computer science and statistics to create predictive models; the former focuses on problem solving and deciding if problems are attainable throughout all phases, while the latter emphasizes on data modeling, reasoning, and reliability measurement.

Machine learning ML-TB predictor work flow.

Figure 1: Machine learning ML-TB predictor work flow.

The contribution of ML in HI is motivated towards predictive analytics, disease diagnosis, therapeutic formation, and other areas. For efficient and precise diagnosis, ML combines healthcare with cutting-edge computer technology, as well as assisting medical experts in providing superior care.

The targeted disease: tuberculosis

As defined by Kumar & Clark (2012), “TB is a communicable disease triggered by a microbes (bacilli) called Mycobacterium tuberculosis which usually blowout through air when an infected person cough, spit, speak, or sneeze”. It primarily disrupts human respiratory tract (i.e., the lungs). However, this bacterium can blowout and communicate the disease to other body parts. The former type is pulmonary TB while the latter is extra pulmonary TB. It may be either in one of two possible states i.e., latent TB and active TB. In latent TB, the mycobacteria remains in a dormant state in the subject, however, in active TB the subject exhibits the relevant signs and symptoms. According to WHO, TB is one among the top 10 global fatal diseases. Pakistan ranks fifth highest TB-burden country and fourth highest prevalence of MDR-TB. In Pakistan, millions of people are infected with TB and 66% of the patients die every year. Since the turn of the century, the Pakistan National TB Program has made significant improvements for TB detection and treatment. In the beginning of previous decade, the rate of detection for TB cases raised from 19% to 84%. Within the same period, the success ratio of TB treatment raised to 91% (The World Health Organization, 2024b). Despite the statistics showing remarkable improvement, it is important to continue research on the treatment and diagnosis methods. Consequently, this research aims to improve the TB detection methods using state-of-the-art.

Literature review

In this section, an in-depth exploration of existing literature is conducted to provide a thorough understanding of the subject. Various ML types, specifically interactive machine learning (iML) and automated machine learning (aML), as well as ML approaches including Regular and Ensemble ML, are examined. Additionally, learning paradigms such as supervised, unsupervised, and semi-supervised, along with algorithms like decision tree, support vector machines, naïve Bayes, regression analysis, neural networks, k-nearest neighbor, k-means clustering, genetic algorithms, deep learning, and Ensembles, have been explored in HI in our previous studies (Karmani et al., 2018, 2020) and other studies (Tasci, Uluturk & Ugur, 2021; Li et al., 2023; Zhang et al., 2023; Islam et al., 2022). Yahiaoui, Er & Yumusak (2017) used the support vector machine (SVM) method to detect TB at the early stage, achieving 96.68% accuracy and low running costs. However, the study does not include a comparison of SVMs to other approaches to correctly determine its superiority. Alcantara et al. (2017) proposed a deep convolutional neural network model that incorporates mobile health technologies to improve TB detection in Peru, with accuracy rates of 89.6% for binary classification and 62.7% for multi classification. Nevertheless, the reliability and accessibility of mobile health technology may have an influence on the outcomes due to the variability of environmental settings. Shahaboddin et al. (2014) detected TB using hybrid ML techniques that included artificial immune systems (AISs), genetic algorithms, and neural networks, with 99.14% classification accuracy, 87% sensitivity, and 86.12% specificity. However, these hybrid ML systems can be computationally complex and intensive on resources, thus restricting their practical viability, particularly in resource-constrained contexts. Er, Yumusak & Temurtas (2012), (2010) used AIS to identify multiple chest diseases, including TB, attaining 90% accuracy. In another study, they used multiple neural networks, including ANN, MLNN, PNN, LVQNN, GRNN, Bayesian networks, and RBFNN, and acquired 90% classification accuracy rate via MLNN. Yet, AIS and certain NN architectures could find it challenging to scale to more extensive datasets, limiting their performance and generalizability. Furthermore, owing to hyper parameter tuning, such techniques tend to be time-consuming and occasionally fail to yield flawless outcomes, particularly when dealing with limited computational resources (Shahaboddin et al., 2014).

As shown in Table 1, prior studies focused on TB disease diagnosis has taken into consideration complicated pathological features and no studies have been carried out in Pakistan, despite its high TB prevalence.

Table 1:
Literature review of ML-TB predictor.
Ref# Dataset Features ML algorithm Training procedure Success ratio (%)
El-Solh et al. (1999) 563 patients, State University of New York 21 GRNN (1 hidden layer) - 92.30
dos Santos et al. (2007) 136 patient’s medical reports, University Hospital, Rio de Janeiro, Brazil 26 MLNN with BP
(1 hidden layer)
77
Er, Temurtas & Tanrikulu (2010) 150 patient’s medical report, Diyarbakir Chest Diseases Hospital, Turkey 38 GRNN (1 hidden layer)
MLNN (1 hidden layer)
MLNN (2 hidden layers)
BPwM
LM
93.18
93.04
93.24
93.93
95.08
Elveren & Yumusak (2011) 150 patient’s medical report, Diyarbakir Chest Diseases Hospital, Turkey 38 MLNN (2 hidden layers) GA 94.88
Dongardive et al. (2011) 250 sample reports, T.B. Hospitals, Mumbai 19 Decision tree IDT 94.50
Omisore, Samuel & Atajeromavwo (2017) 10 TB patients from St. Francis Catholic Hospital, Delta State, Nigeria 24 MLNN GA + NN + Fuzzy Logic 70
Lakhani & Sundaram (2017) Montgomery County (MC), Shenzhen, Thomas Jefferson University Hospital and Belarus Tuberculosis Portal CXRs CNN Ensemble
Radiologist Augmented
96.00
98.70
Hwang et al. (2016) Private Dataset, Montgomery County (MC) and Shenzhen Datasets CXRs CNN Modified AlexNet 90.00
Lopes & Valiati (2017) Montgomery County (MC), Shenzhen CXRs CNN Ensemble 84.60
Alcantara et al. (2017) 5,000 CXRs from Partners in Health at Peru and various other image DBs CXRs CNN Binary and
Multi-class Classification
89.60
62.07
DOI: 10.7717/peerj-cs.2397/table-1

The rationale for undertaking this research in Pakistan, notably in Sindh province, arises from the region’s high prevalence of TB as well as its peculiar epidemiological infrastructure, and socio-cultural challenges. TB epidemiology in Pakistan differs from that in other regions, requiring context-specific treatments to promote effective control. Inadequate healthcare infrastructure, along with cultural and socioeconomic dynamics, impedes TB detection and control initiatives, reinforcing the necessity for innovative methods tailored to the Pakistani context. Accordingly, present study aims to improve early detection and treatment initiation by developing an automated TB diagnostic model based on the analysis of its preliminary signs, symptoms, and risk factors. The research expects to reduce TB-related morbidity and mortality ratio in the country, contributing to positive health solutions for the affected populations.

Ml-tb predictor

According to the philosophical traits of research, the present study is a quantitative type of research study in general based on the positivist paradigm and adopted deductive approach. According to nature, it is analytical research concerned with determining the cause-and-effect associations between two or more variables. According to the purpose of study, it is fundamental (basic or pure) research producing knowledge and understanding in relation to natural occurrences. The research design of the current study is conclusive (i.e., decision making) as shown in Fig. 2. This research study adopted a low-cost, efficient, and precise survey method for data collection about a population, and it was administered to a large sample. The survey was designed and administered in accordance with the research guidelines of the University. Furthermore, this research study has been reviewed and approved by the Institutional Bioethics Committee (IBC) of the University (i.e., Ref. No. ORIC/SU/311). The consent form has been received from the participants in written format which asks their agreement and the hospital details.

Research design and steps.

Figure 2: Research design and steps.

The main research instrument was a cross-sectional self-administered TB screening questionnaire consisting of four parts comprising personal information about the patient, exploratory questions relevant to TB disease, supplementary information, and comments/feedback. The questionnaire was quasi-structured, principally consisting of dichotomous (Yes/No) questions, along with two partial open-ended questions. In addition, to eliminate bias, the research made use of random probability sampling techniques. The questionnaire was pre-tested to ascertain its precision and psychometric reliability. Statistically, 1,200 questionnaires were distributed evenly across three existing hospitals in Sindh, namely the Institute of Chest Diseases (TB Sanatorium) Kotri, Liaquat University of Medical and Health Sciences (LUMHS) Jamshoro, and Civil Hospital Hyderabad. The questionnaires were administered by the researcher and answered by the doctors consulting TB-infected patients at these chosen hospitals, resulting in varied response rates. The highest response rate (90%) was obtained from the Institute of Chest Diseases (TB Sanatorium) Kotri, followed by 77.5% from Civil Hospital Hyderabad and 70% from LUMHS Jamshoro. In total, 950 responses were received, representing a response rate of 79.16%. After eliminating 190 responses (i.e., missingness records) with partial data, 760 responses were chosen for further analysis as depicted in Table 2.

Table 2:
Statistical summary of the survey responses.
Hospitals Number of questionnaires
Distributed Returned
Institute of Chest Diseases (TB Sanatorium), Kotri 400 360
Liaquat University of Medical and Health Sciences, Jamshoro 400 280
Civil Hospital, Hyderabad 400 310
Aggregate 1,200 950
Ambiguous questionnaires (−) 190
Final sample of the study n = 760
DOI: 10.7717/peerj-cs.2397/table-2

The aggregate response rate for the research was 63.3%. The summary provided in Table 3 shows that out of the 760 patients, 58.9% were male and 41.1% were female. The highest response rate was obtained from the 31–40 age bracket, while the lowest was in the 61–70 age bracket. The imbalance dataset across age groups was addressed via regulating the model with weighted parameters. Figure 3 illustrates the proposed systematic architecture, which is used to predict TB disease at an early stage via the implementation of ML algorithms to collected data sample.

Table 3:
Distribution of participants age group.
Age groups ICD Kotri LUMHS Jamshoro Civil hospital Hyderabad Net ratio
Male Female Male Female Male Female
0–10
11–20 24 10 20 8 14 10 9.30%
21–30 34 22 43 17 17 14 25.80%
31–40 44 38 42 25 39 22 37.40%
41–50 36 31 19 11 28 23 24.70%
51–60 30 22 17 19 18 22 2.30%
61–70 15 14 3 1 5 3 0.50%
Total 183 137 144 81 121 94 100%
57.18% 42.80% 64% 36% 56.27% 43.70%
Grand total 320 225 215 760
DOI: 10.7717/peerj-cs.2397/table-3
The systematic architecture of the proposed ML-TB predictor.

Figure 3: The systematic architecture of the proposed ML-TB predictor.

Data analysis

To create an ML model, the first crucial step is data preprocessing that prepares the raw data to make it feasible for the ML model as the real-time data usually encompasses noises, mislaid values, and impracticable format. Thus, data preprocessing is a significant procedure for cleaning and formatting the data in accordance with the required ML model. This practice also ensures the efficiency and accuracy of the ML model. Data analysis tools used in this research includes MS Excel, Jupyter Notebook and Python programming language. Box 1 depicts the systematic data preprocessing for the current study in sequential order, including statistical information shown in Tables 4, 5 and visualizations in Figs. 4, 5. The preprocessed dataset comprises diverse features and their possible values (i.e., as described in Table 5) including cough (varies in severity and frequency); cough duration (mild, occasional, intermittent, moderate, frequent or persistent); cough type (productive, or non-productive); mucus (clear, presence, or bloody color) aids to identify the type of infection; chest pain; breathe state (normal, or dyspnea); body temperature (fever or normal); chills; pulmonary effusion (presence of fluid in lungs); Erythrocyte Sedimentation Rate (ESR) value (distributed or normal); diet (nutritional status); physique (the healthy body structure or weight loss); energy adequacy (stamina level fatigue or fit); smoking; crowding (living in a cramped settings) and exposed (in direct contact with TB risk factors or pathogens).

Table 4:
Imported dataset (760 rows × 18 columns).
Patient ID Cough Cough duration Cough type Mucus Chest pain Breathe state Body temperature Chills Pulmonary effusion ESR value Diet Physique Energy adequacy Smoking Crowding Exposed Prediction
1 No 0 No No Yes Normal Normal No No Normal Malnutrition Weight loss Fatigue Yes Yes Yes Not suspected
2 No 0 No No Normal Normal Normal No No Disturbed Malnutrition Weight loss Fatigue No Yes Yes Not suspected
3 No 0 Non productive No Yes Normal Normal No No Disturbed Malnutrition Healthy Fit Yes Yes No Not suspected
4 No 0 No No Yes Dyspnea Normal No No Normal Balanced Weight loss Fit Yes Yes No Not suspected
5 No 0 No No Normal Normal High No No Normal Malnutrition Weight loss Fatigue No Yes Yes Not suspected
756 Yes 0 Productive No Yes Dyspnea High No No Disturbed Balanced Healthy Fit Yes No No TB suspected
757 Yes 3 Productive Bloody Normal Normal High No No Normal Malnutrition Healthy Fit Yes Yes No TB suspected
758 Yes 3 Productive Bloody Yes Dyspnea Normal No No Disturbed Balanced Healthy Fit Yes Yes No TB suspected
759 Yes 3 Productive Bloody Yes Normal Normal No No Disturbed Malnutrition Healthy Fit Yes No Yes TB suspected
760 Yes 3 Productive Bloody Yes Dyspnea High No No Disturbed Malnutrition Weight loss Fatigue Yes Yes No TB suspected
DOI: 10.7717/peerj-cs.2397/table-4
Table 5:
Description of dataset features.
Features Possible values
Cough {0 = No, 1 = Yes}
Cough duration {0 = No cough, 0.5 = Mild, 1 = Occasional, 1.5 = Intermittent, 2 = Moderate, 2.5 = Frequent, 3 = Persistent}
Cough type {0 = No, 1 = Non-Productive, 2 = Productive}
Mucus {0 = Bloody, 1 = Clear, 2 = No}
Chest pain {0 = No, 1 = Normal, 2 = Yes}
Breathe state {0 = Dyspnea, 1 = Normal}
Body temperature {0 = High, 1 = Normal}
Chills {0 = No, 1 = Yes}
Pulmonary effusion {0 = No, 1 = Yes}
ESR value {0 = Distributed, 1 = Normal}
Diet {0 = Balanced, 1 = Malnutrition}
Physique {0 = Healthy, 1 = Weight Loss}
Energy adequacy {0 = Fatigue, 1 = Fit}
Smoking {0 = No, 1 = Yes}
Crowding {0 = No, 1 = Yes}
Exposed {0 = No, 1 = Yes}
DOI: 10.7717/peerj-cs.2397/table-5
Dataset features.

Figure 4: Dataset features.

Correlation matrix.

Figure 5: Correlation matrix.

Results and discussion

This section provides the performance metrics, results and discussion of the generated ML-TB predictors. The present research adopted five different supervised ML algorithms including: Decision tree (DT) is a top-down hierarchical structure consisting of three nodes i.e., root (top-most) node, internal (non-leaf) nodes, and terminal (leaf) nodes. Each internal node tests a condition, each branch indicates the test outcome and each terminal node represents class label. It resolves classification and regression problems. It handles both numerical and categorical data. Gaussian naïve Bayes is a type of naïve Bayes, which considers that the features follow a normal distribution, thus making it appropriate for continuous data. Logistic (non-linear) regression predicts the categorical dependent variable using independent variables. It is based on maximum likelihood estimate method. It can be binary, multinomial and ordinal.

Neural networks (NNs) or Artificial NNs are defined as layered network of artificial neurons (nodes) with three major components including learning rules, network architecture (one input layer, one output layer, and one or more hidden layers), and activation function (such as linear, sigmoid, hyperbolic tangent). NNs take the sum of all highly weighted neuron’s inputs with bias, process it through an activation function and generate corresponding output. AdaBoost (Adaptive Boosting) is an ensemble that combines numerous weak classifiers into a single strong classifier. AdaBoost was the first successful binary classification boosting technique to be devised. Due to the appropriateness for classification tasks and prior performance in medical diagnostics, these five ML algorithms were chosen for the current research study. More specifically, DT was picked for its interpretability, GNB for its minimalism and efficacy with small datasets, LRC for its simplicity in binary classification, AdaBoost for its ability to manipulate imbalanced datasets, and NN for its fitness to model complex patterns. For the performance analysis of each classifier, research made use of five different metrics including:

Accuracy defines the fraction of predictions that ML model correctly identifies. Mathematically,

Accuracy=TP+TNTP+TN+FP+FNwhere TP stands for true positive, TN stands for true negative, FP stands for false positive and FN stands for false negative.

Precision defines the fraction of positive predictions that ML model correctly identifies. Mathematically,

Precision=TPTP+FPwhere TP stands for true positive and FP stands for false positive.

Recall (sensitivity or true positive rate) defines the fraction of actual positives that ML model correctly identifies. Mathematically,

Sensitivity=TPTP+FNwhere TP stands for true positive and FN stands for false negative.

F1 score (F-measure or F score) defines the harmonic mean of precision and recall values. It best case is 1 and the worst case is 0. Mathematically,

F1=2×precisionrecallprecision+recall

Confusion matrix is a N×N matrix that calculates the performance of a classification model, where N represents the number of target classes. It provides a comprehensive picture of ML model performance accuracy and the errors it produces by comparing the actual target values to the predicted values shown in Tables 6 and 7. Table 7 illustrates the confusion matrices for each classifier’s performance in diagnosing instances as TB positive (1) or TB negative (0).

Table 6:
Confusion matrix.
Actual values Predicted values
0 (False) 1 (True)
0 (False) True Negative (TN) False Positive (FP)
1 (True) False Negative (FN) True Positive (TP)
DOI: 10.7717/peerj-cs.2397/table-6
Table 7:
Confusion matrix.
(a) DT classifier
0 1
0 94 11
1 7 116
(b) GNB classifier
0 1
0 92 13
1 12 111
(c) AdaBoost classifier
0 1
0 100 5
1 10 113
(d) NN classifier
0 1
0 96 9
1 7 116
(e) LRC classifier
0 1
0 94 11
1 11 112
DOI: 10.7717/peerj-cs.2397/table-7

The research experimental procedures were carried out by means of Jupyter Notebook (Python) where the ML-TB predictor models were trained using a 70-30 train-test split, hyper parameters were regulated via grid search and cross-validation was applied to ensure the robustness. The individual as well as comparative performance of each classifier is interpreted as shown in Figs. 6 and 7, respectively. Statistically, the implemented classifiers including DT, GNB, LRC, AdaBoost and NN obtained 92.11%, 89.04%, 90.35%, 93.42% and 92.98% accuracy rate, respectively; 91.34%, 89.52%, 91.06, 95.76% and 92.8% precision rate, respectively; 94.31%, 90.24%, 91.06, 91.87% and 94.31% recall rate, respectively; 92.8%, 89.88%, 91.06%, 93.78% and 93.55% F1 score rate, respectively. The outcomes comprehend that each implemented algorithm successfully anticipated the TB disease. However, the statistical results exposed substantial variances between classifiers, with p-values < 0.05 for accuracy and precision metrics. Confidence intervals were calculated to measure the robustness of the outcomes. Feature importance analysis including Gini, permutation and SHAP values exposed that the symptoms including persistent cough and weight loss were most persuasive in predicting TB. AdaBoost ensemble performed far better than other classifiers according to the accuracy, precision and F1 score performance metric. However, sensitivity (recall) is prioritized due to its significance in classifying as many cases as possible. Thus, the DT and NN classifiers are considered as better approach in terms of recall performance metric.

Performance analysis of individual ML-TB predictor models.

Figure 6: Performance analysis of individual ML-TB predictor models.

Comparative analysis of ML-TB predictor models based on performance metrics.

Figure 7: Comparative analysis of ML-TB predictor models based on performance metrics.

Conclusions

In a nutshell, this research study signifies a substantial breakthrough in TB diagnosis, tailored to the exclusive context of Pakistan, particularly in Sindh, where the prevalence of TB is exceptionally high. The present research addressed an essential gap in automated TB diagnosis by emphasizing on context-specific needs and acquiring vital dataset from three notable hospitals in Sindh: ICD Kotri, LUMHS Jamhoro, and Civil Hospital Hyderabad. The in-depth analysis of the study, featuring a variety of ML algorithms such as DT, GNB, LRC, AdaBoost, and NN, yielded positive outcomes. While AdaBoost attained high accuracy (93.42%), precision (95.76%), and F1 score (93.78%), DT and NN classifiers outperformed in recall (94.31%). This study not only optimizes the TB detection with cutting-edge ML algorithms, but it also promotes context-dependent research in healthcare. Despite constraints such as a limited feature dataset and the absence of certain evaluation parameters like as ROC curve (AUC) or calibration metrics, the research study intends to address these in future ventures by augmenting the dataset and leveraging aML tools like WEKA. In addition, the research study expects to widen the scope to incorporate Extra-pulmonary TB manifestations, improve the model with Reinforcement Learning, and develop a ML-TB diagnostic apparatus for seamless integration into the medical setting. By transforming research implications into tangible implements, the research study drives to positively influence healthcare provision and contribute to enriched patient outcomes.

Integrating ML into TB diagnostics can transform early detection and cure, leading to good health and well-being. By implementing accurate ML models, healthcare systems can boost diagnostic precision. Pilot software package in high-burden regions will aid to upgrade these implements, whereas an inclusive training for healthcare providers guarantees operational use and data management. For successful implementation, forming robust monitoring frameworks, refining data collection and privacy benchmarks is crucial. Public awareness campaigns and patient engagement will further drive acceptance, while constant research and global associations will guarantee unceasing novelty. Thus, implementing such policy would positively reinforce the fight against TB, ultimately falling TB mortality ratio.

Supplemental Information

Questionnaire (English).

DOI: 10.7717/peerj-cs.2397/supp-4
1 Citation   Views   Downloads