Kumpulan Jurnal Imam Riadi Internasional
Kumpulan Jurnal Imam Riadi Internasional
International Journal of
Computer Science
& Information Security
IJCSIS
ISSN (online): 1947-5500
Please consider to contribute to and/or forward to the appropriate groups the following opportunity to submit and publish
original scientific results.
The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research,
standards, implementations, running experiments, applications, and industrial case studies. Authors are invited
to submit complete unpublished papers, which are not under review in any other conference or journal in the
following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.
Indexed by Google Scholar, DBLP, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld
Academic Search Engine (BASE), SCIRUS, Scopus Database, Cornell University Library, ScientificCommons,
ProQuest, EBSCO and more.
Deadline: see web site
Notification: see web site
Revision: see web site
Publication: see web site
A great journal cannot be made great without a dedicated editorial team of editors and reviewers.
On behalf of IJCSIS community and the sponsors, we congratulate the authors and thank the
reviewers for their outstanding efforts to review and recommend high quality papers for
publication. In particular, we would like to thank the international academia and researchers for
continued support by citing papers published in IJCSIS. Without their sustained and unselfish
commitments, IJCSIS would not have achieved its current premier status, making sure we deliver
high-quality content to our readers in a timely fashion.
“We support researchers to succeed by providing high visibility & impact value, prestige and
excellence in research publication.” We would like to thank you, the authors and readers, the
content providers and consumers, who have made this journal the best possible.
Open Access This Journal is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source.
Bibliographic Information
ISSN: 1947-5500
Monthly publication (Regular Special Issues)
Commenced Publication since May 2009
1. PaperID 31011701: Machine Learning Techniques to Recognize Multilingual Characters using HOG
Features (pp. 1-8)
2. PaperID 31011703: Image Steganography between Firefly and PSO Algorithms (pp. 9-21)
* Ziyad Tariq Mustafa Al-Ta’i , * Jamal Mustafa Abass ,** Omar Y. Abd Al-Hameed
* Department of Computer Science - College of Science - University of Diyala
** Computer Science Department – University of Garmian
3. PaperID 31011704: Farsi Text Localization in Natural Scene Images (pp. 22-30)
M. Samaee, Department of Electrical and Computer Engineering, Amirkabir University of Technology, Tehran, Iran
H. Tavakoli, Department of Electrical Engineering, Shahed university of Tehran, Tehran, Iran
4. PaperID 31011705: In Silico Screening and Pathway Analysis of Disease-Associated nsSNPs of MITF Gene:
A study on Melanoma (pp. 31-54)
Muhammad Naveed (*1,2), Fiza Anwar (1), Syeda Khushbakht kazmi (1), Fadwa Tariq (1), Sana Tehreem (1),
Ghulam Abbas (1), Humayun Irshad (2), Pervez Anwar (2), Aitizaz Ali (3), Muzamil Mehboob (3)
(1) Department of Biochemistry and Biotechnology, University of Gujrat, Pakistan 50700
(2) Department of Biotechnology, Faculty of Sciences, University of Gujrat, Sialkot campus, Pakistan 51310
(3) Department of Computer Sciences, University of Gujrat, Sialkot campus, Pakistan 51310
5. PaperID 31011709: Ticket based Secure Authentication Scheme using NTRU Cryptosystem in Wireless
Sensor Network (pp. 55-66)
Iqbaldeep Kaur (1), Harnain kour (2), Dr. Amit Verma (1*)
(1) Associate Professor, Computer Science& Engineering, Chandigarh Engineering College, Landran, Punjab,
India
(2) M. Tech. Research Scholar, Computer Science & Engineering, Chandigarh Engineering College, Landran,
Punjab, India
(1*) Professor and HOD, Computer Science& Engineering, Chandigarh Engineering College, Landran, Punjab,
India
Hazel Esperanza Loya Larios, Raúl Santiago Montero, David Asael Gutiérrez Hernández, Agustino Martínez
Antonio Luis Ernesto Mancilla Espinoza
Tecnológico Nacional de México. Instituto Tecnológico de León. División de Estudios de Posgrado e Investigación.
Av. Tecnológico S/N - Fracc. Industrial Julián de Obregón. León, Guanajuato, México - C.P. 37290
Tecnológico Nacional de México, Instituto Tecnológico de León, División de Estudios de Posgrado e Investigación,
León Guanajuato, México
7. PaperID 31011713: LSBSM: A Novel Method for Identification of Near Duplicates in Web Documents (pp.
72-78)
8. PaperID 31011718: A Method for Arabic Documents Plagiarism Detection (pp. 79-85)
Yahya A. Abdelrahman, Department of Computer Science, Sudan University of Science and Technology, Khartoum,
Sudan
Ahmed Khalid, Department of Computer, Najran University, Najran KSA
Izzeldin M. Osman, Department of Computer Science, Sudan University of Science and Technology Khartoum,
Sudan
9. PaperID 31011719: Controlling Future Intelligent Smart Homes using Wireless Integrated Network Systems
(pp. 86-112)
Rustom Mamlook *, Omer Fraz Khan, Mohannad Maher Haddad, Hatem Salim Koofan, Said Mahad Tabook
Department of Electrical & Computer Engineering, Dhofar University, Sultanate of Oman
10. PaperID 31011720: Role of Stakeholders in Requirement Change Management (pp. 113-117)
11. PaperID 31011722: Steganography in DCT-based Compressed Images through Modified Quantization and
Matrix Encoding (pp. 118-126)
12. PaperID 31011724: A Comparison Study on Text Detection in Scene Images Based on Connected
Component Analysis (pp. 127-139)
Abdel-Rahiem A. Hashem (1), Mohd. Yamani Idna Idris (2), Ahmed Gawish (3), Moumen T. ElMelegy (4)
(1) Mathematics Department, Faculty of science, Assiut University, Assiut 71516, Egypt. Exchange student program
in UM university, Malaysia
(2) Faculty of Computer Science and Information Technology, University of Malaya, Malaysia.
(3) Vision and Image Processing (VIP) Lab, Department of Systems Design Engineering, University of Waterloo,
Waterloo, Canada
(4) Electrical Engineering Department, Assiut University, Assiut 71516, Egypt
13. PaperID 31011726: Searching of a Route through Implementation of Neural Network in Visual Prolog (pp.
140-144)
Elitsa Zdravkova,
Department of Computer Systems and Technologies, Shumen University "Konstantin Preslavsky", Shumen, Bulgaria
14. PaperID 31011727: Predection of Nephrolithiasis Based on Extracted Features of X-Ray Images Using
Artificial Neural Networks (pp. 145-156)
15. PaperID 31011729: Multimodal Cumulative Class-Specific Linear Discriminant Regression for Cloud
Security (pp. 157-165)
Savitha G., Computer Science and Engineering, B.N.M. Institute of Technology, Bangalore, India
Dr. Vibha Lakshmikantha, Computer Science and Engineering, B.N.M. Institute of Technology, Bangalore, India
Dr. K. R. Venugopal, Computer Science and Engineering, Visvesvaraya College of Engineering, Bangalore, India
16. PaperID 31011730: Generic Architecture for Information Availability (GAIA) a High Level Agent Oriented
Methodology (pp. 166-171)
17. PaperID 31011734: Microstrip Patch Antenna with Defected Ground for L, S and C Band Applications (pp.
172-179)
18. PaperID 31011739: Extracting Words’ Polarity with Definition and Examples (pp. 180-190)
19. PaperID 31011740: An Efficient and Secure One Way Cryptographic Hash Function with Digest Length of
1024 Bits (pp. 191-198)
Justice Nueteh Terkper, Department of Computer Science, Kwame Nkrumah University of Science and Technology,
Kumasi, Ghana
James Ben Hayfron-Acquah, Department of Computer Science, Kwame Nkrumah University of Science and
Technology, Kumasi, Ghana
Frimpong Twum, Department of Computer Science, Kwame Nkrumah University of Science and Technology,
Kumasi, Ghana
20. PaperID 31011744: Natural Terrain Feature Identification using Integrated Approach of Cuckoo Search
and Intelligent Water Drops Algorithm (pp. 199-215)
21. PaperID 31011747: Tree Based Cluster Energy Aware Routing In Wireless Sensor Networks (pp. 216-226)
22. PaperID 31011750: Security-as a – service in Cloud Computing (SecAAS) (pp. 227-230)
Baby Marina, Information Technology, SBBU, Shaheed Benazirabad
Dr. Irfana Memon, CSE, QUEST, Nawabshah
Fatima, telecommunication, QUEST, Nawabshah
23. PaperID 31011752: A New Enhanced Automated Fuzzy-Based Rough Decision Model (pp. 231-238)
24. PaperID 31011757: Secrecy Capacity of a Rayleigh Fading Channel under Jamming Signal (pp. 239-246)
25. PaperID 31011759: ETL Based Query Processing Architecture for Sensornet (pp. 247-254)
Dileep Kumar, Department of Information Media, The University of Suwon, Hwaseong-si South Korea
Jangyoung Kim, Department of Computer Science, The University of Suwon, Hwaseong-si South Korea
26. PaperID 31011760: Link Prediction in Social Networks Based on Similarity Criteria and Behavioral
Patterns of Users (pp. 255-264)
Farnaz Sabzevari *, Islamic Azad University, Damavand Branch, Department of computer, Tehran, Iran
Ali HaroonAbadi, Islamic Azad University, Central Tehran Branch, Department of computer, Tehran, Iran
Javad Mir Abedini, Islamic Azad University, Central Tehran Branch, Department of computer, Tehran, Iran
27. PaperID 31011761: Ear Biometric System Using Speeded-up Robust Features and Principal Component
Analysis (pp. 265-269)
Dr. Habes Alkhraisat, (Al-Balqa Applied University) Department of Computer Science Al-balqa Applied University,
Asalt, Jordan
28. PaperID 31011764: Simulation of Various QAM Techniques Used in DVBT2 & Comparison for Various
BER Vs SNR (pp. 270-276)
Sneha Pandya, C. U. Shah University, Wadhwan.
Nimit Shah, Electrical Engg, C. U. Shah College of Engg & Technology.
Dr. C. R. Patel, V.V.P. Engineering College, Rajkot
29. PaperID 31011765: Assessing e-Government systems success in Jordan (e-JC): A validation of TAM and IS
Success model Validation of TAM and IS for e-Government Systems Success in Jordan (pp. 277-304)
30. PaperID 31011766: Mining Student Data Using CRISP-DM Model (pp. 305-316)
31. PaperID 31011769: Malware-Free Intrusion: A Novel Approach to Ransomware Infection Vectors (pp. 317-
325)
Aaron Zimba, Department of Computer Science and Technology, University of Science and Technology Beijing,
Beijing 100083, China
32. PaperID 31011770: Network Forensics for Detecting Flooding Attack on Web Server (pp. 326-331)
33. PaperID 31011771: Adaptive Scheme for Application Methods Offloading in Mobile Cloud Computing (pp.
332-339)
34. PaperID 31011773: Spectral Unmixing From Hyperspectral Imagery Using Modified Gram Schmidt
Orthogonalization and NMF (pp. 340-345)
Neetu N. Gyanchandani, Department of Electronics Engineering, Research Scholar, GHRCE, Nagpur, India
Dr. A. A. Khurshid, HOD, Electronics Engineering, RCOEM, Nagpur, India
Dr. Sanjay Dorle, HOD, Department of Electronics Engineering, GHRCE, Nagpur, India
35. PaperID 31011774: Hyperspectral Image Compression Methods: A Review (pp. 346-350)
Neetu N. Gyanchandani, Department of Electronics Engineering, Research Scholar, GHRCE, Nagpur, India
Dr. A. A. Khurshid, HOD, Electronics Engineering, RCOEM, Nagpur, India
Dr. Sanjay Dorle, HOD, Department of Electronics Engineering, GHRCE, Nagpur,India
36. PaperID 31011777: MQA: Mobility’s Quantification Algorithm in AODV Protocol (pp. 351-361)
37. PaperID 31011778: Testing Coverage based Software Reliability Models: Critical Analysis and Ranking
based on Weighted Criterion (pp. 362-371)
Manohar Singh, Research Scholar, Department of Computer Science, OPJS University, Churu, Rajasthan, India
Dr. Vaibhav Bansal, Associate Professor, Department of Computer Science, OPJS University, Churu, Rajasthan,
India
38. PaperID 31011783: Accelerating a Secure Communication Channel Construction Using HW/ SW Co-design
(pp. 372-377)
39. PaperID 31011789: An Efficient Zone-Based Routing Protocol for WSN (pp. 378-396)
40. PaperID 31011790: Zone Hierarchical Routing Protocol with Data Aggregation (pp. 397-405)
Kamal Beydoun,
Department of Computer Science, Lebanese University, Beirut, Lebanon
Muhammad Itqan Mazdadi, Department of Informatics Engineering, Islamic University of Indonesia, Yogyakarta,
Indonesia
Imam Riadi, Department of Information System, Ahmad Dahlan University, Yogyakarta, Indonesia
Ahmad Luthfi, Department of Informatics Engineering, Islamic University of Indonesia, Yogyakarta, Indonesia
42. PaperID 31121621: Evaluating Maintainability of Open Source Software: A Case Study (pp. 411-429)
Feras Hanandeh (1), Ahmad A. Saifan (2), Mohammed Akour (3), Noor Al-Hussein (4), Khadijah Shatnawi (5)
(1) The Hashemite University, Zarqa, Jordan.
(2, 3) Software Engineering Department, Faculty of IT, Yarmouk University, Irbid, Jordan.
(4, 5) CIS Department, Faculty of IT, Yarmouk University, Irbid, Jordan.
43. PaperID 31121622: Classification of Human Vision Discrepancy during Watching 2D and 3D Movies Based
on EEG Signals (pp. 430-436)
44. PaperID 31121623: A New Brain-Computer Interface System Based on Classification of the Gaze on Four
Rotating Vanes (pp. 437-443)
45. PaperID 301116179: Decentralized Access Control with Anonymous Authentication for Secure Data
Storage on Cloud (pp. 444-449)
Shraddha Mokle, Department of Computer Engineering, Modern Education Society's College of Engineering, Pune,
India
Prof. Nuzhat F Shaikh, Department of Computer Engineering, Modern Education Society's College of Engineering,
Pune, India
46. PaperID 301116214: Data partition and Aggregation in MapReduce to Improve Processing time (pp. 450-
456)
47. PaperID 311016184: Studying the Numerical Methods for Calculating Bi-Phase Fluid Flow (pp. 457-463)
48. PaperID 30111603: A Novel Simple Method to Select Optimal k in k-Nearest Neighbor Classifier (pp. 464-
469)
49. PaperID 301116112: A proposed Method for Face Image Edge Detection Using Markov Basis (pp. 470-476)
50. PaperID 301116229: Face Recognition Age Invariant: A Closer Look (pp. 477-482)
51. PaperID 31011782: Comparative Study for Selection of an Item Based on Multi-Criteria DSS (pp. 483-492)
52. PaperID 31011787: Implementation of Indian Sign Language Recognition System using Scale Invarient
Feature Transform (SIFT) (pp. 493-507)
53. PaperID 301116196: Indexes’ Optimal Selection for Data Warehouse Quality (pp. 508-514)
54. PaperID 301116231: A Survey on Different Methods of Software Cloning and Detection (pp. 515-535)
55. PaperID 31011776: Performance comparison of Adaptive OFDM Pre-and Post-FFT Beamforming System
(pp. 536-543)
56. PaperID 31011735: Designing of Cloud Storage using Python Language (pp. 544-552)
Shipra Goel
Abstract-Link prediction is one of the most important and common activities in the field of social network analysis
and network graphs analysis. Link prediction means the possibility of establishing a connection between two vertices
that currently there is no relationship between them and may be done according to the available information on the
net and by knowing information about the communication that has already been created. A variety of link
prediction methods is presented in social networks. Features used in determining the similarity that are extracted
from the network graph include local and global characteristics. Local features have the advantage of speed and
global features have the advantage of Precision. In this study, the aim is that by using the user profile properties and
clustering them and finally by applying the Friend Link algorithm in each cluster, a system can be implemented to
link prediction between users. Therefore, by using these mentioned techniques, the Precision of prediction can be
raised. The Precision of the proposed method in compare with method of spectral link has been improved close to
4% on average.
I. INTRODUCTION
Social networks are dynamic networks that members and linkages between them are constantly increasing.
Chain of these links is incomplete due to the process and or due to reasons that are not reflected in these
networks have been torn and lost. Therefore, one of the important issues in social networks is issue of link
prediction that means the presence or absence of a link or connection in the future between two vertices of a
social network and it is an important tool for social network analysis.
Graphs are used to display social networks. Nodes in the graph play the role of members and edges play role of
communication between these people [1&2]. In this paper, we are going to create a recommender system on the
web by using graph theory and similarity measure. The second section is focused on basic concepts and
definitions needed in the next parts. The works that have been so far performed on the subject of this study is
analyzed in the third section. The proposed technique is described in section four and in the fifth section
simulation and evaluation of the proposed method is described and finally in sixth section the conclusion is
observed.
255 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
II. BACKGROUND
In this section we will be familiar with the general concepts and definitions that will be mentioned in the next
chapters.
A. LINK MINING
Topic of link mining was proposed formally for the first time that link prediction considered one of the sub-
links of link mining [3&4]. Sometime in some networks, some links arise accidentally due to an error in the
networks. These incorrect links can disrupt the network structure and its study. With the help of Link
prediction, these links can be identified and removed from the network [5].
First group: solutions that are based on similarity criteria. These methods using structural characteristics of the
network graph are used to recognize the similarities between the network nodes that include three groups: local,
quasi-local and global.
Second group: solutions that are based on maximum possibility. In these solutions while studding the network
structure, rules and features that increase the probability of the links will be extracted.
Third groups: are solutions based on statistics. In this kind of methods, statistical models and relevant
distributes are used for link prediction.
D. CLUSTERING
Clustering is process of the category of objects into clusters that each cluster members have the maximum
amount of similarity to each other and minimum similarity to members of other clusters [7]. These methods are
divided into two groups:
A group that shows each cluster by using central point of its existing data, like the K-means algorithm [8&9].
A group that shows each cluster with nearest data to the center of the cluster. The K-medoid algorithm is from
this group.
256 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
E. RECOMMENDER SYSTEMS
Recommender systems are the systems that help users to find and select their desired items. It is natural that
this system without having accurate information about users and their desired items (for example movie, music
book and…) is not able to recommend [10].
In this part study of performed works in the field of data clustering will be discussed.
257 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Explanation of feature
Age
Weight
Height
Interest in music I like music
Interest in movie I like movies
Relation to children
clustering. Secondly, many features in practice had not any role in increasing the efficiency and reduce the
efficiency. Therefore, the Precision and efficiency of clustering can be increased by identifying these features.
D. PROCESS OF CLUSTERING
a) Obtaining the central parts of the clusters that are in fact the average points of each cluster.
b) Assigning each data to a cluster that has the shortest distance to the center of the cluster.
c) In the simple form of this method first some points are randomly selected on the number of clusters
required. Then the data according to similarity are attributed to one of these clusters, so the new
clusters are obtained. By repeating this procedure, in each repetition with the average of data, new
centers of clusters can be calculated and data are again attributed to the new clusters. This process will
continue as long as there is no change in the data. The function that is shown in equation 1 is objective
function:
( )
∑ ∑ (1)
258 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
: central of cluster j
Stages of b and c are repeated as long as any there is no change in cluster centers.
Calculating of distance between two data in clustering is very important. By calculating the distance between
two data, one can understand how these two data are close together and accordingly put them in a cluster.
In the next section, a new similarity measure is defined to determine how to express the proximity between
nodes of the graph. If and are two nodes of a graph and “Sim” is a function that holds similarities, the
higher the score of similarities between two nodes, the higher the probability that they will be friends.
| |
( ) ∑ ( )
∏ ( )
Wherein
L is the maximum length of a path considered between the nodes of the graph and
259 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
In this algorithm cycle routes are not considered in the similarity measurement.
Therefore, the new matrix includes the similarity of each pair of user according to the aforementioned
relationship. Eventually, the user with higher rating will be suggested to the right user.
A. IMPLEMENTATION TOOLS
In order to implement the proposed method tools of Matlab and Weka are used.
Clustering is performed by using the tool Weka and results for validation were used in the Matlab Tools. Table
2 shows the results of calculation of silhouette factor for K-means clustering with different number of K.
according to results, number of 6 is the best clustering in the K-means clustering method.
a) Dataset of users’ profile includes; gender, age, interest, education status and…and is used for
clustering users.
b) Dataset of communication between users which defines the communication among them together.
This part is used for link prediction. We work on 2000 records from this database to evaluate our
proposed method.
5 0.3425
6 0.4155
7 0.3749
8 -0.1224
9 -0.0322
260 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The first part is related to the clustering of users’ features. In the previous section these features are mentioned.
Figure.1 shows the clustering.
Precision: recognition Precision in each of the clusters printed in output that is obtained by dividing the number
of correct estimated links to the total number of estimated links in the test data.
Recall: this amount also will be specified in each cluster output and is obtained by dividing the number of
correct estimated links to the number of correct links in the test data.
FIGURE 1: CLUSTERS
1 450
2 256
3 379
4 385
5 285
6 246
261 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
1 0.938 0.939
2 0.898 0.907
3 0.891 0.899
4 0.928 0.925
5 0.896 0.898
6 0.879 0.895
FIGURE 2 :COMPARE THE PRECISION OF THE PROPOSED METHOD WITH SPECTRAL LINK
262 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
FIGURE 3 :COMPARE THE RECALL OF THE PROPOSED METHOD WITH SPECTRAL LINK
As can be seen, Precision and Recall of the proposed method is more than Spectral Link algorithm in each
cluster.
VI. CONCLUSION
The main purpose of this study is improvement of Precision and efficiency of predicting the similarity of
people and suggest to them in social network. The results show that our proposed method reduces the
complexity and increases the Precision. Therefore it also increases the Precision of suggestions in the social
networks which makes it possible to identify the most similar people to the user in each cluster. One clustering
technique was used in this study and then Friend Link algorithm was applied in each cluster. In order to
improve Precision and efficiency in the future works, after applying clustering methods, a classification with
supervisor like neural network can be used and then Friend Link algorithm can be applied.
263 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
REFERENCES
[1] Jannach, D., Zanker, M., Felfernig, A. and Friedrich, G., 2010. "Recommender Systems: An Introduction",Cambridge University
Press. New York, 2010.–352 P.
[2] Konstan, J., & Riedl, J., 2012, “Recommender systems: from algorithms to user experience”, Springer, vol. 22, pp. 101-123.
[3] Getoor, L. and Diehl, C. P., 2005. "Link mining: a survey," ACM SIGKDD Explorations Newsletter ACM Digital Library , vol. 7, pp.
3-12,
[4] Lü, L. and Zhou, T., 2011. "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications,
vol. 390, pp. 1150-1170,
[5] Liben‐Nowell, D. and Kleinberg, J., 2007. "The link‐prediction problem for social networks," Journal of the American society for
information science and technology, vol. 58, pp. 1019-1031,
[6] Schifanella, R., Barrat, A., Cattuto, C., Markines, B., Menczer, F., 2010, “Folks in folksonomies: social link prediction from shared
metadata.”, In: Proceedings 3rd ACM International Conference on Web Search and Data Mining (WSDM’2010), New York, NY, pp. 271–
280.
[7] Cui, X., and Wang, F., 2015.” An Improved Method for K-Means Clustering.” In 2015 International Conference on Computational
Intelligence and Communication Networks (CICN) IEEE ,pp. 756-759.
[8] Gupta, H. and Srivastava, R., 2014. “k-means Based Document Clustering with Automatic “k” Selection and Cluster Refinement”.
International Journal of Computer Science and Mobile Applications, 2(5), pp.7-13.
[9] Cui, X. and Wang, F., 2015, “An Improved Method for K-Means Clustering”. In International Conference on Computational
Intelligence and Communication Networks (CICN) (pp. 756-759). IEEE.
[10] Zhang, J. and Philip, S.Y., 2014. “Link Prediction across Heterogeneous Social Networks: A Survey”. SOCIAL NETWORKS.
[11] Newman; and Girvan., 2004,”Finding and evaluating community structure in networks”. Phys. Rev. E 69, 026113,.Journal reference:
Phys. Rev. E 69, 026113
[12] Symeonidis, P., Iakovidou, N., Mantas, N. and Manolopoulos, Y., 2013. “From biological to social networks: Link prediction based on
multi-way spectral clustering.” Journal of Data & Knowledge Engineering , vol.87, pp.226-242.
[13] Papadimitriou, A., Symeonidis, P., & Manolopoulos, Y., 2012, "Fast and accurate link prediction in social networking systems",
Journal of Systems and Software, vol.85, pp. 2119-2132.
264 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
1 Vol. 15, No. 2, February 2017
Ear Biometric System Using Speeded-up Robust Features and Principal Component Analysis
Habes Alkhraisat1
1(Al-Balqa Applied University) Department of Computer Science Al-balqa Applied University, Asalt, Jordan;[email protected]
Abstract— Recently, identification of individual using personal biometric features are widely used in security monitoring, and
access control, criminal investigation system. Nowadays, fingerprints, iris, and face are the most popular biometric characteristic
used in Biometric systems. In recent years, the interest in ear recognition techniques has received increasing attention. The
outer ear has universal, unique, permanent, measurable, and high-performing biometric characteristic and the structure of the
outer ear does not change with increasing persons’ ages. Therefore, in the last decades, there are many experiments which are
conducted ears biometric features. This article presents a robust technique for improving the performance of ear recognition. The
proposed technique combines the advantages of Speeded-Up Robust Features (SURF) for feature extraction, Principal
components analysis (PCA) to reduce the dimension of the feature vector to a lower dimension, which improves the computation
efficiency, and scalable K-means++ algorithm for feature clustering. The experimental results demonstrate the robustness,
accuracy, efficiency, and performance of the new technique.
Keywords — Ear recognition, Feature extraction, Speeded-up robust features, scalable K-means++, Principal
components analysis
—————————— ——————————
1 INTRODUCTION
Recently, in both forensic scientists and among anato-
Biometrics is the process identifying of an individual mists’ and anthropologists’ circles, it is a fact that the
using physiological or behavioral characteristics [1]. Now- structure of an external ear enables identification of indi-
adays, various biometrics characteristics and meter has viduals [8]. Generally, anthropologist recommended the
been studied, like fingerprints, face, iris, and ear. The ear shapes of external ear to differentiate between individuals
characterisitcs are new biometrics features for individual. [9].
French criminologist Bertillon discovered that it is possi-
ble to identify individuals based on the shape of their outer
ear [2][1], and the first ear recognition system based on
seven ear features, was proposed by American police of-
ficer Iannarelli [3].
Ear has a unique and permanent structure, as the ap-
pearance of the ear does not change with icreasing persons’
ages. Beside that, the acquisition of ear images does not
require a person’s cooperation. Therfore, ear seems to be
suitable for recognition of personal identity based on fea-
tures derived from ear images, and the interest in ear bio-
metric systems has grown in the last two decades.
The ear is ideal biometric candidate due to the follow-
ing characterstics (i) its structure is rich and stable, it is Fig. 1. Characteristics of the human ear
consistent with the lifetime of individuals, (ii) its structure
is not affected by pose and facial expression, (iii) it is col- This paper aims to develop a roubust ear recognition
lectable and (iii) immune from privacy, anxiety, and hy- system by integrating the advantages of the following tech-
giene problems with several other biometric. niques: Speeded-Up Robust Features (SURF) [10], Princi-
The human outer ear is formed by the outer helix, the ple Componenet Analysis (PCA) [11], and scalable K-
antihelix, the lobe, the tragus, the antitragus, and the con- means++ algorithm [12]. The motivation of this paper, the
cha (figure 1). performance and efficiency of ear recognition schema with
Research in age and sex related changes in the human scale and pose invariance. The proposed method consists
ear has shown, that the outer ear maintains its structure of 4 statges. It starts by constructing of ear SURF de-
with icreasing persons’ ages [4][2] [5] [6][3]. The study scriptors. The second stage of the method consists of com-
in [7] demonstrates that short periods of time do not af- bining the SURF descriptor with the PCA algorithm to ex-
fected the recognition rate. Even though the potential ef- tract and construct the ear local descriptors. The third stage
fect of aging on biometric ear recognition is still subject to is concern with clustering the ear local descriptors by ap-
further research and has yet to be totally explored scientif- plying the scalable K-means++ algorithm. Finaly the clas-
ically. sification of the ear images is carried out by calculating
265 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
2 Vol. 15, No. 2, February 2017 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
local and global similarities. The Hessian matrix 𝐻(𝑥, 𝜎) at scale 𝜎 for a point (𝑥, 𝑦)
The remainder of this paper is organized as follows. in an image 𝐼, is defined as follows:
Section 2 discusses all stages of the proposed method. The
experimental results are demonstrated in Section 3. Section 𝐿𝑥𝑥 (𝑥, 𝜎) 𝐿𝑥𝑦 (𝑥, 𝜎) (2)
𝐻(𝑥, 𝜎) = [ ]
4 provides final conclusions. 𝐿𝑦𝑥 (𝑥, 𝜎) 𝐿𝑦𝑦 (𝑥, 𝜎)
266 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
AUTHOR ET AL.: TITLE Vol. 15, No. 2, February 2017 3
267 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
4 Vol. 15, No. 2, February 2017 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
𝑦
𝑆𝐺 (𝐼𝑡 , 𝐼𝑟 ) = 𝑚𝑎𝑡𝑐ℎ(𝐼𝑡 , 𝐼𝑟 )× max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) (8)
EXPERIMENTAL RESULTS
The performance and effieceiny of the proposed algo-
Fig. 6. Ear images of different clustering sub-regions. rithm, is tested using AMI Ear Database. AMI Ear Data-
base has been created by Esther Gonzalez at Computer Sci-
The fast index matching is applied for each feature in ence department of Universidad de Las Palmas de Gran
the sub-region of 𝐼𝑡 to remove the extreme different fea- Canaria (ULPGC). It includeds 700 ear images of 100 in-
tures and retaine the similar features in the same sub-re- dividuals. For each individual, six right ear images and one
gion. The local similarity 𝑆𝐿 is calculated as follow: left ear image were taken. For left ears, each sample was
captured from a slightly different pose and distance. Five
1
𝑘
(6) images were right ear with the individual facing forward
𝑦
𝑆𝐿 (𝐼𝑡 , 𝐼𝑟 ) = ∑ (max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) ×𝑤𝑖 ) (FRONT), looking right and left (Right, LEFT) and look-
𝑘
𝑖=1
ing down and up (Down, UP). The sixth image of right
𝑥 ∈ (1, ⋯ , 𝑚𝑡𝑖 ), 𝑥 ∈ (1, ⋯ , 𝑚𝑟𝑖 )
where: profile was taken with the subject facing forward but with
𝐼𝑡 is the test image a different camera focal lenght (ZOOM). Last image
𝐼𝑟 is the reference image (BACK) was a left side (left ear), with the subject facing
𝑦
𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 ) is the correlation between each pair of forward. The image resolution is 492 x 702 pixels. Some
features associated with the 𝑖 𝑡ℎ sub-regions of 𝐼𝑡 images from the AMI Ear Database are shown in figure 8.
and 𝐼𝑟 : The first 30 individuals have been used for training and
𝑦 remaining 70 subjects, we used sessions 2 until 6 for train-
𝑦 (𝑓𝑡𝑖𝑥 , 𝑓 ) (7)
𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 ) = 𝑥 𝑟𝑖 𝑦 ing and session 7 for testing. This implies that we have 560
‖𝑓𝑡𝑖 ‖ ∙ ‖𝑓𝑟𝑖 ‖ training images and 140 testing images.
𝑤𝑖 is the relative weight for the 𝑖 𝑡ℎ sub-region, The study shows that PCA-Based SURF suitable for ear
𝑦
max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) is the maximal similarity in 𝑖 𝑡ℎ recognition. In Table 1, the variation of recognition is
sub-region. shown with five different cluster types, which are illus-
The global similarity is computed using the inline-point trated in Figure 5.
similarity 𝑚𝑎𝑡𝑐ℎ(𝐼𝑡 , 𝐼𝑟 ) and maximal cosine correlation The results proposed method have been evaluated using
𝑦 invariant ear images categorized as follows: image is Nor-
𝑚𝑎𝑥 (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) as follows:
mal, image is Rotated, image is Rotated and change in
Contrast of image.
268 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
AUTHOR ET AL.: TITLE Vol. 15, No. 2, February 2017 5
Table 1. Recognition Rate with five different cluster types 293, 2007.
[6] L. Meijerman, G. Maat and C. Van Der Lugt, "Cross-
Recognition Rate (%)
Sectional Anthropometric Study," ournal of Forensic
Method Rotated
Sciences, vol. 52, no. 2, pp. 286-293, 2007.
Normal (degrees) Rotated and
180 90 -90 Contrast [7] M. Ibrahim, M. Nixon and S. Mahmoodi, "The effect
A 99.0 98.8 99.0 99.0 98.8 of time on ear biometrics," in The first International
B 98.5 98.2 98.4 98.4 98.0 Joint Conference on Biometrics, Washington DC,, 2011.
C 97.0 96.8 96.8 96.8 96.5 [8] J. Kasprzak, "Identification of ear impressions in
D 98.1 98.1 98.1 98.1 97.8 polish forensic," Problems of Forensic Sciences, vol. 57,
E 99.5 99.2 99.5 99.5 99.2 pp. 168-174, 2001.
[9] R. Purkait and P. Singh, "A test of individuality of
human external ear pattern: Its application in the
field of personal identification," Forensic Science
6 CONCLUSIONS International, vol. 178, no. 2-3, pp. 112-118, 2008.
In this paper, we proposed PCA-SURF features for an [10] H. Bay, A. Ess, T. Tuytelaars and L. Van Gool,
effective ear recognition system as method for identifying "Speeded-up robust features (SURF)," Computer
of an individual using ear print. The proposed method con- Vision and Image Understanding, vol. 110, no. 3, p. 346–
sists of 4 statges. It starts by constructing of ear SURF de- 359, 2008.
scriptors. The second stage of the method consists of com-
[11] I. Jolliffe, Principal component analysis, John Wiley
bining the SURF descriptor with the PCA algorithm to ex- & Sons, Ltd., 2002.
tract and construct the ear local descriptors. The PCA en-
[12] B. Bahmani, B. Moseley, A. Vattani, R. Kumar and S.
cods of a high-dimensional descriptor into a compact low-
Vassilvitskii, "Scalable K-Means++," in VLDB
dimensional space called as the PCA-SURF. The third
Endowment, Istanbul, 2012.
stage is concern with clustering the extracted ear local de-
scriptors by applying the scalable K-means++ algorithm. [13] L. D. Shinfeng, L. Bo-Feng and L. Jia-Hong,
Finaly the classification of the ear images is carried out by "Combining Speed-up Robust Features with
calculating local and global similarities. Principal Componenet Analysis in Face Recongnition
System," International Journal of Innovative Computing,
Experimental results show that the performance of the
Information and Control, vol. 8, no. 12, pp. 8545-8556,
proposed method is quite well and robust to the accessory,
2012.
expression, pose and age variations, rotation, and under
lighting. [14] D. Arthur and S. Vassilvitskii, "k-means++: The
In addition, due to using the PCA in feature space re- Advantages of Careful Seeding," in 07 Proceedings of
duction, scalable k-means++ clustering, and applying the the eighteenth annual ACM-SIAM symposium on
Discrete algorithms, New Orleans, 2007.
fast indexing in matching stag the proposed method has
lower complutational complexity and computation time of [15] B. Arbab-Zavar and M. Nixon, "Robust Log-Gabor
feature matching. Filter for Ear Biometrics," in The 18th International
Conference on Pattern Recognition, Florida, 2008.
REFERENCES
269 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
I. INTRODUCTION
The paper describes Digital video broadcasting – Terrestrial standard as the one which actually is modifying the
existing analog standards being used currently across the globe. The most important part of such standards is the
retrieval of perfect signal at the receiver end excluding the effects of the channels it goes through and the noise and
timing jitter. In the transmission being carried out, the data – either audio – video or any picture information or
randomized data is processed for coded orthogonal frequency division multiplexing (COFDM) before they are
modulated using QAM – Quadrature Amplitude Modulation constellation and mapped in the group of blocks. After
formation of the blocks, IFFT – Inverse Fourier Transform is carried out with point 2048 or 8192, which will
determine bandwidth requirement and number of subcarriers. Some of these subcarriers are kept in reserve to be
used for the pilot symbols – much needed for efficient reception of the signals, whereas the others are to be used for
guard-bands as well.[1][2]
Limitations of DVBT: Though there are many virtues of implementing DVB-T system, there are many
shortcomings too of the same which cannot be neglected. The very first limitation and an important one too is in the
form of bit rates supported by it. They are limited and not compatible with the existing and rapidly changing
wireless standards. For the transmission of HDTV – high-definition television and also for accommodating more
channels for broadcasting, there was a strong need of new standard. [3]
The second thing it lacked was interaction with the user which was needed to be upgraded.
The third limitation of the DVB-T system is its hugely inferior performance with portability or mobility which
restricted its usages in moving vehicles.
Last but not the least is regarding Single Frequency Networks – SFNs, where repeated signals create interference
to their own versions of the signal and damages the quality of reception. [3]
270 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
A new standard which provides enhanced capacity and also the required sturdiness in the terrestrial scenario is the
second generation standard of DVBT, popularly known to be DVB-T2. It was basically designed in such a way that
it can support the fixed receptors but was also equipped with required mobility. It was designed in such a way as to
maintain the spectrum characteristics of its ancestor standard – i.e. DVBT. Figure 1 shows the functional block
diagram of a DVB-T2 transmitter.[4] The most important change made is in its strategy of correcting errors, which
has been inherited from DVBS2.
A combination of LDPC – Low Density Parity Check code and BCH - Bose-Chaudhuri-Hocquenghem code
improves the performance by great amount giving the robustness in receiving the signal efficiently. The FEC –
forward error check coding techniques are way better than Convolution Codes used in DVBT to achieve the same
purpose. As far as the modulation technique is concerned, DVBT2 uses the same – OFDM as used in DVBT, but it
uses this modulation technique introducing longer symbols with 16K and 32K carriers so that an increment in the
length of the guard interval can be carried out without damaging the spectral efficiency. The second generation
provides combination of different numbers of carriers and guard interval lengths and hence it become a very flexible
standard and can be used for any of the multiple combinations.[5]
A very important modification offered by DVBT2 is the presence of 8 different pilot patterns in the scattered
format, whose choice would be made by the parameters of current transmission. Because of all these minute changes
made and modulation techniques too updated, a new standard has emerged giving the best possible spectral
efficiency. In the block diagram, it can be observed that interleaving is carried out in multi-folds – bit inter-leaver –
time inter-leaver and then frequency inter-leaver to avoid the bursts of errors as much as possible and giving a way
to the randomised error pattern within the frame of LDPC. [5]
The Bit Error Rate as obtained from the internal decoder is taken into account for all results compared in this
article. For the justified comparison to be made between DVBT and DVBT2, a quasi-error free (QEF) of
BER=2·10-4 and BER=10-7 must be considered for DVB-T and DVB-T2 after convolutional and LDPC decoders,
271 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
respectively.[6] If these QEF reference values are considered, for an additive white Gaussian noise (AWGN)
channel model - a gain of 6 dB can be obtained between the two standards and in a Rayleigh channel - nearly 4 dB.9
TABLE I
COMPARISON OF DVBT AND DVBT2
DVBT DVBT2
FEC carried out by REED SOLOMON & FEC carried out by LDPC & BCH codes
Convolutional codes
Modes are QPSK, QAM 16 & 64 QPSK,QAM 16,64 & 256
Guard intervals 1/4, 1/8, 1/16, 1/32 1/4, 19/256, 1/8, 1/16, 1/32 , 19/128,
FFT up to 8K FFT size up to 32K
Scattered pilots 12% Scattered pilots 1%
Continual pilots 2.6% Continual pilots 0.35%
The same is applied using 64 QAM techniques as the basic modulation technique in DVBT2, the graph shows the
value of BER for given value of SNR. The gradual betterment achieved in the BER can be observed. The resulting
graph can be observed in Figure – 3.
The permissible generalized value of BER taken as 10e-5 is achieved in 64 QAM using / providing / maintaining
the SNR at approx. 11.86 dB.
272 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
In Figure – 4, for both the techniques, the graphs of BER vs SNR are plotted so that a closed comparison can be
made and the required technique can be chosen.
The proposed research work was to carry out to find the best and optimized method of implementing DVBT2
using the most efficient and promising QAM technique and also the optimized value for SNR for an acceptable
value of BER. Just like the DVBT2 applied to random data stream, it has been applied to video signal and the
various results are obtained.[8] This entire work actually is carried out in order to further obtain the same results on
video signals and later to implement them to Digital Video Broadcasting- Hand Held – also becoming popular
DVBH.
Figure 5 shows how in 4 QAM (very much similar in results and performance to QPSK) it can be seen the good –
and – the best reconstruction of the video for the values of SNRs taken 1.80 dB and 1.98 dB respectively.
273 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 6 shows that in 16 QAM, it can be seen the good -and – the best reconstruction of the video for the values
of SNRs taken 7.1 dB and 7.90 d B respectively.
Figure 7 shows that in 64 QAM, it can be seen the good –and- the best reconstruction of the image for the values
of SNRs taken 11.79 dB, 11.81 dB and 11.86 dB respectively.
Table II shows the comparison of 16 QAM and 64 QAM as applied to random data bits generated applying
DVBT2 and taking the results of achieved BER for gradually increasing values of SNR. The results of Videos
reconstructed applying gradually increasing SNRs, it has been observed that the BER becomes permissible after a
specific SNR and that brings the trade-off into existence between BER and SNR.[9]
274 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE II
COMPARISON OF DVBT2 USING 16 AND 64 QAM TECHNIQUES
DVBT2 -16 QAM DVBT2 -64 QAM
SNR BER SNR BER
1 7.5 dB 10-5 11.86 dB 10-5
2 4 dB 0.8 x 10-2 7.5 dB 0.5 x 10-1
3 2 dB 10-1 2 dB 0.5 x 10-1
V. CONCLUSION
DVB-T2 offers data rates up to 50 to 90 percent higher than DVB-T for the equal level of strength. The increase
results from the subsequent improvements:
• Improved Forward Error Check
• Rotated / revolved Constellation diagrams and
• Larger SFNs Flexible Pilot Pattern
This certainly makes it the better choice when presenting DTT or adding HD services to the terrestrial platform.
But, accurate definition of the key parameters of the DVB-T2 system is more precarious in planning DVB-T2
networks than it is for DVB-T.
Another important conclusive point is using 64 and 16 QAM respectively, the similar value of accepted BER can
be achieved by applying higher SNR.
The benefit of choosing the higher order formats is - there are more points included within the constellation – so,
it is possible to transmit more bits per symbol. The shortcoming is that the constellation points are closer together
and so the link is more susceptible to noise. As a result, higher order versions of QAM are only used when there is a
sufficiently high signal to noise ratio.
With the variation in QAM techniques as the order of it increases, the number of bits accommodated increases but
that can be achieved sacrificing either more signal power or BER has to be compromised.
275 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
ACKNOWLEDGMENT
The Research work I have carried out is a collective effort of my contributions and valuable inputs from experts of
this area which include my internal supervisor from C.U.Shah University – Dr. Nimit Shah sir whose insights are
powerful and suggestions are very innovative. I hereby thank him wholeheartedly. I also would like to extend my
heartfelt thanks to my supervisor Dr. Charmy Patel for guiding my work with her valuable inputs and carving my
path for this research work.
REFERENCES
[1] Digital Video Broadcasting (DVB); Frame structure channel coding and modulation for a second generation digital terrestrial
television broadcasting system (DVB-T2), ETSI Std. EN 302 755 V1.1.1, Sep. 2009.
[2] DVB-T2: New Signal Processing Algorithms for a Challenging Digital Video Broadcasting Standard, Mikel Mendicute, Iker Sobrón,
Lorena Martínez and Pello Ochandiano
[3] DVB-T2 Performance Comparison with other Standards DVB – NCA Seminar - 18 – 19 August 2010 Bangkok John Bigeni
[email protected]
[4] DVB-T and DVB-T2 Performance in FixedTerrestrial TV Channels Ladislav Polak and Tomas Kratochvil 978-1-4673-1118-2/12/$31.00
©2012 IEEE
[5] DVB-T2: The Second Generation of Terrestrial Digital Video Broadcasting System I˜naki Eizmendi, Manuel Velez, David G´omez-
Barquero, Javier Morgade, Vicente Baena-Lecuyer, Mariem Slimani, IEEE transactions on broadcasting, vol. 60, no. 2,June 2014
[6] Digital Video Broadcasting (DVB); Frame structure channel coding and modulation for a second generation digital terrestrial
television broadcasting system (DVB-T), ETSI Std. EN 300 744 V1.6.1, Jan. 2009.
[7] DVB Fact Sheet - August 2014
[8] A Comparison of 64-QAM and 16-QAM DVB-T under Long Echo Delay Multipath Conditions by Scott L. Linfoot, Member, IEEE, IEEE
Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003.
[9] Design of a DVB-T2 simulation platform and network optimization with Simulated Annealing, Carlos Enrique Herrero, Carlos Alberto
López Arranz.
276 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Jordan
1
Department of Management Information Systems, Girne American University,
Kyrenia, Turkish Republic of Northern Cyprus, via Mersin 10, Turkey
*Corresponding author
E-mail: [email protected]
¶
These authors contributed equally to this work.
1
277 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract
Drawing on information systems success model and technology acceptance model, this article
will examine the impact of system quality, information quality, and service quality on perceived
usefulness, perceived ease of use, and citizen’s attitudes toward the use of e-JC system. Data
analysis involving 398 randomly selected subjects was conducted to test these propositions,
general support was found for all the interactions. Results from structural equation modeling
delineates that information, system and service quality of e-government influences citizens
perceived ease of use and perceived usefulness, which in turn influences citizens
adoption/attitudes toward use of the e-government system. The findings indorse the model of
interest, and also contributes to the literature by strengthening researchers' theoretical and
practical understanding of the effects of information, system, and service quality in developing e-
government system.
2
278 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
1. Introduction
Technological innovations aimed at offering citizens improved and equitable access to public
services is popularly known as Electronic government (e-Government). This innovation has been
accepted and embraced by many countries throughout the globe. More specifically western
industrialized nations, with infrastructure and educated citizens. Given this research on e-
Government adoption has, to date, focused on developed countries in the Western World. Thus
providing additional space for exploration in the non-western country like Jordan. According to
themes from developed countries (Hsieh, Huang, & Yen, 2013; Krishnan, Teo, Lim, 2013),
however, e-Government researches specifically in Arab countries, has not received equal
attention.
individual’s access to public data, and services (United Nations, 2003). It is also the strategic use
better service quality, and a more democratic participation (Yang, & Rho, 2007). Academicians
and practitioners championing a utopian image thrash out that advances in IS will not only
deconstruct hierarchical forms of social and organizational structure, but also decentralized the
forms of information flow on a network relationship among citizens (Blanchard, & Horan, 1998;
3
279 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Jordan is considered a developing country with more than 60% urban population and about 76%
of households have Internet access as of 2015 (Mohammad, 2015). Very little about e-
Government in Jordan has been investigated or published. One of the researches delineated that
e-Government applications in Jordan lacks the standard features required for such application,
and that the system failed to take account of citizens need and expectations (AL-Soud, & Nakata,
2010). Given this, the author suggested formulated the following research questions:
¾ Does information quality, system quality, and service quality enhances perceived
¾ Does information quality, system quality, and service quality enhances perceived ease of
The existing e-Government portals in Jordan are standalone, in other words each ministry has its
own portal, as such end users (citizens) must register, create a username and a password. Given
the number of ministries Jordan, for citizens/business to benefit and use e-Government services,
they should remember each ministry’s username and password. This has generated redundant
information, this information overload and the increased complexity from the citizen’s side often
discourage them from using such services, and this has hinder the growth of e-government
applications and increased pressure on public servants and resources. Drawing on information
system success (DeLone, & McLean model) and Technology Acceptance Model (TAM), this
study presents an empirically validated model for measuring the success and acceptance of e-
government systems from the citizens’ perspective. The low adoption and use of e-government
services by end users thus remain major barriers to successful e-government implementation. The
4
280 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
2. Theoretical framework
altering the way things are done in key areas like government top public interaction, public
service delivery. E-government is not limited to the above said but also touches important aspect
applications can help a country better delivery government services to its citizens, improved
information sharing and accessibility (World Bank, 2010). Contemporary scholars have argued
that e-government can deliver efficient, effective and transparent services the public, this notion
has been supported by substantial empirical research (e.g., Affisco & Soliman, 2006; Reddick &
Originally introduced by Davis in 1989, the TAM build on the social psychology theory or
reasoned action which aims to model user acceptance of IS applications. It is one of the most
used framework to measure individuals’ willingness to adopt and use a particular technology.
The model has two famous and extensively used constructs namely; the “perceived usefulness”
popularly abbreviated as (PU) and the “perceived ease of use” popularly abbreviated as (PEOU).
PU is “the degree to which an end-user considers and believes that the use of an IS application
will enhance task performance”. In the context of e-Government applications the concept, it
assumed to have influence on the usage of an e-Government portal (Wirtz et al., 2015). Whereas,
PEOU is “the degree to which an end-user believes a system will be free of effort to a large or
lesser extent” as noted by (Venkatesh et al., 2003; Shen & Chiou, 2010). In the context of e-
5
281 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Government applications the concept has been used successfully in some e-Government studies
According to Horst et al. (2007), PU is seen as the most important motivator for a user’s
willingness to adopt, employ and use a technology irrespective of education, location or culture.
These attributes of TAM are deemed as the key factors shaping end-users behavioral intention to
use a system. The TAM framework has been utilized in many research and disciplines associated
with technological innovation and development. For instance, Kwon (2000) adopted the model in
a study to evaluate technology adoption in the cellular telephone industry, personal digital
assistant usage (Liang et al., 2003). Pavlou (2003) adopted the model in an electronic commerce
Additionally, the framework was used in an online shopping sites to evaluate online consumer
behavior (Koufaris, 2002), and World Wide Web (Lederer et al., 2000). Recent studies have also
employed the model to measure the willingness of citizens to use in e-Government application
(e.g., Alghamdi, & Beloff, 2015; AL-Athmay, Fantazy, & Kumar). Majority of the studies
assumed that PU and PEOU of the TAM can adequately capture the overall perceived value of
using an e-Government application or system. Scholars like Bagozzi et al. (2007) were among
the pioneers of TAM model, however the authors acknowledged certain limitations of TAM
model. This study is motivated to use it as a theoretical lens because it presents the genesis of all
Delone and Mclean (2004) came up with the IS success model, which consist of the following
variables system, information and service quality, perceived usefulness, user satisfaction, and
6
282 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
system usage and net benefit. The main aim of the model is to test system usage (Rai et al.,
2002). As mentioned earlier, adoption of the system and usage continues to be considered an IS
success variable in most studies and widely used by IS researchers as noted by (McGill et al.,
2003). The addition and integration of a new framework (i.e., the IS success model) will open
possibilities for discovering other unknown factors and in so doing providing opportunities for
System quality (SQ) - system quality refer to as “the degree to which the system is easy to use to
accomplish tasks “(Schaupp, Boudreau, & Gefen 2006). The construct entails and considers
performance characteristics, functionality, and usability, among others (McKinney Kanghyun, &
Zahedi, 2002). Accordingly, in an e-Government context, we adopted the following definition for
system quality: “the ability of an e-government system to provide its citizens with accurate,
reliable, relevant, and easy to understand information”. It represents the performance of the
system in terms of ease of use, user-friendliness, and usability (Wang, & Liao, 2008). Previous
research on other technological context suggested that the construct can influence PU, PEOU and
Information quality (IQ) - information quality is refer to as “the degree to which the quality of
the information that the portal provides and its usefulness for the user enables them accomplish
the stated goal of the system. Information quality is considered one of the most important success
factor when investigating overall IS success of any given system (McKinney et al., 2002). In the
context of e-Government, it is defined as “the ability of the e-government system to provide its
citizens with new, accurate, clear, and easy-to understand information” (Urbach et al., 2010). It is
7
283 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
also refer to as the quality of e-Government system output and is measured by different semantic
attributes (Wang, & Liao, 2008). Previous research on other technological context suggested that
the construct can influence PU, PEOU and citizen’s attitude toward use of a system.
Service quality (SRQ) - encompass measures of the overall service performance and assistances
competency of service personnel as noted by (Chang & King, 2005; Pitt, Watson, & Kavan,
1995). Petter, DeLone and McLean (2013) recently noted that the SRQ variable in an IS model
“captures the general quality of an e-Government system from the perspective of readiness of
personnel to provide proper service, safety of transactions when using the e-government system,
availability of the system to users, individual attention of IS personnel, and providing specific
needs for users”. Previous research on other technological context suggested that the construct
can influence PU, PEOU and citizen’s attitude toward use of a system. Based on the extent
literature the following hypotheses were developed and diagrammatically presented in figure 1.
H1a: Information quality will have a positive effect on citizenships perceived usefulness of e-JC
system.
H1b: Information quality will have a positive effect on citizenships perceived ease of use of e-JC
system.
H2a: System quality will have a positive effect on the perceived usefulness of e-JC system.
H2b: System quality will have a positive effect on perceived ease of use of e-JC system.
H3a: Service quality will have a positive effect on the perceived usefulness of e-JC system.
8
284 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
H3b: Service quality will have a positive effect on perceived ease of use of e-JC system.
H4: Perceived usefulness of e-JC system will have a positive effect on citizen’s
H5: Perceived ease of use of e-JC system will have a positive effect on citizen’s
Information
quality (IQ)
Perceived
usefulness (PU)
System quality Attitude toward
(SQ) use e‐government
Perceive ease of
use (PE)
Service quality
(SEQ)
9
285 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3. Research Method
This study employed a survey method to test and analyze the research model presented in figure
1. The participants in this study were Jordanian citizens who have used the e-government system,
they were randomly selected and asked to voluntarily participate in the study. A survey was
developed using validated instruments from previous relevant studies in English. Prior to
administering the survey, back-translation was conducted since most Jordanian speaks and
understood Arabic. The method was used to translate the scale items from English to Arabic and
vice versa following (Brislin, 1970) recommendations. As a next step, A Q-Sort pilot study was
The Q-sort method is “an iterative process in which the degree of agreement between judges
forms the basis of assessing construct validity and improving the reliability of the constructs”
(Nahm, Rao, Solis-Galvan, & Ragu-Nathan, 2002, p. 114). The outcome of the pilot study was
satisfactorily, and then the main survey was subsequently administered. The final survey
instrument consisting of 25 items, available in table 1 was operationalized. Although the items
were adapted from previous research, the author modified them to fit the context of the current
study. A five-point Likert scale was employed to measure the items using anchors that ranged
10
286 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Five hundred questionnaires were distributed, of which four hundred and thirteen were returned,
yielding a (response rate of 82.6%). Out the 413 returned questionnaires, 15 had missing data and
were subsequently eliminated from the study. This response rate is comparable with those of
other studies that have examined e-government. The final study sample involved 398 Jordanian
citizens, 228 (57.3%) males and 170 (42.7%) females, 344 Education (50%), 116 (29.1%) were
single, and the rest married. The demographic data also posit that, 55 (13.8%) of the participants
have high school diploma, 94 (23.6%) have some college degrees, 178 (44.6%) have bachelor’s
degrees and the rest higher degrees. The ages of the participants ranged between18 and 35 years,
Before testing the structural model, I tested the measurement model and assessed the
relationships between the observed variables and their underlying constructs, which were
allowed to inter-correlate freely. More specifically, to provide support for the issues of
confirmatory factor analysis (CFA) with IBM SPSS AMOS v21. First, a single factor test was
conducted to assess the likelihood of common method bias (Podsakoff, MacKenzie, Lee, &
Podsakoff, 2003), the result yielded a poor fit, suggesting that the dataset is not affected with
CMV.
Next, the author conducted a six factor model test, the initial results of the CFA provided low
model fit statistics. Then, two items from adoption/attitudes toward use, and two from perceived
ease of use were eliminated due to low standardized <0.50 and/or cross loading as recommended
11
287 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
by (Hair et al., 1998). The results [Chi-square: χ2 = 501.36; d.f =194; p = .000; Goodness of Fit
Index (GFI) = .89; Normed Fit Index (NFI) = .93; Comparative Fit Index (CFI) = .95; Tucker-
Lewis index (TLI) = .95; The Root Mean Square Error of Approximation (RMSEA) = .063,
Relative χ2= 2.58] shows that model conforms to the criteria’s suggested by (Ullman, 2006;
Byrne, 1994; Browne and Cudeck, 1993; Tucker and Lewis, 1973). See figure 2
The retained item loadings exceeded .50; Cronbach’s alphas were all above the benchmark of
.60; CR and AVE were also above the benchmark of .50 (Hair et al., 1998). Discriminant validity
is established when the estimated correlations between the variables is below 0.85 (Abubakar &
Ilkan, 2016). It does give confirmation of convergent and discriminant validity among our
measures (See TABLE 1). First, bivariate correlations were computed among the variables.
Second, structural equation modeling (SEM) was incorporated to evaluate the proposed and
alternative models via path analysis in AMOS. Means, standard deviations, and correlations of
12
288 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The e-Government system provides the precise information you need. .86 16.82
When you have a problem, the e-Government system service shows a sincere interest in solving it. .78 16.92
You feel safe in your transactions with the e-Government system service. .79 17.15
Perceived usefulness
13
289 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The e-Government system make it easier to do deliver public services .86 18.59
The e-Government system is useful in for public service activities .78 16.66
It is very likely that I will use the e-Government system in the near future -* -*
14
290 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
15
291 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Table 2 presents the mean bivariate correlations among the observed variables. As seen in the
table, information quality was positively correlated with perceived usefulness (r = .707, p<.01)
and perceive ease of use (r = .627, p<.01). Next, the dataset uncover that system quality was
positively correlated with perceived usefulness (r = .636, p<.01) and perceive ease of use (r =
.674, p<.01). In addition, correlation analyses shows that service system quality positively
correlate with perceived usefulness (r = .689, p<.01) and perceive ease of use (r = .621, p<.01).
Finally, the relationship between perceived usefulness and citizens adoption/attitudes toward use
of the e-government system was positive and significant (r = .642, p<.01). Similarly, a
significant and positive correlation was found between perceive ease of use and citizens
adoption/attitudes toward use of the e-government system (r = .589, p<.01). The strong
associations between the research measures provided a precursory support for the all the
hypothesized relationships.
Variables 1 2 3 4 5 6
16
292 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
CR, composite reliability; α Cronbach’s alpha; AVE, average variance extract; ** Correlations are significant at the .01 level.
17
293 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
***
Information quality (IQ) Perceived usefulness 0.336 10.981
***
Information quality (IQ) Perceive ease of use 0.162 5.203
***
System quality (SQ) Perceived usefulness 0.172 5.421
***
System quality (SQ) Perceive ease of use 0.366 11.317
***
Service quality (SEQ) Perceived usefulness 0.321 10.211
***
Service quality (SEQ) Perceive ease of use 0.227 7.092
***
Perceived usefulness Adoption/Attitude 0.504 9.541
***
Perceive ease of use Adoption/Attitude 0.354 6.575
Notes: *Significant at the p < 0.05 level (two-tailed); **significant at the p < 0.01 level (two-tailed)
In order to test the hypotheses formally, we conducted a structural equation analysis of the
relationships, the resulting path estimates are shown in Figure 3. Hypothesis 1a and 1b state that
information quality has positive effects on perceived usefulness (β = .336, p<.01) and perceive
ease of use (β = .162, p<.01). In agreement with findings in the literature, [H1 gained support].
18
294 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Hypothesis 2a and 2b state that system quality has positive effects on perceived usefulness (β =
.172, p<.01) and perceive ease of use (β = .227, p<.01). In agreement with findings in the
Hypothesis 3a and 3b state that system quality has positive effects on perceived usefulness (β =
.321, p<.01) and perceive ease of use (β = .366, p<.01). In agreement with findings in the
literature, [H3 gained support]. Hypothesis 4 state that perceived usefulness has positive effects
on adoption/ attitude toward use (β = .504, p<.01). In agreement with findings in the literature,
[H4 gained support]. Hypothesis 5 state that perceive ease of use has positive effects on
adoption/ attitude toward use (β = .354, p<.01). In agreement with findings in the literature, [H5
gained support].
Government systems in Jordan. This article proposed and developed a sophisticated model for
Jordanian government, the developed e-government portal integrated basic user-centric concepts
of technology acceptance and online service quality research, which provides a more nuanced
understanding of TAM and IS success model. More generally, the integration of portal-related
measures like perceived ease of use, perceived usefulness, information quality, system quality
and service quality allows for a more sophisticated modeling approach. This combination also
enriched the tested factors and also provide a more broad explanatory nature of the interaction
19
295 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The empirical evidence from this study enriches the research on factors associated with citizen’s
adoption/attitudes toward use of the e-government system. Concurrent with previous research in
other industries and e-government studies in other countries who utilize TAM and IS success
model. This article found that information, system and service quality appears to rendered
pronounced impact on the perceived usefulness and perceived ease of use of e-government
system. Additionally, there is support for previous evidence that perceived usefulness and
perceived ease of use are important antecedents for technology adoption/attitudes toward use.
Strategies aimed at enhancing system, information and service quality may not be enough for the
adoption of a new system though: the results provide strong support for that usefulness and ease
of use perception must be taken care of. Intuitively speaking, nation-wide interventions that can
promote awareness and openness culture to new technology are critical development tools for
Jordanian government.
The author will like to point out certain limitations associated with the paper. One, the results
possibly suffer from single-method bias due to its cross-sectional nature, which may prevent
proper assessment for intertemporal variations. Hence, future study should conduct a longitudinal
design to validate the current finding. Two, data was collected in Amman, as such the interaction
between the variables might not be applicable to the whole of Jordan; and the outcome limits the
generalizability to other countries and cultural context. Lastly, it is our view that a quasi-
20
296 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
References
framework for service delivery. Business Process Management Journal, 12(1), 13– 21.
Alanezi, M. A., Kamil, A., & Basri, S. (2010). A proposed instrument dimension for measuring
Alghamdi, S., & Beloff, N. (2015).Exploring determinants of adoption and higher utilization for
e-Government: A study from business sector perspective in Saudi Arabia. Computer Science and
Al-Soud, A.R., Al-Yaseen, H., & Al-Jaghoub, S.H. (2014). Jordan’s e-Government at the
crossroads. Transforming Government: People, Process and Policy, 8(4), 597 - 619:
AL-Athmay, A.A.A., Fantazy, K., & Kumar, V. (2016) "E-government adoption and user’s
Blanchard, A., & Horan, T. (1998). Virtual communities and social capital. Social Science
Abubakar, A.M., & Ilkan, M. (2016). Impact of online WOM on destination trust and intention to
http://dx.doi.org/10.1016/j.jdmm.2015.12.005
Psychology 1: 185-216.
21
297 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
approaches in US public agencies. In Heeks, R., (ed), reinventing government in the information
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen
& J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Beverly Hills, CA: Sage.
Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand
Chang, J.C.J., & King, W.R. (2005). Measuring the performance of information systems: A
Chang, I.C., Li, Y.C., Hung, W.F., & Hwang, H.G. (2005). An empirical study on the impact of
Chen, S. (2011). Understanding the Effects of Technology Readiness, Satisfaction and Electronic
Research, 1(3)
Chen, L., Soliman, K.S., Mao, E., & Frolick, M.N. (2000). Measuring user satisfaction with data
Chiu, C.M., Chiu, C.S., & Chang, H.C. (2007). Examining the integrated influence of fairness
and quality on learners’ satisfaction and Web-based learning continuance intention. Information
22
298 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Davis, F.D. (1989). Perceived usefulness, ease of use, and user acceptance of information
Davis, F., Bagozzi, R., & Warshaw, P. (1989). User acceptance of computer technology: a
Delone, W.H., & McLean, E.R. (2003). The DeLone and McLean Model of Information Systems
Frissen, P. (1997). The virtual state: Post modernization, informatisation, and public
administration. In Loader, B. D., (ed.), the governance of cyberspace. London: Routledge. 111–
125.
Hair, J.F. Jr , Anderson, R.E., Tatham, R.L., & Black, W.C. (1998). Multivariate Data Analysis,
Hassanzadeh, A., Kanaani, F., & Elahi, S. (2012). A model for measuring e-learning systems
http://dx.doi.org/10.1016/j.eswa.2012.03.028.
Horst, M., Kuttschreuter, M., & Gutteling, J. (2007). Perceived usefulness, personal experiences,
23
299 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Hu, P. J. H., Brown, S. A., Thong, J. Y., Chan, F. K., & Tam, K. Y. (2009). Determinants of
service quality and continuance intention of online services: The case of eTax. Journal of the
Hsieh, P.H., Huang, C.S., & Yen, D.C. (2013). Assessing web services of emerging economies in
Kim, K., & Prabhakar, B. (2000). Initial trust, perceived risk, and the adoption of internet
banking. Paper presented at the 21st International Conference on Information Systems, Brisbane,
Australia.
King, W. R., & He, J. (2006). A meta-analysis of the technology acceptance model. Information
Klein, H. (1999). Tocqueville in cyberspace: Using the Internet for citizen associations. The
Koufaris, M. (2002). Applying the technology acceptance model and flow theory to online
Krishnan, S., Teo, T.S.H., & Lim, V.K.G. (2013). Examining the relationships among e-
24
300 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Kwon, H. (2000). A test of the technology acceptance model: the case of cellular telephone
adoption. Proceedings of the 33rd Annual Hawaii International Conference on System Sciences,
Lean, O.K., Zailani, S., Ramayah, T., & Fernando, Y. (2009). Factors influencing intention to use
Liang, H., Xue, Y., & Byrd, T.A. (2003). PDA usage in healthcare professionals: testing an
372-389.
McGill, T., Hobbs, V., & Klobas, J. (2003). User-developed applications and information
systems success: a test of DeLone and McLean’s model”, Information Resources Management
McKinney, V., Kanghyun, Y., & Zahedi, F.M. (2002). The measurement of web customer
296–315
http://www.jordantimes.com/news/local/internet-penetration-rises-76-cent-q1
25
301 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. (2002). The Q-sort method:
Assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal
Ozkan, S., & Koseler, R. (2009). Multi-dimensional students’ evaluation of e-learning systems in
the higher education context: An empirical investigation. Computers & Education, 53(4), 1285-
1296,
Parent, M., Vandebeek, C., & Gemino, A. (2004). Building citizen trust through e-government.
Proceedings of the 37th Hawaii International Conference on System Sciences – 2004, Big Island,
Petter, S., DeLone, W., & McLean, E.R. (2013). Information Systems Success: The Quest for the
Pitt, L.F., Watson, R.T., & Kavan, C.B. (1995). Service quality: a measure of information
Podsakoff, P. M., MacKenzie, S. B., Lee, J.Y., & Podsakoff, N. P. (2003). Common method
biases in behavioral research: A critical review of the literature and recommended remedies.
Qutaishat F.H. (2013). Users’ Perceptions towards Website Quality and Its Effect on Intention to
Rai, A., Lang, S.S., & Welker, R.B. (2002). Assessing the validity of IS success models: An
empirical test and theoretical analysis. Information Systems Research, 13(1), 50-69
26
302 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Reddick, C. G., & Roy, J. (2013). Business perceptions and satisfaction with e-government:
Shen, C.C., & Chiou, J.-S. (2010). The impact of perceived ease of use on internet service
adoption: the moderating effects of temporal distance and perceived risk”, Computers in Human
Tucker, L. and Lewis, C. (1973) A reliability coefficient for maximum likelihood factor analysis.
Using multivariate statistics, (5th ed.; pp. 653–771). Boston: Allyn & Bacon
United Nations (UN). (2003). World public sector report 2003: E-government at the crossroads.
Urbach, N., Smolnik, S., & Riempp, G. (2010). An empirical investigation of employee portal
Venkatesh, V., Moris, M.G., & Davis, G.B. (2003). User acceptance of information technology:
Wang, Y., & Liao, Y. (2008). Assessing e-Government systems success: A validation of the
DeLone and McLean model of information systems success. Government Information Quarterly,
25(4), 717-733,
Weerakkody, V., Janssen, M., & Dwivedi, Y. K. (2011). Transformational change and business
process reengineering (BPR): Lessons from the British and Dutch public sector. Government
27
303 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Wirtz, B.W., Piehler, R., & Daiser, P. (2015). E-Government Portal Characteristics and
Local Administration Portals. Journal of Nonprofit & Public Sector Marketing, 27(1), 70-98.
http://www.worldbank.org/en/topic/ict/brief/e-gov-resources#egov
Yang, K., & Rho, S. (2007) E-Government for Better Performance: Promises, Realities, and
Yildiz, M. (2007). E-government research: Reviewing the literature, limitations, and ways
28
304 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract: The educational system faces several challenges; one of them is identifying
the factors which have an effect on the students’ performance. This paper aims to
apply the Cross-Industry Standard Process for Data Mining (CRISP-DM) on the
student database from the Education directorate in the Al Karak region, in order to
identify the main attributes that may influence the performance of the student. The
C4.5 Decision Tree has been used to build a classification model that has the ability to
predict the final grade in a computer course.
Key Words: CRISP-DM Model, Data Mining, Classification, C4.5 Decision Trees,
Student Data, Education.
1. Introduction
Data mining offers that and more, where data mining can be defined as extracting the
knowledge from a large database. Recently, there is a growing demand for using data
mining techniques in various fields, such as education, the telecommunication
industry and banking, to enhance their performance (Han et al., 2011).
305 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
For example, Calvo-Flores (2006) tried to identify the students who have not passed
courses at the Cordoba University; the Artificial Neural Network (ANN) model was
used to predict the students’ marks, the data having been obtained from the Moodle
logs. The database contained 240 students who had an account on the Moodle system
from Cordoba University and enrolled in the programming course. However, the lack
of students’ patterns forced the authors to add a random noise to create a new
students’ patterns.
The main goal of this study is identifying the factors affecting the performance of
students, particularly the students’ grade in the computer course, where a
classification model is built to predict the student's grades.
306 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
attributes, building the classification model, using the classification model to predict
the student performance and evaluating the classification model (Chapman, 2000).
In this phase, the students’ data was collected using questionnaires that were passed
among the students; the age of students was from thirteen to sixteen (from the seventh
grade to tenth grade). The data was collected from five schools, taking into account
the geographic distribution of these schools to cover different areas within the Al
Karak city.
All the student attributes were available in the questionnaire; about 600 questionnaires
were passed among the students and after verifying the validity of questionnaires,
some of the questionnaires (about 85) were manually eliminated due to incomplete
data. Therefore, the number of questionnaires that could be used was about 515.
Figure 1 shows how the collected data was prepared in a table in Attribute Relation
File Format (ARRF). After that, the ARRF file was uploaded into a data mining
system (WEKA toolkit) as shows Figure 2.
307 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
308 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The questionnaire contained about 15 attributes. Since the collected attributes may
have some irrelevant attributes that may reduce the performance of the classification
model, a feature selection approach was used to select the most appropriate set of
attributes, where six attributes were excluded because they were not related. Finally,
nine attributes and one class (computer grade) were adopted. The attribute, their
descriptions, their possible value and the attributes type are all presented in Table 1.
The WEKA toolkit was used to build the classification model, where the C4.5
decision tree technique was applied, as seen in Figure 3. C4.5 implementation in the
WEKA toolkit is known as J48 and it has several different configurations, as seen in
Figure 4. The decision tree is a good and practical technique since it is fast, and can
be converted to be a classification rule (Al-Radaideh, 2006).
The experimental work was repeated many times until a reasonable tree model with a
good percentage of correct classification was obtained. After repeating the experiment
more than 70 times, a classification model was adopted.
309 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
310 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
To build the decision tree, the gain ratio was used to determine the most affective
attribute which caused the resultant nodes and was the most appropriate root node; the
attribute that had the highest gain ration was the fail attribute, so that is located at the
top of the decision tree. To build the whole decision tree, the previous process was
repeated several times, as seen in Figure 5.
The experimental work was repeated many times until a reasonable tree model with a
good percentage of correct classification was obtained. After repeating the experiment
more than 70 times a classification model was adopted. The decision tree of the
adopted model is presented in Figure 6, where the adopted model's size is about 27
and the number of leaves (rules) is 14.
311 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
To predict the student performance (marks of students) the decision tree was used; the
rules were generated by following every path of the decision tree: 14 rules were generated.
Table 2 represents the generated rules of the adopted model, where the first column
represents the rule number, the second column represents the rule, the third column
represents the number of students who satisfy the rule, the fourth column represents
the rule accuracy where it is equal (number of students who satisfy the rules divided
on the number of students who are included in the rule), the fifth column represents
the number of attributes contained in every rule.
312 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
From reviewing Table 2, the rules were arranged according to their accuracy in
descending order, where the accuracy of the strongest rule is about 87.4% and the
accuracy of the weakest rule is about 56.8%. The rules were arranged in this way in
order to determine the most significant rule. Also we note that the longest rule
contains seven attributes, while the shortest rule contains just one attribute which
makes it the strongest rule.
313 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
In order to evaluate the performance of the adopted model, the classification accuracy
is usually used for this purpose.
C
Accuracy =
D
C is the number of instances which classified in the correct way and D is the number
of all instances in the database. Table 3 shows the confusion matrix for the adopted
model. From this confusion matrix, we can calculate the accuracy of the adopted
model.
Table 3 shows the predicted instances or class and the actual instances or class. The
highlighted numbers represent all instances which are classified in the correct way,
equaling 210 instances. The total number represent the number of all instances (marks
of students) in the database, equaling 515 instances.
314 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
C 210
Accuracy= = *100
D = 40.77% 515
We notice that the accuracy of the classification model is not high. This indicates that
the collected questionnaires and attributes are not satisfactory to generate a
classification model with high quality.
3. Conclusion
This paper tried to use data mining technologies for evaluating student performance,
and to identify the factors which have an effect on the student performance at schools
belonging to the Education Directorate in the Al Karak region, where the
classification model can be used to predict the marks of students, but the collected
data is not sufficient to build a classification model with high quality.
This paper found that “fail, father education, having computer” attributes are the most
effective attributes which have an effect on the students' performance, while the rest
of the attributes have no influence on the students' performance.
For future work, reliable information should be obtained to enhance the quality of the
classification model, and then the classification model should be combined with the
educational systems to provide deeper knowledge about the students’ behaviour.
315 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
References
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth,
R. (2000). CRISP-DM 1.0 Step-by-step data mining guide.
Han, J., Kamber, M., Jian P. (2011). Data Mining Concepts and Techniques. San
Francisco, CA: Morgan Kaufmann Publishers.
Mierle, K., Laven, K., Roweis, S., & Wilson, G. (2005). Mining student CVS
repositories for performance indicators. In ACM SIGSOFT Software Engineering
Notes Vol. 30, No. 4, pp. 41-45.
316 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— The Internet is so diverse such that at any given domain is not uncommon. Attackers use a wide range of
instance someone is clicking a link, opening a file, downloading malware not limited to viruses, worms, trojans, rootkits etc to
an email attachment and so forth. Such seemingly benign actions achieve their ultimate. One new breed of malware coined as
do not always return the expected outcome because attackers Ransomware [3] employs a new philosophy altogether, that of
leverage these actions to spread their malware. And malware extortion, as a means to achieve the end goal. Unlike
today casts a broad spectrum of software with varying conventional malware which usually seeks to replicate, delete
characteristics some of which include Ransomware. Ransomware files, exfiltrate data or extensively consume system resources,
has come to claim its place in the malware wild due to the Ransomware on the other hand imposes some form of denial of
philosophy of extortion behind its operations. Ransomware
service to either the system or system resources such as files
threat actors are seeking ways to delivery their malware payload
until a ransom is paid. One class of Ransomware uses
in ways that do not generate suspicion via unusual network
traffic and system calls by involving less user input if any at all. encryption to encrypt victim files and demands a ransom before
Malware-free intrusions present attack vectors so desirable to decryption. This type of malware has targeted critical industries
Ransomware threat actors in this respect in that they do not [4] where the victim has had to pay as the only way out due to
employ an extra malicious code which otherwise would be the vitality of access to data on demand. Figure 1 below shows
detected by intrusion detection and prevention system. We in this the distribution of Ransomware attacks on different sectors of
paper explore the utilization of malware-free backdoors for the economy for 2016 [15].
Ransomware payload delivery over a network with RDP-based
remote access. We further show that leveraging such backdoors
does not require user input while providing high probability
levels of success thus adding to the expansion of the available
attack surface.
I. INTRODUCTION
The rise of the Internet has likewise seen the emergency
related cyber-attacks and the two are seen not to occupy
opposite ends of the continuum. The Internet was initially built
without security in mind [1] implying that all technologies that
jump onto this bandwagon need to address the associated
security concerns in their respective niche, but unfortunately
this is not the case. Due to the vast number of technologies
integrated into the Internet today, the variety of attacks thereof
are extensively wide correlating to the incepting technologies. Figure 1. Ransomware Infections by Organization Sector,
There are many metrics and parameters used to classify cyber- January 2015 – April 2016 [15]
attacks but they can broadly be classified as targeted or non-
targeted attacks [2]. Non-targeted attacks usually don’t have a It is estimated that Ransomware has costed millions of
specific target and tend to be works of novices and script dollars to victims [5] while enriching the criminals that be. As
kiddies as opposed to targeted attacks. On the contrary, with all cyber-attacks, attacks via Ransomware cast a wide
targeted attacks are the works of highly skilled technical people spectrum of attack vectors. These are the ways and means
who might be working on individual basis, for organized crime through which Ransomware is spread and delivered to the
groups, for big corporations or even governments. This class of potential victim. The attacker is therefore tasked with finding
attackers employ sophisticated techniques to compromise and optimal ways of infecting victims and Ransomware is known to
victimize their targets. The use of malicious software in this use some of the common attack vectors employed by other
malware. Some of these attack vectors generate suspicious
317 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
network traffic and issue out unusual system calls, something network and will utilize lapses in security configurations.
very undesirable to attackers as this tends to raise a red flag. These lapses may include poor security implementations,
Most malware require some form of user input of some sort vulnerabilities imposed by software built without security in
to effectively carry out an infection. However, most user mind, social engineering etc. To this effect, Ransomware
systems implement Intrusion Detection Systems (IDS) which mainly comes in two flavors; non-encrypting Ransomware
detect and alert the user of potential harm as a consequential also known as Locker Ransomware and encrypting
result if certain actions are performed. This detection is Ransomware also known as Crypto Ransomware. The diagram
inclusive of Ransomware. Attackers therefore seek to employ below in Figure 2 shows the Microsoft report for the relative
methods and tactics which do not require user interaction and distribution of Ransomware variants.
generate little noise as possible if any at all. For attacks as those
employed by Advanced Persistent Threats (APT), stealthiness
and a low threshold of noise are of great essence as such threat
actors seek to maintain an undetected persistence presence for a
long time [6]. Therefore an attack vector which is stealth and
less noisy is likewise desirable to Ransomware threat actors.
One such attack vector is backdoor implantation leveraging the
pre-authentication services available in almost every version of
the Windows operating system. The accessibility backdoor in
Windows is actualized by replacing an accessibility binary
executable with a system file capable of granting system level
access before even one logs in. This backdoor is documented
[7] to be present in important sectors such education, judiciary,
government etc at the disposal of Ransomware threat actors.
This attack vector is especially attractive to attackers in that it Figure 2. Relative distribution of different Ransomware
does not involve any malicious code. This implies that all IDSs variants. [16]
which are signature based [8, 9, 10] are incapable of detecting Using Figure 2, we in this paper consider the most
it and since the rationale behind the backdoor is to utilize common Ransomware variants of the Locker and Crypto
system resources and files to covertly achieve a goal, a Ransomware, whose characteristics are later documented in
behavioral based IDS will particularly find it hard to detect it Section IV in the analysis stage.
since there is no anomaly behavior to evaluate.
A. Locker Ransomware
This paper explores the utilization of the aforementioned
backdoor attack vector for Ransomware payload delivery. We This is a less common type of Ransomware which basically
investigated the delivery of the malware payload in the locks down the victim’s system and its applications while
presence of IDS on different versions of the Windows disabling user input to prevent the user from operating the
operating system from Windows XP to Windows 10. The system at all. The victim is usually extorted that they have
Windows operating system is chosen as the victim on the engaged in some of cyber-crime like copyright infringement,
pretext that it’s the most widely used operating system [11] child pornography, money laundering etc and that they need to
hence the obvious casualty of such attacks. We leverage the pay some fee, usually in the form of bitcoin [12], before the
built-in RDP-based remote access functionality in these charge is disposed of. The emphasis usually is that the victim
systems to establish an RDP session without any login at all to won’t be able to use the system and will be in trouble with the
deliver the malware payload and confirm the result. We law unless a payment is made. Some variants of this malware
contend that such an attack vector increases the attack surface are capable of modifying the Master Boot Record (MBR) and
of Ransomware attacks with a high probability of inflicting even the partition table. Only limited system functionality is
maximum damage without any direct user input. made available such as numeric functions and limited mouse
movements to enable the victim to enter and pay the ransom
The rest of the paper is organized as follows: Section II amount on the displayed Ransomware screen. This strain of
provides background information and concepts whilst the Ransomware usually leaves the system and user files
attack model for Ransomware payload delivery and the uncorrupted [13] and can usually be recovered offline via a
analysis thereof are discussed in Section III. Experiment technical hack or otherwise, cause of the weak techniques
simulations are presented in Section IV while best practices employed.
and mitigation techniques are presented in Section V and we
conclude the paper in Section VI. B. Crypto Ransomware
This is by far the most common type of Ransomware [14]
II. BACKGROUND AND CONCEPTS and employs encryption techniques to achieve resource
The networks that build up the Internet are thought to be like inaccessibility. This Ransomware variant silently infects the
an egg. There is an obvious hard network perimeter that victim and communicates with its Command and Control (C2)
requires penetration into the softer inner core which permits servers if need be to download the relevant encryption keys.
lateral traversal once the attacker obtains access. Therefore, The malware then extracts the keys and encrypts targeted user
Ransomware attackers try to find ways of penetrating a target files which become inaccessible without the decryption key.
318 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Unlike Locker Ransomware, crypto Ransomware does not lock D. Infection Vectors
down the system; it only encrypts user data and displays a There are different ways in which attackers deliver their
message after completion of encryption that the victim’s data Ransomware payload to the victim. They differ in the degree
files are no longer accessible and can only be accessed via
of complexity and effectiveness. Here we discuss the most
decryption upon payment of the ransom. The attacker holds the
prevalent ones and elaborate how the attack vector considered
decryption keys and promises to avail them once the ransom
demand is met. Whether the attacker avails the decryption keys in this paper contributes to the attack surface.
upon payment of the ransom is a debate of circumstance but 1) Malicious Emails
one thing for sure is that there’s no guarantee that the keys will This is one of the most common way of Ransomware delivery.
be provided after paying the ransom. The diagram in Figure 3 The payload is delivered as an attachment from emails sent
below shows the general structure of Crypto Ransomware. through spam using botnets and other compromised hosts. The
victim is social engineered into interacting with the attachment
by directly opening an attachment which executes the
Encryption Ransomware payload, opening a malicious file which in turn
Key* initiates payload delivery via a macro or by clicking on a URL
in the email which redirects to an exploit kit which in turn runs
Main Body to find vulnerabilities on the target system and executes the
(Payload) Ransomware payload thereafter. Spam attachments, as shown
in Figure 4 below according to Cisco Security Research [25],
Encryption usually carry files of different formats which could be used to
Algorithm deliver the Ransomware payload.
319 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3) Exploit Kits (EKs) We now employ an attack tree [24] for our model
EKs are software packages which scan for vulnerabilities with consideration. The diagram in Figure 5 below shows an attack
the purpose of malware installation upon successful discovery. tree with different infection vectors. The root node is denoted
These scans are run on third party servers and inject code into by G0 and it’s the attacker’s ultimate goal. The rest of the
different portions of the server depending on the context nodes and leaves are denoted as follows: G1 – attack via
which in turn redirect server visitors to the malware. The offline or out of band payload delivery, G2 – network attack
Angler EK [20] for example accounted for close to 20 million payload delivery, G3 – authentication attack, G4 – payload
attacks thwarted by Symantec. delivery via USB flash drive, G5 – payload delivery through
optical media, G6 – payload delivery via Bluetooth, G7 –
4) Other Attack Vectors payload delivery via EKs, G8 – payload delivery through spam
There are many other attack vectors employed to attain a email, G9 – payload delivery via brute-forcing, G10 – payload
successful Ransomware attack. Some of these include the delivery through malware-free intrusion and Gn+ – payload
injection of redirect links in JavaScript, Malvertising, Drive- delivery via other network-based infection vectors.
by-Downloads [21, 22, 23] etc.
III. THE ATTACK MODEL G0 Root Node
We now formulate the attack model based on the preferred (System Level Access)
infection vector from the preceding section. We model our
attack model based on a set of conceptual units which serve as
the basic building blocks of the whole attack process. G1 Intermediate Node
Therefore the delivery of Ransomware to a victim can be G2 (Sub-goals)
envisioned as a process with an attacking agent carrying out
the attack by acquiring a set of assets after performing some
Gn+
action with the sole purpose of reaching the goal, the delivery G4 G6 Leaf Nodes
and successful execution of the malware on the targeted host. G5 G3 G7 G8 (Atomic Attacks)
A. Attacking Agent
This is the subject of the attack process who carries out actions G9 G10
towards the object which might be a host, network or system.
The agent can be software e.g. EKs or a human actor or a
combination of both. We distinguish the agent of our model to Figure 5. Attack Tree of Infection Vectors
be a highly skilled threat actor with a considerable level of All the intermediate nodes in the resulting graph decompose
sophistication in terms of traceability and stealthiness. into children nodes sharing disjunctive OR association. This
B. Assets implies that only one node need to be true to traverse to the
upper node, meaning if the route through G1 is traversed, the
These are resources which the agent requires in order to attacker has options of either starting with G4, G5 or G6. The
further the attack. Some resources are but not limited to same is true if the attack route pursued traverses through G2,
information about the host such as operating system, IP the attacker can either start with the leaves G7, G8 or Gn+.
address, open ports, TCP/UDP connectivity and so forth. It’s
important to note that such information might be reusable 0 1 0 0 0 0 0 0
throughout the attack process hence the need for constant 1 0 1 1 11 0 0
verification for consistency.
C. Actions 0 1 0 0 0 0 1 1
These are requests made by the agent with specified input 0 1 0 0 0 0 0 0
parameters with an expected return. There are preconditions AG =
that have to be met for a given action to return the correct
0 1 0 0 0 0 0 0
output. The output of an action can be either true, in which
case the returned parameters further the attack or false where
0 0 1 0 0 0 0 0
the returned value denotes that further pursuance of the chosen
0 0 1 0 0 0 0 0
attack vector does not yield fruition.
0 1 0 0 0 0 0 0
D. Goals
These are the treasures that the attacker seeks to attain. If the attacker instead opts to use the path through G3, likewise
There’s the ultimate goal, Ransomware delivery and execution he has an option of starting either with the leaf G9 or G10.
in our context, but also other sub-goals of the attack process Pursuance of attack vectors through G1 is practically daunting
which act as pivots for further attacks. These are only reached because of the constraints imposed by the need for physical
when the returned value of a certain action is true. presence. Moreover it even limits the target audience which
needs to be reached. We therefore drop this attack path in our
320 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
attack consideration and generate the adjacency square matrix further activate RDP-based remote access on the targeted
AG of the 8th order shown above for the resultant attack graph. hosts. We acquire and verify Ransomware samples with
We thus deduce five attack scenarios from the adjacency VirusTotal and Malwr [26, 37] for delivery to the victims.
matrix AG corresponding to the following paths: Since the malware is active and harmful, we perform the test
in a securely built environment as specified by common
P7: {G7, G2, G0}
scientific guidelines [27, 28, 29, 30] while maintaining limited
P8: {G8, G2, G0}
regulated Internet access via NAT integrated in Virtual Box.
Pn+: {Gn+, G2, G0}
We in addition used the following tools for analysis of the
P9: {G9, G3, G2, G0}
Ransomware: Process Monitor [31] for verification of process
P10: {G10, G3, G2, G0}
activity, Regshot for registry alterations monitoring, API
EKs require the existence of an exploit before Monitor [32] for observance of issued system calls and
materialization meaning a target without vulnerabilities being ApateDNS and Netcat [33, 34] for emulating some C2 servers.
sought by the EK won’t be susceptible to the attack. We thus We use an Nmap [35] for reconnaissance attacks where the
drop the first path P7. Spam email largely depend on user input actions of the defined attack model returned assets in the form
and a user with up to date Internet hygiene would rarely fall of list of hosts IP addresses, types operating systems, open
prey to such. Moreover, spam emails are subject to filtering by ports and running services. We did obscure an RDP port on
spam filter giving no assurance of success of payload delivery. one of the hosts and it was discovered to be running RDP
We likewise drop path P8. In authentication attacks, brute- services upon probing for the service banner. We also
forcing passwords is subject to system lockdown upon employed the services of an automated script [36] for
multiple failed attempts. We in this regard likewise drop the automated backdoor discovery. We set up an FTP server on
path P9. Malware-free intrusions on the other hand do not the attacker’s machine to host the Ransomware payload. The
require any vulnerabilities for exploits and neither do they snapshot in Figure 7 below shows successful malware delivery
require any user input. Once the malware-free intrusion on one of the targeted hosts.
backdoor is identified, the attacker can without difficulty
We ran the attack by deploying the Ransomware to the victims
deliver his Ransomware payload directly to the victim hence
using the malware-free intrusion infection. First a
the actualization of the attack. We therefore base our
reconnaissance attack was carried out which revealed the list
experiment simulations solely on the attack path P10 in the
of available host and their respective running services upon
following section.
port scans and banner grabbing. The obscured port likewise
IV. EXPERIMENT SIMULATIONS AND ANALYSIS revealed that the RDP service was running. The automated
scripted referenced earlier was employed to probe the
Our simulation environment consists two networks separated
availability of the backdoor and five backdoors were
by a simulated Internet as shown in Figure 6 below. The
discovered on separate hosts. As can be seen in Figure 7, the
attacker is located in one subnet whilst the targeted hosts Ransomware payload file, named Invoice.zip is a small file in
reside in a different subnet altogether. In practice the attacker the range of ~ 10KB and might not raise suspicion to the
can reside in the same network with his victims but that's a
benign user. They are usually attached to some spam email
rarity from a logical point of view in as far as Ransomware is
with a catchy subject to raise interest from the would-be
concerned. The threat actor, the attacking agent defined in the
victim.
attack model ran from the Kali Linux whilst the targeted hosts
ran on Windows XP, Windows Vista, Windows 7, Windows 8 We observed from the pursued attack vector that it did not
and Windows 10. require any user action so long the backdoor was present and
RDP service active. Table I below summarizes some of the
Ransomware Advesary known activities and properties of the Ransomware payloads.
Victim Network
TABLE I. RANSOMWARE ATTACK ACTIVITIES
Ransomware Attack Details
Family
Switch Variant Delete MBR Steals
Name Encryption
Internet Files Alteration Info
Cryptowall Crypto X
Router FakeBSOD Locker X X X
Brolo Locker X X X X
CTB-Locker Crypto X X
Teslacrypt Crypto X
Figure 6. Experiment setup for Ransomware delivery Reveton Locker X X
Seftad Locker X X X
We implant the accessibility backdoor on target hosts via Cerber Crypto X X
registry manipulations by setting cmd.exe as the debugger for
specified accessibility suite executable binaries and by Ransomware threat actors might have specific targets in mind
switching specific accessibility suite binary executable with but using this attack vector would increase the surface area of
cmd.exe in the %systemroot%\System32\ directory. We infection and since the motive is to extort as much money as
321 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
possible regardless of the target victim, this attack vector is activities shown in Table I. The chain comprises five stages as
especially attractive for maximizing profits. shown in Figure 8 otherwise elaborated below as follows:
Different Ransomware families have specific file activity upon 1) Malware-free Intrusion Backdoor Discovery
infection. It’s notable that the Locker variant of the The attacker in this case seeks to find the accessibility
Ransomware does not employ encryption. The deletion carried backdoor which when invoked avails a system level access
out by the various families differ in their target files and the console via RDP-based remote access. The attacker does not
subject implementing the deletion. Some attacker are known concern himself with the implantation of the backdoor, his
to remotely delete the files upon failure to pay the ransom main objective is to determine whether the backdoor exists or
while some Ransomware payload itself deletes target files to not. We base this assumption on the fact that this type of
reduce any possibility of recovery. Families which employ backdoor is documented [7, 38] to be existing on critical
asymmetric encryption largely depend on the C2 for networks of educational institutions, governments,
generation and deployed of encryption keys. The public key is manufacturing industries, legal sector, gaming companies etc.
always used for encryption. There are a number of Moreover, we contend that a determined attacker with a
Ransomware out in the wild but the majority belong existing specific target might employ other techniques to achieve this
families, only that they introduce some additional backdoor or any of this type which does not necessarily
functionalities, e.g. changing from symmetric encryption to
322 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3) Infection and Initial Execution victim. There is the process of payment which is carried out in
Once the Ransomware is delivered to the victim via FTP, the different forms the most notable being via bitcoin using the
first action it carries out is to beacon out to the C2 servers. Tor network to eliminate any chances of traceability.
This is common for the variants which employ public key With the infection chain of Ransomware attack via malware-
encryption. Cases of the Ransomware falling to successfully free intrusion defined, we now look at how this infection
encrypt are documented [39] where the failure is attributed to vector fares in comparison to others in terms of intractability
inability to establish contact with the C2 servers. It should be while putting both the attack and victim into context. Table II
noted at this point that the attacker does not engage further in below summarizes the comparisons thereof.
the attack process but rather awaits the Ransomware to carry
out its tasks. Another point worth noting is that some variants TABLE II. INFECTION VECTOR CHARACTERISTICS
are known not to execute instantly upon infection but Infection Victim Exploit Mule
Repudiation
hibernate in efforts to avoid detection. Vector Action Dependence Carrier
Spam Mail X X
Payload Execution Brute-Forcing X X X
C2 Beaconing Exploit Kits
FTP Payload 2 Malware-Free
Download Victim 3 X X X
Intrusion
Victim
Backdoor It’s evident from the above table that malware-free intrusion
File Encryption
Discovery infection vector shares some commonalities with the brute-
Ransomware Encryption Key 4 forcing vector all due to the fact that these two methods
File
1 Download Victim require first access to the victim’s machine. Though this may
result in requiring a lot of input parameters for the attack to
Targeted Hosts User Notification materialize, the benefits outweigh attack paths pursued via
Victim About Ransom other means. It’s worth noting that though these two vectors
5
may share some commonalities, brute-forcing is subject to a
C2 Severs lot of hurdles compared to malware-free intrusions and the
latter is therefore a better attack vector.
Figure 8.Infection Chain for Ransomware Attack via
V. MITIGATION AND BEST PRACTICES
Malware-free Intrusion Vector.
In as far as prevention and detection are concerned, we
4) File Encryption approach it twofold; against the Ransomware attack itself and
This is the stage where the payload actually encrypts the against malware-free intrusion. This is so because as earlier
targeted files using keys obtained from the C2 servers. The shown in the attack model, the absence of malware-free
encryption is selective in that it does not encrypt system files intrusion backdoor implicitly entails that a Ransomware
but user files not limited to images, text documents, pdf infection vector in this regard would not be feasible.
documents, database files, excel files, PowerPoint etc. Crypto There are a number of suggested solutions against
Ransomware are not known to attack system files since from a Ransomware attacks the majority directed towards prevention
logical point of view it might be cumbersome to deliver the than recovery. The most echoed of these is “prevention is
ransom notice. Once the files are encrypted, only the attacker better than cure” where offline backup of data is strongly
with the decryption keys has got the means of making the data emphasized. Offline is stressed due to the fact that some
accessible again hence the ransom, but there is no guarantee Ransomware families are known to search for any network
that the attacker will keep their word. Some variants as shown attached storage and any network resources and induce an
in Table I do delete original files while others don’t. Others attack if the target files are present. Offline backup is arguably
are known to delete shadow files so as to prevent any the best solution because Ransomware variants keep mutating
possibility of system restoration available in Windows. Crypto and new ones keep emerging with new techniques altogether.
Ransomware threat actors suffer from the challenge of key It is also recommended to keep the anti-virus updated so as to
management. Clearly using the same key for multiple include new Ransomware signatures and anomaly behavior to
encryption, hence decryption increases the chances of key the IDS and IPS engine. Good Internet hygiene is another
discovery once a victim pays the ransom as the key would be recommended house-keeping activity; users whether technical
reusable. Generating multiple keys introduces the challenge of or otherwise should likewise be educated on the importance of
key management and increases the chances of key discovery. safe Internet browsing since Ransomware attacks are mainly
5) Ransom Notification directed towards the Internet and the users thereof.
Once the Ransomware is done encrypting and deleting file in One solution against Ransomware attacks is to keep restore
accordance with its characteristics, a ransom note is delivered points on the system. However this method works with earlier
to the screen of the victim. Attackers employ a myriad of scare variants of Ransomware which did not attack the restore utility
tactics to intimidate the victim into succumbing. Some of the in the Windows operating system. Newer mutated and updated
scare tactics employed try to prey on the ignorance of the versions of Ransomware seek to delete system restore points
323 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
via the vssadmin.exe file to prevent system restoration, .i.e. the It’s worth noting that security via port obscurity is not
Ransomware depends on local system resources to make forthcoming due to the fact that service banner probes do
recovery impossible. Trivial methods have been suggested reveal the actual running service.
[40] to prevent access to the above file by making a backup of
the file and renaming the original file. If this is implemented, VI. CONCLUSION
the Ransomware fails to find the vssadmin.exe responsible for Ransomware attacks keep evolving and so do the methods and
removing restore points consequently failing to explicitly techniques employed to carry out the attacks. The most
prevent a system restore. Though the Ransomware at this point common methods of Ransomware payload delivery involve
might be able to encrypt the targeted files, restoration is some third party and require some action of the user.
acquired by removing the Ransomware payload first, Moreover these types of attack vectors also involve some form
assuming it didn’t delete itself, then renaming the backed up of malware for initialization of the attack before Ransomware
file to vssadmin.exe and implementing a system restore. infection which might otherwise be detectable by the IDS.
Nevertheless, this is in the hope that the Ransomware doesn’t Malware-free intrusions introduce a new attack vector so
compute hashes to check collisions with the targeted file. desirable to the attacker in that it does not require a third party
Considering that Crypto Ransomware is the more resilient of mule and neither does it require any action from the user. As
the two Ransomware variants, other suggested solutions are demonstrated in this paper, all the attacker needs to do is to
process signing and traffic monitoring by the IPS. Crypto download the payload once the victim’s system has been
Ransomware always tries to beacon back to C2 for further penetrated. We in this paper explored the accessibility
instructions and in this prevention approach, the IPS that be backdoor as a malware-free infection vector where system
signs all processes in the system and monitors and logs level access is gained over an RDP session at pre-
process activity on the network. Communication to C2 servers authentication without logging at all. Since the accessed
can be sighted as unusual traffic and the necessary steps taken console at pre-authentication is at system level permission, the
to prevent further damage. This techniques has shortfalls in Ransomware does not need any user action as it will run under
that C2 servers are not static resources. Attackers employ system root, the highest permission in the system.
different techniques to keep their C2 servers dynamic and Furthermore, this infection vector does not rely on any exploit
difficult to trace. Moreover, C2 servers can be anything from whatsoever; all versions of Windows systems are susceptible
normal compromised user machines to botnets controlled by to the attack via this infection vector so long RDP-based
the attacker. In this case, the Deep Packet Inspection (DPI) remote access is activated, and all versions of Windows
could be employed to examine the payload of communication considered in this paper from Windows XP to Windows 10
with C2s but then DPI is known to be slow and costly for high ship with RDP by default. The attacker needs not to worry
bandwidth applications. about the implantation of the backdoor because recent study
Countering malware-free intrusions calls for addressing the has shown that a lot of systems running on the Internet today
three attack vectors that make it possible. Since the intrusion have this backdoor as evidenced by the SHODAN search
discussed herein is as a result of backdoor planting via engine.
accessibility tools, ultimate prevention and mitigation calls for Ransomware attacks pursued through this attack vector ought
prohibition of interactive console access at pre-authentication, to be countered by addressing the security loopholes resulting
most importantly over RDP-based remote access. One way to from the implantation of the backdoor. Until Microsoft find a
detect the presence of the backdoor is by hashing all binary way to implement context detection of cmd.exe execution at
executables in the %systemroot%\System32\ directory. Any pre-authentication, i.e. prohibition of execution of cmd.exe or
hash collision is an indicator of compromise. The second any other system binary that avails system level access at pre-
method of detecting the backdoor is by checking registry login, the backdoor will continue to exist. Since this backdoor
entries to check whether cmd.exe has been set as the debugger has already been used in APT attacks, it only remains to be
to any of the accessibility tools. The presence of such a setting documented for use in Ransomware attacks. Another security
likewise is a clear indication of compromise. Network Level implementation to thwart infection via this vector is ensuring
Authentication (NLA), a feature that has been introduced in system integrity check that a system executable binary capable
newer versions of Windows starting with Vista, prevents of providing system level access at pre-authentication is not
establishment of an RDP session before authentication. This set as a debugger to any of the accessibility tools. Though the
implies that activation of NLA will ultimately see the concept of introduction of NLA in newer versions of Windows
thwarting of this malware-free intrusion. However, it must be somewhat helps prevent the backdoor, it should be extended to
stated that NLA imposes requirements such as belonging to a pre-authentication attacks and not limited to denial of service
network domain and a third entity for credential handling, attacks as originally intended.
requirements that are not befitting to the average independent There are other Ransomware infection vectors that do not
user. This solution therefore can only work in isolated require user action neither a third party carrier, like brute-
instances. forcing, but their effectiveness is hindered by a couple of other
Since one of the actions of the attack model involves RDP factors. Malware-free intrusion promises a better turnout to the
service discovery, and this intrusion is only actualized via attacker.
RDP, closing RDP ports and service will prevent this attack.
REFERENCES
324 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[1] Al-Salqan, Yahya Y. "Future trends in Internet security." In Distributed [23] McDowell, Karen. "Now that we are all so well-educated about spyware,
Computing Systems, 1997., Proceedings of the Sixth IEEE Computer can we put the bad guys out of business?." In Proceedings of the 34th
Society Workshop on Future Trends of, pp. 216-217. IEEE, 1997. annual ACM SIGUCCS fall conference: expanding the boundaries, pp.
[2] Thonnard, O., Bilge, L., O’Gorman, G., Kiernan, S. and Lee, M. 235-239. ACM, 2006.
"Industrial espionage and targeted attacks: Understanding the [24] Schneier, Bruce. "Attack trees." Dr. Dobb’s journal 24, no. 12, pp.21-29,
characteristics of an escalating threat." In International Workshop on 1999.
Recent Advances in Intrusion Detection pp. 64-85, 2012. [25] Catalin Cimpanu. (February 2017). "Spam Accounts for Two-Thirds of
[3] O'Gorman, Gavin, and Geoff McDonald. Ransomware: A growing All Email Volume, and It's Still Going Up." [Online] Available:
menace. Symantec Corporation, 2012. https://www.bleepingcomputer.com/news/security/spam-accounts-for-
[4] Richard Winton (February 2016). "Hollywood hospital pays $17,000 in two-thirds-of-all-email-volume-and-its-still-going-up/#comment_form.
bitcoin to hackers; FBI investigating" [Online] [Accesed February 2017].
Available:http://www.latimes.com/business/technology/la-me-ln- [26] VirusTotal - Free Online Virus, Malware and URL Scanner. [Online]
hollywood-hospital-bitcoin-20160217-story.html [Accessed 3rd January Available: https://www.virustotal.com/. [Accessed 20th December
2017] 2016].
[5] V. Weafer, “McAfee Labs Threats Report,” McAffe, March 2016 [27] Rossow, C., Dietrich, C.J., Grier, C., Kreibich, C., Paxson, V.,
[6] Gonzales, Daniel, Jeremy Kaplan, Evan Saltzman, Zev Winkelman, and Pohlmann, N., Bos, H., Van Steen, M.: Prudent practices for designing
Dulani Woods. "Cloud-trust-a security assessment model for malware experiments: status quo and outlook. In: 2012 IEEE
infrastructure as a service (IaaS) clouds." IEEE Transactions on Cloud Symposium on Security and Privacy (SP), pp.65–79. IEEE. 2012.
Computing (2015). [28] L. Zeltzer, “5 Steps to Building a Malware Analysis Toolkit Using Free
[7] D. Maldonado and T. M. Lares. "Sticky Keys to the Kingdom: Pre-Auth Tool.” Zektzer Security Corp. [Online] Available:
system RCE on Windows is more common than you think." DEFCON https://zeltser.com/build-malware-analysis-toolkit/. March 2015.
Conference, 2016. [29] Cuckoo Foundation. Cuckoo Sandbox: Automated Malware Analysis ).
[8] Sourabh Saxena. "Demystifying Malware Traffic." SANS Institute [Online] Available: http://www.cuckoosandbox.org. 2014.
InfoSec. August 2016. [30] M. Sikorski and A. Honig, “Practical Malware Analysis: The HandsOn
[9] Rieck Konrad, Philipp Trinius, Carsten Willems, and Thorsten Holz. Guide to Dissecting Malicious Software,” No Starch Press, 2012.
"Automatic analysis of malware behavior using machine learning." [31] M. Russinovich (2016, July). Process Monitor [Online]. Available:
Journal of Computer Security 19, no. 4, pp. 639-668. 2011. https://technet.microsoft.com/en-us/sysinternals/processmonitor.aspx
[10] Alazab Mamoun, Sitalakshmi Venkataraman and Paul Watters. [32] API Monitor. (2017). [Online] Available:
"Towards understanding malware behaviour by the extraction of API http://www.rohitab.com/apimonitor
calls." In Cybercrime and Trustworthy Computing Workshop (CTC), [33] ApateDNS. (January 2017). [Online] Available:
2010 Second, pp. 52-59. IEEE, 2010. https://www.fireeye.com/services/freeware.html/mandiant_apatedns.htm
[11] "Top 7 Desktop OSs on January 2017." [Online] Avaialable: l
http://gs.statcounter.com/os-market-share/desktop/worldwide/ [Accessed [34] Ncat - Netcat for the 21st Century. (January 2017). [Online] Availble:
20th January 2017] https://nmap.org/ncat/
[12] Nakamoto, Satoshi. "Bitcoin: A peer-to-peer electronic cash system." [35] Nmap: The Network Mapper. (January 2017). [Online] Available:
2008. https://nmap.org/
[13] Bhardwaj, Akashdeep, et al. "Ransomware Digital Extortion: A Rising [36] “Sticky-Keys-Slayer” (2017). [Online] Available:
New Age Threat." Indian Journal of Science and Technology 9, pp 1- 5, https://github.com/linuz/Sticky-Keys-Slayer [Accessed 2nd January
2016. 2017]
[14] Kharraz, Amin, William Robertson, Davide Balzarotti, Leyla Bilge, and [37] Malware Anysis Service. (2017). [Online] Available:
Engin Kirda. "Cutting the gordian knot: A look under the hood of https://www.malwr.com.
ransomware attacks." In International Conference on Detection of
Intrusions and Malware, and Vulnerability Assessment, pp. 3-24. [38] Zach Grace. "Hunting Sticky Keys Backdoors" (March 2015) [Online]
Springer International Publishing, 2015. Available: https://zachgrace.com/2015/03/23/hunting-sticky-keys-
backdoors.html
[15] Symantec Security Response. "An ISTR Special Report: Ransomware
[39] Ahmadian, M. M., Shahriari, H. R., & Ghaffarian, S. M. "Connection-
and Businesses 2016." [Online] Available:
monitor & connection-breaker: A novel approach for prevention and
http://www.symantec.com/content/en/us/enterprise/media/security_respo
detection of high survivable ransomwares." In Information Security and
nse/whitepapers/ISTR2016_Ransomware_and_Businesses.pdf
[Accessed 20th January 2017] Cryptology (ISCISC), 2015 12th International Iranian Society of
Cryptology Conference on (pp. 79-84). IEEE. 2015.
[16] Ransomware, Microsoft Malware Protection Center, Februrary 2015.
[40] Mattias Weckstén, Jan Frick, Andreas Sjöström and Eric Järpe. "A
[17] Salvi, Miss Harshada U., and Mr Ravindra V. Kerkar. "Ransomware: A Novel Method for Recovery from Crypto Ransomware Infections." 2016
Cyber Extortion." Asian Journal of Convergence in Technology. 2016. 2nd IEEE International Conference on Computer and Communications
[18] Ransom Cryptowall. June 2014 [Online] Available: (ICCC 2016), Chengdu, China. IEEE. 2016.
https://www.symantec.com/security_response/writeup.jsp?docid=2014-
061923-2824-99 [Accessed 23rd January 2017].
AUTHORS PROFILE
[19] Ransom Bucbi.(May 2016). [Online] Available:
https://www.symantec.com/security_response/writeup.jsp?docid=2016- Aaron Zimba is currently a PhD student at the University of
050921-2018-99 [Accessed 23rd January 2017] Science and Technology Beijing in the Department of
Computer Science and Technology. He received his Master
[20] Ankit Singh. (January 2016). "What Symantec’s Intrusion Prevention and Bachelor of Science degree from the St Petersburg
System did for you in 2015." Symantec Security Response. 2016. Eletrotechnical University in St Petersburg in 2009 and 2007
[21] Broad Analysis Threat Intelligence and Malware Research. "Malicious respectively. He is also a member of the IEEE. His main
Java Script sends Locky Ransomware Again". [Online] Available: research interests include Network Security Models, Network
http://www.broadanalysis.com/2016/04/29/malicious-java-script-sends- & Information Security and Cloud Computing Security.
locky-ransomware-again/ [Accessed 23rd January 2017]
[22] Sood, Aditya K., and Richard J. Enbody. "Malvertising–exploiting web
advertising." Computer Fraud & Security 2011, V no. 4, pp.11-16, 2011.
325 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
Abstract- Flooding attack is one of the serious threats of network many computers simultaneously, by spending resources
security on Web servers that resulted in the loss of bandwidth (resource) owned by that computer until the computer is not
and overload for the user and the service provider web server. able to function properly[7]. Making it necessary to do the
The first step to recognizing the network flooding attack is by search step to finding the lights and reconstructed the attack
applying the detection system Intrusion Detection System (IDS) action through evidence analysis attack. To combat this, the
like Snort. Snort is an open source system that can be used to Intrusion Detection System (IDS)[8], which can be used for
detect flooding attacks using special rules owned by Snort. All detection and identification of flooding attacks such as Snort
activities are recorded on Snort are stored in a log file that IDS[9]. Snort can detect intruders with a rule that has been
records all activity on network traffic. Log files are used at this
owned by Snort by way of a packet sniffer to see the data
stage of the investigation to the forensic process model method to
find evidence. The results of this research scenario analysis
traffic on computer networks[10].
obtained 15 IP Address recorded perform illegal actions on web Detection of flooding attack on a web server as done with
server. This research has successfully detected flooding attack on the forensic evidence in forensic process model approach,
the network by performing forensics on web server. namely forensic methods to gather information, examination,
analysis, and reports[5]. Therefore, the topics raised in this
research is the detection of flooding attack on a web
Keyword: Flooding, IDS, Snort, Network Forensics server[11], including flooding attack detection process and the
I. INTRODUCTION reconstruction of the characteristics of the log file that has
been recorded by Intrusion Detection System (IDS) Snort[12].
In this era, the increase in threats and attacks on network The detection process is done with the aim to help network
security is increasing because the web server is supported by administrators to minimize manual tasks undertaken in the
the ease of access and resource availability are more easily search for evidence of an attack on the data of each visitor
lead to hacker has vulnerabilities for hacking web servers. who is deliberately flooding attack on a web server[13].
Web server is an application server with the content contained
on HTTP or HTTPS from the browser and send it back in the Forensic research network with the network computers
form of pages[1]. contained in the Bureau of Information and Communication
Technology (ICT)[14], University of Muhammadiyah
The web server will record data every of the visitor in the Magelang. From the description of the ICT network
form of log files on the web server[2]. The data log file will be administrator, University of Muhammadiyah Magelang most
very helpful in case of problems on the web server[3]. In this attacks are attacking flooding. therefore required an
regard, there is a field of computer technology, the network investigation and forensic investigation network in the
forensic (forensic network) is a branch of digital forensics[4], University of Muhammadiyah Magelang.
using the technique scientifically proven to collect, use,
identifying, testing,
II. BASIC THEORY
A. Network Forensic
Analyzing, documenting and over and can present digital Network forensics is defined in [19] as capture, recording,
evidence from several sources of digital evidence to network and analysis of network events in order to discover the source
events on finding source of the attack[5]. of security attacks or other problem incidents. In other words,
There are many possible attacks on a web server, one of network forensics involves capturing, recording and analyzing
which is a flooding attack. Flooding attack is an attack of network traffic. The network data is derived from the
indicated[6] or stop services carried from one computer or to existing network security appliances such as firewall or
326 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
intrusion detection system, examined for attack
characterization, and investigated to trace back to the attacker.
In many cases, certain crimes which do not break network
security policies but might be legally prosecutable. Those
crimes can be handled only by network forensics.
B. Model Process Forensic
The template is used to format your paper and style the text.
All margins, column widths, line spaces, and text fonts are
prescribed; please do not alter them. You may note
peculiarities. For example, the head margin in this template
measures proportionately more than is customary. This Figure 1: Snort detection
measurement and others are deliberate, using specifications that
anticipate your paper as one part of the entire proceedings, and
not as an independent document. Please do not revise any of III. METHODOLOGY
the current designations.
Flooding attack detection configuration phase consists of
C. Intrusion Detection System (IDS) Intrusion Detection System (IDS) Snort. This configuration is
performed to detect flooding attack on a web server, after
The software application or hardware device that can
configuration Intrusion Detection System (IDS) Snort the next
detect suspicious activity in a network system. Intrusion
step flooding conduct simulated attacks to test whether the
Detection System (IDS)[11]. Intrusion Detection System
Intrusion Detection System (IDS) Snort has been successfully
(IDS) can perform inspections of inbound and outbound traffic
installed. Snort log files can be saved in a file p.cap, it will be
in a system or network, do the analysis and find evidence of
analyzed to obtain the results of the forensic evidence of the
experiments (infiltration)[15].Intrusion Detection System
intruder on a web server.
(IDS) are passive which can only detect the presence of an
intruder to inform the network administrator that there is an A. Intrusion Detection System (IDS) Snort Configuration
attack or disruption to the network. Intrusion Detection
System (IDS) is divided into two types, namely[16]: Configuration phase Intrusion Detection System (IDS)
Snort performed to detect any demand (request) data, either by
Network-based Intrusion Detection System (NIDS) request or attack. after configuring snort, then the next rule
configuration in accordance with the rules that have been
All traffic flowing into a network will be analyzed to owned by the snort to detect attacks flooding.
find whether there was an attempted attack or
intrusion into the network system. B. Flooding Attack Scenario
Host-based Intrusion Detection System (HIDS) Phase flooding attack scenario was conducted to test
whether the configuration Intrusion Detection System (IDS)
Activities of a host of individual networks will be Snort on the web server has been successfully installed. The
monitored if they are in an attempted attack or simulation was performed using the LOIC tool used to test
intrusion into it or not. Intrusion Detection System (IDS) Snort to detect attacks
D. Snort flooding. The drill began with the sending IP packets on a
target and selected the port will be attacked.
Snort is a software to detect instruction on the system[17],
capable of analyzing in real-time traffic and logging IP
Address, able to analyze the port and detect all sorts of attacks
from outside[18]. Snort works in three modes package as
shown in figure 1, namely:
Packet sniffer mode
In a packet sniffer mode, Snort works as a sniffer to
see the data traffic on computer networks.
Packet Logger
None of the packets on the network will be analyzed. Figure 2: Simulation Flooding attack
Intrusion Detection Mode Scenario of flooding attack as show by figure 2, carried out
In this mode, Snort will serve to detect attacks made from several directions that connected to the Internet, by using
through a computer network. the LOIC tool. The server Muh University of Mgl is targeted
Flooding attack simulation to test whether the Snort IDS has
worked well. During the simulation, flooding attacks carried
out for about 15 minutes. During the simulated attack lasted
327 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
flooding a target server running IDS Snort to capture traffic
and will get Snort log file shaped p.cap file.
Phase Examination
Intrusion Detection System (IDS) used Snort forensic
investigators in examining the log file found on Snort
in the capture (p.cap) b entering parameters to be
plugged into Snort. The inspection process is going
through a phase in figure 5.
Figure 3: Topology University of Muhammadiyah Magelang
Start Traffic
Yes
No
328 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
the 70s Bytes (74 Bytes). On the Internet Protocol
Version 4, to read as 118.96.55.120 IP source and
destination IP address visible 203.6.149.136 with 20
Bytes header length and the total length of 60. on the
part of the user datagram protocol, source port reads
as 52 883 and destination port read as 80. If the filter
is returned to the ip.src == 118.96.55.120 and
investigated in another frame, the source port is
immutable, but still in a great range (ports 51000-
64000). log file analysis results obtained 15 IP
address that has acted illegally flooding attacks web
servers.
Figure 6: Load Average In addition, the analysis continued with statistics
module endpoint in Wireshark used to collect attack
After the Snort log files are recorded, the log file will
packets contained in log files Intrusion Detection
be taken and analyzed using Wireshark to have this
System (IDS) Snort during the attack simulation. In
forensic evidence. In the picture seen demand exceed
Figure 9 below explains that the IP address has a
30 packets in one second. When detected, the Snort
different load on each package and at different speeds
rules will give a warning message in the alerts as
in each of its bytes.
shown in figure 7.
V. CONCLUSION
IDS system that is applicable to the scenario of this study
have worked as expected, the system can record the activities
of the network in the form of log files with the extension p.cap
the file can be analyzed with Wireshark tool. Based on the
analysis that has been done, it was found that 15 IP address
Figure 8: UDP Follow web servers perform illegal actions, which led to overload
traffic.
From the collection of the line can have one line to
perform analysis on any part of the frame that By applying the forensic process model, IDS systems on a
represents a frame in an attack packet flooding of IP web server can be used to help meet the needs of forensics at
address 118.96.155.120 has a length (length) range in the Muh University of Mgl, other than that the administrator
can monitor and prevent future attacks.
329 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
TABLE 10 FILE LOG SNORT
Source Dest.
No. Timestamp Source Dest. IP Protokol Payload
Port Port
40ddf957603e0e006e69746f72696e6
1 7/10/2016 17:26 203.6.149.140 203.x.x.136 ICMP - - 763616374692d6d6f…
69732066696e6520746f6f2e204465
2 7/10/2016 16:32 112.78.32.170 203.x.x.136 UDP 52658 80 737564657375646573...
69732066696e6520746f6f2e204465
5 8/10/2016 19:45 180.253.133.16 203.x.x.136 UDP 60052 80 737564657375646573...
69732066696e6520746f6f2e204465
6 8/10/2016 20:26 180.253.128.44 203.x.x.136 UDP 63749 80 737564657375646573…
69732066696e6520746f6f2e204465
7 8/10/2016 20:09 180.254.95.85 203.x.x.136 UDP 53820 80 737564657375646573...
69732066696e6520746f6f2e204465
8 8/10/2016 20:15 180.254.89.62 203.x.x.136 UDP 61246 80 737564657375646573...
69732066696e6520746f6f2e204465
9 8/10/2016 20:51 180.254.66.63 203.x.x.136 UDP 54948 80 737564657375646573...
69732066696e6520746f6f2e204465
10 8/10/2016 20:07 36.73.104.81 203.x.x.136 UDP 53817 80 737564657375646573...
69732066696e6520746f6f2e204465
11 8/10/2016 20:08 36.73.54.59 203.x.x.136 UDP 53814 80 737564657375646573...
69732066696e6520746f6f2e204465
12 8/10/2016 20:20 36.81.87.139 203.x.x.136 UDP 63748 80 737564657375646573...
69732066696e6520746f6f2e204465
13 8/10/2016 20:25 36.81.26.141 203.x.x.136 UDP 63756 80 737564657375646573...
69732066696e6520746f6f2e204465
14 10/10/2016 6:34 36.81.47.197 203.x.x.136 UDP 55291 80 737564657375646573...
69732066696e6520746f6f2e204465
15 10/10/2016 6:34 36.81.35.5 203.x.x.136 UDP 56328 80 737564657375646573...
REFERENCES
[1] J. D. Ndibwile and A. Govardhan, “Web Server Protection against Neural P,” vol. 75, no. 3, pp. 397–404, 2015.
Application Layer DDoS Attacks using Machine Learning and [7] E. Lee, “Detection Of Flooded Areas From Multitemporal Sar
Traffic Authentication,” pp. 261–267, 2015. Images 2016 Second International Conference on Science
[2] A. Iswardani and I. Riadi, “Denial Of Service Log Analysis Using Technology Engineering And Management,” 2016.
Density K-Means Method,” vol. 83, no. 2, pp. 299–302, 2016. [8] V. Shah and A. K. Aggarwal, “Heterogeneous fusion of IDS alerts
[3] T. A. Cahyanto and Y. Prayudi, “Web Server Logs Forensic for detecting DOS attacks,” Proc. - 1st Int. Conf. Comput. Commun.
Investigation to Find Attack’s Digital Evidence Using Hidden Control Autom. ICCUBEA 2015, pp. 153–158, 2015.
Markov Models Method ,” Snati, pp. 15–19, 2014. [9] A. Dewiyana, A. Hadi, U. Mara, U. Mara, and U. Mara, “IDS Using
[4] K. K. Sindhu and B. B. Meshram, “Digital Forensics and Cyber Mitigation Rules Approach to Mitigate ICMP Attacks,” 2013.
Crime Datamining,” vol. 2012, no. July, pp. 196–201, 2012. [10] A. Saboor, M. Akhlaq, and B. Aslam, “Experimental Evaluation of
[5] R. Utami Putri and J. E. Istiyanto, “Network Forensic Analysis Case Snort against DDoS Attacks under Different Hardware
Studies SQL Injection Attacks on Server Universitas Gadjah Mada,” Configurations,” pp. 31–37, 2013.
Int. J. Comput. Sci. Secur., vol. 6, no. 2, 2012. [11] N. M. Lanke and C. H. R. Jacob, “Detection of DDOS Attacks
[6] R. K. Idowu, R. C. M, and Z. A. L. I. Othman, “Denial Of Service Using Snort Detection,” vol. 2, no. 9, pp. 13–17, 2014.
Attack Detection Using Trapezoidal Fuzzy Reasoning Spiking [12] A. I. Technology, T. Nadu, and T. Nadu, “Flow Based Multi Feature
330 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
Inference Model For Detection Of DDoS Attacks In Network
Immune,” vol. 67, no. 2, pp. 519–526, 2014.
[13] S. Sharma, “On Selection of Attributes for Entropy Based Detection
of DDoS,” pp. 1096–1100, 2015.
[14] “Guide to Integrating Forensic Techniques into Incident Response.”
[15] “Introduction to Snort A . Sniffer Mode,” pp. 1–11.
[16] H. Toumi, A. Eddaoui, and M. Talea, “Cooperative Intrusion
Detection System Framework Using Mobile Agents For Cloud
Computing,” vol. 70, no. 1, 2014.
[17] H. A. D. Eugene C. Ezin, “Java-Based Intrusion Detection System
in a Wired Network,” vol. 9, no. 11, 2011
[18] B. Khadka, C. Withana, A. Alsadoon, and A. Elchouemi,
“Distributed Denial of Service attack on Cloud : Detection and
Prevention,” 2015.
[19] Nguyen, K., Tran, D., Ma., & Shama, D. (2014) An Approach to
Detect Network Attacks Applied for Network Forensics, 655-660.
331 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
∗
Department of Mathematics, Faculty of Science, Al-Azhar University
Cairo, Egypt
2
[email protected]
Abstract—Recently, due to existence of a lot of mobile appli- [4]. Generally, there are two main approaches which are
cations that consume high their computing power and energy. introduced to perform remote execution. The first approach
So, there are offloading schemes which move the computing is to use full process or full VM (Virtual Machine) migration
power and data storage away from a mobile phone into a power
full cloud server which has a big storage space by offloading as in CloneCloud [5]. The full process or VM can be migrated
methods or services to the cloud server. However, the offloading to the rich infrastructure to execute remotely. While the second
might consume more energy than the local processing of data approach is only offloading intensive methods or services of
when the size of code is small. This paper introduces a new applications to execute remotely [6], [7], [8]. This approach
offloading scheme by using decision making approaches. The leads to large energy saving because it is fine grained appli-
scheme considers each service or process in a mobile application
as a set of methods and can offload a group of methods based on a cations. This means that it can remote only the sub-parts that
defined cost model. Also, it enumerates the set of solutions based benefit from remote execution [6], [7], [8].
on decision making approaches to find all feasible solutions which In this paper to solve these problems method-based offloading
satisfy the cost model then it selects the best feasible solution scheme for mobile application is proposed. The proposed
among them. The conducted simulation results show that the scheme considers each service or process in a mobile ap-
offloading performance of the proposed scheme is much better
than local processing scheme. plication as a set of methods and can offload a group of
methods based on a defined cost model. It enumerates the set
I. I NTRODUCTION of solutions based on decision making approaches to find all
Mobile Cloud Computing refers to an infrastructure where feasible solutions which satisfy the cost model then it selects
both the data storage and the data processing happen outside the best feasible solution among them.
of the mobile device. Mobile cloud applications move the The rest of this paper is organized as follows: the related
computing power and data storage away from mobile phones work will be introduced in Section II. Section III describes
to the cloud which brings applications and mobile computing offloading problem formulation in MCC. Section IV explains
to not just smartphone users but a much broader range of the proposed scheme. Section V introduce the application
mobile subscribers [1]. Computation offloading offloads inten- scenario using the proposed scheme. Section VI introduces
sive methods of mobile applications to run remotely on rich the performance and qualitative evaluation of the proposed
resource such as cloud. In case of code compilation, offloading scheme and Section VII concludes the paper.
might consume more energy than that of the local processing
when the size of codes is small. So, offloading is not always II. RE L ATED WORK
the effective way to save energy of a mobile device [2]. For In recent years, a lot of studies have been appeared to
example, when the size of altered codes after compilation is support remote execution for mobile applications on the cloud
500 KB, offloading consumes about 5% of a device?s battery [5], [6], [7], [8]. In the rest of this section, these related work
for its communication while the local processing consumes will be introduced in details.
about 10% of the battery for its computation. In this case, the
offloading can save the battery up to 50%. However, when A. CloneCloud
the size of altered codes is 250 KB, the efficiency reduces to CloneCloud is introduced by B. Chun [5] in 2011. The
30%. Also computation offloading may require a large amount concept of clonecloud is based on creating virtual Smartphone
of data to be transferred on runtime, and then higher latencies called clone. The clone on the cloud has more hardware,
may occur. software, network and energy which provides more suitable
In recent years, a lot of studies have been appeared to support environment to process complicated tasks. The partitioning
remote execution for mobile applications on the cloud to mechanism in CloneCloud is dividing the application into
increase the performance and reduce energy consumption [3], software blocks based on energy consumption intensive or
332 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
computing. Some of this block will be running on the Smart- the server side consists of profiler, server proxy, solver and
phone and other will be running on the clone. Once the virtual controller. However, the working of a profiler and server
clone is available, then the some computing or energy intensive proxy is similar to the smart phone. The solver is the main
blocks will be offloaded to cloud for processing. Once those decision engine of the MAUI that holds the call graph of the
execution blocks have been completed, the output will be applications and the scheduled methods. Lastly, the controller
passed from the clone on the cloud to the Smartphone. is responsible for the authentication and resource allocation for
The main disadvantages of CloneCloud approaches are: (1) incoming requests. However, single method offloading is less
Access to native resources that are not virtualized already and beneficial compared to multiple methods offloading. Another
are not available on clone. (2) May be offloading task not weakness of MAUI is that if the programmer forgets to mark
correct if it consumes energy more than running local. (3) methods (for remote execution), MAUI will not be able to
Computation offloading may require a large amount of data offload those methods. Also, MAUI does not consider the
to be transferred on runtime, and then higher latencies may execution time in its optimization cost however it can predict
occur. the execution of a method. Nevertheless, the MAUI profilers
consume processing power, memory and energy, which is an
B. Giurgiu et al. Model overhead on the smart phones.
Giurgiu et al. [7] proposed a model that focuses on of-
floading intensive parts of applications to execute remotely D. AMBEO
on the cloud/server to optimize latency, data transfer delay AMBEO [8] divides an application into three layers: (1)
and cost. The core method of this model using R-OSGi presentation layer which contains user interface and resides on
[9] and AlfredO [10] frameworks for the management and the smart phone, (2) logical layer which contains computation
deployment of applications. R-OSGi is an enhanced version methods and is distributed between the cloud and the smart
of OSGi that supports multiple VMs residing on distributed phone according to determined optimal cost which takes into
servers, whereas the primary objective of OSGi is to assist with account memory constrain, and (3) data layer which contains
the decomposition and coupling of applications into modules, data and data access method and is fully deployed on the cloud
called bundles. The proposed model divides each mobile to minimize the data access over the data layer. Instead of
application into presentation layer, logical layer and data offloading a whole service or a whole application to the cloud,
access layer. AlfredO distributes the bundles of layers between AMBEO works with methods of each service by adaptive
the Smartphone and server. The bundles of presentation layer offloads some of logical layer methods of a service based
reside on the Smartphone while the bundles of logical layer are on a determined cost model. However, AMBEO decide the
distributed between the server and the Smartphone. Moreover, running of each method separately based on its local and
the bundles of data layer are fully deployed on the server to remote costs taking into account its needed memory and the
minimize the data access delay. available memory. But if another method needs to run in the
The main disadvantages of Giurgiu et al. are: (1) the cost same time the decision may be wrong.
model has focused so far only on the mobile device and
has assumed the server?s resources to be infinite. (2) CPU III. OFFLOADING PROBLEM IN MCC
consumption and energy consumption are not including into In MCC, The decision of computation offloading is an
the optimization problem. (3) Offloading bundle needs high extremely complex process and is affected by the nature of the
bandwidth. application. Therefore, the offloading problem is how services
or methods can be offloaded such that the mobile devices can
C. MAUI Model save their energy with keeping a high service performance and
MAUI [6] enables developers to produce an initial partition- minimum time delay.
ing of their applications with minimal effort. The developer
simply annotates as remote able those methods that the MAUI A. Assumptions and Models
run time should consider offloading to a MAUI server to The MCC model consists of a mobile node MN and a
minimize energy consumption of mobile devices. In MAUI, cloud server node CS. MN can communicate with CS by using
the application partitioning is dynamic and the offloading is advanced wireless technology to exchange data, applications,
done on the basis of methods instead of complete application and services. In addition, the set of assumptions that must
modules to minimize the offloading delay. However, MAUI be met in this MCC model are (1) there are developers can
creates two versions of smartphone application, for local and apply the Model-view-controller (MVC) [11] design pattern
remote execution. In MAUI, the mobile device consists of explicitly and rigorously to isolate the application logical
three main components: solver interface, profiler and client layer from the user interface and data layer of any mobile
proxy. The solver interface provides interaction with the application,(2) any method that interacts with a user or needs
solver and facilitates the offloading decision making. The to access device hardware will belong to a user interface, (3)
profiler collects information regarding the application energy the energy consumption of each hardware component of MN
consumption and data transfer requirements. The client proxy such as LCD, CPU and Wi-Fi can be measured separately
deals with the method offloading and data transfer. Similarly, by using a measurement application model for the energy
333 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
consumption on a mobile device (e.g., Android phone) on the where Ei,idle can be expressed as multiplying of the idle time
fly [12], and (4) any mobile application consists of a set of of the mobile device for waiting to get a result from the cloud,
methods and each method consists of a set of instructions that ti,idle and the power idle consumption per second Pi,idle .
can be determined at run time. The set of methods in a logical
layer of a certain service which can be offloaded is denoted as Ei,idle = Pi,idle ∗ ti,idle (7)
SM={mi , 1 ≤ i ≤ n}. Each mi ∈ SM has several metadata The energy consumption for sending data, Ei,s can be ex-
properties as memory cost, memi and code size, codei .The pressed as multiplying of the time for sending data from
number of instructions in a code size codi of a method iis mobile to cloud, ti,s and the power consumption for sending
denoted as I. The data size is needed for a method i to be sent data from mobile to cloud per second, Pi,s .
or received are denoted by sendi,s and recvi,r , respectively.
In addition, the speed of executing any instruction by a mobile Ei,s = Pi,s ∗ ti,s (8)
node is denoted by SpeedM N . Finally, there are n methods
The energy consumption for receiving data from cloud, Ei,r
that can be offloaded for an application or a service.
can be expressed as multiplying of the time for receiving data
B. Problem Formulation from cloud ti,r and the power consumption for receiving data
from cloud per second Pi,r .
To formulate the offloading problem in MCC, firstly, the
local and offloading costs for methods will be modeled based Ei,r = Pi,r ∗ ti,r (9)
on before mentioned assumptions and models. The local time
execution, Ti,local for a service i can be determined by using The idle time of a mobile device during waiting period for
the number of instructions I and mobile execution speed getting a result from a cloud server can be treated as the
SpeedM N as follows: execution time of a remote cloud, so Ei,of f load can be written
as follows.:
I Pi,idle ∗ I Pi,s ∗ Di , s Pi,r ∗ Di , r
Ti,local = (1) Ei,of f load = + + (10)
SpeedM N Speedcloud Bi, s Bi, r
The local energy consumption, Ei,local for a service i can be By using Eq. 1and Eq. 5, the total cost of execution time for
determined by using the number of instructions I and mobile n methods is
execution speed SpeedM N . n
X
Ei,local = Pi,local ∗ Ti,local (2) Ctime = (Ti,local ∗ (1 − xi ) + Ti,of f load ∗ xi ) (11)
i=1
Where Pi,local is the power for a local execution per second. Also, by using Eq. 2 and Eq. 10, the total cost of energy
Here, xi is introduced for a method i, which indicates whether consumption for n methods is
the method i is executed locally or remotely(xi =1 if a method n
i runs remotely and 0 if it runs local). By using xi , the data
X
Cenergy = (Ei,local ∗ (1 − xi ) + Ei,of f load ∗ xi ) (12)
sizes which will be sent, Di,s , and recieved, Di,r , are defined i=1
as follows.
Note that, the memory cost on the mobile device for n methods
Di,s = sendi ∗ xi (3)
that run local can be calculated as follows.
n
Di,r = reci ∗ xi (4) X
Cmemory = memi ∗ (1 − xi ) (13)
The time cost for offloading a method i to remote cloud, i=1
Ti,of f load , can be expressed as the sum of taking time during In addition, the data transfer cost for a remote execution of n
waiting w period for getting results from the cloud and methods, that includes the transfer cost of its related methods
transferring time(including sending and receiving)as follows. which are not at the same execution location. If the output
I Di,s Di,r of one method is an input of another is determined by the
Ti,of f load = ( + + ) (5) following equation
Speedcloud Bi,s Bi,r
n n X
k
where Speedcloud is the remote execution speed(cloud speed), X X
Ctranf er = codi ∗ xi + tri ∗ (xi XORxj ) (14)
Bi,s and Bi,r are the bandwidth for sending and receiving,
i=1 i=1 j=1
respectively.
The energy cost for offloading a method i to a remote cloud, where k is the number of related methods. By using Equations
Ei,of f load , can be expressed as the sum of energy consumption 11,12, 13, and 14, the total overall cost for n methods can be
during waiting period for getting results from a cloud Ei,idle written as :
, and transferring (including sending Ei,s and receiving Ei,r ) Ctotal = Ctranf er ∗ Wtr + Cmemory ∗ Wmem
as follows.
+ Cenergy ∗ Wenergy (15)
Ei,of f load = Ei,s + Ei,idle + Ei,r (6) + Ctime ∗ Wtime
334 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
where Wtr , Wenergy , Wtime , Wmem are the weights of trans- Local cost m1
Offloading cost
ferring, energy, time, and memory costs, respectively. These
weights represent the importance of these costs in the offload-
m2 m2
ing process such that the sum of these weights must equal 1
as follows.
m3 m3 m3 m3
Wtr + Wmem + Wenergy + Wtime = 1 (16)
m4 m4 m4 m4 m4 m4 m4 m4
Here, the solu tion x1 , x2 ...xn represents the required offload-
m4
ing partitioning of the application.
The objective goal of offloading problem in MCC is mini-
mizing the overall cost Ctotal as much as possible such that mn mn
takes into account the resources constraints of a mobile device.
So, the objective function of this problem can be written as
follows. Fig. 1: Decsion tree
min Ctotal (17)
x∈0,1
Such that Ctotal,M Pbf (n) = min {Ctotal,f ph (n) }∀f ph (n) ∈ F P (n)
x∈0,1
n (21)
X n
memi ∗ (1 − xi ) ≤ availmemory (18)
X
memi,M Pbf (n) ≤ availmemory (22)
i=1
i=1
Cenergy ≤ availenergy (19) where F P (n) = {f ph (n) : 1 ≤ h ≤ H} is the set of all
where constraint 18 means that the memory cost of a resident feasible paths (solutions) of n methods in the decision tree
method can not be more than available memory on the mobile and H is the number of feasible paths. Constraint 21 means
device. Constraint 19 means that the energy cost of can not that the selected path has the minimum cost among all feasible
be more than available energy of the mobile device. paths. Constraint 22 means that the memory cost of a resident
method can not be more than available memory on the mobile
IV. PROPOSED SCHEME device and memi,f ph (n) = 0 if the method Mi in this path
will be offloading to cloud server
Here, a new scheme called Application Decision Making-
Based Scheme for Method Offloading (ADBMO) is proposed C. Proposed Algorithm
to solve the offloading problem which was formulated in the
ADBMO consists of two phases based on its basic idea as
previous section.
follows.
A. Basic Idea • Phase 1: Profile phase which determines the current
The basic idea of ADBMO is based on five issues: (1) value of a mobile, cloud server, each method in the logical
dividing each mobile application into three layers: presentation layer, network conditions as bandwidth at run time. So,
layer, logical layer and data access layer, (2) considering each before each method is invoked, ADBMO determines
service or process in each layer as a set of methods, (3) the whether the method invocation should run locally or
methods of presentation layer resides on the mobile device, (4) remotely. So, ADBMO measures the characteristics of a
the methods of data layer are fully deployed on the cloud to mobile device and a cloud server at the initialization time
minimize the data access, and (5) the methods of logic layer and it continuously monitors the network characteristics
are distributed between a cloud server and a mobile device because these can often change and a stale measurement
using decision tree to enumerates the set of solutions based may force algorithm to make the wrong decision on
on decision making approaches to find all feasible solutions whether a method should be offloaded or not. Therefore,
which satisfy the cost model then it selects the best feasible the profile phase contains four profiling components:
solution among them.as shown Fig 1. – Device Profiling: In this profiling, ADBMO deter-
mines an energy consumption of a mobile device by
B. Problem Reformulated using a measurement application model for the en-
By using the decision tree the problem can be reformulated ergy consumption on a mobile device (e.g. Android
for finding the Path (solution) with minimal total cost as the phone) on the fly as PowerTutor [12]. In addition,
following. it measures the values of a processor speed, the
Objective: find the minimal total cost path of n methods, available memory of a mobile, battery consumption,
M Pbf (n) Such that amount of data transfer and memory used.
– Cloud Profiling: In this profiling, ADBMO deter-
M Pbf (n) ∈ F P (n) (20) mines the value of a processor speed of a cloud
335 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
server. ADBMO assumes that this value can be energy consumption local costs by using Eqs. 1 and
determined by using some means from the cloud. 2. Also, calculates the memory local cost for this
– Method Profiling: In this profiling, ADBMO deter- method. (ii) Calculating the time offloading and the
mines which the method belongs to a presentation energy consumption offloading costs by using Eqs.
layer, a logical layer or a data layer by using meta- 5 and 10. Also, calculating data delay transfer cost
data that stored in a manifest file and Java reflection which is represented by the sum of code size and
such as OSGi [13] which has been traditionally used receiving and sending data costs. (iii) Enumerating
to decompose and loosely couple Java applications the set of solutions using the decision tree. Each
into software modules. Then, for each method that node in a decision tree represents a method and
belongs to the logical layer it determines the charac- each a method has two edges one for local cost and
teristics of the methods such as code size, memory another for offloading cost. Paths from the root to the
used and amount of data transferred. leaves represent the set of solutions. (iv) Deleting
– Network Profiling: In this profiling, ADBMO mon- the unfeasible solutions. The unfeasible solutions
itors the network and gathers all information about are the total memory cost of methods is large than
the network as Internet connection availability of available memory. (v) Finding the best feasible path
a mobile device and the current bandwidth of the that has the best value among all feasible paths.
network to show its quality. (vi) saving the offloading decision (i.e., value of xi
) and the actual execution time of method, Tact .
This actual execution time can be used to determine
the actual number of instruction, Iact of this method
with respect to the number of instruction of a method
which is defined in Eq. 1 or Eq. 5 according to xi .
– Consequent execution time case: This time means
that a method will be run for second or more times.
In this case, there are two cases of the model parame-
ters as available memory or network bandwidth. (1)
Changed case: this means that the values of these
parameters are changed from their earlier values in
the previous run. In this case, ADBMO repeats the
two phases, Profiling and decision phases as the first
execution time. (2) Unchanged case: this means that
the values of these parameters are not changed from
Fig. 2: ADBMO phases their earlier values in the previous run. In this case,
ADBMO compares the number of instructions of
• Phase 2: Decision phase which determines which a method, I (which determine in method profiling
method will be run on a mobile device or on a cloud step) and the actual number of instructions, Ia ct to
based on the information that are calculated by using determine the changing degree of these parameters.
the profile phase, and the local and offloading costs If the result is in the period [0.0, 0.2], the change
as described in section 3. In this phase, the offloading is called Low and if the result is in the period
decision of a method depends on four factors: 1) charac- [0.21, 0.6] the change is called Medium. Otherwise
teristics of mobile device, 2) characteristics of the cloud, the change is called High. In case of Low change,
3) characteristics of methods, and 4) characteristics of ADBMO takes the same decision in the previous
network as network bandwidth. For each method, there execution time. While, in case of Medium or High
are two execution cases: first execution time case which ADBMO repeats the two phases of ADBMO as the
means that this method will be run for the first time and first execution time as shown in Fig 2.
consequent execution time case which means that this
method will be run for second time or more. These two V. APPLICATION SCENARIO: MOBILE FACE
cases are described as follows. RECOGNTION SYSTEM
– first execution time case: If a method will be run Identification of people is a major challenge faced by the vi-
for the first time, the methods of logical layer will sually impaired. The increase in computation capability of the
be run as follows: (1) If disconnect occurs, ADBMO cloud servers and mobile devices give motivation to develop
runs the method on the mobile device. In this case, applications that can assist visually impaired persons.Here,
the application?s energy consumption only incurs a The proposed face detection and recognition application is
small penalty cost due to offloading to the cloud. (2) designed to take advantage of ADBMO and is intended to
If a cloud server is available, ADBMO executes the assist visually impaired users in locating and identifying
following steps: (i) calculating the time local and the people that they know as follows.
336 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Presentation layer (user interface) in this layer, the applica- A. Simulation Results and Analysis
tion accesses the video feed from the device?s camera and after • Scenario A: In this scenario, ADBMO uses the difference
the person is identified, the application displays the results to numbers of methods to compare the cost of a running
the user. This layer will be running on the mobile device. method locally, the cost of running method by using
Logical layer (detection) the main goal of this layer is to AMBEO and running using ADBMO. The number of
detect faces. Face detection can be regarded as a specific case method (5,7,10,15,25), available memory 128Mb, and the
of object-class detection. In object-class detection, the task is code size of method iis100+50∗(i−1) kb. Fig -3 shows
to find the locations and sizes of all objects in an image that the cost of running methods on local device, running
belong to a given class. Face detection algorithms focus on by using AMBEO and by using ADBMO against the
the detection of frontal human faces. When a face is detected, number of methods. Fig -4 shows that total cost memory
a bounding box is drawn around it. This bounding box is of first 10 methods is 3575Kb which is less than available
used to extract and save the face of a person by cropping memory. The cost of running method local less than
the area inside it. Once the face is detected, the next detection
is performed after a delay to avoid overwhelming the user
with constant detection notifications. The methods of this
layer needs high computation resources and consume time and
energy. So all methods in this layer will be distributed between
a cloud server and a mobile device by using ADBMO.
Data layer (recognition) this layer complements the detection
by identifying the person 16 using the detected face. Image
matches with the image stores in database. This is done by
running the recognition program which searches the internal
database for a match using the saved face image as input. This
layer has two states, offline and online based on the status of
network connection. In Offline state after a face is detected,
a temporary image is captured and saved. After the image
is saved, it is used to identify the person by searching for a Fig. 3: cost VS number of methods
possible match in the application?s internal database. As soon
as the person is identified, the result is displayed to the user.
In the online state, the video feed is still accessed for scanning
and detecting faces. A temporary image is captured and saved
when a face is detected in the same manner as in the offline
state. However, instead of attempting to identify the person,
the image is sent to cloud servers for identification. After the
person is identified, the results are sent back to the application
and displayed to the user.
337 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Scenario B: In this scenario, 15 methods are used and • Offloading level (OffL): the offloading level means
each method Mi has cod size 100 + 50 ∗ (i − 1) kb and that the offloaded entities which are needed to move into
the difference size of available memory to compare the the cloud are class objects, threads, software modules, or
cost of a running method locally, the cost of running methods.
method by using AMBEO and running using ADBMO. • Prediction of second execution (PSE): the algorithm
Fig -5 shows the cost of running methods on local device, can predict the second execution or not.
• Minimize data transfer (MDT): a cost model includes
the amount of data transfer to avoid data traffic.
• Multiple methods offloading (MMO): the algorithm
takes the offloading decision for a group of methods
together rather than for every method alone.
running by using AMBEO and by using ADBMO against According to the qualitative parameter, the best criteria is as
the available memory. It is clear that if running method follows: ANC is Yes, SE is Yes, MC is Yes, OffL is methods,
using AMBEO is better than ADBMO and running local. PSE is Yes, and MDT is Yes. The qualitative evaluation is
But in Fig -6 it is clear that if all methods will run in shown in Table II. As shown in Table II, ADBMO satisfies all
requirements of the best criteria among existing approaches.
VII. C ONCLUSION
In this paper, the offloading problem in mobile cloud
computing and a lot of studies have been appeared to support
remote execution for mobile applications on the cloud are
introduced. In addition, a new offloading algorithm called
ADBMO is proposed. ADBMO provides method level code
offloading which improves the performance and save energy of
the mobile device. ADBMO is dealing with the computation
offloading challenges. ADBMO can decide which method will
run in local device or must be offloaded into the cloud for
group of methods will run at the same time based on the
cost of its running and available memory. The performance
Fig. 6: Memory cost VS available memory of ADBMO algorithm has been evaluated through extensive
simulation with different values of available memory and
the same time the cost of memory is maximum available
difference number of methods. The simulation results demon-
memory and the best algorithm for running methods
strated that ADBMO is better than AMBEO and other existing
correctly by using ADBMO.
models if the number of methods running at the same time.
B. Qualitative Comparison R EFERENCES
In this section, the qualitative between ADBMO and some [1] H. T. Dinh, C. Lee, D. Niyato, and P. Wang, “A survey of mobile
of existing approaches according to the following criteria: cloud computing: architecture, applications, and approaches,” Wireless
communications and mobile computing, vol. 13, no. 18, pp. 1587–1611,
• Adaptive with network change (ANC): an approach 2013.
continuously monitors the network and is adaptive with [2] K. Kumar and Y.-H. Lu, “Cloud computing for mobile users: Can
network disconnection. offloading computation save energy?” Computer, vol. 43, no. 4, pp. 51–
56, 2010.
• Saving energy (SE): a cost model includes the parameter [3] M. Shiraz, A. Gani, R. H. Khokhar, and R. Buyya, “A review on
of energy consumption cost . distributed application processing frameworks in smart mobile devices
• Memory cost (MC): a cost model includes the parameter for mobile cloud computing,” Communications Surveys & Tutorials,
IEEE, vol. 15, no. 3, pp. 1294–1313, 2013.
of memory cost and takes into account the available [4] A. Khan, M. Othman, S. Madani, and S. Khan, “A survey of mobile
memory of the mobile device. cloud computing application models,” 2013.
338 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
339 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Dr.Sanjay Dorle
HOD, Department of Electronics Engineering
GHRCE,
Nagpur, India
Abstract— In hyperspectral image, every pixel shows some reduce the complexity. The statistical approach used in
combination of mixed pattern of endmembers and in order to Orthogonal Subspace Projections[1] combined bands with
find those endmembers, abundance of endmembers and hence redundancy and tend to reduce dimensionality of data. The
the respective spectral signature unmixing is an excellent resulting image generated gives some mathematical relation
approach. Atmospheric interferers are a potential source of like Eigen vector with the input image and does not generate
errors in spectral unmixing, therefore it becomes very
challenging to identify the end members. The problem usually
any physical relation.
faced in extracting information from hyperspectral data is the Sometimes both the methods discussed goes hand in
presence of mixed spectral information. This problem can be hand, as in case of endmember spectra determination and
solved by making use of unmixing technique. The ultimate goal mixing, where the image is transformed into the principal
is to devise a feasible and effective method for unmixing based on component statistically, then the selection of group
modified Gram Schmidt orthogonalization, lsma and NMF. characterizing image is done manually. The selected groups
indicates pure pixels or endmembers with unique spectral
Keywords- hyperspectral image, abundance,NMF,endmembers, properties and rest of the pixels are considered as mixed pixels.
lsma and unmixing. Mixture of two or more individual spectral information is
known as a mixed pixel. Material maps are produced and
I. INTRODUCTION resulting images gives fractional abundance of the spectrum of
Information regarding materials present in hyperspectral scene endmembers. Advantage of this is that it deals with both
can be extracted from spectral properties of materials. With this radiance and physical axes or it transforms data from radiance
information analysts can separate or identify objects within the axis to physical one. So, automated unmixing methods are
scene. Hyperspectral data can be interpreted in two ways as preferred to reduce complexity in finding endmembers.
Purely physical manner or Statistical Manner. In this physical
variable ( like radiance) is presented by each pixel to the Different endmember extraction techniques include
observer. For further analysis of physical interpretation Pixel purity index [2] where dimensionality reduction is
scientific methodologies are used such as for identification of applied to the original data which results in skewers. With the
particular objects its spectral properties are matched with Pure Pixel Index it is not possible to get the final list of
absorption properties. But all scenes of data are not endmembers with the selection of skewers which is done
atmospherically corrected and properly calibrated. Also, it is randomly [3].In the N-Finder method selection pixel detecting
not necessary that all analysts belong to the spectroscopy simplex is generated, but the disadvantage is that recalculation
background. Dealing with millions of spectra per image that to of pixel volume increases the computation of algorithm and
with accuracy is difficult. So, automated systems are required also leads to sensitivity to noise [4].In VCA the principle that
for identification and exploitation of hyperspectral image. endmembers are vertices of simplex is used. Here positive cone
of hyperspectral data is projected on hyperplane and generates
In case of statistical approach ,the variables of data for simplex with vertices. VCA determines the number of
analysis are considered as statistical variables and tend to endmembers and solves the computational complexity but the
340 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
341 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Assume, 900nm with a band width of 3.2nm. The data was collected by
α =(α1, α2,..... αp)T :Px1 the Florida Environmental Research Institute with lines equal
so, to 952, number of bands is 156,samples equal to 952,
r=Mα+n http://opticks/sampledata/samson/
few bands of image read in matlab are as shown in Fig.2
r is linear mixing of target Spectral signature and mixing
coefficients =(α1, α2,..... αp). Fractional images are grayscale
images representing abundance fractions of mixing objects.
LSMA generates fractional images and reflectance spectra with
respect to endmembers generated in previous state.
IV. NMF
W € Rmxk
A ≈ WH
min f(W,H)=1/2 *║A-WH║ 2 F
w≥0,H≥0
For this Euclidian or Frobenius distance is used.
342 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Out of the total 156 bands of the input image, Few bands are
presented in Fig.2. The output given by modified gram Schmidt
orthgonisation endmember extraction algorithm are
endmembers, orthogonized endmembers(Q) and
orthonormalised endmember(U). Endmembers detected have
their specific spectral signatures, which is plot between
spectral band and radiance.Fig.3 shows spectral signature for
all extracted endmembers. Also, Variance value, Elapsed time
for endmembers are calculated as shown in Table.1
Endmembers details like pixel value, orthogonalised and
orthonormatized vectors are given as input to LSMA.
Fig.3(b): Reflectance spectrum for endmember 2 Grayscale Fractional abundance images which has to be coded
are given by LSMA as shown in Fig.4.
When more attributes are there or they are ambiguous
in nature with weak predictability, NMF can be used. Because
NMF utilizes multivariate analysis with linear algebra, and can
produce effective patterns. Here input data matrix A is
decomposed into two matrices with lower ranks W and H.
Now, matrix W carries basis whereas H carries associated
coefficients or weights. W and H are modified in iterative
manner so that their product approaches A. Here nonnegative
values of sparse bases and sparse weightings can be obtained
and the original data structure can be preserved. Since NMF is
dimensionality reduction method further it helps in
compression.
Here, Nonnegative matrix factorization (NMF)
method is used for solving spectral unmixing problem caused
by nonnegativity constraint on abundance fractions.
Fig.3(c): Reflectance spectrum for endmember 3
.
343 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Table:1
Endmember4 26.455712362002 1.2380249147618 51.00 3] Theiler, J ,DD Lavenier, N.R Harvey, S.J Perkins and J.J
Endmember6 22.384385738664 1.2300259316586 52.00 spectral end member determination in hyperspectral data," SPIE
3e-006 9e+000 proc, 3753, 266-275,1999.
5] Sebastian lopez, pablo Horstrand, Gustara M.callico,"A low
computational- Complexity algorithm for Hyperspectral End
344 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
member extraction: modified vector component Analysis," IEEE 10] Jingu Kim and Haesun Park," Toward Faster Nonnegative Matrix
Geoscience and remote sensing letters Vol.9, 3 may 2012. Factorization: A New Algorithm and Comparisons",Eighth IEEE
International Conference on Data Mining (ICDM’08), Pisa,Italy,
6] Chein- I chang, Cheng Wu, Wer-min, “A new growing method Dec. 2008.
for simplex Based endmember extraxtion algorithm". 11] D.Lee and H-seung,”Learning the parts of object by nonnegative
matrix factorization,” Nature, Vol.401,PP788-791,1999.
7] Raul guerra, Lucana Santos, Sebastian Lopez and Roberto
Sarmiento," A new Fast Algorithm for Linearly Unmixing 12] J. Bowles, J. Antoniades, M. Baumback, J. Grossmann, D. Haas, P.
Hyperspectral Images," IEEE Transactions on geoscience and Palmadesso, and J.Stracka, Real time analysis of hyperspectral
Remote Sensing, 2015. data sets using NRL’s ORASIS algorithm,Proceedings of the SPIE,
8] J.B.Adams, M. O.Smith, and A.R. Gillespie,“Image vol. 3118, p. 38, 1997.
spectroscopy: Interpretation based on spectral mixture analysis,”
in Remote Geochemical Analysis: Elemental and Mineralogical
Composition, C. M. Pieters and P. A. Englert, Eds.
Cambridge, U.K.: Cambridge Univ. Press, pp. 145–166,1993.
AUTHORS PROFILE
9] R. Bro and S. De Jong. A fast non-negativity-constrained least Authors Profile …
345 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Dr.A.A.Khurshid
Neetu N. Gyanchandani Professor, Department of Electronics Engineering.
Department of Electronics Engineering. RCOEM,
Research Scholar, GHRCE Nagpur, India
Nagpur, India
Dr.Sanjay Dorle
HOD, Department of Electronics Engineering
GHRCE,
Nagpur,India
Abstract— Due to high spectral and spatial resolution The different steps involved in compression [encoding] of the
hyperspectral data seems to be most popular source of hyperspectral image are given in Fig.2
information. Extracting required information from large
hyperspectral volume is very difficult task, since data cubes These steps include:
require large memory and transmission space. Many
compression methods and techniques are being developed to
achieve the goal and are thus presented here.
1. Pre-Processing Unit- Different reversible processes
In recent years, dramatic growth in remote sensing are used and shared with decoder for decompression
applications and platform are observed both airborne and the to improve performance of compression e.g.
space borne. Data obtained from remote sensing faces challenges Reordering of bands, principle component analysis,
in acquisition, transmission, analysis and its storage. Accurate normalization and preclustering.
analysis or information extraction is better with high quality
data, but it increases the data volume. The size of hyperspectral 2. Compression unit - Standard compression techniques
data generated by NASA IPL’s Airborne Visible/ Infrared like vector quantization, transform coding or
imaging spectrometer for spectrum at reflected light is around prediction coding can be used.
500 megabytes per flight which serves purposes like geological
mapping, environmental Monitoring, disaster assessments
,target recognition, Urban growth analysis, vegetation,
classification ,defense and many more.
I. INTRODUCTION
Hyperspectral data is three dimensional data. The data is a
stack of two dimensional images with each 2D image
corresponding radiation received by the sensor with specific
wavelength. These images are also called band images or
bands.
Another way to view hyperspectral data is by means
of pixel vector where data from same pixel location in each
band is used and a multi-dimensional vector is created. Here
elements of pixel vector corresponds to reflected energy with
specific wavelength from the surface of earth.
346 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
347 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
was designed to have good compression without [SPIHT][8] and contact based approach shown is JPEG zero
much loss in classification accuracy, it gave a image compression standard. [9].
compression ratio near to 43:1.
C. Predictive coding
3. Qian et al [6] developed fast VQ lossy compression
techniques for , here segmentation of image is done Data already sent to decoder is utilized for prediction
and codebooks are utilized the processing speed of of current data in predictive coding. In hyperspectral
compression is improved by around 1000/ average predictive coding for prediction linear combination of values
fidelity penalty of 1dB. from pixels which are adjacent spectrally and specially to
current pixel are used.
4. Qian [7] again developed the generalized Lloyd Pixel selection for prediction method depends on
algorithm for hyperspectral imager in 2004 with order in which data is given to algorithm. Two main format in
which hyperspectral data can be analyzed with less this regard are 1) BIL: Band interleaved by line & 2) BSQ:
computational iterations with improvement in the Band Sequential.
distance of partition.
1. Roger and caven or [10] started with DPCM with 1.6-
B. Transform coding 2.0:1 compression ratio. Forwarded by Aiazzi et.al
[11] who used fuzzy logic for prediction co-efficents,
In transform coding a set of product values are produced gives 20% improvement in compression ratio
which is the product of original sample value and a set of basis compared to [10]. Then WU and Memon[12]
vectors where, Basis vectors are sinusoids with increasing developed content based adaptive lossless image
frequency. coding [CALIC] and 3D version of
Resulting coefficients which are generated after CALIC[13].
adding up product values indicates frequency content of the
original signal . Product value depends on shapes of original 2. Magli et al modified CALIC algorithm by using
sample & basis vector. If both are same it is positive if not previously generated bands for prediction and the
near to zero with high correlation of adjacent samples, data new M-calic algorithm has given excellent results for
bits for transmission can be effectively reduced. Some lossless and near lossless application for data in bit
Transform coding algorithms are discussed as follows interleaved format.
348 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
349 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[17] Hanye Pu, Zhao Chen, Bin Wang and Wei Xia,"
Constrained Least Squares Algorithms for
Nonlinear", IEEE Transactions On Geosci. And
Remote Sensing, VOL. 53, NO. 3, pp- 1287-
1303,MARCH 2015
350 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 1
Abstract Mobility is one of the basic features that define an ad hoc network, an asset that leaves the field free for the nodes to
move. The most important aspect of this kind of network turns into a great disadvantage when it comes to commercial
applications, take as an example: the automotive networks that allow communication between a groups of vehicles. The ad hoc
on-demand distance vector (AODV) routing protocol, designed for mobile ad hoc networks, has two main functions. First, it
enables route establishment between a source and a destination node by initiating a route discovery process. Second, it maintains
the active routes, which means finding alternative routes in a case of a link failure and deleting routes when they are no longer
desired. In a highly mobile network those are demanding tasks to be performed efficiently and accurately. In this paper, we
focused in the first point to enhance the local decision of each node in the network by the quantification of the mobility of their
neighbors.
Index Terms—Ad hoc, mobility, RSSI, AODV, Localization, Distance, Relative speed, Degree of spatial dependence, GPS-free.
351 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017
(Internet Engineering Task force), by the (RFC 3561). Maintained roads is done by periodically sends short
The protocol's algorithm creates routes between nodes message application called "HELLO" , if three consecutive
only when the routes are requested by the source nodes, messages are not received from a neighbor, the link in
giving the network the flexibility to allow nodes to enter question is deemed to have failed . When a link between
and leave the network at will. Routes remain active only as two nodes of a routing path becomes faulty, the nodes
long as data packets are traveling along the paths from the broadcast packets to indicate that the link is no longer valid.
source to the destination .When the source stops sending Once the source is prevented, it can restart a process of
packets, the path will time out and close. route discovery.
In this paper we propose a solution that enables each AODV maintains its routing tables according to their use,
node in the network to determine the location of its a neighbor is considered active as long as the node delivers
neighbors in order to create a more stable and less mobile packets for a given destination, beyond a certain time
road. For that purpose, we locally quantify the neighbor’s without transmission destination, the neighbor is considered
distances of a node as the metric of mobility using AODV inactive. An entered routing table is considered active if at
protocol. least one of the active neighbors using the path between
The remainder of this paper is organized as source and destination through active routing table entries is
follows. Section 2, describes briefly the AODV protocol. called the active path. If a link failure is detected, all entries
In Section 3, a summary of related work is presented. we of the routing tables participating in the active path are
present in Section 4,6,8 how to quantify, evaluate, estimate removed.
mobility in ad hoc network(Distance, Relative Speed, AODV (ad hoc on-demand distance vector routing) is a
Degree of special dependence). Section 5,7,9 shows the routing protocol for mobile network. AODV is capable of
algorithm used the quantification of the Mobility’s metrics both unicast and multicast routing. An algorithm in the
in AODV protocol. Section 10 presents some simulations application that is to say it does built roads between the
and results. Finally Section 11 concludes this paper. nodes when requested by the source nodes. It maintains
these routes as long as the sources are in need.
A. Route Discovery
II. AD HOC ON-DEMAND DISTANCE VECTOR
When a source node wants to establish a route to a
AODV is an on-demand protocol which is capable of destination for which it does not yet drive it broadcast a
providing unicast, multicast [7], broadcast communication Route Request packet through the network.
and Quality of Service aspects (QoS) [8], [9]. It combines
mechanisms of discovery and maintenance roads of DSR
Route Request
(RFC 4728) [10] involving the sequence number (for
Broadcast ID
maintains the consistency of routing information) and the
IP source
periodic updates of DSDV [11].
Destination address
At the discovery of routes, AODV maintains on each
Meter jump
node transit information on the route discovery, the AODV
Sequence number of the source
routing tables contain:
Sequence number of the destination
- The destination address
Table 1:Route Request Contents
- The next node
- The distance in number of nodes to traverse
- The sequence number of destination B. The return path
- The expiry date of the entry of the table time.
When a node receives a packet route discovery (RREQ), A node receiving a Request Route records in its routing
it also notes in its routing table information from the source table the IP address of the source node, the sequence
node and the node that just sent him the package, so it will number, the number of hops which separates the source and
be able to retransmit the response packet (RREP). This the IP address of the neighbor who just sent him this
means that the links are necessarily symmetrical. The request.
destination sequence number field of a route discovery If it is the destination or if it has a route to the destination
request is null if the source has never been linked to the (with a higher sequence number equal to or included in the
destination, else it uses the last known sequence number. It Request Route), the node will issue a Route Reply packet.
also indicates in this query its own sequence number. When Otherwise, it broadcast the Request Road. The nodes keep
an application sends a route discovery, the source waits for track of IP sources and broadcasts Route Request ID. If
a moment before rebroadcast its search query (RREQ) road, they receive a Request Route they have already treated,
after a number of trials, it defines that the source is they depart and do not transmit.
unreachable.
352 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 3
353 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017
: Receiving power.
: Transmitting Power.
: Gain of a transmitting antenna = ability to α is calculated using this formula:
radiate in a particular direction in space.
: Gain of a receiving antenna = ability to couple
the energy radiated from a direction in space.
: is the wavelength.
: is system loss factor which has nothing to do
with the transmission
: is the distance between the antennas.
Then, to calculate the distance between two nodes that’s are
equipped by transmitting antennas, the formula is
354 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 5
- x diffuse RREQ.
- Each node receiving RREQ, calculates the
distance between itself and the neighbor who
For b, the coordinates are: sent him RREQ (in this part we use the exact
distance or the distance using the Pr) and
broadcasts its table [neighbors-distance] to its
neighbors.
355 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017
In this part, we propose to use one of those methods in Using the theorem of Al-Kashi:
the first function of a AODV protocol (rout establishment
between a source and a destination). The Relative Direction can be reformulated as:
A node x wants to communicate with a node y.
- x diffuse RREQ.
- Each node receiving RREQ, calculates the
distance between itself and the neighbor who
sent him RREQ (in this part we use the exact .
distance or the distance using the Pr) and Using one of methods of the quantification of distances
broadcasts its table [neighbors-distance-time] to cited above, every node can calculate the movement’s
its neighbors. speed of its neighbors.
- Each node calculates the relative speed between By definition the relative speed is the variation in time of
itself and its neighbors using the precedent the distance between two mobiles.
formula.
356 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 7
6.1. Environment
The network size considered for our simulations is
(1000m1000m) . The nodes have the same configuration,
in particular TCP protocol for the transport layer and Telnet
for the application layer. Time for each simulation is of 60s.
For each simulation the mobility of the nodes is represented
by the choice of an uniform speed between = 0 and
= 100 m/s. The nodes are moved after a random
choice of the new destination without leaving the
network (1000m1000m).
357 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017
358 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 9
359 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017
REFERENCES
Figure 12: The distance, the Relative Speed and [5] C. Perkins, B.-R. E. and D. S.“Ad hoc On-demand Distance Vector
between node 1 and its neighbor node 8 during the first 3s of routing” Request For Comments (Proposed Standard) 3561, Internet
the simulation Engineering Task Force, July, 2003.
[6] http://datatracker.ietf.org/wg/manet/charter/
This algorithm can be used in all environments with or [7] C. Cordeiro, H. Gossain and D. Agrawal “Multicast over Wireless
without GPS. The metric calculated describes the similarity Mobile Ad Hoc Networks: Present and Future Directions” vol. 17,
of the velocities of two nodes using the distance and the no. 1, pp. 52-59, January/February, 2003
relative speed.
[8] Sung-Ju Lee, Elizabeth M. Royer and Charles E. Perkins
“Scalability Study of the Ad Hoc On-Demand Distance Vector
0.020 - 084: the is high wish describes the Routing Protocol” In ACM/Wiley International Journal of Network
Management, vol. 13, no. 2, pp. 97-114, March, 2003
similarity of the velocity between the nodes 1 and 8, that’s
translated by the almost stability of the distances and the [9] Ian Chakeres Elizabeth and M. Belding-Royer “AODV
relative speeds in this interval. Implementation Design and Performance Evaluation” to appear in a
special issue on Wireless Ad Hoc Networkirng’ of the
InternationalJournal of Wireless and Mobile Computing
0.94 - 1.75: the decreases wish describes the (IJWMC), 2005
difference of the velocity between the nodes 1 and 8.
[10] David B. Johnson, David A. Maltz and Josh Broch “DSR: The
Dynamic Source Routing Protocol for Multi-Hop Wireless Ad Hoc
1.75 – 2.9: the increases wish describes the Networks” Proceedings of INMC, 2004 - cseweb.ucsd.edu
similarity of the velocity between the nodes 1 and 8
[11] Guoyou He. “Destination-sequenced distance vector (DSDV)
protocol.” Technical report, Helsinki University
of Technology, Finland. 2 Dec 2003
XI. CONCLUSION:
In this paper, we tried to calculate a local Mobility’s [12] ] P. Johansson, T. Larsson, N. Hedman, B. Mielczarek, and M.
Degermark, “Scenario-based performance Analysis of Routing
Metric between a node and its neighbors in a AODV Protocols for Mobile Ad Hoc Networks,” F’roc. ACM Mobicom
routing protocol for the ad hoc networks. This metric of 1999, Seattle WA, August 1999
mobility that can be used to choose a stable rout to transmit
data thus ameliorate the Quality of Service in this kind of [13] X. Hong, M. Gerla, G. Pei, and C.-C. Chiang, “A Group Mobility
Model for Ad Hoc Wireless Networks,” Proc. ACM/IEEE MSWiM
networks. ’99, Seattle WA, August 1999
To allow this proposition more really feasible, we present
the three methods to calculate the distance between two [14] N. Enneya, K. Oudidi and M. Elkoutbi “Network Mobility in Ad hoc
nodes. First, we use the exact distance with a GPS or using Networks”, Computer and Communication Engineering, 2008.
ICCCE 2008. International Conference on, 13-15 May 2008, Kuala
RSSI. In case that the absolute positioning is not accessible, Lumpur, Malaysia.
we propose our improved GPS-free implementing in
AODV protocol. [15] N. Enneya, M. El Koutbi and A. Berqia“Enhancing AODV
Performance based on Statistical Mobility Quantification”,
Using one of those methods, we can calculate two others Information and Communication Technologies, 2006. ICTTA '06,
metrics of mobility: Relative speed and the degree of 2nd (Volume:2 ), Pages 2455 – 2460.
special dependence.
[16] Ahmad Norhisyam Idris, Azman Mohd Suldi & Juazer Rizal Abdul
Hamid ,Effect of Radio Frequency Interference (RFI) on the Global
Positioning System (GPS) Signals,2013 IEEE 9th International
360 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 11
[19] http://revue.sesamath.net/spip.php?article362
361 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract:This paper focuses on the Testing Coverage based Software Reliability Growth Models. In this paper we analyzed the various
SRGM that incorporates testing Coverage and proposed a Testing Coverage based Software Reliability Growth Model. We also
suggested a Weighted Criterion for rankingand analysis of SRGMs. In this paper, the proposed model with various existing
SRGMsbased on Testing Coverage has been examined and analyzed using two real time software data sets. We have also ranked the
various SRGMs based on weighted comparison criteria values. We find that the proposed model can provide a significant improved
goodness-of-fit and conclude the ranking of SRGMs.
I. SECTION – 1
A. Introduction:
Software Reliability models provide quantitative measures of the reliability of software systems during software development
processes. In the software engineering the Reliability of Software System is considered as the key characteristic of the quality of
the software. Achieving a level of software quality forms the mean basis for deciding the release schedule of the software. Besides
testing efficiency, testing efforts, various other factors are incorporated during reliability estimation such as fault complexity,
debugging time lag etc. There are many other factors that greatly influence the reliability growth. Among all others, one factor
that plays a critical role in a reliability assessment is Testing Coverage. Testing Coverage is a measure that enables software
developer to evaluate the quality of the tested software and determine how much additional effort is required to the software
reliability.
Among all SRGMs, a large family of stochastic reliability models based on a non-homogeneous Poisson process, which is
known as NHPP reliability models, has been widely used to track reliability development during software testing. These models
facilitate software developers to estimate software reliability in a quantitative manner. They have also been effectively used to
provide guidance in building decisions such as when to conclude testing the software or how to distribute available resources.
However, software development is a very intricate process and there are still issues that have not so far been addressed. Testing
coverage is among those issues.
Testing coverage is an essential measure for both the software developers and clients of software products. It can assist
software developers to estimate the excellence of the tested software and determine how much additional attempt is required to
improve the reliability of the software. Testing coverage, on the other hand, can offer customers with a quantitative assurance
criterion when they plan to buy or use the software products.
Gokhale et al.[1] analysed the affect of testing coverage and drive several types of coverage function in registering NHPP
based SRGMs. Yamada et al. [2] develop software reliability models with testing domain coverage ratio. Pham and Zhang [3]
proposed NHPP software reliability model for obtaining coverage function for various testing efficiency. Kapur et al [4] also
suggested S-shaped Testing Coverage based SRGM.
In this paper, the study is focused on the critical analysis and ranking of various existing Testing coverage based software
reliability growth models based on weighted criteria values using two data sets of real time failure data. The rest of this paper is
organized as follows. Section-2 describes Testing Effort Coverage SRGM in literature and also proposes a Testing Coverage
SRGM for analysis and ranking of SRGM based on Weighted Criteria.In Section-3, we discuss about comparison Criteria of
SRGM. In Section-4, we explain our new Ranking methodology Scheme. Section-5 shows the experimental results through two
different real data sets. The parameters are estimated for Goodness-of-Fit and Comparison with ranking of various SRGMs are
also included in this section. The conclusion and remarks are given in Section-6.
362 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
II. SECTION - 2
A. Description of the Testing Coverage SRGMs:
In this Section, we discuss the different testing efforts function and various existing Testing Coverage Based Software
Reliability models and also propose a Testing Coverage SRGM for Critical Analysis and Ranking.
1) Testing Effort functions:
Testing Effort plays an important role in the Testing Process of Software Development. Testing Effort is measured by:
The numbers of executed Test cases, The CPU time spend in Testing face, The Amount of manpower etc. The consumption
curve of Testing Resources over Testing Period can be thought of Testing Effort curve. A Testing Effort Function Describes
distribution or consumption function of Testing Efforts such as Test cases executes, CPU Testing hours, manpower etc. during
Testing face Yamada et al [5][6][7], Kapur et al [8], Kuo et al [14], Hueng et al [10][11] suggested software reliability growth
models explaining the relationship among the Testing time, Testing Efforts expenditure and number of software faults detected.
Yamada et al. [5][6][7] proposed Weibull type distribution to explain testing efforts function as given below:
• Exponential Testing Efforts Function: The exponential curve is used for processes that decline monotonically to
an asymptote. The cumulative Testing effort consumed in (0,t] is given as:
( ) = (1 − )
• Rayleigh Testing Effort Function:Rayleigh curve is frequently used as an alternative to the exponential curve. This
curve predicts the cost and the schedule of the software development. The testing effort function is given as:
( )= 1−
• Weibull Testing Effort function: The Weibull Testing Effort Function is given as:
( )= 1−
Exponential Rayleigh functions are the special cases of Weibull Function for l=1 and l=2 respectively.
• Logistic Testing Effort Functions: The Logistic Testing effort function over the time period (0, t] can be defined
as:
( )= , ℎ (0) =
Parameter used in above mentioned testing effort functions are:
:is the total amount of test effort expenditures required for software testing
:is the scale parameter
l: is the shaped parameter
β: is constant
W(t): is a testing effort function
2) TestingCoverage based SRGM:
In general test coverage measure is defined as how well a test covers all the potential faults sites in software system. A
large verity of test coverage measures exists [12] in the literature. Malaiya et al [13] based on the previous work made the initial
attempt of testing coverage to software reliability. Gokhale et al. [1] proposed the model analysing the effort of testing coverage
and derived several forms of coverage functions for various executing NHPP models. What we are considering in this paper is
GO-Exponential Testing Coverage SRGM [1]. Yamada et al[14] also proposed S-shaped Testing coverage SRGM for software
error detection. Pham and Zhang[15] also suggest the testing coverage model, incorporating the concept of testing efficiency.
Kapur et al. [16] also suggested a Testing Efficiency S-shaped coverage dependent SRGM. Inoue et al. [17] proposed a Flexible
Testing Coverage SRGM. Yamada et al. also proposed an Exponential and Rayleigh Testing Coverage SRGM [18]. The mean
value function ( ) and coverage function ( ) of above said executing models considered for analysis and ranking are given
in Table-1. In addition to these SRGM based on the Testing Coverage, a Testing Coverage based SRGM is proposed. The
proposed Model incorporating testing coverage and testing effort is based on basic assumptions of NHPP software failure
phenomenon and also assumed that identified errors are removed perfectly and no additional faults introduced during the
process. Cumulative Testing Effort Function is modelled by logistic function. The cumulative Testing Effort consumed in the
interval (0, ] of the logistic fault is given as:
( ) =
1+
where (0) = and , and are constant.
The Coverage Function ( )of the Model is considered as:
( )
1−
( ) = ( )
1+
The Mean Value Function of proposed SRGM is
363 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
( )
1−
( )= ( )
1+
where and are constant and a is constant, representing the number of faults lying in the software at the starting of
testing.
Testing Coverage Based SRGMs Mean value function m(t) and Reference
Coverage Function c(t)
Model-1: G-O Exponential Testing Coverage ( ) = (1 − ) [1]
SRGM ( ) = (1 − )
Model-2: Yamada et al. S-Shaped Testing ( ) = (1 − (1 + ) ) [14]
Coverage SRGM ( ) = (1 − (1 + ) )
Model-3: Pham et al Testing efficiency ( )= 1 − (1 + )( ) ( ) [15]
Coverage SRGM 1−
( ) = (1 − (1 + ) )
Model-4: Kapur et al. Testing Efficiency S- [16]
Shaped Coverage SRGM ( )= 1− 1+
1−
( )
( ) ( )
+
2
( )
( )= 1− 1+ +
2
Model-5: Pham et al. Testing Efficiency S- ( ) = (1 − (1 + ( + ) + ) ) [19]
Shaped Coverage SRGM. ( ) = (1 − (1 + ) )
Model-6: Inoue et al. Flexible Testing ( )= 1− ( ) [17]
Coverage SRGM ( )
( )= , =
Model-7: Yamada et al. Testing Coverage ( )= ( ) [18]
1−
SRGM
( )= (Rayleigh)
( )= (Exponential)
Model-8: Proposed Testing Coverage SRGM ( )
1−
( )= ( )
1+
( )
1−
( )= ( )
1+
( )=
1+
Table-1: Summary of the SRGM based on Testing Coverage Model and Proposed Model
III. SECTION – 3
A. Comparison Criteria:
To investigate the effectiveness of SRGMs, the various comparison criteria used to compare models quantitatively are
Mean Squared Errors (MSE), BIAS, VARIANCE, Mean Absolute Error (MAE), Mean Error of Prediction (MEOP), Predictive
Ratio Risk (PRR), Accuracy of Estimation (AE), Sum of Squared Errors (SSE), R2 and Root Mean Square Predictive Error
(RMPSE). In each comparison criteria except R2 it is seen that Lower the value provides a better to the Goodness-of-Fit for
SRGM. In case of R2 value close to 1 provides a better Goodness-of-Fit of Model [9]. The summary of comparison criteria is
given in Table-2
Metrics Formula
364 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
365 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
⎡W 11 W12 K K W1m ⎤
⎢W W22 K K W2 m ⎥⎥
W = ⎢ 21
⎢ M M K K M ⎥
⎢ ⎥
⎣ Wn1 Wn 2 K K Wnm ⎦
Where, W = 1 − Z , for i = 1 to n and j = 1 to m.
Z is criteria rating of j criteria of i model. There are two cases to calculate the criteria rating
When the smaller criteria value is best fit to the actual data, the criteria rating is calculated as:
(CMAX) − C Critria Maximum value − Criteria Value
Z = =
(CMAX) (CMIN) Criteria Maximum value – Criteria Min value
When the larger criteria value is the best fit to the actual data then criteria rating is calculated as:
C − (CMIN)j Critria value – Criteria Minimum Value
Z = =
(CMAX) (CMIN) Criteria Maximum value – Criteria Minimum value
Step 3: Weighted Criteria Value:
Weighted Criteria value is calculated by multiplying criteria value of each Criterion with their weight. Let V is
weighted criteria value of j criteria of i model and is calculated as:
V = W ∗ C
The Weighted Criteria value Matrix V is given as:
⎡V 11 v12 K K v1m ⎤
⎢V V22 K K V2 m ⎥⎥
V = ⎢ 21
⎢ M M K K M ⎥
⎢ ⎥
⎣ Vn1 Vn 2 K K Vnm ⎦
Step 4: Permanent Value of Model:
The Weighted mean value of all Criteria is called Permanent value of model. The Permanent Value of model is given as:
∑
P =∑ , for i=1 to n
Step 5: Ranking of Models:
The ranking of models is proposed on the basis of permanent value of the model. The model with smaller permanent
value is considered good ranker as compared to the model with bigger permanent value. Thus ranks for all models are
provided by comparing permanent values.
V. SECTION – 5
A. Data Validation, Data Set and Data Analysis
1) Model Validation:
To illustrate the estimation procedure of Software Reliability Growth Model (Existing as well as proposed), we have
carried data analysis of the two real software data sets. A data set (Data Set-1) cited from Musa, Iannino and Okumoto [20]. The
software was real time control system, which represented the faults observed during system testing for 25 hours of CPU time.
For this real time control system 21700 object instructions were delivered. It was developed by Bell Laboratories. A data set
(Data Set-2) obtained from H. Pham [21]. In this data set the number of faults detected in each week of testing is found, and the
cumulative number of faults since the start of testing is recorded for each week. It observes 416 hours per week of testing. It
provides the cumulative number of faults by each week up to 21 weeks and 43 failures observed during system testing for 8736
hours of CPU time. The parameters of the Model have been estimated using Statistical Package SPSS. The results of the
parameter estimation of SRGMs using Data Set-1 and Data Set-2 are shown in Table-3 and Table-4 respectively.
366 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
367 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The fitting of various SRGMs and the proposed model to Data Set-1 and Data Set-2 is graphically illustrated in Figure-1
and Figure-2 respectively. The fitness of proposed SRGM estimated and actual values of the faults to Data Set-1 and Data Set-2
is shown in Figure-3 and Figure-4 respectively.
160
140
Cumulative Number of Faults
120
100
80
Actual Data
60 Model-1
Model-2
Model-3
40 Model-4
Model-5
20 Model-6
Model-7
Proposed Model-8
0
0 5 10 15 20 25 30
Execution Time in Hours
Figure-1: Goodness of Fit Curves for various SRGMs and Proposed Model for Data Set-1 (Musa et al.)
45
Cumulative Number of Faults
40
35
30
25
Actual Data
20 Model-1
Model-2
15 Model-3
Model-4
10 Model-5
Model-6
5 Model-7
Proposed Model-8
0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
System Test Hours
Figure-2: Goodness of Fit Curves for various SRGMs and Proposed Model for Data Set-2 (Pham)
368 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
140
Cumulative Number of Faults
120
100
80
60
40 Actual Data
Proposed Model-8
20
0
0 5 10 15 20 25 30
Execution Time in Hours
Figure-3: Goodness of Fit Curves for Actual Data and Proposed Model for Data Set-1 (Musa et al.)
45
40
Cumulative Number of Faults
35
30
25
20
15 Actual Data
10 Proposed Model-8
5
0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
System Test Hours
Figure 4: Goodness of Fit Curves for Actual Data and Proposed Model for Data Set-2 (Pham)
4) Ranking of SRGM based on Weighted Criterion:
The Ranking of these eight software reliability growth models (seven existing and one proposed) based on ten criteria
values as described above are calculated using Weighted Criterion Values for Data Set-2. The Weighted Values of the Criteria
is shown in Table-7 and the Permanent Values of Models and Ranking is shown in Table-8.
369 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Criteria R2 MSE BIAS Variance RMPSE MAE MEOP AE SSE PRR SUM
Model
Model-1 0.969 6.723 2.740 12.260 12.560 3.030 2.880 0.053 236.610 10.070 287.895
Model-2 0.985 3.273 1.570 7.020 7.190 1.730 1.650 0.032 62.370 23.620 109.440
Model-3 0.985 3.455 1.570 7.020 7.190 1.830 1.730 0.032 62.370 23.620 397.335
Model-4 0.984 3.733 1.560 6.960 7.130 1.810 1.720 0.017 67.160 75.750 794.670
Model-5 0.985 3.410 1.500 6.710 6.870 1.750 1.660 0.027 62.370 15.230 100.512
Model-6 0.992 1.783 1.050 4.680 4.790 1.220 1.160 0.002 33.660 8.430 57.767
Model-7 0.910 19.547 3.780 16.900 17.320 4.180 3.970 0.081 391.290 23.300 481.278
Proposed
0.995 1.037 0.860 3.860 3.950 1.010 0.950 0.025 21.420 1.710 35.817
Model-8
Maximum 0.995 19.547 3.780 16.900 17.320 4.180 3.970 0.081 391.290 75.750 481.278
Minimum 0.910 1.037 0.860 3.860 3.950 1.010 0.950 0.002 21.420 1.710 35.817
Table-7: Weighted Values of Criteria of various SRGMs
Criteria Weighted Matrix of 8 SRGMs (Rows) and 10 Comparison Criteria (Columns) is given below:
⎡0.306 0.31 0.644 0.64 0.644 0.64 0.639 0.65 0.582 0.11⎤
⎢0.118 0.12 0.243 0.24 0.242 0.23 0.232 0.38 0.111 0.30⎥⎥
⎢
⎢0.118 0.13 0.243 0.24 0.242 0.26 0.258 0.38 0.111 0.30⎥
⎢ ⎥
0.129 0.15 0.240 0.24 0.238 0.25 0.255 0.19 0.124 1.00 ⎥
W =⎢
⎢0.118 0.13 0.219 0.22 0.218 0.23 0.235 0.32 0.111 0.18⎥
⎢ ⎥
⎢0.035 0.04 0.065 0.06 0.063 0.07 0.070 0.00 0.033 0.09⎥
⎢ 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.29⎥
⎢ ⎥
⎢⎣ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.29 0.00 0.00⎥⎦
Weighted Criteria Values Matrix of 8 SRGMs (Rows) and 10 Comparison Criteria (Columns) is given below:
⎡0.30 2.07 1.76 7.90 8.09 1.93 1.84 0.03 137.66 1.14 ⎤
⎢0.12 0.40 0.38 1.70 1.74 0.39 0.38 0.01 6.91 6.99 ⎥⎥
⎢
⎢0.12 0.45 0.38 1.70 1.74 0.47 0.45 0.01 6.91 6.99 ⎥
⎢ ⎥
0.13 0.54 0.37 1.65 1.70 0.46 0.44 0.00 8.31 75.75⎥
V =⎢
⎢0.12 0.44 0.33 1.47 1.50 0.41 0.39 0.01 6.91 2.78 ⎥
⎢ ⎥
⎢0.04 0.07 0.07 0.29 0.30 0.08 0.08 0.00 1.11 0.77 ⎥
⎢ 0.91 19.55 3.78 16.90 17.32 4.18 3.97 0.08 391.29 6.79 ⎥
⎢ ⎥
⎢⎣0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 ⎥⎦
Model Permanent Value and Ranking
Model Sum of Weight Sum of Weighted Values Permanent Value Rank
Model-1 5.162 162.71 31.52 6
Model-2 2.212 19.02 8.60 5
Model-3 7.373 19.22 2.61 3
Model-4 2.811 89.35 31.78 7
Model-5 1.980 14.34 7.24 4
Model-6 4.792 2.81 0.59 2
Model-7 9.292 464.77 50.02 8
Proposed
0.291 0.01 0.03 1
Model-8
Table-8: The Permanent Values of Models and Ranking
6) Data Analysis:
It is clear from Table-5 for the Data Set-1of comparison criteria MSE, BIAS, Variance, RMPSE, AE, PRR of proposed
SRGM is the lowest as compared to existing SRGMs and the value of R2 is very close to 1 which shows the goodness-of-fit of
370 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
the proposed model. Table-6 for the Data Set-2 of comparison criteria MSE, BIAS, VARIANCE, RMPSE, MAE, MEOP, SSE
and PRR of proposed SRGM is also the lowest as compared to existing SRGMs and the value of R2 is very close to 1 which
further shows the goodness-of-fit of proposed model. It is also clear from Table-8, the ranking of proposed model is 1 as
compared to existing SRGMs.
VI. SECTION - 6
A. Conclusion:
In this paper, we studied the various existing Software Reliability Growth Models based on Testing Coverage and also
proposed a Testing Coverage Software Reliability Growth Model. We also explained a new ranking methodology based on
Weighted Criteria and evaluated the software reliability growth models. The result of comparison criteria for both data sets i.e.
Data Set-1 and Data Set-2 shows the goodness-of-fit curves of proposed model. This paper also includes the issue of optimal
selection of the Testing Coverage SRGMs based on Weighted Criteria. This Weighted Criteria method is suitable for ranking the
software reliability growth models based on a set of criteria taken all together. The weighted criteria method uses a relatively
simple mathematical formulation and straight forward calculation. Now we conclude in this paper that the ranking of the proposed
testing coverage SRGM is 1 using Weighted Criteria which also matches the goodness-of-fit using individual comparison criteria.
371 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Data Set #2: A data set (Data Set-2) obtained from H. Pham [19]. In this data set the number of faults detected in each week of
testing is found, and the cumulative number of faults since the start of testing is recorded for each week. It observes 416 hours per
week of testing. It provides the cumulative number of faults by each week up to 21 weeks and 43 failures observed during system
testing for 8736 hours of CPU time.
VII. REFERENCE:
[1]. Gokhale SS, Philip T, Marinos PN, Trivedi KS (1996) Unification of finite failure non homogeneous poison process models through test
coverage. In: Proceedings 7th International Symposium on Software Reliability Engineering, White Plains, pp 299–307.
[2]. Shaik. Mohammad Rafi et al., ―Software Reliability Growth Model with Logistic-Exponential Test-Effort Function and Analysis of
Software Release Policyǁ, (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 02, 2010, 387-399.
[3]. Pham H, Zhang X (2003) NHPP software reliability and cost models with testing coverage. Eur J Oper Res 145(2):443–454.
[4]. Kapur PK, Singh O, Gupta A (2005) Somemodeling peculiarities in software reliability. In: Proceedings Kapur PK, Verma AK (eds)
Quality, reliability and infocom technology, trends and future directions. Narosa Publications Pvt. Ltd., New Delhi, pp 20–34.
[5]. H. Pham, Software Reliability, Springer, Berlin, 2000.
[6]. H. Pham, X. Zhang, Software release policies with gain in reliability justifying the costs, Annals of Software Engineering 8 (1999) 147–
166.
[7]. H. Pham, Software reliability, in: J.G. Webster (Ed.), Wiley Encyclopedia of Electrical and Electronics Engineering, Wiley, New York,
2000.
[8]. A. Wood, Predicting software reliability, IEEE Computer 11 (1996) 69–77.
[9]. S Singh, M., Bansal, V., 391Parameter Estimation and Validation Testing Procedures for Software Reliability Growth Model in
International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064, 5(12), 1675-1680, 2016.
[10]. S. Yamada, Software quality/reliability measurement and assessment: Software reliability growth models and data analysis, Journal of
Information Processing 14 (3) (1991) 254–266.
[11]. S. Yamada, K. Tokuno, S. Osaki, Imperfect debugging models with fault introduction rate for software reliability assessment,
International Journal of Systems Science 23 (12) (1992).
[12]. Malaiya YK, Li MN, Bieman JM, Karcich R (2002) Software reliability growth with test coverage. IEEE Trans Reliability 51(4):420–
426.
372 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[13]. Malaiya YK, Li N, Bieman J, Karcich R, Skibbe B (1994) The relationship between test coverage and reliability. In: Proceedings of the
5th International Symposium Software Reliability Engineering, Monterey, CA, pp 186–195.
[14]. Yamada S, Ohba M, Osaki S (1983) S-shaped software reliability growth modelling for software error detection. IEEE Trans Reliability
R-32(5):475–484.
[15]. Pham H, Zhang X (2003) NHPP software reliability and cost models with testing coverage. European Journal Operational Research
145(2):443–454.
[16]. Kapur PK, Singh O, Gupta A (2005) some modelling peculiarities in software reliability. In: Proceedings Kapur PK, Verma AK (eds)
Quality, reliability and infocom technology, trends and future directions. Narosa Publications Pvt. Ltd., New Delhi, pp 20–34.
[17]. Inoue S, Yamada S (2008) Two dimensional software reliability assessment with testing coverage. The 2nd International Conference on
Secure System Integration and Reliability Improvement, pp 150–155
[18]. Yamada S, Ohtera H and Narithisa H, “Software Reliability Growth Models with Testing Effort” IEEE Trans. on Reliability, R-35 (1),
pp. 19-23, 1986.
[19]. Pham H (2006) System software reliability., Reliability engineering series Springer Verlag, London
[20]. Musa JD, Iannino A and Okumoto K, Software Reliability: Measurement, Prediction, Applications, McGraw Hill, 1987.
[21]. Pham H, An Imperfect-debugging Fault-detection Dependent-parameter Software, “International Journal of Automation and
Computing” 04(4), October 2007, 325-328, DOI: 10.1007/s11633-007-0325-8.
373 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— A secure communication channel is imperative Authentication is done by the help of another node as a
part of any significant communication in different systems. In trusted third party, who is the constructor of system in this
this paper, two well-known algorithms,RSA and Diffie- situation. RSA algorithm is an asymmetric cryptography
Hellman, are combined in order to form a secure
communication channel. This paper uses RSA for algorithm, which is used for digital signature and
authentication and Diffie-Hellman for key exchange. After confidential applications [2]. Diffie-Hellman algorithm is a
formation of the channel, the nodes will use the exchanged key key-sharing algorithm in which both nodes will have a
for encryption of their messages. Moreover, construction of same key in secure state without transmission of any key in
the secure communication channel is accelerated using the channel [3]. In this paper, key exchange is similar to
hardware/software (HW/ SW) co-design such that, it uses SPEKE algorithm but there is a prominent difference that
both advantages of hardware and software. Their
implementations using software-only is time consuming, while
hashing functions are removed and asymmetric signature
implementation of it using a HW/ SW co-design platform algorithm RSA is added [4].
speeds up the secure communication. The experimental This paper proposes a new design to accelerate
results show that this kind of design is 6 times faster than construction of a secure communication channel; then, it
software-only one. In addition to that, memory overhead and evaluates the implementation of this design. The hardware
computational overhead of this method is approximately implementation usually has better performance compared
trivial. The implementation of this paper was done by Xilinx
with software one, while the latter has better flexibility,
Embedded Development Kit (EDK ) software tool, which is a
suitable mean to implement different projects of HW/SW co- usage of empty space of the processor, lack of hardware
design. design complexity and area overhead. Therefore, this
Keywords—Secure Communication Channel, secure communication is implemented using HW/SW co-
Authentication, Key Exchange, RSA Protocol, Diffie-Hellman design to access beneficial aspects of both sides. The goal
Protocol, Field Programmable Gate Array (FPGA), EDK Tool, of this paper is increasing the speed of a secure
Co-design.
communication construction and enhancing its
I. INTRODUCTION performance with enough flexibility and usage of empty
space of processor. To obtain this goal, the complex and
In modern days of vast use of embedded systems, the
time-taking part of algorithm should be implemented using
attention to the non-functional requirements of them are
hardware. Modular exponentiation is a part of the algorithm
arisen; one of the key concept are security, which consists
which takes long execution time in software
of three main aspects of confidentiality, integrity and
implementation because of its complexity. The hardware
availability. Since the availability is usually covered in the
part is implemented by VHDL.
functional requirements, the focus is usually on the other
FPGAs are suitable candidate to improve a HW/SW co-
two. The known solutions for these are encryption and
design for obtaining higher performance [5, 6], and
authentication. For encryption, the ‘key’ plays the most
microprocessor are used for getting higher flexibility and
important role, and a secure mechanism for key exchange
more features. Implementing modular exponentiation on
is vital. Moreover, in unsecure channels, there is the
FPGA speeds up the construction of a secure
possibility of the man-in-the-middle attacks [1]; which
communication channel. In this paper, modular
focuses the attention toward authentication. Therefore,
exponentiation is used in both algorithms used in the
provision of security to communication nodes is
proposed protocol; also, time-taking parts is mapped into
indispensable. In this paper, a secure protocol is
the FPGA logic blocks whereas generation of public and
implemented to communicate two (or more) embedded
private keys is performed by a software processor (i.e.,
systems. Protocol steps are explained in the following.
MicroBlaze) in a Xilinx FPGA. Therefore, using HW/SW
co-design, a high speed secure communication channel is
1- Node Setting
constructed.
2- Authentication
There is a cost of area with increasing the speed; but
3- Key Exchange
since in the HW/SW co-design, the area overhead is very
little, it can usually be placed in the empty areas of the
372 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
FPGA; but if the design was done using only hardware, not [12] has provided a co-design flow to use high level
only it lacks the required flexibility, but also it usually description of a system and later take parts of it for
requires so much area, that requires changing the FPGA to hardware acceleration. They implemented a
a bigger one. hardware/software co-design of AES on NIOS II with a
This kind of implementation is 6 times faster than hardware accelerator. They described the algorithm in
another one, which is implemented using software-only. Catapult C, which converts C code into RTL language.
Moreover, its area and computational overhead is They reached around 8 times better performance in co-
measured. Because of their trivial values, these overhead design, comparing to pure software implementation of
can be ignored in comparison with the obtained speed. AES.
The rest of paper is organized as follows; related works In 2015, high throughput wireless communication
is presented in section II. Section III is related to the system was implemented using HW/SW co-design. The
background knowledge of the context. Section IV depicts goal of this work was reducing hardware design and time
the proposed method and implementation details; section V in order to provide reliable design [13].
discusses the experimental results and it is wrapped up with [14] proposed a HW/SW co-design of RSA in order to
conclusion in section VI. obtain performance and flexibility. It adopted Xilinx Zynq-
7000 SOC platform such that integrated a dual-core ARM-
II. RELATED WORKS A9 system with Xilinx programmable logic.
Some HW/SW co-design projects were done in previous
years but the proposed method is a new method to III. BACKGROUND KNOWLEDGE
accelerate a secure communication channel. This section presents different steps of two algorithms
In 2009, in Bristol University, a HW/SW co-design of RSA and Diffie-Hellman. This paper concerns two secure
public-key cryptography for Secure Socket Layer (SSL) functions in embedded systems: 1) authentication, 2)
was executed in embedded systems. In this work, the encryption. For authentication, we use asymmetric
hardware part included complex processor SPARC V8 cryptography to ensure the identity of each ends of
with a set of mathematical operations in elliptic curves, communication. For encryption, the algorithm itself is
while matrix of the secure socket layer was implemented in flexible but for preparation, we use a secure key-sharing
software part. This implementation increased the speed of algorithm. We call the combination of authentication and
public-key operations in a handshaking process of the key-sharing, a setup suite of secure channel. The main
secure socket layer. The result of the work on SPARC focus of this paper is to accelerate the implementation of
processor 20 MHz was 10 times faster than software-only this suite using co-design approach.
implementation [7]. The DMA controller, ECC accelerator
A. Authentication
and total design were remarkably large.
In 2010, a project was done for cryptography of elliptic One of the most imperative dangers in unsecure
curve on microcontroller PicoBlaze. In this project, the environment is masquerading. This kind of attacks may
cryptographic processor of scalable elliptic curve with have remarkable negative effects on the systems. The
limited sources was set on FPGA. The result of this work known defense mechanism is authentication; each node
was compared with the implementation on different Xilinx makes sure about identify of the node that is communicated
FPGAs (Spartan) based on scalability and area overhead with it in the beginning of the communication. There are
[8]. many protocols that can perform this function. Digital
In 2010, algorithm RSA was implemented based on two signature of trusted third party (TTP) is one of the highest
methods that one of them used hardware-only and another secure protocol. This method is used for security and
one was using software-only on microcontroller 8051. This authentication of various websites in the secure socket
paper presented that there is not any complete method; layer (SSL). The drawback of it is that finding the trusted
therefore, there is a trade-off between performance and third party is arduous; in this paper, constructor of
flexibility based on the type of applications. The result of embedded system is identified as trusted third party.
this paper was that speed of hardware-only implementation Asymmetric cryptography can be used for
was 4 times more than another one [9]. authentication, which needs two various keys for
In 2012, hash algorithm SHA-256 was implemented encryption and decryption, whereas symmetric
using HW/SW co-design and its performance had tangible cryptography requires only one key. Therefore, sender can
boost. This project used pipeline methods beside the send information securely to anyone. To do that, one key is
hardware part. Framework of this work was Virtex family identified as public-key and another one is private-key,
and it doubled operational power [10]. which is available for only the receiver.
[11] discusses the implementation of AES in wireless Another application of asymmetric cryptography is
sensor network using HW/SW co-design. Regarding the digital signature; if sender of packet encrypts it using its
environment, they focused mainly on the power private key, any node can decrypts it using the associated
consumption of implementation. Albeit, their public key. In this situation, in contrast with the encryption,
implementation shows 6 times better performance than the focus is on integrity aspect of security, and the receiver
software-only one.
373 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
is assured that the packet is approved by the sender using hence, key renewal is suggested on regular basis for each
its private key and it has the sender’s signature. communication to decrease possibility of finding those
RSA is the most famous and applicable algorithm among keys.
asymmetric algorithms. Security of RSA is related to There are many algorithms for secure key exchange; one
complexity of solving the problem of decomposition into of them is Diffie-Hellman. Its security is based on the
prime numbers. The various steps of this algorithm are complexity of solving the modulus logarithm. Different
presented in the following [2]. steps of this algorithm is informally explained in the
1- Two large prime numbers (𝑃, 𝑄) are selected. following [3].
2- 𝜑(𝑁) = (1 − 𝑃)(1 − 𝑄), 𝑁 = 𝑃 ∗ 𝑄 1- Both nodes select one modulus (𝑁) and one prime
3- To generate public key (e), one number is selected number (𝑃).
between 1 and (N) such that it should be relatively 2- Each of both nodes selects one value (𝑎), and sends
prime to (N). the value 𝑃𝑎 to another node.
4- To generate private key (d), 𝑒. 𝑑 ≡ 1 𝑚𝑜𝑑 𝜙(𝑁) 3- Both nodes generate the same value (𝐾𝑒𝑦), which is
5- 𝐸𝑛𝑐𝑟𝑦𝑝𝑡(𝑚) = 𝑐 ≡ 𝑚𝑒 𝑚𝑜𝑑 𝑁 received value 𝑃𝑎 to the exponent of their private
𝐷𝑒𝑐𝑟𝑦𝑝𝑡(𝑐) = 𝑚 ≡ 𝑐 𝑑 𝑚𝑜𝑑 𝑁 number.
Trusted third party selects both of public (𝑃𝑈𝑇 ) and Figure 3 formally describes the steps which were
private key (𝑃𝑅𝑇 ) for itself; these keys are used for digital described above.
signature. Constructor of the embedded system can hard- 𝑎𝑙𝑖𝑐𝑒 ↔ 𝑏𝑜𝑏: 𝑃, 𝑁
code all the required information including node 𝑎𝑙𝑖𝑐𝑒 𝑐ℎ𝑜𝑠𝑒𝑠 𝑎; 𝐴 ≡ 𝑃𝑎 𝑚𝑜𝑑 𝑁
identification number, public key, private key, and 𝑏𝑜𝑏 𝑐ℎ𝑜𝑠𝑒𝑠 𝑏; 𝐵 ≡ 𝑃𝑏 𝑚𝑜𝑑 𝑁
modulus of operation in that node [2]. Figure 1 shows all 𝑎𝑙𝑖𝑐𝑒: 𝐾𝑒𝑦 ≡ 𝐵𝑎 𝑚𝑜𝑑 𝑁 (𝐾𝑒𝑦 = (𝑃𝑏 )𝑎
fields, which are set in each node. 𝑏𝑜𝑏: 𝐾𝑒𝑦 ≡ 𝐴𝑏 𝑚𝑜𝑑 𝑁 (𝐾𝑒𝑦 = (𝑃𝑎 )𝑏
Figure 3. Required communication to generate the public key
∃𝑥: 𝑇𝑇𝑃 → 𝑥: { 𝑖𝑑𝑥 , 𝑃𝑈𝑥 , 𝑃𝑅𝑥 , 𝑁, [𝑖𝑑𝑥 , 𝑃𝑈𝑥 ]𝑃𝑅𝑇 }
Figure 1. Required fields in each node
IV. THE PROPOSED HW/SW CO-DESIGN
Each node sends its unique identification number along
The prominent part of the most modern electronic
with its signature in a packet; then, the receiver compares
systems is digital components, which have hardware
the node number of that packet with the received number;
platform such that software programs are executed on it.
if these two numbers are equal, authentication is done and
The goal of HW/SW co-design is providing merits of
public key is approved. This operation should be done by
hardware and software at the same time.
second node, too [2]. Figure 2 presents the communication
Introduction of Computer-Aided Design (CAD) makes
steps for authentication.
that HW/SW co-design a hot topic. Since, there are tangible
𝑎𝑙𝑖𝑐𝑒 → 𝑏𝑜𝑏: {𝑖𝑑𝑎 , [𝑖𝑑𝑎 , 𝑃𝑈𝑎 ]𝑃𝑅𝑇 } preference on HW/SW co-design tools, which have an
imperative role in market [15].
𝑏𝑜𝑏: 𝑖𝑑𝑎′ , 𝑃𝑈𝑎= [[𝑖𝑑𝑎 , 𝑃𝑈𝑎 ]𝑃𝑅𝑇 ] ; 𝑖𝑓(𝑖𝑑𝑎 = 𝑖𝑑𝑎′ )𝑂𝐾
𝑃𝑈𝑇 In this paper, Xilinx Embedded Development Kit (EDK)
𝑏𝑜𝑏 → 𝑎𝑙𝑖𝑐𝑒: {𝑖𝑑𝑏 , [𝑖𝑑𝑏 , 𝑃𝑈𝑏 ]𝑃𝑅𝑇 } software tool provides enough capacity for co-design. To
𝑎𝑙𝑖𝑐𝑒: 𝑖𝑑𝑏′ , 𝑃𝑈𝑏 = [[𝑖𝑑𝑏 , 𝑃𝑈𝑏 ]𝑃𝑅𝑇 ] ; 𝑖𝑓(𝑖𝑑𝑏 = 𝑖𝑑𝑏′ )𝑂𝐾 create a processor, Xilinx Platform Studio (XPS) tool is
𝑃𝑈𝑇
used. Moreover, the process of adding custom peripherals
is done using the import peripheral wizard.
Figure 2. Communication steps for authentication In the proposed method, modular exponentiation, which
is time-taking part of algorithm is implemented using
Authentication is ended in this state of process such that hardware and rest of algorithm is employed by software to
if node number of the packet is equal to the received one, obtain better performance with acceptable flexibility.
the node number will be approved. In the next step, key
exchange should be done, which is explained in the A. Implemetaion
following subsection. In order to implement the proposed secure
communication channel, EDK tool is used; its processor is
B. Key Exchange
a MicroBlaze system. The MicroBlaze processor is a 32-bit
Eavesdropping can be catastrophic depending on the Reduced Instruction Set Computer (RISC) architecture,
context of the application; the appropriate method to which is optimized for implementation in Xilinx FPGAs
defend this kind of attacks is cryptography. In with separate 32-bit instruction and data buses running at
cryptography, security is based on the key; therefore, whole full speed to execute programs and access data from both
information may be in danger, if that key is compromised. on-chip and external memory at the same time. MicroBlaze
One solution is asymmetric cryptography but their processor is a configurable and user-friendly processor that
computational overhead is higher than symmetric ones. can be used across FPGA and all programmable SOC
Moreover, keys might be found by exhaustive search; families.
374 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
375 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
VI. CONCLUSIONS
This paper proposes a new method to accelerate the
construction of a secure communication channel using
HW/SW co-design. The method uses from strengths of
software and hardware to obtain more suitable result. In
this design, time-taking part of the proposed method is
implemented as hardware part in order to obtain better
performance; the rest of algorithm is implemented by
software in order to get higher flexibility.
EDK is a suitable tool to implement HW/SW co-design.
The results of the implementation show that this method
has a remarkable acceleration of about 6 times faster than
software-only implementation with low hardware
overhead. Also, the results present that the proposed
method obtains suitable performance rather than existing
Figure 8. Hardware view of the proposed method with its peripherals methods of a secure communication channel construction.
REFERENCES
V. EXPERIMENTAL RESULT
1. Johnston, A. M., Gemmell, P. S."Authenticated Key Exchange
This section demonstrates the boost in the running speed Provably Secure against the Man-in-the-Middle Attack",
of the secure communication channel construction using Journal of Cryptology, pp.139-148, 2001.
the proposed HW/SW co-design. Table 1 presents running 2. Rivest, R. L., Shamir, A., Adleman, L. “A Method for
Obtaining Digital Signatures and Public-Key
time of modular exponentiation part and whole proposed
Cryptosystems”, Communications of the ACM, 21(2), pp.120-
protocol. As it is observed, this part is time-taking part of
126, 1978.
protocol because it should be called 8 times in the protocol. 3. Diffie, W., Hellman, M. “New Directions in
Cryptography”, Information Theory, IEEE Transactions
Table 1. Running time of modular exponentiation and whole on, 22(6), pp.644-654, 1976.
proposed method
4. Jablon, D. P. “Strong password-only authenticated key
Running Time of the Running Time of the exchange”, ACM SIGCOMM Computer Communication
Modular exponentiation Proposed Method (PS) Review, 26(5), pp.5-26, 1996.
(PS) 5. Ferrandi, F., Santambrogio, M. D., Sciuto, D. “A Design
96,688,000 773,505,000 Methodology for Dynamic Reconfiguration: The Caronte
Architecture”, In Proceedings of the 19th IEEE International
Table 2 shows running time of proposed protocol in two Parallel and Distributed Processing Symposium, pp. 4-8, April
different implementations including software-only 2005.
6. Mhadhbi, I., Litayem, N., Othman, S. B., & Saoud, S. B.
implementation and HW/SW co-design one. It is clear that
“Impact of Hardware/Software Partitioning and MicroBlaze
the latter is 6 times faster than the former. FPGA Configurations on the Embedded Systems
Performances”, In Complex System Modelling and Control
Table 2. Comparison of software-only implementation and HW/SW
through Intelligent Soft Computations Springer International
co-design one based on running time
Publishing, pp. 711-744, 2015.
Type of Implementation Running Time of the Proposed
Method(PS) 7. Koschuch, M., Grobschcadl, J., Page, D., Grabher, P., Hudler,
Software-Only 773,505,000 M., Kruger, M. “Hardware/Software Co-design of Public-Key
HW/SW Co-design 128,919,928 Cryptography for SSL Protocol Execution in Embedded
System”, ICICS 2009, LNCS 5927, PP. 63-79, 2009.
Table 3 presents memory and computational overhead of 8. Hassan, M. N., Benaissa, M. “A Scalable Hardware/Software
Co-design for Elliptic Curve Cryptography on PicoBlaze
HW/SW co-design implementation in order to analyze its
Microcontroller”, Circuits and Systems (ISCAS) IEEE, 2010.
hardware overhead. The memory overhead of the 9. Uhsadel, L., Ullrich, M., Verbauwhede, I., Preneel, B.
implementation can be neglected; since this amount can “HW/SW co-design of RSA on 8051”. In European Workshop
usually be found unused in modern FPGA applications. on Microelectronics Education, pp. 41-44, 2012.
10. Michail, H., Athanasiou, G., Kritikakou, A., Goutis, C.,
Table 3. Memory and computational overhead of the proposed Gregoriades, A., Papadopoulou, V. “Ultra High Speed SHA-
method 256 Hashing Cryptographic Module for IPSEC
Memory Overhead (KB) 340,640 Hardware/Software Codesign”, Proceedings of the
Computational Overhead 130,041
International Conference on Security and Cryptography
(ms)
(SECRYPT), IEEE, pp. 1-5, July 2010.
11. Otero, C. T. O., Tse, J., & Manohar, R. “AES Hardware-
There is a trade-off between these overheads and the Software Co-design in WSN”, In Asynchronous Circuits and
speed acceleration. It can be changed based on various Systems (ASYNC), 21st IEEE International Symposium on,
desired applications. pp. 85-92, May 2015.
376 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
377 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
ABSTRACT
A Wireless Sensor Network is a collection of sensor nodes that cooperate with each other to send data to a base station.
These nodes have limited resources in terms of energy, memory, and processing power. Energy conserving
communication is one of the main challenges of wireless sensor networks. Several studies and research are focused
on saving energy and extending the lifetime of these networks. Architectural approaches, like hierarchical structures,
tend to organize network nodes in order to save energy. Most of these protocols need background information on the
network for them to be efficient. In this paper, we describe a new approach for organizing large sensor networks into
zones, based on the number of hops, to address the following issues: large-scale, random network deployment, energy
efficiency and small overhead. This network architecture enables a hierarchical network view, with the purpose of
offering efficient routing protocols based on zone partitioning. Simulations undertaken demonstrate that our approach
is energy-efficient; this is highlighted by the reduction of traffic overhead.
KEYWORDS
378 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
communicating the collected data via nodes to the base long lifetime for the system. Another main goal is
station. Despite the advantages of clustering protocols, reducing the size of the stored data (e.g. routing table)
most of them require further information on the in each node of the network. Recent researches in
network (e.g. node energy, connectivity, geographical Wireless Sensor Networks are focused on increasing
position). That leads to an overload in the network due the lifetime of the system by decreasing energy
to the number of sent packets. Consequently, both the consumption of each node in the network ([2], [3],
energy and lifetime of the network decrease. [4]). Because of the importance of energy
consumption optimization, a particular interest is
In wireless, mobile and multi-hop networks, routing
oriented towards routing protocols.
protocols should be able to deal with random node
deployment. It means that even though sensors’ 2.1 WSN Routing Protocols
positions are known (manually deployed), no Ad hoc routing protocols (AODV [5], DSR [6], and
particular hypotheses concerning its neighbors can be DSDV [7]) may be used as network protocols for
done due to the large scale of the network sensor networks. However, such approaches will
(neighborhood discovery protocols need to be generally not be good candidates for sensor networks
implemented). Therefore, sensor networks are because of the main following reasons ([8]): (i)
considered a subclass of ad hoc networks because of sensors have low battery power and low memory
the absence of an infrastructure. Thus, ad hoc availability; (ii) the routing table size scales with the
networking may influence some routing approaches in network size. According to the structure of the
wireless sensor networks, with respect to their network, the routing protocols in WSN are classified
topologies. In hierarchical structures, topology control as follows ([9]):
can be applied to minimize the set of active nodes
2.1.1 Flat-based routing
(switching off some of them to preserve energy) or to
In flat networks, each node typically plays the same
define coordination tasks for some particular nodes.
role and sensor nodes collaborate to communicate the
We are interested in hierarchical structures because
sensed data. Due to the large number of such nodes, it
flat architectures generally depend on the size of the
is not feasible to address each node. This consideration
network, which makes routing approaches difficultly
has led to data centric routing, where the BS (base
scalable. In either approach, an important issue that
station) sends queries to certain regions and waits for
needs to be addressed is the most crucial aspect:
the data from the sensors located in the selected
energy efficiency.
regions. Early works on data centric routing, e.g. SPIN
In our work, we propose a new approach of node [2] and directed diffusion [10], were shown to save
grouping (ZHRP[1]) in zones for large WSNs, where energy through data negotiation and redundant data
zone construction uses the number of hops as the elimination.
metric. No other information on the network is needed.
2.1.2 Location based routing
For this purpose, an inexpensive neighborhood
In this type of routing, sensor nodes are addressed by
discovery algorithm is proposed. The idea is to
means of their locations. The distance between
distribute routing roles between nodes inside a zone,
neighboring nodes can be estimated on the basis of
avoiding cluster management (including cluster head
incoming signal strengths. Relative coordinates of
election and rotation, cluster construction as in
neighboring nodes can be obtained by exchanging
classical hierarchical approaches). The zone topology
such information between neighbors like in GEAR
we propose does not intend to give a management role
[11] and SPAN [12] protocols. Alternatively, it may
to specific nodes: the nodes on the zone border will
be possible to obtain location information using
help routing between the zones, and all nodes of a zone
existing infrastructure, such as the satellite-based GPS
have the same function inside their zone. Moreover, it
(Global Positioning System), if the nodes are equipped
does not need prior information on the network.
with a low power GPS receiver like in GAF protocol
2 Related work [13].
Energy consumption is one of the main challenges in
wireless sensor networks. Energy saving assures a
379 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
2.1.3 Hierarchical routing remaining energy, connectivity with other nodes ([18],
Hierarchical routing (Table 1), originally proposed in [14], [19]).
wired networks, is a well-known technique that has
There are many existing clustering protocols. LEACH
advantages related to scalability and efficient
[14] is a distributed clustering-based protocol that uses
communications. The concept of hierarchical network
randomized rotation of the CHs to evenly distribute
architecture is also used to perform energy-efficient
the energy load among the sensors in the network.
routing in wireless sensor networks. That is because in
LEACH assumes that the fixed sink is located far from
a hierarchical architecture, higher energy nodes can be
the sensors and that all sensors in the network are
used to process and send information while lower
homogeneous and battery-constrained. Lin’s protocol
energy nodes can be used to perform the sensing in the
[20] is a distributed clustering technique for large
proximity of the target. LEACH [14] and HPAR [15]
multi-hop mobile wireless networks. The cluster
are two known hierarchical routing protocols.
structure is controlled by the hop distance. In each
Hierarchical architectures are efficient ways to lower
cluster, one of the nodes in the cluster is designed as
energy consumption performing data aggregation and
cluster head. Other nodes join a cluster if they are
fusion in order to decrease the number of transmitted
within a predetermined maximum number of hops
messages to the BS.
from the cluster head. HEED [21] is a distributed
Table 1. Some Available Hierarchical Routing Protocols clustering protocol that periodically selects cluster
heads according to a hybrid function between their
residual energy and a secondary parameter, such as
node proximity to its neighbors or node degree.
380 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Here are some hierarchical protocols listed according will detail the new approach for sensor grouping in
to their classes: multiple zones.
The challenge addressed in this paper presents an Nodes are classified into zones. In each zone, an Intra-
approach of virtual structuring of networks without Zone Routing Table will be constructed at all nodes.
using topology control technique. Our contribution to Then at the Border Nodes, an Inter-Zone Routing
topology construction addresses two main issues in Table is constructed. When a node wants to send a
WSNs: distributed approaches, and energy efficiency. packet, it uses the Intra-Zone Routing Table to send
Moreover, our approach is independent of the the packet to one of its zone Border Nodes. The Border
embedded sensor technology (being able to vary the Node then uses the Inter-Zone Routing Table to send
transmission power or not); the only parameter the packet to the destination zone. At the destination
considered is the current node’s transmission range. zone, a Border Node will receive the packet, then it
The algorithm is executed simultaneously with the uses its Intra-Zone Routing Table to send the packet to
neighborhood discovery protocol for random sensor the destination node.
node deployments. Out Approach In this section we
381 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
382 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3.1.2 Intra-Zone Routing Table Construction Each entry in the Intra-Zone table contains attributes
Stage as shown in Table 4. During ZHRP Intra-Zone
The second stage in ZHRP is the Intra-Zone Routing Routing Table Construction stage, nodes exchange
Table Construction [26]. In this stage, nodes in the packets to complete the phase. The exchanged packets
same zone will know the minimal path to send packets contain the following structure (Table 5).
to each other.
srcId Node Id of the sender
destNodeId Destination node Id zoneId Zone id of the destination node
nextHopId Next hop Id destinationId Destination node Id
M (metric) Number of nodes nextHopId next node Id
Node Type of the destination node Metric computed in number of
nodeType M (metric)
(Border or Normal) nodes
List of neighboring zones’ ids if nodeType Node type of the sender node
neighZonesId the destination node is of type If the sender node is a Border
Border node borderTable node, then it sends the Border
Table 4: Intra-Zone Routing Table Entry Fields Table
Table 5: Intra-Zone Routing Table Packet Fields
The Intra-Zone table is constructed based on the
Distance-Vector Algorithm (Bellman-Ford [27]).
383 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The algorithm of Intra-Zone Routing Table packet’s destinationId. If so, the node checks if the
Construction is composed of two steps; Figure 3 packet’s metric M is less than the entry’s metric in the
shows the first step. Each node broadcasts an Intra- Intra-Table, and updates the entry with the new values
Table Construction packet specifying its zoneId, from the packet (nextHopId, M, nodeType,
nodeType and borderTable (if it is a Border node). BorderTable). Then, the packet rebroadcasts the
Any node that receives an Intra-Zone Construction modified entry.
Packet checks if the zoneId of the sender is the same
If there is no entry for the received packet’s
as its zoneId. If it isn’t, it ignores the packet; else if the
destinationId, the node will add a new entry to the
received packet’s sender zoneId is the same as the
Intra-Table setting the fields (destinationId,
receiver zoneId (packet is received from the same
nextHopId, nodeType, M, BorderTable) from the
zone) and there is no entry in the Intra-Zone Table for
received packet. Then it will broadcast the new added
that sender node, then the node adds an entry to the
entry. After this phase, each node will have an Intra-
Intra-Zone table with
Table with shortest paths for packets to be sent to their
• destinationId as the sender nodeId. destination. The routing algorithm will be discussed
later in ZHRP Data Routing section.
• metric M equals to 1.
• nextHopId same as the sender nodeId. 3.1.3 Inter-Zone Routing Table Construction
• nodeType as the nodeType in the packet. Stage
The third stage in ZHRP is the Inter-Zone Routing
• If the nodeType is Border, it adds the
Table Construction [25]. In this stage, all Border
BorderTable from the packet.
Nodes will have an Inter-Zone Routing Table which
Hence every node knows its neighbors. contains information about other zones and the paths
that should be taken to reach them.
384 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
srcId Source node Id The second stage: the stage where zones exchange
nextHopId Next hop node Id their Inter-Zone Routing Table with each other to
The zoneId of the complete them.
srcZoneId
Source Node
In the first stage, each Chief-Node constructs the
subject Packet Subject
Initial Inter-Zone Routing Table from their Intra-Zone
The Inter-Zone Routing
Table. For each entry in the Intra-Zone table, if the
interZoneRoutingTable Table produced by the
nodeType is Border-Node; then for each zone in
BORDER-CHIEF
neighZonesId list, add an entry to the Inter-Zone
The final destination
finalDestId Routing Table setting the destZoneId and the
nodeId nextZoneId to zone and zoneMetric as computed
Table 7: Inter-Zone Routing Table Packet
during Intra-Zone Table construction. Figure 5 shows
Table 7 shows the packet fields that are used during how the initial Inter-Zone Routing Table is
the Inter-Zone Routing Table construction. constructed at the Chief Border Node.
ENDIF
385 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
PART C :
IF (n.ZoneId ≠ P.ZoneId) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.NodeType = CHIEF-BORDER
Send a packet P’(n.NodeId, entryRT.NextHopId, entryRT.DestNodeId, n.ZoneId, P.Subject, P.InterZoneRoutingTable)
ELSE
IF (n.NodeId = P.FinalDestId) THEN
IF (P.Subject = UPDATE_TABLE) THEN
For each entryBT in BorderTable DO
Choose randomly node from entryBT.BorderNodesIds
Send a packet P’ (n.NodeId, node, NULL, n.ZoneId, P.Subject, P.ZoneT)
ENDDO
ELSE
Save P.ZoneT in n.InterZoneRoutingTable
ENDIF
ELSE
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.FinalDestId
Send a packet P’(n.NodeId, entryRT.NextHopId, P.DestNodeId, n.ZoneId, P.Subject, P.ZoneT)
ENDIF
ENDIF
ENDIF
ENDIF
OUTPUT : InterZoneRoutingTable at BORDER nodes
386 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 7 shows how nodes act when receiving an Inter- SrcId Global id of the sender node
Zone Routing Table. When a Normal node receives an LocalDestId Destination node Id in the zone
Inter-Zone Routing Table Construction Packet, it just NextHopId Node Id of the nextHop
forwards it to its destination. If a Border node receives FinalDestId Final destination global node Id
an Inter-Zone Routing Table Construction Packet DestZoneId Destination Zone Id
from another zone, it changes the nextZoneId for all Data The data to be send
the packet’s Inter Table to the packet’s source zoneId Table 9: Data Routing Packet Fields
and then forwards the packet to the Chief-Border node.
If the received packet is from the same zone and the 3.2.1 Sending Data Packet
packet’s destinationNodeId is the current Border node, As described in [28], when a node n1 wants to send
then: data to node n2, it follows the algorithm in Figure 8: If
the srcZoneId equals to the destZoneId (in the same
If the packet’s subject is UPDATE_TABLE, forward zone), then the Intra-Table is used to find the
the packet to the neighbor Border nodes in the Border corresponding information to build the packet. Else, if
Table. the srcZoneId is not equal to the destZoneId, then if n1
is a Normal node then search the Intra-Table to find a
If the subject is COMPLETE_TABLE, save the
Border-Node such that this Border-Node is a neighbor
packet’s Inter-Zone Routing table in the current
to the Destination Zone. If such a Border-Node is
Border node. If the packet’s destinationNodeId is not
found, then send a packet to one of its neighbor node.
the current Border node, forward it to the packet’s
If no Border-Nodes are found, then send the packet to
destinationNodeId (get the nextHopId from the Intra-
any Border-Node.
Table).
If the sender node n1 is a Border-Node, then if there
If the Chief-Border receives the packet, it updates its
exists a node in the borderTable such that
Inter-Zone Routing Table. Then if the Inter-Zone
destinationZoneId equals the neighbor border node
Routing Table is complete (number of zones = number
ZoneId, then send the packet to it.
of entries), it sends the Inter-Table to all Border nodes
in the same zone setting the packet’s Subject to Else, if there is no such Border Node in the Border
COMPLETE_TABLE. Table, then search the Inter-Table for an entry
(interRecord) such that interRecord.destZoneId equal
If the Inter-Zone Routing Table is not complete, the
to the packets destinationZoneId, then find a
Chief-Border sends its Inter-Table to all Border nodes
borderTable record (borderTableRecord) such that the
in the same zone and to the neighbor nodes of neighbor
interRecord.NextZoneId equals to
zones setting the packet’s Subject to
borderTableRecord.zoneId and send the packet to the
UPDATE_TABLE.
borderTableRecord.nodeId.
3.2 ZHRP Data Routing
If no such borderTableRecord exists, then search the
Once the Intra-Zone and Inter-Zone Routing Tables
Intra-Table for a Border Node that is a neighbor to the
are constructed, the data routing can be accomplished
destZoneId and send the packet to that border node.
easily. The packet structure used in Data Routing
contains the following entries:
387 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
PART C :
IF (∃ entryBT in BorderTable | n’.ZoneId = entryBT.neighZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P (n.NodeId, node, node, n’.NodeId, n’.ZoneId, DATA)
ELSE
Find entryZone in InterZoneRoutingTable | entryZone.DestZoneId = n’.ZoneId
IF (∃ entryBT in BorderTable | entryZone.NextZoneId = entryBT.neighZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P (n.NodeId, node, node, n’.NodeId, n’.ZoneId, DATA)
ELSE
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = entryZone.NextZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send a packet P (n.NodeId, destNode, entryRT.NextHopId, n’.NodeId, n’.ZoneId, DATA)
ENDIF
ENDIF
ENDIF
ENDIF
388 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
PART B :
IF (n.ZoneId = P.DestZoneId) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.FinalDestId
Send a packet P’ (n.NodeId, P.FinalDestId, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C :
PART C.1 :
IF (n.NodeType = NORMAL) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.LocallDestId
Send a packet P’ (n.NodeId, P.localDestId, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.2 :
IF (∃ entryBT in BorderTable | entryBT.neighZoneId = P.DestZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P’(n.NodeId, node, node, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.3 :
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = P.DestZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
IF (TempNodes ≠ ∅) THEN
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send P’(n.NodeId, destnode, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.4 :
Find entryZone in InterZoneRoutingTable | entryZone.DestZoneId = P.DestZoneId
IF (∃ entryBT in BorderTable |entryBT.neighZoneId = entryZone.NextZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P’(n.NodeId, node, node, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.5 :
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = entryZone.NextZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send a packet P’ (n.NodeId, destnode, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ENDIF
ENDIF
ENDIF
ENDIF
ENDIF
ENDIF
389 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3.2.2 Receiving Data Packet the same procedure, then the packet reaches a Border
As described in [28], when a node n receives a packet Node in zone Z9. Finally, this node will use its Intra-
p it acts as the following (Figure 9): Zone Routing Table to forward the packet to Dest.
390 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
zones and the zone radius R changes. The radius can 4.2.2 ZHRP Intra-Zone Routing Table
take the values 5, 15, and 25 while the number of zones Construction
changes between 5, 10, 15, 20, and 25. The simulation As shown in Figure 14 and Figure 15, in Intra-Zone
takes place with 200, 300, and 400 nodes. Routing Table construction, when the number of zones
increases, the number of sent and received packets
4.2.1 ZHRP Zones Construction
decreases. This is because the number of nodes in each
Figure 12 and Figure 13 show that the increase in R
zone will decrease when the number of zones increase,
correlates with an increase in the number of sent and
so the communication between nodes in the same zone
received packets. However, when R takes a value of
will decrease. When R has the value of 10 or more, the
15 or 25, we get an equal amount of sent and received
values became the same for same number of zones and
packets. That’s because zones were already neighbors,
number of nodes. This is because the zone is fully
but the nodes within the radius are less than 10.
constructed (have Border nodes) before R reaches 10.
Increasing R will let more nodes joins the zone, hence
The number of sent packets decrease from 200 to 75
more packets exchange will occur. When the number
when the number of zones increases from 5 to 25,
of zones increases, the number of packets
while the number of received packets decrease from
sent/received will increase, because increasing the
550 to 250.
number of zones will increase the number of inviting
nodes, so more INVITATION messages will be
transmitted.
391 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
of the received ones will reach 700. When the number reduce this complexity by the two-level routing tables
of nodes = 200, the number of sent packets reach 650 that we construct. Next, we are interested in the space
and received packets reach 450. The peeks in the complexity, in terms of number of bytes occupied by
graphs show the maximum number of packets sent and the involved data structures. As far as we know, no
received that the WSN can reach as long as the number other routing mechanism proposed, in literature, a
of zones increases. When the number of zones is less wireless sensor network that considers this metric. The
than the value at the peek, if a zone wants to reach formula for computing the size (in bytes) of the routing
another zone, it must pass through many zones to reach data structures is given in Table I, for N deployed
the destination zone. But when the number of zones is nodes, when nZ zones are constructed, each zone
greater than the value at the peek, the zone will have having in average nB border nodes.
many neighbors. Hence, in order for a zone to reach
4.4 Lower bound for the number of zones
another zone, it must pass through less zones, because
The previous metric does not only estimate the size of
its will have more neighbors, and the probability for
the needed data structure in order to assure pro-active
the destination zone to be a neighbor will increase. For
routing based on routing tables, it also gives a lower
400 nodes, the peek is at number of zones = 20. For
bound of the number of zones. This computation is
300 nodes, the peek is at 15. For 200 nodes, the peek
based on a memory limit imposed for sensors in
is not reached before number of zones equal to 25.
respect with the total memory capacity of a sensor.
Depending on the technology used, this total capacity
may vary. Therefore, we make the following
assumption: nodes technically dispose of MEM_RAM
RAM memory capacity. Obviously, only a fraction of
the total available memory can be used for protocol
data structures. We did it by the
MAX_MEM_PRCTG percentage (%). Considering
the theoretical memory capacity needed by the
protocol for a normal node, we have
392 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
5 Battery Consumption
CPU Consumption 8 mAh
Figure 18: Z10 Sent Packets Receiving Consumption 10 mAh
Transmitting Consumption 27 mAh
Initial Energy 2900 mAh
Voltage 3V
Data Transfer Rate 38400 bits/s
Communication Range 500ft
Table 10. Characteristics of MICA2 Sensor
393 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 23: Zone Construction Received Packets Figure 26: Inter-Zone Routing Table Sent Packets
5.2 Battery Consumption in Intra-Zone
Routing Table Construction
Battery consumption for the Intra-Zone Routing Table
Construction stage are shown in Figure 24 for the sent
packets, and in Figure 25 for the received packets.
394 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[6] Broch, D., Johnson, D., and Maltz, J. “DSR The [15] Rus D., Li Q., and Aslam J. “Hierarchical Power-
Dynamic Source Routing Protocol for Multihop aware Routing in Sensor Networks”. Proceedings of
Wireless Ad Hoc Networks”. C.E. Perkins. Addison- the DIMACS Workshop on Pervasive Networking,
Wesley. Vol. 5, 2001, pp. 139-172 May 2001.
[7] Bhagwat, P., and Perkins, C. Highly Dynamic [16] Hebden P., and Adrian R. “Pearce Distributed
Destination- “Sequenced Distance-Vector Routing Asynchronous Clustering for Self-Organisation of
(DSDV) for Mobile Computers”. ACM Wireless Sensor Networks”. Proceedings of the Fourth
SIGCOMM'94 Conference on Communications International Conference on Intelligent Sensing and
Architectures, Protocols and Applications. London, Information Processing (ICISIP-06). Bangalore, India
United Kingdom, 1994, pp. 234-244. 2006, pp. 37-42.
395 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[17] Niu Y., Y, Z., and Hua,F. “Sub Cluster Aided [24] Hamma, T. Katoh, T. Bista, B.B. Takata, T. “An
Data Collection in Multihop Wireless Sensor Efficient ZHLS Routing Protocol for Mobile Ad Hoc
Networks”. IEEE, Wireless Communications and Networks”. 17th International Conference on
Networking Conference WCNC. Kowloon, China, Database and Expert Systems Applications. 2006, pp.
2007. pp. 3967-3971. 66-70.
[18] Fahmy S, and Younis O. “Distributed clustering [25] K. Beydoun, V. Felea, and H. Guyennet,
in ad-hoc sensor networks: A hybrid, energy-efficient “Wireless sensor network infrastructure: construction
approach”. Proceedings of the IEEE (INFOCOM) and evaluation,” in Wireless and Mobile
Conference on Computer Communications. Hong Communications, 2009. ICWMC’09. Fifth
Kong 2004. International Conference on, 2009, pp. 279–284.
[19] Boutaba, R., and Aoun, B. Clustering in WSN [26] K. Beydoun and V. Felea, “Wireless sensor
with Latency and Energy Consumption Constraints. networks routing over zones,” in Software,
Journal of Network and Systems Management, Vol. Telecommunications and Computer Networks
14, September 2006. (SoftCOM), 2010 International Conference on, 2010,
pp. 402–406.
[20] H. Lin, Y. Chu. A clustering technique for large
multihop mobile wireless networks. Vehicular [27] Walden, D. The Bellman-Ford Algorithm and”
Technology Conference Proceedings,Tokyo, Japan. Distributed Bellman-Ford”. [En ligne] 2009.
Vol. 2, pp. 1545-1549. (2000)
[28] Kamal Beydoun, “Conception d’un Protocole de
[21] O. Younis, S. Fahmy. Heed: A hybrid, energy- Routage Hirarchique pour les Reseaux de Capteurs”,
efficient, distributed clustering approach for ad hoc PHD Report, 2009, Franche-Comte.
sensor networks. IEEE Transactions on Mobile
[29] Omnet++ User Manual, 2014.
Computing. Vol.3, Issue 4, pp. 366–379. (2004)
https://omnetpp.org/
[22] L. Subramanian and R. Katz, “An architecture for
[30] A. Boulis, “Castalia”, Simulator for Wireless
building self-configurable systems,” in Proceedings of
Sensor Networks and Body Area Networks User
IEEE/ACM Workshop on Mobile Ad, Hoc
Manual Online, 2009.
Networking and Computing, Boston, MA, 2000.
https://castalia.forge.nicta.com.au
[23] Lindsey S, Raghavendra C. PEGASIS: “power-
[31] Crossbow. MICA2 Data sheet. [Online] 2009.
efficient gathering in sensor information systems”. In:
http://www.xbow.com/products/Product_pdf_files/W
IEEE aerospace conference proceedings, vol. 3; 2002.
ireless_pdf/MICA2_Datasheet.pdf.
p. 1125–30.
396 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
ABSTRACT
A wireless sensor network (WSN) is a network formed by a large number of sensor nodes where each node is equipped
with a sensor to detect physical phenomena such as light, heat, pressure, etc... WSNs are regarded as a revolutionary
information gathering method to build the information and communication system which will greatly improve the reliability
and efficiency of infrastructure systems. Compared with the wired solution, WSNs feature easier deployment and better
flexibility of devices. In the energy-constrained sensor network environments, it is unsuitable in numerous aspects of battery
power, processing ability, storage capacity and communication bandwidth, for each node to transmit data to the sink node.
This is because in sensor networks with high coverage, the information reported by the neighboring nodes has some degree
of redundancy, thus transmitting data separately in each node while consuming bandwidth and energy of the whole sensor
network, which shortens lifetime of the network. To avoid the above-mentioned problems, data aggregation techniques
have been introduced. Data aggregation is the process of integrating multiple copies of information into one copy, which is
effective and able to meet user needs in middle sensor nodes. In this paper, we will propose data aggregation solution to
the routing protocol ZHRP (Zone Hierarchical Routing Protocol). This solution will efficiently improve the lifetime of the
WSN.
KEYWORDS
Wireless Sensor Networks — Hierarchical Routing — Data Aggregation.
397 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Limited Transmission Range: Nodes are tiny and
have small antennas and small battery so their
transmission range is limited and small.
• Limited Memory: Because of the small size of the
nodes, the nodes contain small and limited
memories.
• Limited Computing Power: The Processing unit in
the nodes is small and has limited resources due to
the size of the nodes.
• Dynamic Topology: Nodes may die, added, or even
Figure 1: Sensor Node Architecture
move. Therefore, the WSN will dynamically change
1.2 Wireless Sensor Networks its structure.
Sensor nodes offer a powerful combination of distributed 1.3 Wireless Sensor Applications
sensing, computing and communication. The ever-increasing Wireless Sensor Networks (WSNs) are used in many
capabilities of these tiny sensor nodes, which include sensing, applications that are divided into three categories:
data processing, and communicating, enable the realization of
• Monitoring of areas
WSNs based on the collaborative effort of a number of other
o Environment and Habitat: forest fire
sensor nodes. They enable a wide range of applications and, at
the same time, offer numerous challenges due to their detection, animal monitoring
peculiarities, primarily the stringent energy constraints to o Military: monitor friendly forces,
which sensing nodes are typically subjected. Wireless Sensor ammunition
Network is a highly distributed and randomly deployed o Agriculture: farming
wireless network consists of large number of sensor nodes • Monitoring of objects
called Motes. These nodes work with each other to sense data o Structures: critical building monitoring,
from the environment and send them to the base station over a machine status
large area. A very important factor in the lifetime of the WSN o Medical Diagnosis: blood pressure
is the energy consumption since sensor nodes are driven by monitoring
small batteries; they have a limited energy resource. When • Monitoring both areas and objects
sensors sense data, compute or communicate they consumes • Asset Tracking: vehicle tracking
energy, hence the lifetime of WSN will decrease. Therefore, • HealthCare: monitoring patients
the battery consumption should be decreased efficiently to • Disaster Management: volcanic monitoring
increase the network lifetime. Not only energy consumption is
a challenge for WSN, there are some other challenges like
limited memory, limited processing power, and limited 2. Related Work
communication range. 2.1 Routing Protocols in WSN
WSN nodes have a limited transmission range so they cannot
communicate with the base station directly so they must Energy consumption is one of the main challenges in wireless
cooperate with each other to deliver the data packets to the sensor networks. Energy saving assures a long lifetime for the
base station. The base station is responsible to collect data system. Another main goal is reducing the size of the stored
from the WSN. Nodes send data packets to their neighboring data (e.g. routing table) in each node of the network. Clustering
nodes which are in the range of the transmitting nodes, and is an important technique for prolonging the system lifetime
then forward those packets to their ‘neighbors’ until the base and reducing the size of the stored data. In clustering, nodes are
station. This act will consume power because of packets gathered in several groups, generally disjoint, which are named
transmission thus the communication should be decreased to a clusters. Each cluster has a cluster head (CH). The nodes
minimum in order to make the battery consumption lower and collect data and send it to the CH that forwards this data to the
as a result increase the network’s lifetime. final user or Base Station (BS). CHs can communicate with the
Base Station directly or via other
1.2 Wireless Sensor Characteristics
CHs. There are many existing clustering protocols. LEACH
Any WSN have some common characteristics such as: [22] is a distributed clustering-based protocol that uses
• Infra Structure less: WSN initially has no structure randomized rotation of the CHs to evenly distribute the energy
but it may define a structure after deployment. load among the sensors in the network. LEACH assumes that
• Large Area and Large Number of Nodes: WSN the fixed sink is located far from the sensors
contains a large number of sensor nodes and can and that all sensors in the network are homogeneous and
cover a very large area. battery-constrained. Lin’s protocol [23] is a distributed
• Many Interferences: Nodes in the WSN may receive clustering technique for large multi-hop mobile wireless
many packets at the same time. Packets may collide networks. The cluster structure is controlled by the hop
and lost. distance. In each cluster, one of the nodes in the cluster is
designed as cluster head. Other nodes join a cluster if they are
• Security Issues: WSNs highly exposed to security within a predetermined maximum number of hops from the
breaches, and nodes can be hacked easily. cluster head. HEED [24] is a distributed clustering protocol
that periodically selects cluster heads according to a hybrid
function between their residual energy and a secondary
398 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
parameter, such as node proximity to its neighborsVol. or node deployments.
15, No. 2, In the next section, we will detail our work for
February 2017
degree. In CES distributed protocol [25], each sensor computes structuring wireless sensor networks into zones, which is not a
its weight based on the k-density, the residual energy and the real clustering algorithm like the cited related work. Therefore,
mobility features. Then it broadcasts the weight to its 2-hop no cluster heads exist in our topology; no other information on
neighborhood. The sensor node having the greatest weight in the network (e.g. geographic position) is required.
its 2-hop neighborhood becomes the cluster head and its
2.2 Data Aggregation in WSN
neighboring sensors will join its cluster. SPAN [26] is a
distributed, randomized protocol in which nodes make local Sensor networks are distributed event-based systems that
decisions on whether to sleep, or to join a coordinator that differ from traditional communication networks in several
rotates at times. Each node makes its decision depending on the ways: sensor networks have severe energy constraints,
amount of available energy on the node and on its degree (the redundant low-rate data, and many-to-one flows. Data centric
number of its neighbors when the node is active). SPAN is a mechanisms that perform in-network aggregation of data are
protocol that operates under the routing layer and above the needed in this setting for energy-efficient information flow.
MAC and physical layers. The routing layer uses information Because of the requirement of unattended operation in remote
SPAN provides, and SPAN leverages any power saving or even potentially hostile locations, sensor networks are
features of the underlying MAC layer [26]. The centralized extremely energy-limited. However since various sensor
PEGASIS protocol [27] constructs chains instead of clusters. nodes often detect common phenomena, there is likely to be
Each node delivers the sensed data to the nearest neighbor some redundancy in the data the various sources communicate
to a particular sink. In-network filtering and processing
node. One sensor node on the chain is assigned as the cluster
techniques can help to conserve the scarce energy resources.
head node that delivers sensed data to the base station. The
Data aggregation has been put forward as an essential
head node is selected by turns; this technique allows even
paradigm for wireless routing in sensor networks [3, 6]. The
energy consumption in wireless sensor networks. However, the
idea is to combine the data coming from different sources–
PEGASIS protocol causes redundant data transmissions since eliminating redundancy, minimizing the number of
one of the nodes on the chain is selected as the head node transmissions and thus saving energy. This paradigm shifts the
regardless of the base station's location. In [28], authors focus from the traditional address-centric approaches for
propose the enhanced PEGASIS protocol based on the « networking (finding short routes between pairs of addressable
concentric clustering » scheme to solve this problem. It means end-nodes) to a more data-centric approach (finding routes
that clusters have the shape of concentric circles. Similar to from multiple sources to a single destination that allows in-
PEGASIS, the SHORT protocol [29] adopts centralized network consolidation of redundant data).
approaches and requires powerful BS to take the responsibility Data Aggregation is the process of collecting and summarizing
of managing the network topology and to calculate the routing data from the sensor nodes in a way that the communications
path and time schedule for data collection. between nodes are reduced so the energy consumption of the
Most topologies based on clusters assume that cluster heads are nodes is decreased hence increasing the network lifetime.
high-energy nodes and their transmission power can be adapted
in order to reach the base station at far distances and to
communicate directly to other cluster heads. Another
assumption is that nodes within a cluster can directly
communicate to the cluster head. In SHORT, HEED, CES,
PEGASIS and Enhanced PEGASIS all nodes are supposed to
have the ability to modify the transmission power in order to
control topology. The LEACH radio model [22] is used for
these protocols. Requirement of adaptive and dynamic
transmission power modification can be prohibitive, especially
for sensors not equipped with transmission amplifier. SPAN
uses the radio model of the Cabletron Figure 2: Data Aggregation
Roundabout 802.1 card has fixed transmission range and does
Figure 2 shows how sensor nodes S1, S2 ... Sn send their
not support power control. Lin’s protocol does not mention the
packets S’1, S’2 … S’n to a data collector node called
radio model used for simulations. The transmission defines the
aggregator node A, which indeed collects data and eliminates
set of neighbors for a sensor node, those able to receive the
the redundant data. The aggregator uses some methods (f in
transmitted signals. Because variation of the transmission Figure 2) to remove redundant data and produce the
range consumes more resources, virtual topologies should be aggregated filtered data y’. These methods could be statistical
proposed for sensor networks that are made of sensors with methods like in [13], probabilistic methods like in [14], or
fixed transmission power. The challenge addressed in this artificial intelligence like in [15]. The filtered data y’ is then
paper presents an approach of virtual structuring of networks sent to the base station R.
without using topology control technique. Our contribution to Data Aggregation protocols main goal is to gather and
topology construction addresses two main issues in WSNs: aggregate data in an energy efficient manner. These protocols
distributed approaches, and energy efficiency. Moreover, our can be classified as structure based and structure-free data
approach is independent of the embedded sensor technology aggregation protocols. In structure based protocols data are
(being able to vary the transmission power or not); the only transmitted to the base station by creating chain [6], tree
parameter considered is the current node’s transmission EIPDAP [2], cluster [16], tree-cluster [17], or hierarchy
range. The algorithm is executed simultaneously with the clustering [18].
neighborhood discovery protocol for random sensor node
399 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
3. EIPDAP Vol. 15, No. 2, 3.3
February
Result2017
Checking
In the result-checking phase, the base station verifies the
Efficient Integrity-Preserving Data Aggregation Protocol integrity of the aggregated values with the two tags. As a
(EIPDAP) [2] is an aggregation protocol that can verify the result, the base station can preserve the integrity of the
integrity of aggregation result immediately after receiving aggregated data immediately after receiving the aggregated
aggregation result and the corresponding authentication data and their corresponding authentication information, so it
information. The integrity verification is not done through will reduce the energy consumption not like other protocols
another query-and-forward phase, for this reason energy, which makes another query phase to check if the integrity of
consumption and communication delay will be reduced the aggregated data. EIPDAP is energy efficient because the
significantly. result-checking phase is done in the base station hence no
EIPDAP needs some network assumptions to work. The first congestion in the aggregation tree during the result checking-
assumption is that the base station needs to be powerful with phase.
transmission range enough to cover the wireless sensor
network in order to broadcast messages to all nodes directly,
4. Integration of EIPDAP in ZHRP
because the base station needs to broadcast authenticated
query before the aggregation phase. The second assumption is
As we have described previously, ZHRP protocol splits the
that the wireless sensor network should form a tree topology
WSN into disjoint zones. Each zone has one Inviting node,
with the base station as the root.
Normal and Border Nodes. Each node has an Intra-Zone
EIPDAP is based on the elliptic curve discrete logarithm with
Routing Table that contains path cost to reach a destination
hierarchical aggregator topology. EIPDAP goal is to prevent
node in the same zone. This path is the minimal between the
stealthy attacks where the attacker tries to send wrong data to
paths. The Border nodes have Border Table to know their
the base station and make it accepts them. Each node in the
neighbor nodes from the neighbor zones. In addition, the
wireless sensor network should have a unique identifier s,
Border nodes have the Inter-Zone Routing Table that contains
private keys r and l ∈Zp, and shares a private key sk with the
the cost of passing through a zone when sending a packet. In
base station and a private point Ө ∈ cyclic elliptic group E(Zp)
ZHRP, the packet passes through the shortest route. Therefore,
with the base station. Also the generator point G ∈ E(Zp) is
the packet will pass through fewer nodes until it reaches its
preloaded to all the nodes. In addition, two parameters and
destination. However, if there are x nodes that wants to send
such that = and = are preloaded to all nodes. data packets to their destinations, each packet will have a route
EIPDAP is accomplished after three main phases: query to pass through, hence there will be x routes, therefore a lot of
dissemination, aggregation-commit, and result checking. nodes will have to send and receive packets. To decrease the
3.1 Query Dissemination number of routes, packets must be combined as one packet as
much as possible to reduce the number of packets and reduce
In the dissemination phase aggregation tree information is the number of routes.
collected; if the aggregation tree is not constructed then it is Data aggregation is the process of collecting and summarizing
constructed during this phase. Then the base station calculates data. This process will reduce the amount of data to be sent
path-keys and edge key for each node and encrypts them with from one node to another, which will reduce communications
the secret key shared between the base station and the node, and decreases the energy consumption to increase the WSN
and then the base station sends them to the corresponding lifetime.
node. In ZHRP, if an event occurs, it may that many nodes in the
3.2 Aggregation-Commit same zone will send the same sensed data (event) to the same
destination so there will be x packets each will have a path,
and the packets may be redundant. This scenario will consume
energy in a bad way. Therefore, to solve this problem we will
build some aggregation trees in all the zones. These
aggregation trees will collect data, summarize data, and sends
them as one packet to the destination node. In this way, we will
highly reduce the number of packets to be send from one zone
to another. The x packets may become one packet. This will
decrease the number of send and receive actions at the nodes,
so energy consumption will be efficient.
4.1 Aggregation Tree Construction Algorithm
Table 1: Tree Construction Packet Fields
400 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
have treeId equals to the root nodeId. During Vol. the 15,
treeNo. 2,
receive more
February packets from other children. Then it will add its
2017
construction, a packet will be used with fields illustrated in data and apply any aggregation algorithm on the collected
Table 1. data, such as filtering, addition, or subtraction. Then the node
When a node receives a tree construction packet, if the node that received the data packets will send the result data as one
joins the tree, it must send a child packet to the sender of the packet to its parent, and the parent node will do the same task
construction packet to tell it that it is its child. until the data packet reaches the root node, which is a Border
Table 2 shows the fields of the child packet. Node.
When a Border node receives data from its children, then it
Table 2: Child Packet Fields will send the data using ZHRP routing to the destination node.
Figure 4 shows the trees in a zone with two border nodes: 5. ZHRP vs ZHRP with Aggregation:
Routing Scenario
Figure 7describes what happens when an event occurs at zone
Z1 and how the nodes that detects that same event sends their
packets to the destination base station B using ZHRP routing
(Table 4).
401 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
assumed:
Vol. 15, No. 2, Februaryno interference, no interruptions, and no packet loss
2017
as we are simulating in the network layer. The simulation takes
place in a field of size 200m x 200m. The sensor transmission
range is 15m.
6.1 Sent and received packets – Aggregation Tree
Construction
During Aggregation Tree Construction as shown in Figure 9
and Figure 10, the number of Sent and Received packets in
each node is very small. This number does not increase when
R and number of zones change because nodes will be able to
communicate with their direct neighbors only.
Table 5: ZHRP routing path with aggregation for the same event
Figure 9: Aggregation Tree Construction - Sent Packets
Node Path Sent Packets
n1 n1,n2,b2 2
n3 n3,n4,n5,b2 3
n6 n6,n7,n5,b2 3
402 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 12: Number of zones Z =20 Figure 15: Number of zones Z =20
Figure 11, Figure 12, and Figure 13show that when events
happens the number of packets Sent in ZHRP is much higher
than that of ZHRP with tree aggregation. The results show that
the number of Sent packets decreased from 15994 to 3480
which signals that the number of sent packets is decreased by
about 80%. Therefore, the integration is very efficient, it will
decrease the energy consumption, and the WSN lifetime will
increase.
403 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
For
Vol. 15, No. 2, the routing
February 2017 scenario the battery consumption energy
percentage decreased (in the worst case) from 40% to 10%
after adding the aggregation to the ZHRP.
8. References
[1] K. Beydoun and V. Felea, “Energy-efficient WSN
infrastructure,” in Collaborative Technologies and Systems,
2008. CTS 2008. International Symposium on, 2008, pp. 58–
65.
[2] L. Zhu, Z. Yang, M. Li, and D. Liu, “An Efficient Data
Figure 18: Aggregation Tree Construction - Received Packets Aggregation Protocol Concentrated on Data Integrity in
Wireless Sensor Networks,” Int. J. Distrib. Sens. Netw., vol.
Battery consumption in ZHRP with Aggregation Tree routing 2013, pp. 1–9, 2013.
are shown in Figure 19for the sent packets, and in Figure 20for [3] Alberto Camilli, Carlos E. Cugnasca, Antonio M. Saraiva,
the received packets. André R. Hirakawa, Pedro L.P. Corrêa, “From wireless
sensors to field mapping: Anatomy of an application for
precision agriculture”, in: Computers and Electronics in
Agriculture archive. Volume 58 Issue 1, August 2007 Pages
25-36.
[4] L. Subramanian and R. Katz, “An architecture for building
self-configurable systems,” in Proceedings of IEEE/ACM
Workshop on Mobile Ad, Hoc Networking and Computing,
Boston, MA, 2000.
[5] Heinzelman W, Chandrakasan A, Balakrishnan H.
“Energy-efficient communication protocol for wireless
microsensor networks”. In: Proceedings of the 33rd Hawaii
international conference on system sciences (HICSS ’00);
Figure 19: ZHRP with Aggregation Tree Sent Packets 2000. p. 3005–14.
[6] Lindsey S, Raghavendra C. PEGASIS: “power-efficient
gathering in sensor information systems”. In: IEEE aerospace
conference proceedings, vol. 3; 2002. p. 1125–30.
[7] Hamma, T. Katoh, T. Bista, B.B. Takata, T. “An Efficient
ZHLS Routing Protocol for Mobile Ad Hoc Networks”. 17th
International Conference on Database and Expert Systems
Applications. 2006, pp. 66-70.
[8] K. Beydoun and V. Felea, “WSN hierarchical routing
protocol taxonomy,” in Telecommunications (ICT), 2012 19th
International Conference on, 2012, pp. 1–6.
[9] K. Beydoun, V. Felea, and H. Guyennet, “Wireless sensor
network infrastructure: construction and evaluation,” in
Figure 20: ZHRP with Aggregation Tree Received Packets
Wireless and Mobile Communications, 2009. ICWMC’09.
Fifth International Conference on, 2009, pp. 279–284.
[10] K. Beydoun and V. Felea, “Wireless sensor networks
7. Conclusion routing over zones,” in Software, Telecommunications and
The introduction of data aggregation benefits both from saving Computer Networks (SoftCOM), 2010 International
energy and obtaining accurate information. The energy Conference on, 2010, pp. 402–406.
consumed in transmitting data is much greater than that in [11] Kamal Beydoun, “Conception d’un Protocole de Routage
processing data in sensor networks. Therefore, with the node’s Hirarchique pour les Reseaux de Capteurs”, PHD Report,
local computing and storage capacity, data aggregating 2009, Franche-Comte.
operations are made to remove large quantities of redundant [12] Walden, D. The Bellman-Ford Algorithm and”
information, to minimize the amount of transmission and save Distributed Bellman-Ford”. [En ligne] 2009.
energy. As previously proved, the addition of aggregation to [13] W. Zhang, Y. Liu, S.K. Das, P. De. “Secure data
the ZHRP has reduced the number of sent and received packets aggregation in wireless sensor networks: a watermark based
in a route from source to destination. This addition decreased authentication supportive approach”. Pervasive Mobile
the number of sent and received packets from 41 to 20. Results Comput, 4 (2008). pp. 658–680.
prove the energy efficient. The simulation also shows how the [14] Huang, H. Leung. “An expectation maximization based
aggregation trees are constructed with small number of interactive multiple model approach for collaborative
packets; hence the addition of aggregation tree will not driving”. IEEE Trans Intell Transp Syst (2005), pp. 206–228.
consume a lot of energy. During scenario of routing with [15] S. Croce1, F. Marcelloni1, M. Vecchio. “Reducing power
aggregation, sent and received packets decreases by 80% and consumption in wireless sensor networks using a novel
75% respectively. Hence, aggregation trees will increase the approach to data aggregation”. Comput J Math Stat, 51 (2007),
WSN lifetime by a noticeable value. Results clearly show that pp. 227–239.
the battery consumption during ZHRP with Aggregation trees [16] Heinzelman W, Chandrakasan A, Balakrishnan H.
construction does not more than 1% of the battery energy. “Energy-efficient communication protocol for wireless
microsensor networks”. In: Proceedings of the 33rd Hawaii
404 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
international conference on system sciences (HICSS Vol. ’00); [25]
15, No. 2, M. Lehsaini,
February 2017 H. Guyennet and M. Feham. An Efficient
2000. p. 3005–14. Cluster-based Self-organization Algorithm for Wireless
[17] Huang KC, Yen YS, Chao HC. “Tree-clustered data Sensor Networks. International Journal of Sensor Networks,
gathering protocol (TCDGP) for wireless sensor networks”. Inderscience Publishers. Vol. 6, Issue 4 (2009)
In: Proceedings of the future generation communication and [26] B. Chen, K. Jamieson, H. Balakrishnan, and Robert
networking (FGCN 2007), vol. 02; 2007. p. 31–6. Morris. Span: An Energy-Efficient Coordination Algorithm
[18] P. Mohanty, M.R. Kabat. “A hierarchical energy efficient for Topology Maintenance in Ad Hoc Wireless Networks.
reliable transport protocol for wireless sensor networks”. Ain Springer. Wireless Networks. Vol. 8, pp. 481-494(14). (2002)
Shams Eng J, 5 (2014), pp. 1141–1155 integrity. [27] S. Lindsey, C.S. Raghavendra. PEGASIS: Power-
[19] Omnet++ User Manual, 2014. https://omnetpp.org/ efficient gathering in sensor information systems. IEEE
[20] A. Boulis, “Castalia”, Simulator for Wireless Sensor Aerospace Conference Proceedings. Vol. 3, pp. 1125-1130.
Networks and Body Area Networks User Manual Online, (2002)
2009. https://castalia.forge.nicta.com.au [28] S. Jung, Y. Han, T. Chung. The Concentric Clustering
[21] Crossbow. MICA2 Data sheet. [Online] 2009. Scheme for Efficient Energy Consumption in the PEGASIS.
http://www.xbow.com/products/Product_pdf_files/Wireless_ The 9th International Conference on Advanced
pdf/MICA2_Datasheet.pdf. Communication Technology. pp. 260-265. (2007)
[22] W.R. Heinzelman, A. Chandrakasan, H. Balakrishnan. [29] Y. Yang, W. Hui-Hai, C. Hsiao-Hwa. SHORT: Shortest
Energy efficient Communication Protocol for Wireless Hop Routing Tree for Wireless Sensor Networks. IEEE
Microsensor Networks. Proceedings of the IEEE Hawaii International Conference on Communication. Vol. 2, pp. 3450
International Conference on System Sciences. Vol. 2, p. 10. - 3454. (2006)
(2000).
[23] H. Lin, Y. Chu. A clustering technique for large multihop
mobile wireless networks. Vehicular Technology Conference
Proceedings,Tokyo, Japan. Vol. 2, pp. 1545-1549. (2000)
[24] O. Younis, S. Fahmy. Heed: A hybrid, energy-efficient,
distributed clustering approach for ad hoc sensor networks.
IEEE Transactions on Mobile Computing. Vol.3, Issue 4, pp.
366–379. (2004)
405 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— Network Forensics are complicated and worth software and high-level. RouterOS Mikrotik API began in
studying. One of the interesting parts of the network is a router introduced and used since version 3.0 [4][5].
that manages all connection for all logical network activity. On
network forensics, we need traffic log to analyze the activity of Under the background of the domains that have been
any computer connected to the network in purpose to know what presented, this study is to gather information and conduct an
hackers do. In another hand, not all information can get from analysis of digital evidence contained in the RouterOS by
traffic log if the network didn’t save the network sniffing. Thus using the API (Application Programming Interface) as a tool
this case need find other resources like router information. To to help maintain information on the activity of network
access information on the router like RouterOS on Mikrotik forensics.
devices, can maintain some data using API to remote access the
router remotely. The purpose of this paper is to explore how to II. RELATED WORKS
do a forensics of RouterOS based Mikrotik devices and Several previous studies have been done on digital
developing a remote application to extract router data using API forensics. Research about Logs management system has been
services. As a result, acquisition process could obtain some developing for several years like kiwi syslog, bnare backlog,
valuable data from the router as digital evidence to explore
spectorosoft server manager, manage engine, and splunk log
information of network’s attacks activity.
management[6]. The management log system helps forensics
Keywords- Network; Router; Live Forensics; API; Logs investigators to analyze and determine an approach to detect
network attacks[7]. The most useful research on network
I. INTRODUCTION forensics is the development of method ontology for
intelligent network forensics analysis[8].
Network forensics is part of the digital forensics that
focuses on monitoring and analysis of data traffic on the Some research about router forensics has done on several
network. The type of data being handled is dynamic data router devices like Cisco, TP-Link, Ubiquity etc. Most of the
network forensics. It is different to that of digital forensics, study of router forensics is included DHCP handling in
data which are static[1][2]. With the increasing presence of determining IP address of computer client that extracted on
digital devices, information storage, and network traffic, device memory[9]. Not only on physical devices, but some
forensics Cyber face of the growing number of cases that have virtual model of the router also given some information for
complex growth. digital forensics[10].
Digital evidence is always taken from the network traffic Acquisition of data from the Household and Small
logs derived from sniffing or monitoring activity to be Business Wireless Router also provides an overview of how
analyzed[2][3]. In addition to the actual network traffic logs of the retrieval of data from the router. In addition they also
the router device, we can also get some valuable information mapped the correlations NAT TCP flow on private wireless
to a network. Information may be found on the router is admin networks between TCP flow to the internet. As well as the
logging activity, a list of client IP address, mac address, mechanism of relationship logging IP and TCP port[11].
network configuration, firewall configuration, etc.
III. BASIC THEORY
API (Application Programming Interface) is a set,
functions, and protocols that can be used by programmers A. Network Forensics
when building software for a particular operating system. API Network forensics is a forensic field that focuses on the
allows programmers to use standard functions to interact with area of network and associated devices. Network forensics is
other operating systems. API is one method of doing an attempt to find the attacker information to look for potential
abstraction, usually (but not always) between the low-level evidence after an attack or incident. There is a variety kind of
attacks include probing, DoS, user to root (U2R) and remote to
406 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
407 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
408 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE 1. SAMPLE DATA OF LOG ACTIVITY Observation data starts with observing Log Activity on
Time Topic Message Table 1 that show the IP Address 192.168.1.246 has 38 failed
16:07:31 system,error,critical login failure for user pengelola from login requests from time 16:07:30 until 16:07:32. This action
192.168.1.246 via ftp is impossible as human behavior that can make 38 requests in
16:07:31 system,error,critical login failure for user pengelola from 2 seconds. That makes 192.168.1.246 as suspected address. In
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user pengelola from
16:08:46 found that 192.168.1.246 is successfully logged in
192.168.1.246 via ftp via telnet which means it gained full access to the router. The
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via action is followed by an activity to add a new user to the
ftp router with name “jebol” at time 16:11:26.
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
ftp For validation of attack activity, we collect the network log
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via from the network sniffing. The sniffing process is recording
ftp
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
the activity of traffic on the network. With tool named
ftp Wireshack, observe a .pcap file to explore FTP Services
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via Communication activity as shown in figure 8. As expected,
ftp same FTP activity obtained from address 192.168.1.246 is
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
found.
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via
ftp
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user pengelola from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
ftp
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via
ftp
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
Figure 8. Observation Network Traffic Log with Wireshack
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
The search continued in subsequent data by looking the
16:07:32 system,error,critical login failure for user kasir from 192.168.1.246 via ARP list. ARP list shows information about IP address that is
ftp owned by a Mac Address on the network. Observation on the
16:07:32 system,error,critical login failure for user power from 192.168.1.246 table 2 found that Mac Address of 192.168.1.246 is
via ftp
16:07:33 system,error,critical login failure for user admin from 192.168.1.246
00:0C:29:48:0B:0A.
via ftp TABLE 2. ARP LIST
16:07:34 system,info, account user admin logged out from 192.168.1.246 via IP Address Mac Address Interface
ftp
192.168.1.243 00:26:6C:98:CE:C3 ether2
16:08:46 system,info, account user admin logged in from 192.168.1.246 via
172.16.150.1 00:1E:67:CF:1A:B1 ether1
telnet
192.168.1.242 CC:07:AB:8F:06:9D ether2
16:11:26 system,info user jebol added by admin
192.168.1.254 14:F6:5A:67:CF:59 ether2
16:12:51 system,info simple queue changed by admin 192.168.1.246 00:0C:29:48:0B:0A ether2
192.168.1.251 74:2F:68:9D:26:35 ether2
16:13:29 system,info, account user investigator logged in from 192.168.1.243 192.168.1.249 00:21:5D:4C:D7:D0 ether2
via api 192.168.1.250 60:D9:A0:64:36:2C ether2
Note : The highlighted data colored by red.
The analysis process should be able to link information
from different variable includes the completion of information In addition, to knowing the hostname of attacker computer
against other information to explain an event or attacks and validate the address, DHCP Leases field needs to observe.
activity. Stages of analysis data field shown in figure 7: The data of DHCP Leases shows at table 3:
TABLE 3. DHCP LEASES
Activity Log ARP List DHCP Server IP Address Mac Address Host Name
192.168.1.245 58:A2:B5:82:5D:08 android-d803df206d5dfd68
•Find Suspected •IP Address Leasses
192.168.1.244 54:27:1E:B8:98:EF Falcon-00
Activity •Mac Address •IP Address 192.168.1.241 74:29:AF:EB:17:CF POLICE
•Mac Address 192.168.1.242 CC:07:AB:8F:06:9D android-ab0a5c691d5a4e06
•Hostname 192.168.1.246 00:0C:29:48:0B:0A HACKER
192.168.1.248 74:E5:43:6E:4B:6D Billy-PC
192.168.1.249 00:21:5D:4C:D7:D0 Puniyas
Figure 7. Data Stages Analysis
192.168.1.243 00:26:6C:98:CE:C3 mazda
Note : The highlighted data colored by red.
409 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
410 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract
Maintainability is one of the most important quality attribute that affect the quality of software.
There are four factors that affect the maintainability of software which are: analyzability,
changeability, stability, and testability. Open source software (OSS) developed by collaborative
work done by volunteers through around the world with different management styles. Open source
code are updated and modified all the time from the first release. Therefore, there is a need to
measure the quality and specifically the maintainability of such code. This paper discusses the
maintainability for the three domains of the open source software. The domains are: education,
business and game. Moreover, to observe the most effective metrics that directly affects the
maintainability of software. Analysis of the results demonstrates that OSS in the education domain
is the most maintainable code and cl_stat (number of executable statements) metric has the highest
degree of influence on the calculation of maintenance in all three domains..
1. Introduction
Software maintenance is a primary phase of the software development life cycle. Several studies
reported that this phase is the most effort and time consuming. For example, authors in [1] reported
that the software maintenance phase takes around seventy percent of total resources and 40%-60%
of the total software lifecycle efforts. Having maintainable software decreases maintenance cost
and efforts. Software maintainability can be defined as “the degree to which an application is
understood, repaired or enhanced” [2].
There are four factors that affect the maintainability of software: (1) analyzability, which measures
the ability to identify the fault or failure within the software, (2) changeability is the capability to
modify software products, (3) stability refers to the capability to avoid unexpected effects from
changing the software product, and (4) testability is the capability to test and validate the modified
software product [3].
There are some software metrics which we can use to measure the maintainability. Software
metrics is a predictor to asses and predict software quality.
Although maintenance is a critical task, but it is poorly managed due to inadequate measurement,
so we need precise criteria for measuring software maintenance [5].Finding a tool that provides
accurate and relevant result is not an easy task and it is a big issue.. Moreover, some tools are
dedicated to special development languages and other tools have an interface that requires an
411 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
extensive training [6], indeed some tools able to measure specific metrics. Moreover, they are vary
in evaluating different sizes of systems.[7]
Many researchers depend on the open source software, because it is free in terms of licensing.
Industrial organizations. Moreover, open source software keeps updated all the time by different
developers. Therefore, there is a need to measure the quality of such code and the maintainability
of open source software is needed [4].
Open source software (OSS) is developed by collaborative work done by volunteers through
around the world with different management styles. There are issues with OSS) including the lack
of attention of user interface design that will cause less use of OSS; and the lack of documentation
which is a serious issue; many OSSs poorly documented since they do not have a contractual
responsibility [8], OSS sometimes is more secured more than proprietary software, but there some
drawbacks in OSS’s security since many eye review leads to find vulnerabilities, another security
drawback that attackers can find vulnerabilities easier [9]. There are also some usability problems
in OSS but it is not significant [10], so the quality of open source software needed to be studied
deeply, maintainability is one of the major factors used to assess the quality of OSS.
Empirical studies refer to OSS is more maintainable than proprietary software. Design metrics
literature indicate that after a long period of maintenance, OSS will be harder to maintain and
expected deterioration in maintenance over time for open source software [11].
Many studies have been conducted on evaluating the maintainability of OSS, some research
evaluates whether the maintainability of OSS different from closed software. On the other hand,
other research study shows the variance of maintainability measurement with different versions for
same OSS, and some other research examine the impact of design metrics on the maintainability.
In this paper, we are going to find the answers for the below questions: Do the domains of the OSS
have direct an effect on maintainability? What is the most available metrics that have directly
affected on the maintainability of software? Which tool can we trust to provide accurate
measurement to maintainability?
2. Background
2.1 Software Metrics and Tools
In this paper Chidamber and kemerer (C&K) of object oriented metrics have been
used. The metrics are as described in [12]:
• WMC (Weighted Method per Class):”Total Complexity for all class methods”.
• DIT (Depth of Inheritance Tree):”Defines maximum length from node to root from
tree”.
• NOC (Number Of children):”number of immediate Subclasses”.
• CBO (Coupling between objects):”Tow classes are coupled when method declared
in one class use method or variables instance of other class”.
• RFC (Response for Class):”number of method that can be invoked in response to a
message sent to an object of class”.
412 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3. Literature Review
Some literatures compare maintainability for different version of software like[13] Mukti Chauhan
and Monika Sharma who calculate the maintainability through MI equation for two OSS with
several version for each one using McCabe cyclomatic complexity, Halstead volume (Hal.Vol)
and LOC metric. For Jasper report with version 1.0.0 the MI increased to get highest value in
version 2.0.0,then it decrease in version 5.0.0 and 4.0.0 with minimum value of MI ,For Appache
they noticed that MI increased from version 1.5.3 to 1.6.4 and increased move to get the highest
MI value for version 1.8.0,the author observed that MI increased in both software with decrease in
average cyclomatic complexity ,but for hal.vol and loc result was different between increase and
decrease ,so they concluded that the three metrics have compound impact on maintainability index.
Ruchika and Anuradha [14] measure maintainability for two versions from five database intensive
software, they proposed two metrics useful for predicting maintainability in database intensive
application, (NODBC) number of database connection and (CCR) code to comment ratio. The
author calculate the value of MI, CC, DIT, CBO, and LOC on two version of each application
(original and modified versions), they identify change() method as number of line added, deleted,
or modified in source code and calculate the value of change for the two version; the actual value
of change(AV) and predicted value(PV) using FFNN modeling, then analyze correlation between
(NODBC)and (CCR)with the value of change and proof their proposed metrics.
Many literature compute maintainability based on MI index, Similar to [15] we examine three
open source software, Anita ganpati examine the maintainability of four OSS through formula
based on average Halstead volume, average cyclomatic complexity per module, average Line of
code per module, Antina prove that maintainability value differ from one application to another;
Mozilla firefox gain the highest maintainability value whereas Appache gain the lowest value. The
author conclude that the Mozilla more maintainable than Appache.
Nahlah Najm [16] suggest new equation to calculate the maintainability index depending on the
factored formula (MI);MI consist of LOC, cyclomatic complexity, halstead volume, suggested
formula resulted from calculation on old formula, it is depend on LOC only, which produce result
close to the old MI formula with less effort and time to calculate. Unlike [15, 16] we examine the
maintainability depending on its four sub characteristics formula.
413 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Many researches assess maintainability using various model and techniques. Similar to [17] we
depend on C&K metrics to assess maintainability. The researchers reviewed OO metrics to
construct a model that predict OSS maintainability, they made a comparison between metrics
applicable on OSS versus those applicable to OO and they find many OO metrics applicable to
OSS but few (like LOD and LCN) cannot be applicable to OSS. They try to find a relationship
between maintainability and coupling, cohesion, complexity. They found that maintainability has
inverse relationship with complexity which increase according to increase in
(LOC,WMC,NAM,DIT, CBO, REF, LCOM, CC, EHF, NCB)., Unlike [18] we focus on the main
characteristics of maintainability; Analyzability , Changeability ,Stability , and Testability.
RimiSuini compare the result of main characteristic of quality software like portability, usability,
modifiability and maintainability using various quality model such as McCall, Boehm, FURPS,
Glip, etc,. Then he apply a comparison on the sub characteristic of maintainability as
changeability, readability, stability, simplicity, modifiability, localizability, compatibility and
other using same previous model. he evaluating the maintainability of software product in order to
reduce maintenance cost and effort, he found that inappropriate handling of maintenance refer to
shortage in assured measurement of maintainability.
S. W. A. Rizvi and R. A. Khan [19] predict maintainability based on understandability and
modifiability. They developed three models; understandability model, modifiability model which
quantified in term of size and structural complexity to establish maintainability model using
multiple linear regression. They conclude that maintainability model shows both understandability
and modifiability strongly related with maintainability. While we depend on Java language in our
experimental study,[20]assess maintainability depending on project from different programming
language in opposite, they focus on the relationship between internal quality attribute as size,
inheritance, coupling, complexity and external quality attribute like maintainability.
Amjed Tahir and Rodina Ahmad [1] developed an AOP technique to capture all important aspect
of system behavior at run time using dynamic metrics that are collected by adding extra code to the
source code, they proposed maintainability as a dynamic metric. In order to implement their
framework named the maintainability dynamic metrics-AOP, they select DCBO metric to measure
dynamic coupling which collected during run time by injection a piece of code to the source code.
They test their framework using two simple application; the Address Book and Paint .They found
that AOP was effective in capturing the maintainability dynamic metric and AOP can derived the
metric form source code.
Wide range of metrics available to predict maintainability such as MOOD, QMOOD,C&K.
according to [21] the authors aimed to find a relationship between number of metrics and
maintainability using large number of Java open source, they collect 15 design metrics from 148
Java open source, mainly from five different domains ;software development tool, communication
, system, internet , DB, depending on higher number of download, using MI equation and linear
regression. The authors conclude that increase number of method parameter ,control flow
complexity of method ,number of attribute and amount of polymorphism will decrease the
maintainability, however the increase number of method, number of child classes, depth of
inheritance tree will increase the maintainability, but other factor such as amount of method,
number of classes have no significant affect on maintainability. Finally they found that the average
control flow complexity per method (OSAVG) is the most important maintainability factor.
Bahavna katoch and loverpreet kaour shah [22] show a comparison between MOOD and
QMOOD metrics that measure the maintainability 414 for object oriented design ,found that MOOD
https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
metrics useful for object oriented program and has replicable measurement, whereas QMOOD
show the relation between design property and quality attribute through the equation for measuring
the quality factor of maintainability. Furthermore [23] Jubair developed a system based on MOOD
metrics in order to asses large software Java program, he identify each MOOD metrics and
formula used to calculate that metric, also he present the correlation between MOOD metrics and
characteristics of object oriented model. Three input Java program with different design and
complexity are tested and evaluated through experimental study, using equation measure each
metric and weight factor for each metric; weight factor is indicator used to reflect the importance
of each metric, he concluded that the system is pass in evaluating java program and also in
checking the quality of student program. On other hand in our paper we depend on C&K metrics
rather than MOOD, and QMOOD.
Sanjay and ajay [24] study the impact of object oriented metrics on maintainability; they mention
different type of O.O metrics and focus on CK metrics. Taking every CK metrics through
empirical study, then analyze the impact of each metrics on maintainability ,they concluded that
lower value of CK metrics will produce more maintainable software, whereas [25] depend on
C&K and MOOD Metrics. The authors take two open source software Marf and Gipsy and
measure the maintainability of them using MOOD and CK metrics using numorous tools as
logiscope, JDeadurant, Marcraft ,MCCabe listing the advantages and disadvantages for each
tool.They find that the maintainability affected by four factor analyzability, testability,
changeability and stability, the team of work measure each factor in isolation using different tool
with different classes taken from two case studies, then they focus on validating the measurement
value resulted from the tool using different test cases at different unit, also they compare the result
of two case study with different classes and different level and illustrate a recommendation to
improving the maintainability, another comparison made between C&K and mood metrics, they
proposed that C&K better in design decision and focus on class level where as MOOD measure the
quality of overall project, finally the authors illustrate that internal quality factor have strong
impact on external quality factor, similar to [25] we use same formula to compute maintainability
for three OSS; File transfer and chat, Faculty book system, and Car sale system using two tool;
Understand for java and Eclipse.
415 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
4. Research Methodology
In our maintainability evaluation approach, we selected three different open source software that
correspond to three different domains: education, business and game. The evaluation approach
consists of following steps:
1. Select the OSS of the application.
2. Run the three different domain’s applications in both tools Understand and Eclipse
3. Evaluate the four attributes that effect on maintainability which are changeability,
analyzability, testability, and stability.
4. evaluate the maintainability
5. Assess maintainability for different domain of OOS
Analyze generated results.
File Transfer and Chat Project (Message Sending) is “a system has been developed in Java 1.3
which is based on Object Oriented Methodology. There are several packages in Java, but mainly
swing packages and networking has been utilized in developing this project. it works under two
modules, namely Active and Passive. Only passive clients can receive files, but active clients can
send and receive files as well. Any kind of files, including .fmx files, .exe files and more, can be
sent using this system”. [27]
Car sales management system is “ a project is written in java. It aims to make car sales more
easier through searchable criteria, such as car model, car price, car specification, speed, average
and many other factors”.[28]
A faculty book system is “a project is written in JAVA and can be used as a major project by
students. This project is used to keep record of faculty books and removed the manual work. There
is a full record of all the faculty of college and school in this system and whenever any book is
issued to any teacher; it is added to the faculty book system along with date of issue and return
date.” [29].
In nutshell, these apps consist of almost the same number of files, packages and line of code, as
show in table. Table 1 describes the three OSS:
416 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Metric Definition
Sum of the static complexities of the class
cl_wmc
methods(cyclomatic # of function)
Ratio between the number of lines of comments in
cl_comf
the module and the total number of lines
Number of classes from which the class inherits
in_bases
directly or not
cu_cdused The number of directly classes used by this class.
Number of executable statements in all methods
cl_stat
and initialization
417 code of a class. https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
5. Experimental Result
According to our experimental study, we choose three projects under three different domains, all
have approximately the same size; according to LOC value; the For all project, we calculate the
maintainability for each class according to our based formula, then we find the mean value for
each metrics in each class, and depending on these value we calculate the maintainability of
overall project.
5.1 Evaluating the maintainability using Understand tool
After analyzing the file transfer and chat system using understand tool, the result in Figure 1
represent the maintainability factor values for each class in this domain.
418 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 1
Figure 1 shows the changeability factor takes the highest value among all other factor, testability is
the second, then analyzability, but stability has varying values according to other factor.
Depending on the result presented in Figure 2, we choose fair and poor class and calculate min,
mean and max value for each metrics included in the formula.
Figure 2
The Table 3 show min, mean and max metrics value for chat.java class which we classify as poor
class.
Table 3 the metrics values for Chat.java Class
Class Name Chat.java
419 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Cl_wmc value is close to max value. Therefore, the complexity of the class will increase.
• Cd_cdused value needs to be decreased in order to make the class easy to change.
• High cl_stat indicates larger project size that leads to complex class.
• The number of cl_func should be decreased.
• The value of cl_fun_pub is 8 out of 10 cl_func, this ratio is high. So, we should replace
public method with protected and private method and just identify the accessible method as
public.
• The high value of cl_data indicates high coupling lead to more complex system, and high
cl_data _publ value low information hiding (low encapsulation).
In the opposite, table 4 represents clientform.java as good class in the file transfer and chat
system.
Table 4 the metrics values for clientform.java Class
Class Name clientform.java
Metrics Value Min Mean Max
cl_wmc 2 2 15.416667 42
cl_comf ratio 0.129032 0.0098039 0.0627238 0.1315789
cd_cdused 15 2 19.416667 53
cl_stat 21 11 54.416667 162
cl_func 1 1 5.5 12
cl_data 2 2 10.416667 26
cl_func_pub 0 0 3.8181818 12
cu_cduser 8 2 10.545455 30
After analyzing Faculty Book System using understand tool, the result in Figure 3 represent the
maintainability factor values for each class in this domain.
Figure 3
Similar to the File transfer and chat system, the changeability factor takes the highest value among
all other factor, then testability, analyzability, and stability respectively.
Depending on the result presented in Figure 4, we choose fair and poor class and calculate min,
mean and max value for each metrics are included in the formula.
Figure 4
The Table 5 shows min, mean and max values for jfrmfacultybook.java class which we consider as
a poor class.
421 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Cl_wmc value is close to max value. This means that it is a complex class.
• Cd_cdused value needs to be decreased in order to make the class easy to change.
• High cl_stat (230) indicates larger project size that leads to complex class.
• The number of cl_func should be decreased.
• The value of cl_fun_pub is 8 out of 12 cl_func, this ratio is high. So, we should replace the
public method with the protected and private method and just identify the accessible
method as public.
• The value of cl_data is close to the maximum value, which indicates the high coupling that
leads to more complex system.
In the opposite, table 6 represents jfrmabout.java as good class in the Faculty Book
System.
• Cl_wmc is low which indicates low complexity that leads to high maintainability.
• Cd_cdused is on the mean value.
• Cl_stat has a low value (43 out of 347), which means that we have a small size class
• There are only 2 methods declared in this class and all are declared as public.
422 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
After analyzing Car Sale System using understand tool, the result in Figure 5 represent that the
maintainability factor values for each class in this domain.
Figure 5
The changeability factor holds the highest value among all other factor, then testability, but the
values for analyzability and stability were varied from one class to another.
Depending on the result presented in Figure 6, we choose a fair and poor class and calculate the
min, mean and max value for each metrics included in the formula.
Figure 6
The Table 7 shows the min, mean and max value for CarSaleSystem.java class which we
consider as a poor class.
423 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Cl_wmc value is the max value. Therefore, the complexity of the class will increase that
leads to less maintainable software.
• Cd_cdused value has the max value that need to be decreased in order to make the class
easy to change.
• cl_stat has the max value which indicates a larger project size ,that leads to complex class.
• The number of cl_func should be decreased.
• All method declared in the class as a public method, so we should replace a public method
with a protected and private method, and just identify the accessible method as public.
• High cl_data value indicates high coupling between methods of class, the high value of
cl_data_pub indicates low information hiding (low encapsulation).
In the opposite, table 8 represents manfacturer.java as a good class in Car Sale System.
Table 8 the metrics values for manfacturer.java Class
Class Name manfacturer.java
Metrics Value Min Mean Max
cl_wmc 8 5 17.1 51
cl_comf ratio 0.613793 0.211765 0.339118 0.613793
cd_cdused 4 4 9.7 27
cl_stat 12 12 52.3 112
cl_func 7 4 8.8 20
cl_data 2 2 10.6 21
cl_func_pub 6 2 7.2 20
cu_cduser 3 3 11.5 21
• Cl_wmc is low which indicates low complexity that leads to high maintainability.
• Cd_cdused is low, meaning low coupling, which is better.
• Cl_stat has low value 12 out of 112, result in a small size class.
• There is 7 methods declared in this class, which indicate low complexity.
• The number of public methods should be decreased, as cl_fun_pub is 6 from 7 methods.
• The number of cl_data_pub is low, which indicate higher encapsulation.
424 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Higher maintainability values indicate more complex software, therefore less maintainable
software. In spite of, Business domain is bigger than social domain with 109 LOC, both Social
and Business domains almost have the same maintainability value. This indicates that the two
domains have the same degree of maintainability. Education domain has the highest
maintainability value, which means that it is the more complex domain and less maintainable
software.
We measure maintainability for Business and education domain on eclipse tool, which result in
close measurement to understand tool. Business LOC is 1401, and education LOC is 1945. Table
10 summarizes the maintainability value for the two domains.
We found that Cl_Stat has the most impact on maintainability values, then cl_wmc and cl_cdused.
Figures 7,8 and 9 shows the metrics values that have the highest impact factor in calculating
maintainability for the three different domains.
425 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
different projects. Our projects could include some other metrics that have a direct affect on the
calculations.
It is also a threat to the validity of the using the tools. The tools have several advantages and
disadvantages. Below, the advantages and disadvantages of these tools:
Disadvantages:
• Trial version.
• Cannot generate the mean, min and max value for each metrics.
5.5.2 Eclipse Tool
Advantages:
• Available for free.
• Generate the mean, max, min and Standard deviation values for each metrics.
Disadvantages:
• Doesn’t generate the measurement for all metrics, such as …..
• For small project some metrics doesn't generated
• Cannot analyze project containing errors.
6. Conclusion
Maintainability is the most important quality attribute that affect the quality of software.
Depending on C&K metrics we compute the maintainability for three different domains using
two different tools.
We conclude that the different domains produce different maintainability measurements, and the
most important factor affect maintainability is changeability. According to result from the tools:
understand and eclipse metrics, we deduce that different tools generate similar result for
maintainability, we trust understand tool that provide accurate measurement for maintainability
value agreed to its advantages over Eclipse tool.
427 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
References:
[1] Tahir, Amjed, and Rodina Ahmad. "An AOP-Based Approach for Collecting Software
Maintainability Dynamic Metrics." Computer Research and Development, 2010 Second
International Conference on. IEEE, 2010.
[2] http://www.castsoftware.com/glossary/software-maintainability. Access on 2017.
[3]Ghosh, Soumi, and Sanjay Kumar Dubey. "Fuzzy Maintainability Model for Object Oriented
Software System." (2012).
[4] Johari, Kalpana, and ArvinderKaur. "Validation of object oriented metrics using open source
software system: an empirical study." ACM SIGSOFT Software Engineering Notes 37.1 (2012): 1-
4.
[5]Rizvi, S. W. A., and R. A. Khan. "Maintainability estimation model for object-oriented software
in design phase (memood)." arXiv preprint arXiv:1004.4447(2010).
[6] Albeladi, Abdulrhman, et al. "Toward Software Measurement and Quality Analysis of MARF
and GIPSY Case Studies a Team 13 SOEN6611-S14 Project Report." arXiv preprint
arXiv:1407.0063 (2014).
[7]Lincke, Rüdiger, Jonas Lundberg, and WelfLöwe. "Comparing software metrics
tools." Proceedings of the 2008 international symposium on Software testing and analysis. ACM,
2008.
[8] Levesque, M. (2005). Fundamental issues with open source software development (originally
published in Volume 9, Number 4, April 2004). First Monday.
[9] Open-source_software_security, availavle at: https://en.wikipedia.org/wiki/Open-
source_software_security. Accessed Dec 11, 2016.
[10] Nichols, D., Twidale, M. (2002). The Usability of Open Source Software. First Monday, 8(1).
Availavle at: http://firstmonday.org/article/view/1018/939#n2.
[11] Ayalew, Y., & Mguni, K. (2013). An Assessment of Changeability of Open Source
Software. Computer and Information Science, 6(3), p68.
[12]Gulia, Preeti, and Rajender Singh Chhillar. "Design based Object-Oriented Metrics to Measure
Coupling and Cohesion." Journal of Management & Computing Sciences (IJMCS) 1.3 (2011): 42.
[13] Mukti Chauhan, Monika Sharma, june 2013, "Predicting Maintainability Of Open Source
Software: An Empirical Approach", IJERT, Volume 2, Number 6, Pages 3333- 3336.[14]
Malhotra, Ruchika, and Anuradha Chug. "An empirical study to redefine the relationship between
software design metrics and maintainability in high data intensive applications." Proceedings of
the World Congress on Engineering and Computer Science. Vol. 1. 2013.
[15] Ganpati, Anita, Arvind Kalia, and Hardeep Singh. "A Comparative Study of Maintainability
Index of Open Source Software." International Journal of Software and Web Sciences 3.2 (2012):
69-73.
428 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[16] Najm, Nahlah MAM. "Measuring Maintainability Index of a Software Depending on Line of
Code Only IOSR Journal of Computer Engineering (IOSR-JCE) ,Volume 16, Issue 2, Ver. VII
(Mar-Apr. 2014), PP 64-69
[17] Bakar, A. D., Sultan, A. M., Zulzalil, H., & Din, J. (2012). Review on'Maintainability'
Metrics in Open Source Software. International Review on Computers and Software, 7(3).
[18] Saini, Rimmi, Sanjay Kumar Dubey, and Ajay Rana. "Analytical study of maintainability
models for quality evaluation." Indian Journal of Computer Science and Engineering 2.3 (2011):
449-454.
[19]Rizvi, S. W. A., and R. A. Khan. "Maintainability estimation model for object-oriented
software in design phase (memood)." arXiv preprint arXiv:1004.4447(2010).
[20]Ghosh, Soumi, and Sanjay Kumar Dubey. "Fuzzy Maintainability Model for Object Oriented
Software System." (2012).
[21] Zhou, Yuming, and Baowen Xu. "Predicting the maintainability of open source software
using design metrics." Wuhan University Journal of Natural Sciences13.1 (2008): 14-20.
[22] Katoch, Bhavna, and Lovepreet Kaur Shah. "A Systematic Analysis on MOOD and QMOOD
Metrics." International Journal of Current Engineering and Technology 4.2 (2014): 620-622.
[23] Al-Ja’Afer, J., and K. Sabri. "Metrics for object oriented design (MOOD) to assess Java
programs." King Abdullah II school for information technology, University of Jordan,
Jordan (2004).
[24] Dubey, Sanjay Kumar, and Ajay Rana. "Assessment of maintainability metrics for object-
oriented software system." ACM SIGSOFT Software Engineering Notes 36.5 (2011): 1-7.
[25] Albeladi, Abdulrhman, et al. "Toward Software Measurement and Quality Analysis of
MARF and GIPSY Case Studies a Team 13 SOEN6611-S14 Project Report." arXiv preprint
arXiv:1407.0063 (2014).
[26]Lincke, Rüdiger, Jonas Lundberg, and WelfLöwe. "Comparing software metrics
tools." Proceedings of the 2008 international symposium on Software testing and analysis. ACM,
2008.
429 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— besides the important applications of been investigated as the main topic of this research. Wajid et al.
Electroencephalogram (EEG) signals, like recognizing different have analyzed five regions of the brain and proposed a new
mental diseases other aspects of EEG utilization such as method based upon classification of states of 2D and 3D games
biometrics, music, entertainment, etc., are striking nowadays. To data [7]. To predict the movement intent of human body [8]
make a good interface between human brains and the surrounding Kaiyang et al. have used EEG signals. Norizam et al. By
environment, brain-computer interface (BCI) has created a analyzing EEG signals in the lab, interpreted human though,
strange evolution in this field. In this paper, achieved EEG signals, despite the low accuracy of this work the study can create the
during watching 2D and 3D movie has been investigated. A sample Labview block diagram to test [9]. In addition to medical useges
of nine healthy volunteers (age range 18-30) contributed in the
of BCI, there are some other recreational applications too.
experiments, these experiments consist of two parts: first, subjects
watched 2D movie and then watched the same movie in 3D mode.
Nowadays, in the realm of three-dimensional images and videos
After data acquisition, to predict states of brain, signals are sent to there are significant progresses. For instance, Khairuddin et al.
the feature extraction stage. Fast Fourier Transformation (FFT) is by gathering adults EEG signals during video game play in 2D
used to extract features and then classified by “Classification and 3D, concluded that their method may be useful in
Learner App”. Two kinds of Support Vector Machine (SVM) quantifying the EEG signals during 2D and 3D visualization
classifier, and fine kind of k nearest neighbors (kNN) were used as [10]. In 2014, another team's discovery showed that during
classifiers in this study. To understanding that which frequency playing in 3D mood, in theta and alpha bands, there is an
bands are more effective in the EEG signals during watching 2D increase at the frontal and occipital regions, while in 2D play,
and 3D Movies, these combinations of EEG bands are used as the chiefly in temporal lobes higher beta and gamma activity was
features: delta, theta, alfa, beta and gamma bands abbreviated as found [11]. In [12] research team after measuring brain activities
“all bands”, delta, theta and alfa bands as “low frequency bands”, of viewers during watching 2D, 2.5D and 3D motion pictures,
theta, alfa and beta as “middle bands” and alfa, beta and gamma have compared them with each other, their results showed that
bands as “high frequency bands”. Finally, in comparing the the relative intensity of α-frequency band of 2.5D-viewer was
results, the classification accuracy of “all bands” in channel T5 for lower than that of 2D-viewer, while that of 3D-viewer remained
Quadratic SVM was attained as the highest. with similar intensity. In other studies, It is shown that the
obtained brain waves at α-frequency band are not related with
Keywords— EEG, brain-computer interface, 2D and 3D movies, visual perception, although incompatible results has reported by
kNN, SVM, Classification Learner App
few previous studies [13, 14].
430 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
As known, EEG is used to discuss human brains visual B. EEG Data Acquistion
perception besides other applications. Our goal is to design a To display the use of EEG signals in order to detect the
pattern recognition system, for classifying brain signals during differences of 2D and 3D movie watching, data collection
watching 2D and 3D movies by applying EEG analysis. Also we scenario was made. The EEG data corresponding to 2D and 3D
want to know that which bands of EEG signal have an important movie watching started by the Brain Quick EEG System
role in our study. In this system, we are considering working on (Micromed, Italy) from 18 scalp locations (Fp1, Fp2, F3, F4, C3,
eight subjects. By using common feature extraction method and C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, Fz and Pz channels)
high performance classification algoritm, we hope that this based on the international 10-20 system where Cz was used as
pattern helps us to identify people in both 2D and 3D movie reference. EEG data were recorded with a sampling rate of 512
watching. Furthermore, by exploring the diffrence between samples per seconds. On the related EEG cap, there are 19
brain activity, during watching these two cases it is hoped that channels in total. All the channels were selected to be analyzed.
this pattern can be excutable in BCI applications. In other words,
the subject can watch a 2D motion image for ON and a 3D
motion image for OFF of a device. This study is a beginning step
C. Data Preprocessing
to design and implement a new, fast, simple BCI system.
In this paper, the raw data of achieved EEG signals are sent
The structure of the paper is described as below: after the to computer via flash memory and then saved as .edf file, then
preliminaries, the experimental setup, results and finally, the by a MATLAB programming the data is converted to .mat
conclusion and discussion section are given. The structure of format ready to analyze. Different preprocessing methods can be
brain signal recording is shown in Figure 1. applied to raw data in order to remove line noise, polarization
noise, eye movements, and muscular activities. In this study, we
were used a band-pass filter to elicit the desired signal between
0.1 till 120 Hz, and to delete line noise we were used a 50 Hz
notch filter. A mean normalization process was used to each
epoch as (1) [16].
𝑥−𝑥
𝑋𝑁 = (1)
𝑚𝑎𝑥| 𝑥−𝑥|
D. Epoch Category
After data collection and preprocessing EEG signals, they
are divided into 1-sec epochs. For a movie viewer, there are 280
1-sec. epochs at all. The described data set is given in Table II.
Fig. 1. The structure of brain signal recording TABLE II. DESCRIBING THE SELECTION OF DATA SET
A. Subjects &
Eight subjects including three women and five men aged 140 epoch for 3D)
between 18 and 30 years old, are participated in this
investigation. There is no problem about subject’s vision ability. III. FEATURE EXTRACTION
In other step, all of the subjects were informed about the
experimental conditions. All of them were asked to sit on the
chair that is almost one meter away from TV (LG 32 inch) stand, Besides time reduction, high performance of feature
and be relax and focus on the television screen. Subjects were extraction is very important in signal analysis. In order to
asked to keep unessential movements to a minimum during the highlight the effectiveness and efficiency of this BCI system, we
trial. Two sessions were organized for each subject. Subjects particularly concentrate on the simplest algorithm for decreasing
first watched 2D movie and then after stopping program, and the calculations and shorten latency [6].
one-minute gap recording the watching 3D of the same movie, To describe the characteristics of EEG signals and transmit
is continued. Two-minute and twelve-second 2D and Two- them to the frequency domain, Fast Fourier Transform (FFT) is
minute and twelve-second 3D AVATAR movie [15] are shown used as (2). In this study, all of the channels were used to extract
to the subjects. features. By using this useful analysis, converting the raw data
431 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
from its original domain (time) to the frequency domain can be Because of binary classes of this investigation, the prediction
possible. By using Discrete Fourier Transform (DFT), speed is fast, but in multiclass model this speed is generally
converting of discrete-time sequences into discrete-frequency medium. Furthermore, this type of Linear SVM classifier has
versions is performed, which is derived by (3). To calculate the medium memory usage and easy interpretability. Flexibility of
sequence's DFT, FFT transform is an effectual algorithm [17]. this model is low and has linear separation between classes. In
FFT command is available as a function fft() in MATLAB which Quadratic SVM model, prediction speed is more than linear,
is used in this study. memory usage is large and has hard interpretability. Flexibility
+∞ of this model is medium.
X(f)=F(x(t))=∫−∞ 𝑥(𝑡) 𝑒 −2𝜋𝑗𝑓𝑡 dt (2)
X(f)=∑𝑛−1
𝑖=0 𝑥𝑖 𝑒
−𝑗2𝜋𝑖𝑘/𝑛
fork=0,1,...,n-1 (3)
where in (2), x(t) is the time domain signal, and X(f) is its FT,
and in (3), x is the input sequence, X is its DFT, and n is the
number of samples [18].
After signal transformation for each epoch of 18 channels,
the five bands of EEG signals (Delta, Theta, Alpha, Beta, and
Gamma) were obtained. For each epoch, the sample’s average
for each band is calculated in order to reduce dimension of EEG
signals. In this way, for each epoch in one channel, 5 features
were extracted and, as mentioned, 18 channels were used. So, 90
(18*5) features were prepared for each epoch.
432 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
over the size of the testing set. Table III shows the classification that high frequency bands are a few more effective than other
accuracy obtained using 18 channels from three classifiers for bands.
all EEG bands. Also the average of each classifier for all subjects
in each channel was computed separately in the last row. It can VI. CONCLUSION
be seen that channel T5 in Quadratic SVM has best performance This paper proposed a new BCI system based on watching
with %69.23. 2D and 3D movies. This paper also tried to find that which bands
To understanding that which bands are more effective in of EEG signals have more effect in watching 2D and 3D movies.
the EEG signals during watching 2D and 3D movies, low Four different combination of EEG bands were used as features.
frequency bands involved delta, teta and alfa are used as These combinations are delta, theta, alfa, beta and gamma bands
features. Table IV shows the classification accuracy obtained abbreviated as “all bands”, delta, theta and alfa bands as low
using 18 channels from three classifiers for low frequency frequency bands, theta, alfa and beta as middle bands and alfa,
bands. Similar Table III, the average of each classifier for all beta and gamma bands as high frequency bands. The results
subjects in each channel was computed separately in the last showed that all bands can be used in classification with a
row. In low frequency bands, channel Fp1 in Linear SVM has reasonable accuracy about %70. Also the results show that high
best performance with %64.50. In the following of frequency frequency bands are a few more effective than other bands. The
bands evaluation, middle frequency bands and high frequency classification results were prepared by the help of the
bands were used as features. Table V and Table VI show the Classification Learner App, in the Statistics and Machine
classification accuracy obtained using 18 channels from three Learning Toolbox of MATLAB. This scheme has a potential to
classifiers for middle and high frequency bands, separately. The be used in the BCI system and even in automation of diagnosing
average of each classifier for all subjects in each channel was fatigue. In future, different classifiers and feature extracting
computed separately in the last row of table. In middle frequency methods can be employed to observe that which one may
bands, channel Fp1 in Linear SVM has best performance with provide even better classification results.
%62.23. Also in high frequency bands, channel C4 in Quadratic
SVM has best performance with %64.69. In brief, we can see
TABLE III. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR ALL EEG BANDS
433 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE IV. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR DELTA,THETA, AND ALFA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 56,3 55,2 56,7 45,6 52,4 56 62,3 49,6 54,8 50,4 49,2 56,7 52 52 46,4 67,1 49,2 45,2
Linear SVM 61,5 62,7 63,5 52 62,3 56,7 59,5 55,6 55,2 55,2 57,1 56,7 60,7 50,4 57,5 68,3 55,2 45,2
Quadratic SVM 60,7 57,7 63,9 51,6 61,3 60,7 60,3 50 57,5 55,6 48,8 53,2 56,7 52,8 57,9 67,1 55,2 62,3
S2 kNN 46,7 47,1 59 51,6 50,4 49,2 52,5 43 50 49,6 57,4 52,9 69,3 54,1 53,3 58,2 43 66
Linear SVM 54,5 54,9 55,7 60,7 56,1 60,2 54,5 50,8 54,9 54,1 56,1 53,3 80,3 55,3 58,6 54,5 54,5 69,3
Quadratic SVM 51,6 55,7 59 56,6 51,2 58,6 55,7 52,9 55,3 52,5 55,7 55,3 77,9 49,2 58,2 53,3 54,9 66,4
S3 kNN 60,7 57,8 60,2 54,5 61,1 52,5 55,3 61,1 53,7 61,1 55,7 52 65,2 64,3 58,6 64,8 60,2 60,7
Linear SVM 68,4 68 52,5 53,7 65,2 60,7 61,9 63,1 49,2 64,8 63,5 57 73 71,7 63,1 64,3 70,5 68,4
Quadratic SVM 63,5 66,4 48,8 56,6 65,2 60,7 62,3 62,7 54,1 62,3 63,5 56,6 71,3 70,5 63,5 63,1 68,4 69,3
S4 kNN 59,5 63,2 61,6 62,8 51,7 53,7 61,2 55 50,4 61,6 58,7 48,8 56,2 46,3 53,7 55,8 49,2 59,9
Linear SVM 68,2 62 67,4 70,7 59,5 62 66,5 59,1 57,4 71,9 53,3 57 57,9 55,8 62,4 63,2 55,4 47,5
Quadratic SVM 63,6 61,6 68,6 68,2 59,1 63,2 63,2 62,4 57 70,7 58,7 54,5 57 51,7 59,1 59,9 57,9 55,8
S5 kNN 53,7 48,8 59,4 51,6 48 49,2 54,5 55,3 49,2 45,5 56,1 54,1 54,9 81,1 45,1 50 52 53,7
Linear SVM 55,3 50,4 67,2 46,7 44,7 53,3 52 59,4 51,6 53,7 48,4 51,2 54,5 82 52,9 52,9 51,6 47,5
Quadratic SVM 50,4 49,6 66,8 50,4 48,4 50 50,8 64,8 52,5 49,2 49,2 51,2 48,4 78,7 53,3 47,5 52,9 47,5
S6 kNN 52,5 56,1 56,1 43 50 49,2 50,4 48,4 70,9 49,2 48 54,5 50 51,2 51,2 50 50,4 48,4
Linear SVM 52,9 58,2 52,5 50 52,9 49,6 43,4 51,2 71,7 58,2 52,9 61,5 48,8 49,6 57,4 53,7 54,1 50
Quadratic SVM 58,2 57,4 58,2 55,3 56,6 52 48 45,5 61,9 58,2 54,5 57 54,5 56,6 58,6 54,9 57 59,8
S7 kNN 88,9 49,2 60,7 62,7 49,6 59,8 56,6 54,1 61,1 45,9 50 59,8 49,2 61,5 95,1 51,6 54,5 47,5
Linear SVM 91,8 61,5 60,7 68 68,9 60,7 57,4 56,6 58,6 47,1 63,5 64,8 55,7 58,6 72,5 55,7 60,2 57,8
Quadratic SVM 89,3 61,1 59 62,3 68 72,1 63,9 54,9 60,7 58,2 62,3 66 54,1 66 89,8 61,9 61,5 58,6
S8 kNN 59 51,2 52,5 55,3 63,9 60,2 59,4 52 47,5 59 48,8 49,2 53,7 73,8 57,8 53,3 47,1 48,8
Linear SVM 64,3 55,7 57 60,7 57,4 63,5 53,7 55,3 49,2 54,1 53,7 50 61,5 69,7 62,7 51,2 57,4 60,2
Quadratic SVM 62,3 54,5 57,4 59,4 59,8 63,9 57 52,9 53,3 56,1 54,1 50,8 58,2 72,1 69,3 51,6 55,7 61,9
S9 kNN 59 53,3 57 52,9 63,9 59 59,8 51,2 48 58,2 45,1 51,2 53,7 72,1 55,3 49,6 45,9 49,2
Linear SVM 63,9 57,4 58,2 58,6 54,9 62,7 55,7 54,9 50,4 54,9 54,5 51,6 62,3 70,5 63,1 49,2 55,7 60,7
Quadratic SVM 61,5 59,4 55,3 59,8 61,9 63,1 58,2 52,5 50 53,7 58,2 48,4 57 73 66 43 55,3 61,5
Ave kNN 59.5 53.55 58.14 53.3 54.56 54.3 56.89 52.19 53.9 53.39 52.12 53.2 56.0 61.83 57.39 55.6 50.17 53.27
Linear SVM 64.5 58.98 59.42 57.9 57.99 58.8 56.07 56.23 55.3 57.12 55.89 55.9 61.6 62.63 61.14 57 57.18 56.29
Quadratic SVM 62.3 58.16 59.67 57.8 59.06 60.4 57.72 55.4 55.8 53.39 56.12 54.7 59.4 63.4 63.97 55.82 57.65 60.35
TABLE V. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR THETA, ALFA AND BETA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 44.3 53.7 61.5 52.9 67.6 57 53.7 49.2 52 50.8 51.2 51.6 67.2 46.7 53.3 48 52 66.8
Linear SVM 57.8 65.6 57 60.7 71.7 61.5 50.8 61.9 59.8 52.9 61.1 62.3 77 57.8 54.5 59.8 57.4 73.4
Quadratic SVM 55.3 64.8 53.3 61.1 70.5 60.7 50.8 60.2 58.6 54.1 57 57 78.3 56.6 51.6 57.4 56.6 73.8
S2 kNN 43.4 53.3 59.8 52 69.7 59.4 51.2 48.4 49.2 52.9 50.8 50 68 48.8 52.9 51.2 51.2 66.4
Linear SVM 61.1 67.2 57 62.3 72.5 61.5 53.3 63.1 61.1 54.5 62.3 63.9 76.2 59 52.5 54.9 55.7 75.4
Quadratic SVM 54.5 65.6 57.4 58.6 69.7 61.5 46.3 58.6 60.2 54.1 56.6 60.2 78.3 54.5 48.4 53.7 54.5 73.8
S3 kNN 59.8 60.2 57.4 48 59.4 59.8 55.7 54.9 45.5 62.3 61.9 56.6 56.1 59.8 52.9 54.5 54.9 47.1
Linear SVM 62.7 61.5 50 55.7 59.8 58.2 59.4 59 52.5 64.8 66 56.6 54.5 63.9 58.2 59.8 68 56.6
Quadratic SVM 52.5 66.4 48 58.2 58.6 56.6 59.8 57 47.5 61.9 64.8 51.2 50.8 60.2 57 63.1 65.6 57
S4 kNN 58.7 56.2 66.1 65.7 58.7 48.8 68.6 55 55 68.2 46.3 56.2 52.9 50.8 51.7 52.9 56.6 52.9
Linear SVM 64.5 66.5 70.2 73.6 59.1 60.3 72.3 56.6 57.4 76.4 52.1 70.2 59.1 52.9 57.4 58.3 59.5 50
Quadratic SVM 63.2 66.1 71.9 69.4 60.7 57.4 70.2 53.7 59.1 73.6 50.4 69 61.6 52.9 52.1 58.7 63.6 52.9
S5 kNN 43.9 50 56.1 53.3 42.6 45.9 52 65.2 52.9 54.1 56.6 64.3 54.5 75.4 48.4 52 54.1 51.6
Linear SVM 50 51.2 56.6 48.8 49.6 47.1 47.5 54.1 48.8 47.1 43.4 45.9 43.4 80.7 50 44.3 49.6 51.2
Quadratic SVM 50 50 60.7 49.2 54.9 51.2 49.6 59 49.2 50.4 46.3 48.8 48.8 60.7 48.8 48.4 47.5 50
S6 kNN 59.8 58.2 53.7 50 50 60.2 57 57.4 50.4 60.7 53.7 53.7 49.2 49.2 51.2 45.9 56.6 52.5
Linear SVM 51.6 53.7 51.2 60.7 60.7 60.2 52.9 41.8 47.5 72.1 57.4 49.6 48 48.8 56.6 55.3 55.3 50.8
Quadratic SVM 61.1 55.3 62.3 65.6 55.7 57.4 59.8 58.2 51.2 71.3 58.6 48.8 54.1 53.7 52 55.3 57.4 52
S7 kNN 91 49.2 59 59.4 56.1 63.5 51.6 52.9 58.2 57 49.2 60.2 48.8 60.2 98 52.9 50.4 55.7
Linear SVM 94.3 59.8 51.6 68.4 67.6 55.7 55.3 57.8 61.5 57.4 49.6 58.2 50.8 56.6 84.8 54.1 61.5 52.9
Quadratic SVM 93 59 59 70.1 66 69.7 55.7 54.5 65.6 60.2 53.3 53.7 52.5 64.8 99.2 54.5 59.8 59.4
S8 kNN 59.4 57.8 45.5 49.6 58.2 59.4 62.3 55.7 49.2 60.2 67.6 64.3 67.6 75 61.9 47.5 41.8 51.2
Linear SVM 67.2 66 48.4 60.2 55.7 50.8 52.9 57 56.6 68 70.9 65.2 66 75.8 65.2 53.7 52.9 58.2
Quadratic SVM 59.8 62.3 50 60.7 55.7 52 45.1 51.6 60.2 71.3 69.3 68 69.7 78.7 70.5 53.3 58.2 57.8
S9 kNN 63.1 45.5 54.5 58.6 56.1 52.9 53.3 69.7 50.4 50.4 61.9 52.9 49.6 48.8 58.6 41 50.8 50.4
Linear SVM 50.8 52.9 49.6 50.8 51.6 50.4 49.6 65.2 53.3 53.7 50.4 51.2 48 49.6 57.4 50 51.6 50.8
Quadratic SVM 52 50 53.3 49.6 50.4 53.7 53.3 65.6 48.4 51.6 52 49.6 48.8 47.5 50.8 48 46.3 52.5
Ave kNN 58.15 53.79 57.06 54.39 57.6 56.3 56.16 56.49 51.43 57.4 55.4 56.6 57.1 57.19 58.77 49.55 52.05 54.96
Linear SVM 62.23 60.49 54.63 60.14 60.93 56.1 54.89 57.39 55.39 60.77 57.0 58.1 58.12 60.57 59.63 54.47 56.84 57.7
Quadratic SVM 60.16 59.94 57.33 60.28 60.25 57.8 54.52 57.6 55.56 60.95 56.4 56.2 60.33 58.85 58.94 54.72 56.62 58.8
434 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE VI. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR ALFA , BETA AND GAMMA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 56 55.2 67.9 46.4 57.9 54.8 48.4 57.5 66.7 64.7 51.2 63.9 75.5 65.9 63.5 58.7 90.1 63.9
Linear SVM 63.5 65.1 57.1 52 59.9 63.1 59.9 60.3 78.2 60.7 50.4 79 65.1 70.2 60.7 62.7 91.3 66.3
Quadratic SVM 64.7 65.1 56 59.1 61.5 66.3 59.1 58.3 77.8 75.4 48 78.6 68.7 71 61.1 67.5 90.5 69.8
S2 kNN 50.8 70.1 57.4 48.4 67.6 54.9 48 53.7 53.7 49.6 51.6 54.5 59 53.7 49.2 52 55.7 57.8
Linear SVM 64.8 78.3 52.5 59.8 72.1 61.9 53.3 68 67.6 52.5 54.5 68 65.2 61.5 50.8 55.7 62.3 75.4
Quadratic SVM 62.7 79.9 46.7 61.1 71.7 57.4 51.2 65.6 65.2 57 59.4 70.1 66.8 58.2 51.6 55.3 59.4 73
S3 kNN 59.4 62.7 57 49.2 54.5 59.4 54.5 54.5 46.3 60.7 59.4 49.6 54.9 57.4 53.7 56.6 64.3 54.9
Linear SVM 60.7 63.5 52.9 57 59.4 58.6 57 57.8 55.3 64.3 66.8 55.7 53.7 64.8 59 60.7 68 57.4
Quadratic SVM 69.3 64.8 50.4 59.8 58.2 54.1 59.8 61.1 52.9 63.9 63.1 53.3 49.6 60.7 60.2 61.5 61.1 59.4
S4 kNN 59.5 57.9 69 63.2 64.5 50 65.3 57.4 62.4 70.7 60.7 71.5 73.1 51.7 64.5 48.3 57 62
Linear SVM 62.4 66.1 73.6 74.8 68.2 57.4 72.7 69 65.3 76 56.2 76 82.6 51.7 68.2 59.5 57 47.9
Quadratic SVM 66.1 66.1 71.9 73.6 68.2 62.4 74.8 69.4 71.1 72.3 57.4 76 78.9 51.7 66.1 53.3 63.2 50.4
S5 kNN 50 58.2 53.3 54.9 50.8 50.8 51.2 57 52 49.2 59.4 71.7 56.1 53.7 52.9 56.1 68.4 49.2
Linear SVM 47.1 49.6 60.2 46.3 49.2 43.4 45.9 50.4 52.9 46.3 50.8 46.7 44.3 47.1 49.2 50 48.4 50
Quadratic SVM 50 50 52.5 48 51.2 52 48.4 54.1 51.2 48.8 54.1 49.6 52.5 45.9 50 49.2 48 50.8
S6 kNN 50.8 50 60.7 51.6 54.5 49.2 53.7 56.6 49.6 61.5 54.1 59 54.9 46.7 43 52 52.9 52.5
Linear SVM 54.1 52.9 50.8 58.6 58.2 59 56.6 48 57 69.7 54.5 52.9 50 58.2 56.6 57 57 49.6
Quadratic SVM 49.6 49.6 61.1 59.8 61.1 57.4 55.3 55.7 54.9 70.1 51.6 54.1 53.7 52.9 52.9 55.7 56.1 53.3
S7 kNN 91.4 51.6 53.7 65.2 54.1 54.5 47.1 57 53.3 60.7 53.7 58.6 57.8 59.8 99.2 50.8 52 55.7
Linear SVM 95.1 59.8 57.8 74.2 60.7 60.7 61.1 56.1 61.9 67.6 51.6 61.1 60.2 62.3 52.9 49.2 63.1 62.7
Quadratic SVM 94.3 62.7 61.1 61.9 67.6 64.8 57.8 48.8 63.5 66.4 47.1 54.5 54.9 65.6 98.8 58.2 61.5 64.8
S8 kNN 61.5 61.1 49.2 55.3 58.2 50.8 53.3 50.8 54.1 57.4 69.7 66.4 66 71.7 67.2 49.2 49.2 50
Linear SVM 65.2 68 46.7 57.8 52.9 43.9 52 57.8 60.2 70.9 78.3 66.8 66.8 77.9 66.4 60.7 62.7 49.2
Quadratic SVM 63.9 65.6 50.8 58.2 61.1 55.7 50.4 51.2 60.2 70.9 76.2 68 69.7 64.3 73.8 54.9 58.6 49.2
S9 kNN 59 47.1 52.9 54.5 59.4 52.9 52.5 75.8 49.2 50 63.5 49.2 50.8 63.5 56.6 43.98 54.1 48.8
Linear SVM 51.2 52 53.7 53.3 52.5 50.8 51.6 68.4 51.2 52.9 48.8 51.2 49.2 53.7 55.3 50.4 51.6 49.2
Quadratic SVM 52.9 47.1 54.1 50.4 50.4 49.6 50 67.2 52 57.4 51.2 48 49.2 46.7 48.8 48 49.6 48.8
Ave kNN 59.8 57.1 57.9 54.3 57.9 53.0 52.67 57.8 54.15 58.28 58.15 60.49 60.9 58.2 61.09 51.96 60.42 54.98
Linear SVM 62.6 61.7 56.15 59.32 59.24 55.43 56.68 59.5 61.07 62.33 56.88 61.94 59.6 60.8 57.68 56.22 62.38 56.42
Quadratic SVM 63.7 61.2 56.07 59.1 61.23 57.75 56.32 59.0 60.98 64.69 56.46 61.36 60.4 57.4 63.32 55.96 60.89 57.73
[12] S. Kim, D. Kim, “Differences inthe Brain Waves of 3D and 2.5D Motion
REFERENCES PictureViewers,”2012.
[1] Q. Wang, O. Sourina, and M. Nguyen, “EEG-based “Serious” Games [13] Y. Jin, O. J. Halloran, L. Plon, CA. Sandman, SG. Potkin, “Alpha EEG
Design for Medical Applications,” 2010 International Conference on predicts visual reaction time,”.Int J Neurosci 116, pp. 1035- 1044, 2009.
Cyberworlds, pp. 270-276, 2010. [14] E. Callaway, RS. Layne, “Interaction between the visual evoked response
[2] B. Rebsamen, E. Burdet, C. Guan, H. Zhang, C. L. Teo, Q. Zeng, et al., and two spontaneous biological rhythms: The EEG alpha cycle and the
"A brain-controlled wheelchair based on P300 and path guidance," pp. cardiac arousal cycle,”.Annal New York Acad Sci 112, pp. 421-431,
1101-1106, 2006. 1964.
[3] D. C. Hammond, "What is neurofeedback?," Journal of Neurotherapy, [15] https://www.youtube.com/watch?v=g7ps5TWzJ-o
vol. 10, pp. 25-36, 2006. [16] L.J., Cao, K. S., Chua, W. K., Chong, H. P., Lee, and Q. M., Gu, “A
[4] J. Carmena, M. Lebedev, R. Crist, J. O. Doherty, D. Santucci, D. comparison of PCA, KPCA and ICA for dimensionality reduction in
Dimitrov, P. Patil, C. Henriquez, M. A. L. Nicolelis, “Brain-controlled support vector machine”, Neurocomputing, 55, pp. 321–336, 2003.
muscle stimulation for the restoration of motor function,” journal of [17] M. Shaker, “EEG Waves Classifier using Wavelet Transform and Fourier
elsevier. Learning to control a brain–machine interface for reaching and Transform”, World Academy of Science, Engineering and Technology,
grasping by primates. PLoS Biol. 1, pp. 193–208, 2003. pp. 723-728, 2007.
[5] J. Collinger, B. Wodlinger, J. Downey, B. Wei Wang, E. Tyler-Kabara, [18] N. Manshouri, T. Kayikcioglu, “Classification of 2D and 3D videos based
D. Weber, A. McMorland, M. Velliste, M. L Boninger, A. B Schwartz,; on EEG waves”, Signal Processing and Communication Application
“High-performance neuroprosthetic control by an individual with Conference (SIU), 2016.
tetraplegia, ” Lancet 381, pp. 557–564, 2013. [19] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Recognition, second
[6] X. Huang, S. Altahat, D. Tran, D. Sharma, “Human Identification with edition. WILEYINTERSCIENCE, 2001.
Electroencephalogram (EEG) Signal Processing,”. ISCIT, pp. 1021-1026,
2012.
AUTHORS PROFILE
[7] W. Mumtaz, L. Xia, A. Malik, M. Yasin, “EEG Classification of
Physiological Conditions in 2D/3D Environments Using Neural
Network*,” Annual International Conference of the IEEE EMBS, Osaka, Negin Manshouri received the B.Sc.
Japan, 2013. degree of Telecommunication
[8] K. Li, X. Zhang, Y. Du, “A LINEAR SVM based classification of EEG Engineering from Islamic Azad
for predicting the movement intent of human body,” 10th International
Conference on Ubiquitous Robots and Ambient Intelligence (URAI), , pp.
University, in 2010. She practically
402-406, 2013. experienced in fields of microwave,
[9] N. Sulaiman, Ch. Chee Hau, A. Abdul Hadi, M. Mustafa, Sh. Jadin, mobile communication, as an antenna
“Interpretation of Human Thought Using EEG Signals and LabVIEW,” designer. Her research interest includes
IEEE International Conference on Control System, Computing and the area of design and analysis of
Engineering, pp. 384-388, 2014.
different kinds of microstrip antenna,
[10] H. R. Khairuddin, A. S. Malik, W. Mumtaz, N. Kamel, L. Xia, “Analysis
of EEG Signals Regularity in Adults during Video Game Play in 2D and ultra-wideband antenna and also branch of biomedical
3D,” 35th Annual International Conference of the IEEE EMBS Osaka, engineering. She received the M.S. degree Telecommunication
Japan, 2064-2067, 2013. Engineering from Islamic Azad University, in 2013. She is
[11] R. N. Hamizah R. Khairuddin, Aamir Saeed Malik and Nidal Kamel, currently working toward the PhD degree in Biomedical
“EEG Topographical Maps Analysis for 2D and 3D Video Game Play, ”
IEEE, 2014.
Engineering at the Karadeniz Technical University, Trabzon,
Turkey.
435 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
436 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— a brain-computer interface (BCI) is a device that easily measurable, which can be applied and tested on a large
enables direct communication between humans and computers by human population [2, 4]. An additional, EEG is an electrical
analyzing neural signals and transforming them into digital signal with high temporal resolution that is generated by
signals. This study presents a novel BCI system based on the gaze neuronal dynamics from the scalp. Therefore, a BCI system
on rotating vane-dependent EEG signals. This BCI system records these brain signals and translates into artificial outputs
proposes to identify four different Rotating Vane from EEG or commands. In the other words, features of EEG signals have
signals that represent commands in a limited visual space. The acts in a real world.
Rotating Vanes have these Specifications: The first vane rotates
slow in an anti-clockwise manner, the second vane rotates fast in Although BCI development is a very young research area, in
an anti-clockwise manner, the third vane rotates slow in clockwise the literature, many methods based on BCI have been proposed.
and the forth vane rotates fast in clockwise manners. All the One of these famous methods is “Steady-State” Visually Evoked
signals are obtained at Department of Electrical and Electronics Potentials (SSVEPs). When our retina is excited by a stimulus
Engineering, Karadeniz Technical University, from 4 healthy flashes at a frequency higher than 6 Hz, our brain generates an
human subjects in age groups between 25 and 32 years old. The electrical activity of the same frequency with its multiples or
features are extracted from the 1-sec epoch of the EEG using Fast harmonics. The stimulus produces a stable Visual Evoked
Fourier Transform (FFT). We use k-Nearest Neighbor (k-NN) and Potential (VEP) in the human visual system that called as
Support Vector Machine (SVM) algorithms to classify these
“Steady-State” Visually Evoked Potentials (SSVEPs). In this
features. Our results demonstrated that SVM was more accurate
paradigm, to produce such potentials, the user gazes a target
compared to k-NN. The proposed algorithm is efficient in the
classification phase with the obtained mean accuracy of 81.51%
block flickers (for example using LEDs) with a certain
for 4 subjects in 1-sec epochs. frequency on screen [5]. A flickering stimulus of different
frequency with a constant intensity can extract SSVEPs with a
Keywords—Brain‐computer interface; Support Vector Machine; maximum amplitude in low (5-12 Hz), medium (12-25 Hz) and
Electroencephalography; Feature extraction; Fast Fourier high (25-50 Hz) frequency bands, separately [6, 7]. Other
transform; k-nearest neighbor algorithm. famous method is Mental Task BCI. In this paradigm, users
think of different mental tasks. In this way, different tasks
activate different areas of the brain. Multi-channel EEG
I. INTRODUCTION recordings will have need to recognition of distinct EEG patterns
to differentiate the tasks. In a recent study [8], researchers used
A brain-computer interface (BCI) is a technology, which EEG to control an electronic device. In this paper, the
provides a straight connection pathway between the brain of a classification of a three-class mental task-based brain–computer
physically disabled patient and an external device or computer. interface (BCI) was presented. The Hilbert–Huang transform for
BCI research wants to generate a non-muscular way for the feature extractor and fuzzy particle swarm optimization by
physically disabled patients to communicate with other such as cross-mutated-based artificial neural network for the classifier
a spelling system for speech or writing a letter and control an were used. These three relevant mental tasks are letter
external device such as an environmental control system. composing, arithmetic, and Rubik's cube rolling forward that
meant left, right, and forward commands to wheelchair,
respectively. Oddball paradigms were used in BCI to generate
BCI systems have been rapidly developed in recent years, event-related potentials (ERPs), like the P300 wave, on the
because these systems may be the only possible solution for targets selected by the user. P300 visual evoked potential (VEP)
people who are unable to communicate via conventional means is another kind of EEG that is extract around 300–600 ms after
because of severe motor disabilities [1, 2, 3]. In the past few visual stimulus beginning. A P300 speller was based on this
decades, a noninvasive brain imaging method commonly principle, in which the detection of P300 waves allow the user
employed in BCIs. Electroencephalography (EEG) signals in the to write characters. A new method for the detection of P300
field of biomedical engineering are often used in BCI systems. waves was presented by Hubert et al. [9], which was based on a
EEG have the advantages such as lower risk, inexpensive and convolutional neural network (CNN). The proposed method has
437 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
438 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
D. Data Analysis
439 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
level of 25%). Many multi-class classifiers have been recently these channels with together, we improved the performance of
used in difficult pattern recognition problems, with great proposed method.
success. Support Vector Machine (SVM) is a popular machine
learning method for classification, regression, and other learning
tasks. k-NN algorithm is another multi-class classifier. The A. One Channel Classification
proposed algorithm was compared with multi-SVM and k-NN
algorithm. A summary of these algorithms is given below. In the first section, EEG signal of a channel divided to 1-sec
epochs. 960 epochs after normalization divided to two training
and testing sets. For each epoch from training set, five features
A. k-NN Algorithm using FFT were extracted. And then we separately trained the
classifiers using K-FCV to calculate classifiers parameter. After
training classifiers, for each epoch from testing set, five features
The k-NN is one of the easiest algorithms for implementation
using FFT were extracted. And then we classified these features
among the existing classification algorithms. First, in this
and calculated classification result (CR) for each classifier
algorithm, the number of the nearest neighbor to the unknown
sample must be determined. Euclidean distance method is separately. Flow chart of design System was shown in figure 3.
commonly used to calculate the nearest neighbors to the sample. The classification result was defined as the percentage of the
Then, the label that is maximum between these neighbors is number of epochs classified correctly over the size of the testing
diagnosed and the unknown sample is labelled with its set. To verify the results, this method was repeated 10 times
maximum label. In binary classification problems, it is with different distributions of training and testing sets. In each
beneficial to use odd numbers for k, because they do not cause time, training and testing set were selected, randomly. And the
any problems for researchers while deciding upon a label [17]. parameter of classifiers (k for k-NN and sigma for SVM) for
each training set were calculated using K-FCV, separately.
In this study, to determine optimum k value, K-fold cross Mean of the classification results in these 10 times (for each
validation (K-FCV) technique was used. Minimum number of
channel separately) and standard deviations of classification
epochs in the training set for each speed was 120; so, the
results for k-NN classifier are provided as Table 2. As is seen
optimum k value was searched in the interval between 1 and 50
with the step size of 2. in Table 2, mean of each channel for four subjects was
calculated. Similarly, Results of Multi-SVM classification for
each channel are shown in Table 3.
B. Support Vector Machine (SVM) Algorithm
B. Multi - Channel Classification
Among the many methods for solving classification
problems, support vector machine (SVM) is one of the most In the second section, seven channels, that they have
popular supervised learning algorithms due to its generalization maximum accuracy in first section (for each classifier
ability [18]. SVM involves the adoption of a nonlinear kernel separately), was selected. We used these seven channels
function to transform input data into a higher-dimensional together, to improve performance of proposed method. In this
feature space that can be formulated as a quadratic optimization case, we have 35 features (7*5) for each 1-sec. These channels
problem in feature space. In iterative learning process of SVM, are Fp1, Pz, T3, P4, O1, T4 and T6 for both classifier. In this
optimal hyperplane with the maximum margin between each way, the channel-reduction process is done. All methods that
class in the higher-dimensional feature space are searched. We were used in the first section are used in this section. We also
utilized SVM with a radial basis function kernel. We have selected 5, 4, 3 and 2 channels, that they have maximum
chosen this kernel due to that the number of hyper parameters accuracy in the first section of classification. Mean and standard
of this kernel is smaller than those of other kernels. This kernel deviations of the classification accuracy for k-NN and SVM are
function is specified by the scaling factor σ. To find best σ value shown in Table 3 and Table 4, separately. As shown in tables,
we searched in interval between 0.1 and 4.5, with step size of the best seven channels are the same for two classifiers. It shows
0.2. To determine optimum σ value K-FCV technique was used. that these channels are important role in our study. But the best
two or three channels for classification are different in each
classifier. For example channels Fp1 and Pz in k-NN classifier
V. RESULTS
have better performance, while channels T3 and P4 are better
As mentioned above, in this study, we have the four-class in SVM classifier.
classification problem. To discovery of problem, classification
in two sections is done. In the first section, for reducing number
of channels and the understanding which channels have best In the other hand, all channels features for an epoch were
performance in the classification, we classified 18 channels, used. So, 90 (18*5) features were prepared for each epoch.
separately. In the second section, seven channels, that they have Results of this classifications, also are shown in tables. The best
maximum accuracy in previous section, were selected. By using classification accuracy is about 81.51%, when all channels were
used for SVM classifier.
440 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
441 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
442 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
443 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
444 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
acknowledge, end user requires security for their data that will Center is hard to keep up as a result of substantial number of
stored anywhere on cloud. End user may want to share clients that are available in the cloud for data sharing or
images, comment, etc and does not wish that his/her identity capacity. Thusly, this underline mists ought to take
to be known by others. Even if, the user must be able to decentralized methodology for conveying mystery keys.
verify/signify to the other users that she/he is a verified user Presently days it's very characteristic that mists have
who stored the data but don’t want to reveal the identity. numerous [4] key distribution centers in various remote spots
in the system.
Previous research on access control of cloud was not In ABE plan which is explained [5] has set of traits
decentralized in nature. Even some not centralized or characterized with one of a kind Id. One or more classes has
fragmented approaches do not support identification for user. been characterized in the KP-ABE, ABE, sender has an
Privacy preserving validated access control for cloud is entrance strategy to scramble information. In technique KP-
supported by pervious work & takes not decentralized ABE, ABE [6] sender has an entrance approach to encode
approach where single key distribution center distributes information. An author whose traits and keys have been
attributes and single key to users present in the system. repudiated can't compose back stale data. The beneficiary gets
traits and mystery keys from the property power and can
II. LITERATURE SURVEY unscramble data in the event that it has coordinating
Distinctive plan utilizes symmetric or ABE key characteristics. In Figure content approach, CP-ABE, the
approach which, does not bolster client recognizable proof. recipient has the entrance strategy displayed in type of a tree,
Past research gave security safeguarding confirmed get to with qualities as leaves, monotonic access structure with OR,
control in cloud. A decentralized approach is proposed by AND other limit entryways [7] [6].
others in existing work; their strategy does not confirmation All the methodologies take a concentrated approach
clients for information get to, who need to stay in secret while and permit stand out Key Distribution Center which is a
getting to information. Past work has proposed conveyed get solitary purpose of disappointment. Pursue proposed a multi
to control system in cloud. The work which has been power Attribute Based Encryption, in which there are a few
distributed, that plan does not give verification to clients Key Distribution Center powers (facilitated by a trusted
exhibit in the framework. Other essential thing was that power) which appropriate credits and mystery keys to clients.
exclusive proprietor of document that is maker/up loader can Multi power Attribute Based Encryption convention was
compose that put away record different clients can not ready contemplated in, which required no trusted power which
to compose that document which was put away on cloud. In requires each client to have qualities from at all the Key
past research compose get to was given to just Distribution Centers. As of late, [9] proposed work
proprietor/maker of particular document not to the viewers/ completely, decentralized Attribute Based Encryption where
peruses this was the disservice. clients could have zero or more qualities from every power
Servers mean cloud servers are mindful to experience and did not require a trusted server. In every one of these
the ill effects of Byzantine disappointment, at whatever point a cases, unscrambling at client's end is calculation escalated.
capacity server can flop in various ways. Inquire about Along these lines, this system may be wasteful when clients
proposes power to deny client characteristics with less access utilizing their cell phones. To get on this issue, research
exertion [3]. A cloud is capable to get influenced by server proposed to outsource the unscrambling assignment to an
intriguing assaults and information upgrading. Server mystery intermediary server, so that the client can rival least assets (for
intriguing harm, the enemy can deal stockpiling servers, with instance, hand held gadgets). Nonetheless, the vicinity of one
the objective that it can trade or conform data records the intermediary and Key Distribution Center makes it less
length of they are inside uniform. Information/bits of data effective than decentralized methodologies. Both these
encryption are relied upon to give secure data stockpiling methodologies had no real way to approve clients, namelessly.
which is on cloud. In any case the information is regularly Changes of verified clients, who need to stay unknown while
changed and this dynamic property should be thought about getting to the cloud [5]. As of late distinctive procedures take
while arranging successful secure stockpiling plans. Effective a shot at decentralized approach and gives confirmation
inquiry on figure information is likewise an imperative without uncovering the character of the clients. As said in the
arrangement in mists. Mists ought not know the operations of past segment it is inclined to replay assaults.
information/bit of data that is inquiry however ought to have
the capacity to give back the arrangement of information that
fulfill the question. By method for searchable encryption this III. SYSTEM ARCITECTURE
is accomplished
Key Distribution Center is incorporated methodology Framework engineering of circulated get to control of
where a solitary Key Distribution Center disperses mystery information put away in cloud so that exclusive checked
keys and it’s a scribe to all clients that are available. If it’s clients with right properties can get to them. Decentralize get
centralized then single Key Distribution Center is single to control strategy with mysterious confirmation, which gives
purpose of disappointment; with single mystery key anticipates replay assaults, client repudiation. The cloud does
disappointment entire framework can fall. Key Distribution not know the character of the client who stores data, yet just
445 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
differs the client's certifications. Enter dispersion is done information. The proprietors ought to change the put
decentralized. The expenses are tantamount to the current not away information and send redesigned data to different
decentralized techniques, and the exorbitant operations are clients. Secure Control It secures information on the
finished by the cloud. As the figure 3.1 shows framework premise get to approach and get to control method.
design of secure framework verification, User will be the up Security controls are sheltered gatekeepers or
loader get to arrangement. countermeasures to stay away from, balance or minimize
security dangers identifying with individual property.
A. Contollers of the Decetralized Access Control
1. Access control System 5) Security control
2. Access policy management It secures data on the basis access policy and access
3. Anonymous executive control technique. Security controls are safe guards or
4. User revocation counter measures to avoid, counteract or minimize security
5. Security control risks relating to personal property. We just consider how to
review the honesty of imparted information in the cloud to
static gathering's keys. It implies the gathering key is pre-
characterized before shared information is made in the
cloud and the enrollment of clients in the gathering key is
not changed amid information sharing. The first client is in
charge of choosing who can share her information before
outsourcing information to the cloud.
B. Set Thory
446 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
to modify or any other process performed on creator’s Hashing algorithm which works in one way
document. Writer who is authorized can modify or rewrite the encryption fashion used to maintain integrity of signature. As
creator’s document. hash algorithm are sensitive for a small change, even a single
space is added to Di message digest Mi will change hence can
Process (Ux, Ai, Ci) detect any change done on Di and thus maintain its integrity.
Ux – Authorized User Takes as info a message of self-assertive length and produces
Ai – Authorized Data as yield a 128 piece unique mark or message summaries of the
Ci - Ciphertext data. It is guessed that it is computationally infeasible to
deliver two messages having the same message digest.
X- Input of System: Data as Input { N,Type,Id,CD}.
Expected where a huge document must be compacted in a
Y- Output of System: Efficiency and accuracy providing the protected way before being encoded with a private key under
data security and authorization. an open key cryptosystem, for example, PGP.
T- Set of steps to be performed from verification to upload the
Access Policy for Creator:
data on cloud:
• User will Login With his specific access policy that
Output: (cipher text, user revocation, decrypts data) will be R1=up-loader. Creator access policy will
request KGC (key generation center) to generate
B = ciphertext (∑ux,Pi)
keys.
Where, • KGC: Its key generation center which will able to
create random keys for creator, KGC: {k1, k2, k3…}.
Ux=number of users involved in data storage
• Proposed system is based on decentralized access
Pi=authorized process, user revocation control to store data, so we are using more than one
KGC for better performance. KGC’s will be in
distributed area, whatever KGC generated keys are
IV. SECURE DATA STORAGE ON CLOUD USING DECENTRALIZED generated by KGC’s that will be distributed to the
ACCESS CONTROL access policy i.e. here is up-loader.
• Up-loader can select any key generated by KGC and
proceed for file selection for uploading that file F1.
Sche Centraliz Read Secure Types User • File f1 which is going to be uploaded in cloud, that
me ed/ /Writ data of Revocatio
Decentral file need to have signature generated, Di will be
e storage Access n signature for file which need to uploaded and keep
ized Acce Contro user identity anonymous to other users and also prove
ss l it to be valid user.
12 Centrali 1-W- Not ABE Yes • Digital signature Di is generated and then we attach
zed M-R Authen Di and encrypt file Fi encryption algorithm [8] to
tication generate cipher text Ci.
13 Decentr M- Authen ABE Yes • User then request CSP to upload the cipher text Ci on
alized W- tication cloud.
M-R
• Cipher text Ci is then uploaded to cloud.
Propo Decentr M- Authen Propos Yes
sed alized W- tication ed
sche M-R Acess V. RESULTS
me Contro We are considering some other algorithm for encryption
l comparing the performance proposed system’s algorithm.
[Encry
Proposed scheme is to generate digital signature and applying
ption the encryption algorithm with the decentralized Key
with Distribution Center and KGC for secure storage.
digital
signat TABLE I
ure ] Scheme Comparative Result
As the system architecture fig.3.1 shows the Signature
generation which will take input file and access policy. Digital The criteria contain S1- fine-grained access control, S2- data
Signature generation makes user identity anonymous to other confidentiality, S3-scalability, S4-user revocation. This
users on cloud every user identity is checked by digital comparison table is listed in Table 2.
signature attached.
TABLE II
Signature Generation for File Encryption: Criteria Results
447 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Criteria ABE KP-ABE Proposed Graph shows that proposed ABE with blowfish gives good
efficiency in terms of time as its running time is less for same
S1 N Y N
file F.
S2 Y Y Y
Time is calculated in Milliseconds.
S3 N N Y
S4 Y Y Y
Proposed developed scheme is more secure and it provides VI. CONCLUSION
required revocation and it’s faster than the other schemes. Proposed system provides secure data storage on cloud with
the anonymous authentication. We construct scalability, high
performance, prevents replay attack. Identity of the user who
stores information is not known by cloud, but scheme only
identifies the user’s credentials. To gain Cloud data for
verified users in decentralized network is beneficial and robust
hence overall communication storage has been implemented
by comparing to the non decentralized techniques. Proposed
scheme is more secure and robust as the performance result
shows. System provides high performance with minimum
storage requirement.
References
[1] H. Li, Y. Dai, L. Tian, and H. Yang, “Identity-Based Authentication for
Fig. 4.1. Storage Performance Cloud Computing,” Proc. First Int’l Conf. Cloud Computing
(CloudCom), pp. 157-166, 2009.
Fig 4.1 shows comparison graph on bias of Storage required [2] S. Ruj, A. Nayak, and I. Stojmenovic, “DACC: Distributed access
by the Algorithm to store the encrypted form of document. control in clouds,” in IEEE TrustCom, 2011.I.S. Jacobs and C.P. Bean,
Proposed proposed Algorithm ABE with Blowfish takes less “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol.
III, G.T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271-
storage irrespective of number of attributes inputted by data 350.
Owner the size of cipher text does not increases exponentially. [3] C. Wang, Q. Wang, K. Ren, N. Cao, and W. Lou, Toward Secure and
Blue color: Existing System Dependable Storage Services in Cloud Compu-ting,ǁ IEEE Trans.
Green color: Proposed System Services Computing, Apr.- June 2012.
[4] S.Seenu Iropia and R.Vijayalakshmi (2014), “Decentralized Access
Storage required is represented in Bytes. Control of Data Stored in Cloud using Key-Policy Attribute Based
Encryption” in preceedings:International journal of Inventions in
Computer Science and Engineering ISSN(print):2348-3431.
[5] A. Sahai and B. Waters, “Fuzzy Identity-Based Encryption,” Proc.Ann.
Int’l Conf. Advances in Cryptology (EUROCRYPT), pp. 457-473, 2005.
[6] G. Wang, Q. Liu, and J. Wu, Hierarchical Attribute-Based Encryption
for Fine-Grained Access Control in Cloud Storage Services,ǁ Proc. 17th
ACM Conf. Computer and Comm. Secu-rity (CCS), 2010.
[7] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-Policy Attribute-
Based Encryption,” Proc. IEEE Symp. Security and Privacy, pp. 321-
334, 2007.
[8] Alabaichi, A. Inf. Technol. Dept., Univ. Utara Malaysia, Sintok,
Malaysia “Security analysis of blowfish algorithm,” IEEE Informatics
and Applications (ICIA),2013 Second International Conference on -
2013.
[9] M. Chase, “Multi-Authority Attribute Based Encryption,” Proc.Fourth
Conf. Theory of Cryptography (TCC), pp. 515-534, 2007.
Fig. 4.2. Time Performance [10] S. Yu, C. Wang, K. Ren, and W. Lou, Attribute Based Data Sharing with
Attribute Revocation,ǁ Proc. ACM Symp. Infor-mation, Computer and
Comm. Security (ASIACCS), 2010.
Fig. 4.2 shows comparison graph on bias of Time required by [11] H. K. Maji, M. Prabhakaran, and M. Rosulek, “Attribute-based
the algorithm to compute the encrypted form of document signatures: Achieving attribute-privacy and collusion resistance,” IACR
simply algorithm running time. Proposed algorithm ABE with Cryptology ePrint Archive, 2008.
Blowfish takes less time irrespective of number of attributes [12] F. Zhao, T. Nishide, and K. Sakurai, “Realizing fine-grained and flexible
access control to outsourced data with attribute-based cryptosystems,” in
inputted by data owner the size of cipher text does not ISPEC, ser. Lecture Notes in Computer Science, vol. 6672. Springer, pp.
increases exponentially. 83–97, 2011.
Blue color: Existing System [13] Shraddha Mokle, Nuzhat Shaikh, “Decentralized Access Control
Green color: Proposed System Schemes for Data Storage on Cloud” SAP, Computer Science and
Engineering, 6(1), pp. 1-6 10.5923/j.computer.20160601.01
448 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
449 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract
The MapReduce model processes 1. Introduction
large-scale data by exploiting parallel map
tasks and reduce tasks. In the shuffle phase Mapreduce is evolved as the efficient
The network traffic generated in middle factor to process the data among huge data
phase i.e shuffle is ignored while increasing centers. Data sets are generated by
the performance of MapReduce jobs. mapreduce which is a model of
Improving the traffic generated in the programming. A set of intermediary
network helps in improving the key/value pairs are generated by processing
performance efficiently. Usually a hash a key/value by map function. Another
function is considered to partition the function which is the reduce function is
intermediate data among reduce tasks, which responsible for merging all the
is not traffic-efficient as the network corresponding values that are linked to a
topology and the data size associated with similar intermediate key on a machine.
each key are not taken into consideration. Run-time system is responsible to store the
The data partition algorithm has being information regarding the input data
proposed to decrease cost of the network partition and also which particular program
traffic for a MapReduce job. We also is to be executed i.e the scheduling among
consider the aggregator placement problem different machine is decided by it. Also
in which each aggregator is responsible to intermachine communication and machine
reduce the data . The data aggregation is failures are managed using run-time system
carried by considering various terms like The computation in Mapreduce is
wordcount, wordfrequency, considered having basically two phases
documentfrequency, TF, IDF.Considering which are map and reduce .Firstly, the
all these factors the data uploaded by the former phase which is map the input is
user is agggregated which helps to reduce initially rearranged in a manner that
the processing time as compared to the computation that is required is achieved by
processing time without aggregation. applying a particular algorithm to the small
data parts. The later phase after map is the
reduce as mentioned earlier and both these
phases are perform parallely on large scale
data . MapReduce job should be considered
as consisting of three phases rather than only
450 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
map and reduce phase while considering the When the data chunk linked to the
mapreduce system performance particular task has been locally stored then it
The `shuffle' phase is the one existing or is said to be a local task. Machine where the
occurring inbetween the two phases i.e map task allocated is referred as the remote
and reduce which can be referred as the machine to that particular task , also that
data transfer phase. The output given by the task on that particular remote machine is
mapper then is again combined and sent to called remote task. While refering to the
the compute nodes which perform term of locality we can also consider the
corresponding reduce operations they are portion of tasks which runs on local
scheduled in the shuffle phase. The machines ..
MapReduce performance wholly relies In order of advancing locality which can
mostly on the manner in which tasks are reduce equally the processing time of map
scheduled which are associated to the map, tasks and also the network traffic load given
reduce and shuffle phases. that very less map tasks require to remotely
At the same time as current techniques had obtain data. Though for allocating the tasks
deal among scheduling workflow to local machines might result in different
implementation on grids, similar techniques allocation of tasks amongst the machines,
are not helpful for scheduling MapReduce that means few machines might be greatly
jobs. The user has to initially define the networked traffic at the same time remaining
position of the reducers which depends on might remain idle. We take into
two factors which are latitude and longitude. consideration the closest machine that are
Before allocating the data to the reducer the able to store the input data in chunks which
data is initially partitioned as mentioned is allocated by the map tasks while the map
above and later the data is aggregated before tasks are allocated. For achieving the proper
allocated to the nearest reducer. Different balance inbetween data localityand load-
terms like wordcount, wordfrequency, balancing among MapReduce algorithm
documentfrequency, TF, IDF are used for which inputs map tasks to machines a map-
aggregation. Considering all these factors scheduling algorithm or a simple scheduling
the data uploaded by the user is agggregated algorithm is considerd.
which helps to reduce the processing time as The Scheduling is the most crucial
compared to the processing time without aspects of MapReduce for the reason that
aggregation. Another added dispute that the various issue which were noticed , and
occured while using MapReduce job is the more additional issues while scheduling in
big data. the MapReduce. To defeat these problems
considering different techniques and
approach numerous algorithms are examined
2. Present State of Art and proposed. Several among these
algorithm concentrate on escalating quality
of data locality while few of these help to
guarantee Synchronization processing.
Data locality is where we take into
Several of these algorithm cover the
consideration the closest machine that are
implementation to decrease the overall
able to store the input data in chunks which
processing time.
is allocated by the map tasks while the map
Several methods are designed of the
tasks are allocated. A local machine is
MapReduce interface. The correct selection
considered for each and every task.
of implementation thoroughly relies on there
451 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
452 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
453 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Algorithm :
454 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Combine module: The shuffling cost compared with the the one where random
is reduced by performing a local aggregation is carried. The processing time
reduce process for the key/value which is achieved by considering the TF
pairs by the combine module which IDF values and the different several
are generated by mapper. occurances in the data. The processing time
Hence this particular module is is majorly smaller by using aggregation as
considered particular type of a compared to the processing time without
reducer. aggregation.
455 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
456 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract- In calculations of shipbuilding and reviewing the marine phenomena, numerical solution of bi-phase fluid
plays a major role. So many researches are conducted in the sea, lakes, and canals. Numerical solution of this flow
is presented by Navier-Stokes, continuity and surface fitted equations. Based on the bi-phase physical properties of
flow requiring discretion of governing equations coupling above equations is important. In this paper, distribution of
bi-phase fluid is obtained in the whole range of calculation by solving the surface fitted equations by interface
capturing method. Therefore, equations governing on the fluid flow will be solved for a bi-phase fluid. For coupling
the velocity and pressure field, “fractional step method” was also used. Apparently, the only wise selection in any part
of numerical solution algorithm- together with dominance on existing selections conformed to the requirements of
problems ahead- resulted in developing an efficient numerical method that is the basis for this paper aiming to
provide a platform for this issue.
Keywords- bi-phase flow, fractional step method, interface capturing method
I. INTRODUCTION
According to the importance of knowing multi-phase process in different industries such as industrial, oil
and gas and petrochemical companies as well as national holding industries such as Water & Sewage Company
and Ministry of Power, it is very important to know more about this issue for properly predicting the flow
system and reducing the risks from phase changes such as cavitation, modeling the multi-phase flows and
knowing their performance. Predicting the properties of different phases such as temperature, pressure as well as
detecting the way of changes in their interface considerably helps to review and to analyze multi-phase systems.
On the other hand, dominance on this field results in reducing the cost of numerical simulations as well as
accurately predicting in critical points. For this purpose, numerical simulation of this type of flows could be very
useful for determining the optimal conditions of functions of devices. For simulating the multi-phase flow by the
help of fluids dynamics, there are required some calculations to use of models in which volumetric fraction of
the fluid could be accurately modeled. For this reason, modeling the interface between multi-phases particularly
in very sensitive occasions such as detecting the systems governing on it is very complicated. Like all numerical
modeling, producing the netted geometry is one of the requirements for simulating the multi-phase currents; in
this section, desirable gridding according to the nature of governing equations as well as proper zoning the
solution field is very important for predicting the intersection. After gridding and selecting the appropriate
method, by determining the boundary conditions conformed to current physics, it will be possible to analyze the
process of multi-phase current. By conducting a proper analysis, one could investigate different parameters such
as volumetric fraction values of the fluid, pressure, velocity Fluid flow occurs together with a free surface in
most scientific problems. Modeling such current is one of the common issues in calculation fluids dynamic and
there are many studies are being conducted in this field. The problem discussed has comprised from two general
parts including solving the equations governing on fluid flow (Navier-Stocks and Continuity equations) and free
surface modeling.
457 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
A. Navier-Stocks equations
Solving Navier-Stocks equations require choosing the algorithm for a couple of field, velocity, and
pressure. Methods for solving the fundamental equations governing on current (three equations of conducting
the linear motion size and a continuity equation) could be divided into two solutions i.e. Simultaneous Solution
and Sequential Solution. Simultaneous solution approach comes with the high cost of calculations, particularly
in big problems; in comparison, there are approaches that the continuity conditions will be satisfied by one time
of solving the equation in any temporal step; in this case, sequential solution approach has been more developed.
The main issue in sequential solution approach is a lack of a clear equation for pressure. In this case, there were
formed approaches such as (1) Predictor- Corrector; (2) Artificial Compressibility, and (3) Projection or
Fractional Step.
Predictor- Corrector Approach: by initial guess about velocity and pressure field, the equations were
solved and in several repetitions by sequentially correcting the velocity and pressure, there was obtained a field
that is correct in all governing equations and continuity is made in calculation field. Artificial Compressibility
Approach: This approach has been developed based on the idea of converting the equations governing on non-
compressible flow – with elliptical- parabola nature- to the equations governing on compressible flow – with
hyperbola nature- and using developed approaches in this kind of flows. In this case, is added to the continuity
equation of compressible flow where β is artificial compressibility parameter. Projection or Fractional Step
Approach: in this method, we solve the linear motion size conduction equations by applying the continuity
conditions in a few steps with no needing to repetition. In this case, using fractional step approach – due to no
needing to repetition among governing equations and transient flows governing on the ocean environment- is
considered as an appropriate choice. Thus, according to increasing the solution speed, fractional step approach is
confirmed [1]. This approach was recommended by Crine [2, 3] in the early 80s and developed by others [4, 5,
6, 7].
B. Free surface
Numerical approach in this field could be divided into two main categories including [8]:
• Interface Tracking Methods
• Interface Capturing Methods
In the interface tracking methods, by marking the surface fitted by particle on interface methods and using
surface fitting methods [9], it provides a sharp interface. In the first approach, particles moved by the local
velocity of fluid are followed on Lagrange based [10]. Therefore, this approach is also used for 3D problems
[11]. In the second approach, the calculation grid follows free surface to satisfy two kinematic and dynamic
conditions [12]. Although these approaches determine the accurate situation of the free surface, however,
according to the algorithms used, they encounter with fundamental limitations in modeling the complicated
geometries such as wave breaking [13]. In comparison to aforementioned approach, there is interface capturing
methods modeling the free surface by using particles on the interface of the fluid, solving the conduction
equation of capturing ratio or calculating the distance to free surface. Interface capturing methods are generally
able to model complicated geometries in the interface, great deformations and considering the climate effects;
for this purpose, there are considered as a desirable choice for reviewing the bi-phase flow. In this state, the
interface of both fluids is considered as a discontinuity in physical properties of calculation range. In this case,
most plans ended to interface capturing method will model the free surface by solving a conduction equation,
458 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
distributing two phases of the fluid in any cell of calculation grid and on the other hand obtain the volumetric
ratio. Proper discontinuity of conduction equation of free surface quantity is very important with considerable
studies conducted in this case:
A) Marking the free surface by particle on the interface
B) Confirming the computational grid
C) Using particles in the fluid
D) Calculating the volumetric or capturing ratio in two phases.
(2)
∫ ∫ ∫ ∫ ∫
459 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Where “u” is velocity vector and “n” is a vector perpendicular to the cell surface. Also, assuming K=1, 2, 3
and Ukare velocity components (w,v,u) respectively and Xk is spatial components (X,Y,Z) respectively and also
gx are gravity components (gz, gy, gx) respectively. In addition, P indicates density, indicates kinematic
viscosity, Vp is the volume of cell P and “Ap” is the surfaces of this cell. The first sentence in above equation
called as “Unsteady Term”. The second sentence in “(2)” indicates the mechanism of conducting the linear
motion size by “Convection Term” and third sentence indicates the mechanism of “Diffusion Term” in this
convection. The fourth sentence indicates surface forces (consultant) and fifth sentence indicates capturing
forces (gravity). Density and viscosity of fluid effective in any cell are calculated by “(3)”:
( ) (3)
( )
Where, indices 1 and 2 indicate two phases of the fluid. Volume ratio α (presence % of two fluids in any
calculation cell) in the solution interface is obtained by “(4)”:
{ (4)
By using continuity equation “(1)” and defining the properties of effective fluid (“3”), results in “(5)” for
conducting the volume ratio, α:
( ) (5)
Discontinuity of convection equation of volume ratio of two phases of fluid “(5)” is very important.
D. Coordinate System and Computational Grid
Making the decision for coordinate system comprises 2 points. First, using the inertial coordinate system or
non-inertial coordinate system [14] and then using Cartesian and non-Cartesian velocity components [15].
Converting the physical continuous space to computational space has been conducted by producing a
computational grid; in this case, there are three main problems including the structure of the grid, properties of
the grid and method of producing the grid. Structurally, types of computational grids could be divided into three
categories including “Structured Grid”, “Block-Structured Grid”, and “Unstructured Grid” as indicate in “Fig.
2”. By comparing the advantages and disadvantages as mentioned above, one could use unstructured grid as a
proper choice in simulating two-phase turbulent flow governing on the sea environment. Some characteristics of
a good computational grid include fine grid in the regions together with severe gradient helping even
distribution of the error; slow changes in the size of neighborhood cells that is desirable for accuracy of the
solution and using tetrahedral cells in 2D state and hexagonal cells in 3D state – particularly next to the wall-
that is effective in reducing the error. Considering all such specifications when producing the gird by using
numerical approaches, differential equations and or different variables [16] resulted in selecting this approach as
one of the most important parts of the numerical solution.
TABLE I
TYPES OF COMPUTATIONAL GRID BASED ON STRUCTURE
Grid type Characteristics Advantages disadvantages
Structured Possibility of following the grid Determined neighborhood Time-consuming gridding
460 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 2. Classification of grids based on structure; (a) structured, (b) block-structured; (c) unstructured
E. Raily- Taylor Instability
When a heavy fluid located on the peak of a light fluid, according to the floating force, both fluids will be
displaced. This problem has unstable form called as Raily-Taylor problem. The displacement form of both fluids
depends on preliminary turbulence given to the calculation range. There have been conducted many works in
this case. For example, it changes the initial distribution of fluid and or gives initial velocity to calculation
range. Here, initial velocity “(6)” for both fluids has been considered as initial turbulence.
| |
( ) ( )
{ | |
(6)
( ) ( )
| |
( ) ( )
461 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
In this equation, A is the turbulence range and Δy is the width of cells used in even gridding. For
comparing the accuracy of results, Reynold number has been used based on heavy fluid (as indicated in “Fig.
3”). Meanwhile, in initial temporal steps, the deformation of the interface has a symmetrical nature followed by
losing such symmetry continuously. For reviewing the Reynold number for interface formation, Raily- Taylor
Instability with initial velocity to the computational range in different Reynolds were studied in the modeling as
indicated in “Fig. 3”.
Figure 3. Influence of increased Reynold number on displacement and velocity and time spent in both fluids; (a) Re= 353; (b) Re= 484;
(c) Re= 602.
II. CONCLUSION
This study investigated numerical solution of bi-phase flow and type of grid for modeling the interface
between two phases particularly under very sensitive occasions such as detecting the governing systems that are
very complicated. In this case, most projects ended to interface capturing methods and surface fitted method by
solving a convection equation, distribution of two fluid phases, in each cell of calculation grid and on the other
hand they could obtain volume ratio; the numerical solution of this flow is possible by Navier-Stocks and
continuity equations and surface fitted method. Therefore, coupling above equations based on physical
properties of two-phase of the fluid is very important requiring discretization of governing equations.
REFERENCES
[1] Ferziger, J.H., Peric, M. ,،،Computational methods for fluid dynamics,, , 3rd. , Springer, 2002.
[2] Chorin, A.J., ،،Numerical solution of the Navier-Stokes equations,, , Math. Comput. 22, 745, 1968.
[3] Chorin, A.J., ،،On the convergence of discrete approximations to the Navier-Stokes equations,, , Math. Comput. 23,341,1969.
[4] Goda, K., ،،A multiphase technique with implicit difference schemes for calculating two- or three- dimensional cavity flows,, , J.
computational physics. 30, 76, 1979.
462 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[5] Bell, J. B., Collela, P ., Howell, H., ،،An efficient second-order projection method for Viscous incompressible flow,,, In proceeding of
tenth AIAA computational fluid dynamics conference, AIAA, p.360, 1991.
[6] Kim, J., Moin, P., ،،Application of a fractional – step method to incompressible Navier-Stokes equations,,, J. Comput. Phys. 59, 308,
1985.
[7] Van Kan, J., ،،A second-order accurate pressure-correction scheme for viscous incompressible flow,,, SIAM, J. Sci. Comput. 7, 870,
1986
[8] Panahi, R., Jahanbakhsh, e., Seif, M.S. "Comparison of Interface Capturing Methods in Two Phase Flow", Iranian J. Science
&Technologye, Transaction B: Technology, Vol, No.B6, 2005.
[9] Dervieux, A., Thomasset, F., "A finite element method for the simulation of Rayleigh – Taylor instability", IRAN-LABORIA report,
F-78150 Le Chesnay, 1979.
[10] Nichols, B.D., Hirt, C.W., "Calculating thtee-dimensional Free surface Flows in the vicinity of submerged (PCBFC) method for the
analysis of the analysis of Free-surface Flow", Int. J. Num. Methods Fluids, Vol. 15, p. 1213-1237, 1973.
[11] Muzaferija, S., Peric, M., Seidl, V., "Computation of flow around circular cylinder in a channel", Internal Report, institute fur
Schifbau, University of Hamburg. 1995.
[12] Clarke, A.P., Issa, R.I., "A numerical model of slug flow in vertical tubes", Dept. of Mechanical Engineering, Imperial College,
London, Internal Report, 1995.
[13] Ferzinger, J., Peric, Peric, M., "Computational methods for Fluid dynamics", 3rd Rev. Ed., Springer Verlag, 1995.
[14] White, F.M., "Fluid mechanics", McGraw-Hill, 4th Ed., 2001.
[15] Melaaen, M.C., PhD Thesis, University of Trondheim, 1990.
[16] Arcilla, A,S., Hauser, J., Eiseman, P.R., Thompson, J.F. (eds.), "Numerical grid generation in computational Fluid dynamics and
related Fields", North-Holland, Amsterdam, 1991.
463 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract— k-nearest neighbor (k-NN) algorithm is one of the role in the performance of a nearest neighbor classifier. If k is
traditional methods that is used in classification. It assigns an too small, then the result can be sensitive to noise points; on the
unseen point to the dominant class between its k nearest neighbors other hand, if k is too large, then the neighbors may include too
within the data set. However, lack of a formal framework for many points from other classes [4]. In many classification
selecting the number of the neighborhood k is problematic. This studies, selection methods of k have not been stated and, in some
article investigates a novel method for calculating the optimum studies, k has been selected using trial-and-error method. In the
value of k using cross-validation techniques. The proposed method study by Duda et al. [5], the best k was selected using (1) in any
is fully automatic with no user-set parameters and it is also tested data set:
on different benchmark data set with comparison of other popular
methods for selecting k. 𝑚 = √𝑛 (1)
Keywords—K-fold cross-validation; optimum k; k-nearest
neighbor; leave-one-out; pattern recognition
n is the number of observations of training data set and the
nearest integer value of m is determined as the best k value. In
this algorithm, k is a function of training data set. Enas and Choi
I. INTRODUCTION [6] accomplished a simulation study and suggested k scaling as
Classification is one of the most active research areas and n^(2/8) or n^(3/8). n is also the number of observations of
also important measures in many applications. It is the allocation training data set. In this algorithm, value of k also depends on
of unknown samples to a known class-based feature vector. training data set. In brief, no method is dominating the literature
Selection of a classifier depends on the kind of problem, used and simply setting k=1 or choosing k via cross-validation
features, and other parameters of problem. The dissatisfactory appears the most popular methods [7]. The advantage of cross-
classification occurs when feature vectors have overlapping validation is that k-NN classifies testing observations with
areas. In this case, an optimum decision boundary should be awareness and acquaintance to training data set; as a result, it
made such that the probability of misclassification is minimized influences the misclassification rate.
[1]. Classification Accuracy Rate (CAR) is one of the important
parameters in performance of a classifier. CAR is the percentage In some papers, empirical algorithms have been used, like
of the number of trials classified correctly in the testing data over K-fold cross validation (K-FCV). The best k value is selected by
the total testing data trials. maximum value of classification accuracy. In some studies k-
NN algorithm is trained by K-FCV, in which the best k is
k-nearest neighbor classification is one of the fastest, easy selected according to maximum classification accuracy rate [8,
to implement, and common algorithms among the existing 9, 10]. In another paper, Onder A. and Temel K. [11] used leave
classification algorithms for statistical pattern recognition [2,3]. one out cross-validation (LOO-CV) method to determine
It forms a limited partition X1, X2, . . . , XJ of the sample space optimum k value. They utilized LOO-CV method, since it makes
X such that an unknown observation x is classified into the jth the best use of the available data and avoids the problems of
class if x ∈ Xj. Performance of a nearest neighbor classifier random selections. This algorithm has a high response time
depends on the distance function and value of the neighborhood when the number of data set is high. In another k selection
parameter k. There are several ways for calculating the distance algorithm, Temel K. and Onder A.[12] used sub-sampling
of two points, which include Minkowski distance, Euclidean method. They repeated this method 30 times and computed each
distance, City block (Manhattan) distance, Canberra distance, CAR for the validation set for different k values. Then, they
Chebyshev distance, and Bray Curtis distance (Sorensen selected k of maximum CAR and used it in testing data set. As
distance). It is worth mentioning that Euclidean distance method can be seen from literature, in many studies, the value of k is
is commonly used in k-NN algorithm. If the observations are not selected by many trials on the training and validation sets. But
of comparable units and scales, it is meaningful to standardize these selected methods are often based on chance and so they are
them before using the Euclidean distance. not acceptable. In this work, an awareness algorithm for
selecting optimum k using cross-validation methods is proposed.
The other parameter, which controls the volume of the This algorithm identify the data set and then select optimum k.
neighborhood and consequently the smoothness of the density It does not select k by chance. The performance of the proposed
estimates, is k number of neighbors. It plays a very important
464 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
method was tested using artificial data set and six different TABLE I. MEAN AND VARIANCE OF ARTIFICIAL DATA SETS WITH
DIFFERENT DISTRIBUTIONS
medical datasets, downloaded from University of California, Data Type
Irvine (UCI) machine learning data repository. Class/distribution
A B C
Mean Class1 (3,6) (3,6) (3,6)
The rest of the paper is organized as follows: the next (3,6) (3,6) (3,10)
(First Dim., Second Dim.) Class2
section describes data set and introduces the popular cross Variance Class1 (1.5,15) (8,8) (1.5,1.5)
validation methods. In Section 3, we explain the proposed (First Dim., Second Dim.) Class2 (15,1.5) (1.5,1.5) (1.5,1.5)
algorithm and experimental results and compare our algorithm
with other classical algorithms. Finally, conclusion is presented
in the last section. To access the effectiveness of the proposed algorithm, it was
tested in the data set with 200 and 1000 observations separately.
Table II shows the details about the artificial data set.
II. MATERIAL AND METHODS TABLE II. ARTIFICIAL DATA SET CHARACTERISTICS
A. Description of data set Dataset Name Total No. of Total No. of Total No. of
Observations Features class
In the following subsections, we describe the used data set. Type A 200 2 2
1000
Type B 200 2 2
a. Description of artificial data set
1000
The three different types of data set with different Type C 200 2 2
variances and means as well as different distributions in two
1000
classes (class1 and class2) were made. These three types (A, B,
and C) of distributions made different hypotheses and so
changes in sample distributions affected the operation of b. Description of UCI data set
classifiers in various ways. In Fig. 1, these three types of To evaluate the effectiveness of the proposed algorithm on real
distributions are presented. Observations were 2-dimensional data, classification of data from UCI machine learning was
and their number was the same in each class. Variance and mean performed. The six data set used for this evaluation are described
of different distributions of the data set are sorted in Table I. In in detail at the UCI website. A summary of each data set is given
this table, Dim. denotes dimension. For example, in type ‘A’,
in Table III.
mean of the first component of class1 is 3 and the second
component is 6. Also, variance of the first component of class 1 TABLE III. UCI DATA SET CHARACTERISTICS
is 1.5 and the second component has 15. Means of the first and Data set Name Total No. of Total No. of Total No. of
second components of class2 are 3 and 6, respectively. Variance Instances Features class
of the first and second components of class2 is 15 and 1.5, Breast Cancer 699 10 2
respectively. Wisconsin
Pima Indians 768 8 2
Diabetes
Bupa 345 6 2
Heart 270 13 2
Thyroid 7200 21 3
Iris 150 4 3
a. K-fold cross-validation
K-fold cross-validation (K-FCV) is one of the most widely
adopted criteria for assessing the performance of a model and
for selecting a hypothesis within a class. An advantage of this
method, over the simple training-testing data splitting, is the
repeated use of the whole available data for both building a
Fig. 1. Three types of distributions generated by rand function with seed 12 learning machine and testing it. Hence, it reduces the risk of
in Matlab enviroment (un)lucky splitting [13]. In K-fold cross-validation method, data
set is randomly split into K subsets with equal size and the
method is repeated K times. Each time, one of the K subsets is
465 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
used as the validation set and the other K-1 subsets are put the available observations was controlled and noise was added
together to form the pre-training. according to the experimental purpose. rand function in
MATLAB R2014a was used to make artificial data set with
We illustrate the use of the proposed method using an “seed12”. The proposed algorithm was tested and checked on
example. If K=10, the training data set is divided into 10 parts; three different distributions of the data set. For all the artificial
in each iteration, 9 parts are for pre-training and the rest are datasets, the results were reported based on both Euclidean and
related to the pre-testing. Then, we check k nearest neighbor for Mahalanobis distances. The data set were randomly divided to
the samples. CAR value for all values of k is calculated 10 times training and testing observations with the same number. For
(due to K=10). Average of CAR is calculated for each k in these example, when there were 200 observations in total, in each
10 repetitions. Finally, k of the highest CAR is selected. This partition, 100 observations (50 observations from each class)
value of k is optimum k value. Our results are demonstrated in were used for training and 100 observations (50 observations
Table IV for distribution type A (200 samples in total). As from each class) were for the testing. To test effect of fold in
shown in this table, average of CAR (in 10 folds) for k=7 cross-validation, K was set in 10, 20, and N (for LOO-CV).
neighbor is 96.5; so, it is optimum k in 25 nearest neighbor. A When there were 200 or 1000 observations in total, N was 100
common problem in cross-validation methods is the number of and 500, respectively. In the training item, the proposed method
folds, into which the training set is divided. In this paper, we
tried to find optimum k value between 1 and 25 as well as 1 and
checked two kinds of folds. 50 when total observations were 200 and 1000, respectively. To
TABLE IV. SELECTION OF K VALUE FOR DISTRIBUTION TYPE ’A’ BY
verify the proposed method, it was repeated 10 times in each
PROPOSED METHOD data set. Table V shows CARs and standard deviations for three
different distributions of artificial data set when total
k 1 2 3 4 5 6 7 8 9 10 Avg. observations were 1000. Also Table VI shows results for data
/folds CAR
set with 200 observations in total. These results were compared
k=1 0,90 1 0,9 1 0,85 0,8 0,9 0,95 1 0,85 0,915
with those of classical training methods.
k=3 0,95 1 0,9 0,95 0,85 0,95 0,95 0,9 1 0,85 0,930
k=5 0,95 1 1 0,95 0,85 0,95 0,95 0,9 0,95 0,9 0,940 Here, along with the artificial data set, six other real data set
k=7 0,95 1 1 1 0,9 0,95 1 0,95 0,95 0,95 0,965 (Breast Cancer Wisconsin, Pima Indians Diabetes, Bupa, Heart,
k=9 0,95 1 0,9 1 0,9 0,95 0,9 0,95 0,95 0,95 0,945 Thyroid and Iris) were used for illustration. For all the datasets,
where the measurement variables were of the same unit and
k=11 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950
scale, the results were computed based on both Euclidean and
k=13 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 Manhattan distances. Those sets were formed by randomly
k=15 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 partitioning the data. Data set were randomly divided into
k=17 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 training and testing observations in almost the same number. To
k=19 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 test the effect of fold on cross-validation, K was set in 10, 20,
k=21 0,95 1 0,9 1 0,9 0,95 0,9 0,95 0,95 0,95 0,945 and N (for LOO-CV) as artificial data set. To satisfy the value
k=23 0,95 1 0,85 1 0,9 1 0,95 0,95 0,95 0,85 0,940 of K in 10 and 20 in some real data set, observations of training
k=25 0,90 1 0,85 1 0,9 1 0,9 0,95 0,95 0,8 0,925 and testing ones were not equal. For example, when there were
b. Leave-one-out cross-validation 268 observations in Pima dataset, in the training partition, 140
observations (70 observations from each class) and 128
LOO-CV is a particular case of K-FCV with K=N, where N observations (64 observations from each class) were used for
is size of the training set. Hence, the validation sets are all of size testing. In the training item, the proposed method tried to find
one. Like other algorithms, the training data set is divided into optimum k value between 1 and 25 in all the data sets. To verify
two groups. The procedure of LOO-CV method is to take one the proposed method, it was repeated 10 times in each data set.
out of N observations and use the remaining N-1 observations as Table VII and VIII show these results for two class and three
the training set for deriving the parameters of the classifier [14]. class problems. CARs and standard deviations for all data sets
This process is repeated for all N observations to obtain the are shown in these tables. The present results were compared
estimation for the classification accuracy. with those of classical training methods. As shown in Tables V,
VI, VII, and VIII, the proposed method showed 0.1% - 4%
The proposed method was applied in LOO-CV, like K-FCV.
higher CAR than other methods.
If K=N, the training data set is divided into N parts; in each
iteration, K-1 parts are used for pre-training and the rest for the When the number of observations in the data set is high, the
pre-testing. Then, we check k nearest neighbor for the samples. learning task needs a long time and using LOO-CV is not
CAR value is calculated for all the values of k for N times (due suitable. In this case, K-FCV is a good way for the learning task.
to K=N). Average of CAR is calculated for each k in these N The results showed that the present algorithm by K-FCV was a
times. Finally, k of the highest CAR is selected. This value of k very good way for finding the optimum value of k when the
is optimum k value for that data set. number of observations was high, because the response time of
the proposed algorithm was very low. Also, these results
III. EXPERIMENTAL RESULTS illustrated that the size of the folds in K-FCV algorithm did not
We begin with three types of artificial data set on two class greatly affect the results. As mentioned, the results were
classification problems. These data sets were described earlier computed based on both Euclidean and Manhattan distances and
in Section 2. MATLAB R2014a was the used environment for it was found that the kind of distance did not affect the proposed
creating the data set. By using artificial data set, the number of method.
466 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE V. RESULTS FOR ALL DISTRIBUTIONS OF ARTIFICIAL DATA SET WITH 1000 OBSERVATIONS IN TOTAL
Algorithms/ Folds Methods Type A Type B Type C
City block, Proposed method 0.9221±0.0075 0.9225±0.0030 0.8971±0.0056
10-Fold Onder’ method 0.9118±0.0057 0.9095±0.0053 0.8949±0.0053
Duda’ method 0.9112±0.0058 0.9079±0.0071 0.8978±0.0066
City block, Proposed method 0.9236±0.0060 0.9243±0.0075 0.8781±0.0082
20-Fold Onder’ method 0.9170±0.0103 0.9036±0.0090 0.872 ±0.0119
Duda’ method 0.9133±0.0079 0.9086±0.0077 0.8984±0.0089
City block, Proposed method 0.9224±0.0062 0.9241±0.0034 0.9035±0.0069
N-Fold Onder’ method 0.9051±0.0057 0.8784±0.0067 0.8530±0.0095
Duda’ method 0.904±0.0055 0.9085±0.0067 0.9035±0.0062
Euclidean, Proposed method 0.9240 ±0.0055 0.9245±0.0081 0.8903±0.0056
10-Fold Onder’ method 0.9023±0.0094 0.9008±0.0073 0.8974±0.0086
Duda’ method 0.9025± 0.0068 0.9110±0.0094 0.9013±0.0072
Euclidean, Proposed method 0.9264±0.0046 0.9249±0.0049 0.8990±0.0060
20-Fold Onder’ method 0.9270±0.0057 0.9042±0.0056 0.8907±0.0104
Duda’ method 0.9243± 0.0037 0.9087±0.0068 0.8997±0.0066
Euclidean, Proposed method 0.9184±0.0053 0.9235±0.0044 0.9007±0.0070
N-Fold Onder’ method 0.9049±0.0044 0.8798±0.0046 0.8480±0.0064
Duda’ method 0.9190±0.0054 0.9094±0.0050 0.9027±0.0068
TABLE VI. RESULTS FOR ALL DISTRIBUTIONS OF ARTIFICIAL DATA SET WITH 200 OBSERVATIONS IN TOTAL
Algorithms/Folds Methods Type A Type B Type C
City block, Proposed method 0.9190±0.0160 0.9130±0.0071 0.9010±0.0204
10-Fold Onder’ method 0.9020±0.0132 0.9025±0.0175 0.8950±0.0255
Duda’ method 0.9065±0.0194 0.8810±0.0122 0.9020±0.0254
City block, Proposed method 0.9250±0.0131 0.8860±0.0209 0.9030±0.0153
20-Fold Onder’ method 0.9175±0.0223 0.8685±0.0253 0.8830±0.0275
Duda’ method 0.9125±0.0136 0.8685±0.0176 0.9025±0.0174
City block, Proposed method 0.9145±0.0172 0.9080±0.0130 0.8995±0.0174
N-Fold Onder’ method 0.8990±0.0242 0.8750±0.0111 0.8610±0.0250
Duda’ method 0.9095±0.0109 0.8790±0.0250 0.8970±0.0189
Euclidean, Proposed method 0.9140±0.0287 0.9115±0.0251 0.9110±0.0223
10-Fold Onder’ method 0.9025±0.0241 0.9055±0.0318 0.9025±0.0190
Duda’ method 0.9075±0.0241 0.8910±0.0204 0.9000±0.0204
Euclidean, Proposed method 0.9180±0.0125 0.9095±0.0205 0.9085±0.0257
20-Fold Onder’ method 0.9080±0.0236 0.8915±0.0300 0.8935±0.0252
Duda’ method 0.9180±0.0226 0.8695±0.0206 0.9095±0.0251
Euclidean, Proposed method 0.9220±0.0225 0.9075±0.0206 0.8960±0.0313
N-Fold Onder’ method 0.9015±0.0267 0.8845±0.0254 0.8660±0.0273
Duda’ method 0.9005±0.0228 0.8660±0.0254 0.8990±0.0211
TABLE VII. RESULTS FOR REAL DATA SET WITH TWO CLASSES
algorithms/types Methods Pima Wisconsin Bupa Heart
(140/128) (120/119) (80/65) (60/60)
City block, Proposed method 0.7078±0.0211 0.9647±0.0147 0.6277±0.0342 0.6967±0.0276
10-Fold Onder’ method 0.7016±0.0239 0.9597±0.0180 0.6362±0.0441 0.6833±0.0340
Duda’ method 0.7094±0.0281 0.9513±0.0174 0.6269±0.0329 0.6842±0.0307
City block, Proposed method 0.7399±0.0133 0.9739±0.0118 0.6408±0.0301 0.6992±0.0268
20-Fold Onder’ method 0.7063±0.0178 0.9567±0.0074 0.6162±0.0608 0.6883±0.0500
Duda’ method 0.7063±0.0220 0.9504±0.0071 0.6231±0.0274 0.6900±0.0378
City block, Proposed method 0.7364±0.0220 0.9768±0.0187 0.6377±0.0574 0.7033±0.0233
N-Fold Onder’ method 0.6465±0.0232 0.9592±0.0179 0.5900±0.0514 0.6367±0.0193
Duda’ method 0.7145±0.0096 0.9555±0.0171 0.6438±0.0570 0.6958±0.0252
Euclidean, Proposed method 0.7061±0.0210 0.9772±0.0099 0.6223±0.0268 0.6375±0.0255
10-Fold Onder’ method 0.6926±0.0252 0.9667±0.0156 0.6285±0.0366 0.6417±0.0255
Duda’ method 0.6918±0.0303 0.9535±0.0098 0.6192±0.0374 0.6475±0.0319
Euclidean, Proposed method 0.6887±0.0315 0.9873±0.0075 0.6208±0.0399 0.6592±0.0234
20-Fold Onder’ method 0.6871±0.0268 0.9618±0.0121 0.6262±0.0359 0.6250±0.0412
Duda’ method 0.6930±0.0281 0.9605±0.0084 0.6315±0.0335 0.6317±0.0222
Euclidean, Proposed method 0.7016±0.0312 0.9723±0.0060 0.6246±0.0351 0.6350±0.0340
N-Fold Onder’ method 0.6535±0.0334 0.9664±0.0114 0.5831±0.0427 0.6033±0.0261
Duda’ method 0.6875±0.0128 0.9588±0.0092 0.6192±0.0361 0.6267±0.0218
467 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
TABLE VIII. RESULTS FOR REAL DATA SET WITH THREE CLASSES [7] Ripley, B.D. “Pattern Recognition and Neural Networks”. Cambridge
University Press, Cambridge, 1996.
algorithms/ Methods Iris Thyriod
types (25/25) (90/76) [8] Efron B., “Estimating the Error Rate of a Prediction Rule: Improvement
on Cross-Validation”, Journal of the American Statistical Association,
City block, Proposed method 0.9573±0.0216 0.7030±0.0405
Vol. 78, No. 382, pp. 316-331, 1983.
10-Fold Onder’ method 0.9480±0.0160 0.6825±0.0397
Duda’ method 0.9587±0.0203 0.6829±0.0582 [9] Huang, P., Lee. C. H.,”Automatic Classification for Pathological Prostate
City block, Proposed method 0.9573±0.0216 0.7086±0.0283 Images Based on Fractal Analysis”, IEEE Transaction on Medical
Imaging, vol. 28, NO. 7, JULY, 2009.
20-Fold Onder’ method 0.9427±0.0199 0.6833±0.0348
Duda’ method 0.9520±0.0157 0.6601±0.0342 [10] Onder, A. and Temel. K., "Wavelet Transform Based Classification of
City block, Proposed method 0.9680±0.0129 0.6850±0.0299 Invasive Brain Computer Interface Data", Radio engineering, 20(1), pp.:
31-38, 2011.
N-Fold Onder’ method 0.9507±0.0167 0.6632±0.0248
Duda’ method 0.9680±0.0157 0.6658±0.0308 [11] Onder, A. and Temel. K., “Comparative Performance Assessment of
Euclidean, Proposed method 0.9780±0.0143 0.6533±0.0315 Classifiers in Low-Dimensional Feature Space Which are Commonly
10-Fold Onder’ method 0.9613±0.0098 0.6298±0.0361 Used in BCI Applications”, Elektrorevue, 2(4), pp.: 58-63, 2011.
Duda’ method 0.9667±0.0169 0.6167±0.0436 [12] Temel, K. and Onder. A., “A Polynomial Fitting and k-NN Based
Euclidean, Proposed method 0.9627±0.0371 0.6512±0.0233 Approach for Improving Classification of Motor Imagery BCI Data”,
20-Fold Onder’ method 0.9627±0.0138 0.6373±0.0339 Pattern Recognition Letters, 31(11), pp. 1207-1215, 2010.
Duda’ method 0.9500±0.0327 0.5982±0.0314 [13] Anguita, D., Ridella, S., Fabio R.,”K-Fold Generalization Capability
Euclidean, Proposed method 0.9560±0.0209 0.6512±0.0409 Assessment for Support Vector Classifiers”, Proceedings of International
N-Fold Onder’ method 0.9560±0.0090 0.6386±0.0326 Joint Conference on Neural Networks, Montreal, Canada, July 31 -
Duda’ method 0.9627±0.0225 0.6184±0.0333 August 4, 2005.
[14] Alippi C., M. Roveri, “Virtual k-fold cross validation: an effective method
for accuracy assessment”, in Proc. IEEE International Joint Conference
on Neural Networks (IEEE IJCNN 2010), Barcelona, Spain, July 18-23,
IV. CONCLUSION 2010.
k-nearest neighbor (k-NN) algorithm is one of the most
popular methods that supervises learning algorithm and exploits
lazy learning between classification methods. k-NN's AUTHORS PROFILE
performance is highly competitive with other techniques. There
are several key issues that affect the performance of k-NN, one
of which is distribution of data set. Value of k plays an important Masoud Maleki received his Master
role in the decision of unknown pattern in k-NN; so, it can be degree in Electronic Engineering from
considered another issue. In this paper, a new method was the Azad University, Iran in 2010 and the
presented for selecting optimum k-nearest neighbor using cross- B.Sc. degree in Telecommunication
validation methods in k-NN algorithm. The results were Engineering from the Azad University,
computed based on both Euclidean and Manhattan distances. It Iran 2007. Currently pursuing his Ph.D.
was found that the kind of distance and number of folds in cross from Karadeniz Technical University,
validations methods did not affect the proposed method. The Trabzon, Turkey in Biomedical
proposed method was also fully automatic with no user-set Engineering. His research interest is
parameters. The experimental results by using the proposed Signal and Image Processing, Brain-
algorithm, could be decided the most optimum k value in
Computer Interfacing.
compare with other algorithms according to achieved
classification accuracy rates (CAR). This algorithm was applied
to different distributions of artificial and real data sets.
Negin Manshouri received the B.Sc.
degree of Telecommunication
REFERENCES Engineering from Islamic Azad
[1] Yooii K. Kim and Joon H. Han, "Fuzzy K-NN Algorithm using Modified University, in 2010. She practically
K- Selection", 0-7803-2461-7/95, International Conference on Fuzzy experienced in fields of microwave,
Systems and The Second International Fuzzy Engineering Symposium., mobile communication, as an antenna
IEEE, 1995.
designer. Her research interest includes
[2] Fix, E., Hodges, J.L., “Discriminatory analysis nonparametric the area of design and analysis of
discrimination: consistency properties”, International Statistical Review,
Project 21-49-004, Report 4, pp. 261–279. US, 1951. different kinds of microstrip antenna,
[3] Cover, T.M., Hart, P.E., “Nearest neighbor pattern classification”, IEEE ultra-wideband antenna and also branch of biomedical
Trans. Inform. Theory 13, 21–27, 1968. engineering. She received the M.S. degree Telecommunication
[4] Wu X., Kumar V. and J. Ross Quinlan, "Top 10 algorithms in data mining Engineering from Islamic Azad University, in 2013. She is
", Knowledge and Information Systems, Volume 14, Issue 1, pp 1-37, currently working toward the PhD degree in Biomedical
2008.
Engineering at the Karadeniz Technical University, Trabzon,
[5] Duda R. ET Hart P. E., and Stark D. G., "Pattern classification", 2nd
edition, John Wiley, 2000. Turkey.
[6] Enas, G. G. and Choi, S. C. “Choice of the smoothing parameter and
efficiency of k-nearest neighbor classification”, Comput. Math.Applic. A,
12, 235–244, 1986.
468 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
469 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract—Edge detection is one of the basic steps in to be used for further image processing . The major
image processing, image pattern recognition, image anal- property of the edge detection technique is its ability
ysis and computer vision techniques . it is solving many to extract the exact edge line with good orientation
complex problems, Edge is determined on the basis of the
boundary between two different areas of color intensity . The main features can be extracted from the edges
in the image . In this paper we suggested edge detection of an image. Edge detection has major feature for
method for detection of facial expression to pick up the image analysis. These features are used by advanced
edge of eyes , mouth’s and other facial expressions of a computer vision algorithms . Spatial masks can be
human face image . we compare the facial expressions used to detect all the three types of discontinuities in
for suggested method with the expressions resulting from
traditional methods to edge detection ( Sobel, Prewitt an image . There are many edge detection techniques
, Kirsh , Robinson , Marr-Hildreth , LoG and Canny Those techniques are Roberts edge detection, Sobel
Edge Detection). The method depend on a suggested Edge Detection, Prewitt edge detection, Kirsh edge
filter result from combine some elements of the markov detection, Robinson edge detection, Marr-Hildreth
basis found by H.H.Abbas and H.S.Mohammed hussein edge detection, LoG edge detection and Canny Edge
in 2014 , and Laplace filters , the results of using this
suggested method to color image or gray image is more Detection[1][2] .
accuracy and clarity than traditional methods.
Keywords: image processing, edge detection, color
II. RELATED WORKS
and gray image , Laplace filter , Gaussian, Markov
basis. Dr.S.Vijayarani and Mrs.M.Vinupriya (October
, 2013) , edge detection methods transform original
images into edge images benefits from the changes
I. I NTRODUCTION of grey tones in the image. Use two edge detection
Edge detection is one of the most commonly algorithms namely Canny edge detection and Sobel
used operations in image processing and pattern edge detection algorithm are used to extract edges
recognition, the reason for this is that edges form from facial images which is used to detect face .
the outline of an object. An edge is the boundary Performance factors are analyzed namely accuracy
between an object and the background, and indicates and speed are used to find out which algorithm works
the boundary between overlapping objects . Since better. From the experimental results, it is observed
computer vision involves the identification and that the Canny edge detection algorithm works better
classification of objects in an image, edge detection than Sobel edge detection algorithms[3].
is an essential tool. Efficient and accurate edge
detection will lead to increase the performance of Abdallah A. Alshennawy and Ayman A. Aly
subsequent image processing techniques, including (2009) , this method based on fuzzy logic reasoning
image segmentation, object-based image coding, and strategy is proposed for edge detection in digital
image retrieval . The purpose of edge detection in images without determining the threshold value. The
general is to significantly reduce the amount of data proposed approach begins by segmenting the images
in an image, while preserving the structural properties into regions using floating 3x3 binary matrix. The
470 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
edge pixels are mapped to a range of values distinct i1 ...im , A non- negative integer xi ∈ N denotes the
from each other. The robustness of the proposed frequency of a cell i , N = {0, 1, 2, 3, ...}.The set
method results for different captured images is of frequencies x = {xi }i ∈ I is called acontingency
compared to those obtained with the linear Sobel table. A contingency table x = {xi }i ∈ I can be
operator[4] . written as n-dimensional column vector of non-
negative integers in N n . Let Z be the set of integers
Bijay Neupane , Zeyar Aung and Wei Lee and let aj ∈ Z v , j = 1, ..., v, denote fixed column
Woon (April , 2012 ) , edge detection many vectors consisting of integers .
approaches have were proposed , the basic approach The v-dimensional column vector t = (t1 , t2 , ..., tv )0
is to search for abrupt change in color, intensity is defined as tj = a0j x , where 0 denotes the transpose
or other properties. We propose a new method for of a vector or a matrix . If A = [a1 ...av ]0 is v × n
edge detection which uses k-means clustering, and matrix with jth row a0j then t = Ax . The set
where different properties of image pixels were used A−1 (t) = {x ∈ N n : Ax = t} (t-fibers) is the set of
as features. We analyze the quality of the different contingency tables x is for agivent . is considered for
clusterings obtained using different k values (i.e., the performing similartests. If ∼ is the relation x1 ∼ x2
predefined number of clusters) in order to choose if and only if x1 _x2 belongs to the kernel of A ,
the best number of clusters. The advantage of this ker(A) , this relation is an equivalence relation on
approach is that it shows higher noise resistance N n and the set of t-fibers is the set of its equivalence
compared to existing approaches[5] . classes .
An n-dimensional column vector Z = {zi } ∈ z n is
Muthukrishnan.R and M.Radha(Dec , 2011 ) called amove if z ∈ ker(A) , i.e , AZ = o . A set of
, interpretation of image contents is one of the finite moves M is called Markov basis if for all t ,
objectives in computer vision . In image interpretation A−1 [t] constitutes one B equivalence class. [9]
the partition of the image into object and background
is a severe step. Segmentation separates an image into
its component regions or objects. Image segmentation
needs to segment the object from the background to
read the image properly and identify the content of They proved the number of elements in B equals to
the image carefully. In this context, edge detection is n2 −3n
3 . If n = 9 then the number of elements in B is
n2 −3n 2
a fundamental tool for image segmentation. [6].
3 = 9 −3×9
3 = 18 these elements are [8]
471 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
0 0 0 −1 0 1 Step3
Z11 = −1 1 0 , Z12 = 1 0 −1
1 −1 0 0 0 0 We will add one value to the filter center (Z100 )
for filter (Z1∗ ) :
0 0 0 0 −1 1
Z13 = −1 0 1 , Z14 = 0 1 −1
0 0 0
1 0 −1 0 0 0 Z1∗ (x, y) = Z100 (x, y) + 0 1 0
0 0 0
0 −1 1 0 0 0
Z15 = 0 0 0 , Z16 = 0 −1 1
0 1 −1 0 1 −1 Then
−2 −4 −1
−1 1 0 −1 0 1
Z17 = 0 0 0 , Z18 = 0 0 0 Z1∗ (x, y) = −3 20 −2
1 −1 0 1 0 −1 −2 −1 −4
Step4
IV. PROPOSED FILTERS FOR EDGE Filters collect included in the step (3) with a
DETECTION special filter (F) To obtain the following filter (Z2∗ ) :
For n = 3 we will use some elements of a Markov
basis B and Laplace filters to generate new filter . Z2∗ (x, y) = Z1∗ (x, y) + F (x, y)
The following steps illustrate this
where
A. Filter Generation
F1 F2 F3
Step1
F (x, y) = F4 a F6
F7 F8 F9
We use the Markov basis elements Zi , Zj , find
P5 P18
Z10 = i=1 Zi + j=14 Zj
Step2 Step1
Filters collect Markov included in the step (1) Let f (x, y)be color image with dimension m × n × 3
with laplace filters To obtain the following filter : and gray image with dimension m × n .
Z100 (x, y) = L(x, y) + Z10 (x, y), Z0 ∈ M
Step2
where
−2 −3 −2 The convolution process to new filter 3 × 3 Let
L(x, y) = −3 20 −3 an image matrix f (x, y) of dimension m × n , which
−2 −3 −2 can be written as following :
472 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Make the two arrays as the following form : g(1, 1) = (f12 ∗ z11 ) + (f13 ∗ z12 ) + (f14 ∗ z13 )
+ (f22 ∗ z21 ) + (f23 ∗ z22 ) + (f24 ∗ z23 )
+ (f32 ∗ z31 ) + (f33 ∗ z32 ) + (f34 ∗ z33 )
473 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Step 6
Fig. 3: Multiply the filter in the image Combine the three image layers (R1 , G1 , B1 )
To reshape the color image .
Step 7
V. EXPERIMENTAL RESULT
So ,
We offer in this section various edge detection
techniques such as Roberts edge detector, Sobel Edge
g(m, n) = (f(m−2)(n−2) ∗ z11 ) + (f(m−2)(n−1) ∗ z12 ) Detector, Prewitt edge detector, Kirsch, Robinson,
+ (f(m−2)(n) ∗ z13 ) + (f(m−1)(n−2) ∗ z21 ) MarrHildreth edge detector, LoG edge detector and
+ (f(m−1)(n−1) ∗ z22 ) + (f(m−1)(n) ∗ z23 ) Canny Edge Detector. The edge detection techniques
were implemented using MATLAB R2015b, and tested
+ (f(m)(n−2) ∗ z31 ) + (f(m)(n−1) ∗ z32 ) with an images . The objective is to produce edge to
+ (f(m)(n) ∗ z33 ) extracting the principal features of the image.
Step3
(d) Suggested
(c) Original
method
Smoothing image by using Gaussian or median
filter . Fig. 4: Original gray Image with the result of Sug-
gested method.
Step4
474 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
(a) Original (b) Roberts (c) Sobel (a) Original (b) Roberts (c) Sobel
(d) Prewitt (e) Kirsch (f) Robinson (d) Prewitt (e) Kirsch (f) Robinson
R EFERENCES
[1] Soumya Dutta and Bidyut Baran Chaudhuri ,"A Color Edge
Detection Algorithm in RGB Color Space" , International Con-
ference on Advances in Recent Technologies in Communication
(d) Suggested and Computing , November 21 , 2009.
(c) Original [2] John Canny. A computational approach to edge detection.
method
Pattern Analysis and Machine Intelligence, IEEE Transactions
Fig. 6: Original color Image with the result of Sug- on, PAMI-8(6):679–698, Nov. 1986.
[3] Dr.S.Vijayarani and Mrs.M.Vinupriya ," Performance Analysis
gested method. of Canny and Sobel Edge Detection Algorithms in Image Min-
ing", International Journal of Innovative Research in Computer
and Communication Engineering , Vol. 1, Issue 8, October 2013.
475 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
476 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract—Face recognition from image is a prevalentsubject in realtime calculation and variation, 4) Scope of natural time
biometrics research. Many open places typically have hardware.
surveillance cameras for video capture and these cameras have Facial biometric checking system is utilized to check proof of
their important value forsafetypersistence. It is persons attempting amongdifferent edge work and access
extensivelyrecognized that face recognition have played
asignificantrole in surveillance framework as it doesn’t need the
control framework. Facial matching make utilization of digital
object’s assistance. The realbenefits of face based identification pictures of face saved in a database and on card. Digital
over other biometrics are uniqueness and acceptance.At start we picture is clicked by registration into framework, and after that
give the basic overview about face recognition and diverse match with live picture person upon an access attempt in a
parameters that affects face shape and structure and texture. procedure known as "matching". Output of matching
User uses age tasks merged with aging method to calculate age. algorithm can be enhanced that is, placement of wrong
Then user uses judge age, commonly vector generating function matches and wrong accepts can be facial pictureincreased.
or feature vector of real image to generate incorporate feature Because of this, latest standards for reduced if standard of
vectors at destination age. User uses a structure and texture facial picture can be increased. Because of this, latest
vectors to show a facial image by forecasting it in Eigen space of
structure. In this article author primary focused on this domain
standards for biometric facial picture specify normative
of face recognition and give the overview of the existing research demands for facial picture standard and provide good practices
that has been initiated in this area and also we discuss the for biometric facial picture captured.
benefits and limitation of researches coined in literature. Face recognition is a simple work for humans experiments in
[12-20] have shown, day old child are able to differentiate
Keywords-Face Recognition, Aging, among known faces. Facial recognition use different
parameters of face contain upper outlines, eye sockets, areas
I. INTRODUCTION around cheekbones, side view of face, and position of nose
Role of Human face is very significant in social interaction, and eyes to work checking and proof. Almost methods are
carry individual's check utilizing human face as a factor to resistant because of changes in hairstyle, as they do not use
safety, machine checking of faces is presented as large field of face position close to hairline. When utilized in proof
calculation range crossing numerous standards for instance mode, facial recognition method commonly give person lists
picture procedure, pattern recognition, PC vision or neural of near matches as different to give a specific match (as do
networks. Biometric face recognition innovation had gotten fingerprint and iris-scan methods apply of facial recognition
critical consideration both from neuroscientists and from PC method is nearly bounded to standard of facial picture. Low-
vision researchers in the previous years a while because of quality picture is large same to output in enrolment and
capability for a large assortment of utilizations in two matching problems than large-standard picture. Such as, large
lawenforcement and non-law enforcement for example picture databases include with drivers' licenses or passports
passports, credit cards, image IDs, licenses of driving, or mug carry photographs of marginal standard, for example inserting
attempts to original time coordinating of scrutiny video data and perform matches goes to minimized accuracy. Same
picture. As contrasted with different biometrics framework well-known errors occur with surveillance deployments.
utilizing fingerprint print and iris, face recognition has Facial picture for enrolment and checking can be gain from
different benefits due to its non-contact procedure Face picture real view with large-standard widgets, system output enhance
can be clicked from large difference without content substantially. For facial recognition at little large-than-simple
individual's being checked, and checking does not want distances, there is a large matching among camera standard
contacting with individual's. Moreover, face recognition and differences ability. Different methods for 2D face
handlefraudleashwork due to face picture that have been recognition can be categorized into 3techniques: analytic,
captured and arrived can nextbenefitcheck an individual's. holistic and hybrid technique. While analytic technique match
Following facts is the result of Research interest in the area of salient facial components or parts detected from face, holistic
face recognition: 1) Enhancement in emphasis on commercial techniques make utilization of data derived from full face
research work, 2) Enhancement requirement for surveillance pattern. By merging both local and global components hybrid
related frameworks because of drug trafficking 3) Re- techniques attempt to generate a more complete representation
emergence of real network explained with intensity on of facial picture. Face recognition relay upon geometric
parameter of a face is veryperceptivemethod to face
477 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
recognition. Start automated face recognition method was an expansive number of ways. Real contrast is that
explained in [2]; marker points for example: location of eyes, distinguishing proof does not use a guaranteed character.
ears and nose were utilized for design a feature vector Rather than procuring a PIN or individual name, conveying
recognition was done by evaluating Euclidean distance among confirmation or denial of case, distinguishing proof
feature vectors of a studyor reference picture. This type of frameworks endeavor to reply inquiry "who am I?" [15] There
technique is strong among modification in illumination by its are only an unobtrusive group of enrollees in the database, this
nature, but has large problems: unique registration of maker need is not asking for: as databases create large, into tens and
dot is tuff, also bounded with state of art algorithms. Few numerous thousands, this work ends up being essentially more
weeks on geometric recognition was carried out in [3]. troublesome. System may simply have ability to contract
Eigenfaces technique explains in [4-28] a holistic technique to database to different likely competitors; Human intercession
face recognition: Facial picture check large dimensional may then be needed finally check stages. A second variable in
picture area and a small display notation is simple. Lower ID is component between target subjects and catch
dimensional subspace is calculated with main parameter contraption. In recognizable proof one expect an
Checks axes with large variance. Type of sending is un- accommodating gathering of people, one contained subjects
necessary nom a redesign standpoint; it doesn’t assume class who are convinced to use structure precisely. Facial yield
labels into measure. Assuming state in which variance is structures, dependent upon clear kind of execution, may in like
evaluated from extra data let it be normal. Axes with manner must be streamlined for non-agreeable and
highdeviation do not includeparticulardata; a category is uncooperative subjects [17]. Non-agreeable subjects are
impractical so class-limitedprecision with Linear Discriminate oblivious that a biometric system is set up, or couldn't mind
Analysis was tested to race recognition in [11]. Normal less to recognize or to avoid recognition. Uncooperative
arrangement is to lessen variance among a class, while subjects viably keep up a vital separation from affirmation,
upgrading variance among classes at equivalent period. and may use masks or take shifty measures. Facial yield
Distinctive procedure for a neighborhood feature extraction developments are significantly more prepared for agreeable
rose. To reject high-dimensionality or information data just subjects,, and are absolutely unequipped for perceiving
nearby locales of a photo are clarified. Removed components uncooperative subjects [16] a facial affirmation contraption is
are (ideally) more grounded inverse to partial occlusion, one that points of view a photo or video of a man and
enlightenment and little minimal capacity. Algorithms utilized considers it to one that is in database. It does this by taking a
for a neighborhood feature withdrawal are Gabor Wavelets, gander at structure, shape and degrees of face; partition
Discrete Cosines Transform and Local Binary Patterns [9]. It's between eyes, nose, mouth and jaw, upper charts of eye
stable an open examination address how to spare spatial data connections; sides of mouth: zone of nose and eyes; and zone
while implementing a neighborhood feature extraction, including check bones. To keep a subject from using a
because of spatial data is possibly helpful data as with all photograph or cover while being checked in a facial
biometrics, 4 stages test catch, feature extraction, layout acknowledgment program. A few endeavors to set up security
correlation, and coordinating [12] clarifies technique stream of have been founded. At moment that customer is being
facial scan strategies. Enrolment by and large contain of a 20- checked, they may be asked for that flicker, grin or gesture
30 second enrolment methodology whereby numerous picture their head. Another security highlight would be usage of facial
are taken of single face. Arrangement of picture will fuse thermographs to record warmth in face. Essential facial
marginally unmistakable points and outward appearances, to acknowledgment techniques highlight investigation, neural are
take into consideration substantial coordinating. After framework, Eigen confronts, programmed face preparing [17-
enrolment, particular components are removed, bringing about 18] few facial acknowledgment programming computations
improvement of a layout. Format is little than picture from recognize faces by expelling highlights from a photo of
which it is conveyed: facial picture can require 15kb to 30kb subject's face. Distinctive calculation institutionalize a
layouts range from 84 bytes to 3000 bytes. Little layouts are showcase of face pictures and after that pack face data simply
regularly utilized for 1: N coordinating. saving the data in picture that can be used for facial
acknowledgment [21-22]. A test picture is then diverged from
Check and distinguishing proof take after same strides face data. A really new procedure in business segment is
Imaging individuals is a helpful group of onlookers, rather three-dimensional facial acknowledgment. This method uses
than uncooperative or non-agreeable, individual claims' a 3-D sensors to capture information regarding condition of a
personality through a login name or a token, stands or sits face. Information is then utilized to perceive apparentparts on
before camera for an a few seconds, and is either coordinated face, for example, condition of eye connections, nose and jaw.
or not coordinated. This correlation depends on balance of Advantages of 3-D facial acknowledgment are that which is
most recent created match layout inverse reference format or not affected by changes in lighting, and it can perceive a face
formats on record. Point, at which 2 formats are sufficiently from a collection of focuses, including profile. Another new
same to coordinate known as threshold, can be balanced for strategy in facial acknowledgment uses visual unobtrusive
various work force, Pcs, time of day, and different components of skin, as got in standard progressed or checked
components. Framework outline and improvement for facial pictures. This procedure is called skin composition
sweep confirmation versus distinguishing proof particular in
478 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
examination, turns amazing lines, examples, and spots evident difference becomes saturated after a difference is bigger than 4
in a man's skin into a logical space. years, for difference of up to 10 years. Additionally, they
Preparatory tests have demonstrated the utilizing skin data calculate the picture standard and eyewear present more of a
investigation as a part of facial recognition can build outputin difficult as compare to facial hair.
distinguishing proof by 20 to 25 %. Focal points of facial
recognition are that it is not meddling, ought to be conceivable HaibinLingIn et al [8] portray a survey of Face
from a detachment even without individual staying alert they Recognition as human Age. In the first place, they outline
are being examined. What separates facial recognition from utilizing slope introduction pyramid for this work. Disposing
other biometric strategy is that it can be utilized for of inclination size and utilize various leveled technique, they
observation purposes; as in hunting down needed lawbreakers, found that most recent descriptor yields a solid and
suspected terrorists, and missing youngsters. Facial discriminative presentation. With given descriptor, they
recognition should be possible from far away so with no outline face check as 2-class issues and use a SVM as
contact with subject so they are unconscious they are being classifier. Procedure is connected to 2 international ID
checked. Facial recognition is most preferences to use for information sets including more than 1,800 picture sets from
facial verification than for ID purposes, as it is basic for each client with immense age gaps. Albeit simple, strategy
individual to change their face, highlights with a camouflage beats pale tried Bayesian technique and different descriptors,
or cover, and so forth. Environment is a thought and in contain power distinction and angle with extent. What's more,
addition subject movement and spotlight on camera. Facial it fills in and also 2 business frameworks. Second, for first
recognition, when used in converging with other biometric time, they experimentally concentrate how age contrasts
method, can improve confirmation and recognizable proof influence acknowledgment execution. Work demonstrates
yield significantly.Another Coined algorithm of face that, albeit maturing technique adds trouble to
impersonation of Across the huge gap of aging. The coined acknowledgment assignment, it doesn't surpass light or
algorithm focused on neural network, and training of face expression as a confounding variable.
recognition is done with neural network, neural network is
outperformed the system and existing algorithm with effect of Simone Bianco et al [1] describe huge age-gap face
high accuracy. Furrows may changes in 2majormethods: (1) checking by injection features in networks. He introduced a
most recent wrinkle may high in center of an edge; (2) one noveltechniquefor face verification across large age gaps and a
wrinkle may bifurcate and shape a letter Y. Diverse among (1) dataset containing variations of age in the wild, LAG dataset,
and (2) is not enormously to be correct, in light of the fact that with images consider the effect of face variation with birth to
one of sides of edge on the off chance that (1) may because of old age discriptors. Neural network Fine-tuning is performed
worn, or be restricted and low, and not leave an engraving, in in a Siamese architecture using a contrastive loss function. A
this way changing over it into case (2); on the other hand case layer of feature injection like catalyst is being introduced by
(2) might be changed into (1). Area of starting point of new the author to boost the accuracy.
wrinkle is, be that as it may, none less explained. And thus it
makes a single pattern for checking of a person [28]. Jiaji Huang et al [4] describe Geometry-aware Deep
Transform. This article, contain a novel profound learning
II. OVERVIEW OF SOME EXISTING ALGORITHM intention era that binds together both arrangement and metric
learning criteria. They give a geometry-mindful profound
Haibin Ling et al [6] depict Face Verification among Age change to empower a non-direct discriminative and solid
Progression utilizing techniques. They concentrate on issue by element change, which show focused execution on little
planning and assessing discriminative methodologies. These preparing sets for both engineered and certifiable data. They
straightforwardly handle check errands with no explicit age assist support proposed structure with a formal (K,epsilon)-
modeling, which is a difficult issue independent from anyone strength examination.
else. To begin with, they calculate that gradient orientation
(GO), subsequent discarding magnitude data, gives Dihong Gong et al [3] describe a large Entropy Feature label
straightforward however powerful display for this issue. This for Age Invariant Face Recognition. This article they design a
display is more enhanced at time various leveled data is latest technique to reduce shows and identical issue in age
utilized, which output in utilization of gradient orientation constant faces recognition. At start, latest maximum entropy
pyramid (GOP). At the point, joined with support vector feature descriptor is stooped that conceal microstructure of
machine (SVM) GOP shows phenomenal execution in every facial picture into a succession of different codes as far as
analysis, in examination with 7 distinctive methodologies expansive entropy. By thickly testing encoded face picture,
contain 2 commercial frameworks. Examination is directed on adequate prejudicial and articulate data can be extricated for
FGnet dataset and 2 substantial passport datasets; one is more examination. A most recent coordinating technique is
biggest from all recognition tasks. Other, exploiting datasets, additionally composed, known as character component
they exactly concentrate how age gaps and relevant problems investigation (IFA), to check likelihood that two appearances
influence recognition calculations. They discovered have break even with hidden personality. Adequacy of system
surprisingly that additional trouble of check delivered by age is affirmed by broad experimentation on 2 face maturing
479 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
datasets, MORPH (biggest open area face maturing dataset) utilized to every test pictures, this capacity is known as
and FGNET. They additionally perform chip away at worldwide aging function.
renowned LFW dataset to clarify phenomenal generalizability
of most recent tecnique.
B. Aging Way Classification
Guosheng Hu et al [2] depict When Face Recognition
reconciled with wide information: an interpretation of Point of aging style arranging is to order client into particular
Convolutional Neural Networks for Face Recognition. This classes relies on upon their aging ways. Single client face
article, gives a broad assessment of CNN-based face picture is ordered agreeing his aging way arrangement output.
recognition system (CNNFRS) on a shared belief to create Aging method for single client is shown by a vector which is
assignment effectively reproducible. In particular, they use outlined by evaluation mistake provided by worldwide aging
open database LFW to prepare CNNs, not at all like most function and genuine age. Case in point, a client has 3 face
existing CNNs prepared on private databases. They plan 3 pictures at genuine age and estimation problem of
CNN designs which are initially reported structures prepared worldwide aging function is so his aging style is
utilizing LFW information. This article significant analyses vector ( , ), ( , ), ( , ) , utilize K-mean
models of CNNs and creates impact of particular usage
grouping strategy for maturing way arrangement. User
decisions. They check numerous helpful properties of CNN-
clarifies separation among 2 maturing route as:
FRS. Case in point, dimensionality of scholarly components
can be essentially minimized without unfavorable impact on
face acknowledgment precision. Also, a customary metric = ∑ − ∑ (2)
learning technique misusing CNN-learned elements is
ascertained. Tests display 2 critical variables to great CNN-
FRS execution are combination of various CNNs and metric In condition (2), eiisevaluation error provided by worldwide
learning. aging function, N is number. After that user plan, age capacity
for each class. For a test picture with feature vector b, user
III. AGE ESTIMATION
computes same to ith class.
Age evolution is utilizing feature vectors to judge period of
client in face picture. It thought process to clarify .
communication among genuine age and facial feature vector. = max | |
, = 1, … . , (3)
Here user takes aging function for age evolution. In article [7],
a perception that client who see same tend to age same was
abused, so appearance special aging function was given. We Where is feature vector of sample in class,
imagine that client who looks same may age in unequal ways. training samples in corresponding class? So calculate age of
For instance, ages of 2 clients with same arrival might be test picture is:
unmistakable a great deal. Style which a client ages in is
named his maturing way. In any case, we envision that client
who age in same style has same maturing capacity. So a most ∑ ( )
recent age estimation strategy is planned which coordinate = (4)
maturing capacity and maturing way order. ∑
A. Aging Functions
For a picture, after facial component removal, we got a couple IV. FACE RECONITION ROBUST TO AGE VARIATION
of structure and surface element vector which are converged to In following section, a face recognition framework solid to
frame facial feature vector. Users utilize age capacity with age variety is outlined, as given in Figure.1. Framework
polynomial structure to show relationship among feature contains noteworthy techniques. Age simulation and face
vector and age, as given in condition (1). recognition. In first module, face in database and test face are
altered to single equivalent regulating age to decrease age
= ( ) = + (1) varieties. Face recognition module completes conventional
capacity of face recognition, for instance perceiving highlight
In above equation (1), ‘age’ is estimated age, ′ ′is aging extraction, comparability sorting and recognition result yield.
function, and ′ ′ is feature vector, ,…… is parameter
vector, and offset is needed. For age function,
userassessparameters byan arrangement of preparing tests with
familiar genuine ages. User intention to lessen issue among
genuine ages and judged ages. Standard of lessening square
blunder is received. On the off chance that aging function is
480 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
REFERENCES
[1] Bianco, Simone. "Large age-gap face verification by
feature injection in deep networks." arXiv preprint
arXiv:1602.06149 (2016).
Fig.2. Age simulation system [2] Hu, Guosheng, et al. "When face recognition meets
with deep learning: an evaluation of convolutional
A. Facial Feature Extraction neural networks for face recognition."Proceedings of
the IEEE International Conference on Computer
One face Picture contains shape data or original texture Vision Workshops. 2015.
picture. Facial picture can be differentiating into structure data [3] Gong, Dihong, et al. "A maximum entropy feature
and texture data. User have these 2type data, we can redesign descriptor for age invariant face
facial picture. recognition." Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. 2015.
1) Shape and Texture Information Extraction [4] Huang, Jiaji, et al. "Geometry-aware deep
transform." Proceedings of the IEEE International
Shape data is shown by coordinates of 101 landmarks on face Conference on Computer Vision. 2015.
(Figure.3), and Active Shape Model [9] is utilized to remove [5] Anjana Mall et al. “Skin Tone Based Face
landmarks. Subsequently landmark islocate, triangle relayed Recognition and Training using Neural Network"
affine transform is utilized to span face picture to original UETAE, ISSN 2250-2459. Volume 2, 1ssue9, pp. 1-
texture picture at typical shape. This original texture picture is 5. September 2012.
texture data of face picture. Commonly we utilize mean shape [6] Ling, Haibin, et al. "Face verification across age
of a set of faces as typical shape. progression using discriminative methods." IEEE
481 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Transactions on Information Forensics and security [22] Ruth Campbell et al. "The development of
5.1 (2010): 82-91. differential use of inner and outer face features in
[7] Maturana, Daniel, et al. "Face recognition with local familiar face identification. “Journal of Experimental
binary patterns, spatial pyramid histograms and naive Child Psychology 59.2 (1995): 196-210.
Bayes nearest neighbor classification." Chilean [23] Rama Chellappa et al. "Human and machine
Computer Science Society (SCCC), 2009 recognition of faces: A survey." Proceedings of the
International Conference of the. IEEE, 2009. IEEE 83.5 (1995): 705-741.
[8] Ling, Haibin, et al. "A study of face recognition as [24] N. M Allinson et al. "Face recognition: combining
people age." 2007 IEEE 11th International cognitive psychology and image
Conference on Computer Vision. IEEE, 2007. engineering." Electronics & communication
[9] Rodriguez, Yann. Face detection and verification engineering journal 4.5 (1992): 291-300.
using local binary patterns. No. LIDIAP-REPORT- [25] Roberto Brunelli et al. "Face recognition through
2006-022. IDIAP, 2006. geometrical features." European Conference on
[10] Anil K Jain, et al. "Biometrics: a tool for information Computer Vision. Springer Berlin Heidelberg, 1992.
security." IEEE transactions on information forensics [26] Matthew A Turk et al. "Face recognition using
and security 1.2 (2006): 125-143. eigenfaces." Computer Vision and Pattern
[11] Turati, Chiara, et al. "Newborns' face recognition: Recognition, 1991. Proceedings CVPR’91, IEEE
Role of inner and outer facial features." Child Computer Society Conference on. IEEE, 1991.
development 77.2 (2006): 297-311. [27] M TURK. Et al., AEigenfaces for recognition.
[12] Fabien Cardinaux et al. "User authentication via Journal of cognitive Neuroscience 3 (1991), 71.86.
adapted statistical models of face images." IEEE [28] Francis Galton,' Personal identification And
Transactions on Signal Processing 54.1 (2005): 361- Description Nature, 1888.
373. [29] RJ Baron. "A bibliography on face recognition," The
[13] TimoAhonen, et al. "Face recognition with local SISTM Quarterly Incorporating the Brain Theory
binary patterns." European conference on computer Newsletter, II(3) 27.36, 1979.
vision. Springer Berlin Heidelberg, 2004. [30] T. KANADE, Picture processing system by computer
[14] Anil K Jain et al. "An introduction to biometric complex and recognition of Hurrian faces. PhD
recognition." IEEE Transactions on circuits and thesis, Kyoto University, November 1973.
systems for video technology 14.1 (2004): 4-20.
[15] WenyiZhao et al. "Face recognition: A literature
survey." ACM computing surveys (CSUR) 35, no. 4
(2003): 399-458.
[16] Constantine L Kotropoulos et al. "Frontal face
authentication using discriminating grids with
morphological feature vectors" IEEE Transactions on
Multimedia 2.1 (2000): 14-26.
[17] Duc, Benoit et al. "Face authentication with Gabor
information on deformable graphs." IEEE
Transactions on Image Processing 8.4 (1999): 504-
516.
[18] Peter N Belhumeur et al."Eigenfaces vs. fisherfaces:
Recognition using class specific linear
projection." IEEE Transactions on pattern analysis
and machine intelligence19.7 (1997): 711-720.
[19] Wiskott, Laurenz, et al. "Face recognition by elastic
bunch graph matching."IEEE Transactions on
pattern analysis and machine intelligence 19.7
(1997): 775-779.
[20] Gerl, Susann et al. "3-d human face recognition by
self-organizing matching approach." pattern
recognition and image analysis c/c of
raspoznavaniyeobrazovianalizizobrazhenii 7 (1997):
38-46.
[21] Roberto Brunelli et al.. "Person identification using
multiple cues." IEEE transactions on pattern analysis
and machine intelligence 17.10 (1995): 955-966.
482 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract
Recommender systems lay a pathway to users in delivering the best solution in their area of
interest. Recommender systems have gained prominence in the field of Information
Technology, e-commerce, etc., inferring personalized recommendations by effectively pruning
from a universal set of choices for end users to identify their content of interest. A number of
multi criteria decision support system algorithms are available for generating priority based
recommendations, which include Technique for Order of Preference by Similarity to Ideal
Solution (TOPSIS), Analytical Hierarchy Processes (AHP). This paper focuses mainly on user-
to-item based filtering technique. Here, a comparative study is conducted between TOPSIS and
AHP for selecting a mobile based on filtering.
Keywords
Introduction
Decision Support Systems are computer-based systems that bring together information from a
variety of sources, assist in the organization and analysis of information and facilitate the
evaluation of assumptions underlying the use of specific models. In other words, these systems
allow decision makers to access relevant data across the organization as they need it to make
choices among alternatives. Most decision-making processes supported by DSS are based on
decision analysis, most commonly multi-criteria decision making (MCDM). MCDM involves
evaluating and combining alternatives' characteristics on two or more criteria or attributes in
order to rank, sort or choose from among the alternatives. Now-a-days a smart phone as become
a necessity in everybody’s life. Since, every day a new model comes into market users get
confused during selection of mobile phone while buying. To select the most suitable mobile
phone among various alternatives, the decision maker must consider meaningful criteria &
possess special knowledge of the phone specifications. In this study, the evaluation criteria
for decision making are selected from the studies in the discussions with the target audience.
A number of alternatives and conflicting criteria are increasing very rapidly. So, robust
evaluation models are crucial in order to incorporate several conflicting criteria
meritoriously. With its need to trade-off multiple criteria, the selection problem like mobile
phone selection is a multi-criteria decision-making (MCDM) problem. A number of
recommender system algorithms / methods are present for selecting an item using item-based,
483 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
user-based and user-item based filtering techniques [1]. Analytic hierarchy process (AHP),
fuzzy multiple attribute decision- making model, linear 0-1 integer programming, weighted
average method, genetic algorithms etc. are some of these methods. Reviewing of [1] led to
motivation of comparative study between TOPSIS and AHP for selecting an item using DSS
which will recommend the most suitable item based on user reviews. Now the rest of the
chapter is organised as, section II describes MCDM methods like TOPSIS and AHP. Section
III provides the comparison of TOPSIS and AHP. Section IV contains experimental results
obtained, followed by section V which contains conclusion.
MCDM Methods
a. TOPSIS
The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) [2] [3] [4] is
a multi-criteria decision analysis method. It is based on the concept that the chosen alternative
should have the shortest geometric distance from the positive ideal solution (PIS) [5] and the
longest geometric distance from the negative ideal solution (NIS)[5]. It is a method of
compensatory aggregation that compares a set of alternatives by identifying weights for each
criterion, normalising scores for each criterion and calculating the geometric distance between
each alternative and the ideal alternative, which is the best score in each criterion. An
assumption of TOPSIS is that the criteria are monotonically increasing or
decreasing. Normalisation is usually required as the parameters or criteria are often of
incongruous dimensions in multi-criteria problems [6][7] . Compensatory methods such as
TOPSIS allow trade-offs between criteria, where a poor result in one criterion can be negated
by a good result in another criterion. This provides a more realistic form of modelling than
non-compensatory methods, which include or exclude alternative solutions based on hard cut-
offs [8].
Step 1
Create an evaluation matrix consisting of m alternatives and n criteria, with the intersection
of each alternative and criteria given as Xij, we therefore have a matrix (Xij)mxn..
Step 2
The matrix (Xij)mxn is then normalised to form the matrix R= (rij)mxn , using the normalisation
method rij = Xij / √∑xij2 where i=1,2,3.......n.
Step 3
484 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
so that ∑Wj =1 and wj is the original weight given to the indicator Vj j=1,2,3...n
Step 4
Determine the worst alternative (Aw) and the best alternative (Ab) :
where,
Step 5
Calculate the L2-distance between the target alternative i and the worst condition Aw
where, diw and dib are L2-norm distances from the target alternative i to the worst and best
conditions, respectively.
Step 6
Siw = 1 if and only if the alternative solution has the best condition;
Siw = 0 if and only if the alternative solution has the worst condition.
Step 7
485 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
b. AHP
The analytic hierarchy process (AHP)[9] is a structured technique for organizing and
analysing complex decisions, based on mathematics and psychology. It has particular
application in group decision making, and is used around the world in a wide variety
of decision situations. Rather than prescribing a "correct" decision, the AHP helps decision
makers find one that best suits their goal and their understanding of the problem. It provides a
comprehensive and rational framework for structuring a decision problem, for representing and
quantifying its elements, for relating those elements to overall goals, and for evaluating
alternative solutions.
If A is compared with B for a criterion and preference value is 3, then the preference value of
comparing B with A is 1/3.
Referred to synthesization
Divide each value in a column by its corresponding column sum to normalize preference
values.
486 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Which one is the most important and which one is the least important one.
Average values of each criterion are multiplied with their respective values and summed up
for each case.
Based on the overall score, the highest score obtained is the most recommended one.
The aim of this study is to propose a multi-criteria decision making (MCDM) approach to
evaluate the mobile phone options in respect to the users' preferences order. Firstly, the most
desirable features influencing the choice of a mobile phone are identified. This is realized
through a survey conducted among the target group, the experiences of the telecommunication
sector experts and the studies in the literature. Here, target group belongs to few students in our
college. Two MCDM methods namely, TOPSIS and AHP are then used in the evaluation
procedure. Firstly, both the algorithms are implemented using JAVA-programming language.
Secondly, values for all the criteria for each alternative are calculated. Finally, based on time
taken for execution, most efficient algorithm will be proposed.
In this paper, comparison of three existing mobile phones of renowned companies namely:
Iphone, Samsung and Windows in general serve to validate the model by testing the
propositions that were developed. First of all, the evaluation criteria for the selection decision
considered are Price, Memory, Camera, Battery and Ease of use. Here, price is considered as
benefit factor. The weights of the main criteria are considered based on decision makers’
subjective judgments. A pair-wise comparison matrix of the main criteria and the calculation
of the weights are given followed by the end result and time taken for execution.
Experimental Results
The following contains the implementation of TOPSIS and AHP algorithms using java
programming language on eclipse platform. This would make the calculations easy and takes
less time to conclude results.
487 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 1: Giving the required alternatives and criteria as input using JAVA Programming
488 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 4: Distance between Positive and Negative ideal solutions to alternatives using
JAVA programming
Figure 5: The closeness coefficient, solution and time taken for execution using JAVA
programming
Figure 6: Giving the required criteria values as input using JAVA programming
489 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 7: Giving required input values for each alternative based on each criteria
490 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 10: Normalized matrices for each alternative based on each criterion
Figure 11: overall score and overall time taken for execution
Based on the execution, the results show that Samsung is highest ranked by both the algorithms.
Also, we can conclude that time taken for execution of TOPSIS takes only 5 milliseconds while
AHP takes 75 milliseconds. Therefore, it can be proposed that based on the comparative study,
TOPSIS generates results more efficiently in less time.
Conclusion
The most popular multi-criteria decision making algorithms TOPSIS and AHP become very
complicated and calculative when there are greater than 4 Criteria and alternatives for a
particular problem. So, designing them in JAVA-Programming not only increases the accuracy
of result but also make easy to calculate any number of alternatives and criteria. Hence, based
on the comparative study made the time complexity for TOPSIS is O(n) which is lesser than
491 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
time complexity of AHP that is O(lg n) . Therefore, we can conclude that TOPSIS gives results
more efficiently and takes lesser time for execution.
References
492 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
2
Electronics & Telecommunication
Government Engineering College, Raipur, INDIA.
Abstract-Due to lack of awareness towards deaf and dumb peoples in Asian country, linguistic communication recognition
is a crucial tool normally developed for deaf and onerous hearing community. Linguistic communication is that the solely
mode of communication between them by generating totally different sign pattern. The scale invariant feature Transform
has been used for feature extraction as its options are invariant to translation, rotation and scaling. The static pictures of
hand gestures are pre-processed mistreatment scale invariant feature rework formula and trained all twenty six hand
gestures (A to Z) with one hundred thirty pictures gift within the information. This paper shows the matching between
input image and information pictures supported the options extracted mistreatment scale invariant feature rework
formula. This paper provides 98.7% of matching accuracies since the matching is completed by mechanically variable the
brink and distance magnitude relation.
Key words- Indian signing (ISL), Hand gesture, scale invariant feature remodel, validity magnitude relation, scale
invariant feature remodel.
I. INTRODUCTION
Detecting and understanding hand and body gestures is turning into a really vital and difficult task in computer
vision. The importance of the matter will be simply illustrated by the utilization of natural gestures that we have a
tendency to apply along with verbal and nonverbal communications. The utilization of hand gestures in support of
verbal communication is thus deep that they're even utilized in communications that folks haven't any visual contact.
There are completely different approaches for recognizing hand gestures in engineering community, a number of
that need carrying marked gloves or attaching further hardware to the body of the topic. These approaches are from
the user’s purpose of read thought-about to be intrusive, and so less possible to use for universe applications.
Additionally, recognition of the form of the hand is additionally vital in some applications like linguistic
communication recognition and Human pc Interaction.
The linguistic communication utilized by the deaf (in India) is principally learnt through a selected approach. There
are kinds of linguistic communication utilized in India that may be classified on regional basis. There are plenty
additional informal sign languages or the 'home sign' as referred. Although of these sign languages appear to be
reticular it's troublesome to search out the precise path of development of any language. It’s been shown through
earlier analysis works the interrelationship of those languages which all of them contribute to the event of the ISL
(Indian Sign Language).
493 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The approach of scale invariant feature remodel has been used for object, scene and hand gesture. The extracted
options square measure invariant and perform reliable matching between numerous views of an object. The
extracted options square measure extremely stable and invariant to image rotation and scaling. The options are
strong across distortion, noise and alter in illumination. The projected approach is economical and has ability to
extract sizable amount of options for a given image [1, 2, 3, 4]. A unique approach for hand poses recognition for
analyzing the textures and key geometrical options of the hand. The feature extraction technique mentioned during
this paper is additional complicated as a result of the kidnapping and movement angle of the fingers and their
internal variations has thought-about [5]. A hand gesture recognition system has been to acknowledge the alphabets
of ISL. The process time needed for the system is additional, since it's to method through four totally different
modules [6]. A unique approach for robot golem interaction (HRI) was steered. This method found its limitations in
real time applications wherever totally different speech and hearing problems square measure thought-about [7].
Author developed an indication Language bioscience system for south Indian languages. The system describes a
collection of thirty two signs, every representing the binary ‘UP’ & ‘DOWN’ positions of the 5 fingers of right
palm. The given system is applicable to Tamil text solely and restricted to 5 fingers of single hand with same
background [8]. A vision-based multi-features classifier has been enforced for linguistic communication recognition.
The system first wont to developed dynamic linguistic communication look model, then classification has been done
by SVM technique. The experiment was applied over thirty teams of the Chinese finger alphabet pictures and also
the results proved that this look modeling technique is straightforward, efficient, and effective for characterizing
hand gestures. The system suffers with limitations in real time with dynamic hand gesture [9]. Period linguistic
communication recognition has been projected victimization hand gestures. Hand gestures square measure
recognized supported Haar options and K-means algorithm formula. The aforementioned formula was wont to cut
back the quantity of extracted options that reduced the process quality. This technique was restricted to little
information [10]. A unified framework was steered that at the same time performed spacial segmentation, temporal
segmentation, and recognition. The performance of this approach was evaluated on 2 difficult applications:
recognition of hand-signed digits gestured by users sporting short-sleeved shirts, and retrieval of occurrences of
signs of interest from a video info containing continuous, nonsegmental sign language of yank linguistic
communication (ASL). This method is proscribed to 5 signs of American Sign Language [11]. A unique formula for
the hand gesture pictures detection and recognition has been projected. The popularity method includes 2 phases (a)
the model construction and (b) linguistic communication identification. This formula provides ninety four accuracy
for hand gesture recognition and also the accuracy are adversely affected if the orientation of hand gesture lies in
skew or if the image half from the radio carpal joint to the arm is wrong [12].
A 2 level language identification system has been developed victimization acoustic options. Firstly, the system
identifies the family of the oral communication, and then it's fed to the second level to spot the actual language
within the corresponding family. The system uses hidden Markov model (HMM), Gaussian mixture model (GMM),
494 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
artificial neural networks (ANN). The system couldn't accomplish good accuracy [13]. A good use of glove has been
projected for implementing an interactive linguistic communication teaching programmed. Linguistic
communication glove with flex sensors square measure needed to mount on the finger of the glove whose resistance
changes consistent with the finger position that's tough to know for the deaf [14]. This paper represented the facial
expressions of the signers face victimization probabilistic principal part analysis (PPCA). A take a look at was
conducted to acknowledge six isolated facial expressions in yank linguistic communication (ASL). The popularity
accuracy according for American Sign Language facial expressions was 91.76% in the flesh dependent tests and
87.71% in the flesh freelance tests. The projected system isn't totally automatic and thus achieved lower accuracy
[15]. The projected system can facilitate the hearing impaired to speak additional fluently with the conventional
folks. A segmentation method represented during this paper is employed to separate the proper and left regions from
the image frame. The system needs regarding 3218 average mean epochs to coach the network model that exceeds
the coaching time. What is more there was confusion in 1st, fourth, eighth and ninth sign that greatly reduced the
accuracy [16]. This paper introduced the primary automatic Arabic linguistic communication (ArSL) recognition
system supported hidden Markov models (HMMs). The system operates in several modes as well as offline, online,
signer-dependent, and signer-independent modes. Experimental results incontestable that the given system has high
recognition rate for all modes. The system didn't accept the utilization of knowledge gloves or alternative means that
as input devices, and it permits the deaf signers to perform gestures freely and naturally. this method is proscribed to
solely Arabic sign language[17]. the assorted ways for feature extraction are projected for character recognition. The
author conjointly mentions the coaching of Devnagri character recognition victimization HMM and Neural network.
The author achieves 100 percent recognition result for tiny info [18, 19, 20, 21]. the assorted researchers have
created vital contribution within the field of sign languages from totally different countries and regions. The
literature supported totally different ways and findings with adequate result have conjointly shown [22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]
As there's no any customary information accessible on the web for Indian signing recognition. Whereas doing
intensive literature survey, we have a tendency to found a picture databases incorporates twenty six gestures of
Indian signing (ISL). The Figure one shows the image of ISL accessible on the web. As this image isn't adequate
for hand gesture recognition, this forces us to get new information for ISL cherish the pictures accessible with us.
Thus, so as to get a brand new information for ISL we have a tendency to created use of Nikon Coolpix-L24,
fourteen Mega picture element digital cameras that provides a clear image even in dark. The photo of gesture
corresponds to all or any the alphabets of English i.e. A to Z were taken from 5 completely different persons of
various age bracket. Total a hundred thirty pictures were taken of same size 1024 x 768 pixels. Generally, the time
interval is incredibly high if image size is massive and thus we have a tendency to reduce the scale of pictures to a
good extent making certain no loss of data to be used for signing recognition. So, we've regenerate these pictures
into a regular size of 284 x 215 pixels Figure1. Among these a hundred thirty pictures, seventy eight pictures were
used for coaching purpose and also the remaining fifty two pictures were used for testing purpose. Fig.2 shows the
495 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
image information collected as real time image database of an individual. This database incorporates twenty six
pictures of size 284 x 215 picture element. This info is more used for recognition purpose subjecting them to
numerous pre-processing and post process stages.
.
Figure 2. Database acquisition for class 1.
IV. METHODOLOGY
The system multidimensional language of ISL recognition is shown in Figure three. Initial of all, we want to
initialize the vital parameter for SIFT (scale invariant feature transform) match formula [4]-[7]. The primary
parameter is distance quantitative relation whose price is ready to 0.65 and so threshold price is ready to 0.035.
496 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The scale invariant feature remodel (SIFT) algorithmic rule was introduced by Lowe and has been used for feature
extraction. This algorithmic rule is one in every of the foremost wide used thanks to the steadiness over image
translation, rotation and scaling and to some extent invariant to alter within the illumination and camera viewpoint.
They’re well localized in each the abstraction and frequency domains, reducing the likelihood of disruption by
occlusion, clutter, or noise. Giant numbers of options may be extracted from typical pictures with economical
algorithms. Additionally, the options area unit extremely distinctive, which permits one feature to be properly
matched with high likelihood against outsized information of options, provides a basis for object and scene
recognition. Following area unit the key stages of computation accustomed generate the set of image options (Figure
3)
Let us discuss however the information flows in every section of SIFT algorithmic rule. The input to a SIFT
algorithmic rule may be a set of N2 pixels of associate degree NxN image. Solely a little fraction of those pixels
usually end up being extrema. Let 0 < α < 1, be this fraction. Therefore αN2 extrema can travel to ensuing key
purpose detection section. Solely the little fraction of those extrema can qualify as a key purpose. Let 0 < β <1, be
this fraction. Therefore nominally there's αβN2 key purpose at this stage. Orientation assignment can re-examine all
the N2 purposes within the image to see if any point of great magnitude are lost. Let the fraction γ of the image
component qualify to be these extra key purposes. The compute descriptor section converts these points into vector
that area unit then was options. The amount of feature descriptor output by SIFT algorithmic rule is nominally
(∝β+γ) N2 for associate degree N x N image.
497 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
In this part the algorithmic rule identifies the points that are stable with image rotation, translation and people that
are minimally suffering from noise and tiny distortion. The algorithmic rule computes ‘scale’, ‘difference of
Gaussian’ and ‘extrema’ over many ‘octaves’[5],[7],[8].
This stage makes an attempt to spot those locations and scales that are identifiable from totally different views of
an equivalent object. This could be expeditiously achieved employing a "scale space" perform. Any it's been shown
underneath affordable assumptions it should be supported the Gaussian perform. Specifically the distinction of
Gaussian image D(x, y, σ ) is given by
where L(x, y ,kj σ) is that the convolution of the first image I(x, y) with the Gaussian blur G(x, y, Kσ) at scale
Kσ i.e
Wherever * is that the convolution operator, G(x, y, σ) could be a variable-scale Gaussian and I(x, y) is that the
input image. Once distinction of Gaussian (DOG) pictures has been obtained, key points are known as native
minima/maxima of the DOG pictures across scales. this can be done by comparison every element within the DOG
pictures to its eight neighbors at an equivalent scale as shown in Figure 4 and Nine corresponding neighboring
pixels in every of the neighboring scales. If the element worth is that the most or minimum among all compared
pixels, it's elite as a candidate key purpose.
To determine the distinction of mathematician, we tend to solely want two corresponding points and to visualize
whether or not the purpose is extremum we tend to solely want twenty six variations of mathematician points adjoin
498 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
3 scales around it. Thus, it's attainable to execute these operations over octaves in many various ways that. Figure
5. Shows the flow sheet of 1st part of SIFT formula.
Within the 1st section the rule verify αN2 extrema and so additional examine them into αβN2 key purposes; that
ultimately becomes the key point of that image. During this section the candidate that lies on the sting of the image
or might corresponds to purpose of low distinction. These square measures usually not helpful as feature are
unstable over image variation and thence are rejected. For rejecting low distinction purpose, every extrema is
examined employing a technique that involves resolution a system 3 x 3 linear equation then it takes constant time.
To discover the extrema, a 2 x 2 matrix is generated and easy computation is performed on it; to come up with a
quantitative relation of principle curvature. This amount is solely compared with threshold worth to choose whether
or not extrema is rejected or not [8].
As per the higher as discussion, once the input image passes through the second section of SIFT rule, the SIFT
key purpose detection for each the coaching and check image is shown by red circles in Figure 6.
499 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure 6: SIFT key points for train (left) and check image (right).
Orientation Assignment
The nominal range of key purpose at the beginning of this part is αβN2. This part adds to the set of key points
on the idea of their magnitude and orientation. The magnitude and orientation for every purpose may be calculated
as follows [4],[7].
Non-key purposes whose magnitudes area unit near the height magnitude area unit added as new key point. The
entire range of key points at the tip of this part is:
αβN + γ(N − αβN ) = αβN (1 − γ) + γN ≅ N (αβ + γ) (5)
So the computation for magnitude and orientation may be done over constant time and later on the entire range of
key points matched in coaching and check image is shown in Figure 7.
(a) (b)
Figure 7: Matched key points between (a) and (b) when orientation and centralization.
500 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
all the values of those histograms. Since there square measure 4 x 4 = 16 histograms every with eight bins the vector
has 128 parts. This vector is then normalized to unit length so as to boost unchangeableness to affine changes in
illumination. To cut back the consequences of non-linear illumination a threshold of 0.2 is applied and also the
vector is once more normalized.
A formula for obtaining validity magnitude relation consists of the subsequent steps:
Step 1: Get the matched key purpose information from SIFT formula. Figure 8 shows the instance of key points
extracted from information and input image
Step 1: Let image one be the information image and image a pair of be the input image as shown in Figure 9.
presumptuous any, let d11 to d61 area unit the space factors of key purpose from the middle of image 1; and d12 to
d62 area unit the space factors of key purpose from the centre of image a pair of.
501 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Step 2: If the numbers of matched key purposes are larger than 2 then calculate the gap of the matched key point to
the centre of the key point’s exploitation the formula given below.
=∑ (6)
=∑ (7)
Wherever dT1 and dT2 are the summation of all distance for image one and image two respectively; M denotes the
amount of matched points.
DAR = (Distance between the matched key purpose and centre of the key point / Total distance) (8)
The DAR for information image and input image is calculated as:
Step 4: The gap masking is important so as to see the similar pattern of matched key purpose and therefore the
centre of the matched key purpose. The gap masking will be calculated by taking absolutely the price of DAR below
the edge price.
Step 5: The full validity points are obtained by summing the gap mask. The validity purpose is given as
If the amount of matched key points isn't larger than two then the amount of valid points is directly zero.
Step 6: Finally to calculate the validity quantitative relation of the key points, we tend to merely divide the valid
matched key points by the matched key points.
More the validity quantitative relation, additional key points of each the pictures are matched. Most validity
quantitative relation offers the matched result.
V RESULT
Visual perception is performed by initial matching every key purpose severally to the information of key points
extracted from coaching pictures. Several of those initial matches are going to be incorrect thanks to ambiguous
options or options that arise from background muddle. Therefore, clusters of a minimum of three options are initial
502 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
known that agree on AN object and its cause, as these clusters have a far higher likelihood of being correct than
individual feature matches. The careful geometric fitting of every cluster’s results accustomed settle for or reject the
interpretation. When applying these higher than six steps for validity quantitative relation, we tend to get 3 matching
result for a check image. As per the result shown in Figure 10, the key points of check image are matched with
information image numbers four, 23 and 24. It’s been ascertained that seven matches are found for image four, 3
matches are found for image twenty three and thirty 2 matches are found for image twenty four of the information.
The amount twenty four corresponds to character ‘X’ within the alphabet. Therefore recognizing character is ‘X’.
The combined result for these 3 characters is shown in Figure 11.
Figure 11: Combined matching result for image 4, 23 and 24 of database images.
As explicit earlier that in the primary check the worth of important parameter that's the space quantitative relation
is 0.65 and threshold worth is unbroken at 0.035 severally. Therefore the algorithmic program offers North
American country the 3 outputs whose most key point’s area unit matched with the key points of input image.
Throughout the second check the important parameter is once more the space quantitative relation that is currently
incremented by zero.5 and threshold worth is decremented by 0.05. At the top of second check, the algorithmic
program offers North American country the ultimate output.
503 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
It’s been discovered that in the primary check the key points of image range four, twenty three and twenty four area
units possibly matched with the key points of input image. We’ve got additionally shown the combined result for all
the match points. Therefore, to differentiate it properly we have a tendency to take into account the individual results
of every image shown in Table 1 and its equivalent bar diagram illustration is shown in Figure 12.
Table one shows the individual results of the photographs that is closely matched with the check image. The check
image is ‘X’ in Figure eleven. The algorithmic program at first offers the 3 best results and out of those 3 the
ultimate result's hand-picked relying upon x-y position, the quantity of matched points, variety of valid match
points, and validity magnitude relation. The Table one shows that the image variety four, twenty three and twenty
four are matched with the check image (last column). Out of those three pictures it's been discovered that the row
three, that has thirty two matched points equals thereto of check image, thirty matched point’s are adequate to that of
check image and therefore the validity magnitude relation is once more near unity. This once more shows that the y-
position of column three is sort of equal (98.708 and 99.6). So the algorithmic program finally offers North
American nation the matched result i.e. image 24. (X-character). Similar interpretation is given in Figure 13.
Figure 12: Bar chart results for (a) image 4 (b) image 23 and (c) image 24.
Figure 12 presents a bar description of results which are explained further as follows:
504 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Figure fourteen provides the detail discussion of matching between the input image and information image. As
mentioned earlier, the image four corresponds to gesture D, image twenty three is corresponds to gesture W and
image twenty four is corresponds to gesture X. The input image is of gesture X, the primary row shows in Figure 13
provides the data concerning the X-Y position of input image and information image. Bar a pair of and four of
image twenty four is nearly equal representing X-Y position of input image and information image is same. The X-
Y position of different 2 pictures was totally different. Equally the column 2 provides the data concerning range of
matched key points and validity key points between input and presumably matched image. Bar five and vi of image
twenty four shows that this image has thirty two matched key points with input image with thirty validity key points.
Finally the validity magnitude relation as shown by bar seven and eight is highest for image twenty four as
compared to different 2. This shows that the input gesture X is presumably matched with the image twenty four gift
within the information. To yield the ultimate output, the systems mechanically increment the space magnitude
relation by 0.5 and decrement the edge price by 0.03. This method of checking improves the accuracy of matching.
5 and 6
The number of
matched key points
and valid key point’s
square measure
Matched key Matched key points additional in image
=3 three as compared to
points =7 Matched key different 2 pictures.
points=32
Valid key points Valid key points=1
=5 Valid key
points=30
505 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
7 and 8
Validity magnitude
Validity
relation is highest in
Ratio=0.7143 Validity Ratio=0.3333 Validity Ratio=0.9375 image three. The
Number= 4 matched character
Number=23 Number=24
corresponds to range
twenty four is ‘X’
Conclusion
This paper describes the implementation of Indian linguistic communication recognition system victimization Scale
Invariant feature remodel (SIFT). The SIFT offers stable options over translation, rotation and scaling. It
additionally offers the stable options over litter background. The input image is to be recognized is fed to the system,
the system no inheritable the SIFT options of input image and compared this options with the pictures gift within the
information. Throughout the primary check, the system offers 3 best results that are presumably to be matched with
input image for constant price of threshold and distance quantitative relation. Since the system returns 3 outputs, the
rule must be going for second check. Throughout the second check, the rule mechanically will increase the space
quantitative relation by 0.5 and decrement the edge price by 0.03, yields conclusion. This method offers the
accuracy of 98.7% for ISL information.
REFERENCES
[1] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-
110.
[2] David G. Lowe, "Object recognition from local scale-invariant features," International Conference on Computer Vision, Corfu, Greece
(September 1999), pp. 1150-1157.
[3] Lindeberg, Tony, “ Feature Detection with Automatic Scale Selection” International Journal of Computer Vision, vol 30, number 2, pp 77--
116, 1998.
[4] Sandeep B. Patil and G.R. Sinha, 2015, “Intensity based distinctive feature extraction and matching using Scale invariant feature transform
for Indian Sign Language” 17th International conference on Mathematical Methods, Computational Techniques and Intelligent System
(MAMECTIS15) held in Tenerife, Canary Islands, Spain, January 10-12, 2015. Pp-245-255. The ISI Journals (with Impact factor from
Thomson Reuters), ISBN:978-1-61804-281-1.
[5] Bhuyan M.K, Kar M.K & Debanga R.N, 2011, Hand Pose Identification from Monocular Image for Sign Language Recognition, IEEE
International Conference on Signal and Image Processing Applications (ICSIPA2011), 3(12):378-383
[6] Ghotkar A.S, Hadap M, Khatal R, Khupase S, 2012, Hand Gesture Recognition for Indian Sign Language, International Conference on
Computer Communication and Informatics (ICCCI -2012), Coimbatore, INDIA, 2(4):9-12.
[7] Nandy 2010, Recognizing & Interpreting Indian Sign Language Gesture for Human Robot Interaction, International Conference on
Computer & Communication Technology |ICCCT’10|:15-21.
[8] Rajam 2011, Real Time Indian Sign Language Recognition, IEEE conference on Image processing, 38-43.
[9] Yang quan,2013, Chinese Sign Language Recognition Based On Video, IEEE/ASME International Conference on Advanced Intelligent
Mechatronics, 67-74.
[10] Arulkarthick V.J, Sangeetha D. and Umamaheswari S, 2012, Sign Language Recognition using K-Means Clustered Haar-Like Features and
a Stochastic Context Free Grammar, European Journal of Scientific Research, 78(1):74-84.
[11] Alon J. and Athitsos V, 2009, A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation, IEEE Transactions
on Pattern analysis and Medical Intelligence, 31(9):1685-1699.
[12] Chou, 2012, An Encoding and Identification Approach for the Static Sign Language Recognition, IEEE/ASME International Conference on
Advanced Intelligent Mechatronic. 28-34
[13] Jothilakshmi S, Palanivel S. and Ramalingam V, 2012, A hierarchical language identification system for Indian languages, Digital Signal
Processing 2(2):544–553.
[14] Kadam, 2012, American Sign Language Interpreter, IEEE Fourth International Conference on Technology for Education.108-117.
[15] Nguyen T.D and Ranganath S, 2012, Facial expressions in American sign language: Tracking and recognition, Pattern Recognition: 1877–
1891.
506 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[16] Paulraj M P, Palaniappan R, Yaacob S, Zanar A, 2011, A Phoneme Based Sign Language Recognition System using 2D Moment Invariant
Interleaving feature and Neural Network, IEEE Student Conference on Research and Development, 2(11):111-116.
[17] Ahmed P, Mishra G.S and Sahoo A.K, 2012, A proposed framework for Indian signs language recognition, International Journal of
Computer Applications, 2(2): 158-169.
[18] Patil S.B and Sinha G.R, 2012, Real Time Handwritten Marathi Numerals Recognition Using Neural Network, International Journal of
Information Technology and Computer Science, 4(12): 76-81.
[19] Patil S.B, Sinha G.R and Thakur K, 2012, Isolated Handwritten Devnagri Character Recognition Using Fourier Descriptor and HMM in
International Journal of Pure and Applied Sciences and Technology (IJPAST), 8(1): 69-74.
[20] Patil S.B. and Sinha G.R, 2012, Off-line mixed Devnagri numeral recognition using Artificial Neural Network, Advance in computational
Research, Bioinfo journal, 4(1): 38-41.
[21] Patil S.B, Sinha G.R and Patil V.S, 2011, Isolated Handwritten Devnagri Numerals Recogntion Using HMM, IEEE Second International
conference on Emerging Applications of Information technology, Kolkota, ISBN: 978-1-4244-9683-9:185-189.
[22] Agarwal Ila, Johar S and Santhosh J, 2011, A Tutor For The Hearing Impaired (Developed Using Automatic Gesture Recognition),
International Journal of Computer Science, Engineering and Applications (IJCSEA), 1(4):49-61.
[23] Akmeliawati R, Kuang Y.C, 2007, Real-Time Malaysian Sign Language Translation using Colour Segmentation and Neural Network, IMTC
2007 Instrumentation and Measurement Technology Conference Warsaw, Poland, 1-3 May 2007, 1(2):1-6.
[24] Agrawal A and Rautaray S.S, 2011, Interaction with Virtual Game through Hand Gesture Recognition, International Conference on
Multimedia, Signal Processing and Communication Technologies, 203, 1(7):244-247.
[25] Bhosekar A, Kadam K, Ganu R. & Joshi S.D, 2012, American Sign Language Interpreter, IEEE Fourth International Conference on
Technology for Education, 6(12): 157-159
[26] Bharti P, Kumar D and Singh S, 2012, Sign Language to Number by Neural Network, International Journal of Computer Applications,
40(10): 38-45.
[27] Ulrike Zeshan, Madan M. Vasishta and Meher Sethna, 2005, Implementation of Indian Sign Language in Educational Settings, Asia Pacific
Disability Rehabilitation Journal:16(1):16-40.
[28] Chakraborty P, Mondal S, Nandy A, Prasad J.S , 2010, Recognizing & Interpreting Indian Sign Language Gesture for Human Robot
Interaction, Int’l Conf. on Computer & Communication Technology |ICCCT’10|,2(10), 52(11): 712-717.
[29] Chen X and Zhou Y, 2010, Adaptive Sign language recognition with exemplar extraction and MAP/IVFS, IEEE signal processing letter,
17(3):297-300
[30] Chung H. W, Chiu Y. H and Chi-Shiang Guo, 2004, Text Generation From Taiwanese Sign Language Using a PST-Based Language Model
for Augmentative Communication, IEEE Transcation On Neural System And Rehabilitation Engineering, 12(4):441-454.
[31] Debevc M, Kosec P & Rotovnik M, 2009, Accessible Multimodal web pages with Sign Language Translations for Deaf and Hard of Hearing
Users, 20th International Workshop on Database and Expert Systems Application 3(9): 279-283.
[32] Dasgupta T, Basu A, Mitra P, 2009, English to Indian sign language machine translation system: A structure transfer framework,
international conference on artificial intelligence (IICAI):118-124.
[33] Domingo A, Akmeliawati R & Kuang Ye Chow, 2007, Pattern Matching for Automatic Sign Language Translation System using LabVIEW,
International Conference on Intelligent and Advanced Systems 2007, 9(7):660-665.
[34] Dharaskar R.V, Futane P.R, , 2012, Video Gestures Identification And Recognition Using Fourier Descriptor And General Fuzzy Minmax
Neural Network For Subset Of Indian Sign Language, 12th International Conference on Hybrid Intelligent Systems (HIS):525-530.
[35] Dharaskar R.V, Futane P.R, 2011, HASTA MUDRA, An Interpretation of Indian Sign Hand Gestures, 3rd International Conference on
Electronics Computer Technology, 1: 337-380.
507 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract—Query answering in DW performance is solve this problem, some mechanisms like summary
highly improved by bitmap Index. The efficiency of tables and indexes can be used. However, though there
complex query processing greatly increased through the is a good performance when using summary tables
use of Boolean operations (AND, OR, NOT, etc.) in the
selection predicate on multiple indexes and uses efficient for predefined queries, but in order to save space
storage space for attribute with low cardinality. The most and time during the query processing, indexing is a
important metrics are space and time for evaluating the better solution without the use of additional hardware.
performance of an index, and in case of bitmap index; The challenge is to find a suitable type of index
the focus was on time optimal index. In this paper, we are that would improve the performance of a query. To
proposing an algorithm to increase query performance
and reduce the response time using boolean operations speed up processing, new indexing techniques carried
(AND, OR) called multi Bitmap Index M-BI, bitmap out some relational database management systems,
index has been implemented on a group of tables, and such as bitmap indexing. Bitmap indexes have a
then the implementation of a bitmap index on a multiple specific structure for quick data retrieval [2] . Due to
columns in the table, as well as retrieve a query on any the low cost of both maintenance and construction,
two columns in the table at the same time with swap
the columns depending on the value cardinality. This Bitmap index structure is used effectively in DW.
study shows that the proposed method is better than Bitmap index structure minimizes the query response
other existing techniques of bitmap indexing on single time, consequently increases the Query Answering
column and single table. One of the results showed that efficiency [3] .
the performance of queries in terms of time of access
from multi Bitmap Index M-BI selection was found to
be 04 milliseconds (MS) while those queries directly II. BACKGROUND
selected from DW was found to be 38 milliseconds. This
shows the performance of query through Bitmap Index In this paragraph, we will present some of the
access is better than those directly access through DW previous studies on the efficiency of Bitmap index in
by 892.10%. the Data Warehouse and some of the algorithms used
Keywords: Data Warehouse , indexing , Bitmap which are close to and related to our study.
index , query processing .
• In 2012, Hamad, M. M., and Abdul-Raheem.M.
, this paper could achieve a number of results such
as ; the performance of query answering in Data
I. I NTRODUCTION
Warehouses is highly improved by Bitmap Index,
Data warehouse (DW) is sorting large historic data through using bitwise operations (AND, OR), it highly
collected from multiple heterogeneous data sources increases the efficiency of complex query processing
for supporting complex queries for strategic decision . A prototype of Data Warehouse “STUDENTS DW”
making. These complex queries require aggregated has been built due to the Inmon of Data Warehouses
data and want results to be realized in a minimum conditions . This prototype is built for student’s
response time. High response time is resulted by information [3] .
running queries on DW [1] . In DW, speeding up query
processing is considered to be a sensitive issue. To
508 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• In 2014, Kausar, Firdous, Sh Odah Al Beladi improve complex query performance by applying
and Kholoud AL Shammari , this research offers low-cost Boolean operations such as OR, AND,
a comparison and analysis of some of the related and NOT in the selection predicate on multiple
facts, which have been conducted from past resources indexes at one time to reduce search space before
that concern on bitmap indexing for data warehouses. going to the primary source data[7]. Each column
This research aims to analyze and to compare number has its own factors which are the criteria to choose
of techniques that are concern with bitmap indexing a proper index. These Factors are as explained below :
for data warehouses. Those techniques are: Scatter
Bitmap Index Optimization, Enhanced Encoded
Bitmap, Index Evaluating the iceberg query through
the Compressed Index of Bitmap and determining the
Join Index of Bitmap through Techniques of Data • Distribution : The column distribution is the
Mining. The comparison is in terms of strengths, frequency of occurrence each distinct value of
weaknesses, differences and similarities of each the column. The distribution of column guides to
technique. Finally, this research gives a helpful determining which indexing technique should be
explanation for the importance of DW and the main adopted [2].
related techniques to enhance DW bitmap indexing [4].
• Value range : The range of values of an
• In 2016, Garhwani, Chiranjeev D., Shreya indexed column guides to the selection of an
S. Kandekar and Payal S. Chirde , this research appropriate index technique. For example, if the
has presented a comprehensive review on processing range of a high cardinality column is small, then
large data sets. Set predicates combined in a group, a bitmap-based indexing technique should be used [2].
allow selection of dynamically formed groups and set
values. They have presented an approach, compressed • Cardinality : The cardinality of a column is
bitmap index based approach using variable length the number of different values in that column. It is
coding to process large datasets. They observed that better to know whether the cardinality of an indexed
bitmap index has fallowing benefits is saving disk column is high or low since an indexing technique
access by avoiding tuple -scan on a table with more may work efficiently only with either high or low
number of attributes, and reducing computation time cardinality e.g. Bit map index only works well with
by conducting bitwise operations [5]. low cardinality data [2].
From the abovementioned, we conclude that Query • Read-Only attributes : An attribute of stable
optimization is the ultimate goal of enhancing the values, which are once stored in the DW, will not
performance of bitmap index. Time and space are be changed or modified. This means that the current
the most important metrics for the evaluation of values are fixed and the new values are appended to
the index performance. Bitmap index structure is them. The read-only/mostly attribute is a key factor
used effectively in DWs due to the minimal cost in choosing a column to be indexed, as well as
of construction and maintenance. The structure modifying a value in an indexed column which means
of Bitmap index increases the Query Answering to rebuild the entire index, which consumes time and
efficiency through minimizing the query response space operation, and causes inefficient performance
time by using the Boolean operations (AND, OR, for the DBMS [3].
NOT) in the selection predicate on multiple indexes
and uses efficient storage space for attribute with low This paper will choose cardinality as a factor of
cardinality . indexing, we depend on these concepts (distribution
and value range) to select which technique suitable
to use according to value of cardinality. So the chosen
III. C ONCEPT IN B ITMAP I NDEX the Bitmap index technique. In our work, we have
For read-only or read-mostly data, bitmap index stable data that will not change, in this case the index
considered one of the most efficient indexing of these data will be stable also and when we add
methods available for speeding up multi-dimensional new data, we need to add new value to the exit index
range queries, while traditional tree-based indexing without changing or modifying the original one. By
structures are designed for datasets that change depending the concept of Read-Only attributes.
frequently over time [6]. The Bitmap Indexes
509 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
A. Features of Selecting an Appropriate Bitmap are built on columns . In this paper, we propose an
Index[2] algorithm to enhance the performance of query and
When choosing bitmap indexing technique, decrease the response time using Boolean operations
there are many features that need to be taken in (AND, OR) called multi Bitmap Index M-BI .
consideration :
V. F ILE S TRUCTURE OF A C OMPANY S YSTEM
• The index which takes small space and utilizes that DW
space efficiently .
• The index should be able to operate with other At this stage which is to be the first stage of our
indexes for filtering out the records before the access proposed system, we have built the data warehouse
to original data . tables (company system) in SQL Server 2012
• The index should support complex queries and environment, and these tables were filled of large
ad-hoc, also speed up join operations . number of records; creating these tables was based
• The index should be easy to be built (dynamically on the requirements of any company system. Clients
generated), implemented and then maintained . where all the information about the clients are saved,
Suppliers is the same but for suppliers, Invoices fall
into two types, Client invoices and Supplier invoices,
B. Basic Bitmap Index Each of which has two tables, a main table and
The main concept of bitmap index for an attribute detailed one (Invoices) and (Invoices − Details) for
with c distinct values, the basic bitmap index generates example . After creating the company system data
c bitmaps with R bits each, where R is the number warehouse , it becomes ready for importing in to the
of records (rows) in the data set. If the attribute in Visual Basic.net 2013 environment for completing the
the record is of a specific value, the corresponding next steps of the proposed system .
bitmap is set to “1”; otherwise the bit is “0”. That is,
a bit with value 1 indicates that a particular row has
VI. B ITMAP I NDEX D ESIGN
the value represented by the bitmap. Figure1 shows a
simple bitmap index with 4 bitmaps (B0-B3) [6]. In this section , we propose an algorithm to enhance
the performance of query and decrease the response
time with Bitmap indexing for data warehouse , we
examine how Query Processing is affected by Bitmap
Index , and study the efficiency of Query Answering
that produced by Bitmap Index. Figure 2 shows the
block diagram that describes the overall work is as
follow .
IV. T HE P ROBLEM
For decision-makers, recently, data warehouse
system has become more and more important. A. The Proposed System
Most of the queries against a large data warehouse In this section, we demonstrate and explain the
are iterative and complex. In the data warehouse main steps (phases) of the proposed design to work
environment the ability to answer these queries Multi Bitmap Indexes algorithm (M- BI ) . Figure 3
efficiently is considered to be a critical issue. The shows the main phases of the proposed system .
performance of queries, especially ad hoc queries
would be greatly improved if the right index structures
510 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
C. Cardinality of a Suggested DW
In this section, the term "Cardinality" will be
explained and implemented practically. And as
previously mentioned, cardinality refers to the number
of distinct values in the column. In our project,
we design a tool that calculates the cardinality for
each column . The first step is creating a function
in database and name it “Cardinality” , its task of
is calculating the cardinality according to the given
column name .
511 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
several tables (Warehouse Supplying Invoice Invoices, Step two: searching of the any table and chose
Supplying Invoice Details and Clients). Choose a any two column using bitmap index and compare
table, and then we calculated Cardinality for each without bitmap index using Boolean operations
column of the chosen table according to a Cardinality (AND, OR) . A search is done on three groups of
algorithm. After that retrieve a query of any two records, which are 105000, 450000, and 750000 .
columns in that table at the same time swapping
the columns depending on value of cardinality. And SELECT * FROM [TBL-CLIENTS]
when the M-BI algorithm was applied, the proposed where ([Email] = ’[email protected]’ OR
proved that outputs decreasing the query response [Name] = ’Ali Saeb Razzaq’)
time, which proves the efficiency of the algorithm
the proposed. We perform our experiments to test SELECT * FROM [TBL-INVOICE-DETAILS]
the response time of bitmap index. The experiments where ([Rto] = 7283 AND [Quantity] = 3)
applied on the project are of three steps as follows .
SELECT * FROM [TBL-SUPPINVOICE-DETAILS]
Step One: searching of the any table and chose where ([Rto] = 6350 AND [Quantity] = 84)
any column using bitmap index and compare without
bitmap index. A search is done on three groups of TABLE II: Response Time of Query Answering in
records, which are 105000, 450000, and 750000. Step Two of Algorithm
SELECT*FROM TBL-CLIENTS
where [Email] = ’[email protected]’
SELECT*FROM TBL-INVOICE-DETAILS
where [Quantity] = 3
512 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Step three: searching of the any table and as shown in the following table 4.
chose any two column Boolean operations (AND,
OR) . Where it is to swap the columns under the
proposed algorithm is based on values Cardinality and TABLE IV: Response Time of Query Answering in
Comparison when not swap the columns . A search is Step Three of Algorithm
done on three groups of records, which are 105000,
450000, and 750000 .
513 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
and space in order to deal with the update and delete R EFERENCES
of data source, an account the cardinality for multi [1] Bhosale P., et al. "Efficient Indexing Techniques On Data
columns by considering the updated data records Warehouse." International Journal of Scientific & Engineering
without re-computing the whole process. Research, Volume 4, Issue 5, May-2013 .
[2] Abdulhadi, Zainab Qays, Zhang Zuping, and Hamed Ibrahim
2. Increase the number of bitwise operations means Housien. "Bitmap Index as Effective Indexing for Low Car-
decreasing the query response time; this leads to an dinality Column in Data Warehouse." International Journal of
improved performance query. Computer Applications , 2013.
[3] Hamad, Murtadha M., and Muhammed Abdul-Raheem. "Eval-
3. Bitmap indexes provide a better performance when uation Of Bitmap Index Using Prototype Data Warehouse."
queries use a combinations of multiple conditions International Journal of Computers & Technology Volume 2
with OR/AND operators. No.2 April ,2012.
[4] Kausar, Firdous, Sh Odah Al Beladi, and Kholoud AL Sham-
4. Bitmap index efficiency is increased through mari. "Comparative Analysis of Bitmap Indexing Techniques in
increasing number of records. Data Warehouse." International Journal of Emerging Technol-
5. Bitmap index is works more efficient with swap ogy and Advanced Engineering 2014 .
[5] Garhwani, Chiranjeev D., Shreya S. Kandekar, and Payal S.
the columns Depending on value cardinality. Chirde. "A Review on Query Processing and Optimization in
SQL with different Indexing Techniques." (IJ) of Advanced
Research in Computer and Communication Engineering, 2016.
X. C ONCLUSIONS AND FUTURE WORK [6] Mei, Ying, Kaifan Ji, and Feng Wang . "A Survey on Bitmap
Index Technologies for Large-Scale Data Retrieval." ,Intelligent
Complex and interactive queries are often found in Networks and Intelligent Systems , 2013 6th International
data warehouse environment. They consume long time Conference on. IEEE, 2013.
[7] Guadalupe Canahuate, Tan Apaydin,"Secondary Bitmap In-
to access, find and retrieve answer for the queries. dexes with Vertical and Horizontal Partitioning" , ACM. EDBT
Space and time are the most important metrics to 2009, March 24–26, 2009, Saint Petersburg, Russia.
evaluate the performance of an index, and in case
of Bitmap index; the focus was on optimal time
index. Several bitmap indexes have been introduced,
aiming to reduce space requirement and improve query
processing time. In this paper, we proposed multi
Bitmap Index M-BI: an efficient algorithm to improve
the performance of query and decrease the response
time of Bitmap index in DW. Results obtained during
this work indicate that the query optimization is made
through the decreasing of query response time. Our
future work will explore techniques to choose other
structures (e.g. Join indexes and materialized views)
for database design in addition to indexes .
514 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
515 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
clones and preventing them to reoccur is large systems, approximately 13% - 20%
important [69]. code is duplicated code. According to
Generally copy-paste activities result in Lague et al. [69], with respect to
clones. This practice can help developers function clones, clones are reported at
to improve productivity especially in 6.4% to 7.5% while Baxter et al. [13]
case of device drivers that look same for opined that 12.7% code in software
many devices [72]. There are other systems is in the form of clones.
reasons such as coding style and
performance for duplicate code [13]. Industrial source code, according to
Accidental cloning is another Mayrand et al. [75], contains clones at
phenomenon which is due to the usage 5% to 20%. In case of large software
of same API while implementing systems, as per Kapser & Godfrey [49]
protocols [3]. In the literature many clones are at 10% to 15%. COBOL
other reasons were found for duplicate system with object oriented code was
code as explored in [8], [13], [47], [48], found to have 50% duplicated code [28].
[62], and [75]. From this it is understood that there is
huge amount of duplicated code and that
The research found in the literature also causes maintenance issues. Fortunately
found that there are serious problems many researchers bestowed mechanisms
caused by code cloning in real world for finding software clones [7], [9], and
software systems [5], [22], [8], [13], [13]. Nevertheless, there is no single
[28], [47], [48], [63], [62], and [75]. In definition for code cloning. There are
the presence of clones, system many terms used such as identical
functionality is not affected but causes program fragment [13], duplicate code
maintenance issues and development [28], clone [68] and so on.
becomes very expensive [81]. Clones
have negative impact on the evolution of Many researchers found in [13], [48],
software systems [33], [32], [31] in [75], [62], [11], [7], [28], [68], [59]
terms of comprehensibility, contributed to provide mechanisms to
maintainability and quality. Update detect clones and there are refactoring
anomalies are increased with code approaches to remove code clones [10],
cloning [12]. When finding and fixing [30], [40], [29], [61]. However,
bugs also duplicate code causes issues. refactoring clones was not found like a
Missing procedural abstraction and perfect solution though it can improve
missing inheritance are other problems the quality of code. Sometimes, even
when too much cloning is in system refactoring may not improve quality
[28]. From the literature it is understood [57]. Cordy did research on this issue
that the financial impact due to cloning and found that in some financial
is very high. After delivery of system, systems, it is not advisable to refractor
the cost of maintenance is estimated at the clones in the context of risk
40% to 70% of total cost [36]. In the management [23]. Another point of view
existing systems significant amount of in the literature is that clones are useful
code is duplicated [57], [72], and [43]. in some scenarios [54], [55]. However
According to Baker’s research [8], in there is overall support for the fact that
516 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
517 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
518 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
519 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
520 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
521 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Johnson [45], Ducasse et al. Marcus& Maletic
Properties
Backer [7], [8] [102] [28], [103] [104]
Applies
Syntactical tokens
Removes white transformation
Transformati White spaces and comment
spaces and after removing
on of code are removed delimiters are
comments white spaces
removed
and comments
Token strings Sub‐strings are Effective
Representati Text appears like
are represented as sequence of
on of code natural language
parameterized fingerprints lines
Employed a
String matching
Technique Employed sub‐ method named
based on Karp‐ A method based
for trees for token Dynamic
Rabin on graph theory
Comparison matching Pattern
fingerprinting
Matching
O(n+m)
Complexity n denotes input O(n2)
in the lines and m Not Available n represents Not Available
method denotes # of input lines
matches found
Granularity
Paragraphs,
of
Tokens in a line Sub‐strings in code Line in code sentences and
comparison
words
Based on Based on free
Based on free
Granularity threshold with threshold and Code segments
threshold with
of code minimum 15 longest and functions
minimum 50 lines
lines matches
Parameterized
Similarity of Exact matches that Only exact Near miss or exact
and exact
clone are repeated matches ADT
matches
Language A lexer is Only source code
Does not need A lexer is
independenc required text is considered
lexer/parser required
e
Clone class or
Types of clone pair in Clone pair in the Clone pair in
‐
Output textual format form of text the form of text
Human Human Human Human
Refactoring
intervention is intervention is intervention is intervention is
of clone
required required required required
Table 2 – String based clone detection techniques
522 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Technique for Token matching based Token matching based on suffix Employed a technique
comparison on suffix - tree – tree named frequent
subsequence mining
Complexity 0(n+m) n denotes input 0(n) n denotes length of source O(n2) n denotes lines of
lines and m denotes # code
of matches
Granularity in Based on free threshold Based on free threshold with Based on free threshold
cloning with minimum of 15 minimum of 30 tokens pertaining to functions
lines and basic blocks
Similarity in Parameterized and Near miss or exact matches with Near miss or exact
cloning exact matches possible gaps matches with possible
gaps
Language A lexer is required Lexer with transformation rules A full parser is required
Independence are required
Types of Output Clone class and clone Clone class and clone pair in Clone pair
pair in textual format textual format
523 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Threshold based
Granularity in Free threshold based
free tree Segment or gram or Threshold based or
cloning (5 statements
similarity free fixed
usually)
Similarity in Near miss and Parameterized and Near miss with gap
cloning Near miss and exact
exact exact and exact
524 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Transformation of Feature vectors are AST and then HTML tags are eMetrics are used
code constructed To AST and then extracted and to identify function
IRL composite tags are clones based on
extracted name similarity
Representation of Represented as Represented as Represented as Represented as
code AST parse tree AST in XML AST in XML
Technique for Employed Tree Employed Tree Employed frequent Graph theory is
comparison matching matching item set concept employed
technique technique
Complexity 0(N), N-AST O(S1S2) O(k:n2) Not Available
S1 denotes # of n denotes
nodes of first tree, statements part of
S2 denotes # of clones, k denotes
nodes of second clones with
tree maximal size
Granularity in Granularity is Tree node or token One line of content AST Node is the
comparison made using AST is the granularity granularity
Node level
Granularity in Threshold based Segment or gram Threshold based or Threshold based or
cloning free tree or free five statements or fixed
similarity free
Similarity in Near miss or exact Near miss and Parameterized and Near miss with gap
cloning exact exact and exact
525 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
526 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
527 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[23] J.R. Cordy. Comprehending reality: [29] Richard Fanta, Vclav Rajlich.
Practical challenges to software Removing Clones from the Code.
maintenance automation. In Proceedings Journal of Software Maintenance:
528 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[33] Reto Geiger. Evolution Impact of [39] Yoshiki Higo, Toshihiro Kamiya,
Code Clones. Diploma Thesis, Shinji Kusumoto, Katsuro Inoue.
University of Zurich, October 2005. ARIES: Refac-toring Support
Environment based on Code Clone
[34] Reto Geiger, Beat Fluri, Harald C. Analysis. In Proceedings of the 8th
Gall and Martin Pinzger. Relation of IASTED International Conference on
code clones and change couplings. In Software Engineering and Applications,
Proceedings of the 9th International Cam-bridge, MA, USA, November
Conference of Funta-mental Approaches 2004.
to Software Engineering (FASE'06), pp.
411-425, Vienna, Austria, March 2006 [40] Yoshiki Higo, Toshihiro Kamiya,
Shinji Kusumoto, Katsuro Inoue.
[35] M.W. Godfrey, D. Svetinovic, and Refactoring Support Based on Code
Q. Tu. Evolution, growth, and cloning in Clone Analysis. In Proceedings of the
Linux: A case study. In CASCON 5th International Con-ference on
workshop on Detecting duplicated and Product Focused Software Process
near duplicated structures in large Improvement(PROFES'04), pp. 220-233,
software systems: Methods and Kansai Science City, Japan, April 2004.
applications, October 2000.
[41] Zhenming Jiang, and Ahmed
[36] Penny Grubb, and Armstrong A Hassan. A Framework for Studying
Takang. Software Maintenance Concepts Clones in Large Software Systems. In
and Prac-tice. 2nd edn. World Scienti¯c Proceedings of the Seventh IEEE
(2003). International Working Confer-ence on
[37] Yoshiki Higo, Toshihiro Kamiya, Source Code Analysis and Manipulation
529 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Proceeding of the 1993 Conference [51] Cory Kapser, and Michael Godfrey.
of the Centre for Advanced Studies Toward a taxonomy of clones in source
Conference (CASCON'93), pp. 171- code: A case study. In Proceedings of
183, Toronto, Canada, October the Conference on Evolution of Large
1993. Scale Industrial Software Architectures
[46] J. Howard Johnson. Visualizing (ELISA '03), pp. 67-78, Amsterdam, The
textual redundancy in legacy source. In Netherlands, Septem-ber 2003.
Proceedings of the 1994 Conference of [52] Cory Kapser, and Michael Godfrey
the Centre for Advanced Studies on . Aiding Comprehension of Cloning
Collaborative research (CASCON'94), Throu gh Cat-egorization. In
pp. 171-183, Toronto, Canada, 1994. Proceedings of the 7th International
530 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[55] Cory Kapser and Michael W. [60] Raghavan Komondoor and Susan
Godfrey. \Cloning Considered Harmful" Horwitz. Using Slicing to Identify
Considered Harmful: A case study of the Duplication in Source Code. In
positive and negative e®ects. Empirical Proceedings of the 8th International
Software Engi-neering (invited for Symposium on Static Analysis (SAS'01),
publication), 2007. Vol. LNCS 2126, pp. 40-56, Paris,
France, July 2001.
[56] Miryung Kim, Lawrence Bergman,
Tessa Lau, David Notkin. An [61] Raghavan Komondoor and Susan
Ethnographic Study of Copy and Paste Horwitz. E®ective, Automatic
Programming Practices in OOPL. In Procedure Extrac-tion. In Proceedings of
Proceedings of 3rd International ACM- the 11th IEEE International Workshop
IEEE Symposium on Empirical Software on Program Compre-hension
Engineering (ISESE'04), pp. 83- 92, (IWPC'03), pp. 33-42, Portland, Oregon,
Redondo Beach, CA, USA, August USA, May 2003.
2004.
531 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
532 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
533 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
534 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
535 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Abstract There is a great need to improve the performance On the other hand, Post-FFT provides better performance
and capacity of mobile communication systems to fulfill the however, more complex computations are required because
ever increasing demand for bandwidth. The main problems spatial processing of individual subcarriers is performed by
affecting mobile communications and limiting their capacity applying the FFT operation on the received signal of each
and performance are multi-path fading, co-channel antenna [1]. In [3], a pre-FFT least mean square (LMS)
interference (CCI) and inter-symbol interference (ISI). To beamforming for OFDM systems was analyzed in additive
overcome such problems, Orthogonal Frequency Division Gaussian noise channel.
Multiplexing (OFDM), and Adaptive Antenna Array (AAA) An adaptive MBER beamforming was analyzed in [5] for
are used to increase the overall performance. single carrier modulation and in [6] for OFDM systems in
In this paper, simulation is used to investigate the behavior additive Gaussian noise channel. A class of MBER algorithms
of the pre-FFT and post-FFT beamformers for OFDM were studied in [5] and combined with space time coding in
system, two algorithms were considered: Least Mean [7].
Squares (LMS) and Minimum Bit Error Rate (MBER). The In earlier papers, a lot of work was done on Pre-FFT as it is
simulation results show that the binary phase shift keying computational efficient compared to Post-FFT. In some
(BPSK) signaling based on MBER technique utilizes the papers, Post-FFT was investigated [9] but it resulted in more
antenna array elements more efficiently than the Least Mean computations due to their selection of adaptation algorithms
Squares (LMS) technique in both cases of pre-FFT and post- used to calculate the weight vector. In [10] and [11] the
FFT, the best performance was obtained in the case of MMSE and MBER beamformers for Pre-FFT and Post-FFT
MBER with post-FFT. OFDM are studied, respectively, without investigating several
factors affecting performance such as considering Antenna
Index Terms MBER, OFDM, Pre-FFT, Post-FFT, MMSE, Array and Angle Spread. Pre-FFT and Post-FFT methods were
Beamforming, Smart Antenna. investigated with LMS in a selective fading channel in [8].
Since wireless standards, such as IEEE 802.11 and 802.16, use
I. INTRODUCTION the pilot subcarriers in their structures, our focus in this paper
Orthogonal Frequency Division Multiplexing (OFDM) is an will be given to suppress CCI and mitigate the multipath
efficient technique for high speed communications. It is used interference in pilot-assisted OFDM systems. In this paper, we
over severe multipath fading channels, especially where the considered the following factors affecting the performance of
delay spread is larger than the symbol duration [1-15]. OFDM both Pre-FFT and post-FFT beamformers: Number of
offers high bit rate transmission because it is less affected by antennas, power of the noise and interferences, presence of a
ISI resulting from multipath fading channels. frequency selective channels in addition to directional
interferences and angle spread, which all affect the array
Also, the use of AAA is efficient to suppress CCI by forming performance.
a beam pattern to keep the desired signal using spatial In addition, we analyzed the MBER algorithm in a practical
processing techniques. Adaptive beamforming can separate channel model in both Pre-FFT and Post-FFT methods and
transmitted signals on the same carrier frequency, provided compared with LMS algorithm, our main focus was on Post-
that they are separated in the spatial domain. The FFT using MBER algorithm which is less complex than LMS ,
beamforming processing combines the signals received by the this in turn results in less complex overall system and good
different elements of an antenna array to form a single output. performance.
In addition, Antenna Arrays can mitigate the effect of ISI and Remainder of this paper is organized as follows: Section II
relax the design of channel equalizer [1]. describes the system model and beamforming schemes.
Section III, describes adaptive algorithms. Sections IV and V
In OFDM systems, it is possible to apply AAA beamforming clarify on the computational complexity and the convergence
to time domain (Pre-FFT) or frequency domain (Post-FFT). rate for the simulated system respectively. In section VII
Pre-FFT processing is done in the time domain, which results simulation results are provided. Finally, conclusions and
in lower computations because only one FFT operation is possible directions for future work are presented in section
required; however, there is a slight performance degradation. VIII.
536 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
This data is interpreted as a frequency-domain data in an where m, l denotes a complex random number representing
OFDM system and is subsequently converted to a time-
domain signal by an IFFT operation [8]. The output of the the l th channel coefficient for the m th source and . is delta
IFFT is transmitted to the channel after the addition of cyclic function.
prefix (CP). This process can be written as The signal at the receiver of an OFDM system, is described by
equation (7), where CP is assumed to be longer than the
1 H channel length (v > L), thus, received signal on the pth antenna
(1) ym F xm 1 m M
K of a Uniform Linear Array (ULA) for one OFDM symbol can
be writes as:
L 1 2
where M j ( p 1) d cos( m , l )
(7) r p (k ) m,l ~y m (k v l )e p (k )
ym ym (1), ym (2), , ym ( K ) T
(2) m 1 l 0
1 p P 1 k K
1 1 1 ,
j 2 (1)(1) j 2 (1)( K 1)
1 e K e K where p (k ) represents the channel noise entering the pth
F (3)
j 2 (K 1)(1)
j 2 ( K 1)( K 1) antenna. m,l denotes the direction of arrival (DOA) of the lth
1 e
K e K
path and m th source. Without loss of generality, we have
assumed that the channels of all sources have the same length
representing the FFT operation matrix, L.
xm xm (1), xm (2), , xm ( K )T (4)
B.PRE-FFT BEAMFORMING:
and H denotes the Hermitian transpose of a matrix. The output MMSE is implemented by sending known pilot symbols and
of the IFFT is transmitted to the channel after the addition of comparing them at the receiver with their known values to
cyclic prefix (CP). In order to add the CP, y m is cyclically generate an error signal, this error signal is used to correct
extended to generating ~ y m by inserting the last v element of errors in the data bits received on the same channel.
y m at its beginning, i.e. If there are a total of Q pilot symbols in every OFDM symbol
then we define two K 1 vector d q and Z q such that, the kth
~ J
ym v ym (5) element of d q is zero if k is a data subcarrier and is the known
I K
pilot value if k is a pilot subcarrier [8]. Similarly, the kth
where J v contains the last v rows of a size K identity matrix
element of Z q is zero if the k is a data subcarrier and is the
IK .
received pilot value if k is a pilot subcarrier. Therefore, the
10101 Signal X x CP
Mapping
IFFT
Addition
error signal in frequency domain is given by:
Eq d q Z q (8)
Channel This error signal must be converted to time domain for the
Per-FFT weight adjustment algorithm. Therefore,
1
ƞ(k)
e F H Eq (9)
Interference Channel Channel K
noise
where e is the vector of error samples in time domain.
e [e(1) e(2) e( K )]T (10)
r1(k) r2(k) r3(k) ... rk(k)
Consequently the Pre-FFT weights are updated as shown later
OFDM Reciver in Adaptation Algorithms section in this paper.
537 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
W (k ) W (k 1) 2 r (k ) e* (k )
C.Post-FFT Beamforming: (12)
1 k K
It should be noted that the number of subcarriers is 64 in post- where is the step size parameter, and represents the
FFT compared to 16 carriers in pre-FFT, thus resulting in complex conjugate. The last update at the end of each OFDM
more computations in post-FFT. As shown in Fig. 2 (block block (W (K)) is used as the initial value of the next block.
diagram of the Post-FFT beamforming), the received time The mean square error is increased with increase in step size
and is decreased according to decrease in the step size.
domain signal of each antenna is first converted to frequency
domain, then beamforming is performed on each subcarrier B. Minimum Bit Error Rate (MBER) Algorithm
[1],[8]. If Rm,k denotes the mth subcarrier of the kth antenna,
then the (frequency-domain) output signal of mth subcarrier is The block diagram of the Pre-FFT beamforming is shown in
given by: Fig.(3), the estimate of the transmitted bit b1 (k ) is given by
1, Re( zˆ (k )) 0
K bˆ1 (k )
Y (m) k 1
wm ,k Rm,k 1 m N (11)
(13) 1, Re( zˆ(k )) 0
In Eq (11), wm,k represents the weight associated with Rm,k. In where Re ( zˆ(k )) denotes the real part of zˆ(k ) .
Fig. 2 one weight is applied to every subcarrier, because we Array 1
assume that all subcarriers are pilot. Since there exist only a w1*
r1 (k )
few pilots in each OFDM block, every group of adjacent data Down convert A/D
subcarriers are clustered under one pilot symbol and the Array 2
Z Ẑ
weight of that pilot symbol is applied to all data subcarriers in w2* FFT
.
r (k )
2 Down convert A/D
the cluster [1],[8]. .
.
Array N
wN ,1
Array 1 w*N
R N ,1
rP (k )
Z (1) Down convert A/D
r1 (n) .
A/D FFT .
. Spatial Weigth control
. R1,1 and Update
. w1,1
. Z
wN , K
Array k Fig.3. Block diagram of the Pre-FFT OFDM adaptive receiver.
RN , K
rk (n) . Pilot Separator
.
A/D FFT
. Z (N ) In this section, Pre-FFT adaptive beamforming based on
R1, K MBER criteria is introduced to obtain the optimum weight
Zp
set. The theoretical MBER solution for the Pre-FFT
w1, K
Ep dp OFDM beamformer is obtained in [4-7] where, the channel
Fig. 2: Block diagram of frequency-domain (post-FFT) beamforming
is assumed to be non-dispersive with additive Gaussian
noise. The error probability (BER cost function) of the
frequency domain signal of the beamformer is given by:
III. ADAPTIVE ALGORITHMS PE (W ) Prob{sgn(b1 (k )Real ( zˆ(k )) 0} (14)
where sgn() is the sign function. The weight vector that
The adaptive beamforming algorithms are used to update the minimize the BER is then defined as
weight vectors periodically to track the signal source in time W arg min PE (W ) (15)
varying environment by adaptively modifying the system’s W
antenna pattern so that nulls are generated in the directions of From equation (14), define the signed decision variable
the interference sources. zˆ s (k ) sgn(b1 (k )) Re ( zˆ (k ))
(16)
A. Least Mean Square (LMS) Algorithm: sgn(b1 (k )) Re ( zˆ (k )) (k )
where
The LMS algorithm is a method of stochastically
implementing the steepest descent algorithm. Successive zˆ (k ) W H [r (k ) (k )] F (k ) (17)
corrections to the weight vector in the direction of the negative and
of the gradient vector eventually lead to the Minimum Mean (k ) sgn(b1 (k )) Re (W H ) (k ) F (k )) (18)
Square Error (MMSE), at which point the weight vector
assumes its optimum value. The equations employed are: zˆs (k ) is the error indicator for the binary decision, when it is
positive, then the decision is correct, else an error occurred,
F(k) is the (k ) k th column of F. Notice that F is unitary
538 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
matrix, so is still Gaussian with zero mean and variance The method of approximating a conditional pdf known as a
n2 W H W . kernel density or Parzen window-based estimate [5-7], is used
to estimate the conditional error probability given the channel
The conditional probability density function (pdf) given the coefficients m,l is used on OFDM systems. Given a symbol
channel coefficients m,l of the error indicator, zˆ s (k ) , is a
mixed sum of Gaussian distributions [5], i.e., of K p training samples r (k ), b1 (k ) , a kernel density
1 estimate of the conditional pdf given the channel coefficients
p z ( zˆ s ) .
K 2 n W W H m,l at pilot locations is given by
(19)
K
( zˆ s sgn(b1 (k )) Re ( zˆ s (k ))) 2 zˆ zˆ (k ) 2
Kp
k 1
exp(
2 n2W H W
) pˆ ( zˆ )
1
2K p 2W HW
exp 2 H
2 W W
(25)
K p 1
This is a good indicator for the beamformer's BER where the kernel width is related to the noise standard
performance, because deriving a closed form for the average
error probability is not an easy process. Therefore, we use the deviation .From this estimated p.d.f., the estimated BER is
gradient conditional error probability to update the weight given by:
vector. The conditional error probability provided that the K p 1
Q(qˆ
1
channel coefficients m,l of the beamformer PE (W ) , is PˆE (W ) k (W ) (26)
Kp k 0
given by [4-7].
K u2
1
PE (W ) exp( )du where
K 2 n W H W k 1
qk (W ) 2
(20) sgn(b1 (k p 1)Re (W H R F p (k p 1))
qˆ k (W ) (27)
K
n W HW
Q(q
1
k (W ))
K k 1 And F p (k p 1) is the (k p 1) th column of F p from
( zˆ s sgn(b1 (k )) Re ( zˆ s (k ))
where u (21) this estimated conditional p.d.f given the channel
n (W H W ) coefficients m,l , the gradient of the estimated BER is given
by [4-7]
where Q () is the Gaussian error function. K p 1
Re( ẑ(k p 1)) 2
exp(
1
sgn(b1 (k )) Re ( zˆ (k )) PˆE (W ) (28)
qk (W ) (22) 2 n W H W k 0 2 n2W H W
n W W H
sgn(b1 (k p 1)) R F p (k p 1)
In OFDM system, it is assumed that there are pilot signals in Now a block-data adaptive MBER algorithm is obtained by
every symbol to do channel estimation [1]-[3]. The pilot
the gradient of PE (W ) . For each OFDM symbol, we can find
signals are also used to adaptive update the weight vector of
beamformer. The transmitted pilot signal vector of desired the optimum weight vector W by the steepest-descent gradient
user x1 p and the received pilot signal vector ẑ p in frequency algorithm [5].
1 ( Re ( zˆ (k p 1))) 2
domain can be written as follows PE (W ) exp( )
2 n 2 n2
x1 p [ x1(1),0..,x1(p 1),0,..,x1((K p 1)p 1)),0,..] (23)
sgn(b1 ( k p 1)) R Fp (k p 1) 1 k Kp
(29)
zˆ p [ zˆ (1),0.., zˆ (p 1),0,.., zˆ (( K p 1)p 1),0,..] That is to say, W weight vector can be updated KP times in one
W H R Fp OFDM symbol. Thus complexity is reduced and consequently,
the update equation is given by
W (k 1) W (k ) PE (W )
1 0 1 1 0
0 e j 2 (1)( p ) / K e
j 2 (1)( K p 1) p / K
0 (Re ( zˆ (k p 1))) 2 (30)
where Fp= 1 W (k ) exp( )
2 n 2 n2
0 0
j 2 ( K 1)( p ) / K j 2 ( K 1)( K p 1) p / K sgn(b1 (k p 1)) R F p (k p 1) 1 k Kp
1 0 e e 0
where is a step size.
(24)
539 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Table I. MBER algorithm summary. We assumed an OFDM system perfectly synchronized, with a
Initialization CP length larger than the channel length with 64 subcarriers
i 1, .01, Block size K 64 (pilot + data), BPSK modulation scheme, one desired source
Calculate variance of noise n and two interferences with equal powers. The desired and
interference sources were placed at 70°, 20°, and 120°,
Initial weight vector W .01 * ones( N ,1) respectively. We further assumed normalized channels with
Outer loop (1: floor (all bits/Block)) different length and real coefficients of 0.864, 0.435, 0.253
Form a block of data from the received signals. and 0 for all sources, and an angle spread of 15 o [1,8]. Pilots
Inner loop (while k K p ) were assumed to be distributed uniformly in the OFDM block
and the first subcarrier in every cluster was taken as a pilot.
Calculate the gradient matrix over the block from
equations (29).
Update the weight matrix as
ˆ
W (k ) W (k 1) (k )PE from equation
(30).
Normalize the solution
W (k 1) W (k 1) / W (k 1)
end of inner loop
Determine the detected signals in order to be
used for calculating the BER.
Increment the block number i i 1
end of outer loop
V. CONVERGENCE RATE
VI. SIMULATION RESULTS Fig.5. shows a similar comparison to that shown in Fig.5 but
In this section, simulations are conducted to evaluate the for SIR= -3 dB. Also in this case, the performance of the post-
performance of the proposed adaptive beamforming for the FFT is better than the pre-FFT. Also, it is observed that the
LMS and MBER algorithms in a variety of channel conditions.
540 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
BER of the post-FFT MBER beamformer is superior to that of all paths of all sources. If this condition is not met, the
LMS under moderate SNR. performance of the pre-FFT will degrade [1,8]. Fig. 8. shows
the effect of number of antennas on BER. Note that MBER
0 algorithm reults in lower BER.
-5
-10
Array Pattern
-15
-20
Pre-fft-LMS
Pre-fft-MBER
-25
0 20 40 60 80 100 120 140 160 180
Observation Angle(in degrees)
-15
Fig .9 shown that the pre-FFT scheme has better results with
-20 wider angle spread while post-FFT exhibits better performance
with narrower angle spread. This is shown for a system with 8
-25 antennas, SNR=10 dB, the performance of MBER was better
than LMS [1,8].
-30
-35
0 20 40 60 80 100 120 140 160 180
Degress
541 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
-2
-5
MSE (dB)
10
Average BER (dB)
-3
10 -10
-4
10
SNR=0 -15
-5 SNR=4
10
SNR=6
SNR=8
-6 SNR=10 -20
10 0 100 200 300 400 500 600 700 800 900 1000
2 4 6 8 10 12 14 16 18 20 Number of OFDM Symbols
Number of OFDM Symbol
Fig. 12 MSE plots of all two schemes when the post-FFT performance is
POST-FFT convergence performance of MBER at different SNR
0
10 better than the pre-FFT.
-1
10 Fig. 12 shows the MSE curves of pre-FFT and post-FFT with
SNR=25 dB. Interestingly, while the MSE curve of the post-
-2
10 FFT scheme is better than the pre-FFT scheme its convergence
Average BER (dB)
VII. CONCLUSION
-4
10
In this paper, we studied MBER beamformer for Pre-FFT and
-5
SNR=0
SNR=4
Post-FFT OFDM adaptive antenna array. A multipath
10
SNR=6 (frequency selective fading) channel model is considered. The
SNR=8
-6 SNR=10
MBER algorithm and LMS algorithm are compared. The
10
2 4 6 8 10 12 14 16 18 20 MBER beamformer has advantageous characteristics such as
Number of OFDM Symbol
better BER performance, less computational complexity and
Fig. (10.A, 10.B) Convergence performance at different SNR.
shorter training symbols. We compared the post-FFT and Pre-
FFT performance, our results show that post-FFT has better
0.25 results in terms of BER but it requires more computations
resulting in a more complex system. We considered different
cases regarding number of antennas and angle separation and
0.2
channel paths.
Future work will cover a combined system of Post-FFT and
0.15 Pre-FFT, the complexity and performance of such a system
|weights|
542 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[6] Lingyan Fan, Haibin Zhang and Chen He," Minimum bit error
beamforming for Pre-FFT OFDM adaptive antenna array", IEEE
International Conference, 28-25 Sept. 2005.
[7] Said Elnoubi, Waleed Abdallah, Mohamed M. M. Omar, "Minimum bit
error rate beamforming combined with space-time block coding", The
International Conf. on Com. and information Tech.-ICCIT 2011, March
2011-Aqaba, Jordon.
[8] S. R. Seydnejad, and S. Akhzari, “Performance Evaluation of Pre-FFT
Beamforming Methods in Pilot -Assisted SIMO-OFDM systems”,
Telecommunication Systems, Springer Science and Business Media,
March 2015.
[9] S. Hara, M. Budsabathon, and Y. Hara, "A Pre-FFT OFDM adaptive
antenna array with eigenvector combining", 2004 IEEE International
Conference on commun. Vol. 4, pp. 2412 - 2416 , June 2004.
[10] M. S. Heakle, M. A. Mangoud, and S. Elnoubi “LMS Beamforming
Using Pre and Post-FFT Processing for OFDM Communication
Systems,” Proc. of the 24th National Radio & Science Conference
(NRSC 2007) Mar. 2007.
[11] A. M. Mahros, I. Elzahaby, M. M. Tharwat, S. Elnoubi, “Beamforming
processing for OFDM communication systems,” Proc. of INCT 2012,
Istanbul, TURKEY, Oct. 2012.
[12] 3GPP, Release 8 V0.0.3, “Overview of 3GPP Release 8: Summery of all
Release 8 Features”, November 2008.
[13] M. Hsieh and C. We “Channel Estimation for OFDM Systems based
on Comb-Type Pilot arrangement in frequency selective fading
channels”, IEEE Transaction on Wireless Communication,vol.2, no.
1, pp 217-225, May 2009.
[14] 3GPP TS 36.211-Physical Channels and Modulation, 3GPP Technical
Specification, Rev. 8.9.0, 2009.
[15] M. Morelli and U. Mengali, “A Comparison of Pilot-aided Channel
Estimation Methods for OFDM Systems,” IEEE Transactions on Signal
Processing, vol. 49, pp.3065–3073, December 2001.
543 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Assistant Professor
Abstract: -
Cloud computing is a metaphor used to connected elements of network rendering services over
the internet. Cloud can store big data as well. Data can be accessed anywhere at any time and
data can never be lost. Python is an interpreter, object oriented language. It is dynamic in nature.
This language is widely accepted for Rapid application development software. Its syntax is very
simple and its readability is very high so its maintenance cost is very low. Debugging of python
code is very easy. So this research paper uses python to design clouds storage.
Keywords: -
Introduction: -
Clouds are unified object storage for organization having large amount of data for data analytics
and for data archiving. It is multi-regional in nature with the highest availability and highest
QPS content. This research paper uses Python to design clouds.
1. or the boto auth plugin framework. It provides OAuth 2.0 credentials that can be used
with Cloud Storage.
Setup to use the boto library and oauth2 plugin will depend on the system you are using.
Use the setup examples below as guidance. These commands install pip and then use pip
to install other packages. The last three commands show testing the import of the two
modules to verify the installation.
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo apt-get update
544 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo yum install gcc openssl-devel python-devel python-setuptools libffi-devel
sudo pip install virtualenv
virtualenv venv
source ./venv/bin/activate
(venv)pip install gcs-oauth2-boto-plugin
(venv)python
>>>import boto
>>>import gcs_oauth2_boto_plugin
You can configure your boto configuration file to use service account or user account
credentials. Service account credentials are the preferred type of credential to use when
authenticating on behalf of a service or application. User account credentials are the
preferred type of credentials for authenticating requests on behalf of a specific user (i.e., a
human). For more information about these two credential types, see Supported Credential
Types.
1. Use an existing service account or create a new one, and download the associated
private key.
545 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
4. Click the drop-down box below Service account, then click New service
account.
5. Enter a name for the service account in Name.
6. Use the default Service account ID or generate a different one.
7. Select the Key type: JSON or P12.
8. Click Create.
A Service account created window is displayed and the private key for
the Key type you selected is downloaded automatically. If you selected a
P12 key, the private key's password ("notasecret") is displayed.
9. Click Close.
You need to get the private key in PKCS12 format. By default, when you create a
new key, the JSON format of the private key is downloaded.
2. Configure the .boto file with the service account. You can do this with gsutil:
gsutil config -e
The command will prompt you for the service account email address and the
location of the service account private key (.p12). Be sure to have the private key
on the computer where you are running the gsutil command.
3. If you don't already have a .boto file create one. You can do this with gsutil.
gsutil config
The credential's associated ID and secret are displayed. You can also view
these later after the key is created.
5. Click OK.
546 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
5. Edit the .boto file. In the [OAuth2] section, specify the client_id and client_secret
values with the ones you generated.
6. Run the gsutil config again command to generate a refresh token based on the
client ID and secret you entered.
If you get an error message that indicates the .boto cannot be backed up, remove
or rename the backup configuration file .boto.bak.
1. Set the client_id and the client_secret in the .boto config file. This is the
recommended option, and it is required for using gsutil with your new
.boto config file.
2. Set environment variables OAUTH2_CLIENT_ID and
OAUTH2_CLIENT_SECRET.
3. Use the SetFallbackClientIdAndSecret function as shown in the examples
below.
To start this tutorial, use your favorite text editor to create a new Python file. Then, add the
following directives, import statements, configuration, and constant assignments shown.
Note that in the code here, we use the SetFallbackClientIdAndSecret function as a fallback for
generating refresh tokens. See Using application credentials for other ways to specify a fallback.
If you are using a service account to authenticate, you do not need to include the fallback logic.
#!/usr/bin/python
import boto
import gcs_oauth2_boto_plugin
import os
import shutil
import StringIO
import tempfile
import time
547 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
GOOGLE_STORAGE = 'gs'
# URI scheme for accessing local files.
LOCAL_FILE = 'file'
Creating buckets
This following code creates two buckets. Because bucket names must be globally unique (see the
naming guidelines), a timestamp is appended to each bucket name to help guarantee uniqueness.
If these bucket names are already in use, you'll need to modify the code to generate unique
bucket names.
Note: Cloud Storage has kept the concept of default project from earlier versions of the product.
A default project exists for interoperability reasons. For more information and to learn how to set
a default project, see Setting a default project. The existence of a default project affects the way
the code shown on the right is written.
now = time.time()
CATS_BUCKET = 'cats-%d' % now
DOGS_BUCKET = 'dogs-%d' % now
548 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
Listing buckets
Uploading objects
To upload objects, create a file object (opened for read) that points to your local file and a
storage URI object that points to the destination object on Cloud Storage. Call the
set_file_from_contents() instance method, specifying the file handle as the argument.
dst_uri = boto.storage_uri(
DOGS_BUCKET + '/' + filename, GOOGLE_STORAGE)
# The key-related functions are a consequence of boto's
# interoperability with Amazon S3 (which employs the
# concept of a key mapping to localfile).
dst_uri.new_key().set_contents_from_file(localfile)
print 'Successfully created "%s/%s"' % (
dst_uri.bucket_name, dst_uri.object_name)
Listing objects
To list all objects in a bucket, call storage_uri() and specify the bucket's URI and the Cloud
Storage URI scheme as the arguments. Then, retrieve a list of objects using the get_bucket()
instance method.
549 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
The following code reads objects in DOGS_BUCKET and copies them to both your home
directory and CATS_BUCKET. It also demonstrates that you can use the boto library to operate
against both local files and Cloud Storage objects using the same interface.
dest_dir = os.getenv('HOME')
for filename in ('collie.txt', 'labrador.txt'):
src_uri = boto.storage_uri(
DOGS_BUCKET + '/' + filename, GOOGLE_STORAGE)
local_dst_uri = boto.storage_uri(
os.path.join(dest_dir, filename), LOCAL_FILE)
bucket_dst_uri = boto.storage_uri(
CATS_BUCKET + '/' + filename, GOOGLE_STORAGE)
object_contents.close()
The following code grants the specified Google account FULL_CONTROL permissions for
labrador.txt. Remember to replace valid-email-address with a valid Google account email
address.
550 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
This code retrieves and prints the metadata associated with a bucket and an object.
To conclude this tutorial, this code deletes the objects and buckets that you have created. A
bucket must be empty before it can be deleted, so its objects are first deleted.
References
[1]M.-E. BEGIN, An egee comparative study: Grids and clouds – evolution or revolution. EGEE
III project Report, vol. 30 (2008).
551 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
[2] XIAO Zhi-Jiao, CHANG Hui-You,YI Yang “An Optimization M ethod of W orkflow
Dynamic Scheduling Based on Heuristic GA”, Computer Science, Vo1.34 No.2 2007.
[3] Arash Ghorbannia Delavar, Yalda Aryan. A Synthetic Heuristic Algorithm for Independent
Task Scheduling in Cloud Systems. IJCSI International Journal of Computer Science Issues, Vol.
8, No 2, 2011, pp.289-295.
[5] David Arthur and Sergei Vassilvitskii. k-means++: the advantages of careful seeding.
Proceed- ings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages
1027–1035, 2007.
[6] Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. Gephi: An open source software
for exploring and manipulating networks. Proceedings of the Third International ICWSM
Confer- ence, 2009
[7] Chih-Chung Chang and Chih-Jen Li. LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology, 2(3), 2011.
552 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Assist. Prof, Kanwalvir Singh Dhindsa, B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India
Dr. Jamal Ahmad Dargham, School of Engineering and Information Technology, Universiti Malaysia Sabah
Mr. Nitin Bhatia, DAV College, India
Dr. Dhavachelvan Ponnurangam, Pondicherry Central University, India
Dr. Mohd Faizal Abdollah, University of Technical Malaysia, Malaysia
Assist. Prof. Sonal Chawla, Panjab University, India
Dr. Abdul Wahid, AKG Engg. College, Ghaziabad, India
Mr. Arash Habibi Lashkari, University of Malaya (UM), Malaysia
Mr. Md. Rajibul Islam, Ibnu Sina Institute, University Technology Malaysia
Professor Dr. Sabu M. Thampi, .B.S Institute of Technology for Women, Kerala University, India
Mr. Noor Muhammed Nayeem, Université Lumière Lyon 2, 69007 Lyon, France
Dr. Himanshu Aggarwal, Department of Computer Engineering, Punjabi University, India
Prof R. Naidoo, Dept of Mathematics/Center for Advanced Computer Modelling, Durban University of Technology,
Durban,South Africa
Prof. Mydhili K Nair, Visweswaraiah Technological University, Bangalore, India
M. Prabu, Adhiyamaan College of Engineering/Anna University, India
Mr. Swakkhar Shatabda, United International University, Bangladesh
Dr. Abdur Rashid Khan, ICIT, Gomal University, Dera Ismail Khan, Pakistan
Mr. H. Abdul Shabeer, I-Nautix Technologies,Chennai, India
Dr. M. Aramudhan, Perunthalaivar Kamarajar Institute of Engineering and Technology, India
Dr. M. P. Thapliyal, Department of Computer Science, HNB Garhwal University (Central University), India
Dr. Shahaboddin Shamshirband, Islamic Azad University, Iran
Mr. Zeashan Hameed Khan, Université de Grenoble, France
Prof. Anil K Ahlawat, Ajay Kumar Garg Engineering College, Ghaziabad, UP Technical University, Lucknow
Mr. Longe Olumide Babatope, University Of Ibadan, Nigeria
Associate Prof. Raman Maini, University College of Engineering, Punjabi University, India
Dr. Maslin Masrom, University Technology Malaysia, Malaysia
Sudipta Chattopadhyay, Jadavpur University, Kolkata, India
Dr. Dang Tuan NGUYEN, University of Information Technology, Vietnam National University - Ho Chi Minh City
Dr. Mary Lourde R., BITS-PILANI Dubai , UAE
Dr. Abdul Aziz, University of Central Punjab, Pakistan
Mr. Karan Singh, Gautam Budtha University, India
Mr. Avinash Pokhriyal, Uttar Pradesh Technical University, Lucknow, India
Associate Prof Dr Zuraini Ismail, University Technology Malaysia, Malaysia
Assistant Prof. Yasser M. Alginahi, Taibah University, Madinah Munawwarrah, KSA
Mr. Dakshina Ranjan Kisku, West Bengal University of Technology, India
Mr. Raman Kumar, Dr B R Ambedkar National Institute of Technology, Jalandhar, Punjab, India
Associate Prof. Samir B. Patel, Institute of Technology, Nirma University, India
Dr. M.Munir Ahamed Rabbani, B. S. Abdur Rahman University, India
Asst. Prof. Koushik Majumder, West Bengal University of Technology, India
Dr. Alex Pappachen James, Queensland Micro-nanotechnology center, Griffith University, Australia
Assistant Prof. S. Hariharan, B.S. Abdur Rahman University, India
Asst Prof. Jasmine. K. S, R.V.College of Engineering, India
Mr Naushad Ali Mamode Khan, Ministry of Education and Human Resources, Mauritius
Prof. Mahesh Goyani, G H Patel Collge of Engg. & Tech, V.V.N, Anand, Gujarat, India
Dr. Mana Mohammed, University of Tlemcen, Algeria
Prof. Jatinder Singh, Universal Institutiion of Engg. & Tech. CHD, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Mrs. M. Anandhavalli Gauthaman, Sikkim Manipal Institute of Technology, Majitar, East Sikkim
Dr. Bin Guo, Institute Telecom SudParis, France
Mrs. Maleika Mehr Nigar Mohamed Heenaye-Mamode Khan, University of Mauritius
Prof. Pijush Biswas, RCC Institute of Information Technology, India
Mr. V. Bala Dhandayuthapani, Mekelle University, Ethiopia
Dr. Irfan Syamsuddin, State Polytechnic of Ujung Pandang, Indonesia
Mr. Kavi Kumar Khedo, University of Mauritius, Mauritius
Mr. Ravi Chandiran, Zagro Singapore Pte Ltd. Singapore
Mr. Milindkumar V. Sarode, Jawaharlal Darda Institute of Engineering and Technology, India
Dr. Shamimul Qamar, KSJ Institute of Engineering & Technology, India
Dr. C. Arun, Anna University, India
Assist. Prof. M.N.Birje, Basaveshwar Engineering College, India
Prof. Hamid Reza Naji, Department of Computer Enigneering, Shahid Beheshti University, Tehran, Iran
Assist. Prof. Debasis Giri, Department of Computer Science and Engineering, Haldia Institute of Technology
Subhabrata Barman, Haldia Institute of Technology, West Bengal
Mr. M. I. Lali, COMSATS Institute of Information Technology, Islamabad, Pakistan
Dr. Feroz Khan, Central Institute of Medicinal and Aromatic Plants, Lucknow, India
Mr. R. Nagendran, Institute of Technology, Coimbatore, Tamilnadu, India
Mr. Amnach Khawne, King Mongkut’s Institute of Technology Ladkrabang, Ladkrabang, Bangkok, Thailand
Dr. P. Chakrabarti, Sir Padampat Singhania University, Udaipur, India
Mr. Nafiz Imtiaz Bin Hamid, Islamic University of Technology (IUT), Bangladesh.
Shahab-A. Shamshirband, Islamic Azad University, Chalous, Iran
Prof. B. Priestly Shan, Anna Univeristy, Tamilnadu, India
Venkatramreddy Velma, Dept. of Bioinformatics, University of Mississippi Medical Center, Jackson MS USA
Akshi Kumar, Dept. of Computer Engineering, Delhi Technological University, India
Dr. Umesh Kumar Singh, Vikram University, Ujjain, India
Mr. Serguei A. Mokhov, Concordia University, Canada
Mr. Lai Khin Wee, Universiti Teknologi Malaysia, Malaysia
Dr. Awadhesh Kumar Sharma, Madan Mohan Malviya Engineering College, India
Mr. Syed R. Rizvi, Analytical Services & Materials, Inc., USA
Dr. S. Karthik, SNS Collegeof Technology, India
Mr. Syed Qasim Bukhari, CIMET (Universidad de Granada), Spain
Mr. A.D.Potgantwar, Pune University, India
Dr. Himanshu Aggarwal, Punjabi University, India
Mr. Rajesh Ramachandran, Naipunya Institute of Management and Information Technology, India
Dr. K.L. Shunmuganathan, R.M.K Engg College , Kavaraipettai ,Chennai
Dr. Prasant Kumar Pattnaik, KIST, India.
Dr. Ch. Aswani Kumar, VIT University, India
Mr. Ijaz Ali Shoukat, King Saud University, Riyadh KSA
Mr. Arun Kumar, Sir Padam Pat Singhania University, Udaipur, Rajasthan
Mr. Muhammad Imran Khan, Universiti Teknologi PETRONAS, Malaysia
Dr. Natarajan Meghanathan, Jackson State University, Jackson, MS, USA
Mr. Mohd Zaki Bin Mas'ud, Universiti Teknikal Malaysia Melaka (UTeM), Malaysia
Prof. Dr. R. Geetharamani, Dept. of Computer Science and Eng., Rajalakshmi Engineering College, India
Dr. Smita Rajpal, Institute of Technology and Management, Gurgaon, India
Dr. S. Abdul Khader Jilani, University of Tabuk, Tabuk, Saudi Arabia
Mr. Syed Jamal Haider Zaidi, Bahria University, Pakistan
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Assist. Prof. Nisheeth Joshi, Apaji Institute, Banasthali University, Rajasthan, India
Associate Prof. Kunwar S. Vaisla, VCT Kumaon Engineering College, India
Prof Anupam Choudhary, Bhilai School Of Engg.,Bhilai (C.G.),India
Mr. Divya Prakash Shrivastava, Al Jabal Al garbi University, Zawya, Libya
Associate Prof. Dr. V. Radha, Avinashilingam Deemed university for women, Coimbatore.
Dr. Kasarapu Ramani, JNT University, Anantapur, India
Dr. Anuraag Awasthi, Jayoti Vidyapeeth Womens University, India
Dr. C G Ravichandran, R V S College of Engineering and Technology, India
Dr. Mohamed A. Deriche, King Fahd University of Petroleum and Minerals, Saudi Arabia
Mr. Abbas Karimi, Universiti Putra Malaysia, Malaysia
Mr. Amit Kumar, Jaypee University of Engg. and Tech., India
Dr. Nikolai Stoianov, Defense Institute, Bulgaria
Assist. Prof. S. Ranichandra, KSR College of Arts and Science, Tiruchencode
Mr. T.K.P. Rajagopal, Diamond Horse International Pvt Ltd, India
Dr. Md. Ekramul Hamid, Rajshahi University, Bangladesh
Mr. Hemanta Kumar Kalita , TATA Consultancy Services (TCS), India
Dr. Messaouda Azzouzi, Ziane Achour University of Djelfa, Algeria
Prof. (Dr.) Juan Jose Martinez Castillo, "Gran Mariscal de Ayacucho" University and Acantelys research Group,
Venezuela
Dr. Jatinderkumar R. Saini, Narmada College of Computer Application, India
Dr. Babak Bashari Rad, University Technology of Malaysia, Malaysia
Dr. Nighat Mir, Effat University, Saudi Arabia
Prof. (Dr.) G.M.Nasira, Sasurie College of Engineering, India
Mr. Varun Mittal, Gemalto Pte Ltd, Singapore
Assist. Prof. Mrs P. Banumathi, Kathir College Of Engineering, Coimbatore
Assist. Prof. Quan Yuan, University of Wisconsin-Stevens Point, US
Dr. Pranam Paul, Narula Institute of Technology, Agarpara, West Bengal, India
Assist. Prof. J. Ramkumar, V.L.B Janakiammal college of Arts & Science, India
Mr. P. Sivakumar, Anna university, Chennai, India
Mr. Md. Humayun Kabir Biswas, King Khalid University, Kingdom of Saudi Arabia
Mr. Mayank Singh, J.P. Institute of Engg & Technology, Meerut, India
HJ. Kamaruzaman Jusoff, Universiti Putra Malaysia
Mr. Nikhil Patrick Lobo, CADES, India
Dr. Amit Wason, Rayat-Bahra Institute of Engineering & Boi-Technology, India
Dr. Rajesh Shrivastava, Govt. Benazir Science & Commerce College, Bhopal, India
Assist. Prof. Vishal Bharti, DCE, Gurgaon
Mrs. Sunita Bansal, Birla Institute of Technology & Science, India
Dr. R. Sudhakar, Dr.Mahalingam college of Engineering and Technology, India
Dr. Amit Kumar Garg, Shri Mata Vaishno Devi University, Katra(J&K), India
Assist. Prof. Raj Gaurang Tiwari, AZAD Institute of Engineering and Technology, India
Mr. Hamed Taherdoost, Tehran, Iran
Mr. Amin Daneshmand Malayeri, YRC, IAU, Malayer Branch, Iran
Mr. Shantanu Pal, University of Calcutta, India
Dr. Terry H. Walcott, E-Promag Consultancy Group, United Kingdom
Dr. Ezekiel U OKIKE, University of Ibadan, Nigeria
Mr. P. Mahalingam, Caledonian College of Engineering, Oman
Dr. Mahmoud M. A. Abd Ellatif, Mansoura University, Egypt
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Mr. Souleymane Balla-Arabé, Xi’an University of Electronic Science and Technology, China
Mr. Mahabub Alam, Rajshahi University of Engineering and Technology, Bangladesh
Mr. Sathyapraksh P., S.K.P Engineering College, India
Dr. N. Karthikeyan, SNS College of Engineering, Anna University, India
Dr. Binod Kumar, JSPM's, Jayawant Technical Campus, Pune, India
Assoc. Prof. Dinesh Goyal, Suresh Gyan Vihar University, India
Mr. Md. Abdul Ahad, K L University, India
Mr. Vikas Bajpai, The LNM IIT, India
Dr. Manish Kumar Anand, Salesforce (R & D Analytics), San Francisco, USA
Assist. Prof. Dheeraj Murari, Kumaon Engineering College, India
Assoc. Prof. Dr. A. Muthukumaravel, VELS University, Chennai
Mr. A. Siles Balasingh, St.Joseph University in Tanzania, Tanzania
Mr. Ravindra Daga Badgujar, R C Patel Institute of Technology, India
Dr. Preeti Khanna, SVKM’s NMIMS, School of Business Management, India
Mr. Kumar Dayanand, Cambridge Institute of Technology, India
Dr. Syed Asif Ali, SMI University Karachi, Pakistan
Prof. Pallvi Pandit, Himachal Pradeh University, India
Mr. Ricardo Verschueren, University of Gloucestershire, UK
Assist. Prof. Mamta Juneja, University Institute of Engineering and Technology, Panjab University, India
Assoc. Prof. P. Surendra Varma, NRI Institute of Technology, JNTU Kakinada, India
Assist. Prof. Gaurav Shrivastava, RGPV / SVITS Indore, India
Dr. S. Sumathi, Anna University, India
Assist. Prof. Ankita M. Kapadia, Charotar University of Science and Technology, India
Mr. Deepak Kumar, Indian Institute of Technology (BHU), India
Dr. Dr. Rajan Gupta, GGSIP University, New Delhi, India
Assist. Prof M. Anand Kumar, Karpagam University, Coimbatore, India
Mr. Mr Arshad Mansoor, Pakistan Aeronautical Complex
Mr. Kapil Kumar Gupta, Ansal Institute of Technology and Management, India
Dr. Neeraj Tomer, SINE International Institute of Technology, Jaipur, India
Assist. Prof. Trunal J. Patel, C.G.Patel Institute of Technology, Uka Tarsadia University, Bardoli, Surat
Mr. Sivakumar, Codework solutions, India
Mr. Mohammad Sadegh Mirzaei, PGNR Company, Iran
Dr. Gerard G. Dumancas, Oklahoma Medical Research Foundation, USA
Mr. Varadala Sridhar, Varadhaman College Engineering College, Affiliated To JNTU, Hyderabad
Assist. Prof. Manoj Dhawan, SVITS, Indore
Assoc. Prof. Chitreshh Banerjee, Suresh Gyan Vihar University, Jaipur, India
Dr. S. Santhi, SCSVMV University, India
Mr. Davood Mohammadi Souran, Ministry of Energy of Iran, Iran
Mr. Shamim Ahmed, Bangladesh University of Business and Technology, Bangladesh
Mr. Sandeep Reddivari, Mississippi State University, USA
Assoc. Prof. Ousmane Thiare, Gaston Berger University, Senegal
Dr. Hazra Imran, Athabasca University, Canada
Dr. Setu Kumar Chaturvedi, Technocrats Institute of Technology, Bhopal, India
Mr. Mohd Dilshad Ansari, Jaypee University of Information Technology, India
Ms. Jaspreet Kaur, Distance Education LPU, India
Dr. D. Nagarajan, Salalah College of Technology, Sultanate of Oman
Dr. K.V.N.R.Sai Krishna, S.V.R.M. College, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Mr. Himanshu Pareek, Center for Development of Advanced Computing (CDAC), India
Mr. Khaldi Amine, Badji Mokhtar University, Algeria
Mr. Mohammad Sadegh Mirzaei, Scientific Applied University, Iran
Assist. Prof. Khyati Chaudhary, Ram-eesh Institute of Engg. & Technology, India
Mr. Sanjay Agal, Pacific College of Engineering Udaipur, India
Mr. Abdul Mateen Ansari, King Khalid University, Saudi Arabia
Dr. H.S. Behera, Veer Surendra Sai University of Technology (VSSUT), India
Dr. Shrikant Tiwari, Shri Shankaracharya Group of Institutions (SSGI), India
Prof. Ganesh B. Regulwar, Shri Shankarprasad Agnihotri College of Engg, India
Prof. Pinnamaneni Bhanu Prasad, Matrix vision GmbH, Germany
Dr. Shrikant Tiwari, Shri Shankaracharya Technical Campus (SSTC), India
Dr. Siddesh G.K., : Dayananada Sagar College of Engineering, Bangalore, India
Dr. Nadir Bouchama, CERIST Research Center, Algeria
Dr. R. Sathishkumar, Sri Venkateswara College of Engineering, India
Assistant Prof (Dr.) Mohamed Moussaoui, Abdelmalek Essaadi University, Morocco
Dr. S. Malathi, Panimalar Engineering College, Chennai, India
Dr. V. Subedha, Panimalar Institute of Technology, Chennai, India
Dr. Prashant Panse, Swami Vivekanand College of Engineering, Indore, India
Dr. Hamza Aldabbas, Al-Balqa’a Applied University, Jordan
Dr. G. Rasitha Banu, Vel's University, Chennai
Dr. V. D. Ambeth Kumar, Panimalar Engineering College, Chennai
Prof. Anuranjan Misra, Bhagwant Institute of Technology, Ghaziabad, India
Ms. U. Sinthuja, PSG college of arts &science, India
Dr. Ehsan Saradar Torshizi, Urmia University, Iran
Dr. Shamneesh Sharma, APG Shimla University, Shimla (H.P.), India
Assistant Prof. A. S. Syed Navaz, Muthayammal College of Arts & Science, India
Assistant Prof. Ranjit Panigrahi, Sikkim Manipal Institute of Technology, Majitar, Sikkim
Dr. Khaled Eskaf, Arab Academy for Science ,Technology & Maritime Transportation, Egypt
Dr. Nishant Gupta, University of Jammu, India
Assistant Prof. Nagarajan Sankaran, Annamalai University, Chidambaram, Tamilnadu, India
Assistant Prof.Tribikram Pradhan, Manipal Institute of Technology, India
Dr. Nasser Lotfi, Eastern Mediterranean University, Northern Cyprus
Dr. R. Manavalan, K S Rangasamy college of Arts and Science, Tamilnadu, India
Assistant Prof. P. Krishna Sankar, K S Rangasamy college of Arts and Science, Tamilnadu, India
Dr. Rahul Malik, Cisco Systems, USA
Dr. S. C. Lingareddy, ALPHA College of Engineering, India
Assistant Prof. Mohammed Shuaib, Interal University, Lucknow, India
Dr. Sachin Yele, Sanghvi Institute of Management & Science, India
Dr. T. Thambidurai, Sun Univercell, Singapore
Prof. Anandkumar Telang, BKIT, India
Assistant Prof. R. Poorvadevi, SCSVMV University, India
Dr Uttam Mande, Gitam University, India
Dr. Poornima Girish Naik, Shahu Institute of Business Education and Research (SIBER), India
Prof. Md. Abu Kausar, Jaipur National University, Jaipur, India
Dr. Mohammed Zuber, AISECT University, India
Prof. Kalum Priyanath Udagepola, King Abdulaziz University, Saudi Arabia
Dr. K. R. Ananth, Velalar College of Engineering and Technology, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Assistant Prof. Sanjay Sharma, Roorkee Engineering & Management Institute Shamli (U.P), India
Assistant Prof. Panem Charan Arur, Priyadarshini Institute of Technology, India
Dr. Ashwak Mahmood muhsen alabaichi, Karbala University / College of Science, Iraq
Dr. Urmila Shrawankar, G H Raisoni College of Engineering, Nagpur (MS), India
Dr. Krishan Kumar Paliwal, Panipat Institute of Engineering & Technology, India
Dr. Mukesh Negi, Tech Mahindra, India
Dr. Anuj Kumar Singh, Amity University Gurgaon, India
Dr. Babar Shah, Gyeongsang National University, South Korea
Assistant Prof. Jayprakash Upadhyay, SRI-TECH Jabalpur, India
Assistant Prof. Varadala Sridhar, Vidya Jyothi Institute of Technology, India
Assistant Prof. Parameshachari B D, KSIT, Bangalore, India
Assistant Prof. Ankit Garg, Amity University, Haryana, India
Assistant Prof. Rajashe Karappa, SDMCET, Karnataka, India
Assistant Prof. Varun Jasuja, GNIT, India
Assistant Prof. Sonal Honale, Abha Gaikwad Patil College of Engineering Nagpur, India
Dr. Pooja Choudhary, CT Group of Institutions, NIT Jalandhar, India
Dr. Faouzi Hidoussi, UHL Batna, Algeria
Dr. Naseer Ali Husieen, Wasit University, Iraq
Assistant Prof. Vinod Kumar Shukla, Amity University, Dubai
Dr. Ahmed Farouk Metwaly, K L University
Mr. Mohammed Noaman Murad, Cihan University, Iraq
Dr. Suxing Liu, Arkansas State University, USA
Dr. M. Gomathi, Velalar College of Engineering and Technology, India
Assistant Prof. Sumardiono, College PGRI Blitar, Indonesia
Dr. Latika Kharb, Jagan Institute of Management Studies (JIMS), Delhi, India
Associate Prof. S. Raja, Pauls College of Engineering and Technology, Tamilnadu, India
Assistant Prof. Seyed Reza Pakize, Shahid Sani High School, Iran
Dr. Thiyagu Nagaraj, University-INOU, India
Assistant Prof. Noreen Sarai, Harare Institute of Technology, Zimbabwe
Assistant Prof. Gajanand Sharma, Suresh Gyan Vihar University Jaipur, Rajasthan, India
Assistant Prof. Mapari Vikas Prakash, Siddhant COE, Sudumbare, Pune, India
Dr. Devesh Katiyar, Shri Ramswaroop Memorial University, India
Dr. Shenshen Liang, University of California, Santa Cruz, US
Assistant Prof. Mohammad Abu Omar, Limkokwing University of Creative Technology- Malaysia
Mr. Snehasis Banerjee, Tata Consultancy Services, India
Assistant Prof. Kibona Lusekelo, Ruaha Catholic University (RUCU), Tanzania
Assistant Prof. Adib Kabir Chowdhury, University College Technology Sarawak, Malaysia
Dr. Ying Yang, Computer Science Department, Yale University, USA
Dr. Vinay Shukla, Institute Of Technology & Management, India
Dr. Liviu Octavian Mafteiu-Scai, West University of Timisoara, Romania
Assistant Prof. Rana Khudhair Abbas Ahmed, Al-Rafidain University College, Iraq
Assistant Prof. Nitin A. Naik, S.R.T.M. University, India
Dr. Timothy Powers, University of Hertfordshire, UK
Dr. S. Prasath, Bharathiar University, Erode, India
Dr. Ritu Shrivastava, SIRTS Bhopal, India
Prof. Rohit Shrivastava, Mittal Institute of Technology, Bhopal, India
Dr. Gianina Mihai, Dunarea de Jos" University of Galati, Romania
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
Assistant Prof. Ms. T. Kalai Selvi, Erode Sengunthar Engineering College, India
Assistant Prof. Ms. C. Kavitha, Erode Sengunthar Engineering College, India
Assistant Prof. K. Sinivasamoorthi, Erode Sengunthar Engineering College, India
Assistant Prof. Mallikarjun C Sarsamba Bheemnna Khandre Institute Technology, Bhalki, India
Assistant Prof. Vishwanath Chikaraddi, Veermata Jijabai technological Institute (Central Technological Institute), India
Assistant Prof. Dr. Ikvinderpal Singh, Trai Shatabdi GGS Khalsa College, India
Assistant Prof. Mohammed Noaman Murad, Cihan University, Iraq
Professor Yousef Farhaoui, Moulay Ismail University, Errachidia, Morocco
Dr. Parul Verma, Amity University, India
Professor Yousef Farhaoui, Moulay Ismail University, Errachidia, Morocco
Assistant Prof. Madhavi Dhingra, Amity University, Madhya Pradesh, India
Assistant Prof.. G. Selvavinayagam, SNS College of Technology, Coimbatore, India
Assistant Prof. Madhavi Dhingra, Amity University, MP, India
Professor Kartheesan Log, Anna University, Chennai
Professor Vasudeva Acharya, Shri Madhwa vadiraja Institute of Technology, India
Dr. Asif Iqbal Hajamydeen, Management & Science University, Malaysia
Assistant Prof., Mahendra Singh Meena, Amity University Haryana
Assistant Professor Manjeet Kaur, Amity University Haryana
Dr. Mohamed Abd El-Basset Matwalli, Zagazig University, Egypt
Dr. Ramani Kannan, Universiti Teknologi PETRONAS, Malaysia
Assistant Prof. S. Jagadeesan Subramaniam, Anna University, India
Assistant Prof. Dharmendra Choudhary, Tripura University, India
Assistant Prof. Deepika Vodnala, SR Engineering College, India
Dr. Kai Cong, Intel Corporation & Computer Science Department, Portland State University, USA
Dr. Kailas R Patil, Vishwakarma Institute of Information Technology (VIIT), India
Dr. Omar A. Alzubi, Faculty of IT / Al-Balqa Applied University, Jordan
Assistant Prof. Kareemullah Shaik, Nimra Institute of Science and Technology, India
Assistant Prof. Chirag Modi, NIT Goa
Dr. R. Ramkumar, Nandha Arts And Science College, India
Dr. Priyadharshini Vydhialingam, Harathiar University, India
Dr. P. S. Jagadeesh Kumar, DBIT, Bangalore, Karnataka
Dr. Vikas Thada, AMITY University, Pachgaon
Dr. T. A. Ashok Kumar, Institute of Management, Christ University, Bangalore
Dr. Shaheera Rashwan, Informatics Research Institute
Dr. S. Preetha Gunasekar, Bharathiyar University, India
Asst Professor Sameer Dev Sharma, Uttaranchal University, Dehradun
Dr. Zhihan lv, Chinese Academy of Science, China
Dr. Ikvinderpal Singh, Trai Shatabdi GGS Khalsa College, Amritsar
Dr. Umar Ruhi, University of Ottawa, Canada
Dr. Jasmin Cosic, University of Bihac, Bosnia and Herzegovina
Dr. Homam Reda El-Taj, University of Tabuk, Kingdom of Saudi Arabia
Dr. Mostafa Ghobaei Arani, Islamic Azad University, Iran
Dr. Ayyasamy Ayyanar, Annamalai University, India
Dr. Selvakumar Manickam, Universiti Sains Malaysia, Malaysia
Dr. Murali Krishna Namana, GITAM University, India
Dr. Smriti Agrawal, Chaitanya Bharathi Institute of Technology, Hyderabad, India
Professor Vimalathithan Rathinasabapathy, Karpagam College Of Engineering, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017
IJCSIS 2017-2018
ISSN: 1947-5500
http://sites.google.com/site/ijcsis/
International Journal Computer Science and Information Security, IJCSIS, is the premier
scholarly venue in the areas of computer science and security issues. IJCSIS 2011 will provide a high
profile, leading edge platform for researchers and engineers alike to publish state-of-the-art research in the
respective fields of information technology and communication security. The journal will feature a diverse
mixture of publication articles including core and applied computer science related topics.
Authors are solicited to contribute to the special issue by submitting articles that illustrate research results,
projects, surveying works and industrial experiences that describe significant advances in the following
areas, but are not limited to. Submissions may span a broad range of topics, e.g.:
Track A: Security
Access control, Anonymity, Audit and audit reduction & Authentication and authorization, Applied
cryptography, Cryptanalysis, Digital Signatures, Biometric security, Boundary control devices,
Certification and accreditation, Cross-layer design for security, Security & Network Management, Data and
system integrity, Database security, Defensive information warfare, Denial of service protection, Intrusion
Detection, Anti-malware, Distributed systems security, Electronic commerce, E-mail security, Spam,
Phishing, E-mail fraud, Virus, worms, Trojan Protection, Grid security, Information hiding and
watermarking & Information survivability, Insider threat protection, Integrity
Intellectual property protection, Internet/Intranet Security, Key management and key recovery, Language-
based security, Mobile and wireless security, Mobile, Ad Hoc and Sensor Network Security, Monitoring
and surveillance, Multimedia security ,Operating system security, Peer-to-peer security, Performance
Evaluations of Protocols & Security Application, Privacy and data protection, Product evaluation criteria
and compliance, Risk evaluation and security certification, Risk/vulnerability assessment, Security &
Network Management, Security Models & protocols, Security threats & countermeasures (DDoS, MiM,
Session Hijacking, Replay attack etc,), Trusted computing, Ubiquitous Computing Security, Virtualization
security, VoIP security, Web 2.0 security, Submission Procedures, Active Defense Systems, Adaptive
Defense Systems, Benchmark, Analysis and Evaluation of Security Systems, Distributed Access Control
and Trust Management, Distributed Attack Systems and Mechanisms, Distributed Intrusion
Detection/Prevention Systems, Denial-of-Service Attacks and Countermeasures, High Performance
Security Systems, Identity Management and Authentication, Implementation, Deployment and
Management of Security Systems, Intelligent Defense Systems, Internet and Network Forensics, Large-
scale Attacks and Defense, RFID Security and Privacy, Security Architectures in Distributed Network
Systems, Security for Critical Infrastructures, Security for P2P systems and Grid Systems, Security in E-
Commerce, Security and Privacy in Wireless Networks, Secure Mobile Agents and Mobile Code, Security
Protocols, Security Simulation and Tools, Security Theory and Tools, Standards and Assurance Methods,
Trusted Computing, Viruses, Worms, and Other Malicious Code, World Wide Web Security, Novel and
emerging secure architecture, Study of attack strategies, attack modeling, Case studies and analysis of
actual attacks, Continuity of Operations during an attack, Key management, Trust management, Intrusion
detection techniques, Intrusion response, alarm management, and correlation analysis, Study of tradeoffs
between security and system performance, Intrusion tolerance systems, Secure protocols, Security in
wireless networks (e.g. mesh networks, sensor networks, etc.), Cryptography and Secure Communications,
Computer Forensics, Recovery and Healing, Security Visualization, Formal Methods in Security, Principles
for Designing a Secure Computing System, Autonomic Security, Internet Security, Security in Health Care
Systems, Security Solutions Using Reconfigurable Computing, Adaptive and Intelligent Defense Systems,
Authentication and Access control, Denial of service attacks and countermeasures, Identity, Route and
Location Anonymity schemes, Intrusion detection and prevention techniques, Cryptography, encryption
algorithms and Key management schemes, Secure routing schemes, Secure neighbor discovery and
localization, Trust establishment and maintenance, Confidentiality and data integrity, Security architectures,
deployments and solutions, Emerging threats to cloud-based services, Security model for new services,
Cloud-aware web service security, Information hiding in Cloud Computing, Securing distributed data
storage in cloud, Security, privacy and trust in mobile computing systems and applications, Middleware
security & Security features: middleware software is an asset on
its own and has to be protected, interaction between security-specific and other middleware features, e.g.,
context-awareness, Middleware-level security monitoring and measurement: metrics and mechanisms
for quantification and evaluation of security enforced by the middleware, Security co-design: trade-off and
co-design between application-based and middleware-based security, Policy-based management:
innovative support for policy-based definition and enforcement of security concerns, Identification and
authentication mechanisms: Means to capture application specific constraints in defining and enforcing
access control rules, Middleware-oriented security patterns: identification of patterns for sound, reusable
security, Security in aspect-based middleware: mechanisms for isolating and enforcing security aspects,
Security in agent-based platforms: protection for mobile code and platforms, Smart Devices: Biometrics,
National ID cards, Embedded Systems Security and TPMs, RFID Systems Security, Smart Card Security,
Pervasive Systems: Digital Rights Management (DRM) in pervasive environments, Intrusion Detection and
Information Filtering, Localization Systems Security (Tracking of People and Goods), Mobile Commerce
Security, Privacy Enhancing Technologies, Security Protocols (for Identification and Authentication,
Confidentiality and Privacy, and Integrity), Ubiquitous Networks: Ad Hoc Networks Security, Delay-
Tolerant Network Security, Domestic Network Security, Peer-to-Peer Networks Security, Security Issues
in Mobile and Ubiquitous Networks, Security of GSM/GPRS/UMTS Systems, Sensor Networks Security,
Vehicular Network Security, Wireless Communication Security: Bluetooth, NFC, WiFi, WiMAX,
WiMedia, others
This Track will emphasize the design, implementation, management and applications of computer
communications, networks and services. Topics of mostly theoretical nature are also welcome, provided
there is clear practical potential in applying the results of such work.
Broadband wireless technologies: LTE, WiMAX, WiRAN, HSDPA, HSUPA, Resource allocation and
interference management, Quality of service and scheduling methods, Capacity planning and dimensioning,
Cross-layer design and Physical layer based issue, Interworking architecture and interoperability, Relay
assisted and cooperative communications, Location and provisioning and mobility management, Call
admission and flow/congestion control, Performance optimization, Channel capacity modeling and analysis,
Middleware Issues: Event-based, publish/subscribe, and message-oriented middleware, Reconfigurable,
adaptable, and reflective middleware approaches, Middleware solutions for reliability, fault tolerance, and
quality-of-service, Scalability of middleware, Context-aware middleware, Autonomic and self-managing
middleware, Evaluation techniques for middleware solutions, Formal methods and tools for designing,
verifying, and evaluating, middleware, Software engineering techniques for middleware, Service oriented
middleware, Agent-based middleware, Security middleware, Network Applications: Network-based
automation, Cloud applications, Ubiquitous and pervasive applications, Collaborative applications, RFID
and sensor network applications, Mobile applications, Smart home applications, Infrastructure monitoring
and control applications, Remote health monitoring, GPS and location-based applications, Networked
vehicles applications, Alert applications, Embeded Computer System, Advanced Control Systems, and
Intelligent Control : Advanced control and measurement, computer and microprocessor-based control,
signal processing, estimation and identification techniques, application specific IC’s, nonlinear and
adaptive control, optimal and robot control, intelligent control, evolutionary computing, and intelligent
systems, instrumentation subject to critical conditions, automotive, marine and aero-space control and all
other control applications, Intelligent Control System, Wiring/Wireless Sensor, Signal Control System.
Sensors, Actuators and Systems Integration : Intelligent sensors and actuators, multisensor fusion, sensor
array and multi-channel processing, micro/nano technology, microsensors and microactuators,
instrumentation electronics, MEMS and system integration, wireless sensor, Network Sensor, Hybrid
Sensor, Distributed Sensor Networks. Signal and Image Processing : Digital signal processing theory,
methods, DSP implementation, speech processing, image and multidimensional signal processing, Image
analysis and processing, Image and Multimedia applications, Real-time multimedia signal processing,
Computer vision, Emerging signal processing areas, Remote Sensing, Signal processing in education.
Industrial Informatics: Industrial applications of neural networks, fuzzy algorithms, Neuro-Fuzzy
application, bioInformatics, real-time computer control, real-time information systems, human-machine
interfaces, CAD/CAM/CAT/CIM, virtual reality, industrial communications, flexible manufacturing
systems, industrial automated process, Data Storage Management, Harddisk control, Supply Chain
Management, Logistics applications, Power plant automation, Drives automation. Information Technology,
Management of Information System : Management information systems, Information Management,
Nursing information management, Information System, Information Technology and their application, Data
retrieval, Data Base Management, Decision analysis methods, Information processing, Operations research,
E-Business, E-Commerce, E-Government, Computer Business, Security and risk management, Medical
imaging, Biotechnology, Bio-Medicine, Computer-based information systems in health care, Changing
Access to Patient Information, Healthcare Management Information Technology.
Communication/Computer Network, Transportation Application : On-board diagnostics, Active safety
systems, Communication systems, Wireless technology, Communication application, Navigation and
Guidance, Vision-based applications, Speech interface, Sensor fusion, Networking theory and technologies,
Transportation information, Autonomous vehicle, Vehicle application of affective computing, Advance
Computing technology and their application : Broadband and intelligent networks, Data Mining, Data
fusion, Computational intelligence, Information and data security, Information indexing and retrieval,
Information processing, Information systems and applications, Internet applications and performances,
Knowledge based systems, Knowledge management, Software Engineering, Decision making, Mobile
networks and services, Network management and services, Neural Network, Fuzzy logics, Neuro-Fuzzy,
Expert approaches, Innovation Technology and Management : Innovation and product development,
Emerging advances in business and its applications, Creativity in Internet management and retailing, B2B
and B2C management, Electronic transceiver device for Retail Marketing Industries, Facilities planning
and management, Innovative pervasive computing applications, Programming paradigms for pervasive
systems, Software evolution and maintenance in pervasive systems, Middleware services and agent
technologies, Adaptive, autonomic and context-aware computing, Mobile/Wireless computing systems and
services in pervasive computing, Energy-efficient and green pervasive computing, Communication
architectures for pervasive computing, Ad hoc networks for pervasive communications, Pervasive
opportunistic communications and applications, Enabling technologies for pervasive systems (e.g., wireless
BAN, PAN), Positioning and tracking technologies, Sensors and RFID in pervasive systems, Multimodal
sensing and context for pervasive applications, Pervasive sensing, perception and semantic interpretation,
Smart devices and intelligent environments, Trust, security and privacy issues in pervasive systems, User
interfaces and interaction models, Virtual immersive communications, Wearable computers, Standards and
interfaces for pervasive computing environments, Social and economic models for pervasive systems,
Active and Programmable Networks, Ad Hoc & Sensor Network, Congestion and/or Flow Control, Content
Distribution, Grid Networking, High-speed Network Architectures, Internet Services and Applications,
Optical Networks, Mobile and Wireless Networks, Network Modeling and Simulation, Multicast,
Multimedia Communications, Network Control and Management, Network Protocols, Network
Performance, Network Measurement, Peer to Peer and Overlay Networks, Quality of Service and Quality
of Experience, Ubiquitous Networks, Crosscutting Themes – Internet Technologies, Infrastructure,
Services and Applications; Open Source Tools, Open Models and Architectures; Security, Privacy and
Trust; Navigation Systems, Location Based Services; Social Networks and Online Communities; ICT
Convergence, Digital Economy and Digital Divide, Neural Networks, Pattern Recognition, Computer
Vision, Advanced Computing Architectures and New Programming Models, Visualization and Virtual
Reality as Applied to Computational Science, Computer Architecture and Embedded Systems, Technology
in Education, Theoretical Computer Science, Computing Ethics, Computing Practices & Applications
Authors are invited to submit papers through e-mail [email protected]. Submissions must be original
and should not have been published previously or be under consideration for publication while being
evaluated by IJCSIS. Before submission authors should carefully read over the journal's Author Guidelines,
which are located at http://sites.google.com/site/ijcsis/authors-notes .
© IJCSIS PUBLICATION 2017
ISSN 1947 5500
http://sites.google.com/site/ijcsis/