0% found this document useful (0 votes)
811 views339 pages

Kumpulan Jurnal Imam Riadi Internasional

Uploaded by

Edi Suwandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
811 views339 pages

Kumpulan Jurnal Imam Riadi Internasional

Uploaded by

Edi Suwandi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 339

IJCSIS Vol. 15 No.

2 Part II, February 2017


ISSN 1947-5500

International Journal of
Computer Science
& Information Security

© IJCSIS PUBLICATION 2017


Pennsylvania, USA
     Indexed and technically co‐sponsored by : 

 
 

 
 
 

 
 
 

 
   

 
 
 
 

 
     

     
 

 
 
 
 
 
IJCSIS
ISSN (online): 1947-5500

Please consider to contribute to and/or forward to the appropriate groups the following opportunity to submit and publish
original scientific results.

CALL FOR PAPERS


International Journal of Computer Science and Information Security (IJCSIS)
January-December 2017 Issues

The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research,
standards, implementations, running experiments, applications, and industrial case studies. Authors are invited
to submit complete unpublished papers, which are not under review in any other conference or journal in the
following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.

Indexed by Google Scholar, DBLP, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld
Academic Search Engine (BASE), SCIRUS, Scopus Database, Cornell University Library, ScientificCommons,
ProQuest, EBSCO and more.
Deadline: see web site
Notification: see web site
Revision: see web site
Publication: see web site

Context-aware systems Agent-based systems


Networking technologies Mobility and multimedia systems
Security in network, systems, and applications Systems performance
Evolutionary computation Networking and telecommunications
Industrial systems Software development and deployment
Evolutionary computation Knowledge virtualization
Autonomic and autonomous systems Systems and networks on the chip
Bio-technologies Knowledge for global defense
Knowledge data systems Information Systems [IS]
Mobile and distance education IPv6 Today - Technology and deployment
Intelligent techniques, logics and systems Modeling
Knowledge processing Software Engineering
Information technologies Optimization
Internet and web technologies Complexity
Digital information processing Natural Language Processing
Cognitive science and knowledge  Speech Synthesis
Data Mining 

For more topics, please see web site https://sites.google.com/site/ijcsis/

For more information, please visit the journal website (https://sites.google.com/site/ijcsis/)


 
Editorial
Message from Editorial Board
It is our great pleasure to present the February 2017 issue (Volume 15 Number 2) of the
International Journal of Computer Science and Information Security (IJCSIS). High quality
research, survey & review articles are proposed from experts in the field, promoting insight and
understanding of the state of the art, and trends in computer science and technology. It especially
provides a platform for high-caliber academics, practitioners and PhD/Doctoral graduates to
publish completed work and latest research outcomes. According to Google Scholar, up to now
papers published in IJCSIS have been cited over 9800 times and this journal is experiencing
steady and healthy growth. Google statistics shows that IJCSIS has established the first step to
be an international and prestigious journal in the field of Computer Science and Information
Security. There have been many improvements to the processing of papers; we have also
witnessed a significant growth in interest through a higher number of submissions as well as
through the breadth and quality of those submissions. IJCSIS is indexed in major
academic/scientific databases and important repositories, such as: Google Scholar, Thomson
Reuters, ArXiv, CiteSeerX, Cornell’s University Library, Ei Compendex, ISI Scopus, DBLP, DOAJ,
ProQuest, ResearchGate, Academia.edu and EBSCO among others.

A great journal cannot be made great without a dedicated editorial team of editors and reviewers.
On behalf of IJCSIS community and the sponsors, we congratulate the authors and thank the
reviewers for their outstanding efforts to review and recommend high quality papers for
publication. In particular, we would like to thank the international academia and researchers for
continued support by citing papers published in IJCSIS. Without their sustained and unselfish
commitments, IJCSIS would not have achieved its current premier status, making sure we deliver
high-quality content to our readers in a timely fashion.

“We support researchers to succeed by providing high visibility & impact value, prestige and
excellence in research publication.” We would like to thank you, the authors and readers, the
content providers and consumers, who have made this journal the best possible.

For further questions or other suggestions please do not hesitate to contact us at


[email protected].

A complete list of journals can be found at:


http://sites.google.com/site/ijcsis/
IJCSIS Vol. 15, No. 2, February 2017 Edition
ISSN 1947-5500 © IJCSIS, USA.
Journal Indexed by (among others):

Open Access This Journal is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source.
Bibliographic Information
ISSN: 1947-5500
Monthly publication (Regular Special Issues)
Commenced Publication since May 2009

Editorial / Paper Submissions:


IJCSIS Managing Editor
([email protected])
Pennsylvania, USA
Tel: +1 412 390 5159
IJCSIS EDITORIAL BOARD

IJCSIS Editorial Board IJCSIS Guest Editors / Associate Editors


Dr. Shimon K. Modi [Profile] Dr Riktesh Srivastava [Profile]
Director of Research BSPA Labs, Associate Professor, Information Systems,
Purdue University, USA Skyline University College, Sharjah, PO 1797,
UAE
Professor Ying Yang, PhD. [Profile] Dr. Jianguo Ding [Profile]
Computer Science Department, Yale University, USA Norwegian University of Science and Technology
(NTNU), Norway
Professor Hamid Reza Naji, PhD. [Profile] Dr. Naseer Alquraishi [Profile]
Department of Computer Enigneering, Shahid University of Wasit, Iraq
Beheshti University, Tehran, Iran
Professor Yong Li, PhD. [Profile] Dr. Kai Cong [Profile]
School of Electronic and Information Engineering, Intel Corporation,
Beijing Jiaotong University, & Computer Science Department, Portland State
P. R. China University, USA
Professor Mokhtar Beldjehem, PhD. [Profile] Dr. Omar A. Alzubi [Profile]
Sainte-Anne University, Halifax, NS, Canada Al-Balqa Applied University (BAU), Jordan
Professor Yousef Farhaoui, PhD. Dr. Jorge A. Ruiz-Vanoye [Profile]
Department of Computer Science, Moulay Ismail Universidad Autónoma del Estado de Morelos,
University, Morocco Mexico
Dr. Alex Pappachen James [Profile] Prof. Ning Xu,
Queensland Micro-nanotechnology center, Griffith Wuhan University of Technology, China
University, Australia
Professor Sanjay Jasola [Profile] Dr . Bilal Alatas [Profile]
Gautam Buddha University Department of Software Engineering, Firat
University, Turkey
Dr. Siddhivinayak Kulkarni [Profile] Dr. Ioannis V. Koskosas,
University of Ballarat, Ballarat, Victoria, Australia University of Western Macedonia, Greece
Dr. Reza Ebrahimi Atani [Profile] Dr Venu Kuthadi [Profile]
University of Guilan, Iran University of Johannesburg, Johannesburg, RSA
Dr. Dong Zhang [Profile] Dr. Zhihan lv [Profile]
University of Central Florida, USA Chinese Academy of Science, China
Dr. Vahid Esmaeelzadeh [Profile] Prof. Ghulam Qasim [Profile]
Iran University of Science and Technology University of Engineering and Technology,
Peshawar, Pakistan
Dr. Jiliang Zhang [Profile] Prof. Dr. Maqbool Uddin Shaikh [Profile]
Northeastern University, China Preston University, Islamabad, Pakistan
Dr. Jacek M. Czerniak [Profile] Dr. Musa Peker [Profile]
Casimir the Great University in Bydgoszcz, Poland Faculty of Technology, Mugla Sitki Kocman
University, Turkey
Dr. Binh P. Nguyen [Profile] Dr. Wencan Luo [Profile]
National University of Singapore University of Pittsburgh, US
Professor Seifeidne Kadry [Profile] Dr. Ijaz Ali Shoukat [Profile]
American University of the Middle East, Kuwait King Saud University, Saudi Arabia
Dr. Riccardo Colella [Profile] Dr. Yilun Shang [Profile]
University of Salento, Italy Tongji University, Shanghai, China
Dr. Sedat Akleylek [Profile] Dr. Sachin Kumar [Profile]
Ondokuz Mayis University, Turkey Indian Institute of Technology (IIT) Roorkee
Dr Basit Shahzad [Profile]
King Saud University, Riyadh - Saudi Arabia
Dr. Sherzod Turaev [Profile]
International Islamic University Malaysia

ISSN 1947 5500 Copyright © IJCSIS, USA.


TABLE OF CONTENTS

1. PaperID 31011701: Machine Learning Techniques to Recognize Multilingual Characters using HOG
Features (pp. 1-8)

Sreerama Murthy Velaga


Professor, Department of Computer Science & Engineering, GMR Institute of Technology, Rajam, Andhra Pradesh,
India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

2. PaperID 31011703: Image Steganography between Firefly and PSO Algorithms (pp. 9-21)

* Ziyad Tariq Mustafa Al-Ta’i , * Jamal Mustafa Abass ,** Omar Y. Abd Al-Hameed
* Department of Computer Science - College of Science - University of Diyala
** Computer Science Department – University of Garmian

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

3. PaperID 31011704: Farsi Text Localization in Natural Scene Images (pp. 22-30)

M. Samaee, Department of Electrical and Computer Engineering, Amirkabir University of Technology, Tehran, Iran
H. Tavakoli, Department of Electrical Engineering, Shahed university of Tehran, Tehran, Iran

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

4. PaperID 31011705: In Silico Screening and Pathway Analysis of Disease-Associated nsSNPs of MITF Gene:
A study on Melanoma (pp. 31-54)

Muhammad Naveed (*1,2), Fiza Anwar (1), Syeda Khushbakht kazmi (1), Fadwa Tariq (1), Sana Tehreem (1),
Ghulam Abbas (1), Humayun Irshad (2), Pervez Anwar (2), Aitizaz Ali (3), Muzamil Mehboob (3)
(1) Department of Biochemistry and Biotechnology, University of Gujrat, Pakistan 50700
(2) Department of Biotechnology, Faculty of Sciences, University of Gujrat, Sialkot campus, Pakistan 51310
(3) Department of Computer Sciences, University of Gujrat, Sialkot campus, Pakistan 51310

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

5. PaperID 31011709: Ticket based Secure Authentication Scheme using NTRU Cryptosystem in Wireless
Sensor Network (pp. 55-66)

Iqbaldeep Kaur (1), Harnain kour (2), Dr. Amit Verma (1*)
(1) Associate Professor, Computer Science& Engineering, Chandigarh Engineering College, Landran, Punjab,
India
(2) M. Tech. Research Scholar, Computer Science & Engineering, Chandigarh Engineering College, Landran,
Punjab, India
(1*) Professor and HOD, Computer Science& Engineering, Chandigarh Engineering College, Landran, Punjab,
India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]


6. PaperID 31011712: Shape Descriptor Analysis for DNA Classification using Digital Image Processing (pp.
67-71)

Hazel Esperanza Loya Larios, Raúl Santiago Montero, David Asael Gutiérrez Hernández, Agustino Martínez
Antonio Luis Ernesto Mancilla Espinoza
Tecnológico Nacional de México. Instituto Tecnológico de León. División de Estudios de Posgrado e Investigación.
Av. Tecnológico S/N - Fracc. Industrial Julián de Obregón. León, Guanajuato, México - C.P. 37290
Tecnológico Nacional de México, Instituto Tecnológico de León, División de Estudios de Posgrado e Investigación,
León Guanajuato, México

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

7. PaperID 31011713: LSBSM: A Novel Method for Identification of Near Duplicates in Web Documents (pp.
72-78)

Lavanya Pamulaparty, Research Scholar, Department of CSE, JNTUH, Hyderabad, India


Dr. C.V. Guru Rao, Department of CSE, S R Engineering College, JNT University, Warangal, India
Dr. M. Sreenivasa Rao, Department of CSE, School of IT, JNT University, Hyderabad, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

8. PaperID 31011718: A Method for Arabic Documents Plagiarism Detection (pp. 79-85)

Yahya A. Abdelrahman, Department of Computer Science, Sudan University of Science and Technology, Khartoum,
Sudan
Ahmed Khalid, Department of Computer, Najran University, Najran KSA
Izzeldin M. Osman, Department of Computer Science, Sudan University of Science and Technology Khartoum,
Sudan

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

9. PaperID 31011719: Controlling Future Intelligent Smart Homes using Wireless Integrated Network Systems
(pp. 86-112)

Rustom Mamlook *, Omer Fraz Khan, Mohannad Maher Haddad, Hatem Salim Koofan, Said Mahad Tabook
Department of Electrical & Computer Engineering, Dhofar University, Sultanate of Oman

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

10. PaperID 31011720: Role of Stakeholders in Requirement Change Management (pp. 113-117)

Haya Majid Qureshi, Rabia Hameed Malik, Wafa Qureshi


Department of Computer Science, COMSATS University, Pakistan

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

11. PaperID 31011722: Steganography in DCT-based Compressed Images through Modified Quantization and
Matrix Encoding (pp. 118-126)

K. Rosemary Euphrasia (1), M. Mary Shanthi Rani (2)


(1) Dept. of Computer Science, Fatima College, Madurai, TamilNadu. India.
(2) Dept. of Comp. Sci. and Applications, Gandhigram Rural Institute, Deemed University Gandhigram, TamilNadu.
India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

12. PaperID 31011724: A Comparison Study on Text Detection in Scene Images Based on Connected
Component Analysis (pp. 127-139)

Abdel-Rahiem A. Hashem (1), Mohd. Yamani Idna Idris (2), Ahmed Gawish (3), Moumen T. ElMelegy (4)
(1) Mathematics Department, Faculty of science, Assiut University, Assiut 71516, Egypt. Exchange student program
in UM university, Malaysia
(2) Faculty of Computer Science and Information Technology, University of Malaya, Malaysia.
(3) Vision and Image Processing (VIP) Lab, Department of Systems Design Engineering, University of Waterloo,
Waterloo, Canada
(4) Electrical Engineering Department, Assiut University, Assiut 71516, Egypt

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

13. PaperID 31011726: Searching of a Route through Implementation of Neural Network in Visual Prolog (pp.
140-144)

Elitsa Zdravkova,
Department of Computer Systems and Technologies, Shumen University "Konstantin Preslavsky", Shumen, Bulgaria

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

14. PaperID 31011727: Predection of Nephrolithiasis Based on Extracted Features of X-Ray Images Using
Artificial Neural Networks (pp. 145-156)

G. Sumana (1), G. Anjan Babu (2)


(1) Sri Padmavathi Mahila Viswa Vidyalaya, Tirupati, India
(2) Professor, Department of Computer Science, Sri Venkateswara University, Tirupati, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

15. PaperID 31011729: Multimodal Cumulative Class-Specific Linear Discriminant Regression for Cloud
Security (pp. 157-165)

Savitha G., Computer Science and Engineering, B.N.M. Institute of Technology, Bangalore, India
Dr. Vibha Lakshmikantha, Computer Science and Engineering, B.N.M. Institute of Technology, Bangalore, India
Dr. K. R. Venugopal, Computer Science and Engineering, Visvesvaraya College of Engineering, Bangalore, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

16. PaperID 31011730: Generic Architecture for Information Availability (GAIA) a High Level Agent Oriented
Methodology (pp. 166-171)

Obinnaya Chinecherem Omankwu, Chikezie Kenneth Nwagu, Hycient Inyiama


(1) Computer Science Department, Michael Okpara University of Agriculture, Umudike, Umuahia, Abia State,
Nigeria
(2) University Department, Mantrac Nigeria Limited, Lagos, Nigeria,
(3) Computer Engineering Department, Nnamdi Azikiwe University, Awka, Anambra State, Nigeria.

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

17. PaperID 31011734: Microstrip Patch Antenna with Defected Ground for L, S and C Band Applications (pp.
172-179)

Karmjeet Kaur, Jagtar Singh Sivia, David Gupta


Department of Electronics and Communication Engineering, Yadawindra College of Engineering, Punjabi
University GKC, Talwandi Sabo, Bathinda, Punjab, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

18. PaperID 31011739: Extracting Words’ Polarity with Definition and Examples (pp. 180-190)

Tariq Naeem, Fazal Masud Kundi, Sheikh Muhammad Saqib


Institute of Computing and Information Technology, Gomal University, D. I. Khan. Pakistan

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

19. PaperID 31011740: An Efficient and Secure One Way Cryptographic Hash Function with Digest Length of
1024 Bits (pp. 191-198)

Justice Nueteh Terkper, Department of Computer Science, Kwame Nkrumah University of Science and Technology,
Kumasi, Ghana
James Ben Hayfron-Acquah, Department of Computer Science, Kwame Nkrumah University of Science and
Technology, Kumasi, Ghana
Frimpong Twum, Department of Computer Science, Kwame Nkrumah University of Science and Technology,
Kumasi, Ghana

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

20. PaperID 31011744: Natural Terrain Feature Identification using Integrated Approach of Cuckoo Search
and Intelligent Water Drops Algorithm (pp. 199-215)

Iqbaldeep Kaur, Parminder Kaur, Amit Verma


Computer Science& Engineering, Chandigarh Engineering College,Landran, Punjab, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

21. PaperID 31011747: Tree Based Cluster Energy Aware Routing In Wireless Sensor Networks (pp. 216-226)

Thirupathi Regula, Dr. Mohammed Ali Hussain


Dept. of Computer Science & Engineering, Shri Venkateshwara University Gajraula, Amroha, Uttar Pradesh, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

22. PaperID 31011750: Security-as a – service in Cloud Computing (SecAAS) (pp. 227-230)
Baby Marina, Information Technology, SBBU, Shaheed Benazirabad
Dr. Irfana Memon, CSE, QUEST, Nawabshah
Fatima, telecommunication, QUEST, Nawabshah

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

23. PaperID 31011752: A New Enhanced Automated Fuzzy-Based Rough Decision Model (pp. 231-238)

Mohamed S.S.Basyoni, Ahmed Mohamed Gad Allah, Hesham A. Hefny.


Cairo University, Institute of Statistical Studies and Research, Department of Computer and Information Sciences

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

24. PaperID 31011757: Secrecy Capacity of a Rayleigh Fading Channel under Jamming Signal (pp. 239-246)

Habiba Akter, Department of ECE, East West University, Dhaka, Bangladesh


Md. Mojammel Islam, Department of ECE, East West University, Dhaka, Bangladesh
Md. Imdadul Islam, Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka,
Bangladesh.
M. R. Amin, Department of Electronics and Communications, Engineering, East West University, Dhaka Banglades

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

25. PaperID 31011759: ETL Based Query Processing Architecture for Sensornet (pp. 247-254)

Dileep Kumar, Department of Information Media, The University of Suwon, Hwaseong-si South Korea
Jangyoung Kim, Department of Computer Science, The University of Suwon, Hwaseong-si South Korea

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

26. PaperID 31011760: Link Prediction in Social Networks Based on Similarity Criteria and Behavioral
Patterns of Users (pp. 255-264)

Farnaz Sabzevari *, Islamic Azad University, Damavand Branch, Department of computer, Tehran, Iran
Ali HaroonAbadi, Islamic Azad University, Central Tehran Branch, Department of computer, Tehran, Iran
Javad Mir Abedini, Islamic Azad University, Central Tehran Branch, Department of computer, Tehran, Iran

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

27. PaperID 31011761: Ear Biometric System Using Speeded-up Robust Features and Principal Component
Analysis (pp. 265-269)

Dr. Habes Alkhraisat, (Al-Balqa Applied University) Department of Computer Science Al-balqa Applied University,
Asalt, Jordan

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

28. PaperID 31011764: Simulation of Various QAM Techniques Used in DVBT2 & Comparison for Various
BER Vs SNR (pp. 270-276)
Sneha Pandya, C. U. Shah University, Wadhwan.
Nimit Shah, Electrical Engg, C. U. Shah College of Engg & Technology.
Dr. C. R. Patel, V.V.P. Engineering College, Rajkot

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

29. PaperID 31011765: Assessing e-Government systems success in Jordan (e-JC): A validation of TAM and IS
Success model Validation of TAM and IS for e-Government Systems Success in Jordan (pp. 277-304)

Arif Sari *, Murat Akkaya, Bashar Abdalla


Department of Management Information Systems, Girne American University, Kyrenia, Turkish Republic of
Northern Cyprus, via Mersin 10, Turkey

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

30. PaperID 31011766: Mining Student Data Using CRISP-DM Model (pp. 305-316)

Layth Almahadeen, Murat Akkaya, Arif Sari


Department of Management Information Systems, Girne American University

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

31. PaperID 31011769: Malware-Free Intrusion: A Novel Approach to Ransomware Infection Vectors (pp. 317-
325)

Aaron Zimba, Department of Computer Science and Technology, University of Science and Technology Beijing,
Beijing 100083, China

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

32. PaperID 31011770: Network Forensics for Detecting Flooding Attack on Web Server (pp. 326-331)

Desti Mualfah, Department of Informatics, Islamic University of Indonesia, Yogyakarta, Indonesia


Imam Riadi, Department of Information Systems, Ahmad Dahlan University, Yogyakarta, Indonesia

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

33. PaperID 31011771: Adaptive Scheme for Application Methods Offloading in Mobile Cloud Computing (pp.
332-339)

Ahmed. A. A. Gad-ElRab, Farouk. A. Emara


Department of Mathematics, Faculty of Science, Al-Azhar University, Cairo, Egypt

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

34. PaperID 31011773: Spectral Unmixing From Hyperspectral Imagery Using Modified Gram Schmidt
Orthogonalization and NMF (pp. 340-345)

Neetu N. Gyanchandani, Department of Electronics Engineering, Research Scholar, GHRCE, Nagpur, India
Dr. A. A. Khurshid, HOD, Electronics Engineering, RCOEM, Nagpur, India
Dr. Sanjay Dorle, HOD, Department of Electronics Engineering, GHRCE, Nagpur, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

35. PaperID 31011774: Hyperspectral Image Compression Methods: A Review (pp. 346-350)

Neetu N. Gyanchandani, Department of Electronics Engineering, Research Scholar, GHRCE, Nagpur, India
Dr. A. A. Khurshid, HOD, Electronics Engineering, RCOEM, Nagpur, India
Dr. Sanjay Dorle, HOD, Department of Electronics Engineering, GHRCE, Nagpur,India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

36. PaperID 31011777: MQA: Mobility’s Quantification Algorithm in AODV Protocol (pp. 351-361)

Meryem SAADOUNE, Abdelmajid HAJAMI, Hakim ALLALI,


LAVETE Laboratory, Univ. HASSAN 1st, FSTS, Settat, Morocco

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

37. PaperID 31011778: Testing Coverage based Software Reliability Models: Critical Analysis and Ranking
based on Weighted Criterion (pp. 362-371)

Manohar Singh, Research Scholar, Department of Computer Science, OPJS University, Churu, Rajasthan, India
Dr. Vaibhav Bansal, Associate Professor, Department of Computer Science, OPJS University, Churu, Rajasthan,
India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

38. PaperID 31011783: Accelerating a Secure Communication Channel Construction Using HW/ SW Co-design
(pp. 372-377)

Roghayeh Mojarad, Hossain Kordestani


Department of Computer Engineering and Information Technology, Amirkabir University of Technology (Tehran
Polytechnic)

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

39. PaperID 31011789: An Efficient Zone-Based Routing Protocol for WSN (pp. 378-396)

Kamal Beydoun, Khodor Hammoud,


Department of Computer Science, Lebanese University, Beirut, Lebanon

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

40. PaperID 31011790: Zone Hierarchical Routing Protocol with Data Aggregation (pp. 397-405)

Kamal Beydoun,
Department of Computer Science, Lebanese University, Beirut, Lebanon

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]


41. PaperID 31011792: Live Forensics on RouterOS using API Services to Investigate Network Attacks (pp.
406-410)

Muhammad Itqan Mazdadi, Department of Informatics Engineering, Islamic University of Indonesia, Yogyakarta,
Indonesia
Imam Riadi, Department of Information System, Ahmad Dahlan University, Yogyakarta, Indonesia
Ahmad Luthfi, Department of Informatics Engineering, Islamic University of Indonesia, Yogyakarta, Indonesia

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

42. PaperID 31121621: Evaluating Maintainability of Open Source Software: A Case Study (pp. 411-429)

Feras Hanandeh (1), Ahmad A. Saifan (2), Mohammed Akour (3), Noor Al-Hussein (4), Khadijah Shatnawi (5)
(1) The Hashemite University, Zarqa, Jordan.
(2, 3) Software Engineering Department, Faculty of IT, Yarmouk University, Irbid, Jordan.
(4, 5) CIS Department, Faculty of IT, Yarmouk University, Irbid, Jordan.

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

43. PaperID 31121622: Classification of Human Vision Discrepancy during Watching 2D and 3D Movies Based
on EEG Signals (pp. 430-436)

Negin Manshouri, Masoud Maleki, Temel Kayıkçıoğlu


Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

44. PaperID 31121623: A New Brain-Computer Interface System Based on Classification of the Gaze on Four
Rotating Vanes (pp. 437-443)

Masoud Maleki, Negin Manshouri, Temel Kayıkcioglu


Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

45. PaperID 301116179: Decentralized Access Control with Anonymous Authentication for Secure Data
Storage on Cloud (pp. 444-449)

Shraddha Mokle, Department of Computer Engineering, Modern Education Society's College of Engineering, Pune,
India
Prof. Nuzhat F Shaikh, Department of Computer Engineering, Modern Education Society's College of Engineering,
Pune, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

46. PaperID 301116214: Data partition and Aggregation in MapReduce to Improve Processing time (pp. 450-
456)

Priya P. Gawande, Modern Education Society’s College of Engineering, Pune


Nuzhat F. Shaikh, Modern Education Society’s College of Engineering, Pune

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

47. PaperID 311016184: Studying the Numerical Methods for Calculating Bi-Phase Fluid Flow (pp. 457-463)

Behrouz Aghaei *, Afshin Mohseni Arasteh


North Branch, Islamic Azad University, Tehran, Iran

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

48. PaperID 30111603: A Novel Simple Method to Select Optimal k in k-Nearest Neighbor Classifier (pp. 464-
469)

Masoud Maleki, Negin Manshouri, Temel Kayıkçıoğlu


Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

49. PaperID 301116112: A proposed Method for Face Image Edge Detection Using Markov Basis (pp. 470-476)

Husein Hadi Abbass, Zainab Radhi Mousa


Department of Mathematics, Faculty of Education for Girls, University of Kufa, Najaf, Iraq

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

50. PaperID 301116229: Face Recognition Age Invariant: A Closer Look (pp. 477-482)

Divyanshu Sinha, KCCITM, Noida, India


Dr. JP Pandey, KNIT, Sultanpur, India
Dr. Bhavesh Chauhan, ABESIT

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

51. PaperID 31011782: Comparative Study for Selection of an Item Based on Multi-Criteria DSS (pp. 483-492)

Viharika Padma, Anurag Group of Institutions, Hyd, India


Sanjana B L, Anurag Group of Institutions, Hyd, India
M Varaprasad Rao, Anurag Group of Institutions, Hyd, India

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

52. PaperID 31011787: Implementation of Indian Sign Language Recognition System using Scale Invarient
Feature Transform (SIFT) (pp. 493-507)

Sandeep Baburao Patil (1), Rajesh H. Talwekar (2)


(1) Electronics & Telecommunication, Faculty of Engineering and Technology of Shri Shankaracharya Technical
Campus, Chhattisgarh Swami Vivekanand Technical University, Bhilai, India.
(2) Electronics & Telecommunication, Government Engineering College, Raipur, India
Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

53. PaperID 301116196: Indexes’ Optimal Selection for Data Warehouse Quality (pp. 508-514)

Dr. Murtadha M. Hamad, Mohanad Ahamed Salih


Department of Computer Science, College of Computer Sciences & Information Technology, University of Anbar
Baghdad, Iraq

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

54. PaperID 301116231: A Survey on Different Methods of Software Cloning and Detection (pp. 515-535)

Syed Mohd Fazalul Haque, Maulana Azad National Urdu University


V. Srikanth, K L University
E. Sreenivasa Reddy, Acharya Nagarjuna University

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

55. PaperID 31011776: Performance comparison of Adaptive OFDM Pre-and Post-FFT Beamforming System
(pp. 536-543)

Waleed Abdallah, Yousef Abuzir, Mohamad Khdair,


Faculty of Technology and Applied Sciences, Al-Quds Open University, Jerusalem, Palestine

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]

56. PaperID 31011735: Designing of Cloud Storage using Python Language (pp. 544-552)

Shipra Goel

Full Text: PDF [Academia.edu | Scopus | Scribd | Archive | ProQuest]


International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Link prediction in social networks based on


similarity criteria and behavioral patterns of
users
Farnaz Sabzevari * Ali HaroonAbadi Javad Mir Abedini
PhD in Computer Eng- PhD in Computer Eng-
MA in Computer Eng- Software,
Software, Islamic Azad Software, Islamic Azad
Islamic Azad University,
University, Central Tehran University, Central Tehran
Damavand Branch, Department of
Branch, Department of Branch, Department of
computer, Tehran, Iran
computer, Tehran, Iran computer, Tehran, Iran
[email protected]
[email protected] [email protected]

Abstract-Link prediction is one of the most important and common activities in the field of social network analysis
and network graphs analysis. Link prediction means the possibility of establishing a connection between two vertices
that currently there is no relationship between them and may be done according to the available information on the
net and by knowing information about the communication that has already been created. A variety of link
prediction methods is presented in social networks. Features used in determining the similarity that are extracted
from the network graph include local and global characteristics. Local features have the advantage of speed and
global features have the advantage of Precision. In this study, the aim is that by using the user profile properties and
clustering them and finally by applying the Friend Link algorithm in each cluster, a system can be implemented to
link prediction between users. Therefore, by using these mentioned techniques, the Precision of prediction can be
raised. The Precision of the proposed method in compare with method of spectral link has been improved close to
4% on average.

Keywords: Link prediction, social network, clustering, Friend Link

I. INTRODUCTION
Social networks are dynamic networks that members and linkages between them are constantly increasing.
Chain of these links is incomplete due to the process and or due to reasons that are not reflected in these
networks have been torn and lost. Therefore, one of the important issues in social networks is issue of link
prediction that means the presence or absence of a link or connection in the future between two vertices of a
social network and it is an important tool for social network analysis.

Graphs are used to display social networks. Nodes in the graph play the role of members and edges play role of
communication between these people [1&2]. In this paper, we are going to create a recommender system on the
web by using graph theory and similarity measure. The second section is focused on basic concepts and
definitions needed in the next parts. The works that have been so far performed on the subject of this study is
analyzed in the third section. The proposed technique is described in section four and in the fifth section
simulation and evaluation of the proposed method is described and finally in sixth section the conclusion is
observed.

255 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

II. BACKGROUND

In this section we will be familiar with the general concepts and definitions that will be mentioned in the next
chapters.

A. LINK MINING
Topic of link mining was proposed formally for the first time that link prediction considered one of the sub-
links of link mining [3&4]. Sometime in some networks, some links arise accidentally due to an error in the
networks. These incorrect links can disrupt the network structure and its study. With the help of Link
prediction, these links can be identified and removed from the network [5].

B. MODELING OF SOCIAL NETWORKS


Modeling is simulation, simulation on a smaller scale than a large object. Two main forms will be taken to
display social networks: a display based on graph and matrix display. Models presented for social networks are
of mathematical models. Models of social networks describe a social network with the help of mathematical
tools, graph and matrix. While matrixes are suitable for small social networks, graphs are usually suitable to
display the network in different fields such as computer science, sociology, biology and etc… [6].

C. LINK PREDICTION METHODS


Different strategies of link prediction can be divided into three categories:

First group: solutions that are based on similarity criteria. These methods using structural characteristics of the
network graph are used to recognize the similarities between the network nodes that include three groups: local,
quasi-local and global.

Second group: solutions that are based on maximum possibility. In these solutions while studding the network
structure, rules and features that increase the probability of the links will be extracted.

Third groups: are solutions based on statistics. In this kind of methods, statistical models and relevant
distributes are used for link prediction.

D. CLUSTERING
Clustering is process of the category of objects into clusters that each cluster members have the maximum
amount of similarity to each other and minimum similarity to members of other clusters [7]. These methods are
divided into two groups:

A group that shows each cluster by using central point of its existing data, like the K-means algorithm [8&9].

A group that shows each cluster with nearest data to the center of the cluster. The K-medoid algorithm is from
this group.

256 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

E. RECOMMENDER SYSTEMS
Recommender systems are the systems that help users to find and select their desired items. It is natural that
this system without having accurate information about users and their desired items (for example movie, music
book and…) is not able to recommend [10].

III. RELATED WORKS

In this part study of performed works in the field of data clustering will be discussed.

A. CLUSTERING WITH COMMUNITIES' DETECTION ALGORITHM LOUVAIN


The main purpose of the communities' detection is that similar nodes be in a cluster. Modularity function
optimization is one of the methods that are widely used for community detection. Modularity measure
presented by Newman and Girvan [11] is one of the most well-known functions to measure communities.
Modularity function is always between zero and one that high value indicates a proper division of graph.
Louvain algorithm provides graph clustering by using the maximum modularity function.

B. RECOMMENDER ALGORITHMS SPECTRAL LINK


Multilateral methods of spectral clustering are used in this algorithm with the introduction of ways to find
nearby nodes on the network. Input of this algorithm is Graph data that are created by connecting nodes in
social network. Finally a matrix of similarity between nodes in a graph results as an output. As the same way
people can be offered to users as a friend based on the points of similarity [12].

C. RECOMMENDER ALGORITHMS FRIEND LINK


This algorithm focused on the links that connect the nodes of a social online network together. Basis of this is
that it introduced a new similarity node index that exploits all local and global characteristics of a network.
Friend Link method finds similarity between nodes in an indirect graph constructed from the data
communication. The input data in Friend Link algorithm is communications of a graph G and its output is
similarity matrix between any two nodes in a graph G. Accordingly, friendship offers can be based on the
weight on the similarity matrix. After running of the Friend link algorithm, we can find similarity matrix
between two nodes of the graph and offer friends based on the importance [13].

IV. PROPOSED METHOD


The proposed method in this paper will proceed in two stages. First by using the K-means clustering technique
it clusters users by using the features of their profiles, then by using the Friend link algorithm which is
mentioned in previous section, it provides link prediction and leads online new users to the appropriate cluster
and some friends are suggested to them based on similarity measure.

A. FEATURE EXTRACTION FROM USERS’ PROFILES IN PROPOSED METHOD


Features required for clustering and should be extracted from users’ profiles are displayed in Table 1. Selecting
of features is important from two aspects; first select of some features instead of all of them reduces time of

257 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE 1 : FEATURES USED FOR CLUSTERING

Explanation of feature

Profile status (public=1, private=0)

Completion percentage of individual information

Gender (Male=1, Female=0)

Age

Weight
Height
Interest in music I like music
Interest in movie I like movies

Relation to children

clustering. Secondly, many features in practice had not any role in increasing the efficiency and reduce the
efficiency. Therefore, the Precision and efficiency of clustering can be increased by identifying these features.

B. PROCESS OF THE PROPOSED METHOD


The proposed method consists of two stages. The stages are as follow:

a) Database clustering based on features of users’ profile


b) Applying the Friend link algorithm in each cluster to link prediction

C. CLUSTERING APPROACH IN THE PROPOSED METHOD


In this section we cluster users by using the feature vectors extracted from their profiles. The purpose of this
clustering is locating of similar users in a cluster. K-means algorithm is used to cluster that is mentioned in the
previous sections. This algorithm uses Euclidean distance to calculate the distance between users.

D. PROCESS OF CLUSTERING
a) Obtaining the central parts of the clusters that are in fact the average points of each cluster.
b) Assigning each data to a cluster that has the shortest distance to the center of the cluster.
c) In the simple form of this method first some points are randomly selected on the number of clusters
required. Then the data according to similarity are attributed to one of these clusters, so the new
clusters are obtained. By repeating this procedure, in each repetition with the average of data, new
centers of clusters can be calculated and data are again attributed to the new clusters. This process will
continue as long as there is no change in the data. The function that is shown in equation 1 is objective
function:

( )
∑ ∑ (1)

: measure of distance between points

258 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

: central of cluster j

The following algorithm is basic algorithm for this method:

a) K point is selected as the clusters’ centers


b) Each sample data attributed to the cluster that has the least distance to that data.
c) After belonging of all data to one of the clusters, a new point is calculated as a center for each cluster.
(Average of points belonging to each cluster).

Stages of b and c are repeated as long as any there is no change in cluster centers.

Calculating of distance between two data in clustering is very important. By calculating the distance between
two data, one can understand how these two data are close together and accordingly put them in a cluster.

E. INTRODUCTION OF SILHOUETTE FACTOR


Silhouette factor is one of the most common ways of internal evaluation of clusters. This method works base on
calculation of data adhesion and separation and its value is in the range of -1 and. This method performs its
calculations on all the clusters’ points.

F. LINK PREDICTION SYSTEM


After the clustering process and identifying of clusters, now we need to implement the friend link algorithm
within each of the clusters. When the graph g adjacency matrix was raised to the second power, shows the
number of two routes that there are between each pair of graphs. Then elements of this matrix are updated.

In the next section, a new similarity measure is defined to determine how to express the proximity between
nodes of the graph. If and are two nodes of a graph and “Sim” is a function that holds similarities, the
higher the score of similarities between two nodes, the higher the probability that they will be friends.

| |
( ) ∑ ( )
∏ ( )

Wherein

n is the number of vertices in a G graph

L is the maximum length of a path considered between the nodes of the graph and

is Attenuation factor that measure the paths according to length of their L

| |is the number of all paths of length L is from to

259 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

∏ ( ) is the number of all paths of possible length L is from to

In this algorithm cycle routes are not considered in the similarity measurement.

Therefore, the new matrix includes the similarity of each pair of user according to the aforementioned
relationship. Eventually, the user with higher rating will be suggested to the right user.

V. SIMULATION AND EVALUATION OF THE PROPOSED METHOD


In this section we will present the results of simulation of proposed method. Also issues related to the
implementation of the data used, results and comparing of the results with previous studies will be expressed.

A. IMPLEMENTATION TOOLS
In order to implement the proposed method tools of Matlab and Weka are used.

Clustering is performed by using the tool Weka and results for validation were used in the Matlab Tools. Table
2 shows the results of calculation of silhouette factor for K-means clustering with different number of K.
according to results, number of 6 is the best clustering in the K-means clustering method.

B. DATABASE USED IN THE PROPOSED METHOD


The dataset that is related to Stanford University has information of more than ten years of users which
includes about 1.6 million users and 31 million communications among users. The mentioned dataset is
composed of two parts:

a) Dataset of users’ profile includes; gender, age, interest, education status and…and is used for
clustering users.
b) Dataset of communication between users which defines the communication among them together.
This part is used for link prediction. We work on 2000 records from this database to evaluate our
proposed method.

TABLE 2 :SILHOUETTE FACTOR OBTAINED FOR K CLUSTERS

Number of cluster Silhouette factor

5 0.3425

6 0.4155

7 0.3749

8 -0.1224

9 -0.0322

260 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The first part is related to the clustering of users’ features. In the previous section these features are mentioned.
Figure.1 shows the clustering.

The number of people in each cluster can be seen in the Table 3

C. EVALUATION OF PROPOSED METHOD


In order to evaluate performance of proposed method, two evaluation criteria are used that includes; Precision
and Recall.

Precision: recognition Precision in each of the clusters printed in output that is obtained by dividing the number
of correct estimated links to the total number of estimated links in the test data.

Recall: this amount also will be specified in each cluster output and is obtained by dividing the number of
correct estimated links to the number of correct links in the test data.

FIGURE 1: CLUSTERS

TABLE 3: PEOPLE IN EACH CLUSTER

Cluster number Number of people in each cluster

1 450
2 256

3 379

4 385

5 285

6 246

261 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

D. EVALUATION RESULTS OF PROPOSED METHOD

TABLE 4 : EVALUATION RESULTS OF PROPOSED METHOD

Cluster number Precision Recall

1 0.938 0.939

2 0.898 0.907
3 0.891 0.899
4 0.928 0.925
5 0.896 0.898
6 0.879 0.895

As can be seen in table 4, link prediction performed with high Precision

E. COMPARING OF PROPOSED METHOD


In the end of work, we have compared the proposed method with Spectral Link method which was introduced
on the database. This comparison can be seen in the following charts. Figure.2 and Figure.3 shows the
compared of proposed method with Spectral Link.

FIGURE 2 :COMPARE THE PRECISION OF THE PROPOSED METHOD WITH SPECTRAL LINK

262 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

FIGURE 3 :COMPARE THE RECALL OF THE PROPOSED METHOD WITH SPECTRAL LINK

As can be seen, Precision and Recall of the proposed method is more than Spectral Link algorithm in each
cluster.

VI. CONCLUSION
The main purpose of this study is improvement of Precision and efficiency of predicting the similarity of
people and suggest to them in social network. The results show that our proposed method reduces the
complexity and increases the Precision. Therefore it also increases the Precision of suggestions in the social
networks which makes it possible to identify the most similar people to the user in each cluster. One clustering
technique was used in this study and then Friend Link algorithm was applied in each cluster. In order to
improve Precision and efficiency in the future works, after applying clustering methods, a classification with
supervisor like neural network can be used and then Friend Link algorithm can be applied.

263 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

REFERENCES
[1] Jannach, D., Zanker, M., Felfernig, A. and Friedrich, G., 2010. "Recommender Systems: An Introduction",Cambridge University
Press. New York, 2010.–352 P.

[2] Konstan, J., & Riedl, J., 2012, “Recommender systems: from algorithms to user experience”, Springer, vol. 22, pp. 101-123.

[3] Getoor, L. and Diehl, C. P., 2005. "Link mining: a survey," ACM SIGKDD Explorations Newsletter ACM Digital Library , vol. 7, pp.
3-12,

[4] Lü, L. and Zhou, T., 2011. "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications,
vol. 390, pp. 1150-1170,

[5] Liben‐Nowell, D. and Kleinberg, J., 2007. "The link‐prediction problem for social networks," Journal of the American society for
information science and technology, vol. 58, pp. 1019-1031,

[6] Schifanella, R., Barrat, A., Cattuto, C., Markines, B., Menczer, F., 2010, “Folks in folksonomies: social link prediction from shared
metadata.”, In: Proceedings 3rd ACM International Conference on Web Search and Data Mining (WSDM’2010), New York, NY, pp. 271–
280.

[7] Cui, X., and Wang, F., 2015.” An Improved Method for K-Means Clustering.” In 2015 International Conference on Computational
Intelligence and Communication Networks (CICN) IEEE ,pp. 756-759.

[8] Gupta, H. and Srivastava, R., 2014. “k-means Based Document Clustering with Automatic “k” Selection and Cluster Refinement”.
International Journal of Computer Science and Mobile Applications, 2(5), pp.7-13.

[9] Cui, X. and Wang, F., 2015, “An Improved Method for K-Means Clustering”. In International Conference on Computational
Intelligence and Communication Networks (CICN) (pp. 756-759). IEEE.

[10] Zhang, J. and Philip, S.Y., 2014. “Link Prediction across Heterogeneous Social Networks: A Survey”. SOCIAL NETWORKS.

[11] Newman; and Girvan., 2004,”Finding and evaluating community structure in networks”. Phys. Rev. E 69, 026113,.Journal reference:
Phys. Rev. E 69, 026113

[12] Symeonidis, P., Iakovidou, N., Mantas, N. and Manolopoulos, Y., 2013. “From biological to social networks: Link prediction based on
multi-way spectral clustering.” Journal of Data & Knowledge Engineering , vol.87, pp.226-242.

[13] Papadimitriou, A., Symeonidis, P., & Manolopoulos, Y., 2012, "Fast and accurate link prediction in social networking systems",
Journal of Systems and Software, vol.85, pp. 2119-2132.

264 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
1 Vol. 15, No. 2, February 2017

Ear Biometric System Using Speeded-up Robust Features and Principal Component Analysis
Habes Alkhraisat1

1(Al-Balqa Applied University) Department of Computer Science Al-balqa Applied University, Asalt, Jordan;[email protected]

Abstract— Recently, identification of individual using personal biometric features are widely used in security monitoring, and
access control, criminal investigation system. Nowadays, fingerprints, iris, and face are the most popular biometric characteristic
used in Biometric systems. In recent years, the interest in ear recognition techniques has received increasing attention. The
outer ear has universal, unique, permanent, measurable, and high-performing biometric characteristic and the structure of the
outer ear does not change with increasing persons’ ages. Therefore, in the last decades, there are many experiments which are
conducted ears biometric features. This article presents a robust technique for improving the performance of ear recognition. The
proposed technique combines the advantages of Speeded-Up Robust Features (SURF) for feature extraction, Principal
components analysis (PCA) to reduce the dimension of the feature vector to a lower dimension, which improves the computation
efficiency, and scalable K-means++ algorithm for feature clustering. The experimental results demonstrate the robustness,
accuracy, efficiency, and performance of the new technique.

Keywords — Ear recognition, Feature extraction, Speeded-up robust features, scalable K-means++, Principal
components analysis
——————————  ——————————

1 INTRODUCTION
Recently, in both forensic scientists and among anato-
Biometrics is the process identifying of an individual mists’ and anthropologists’ circles, it is a fact that the
using physiological or behavioral characteristics [1]. Now- structure of an external ear enables identification of indi-
adays, various biometrics characteristics and meter has viduals [8]. Generally, anthropologist recommended the
been studied, like fingerprints, face, iris, and ear. The ear shapes of external ear to differentiate between individuals
characterisitcs are new biometrics features for individual. [9].
French criminologist Bertillon discovered that it is possi-
ble to identify individuals based on the shape of their outer
ear [2][1], and the first ear recognition system based on
seven ear features, was proposed by American police of-
ficer Iannarelli [3].
Ear has a unique and permanent structure, as the ap-
pearance of the ear does not change with icreasing persons’
ages. Beside that, the acquisition of ear images does not
require a person’s cooperation. Therfore, ear seems to be
suitable for recognition of personal identity based on fea-
tures derived from ear images, and the interest in ear bio-
metric systems has grown in the last two decades.
The ear is ideal biometric candidate due to the follow-
ing characterstics (i) its structure is rich and stable, it is Fig. 1. Characteristics of the human ear
consistent with the lifetime of individuals, (ii) its structure
is not affected by pose and facial expression, (iii) it is col- This paper aims to develop a roubust ear recognition
lectable and (iii) immune from privacy, anxiety, and hy- system by integrating the advantages of the following tech-
giene problems with several other biometric. niques: Speeded-Up Robust Features (SURF) [10], Princi-
The human outer ear is formed by the outer helix, the ple Componenet Analysis (PCA) [11], and scalable K-
antihelix, the lobe, the tragus, the antitragus, and the con- means++ algorithm [12]. The motivation of this paper, the
cha (figure 1). performance and efficiency of ear recognition schema with
Research in age and sex related changes in the human scale and pose invariance. The proposed method consists
ear has shown, that the outer ear maintains its structure of 4 statges. It starts by constructing of ear SURF de-
with icreasing persons’ ages [4][2] [5] [6][3]. The study scriptors. The second stage of the method consists of com-
in [7] demonstrates that short periods of time do not af- bining the SURF descriptor with the PCA algorithm to ex-
fected the recognition rate. Even though the potential ef- tract and construct the ear local descriptors. The third stage
fect of aging on biometric ear recognition is still subject to is concern with clustering the ear local descriptors by ap-
further research and has yet to be totally explored scientif- plying the scalable K-means++ algorithm. Finaly the clas-
ically. sification of the ear images is carried out by calculating
265 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
2 Vol. 15, No. 2, February 2017 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

local and global similarities. The Hessian matrix 𝐻(𝑥, 𝜎) at scale 𝜎 for a point (𝑥, 𝑦)
The remainder of this paper is organized as follows. in an image 𝐼, is defined as follows:
Section 2 discusses all stages of the proposed method. The
experimental results are demonstrated in Section 3. Section 𝐿𝑥𝑥 (𝑥, 𝜎) 𝐿𝑥𝑦 (𝑥, 𝜎) (2)
𝐻(𝑥, 𝜎) = [ ]
4 provides final conclusions. 𝐿𝑦𝑥 (𝑥, 𝜎) 𝐿𝑦𝑦 (𝑥, 𝜎)

2 EAR RECOGNITION SYSTEM 2.2 Interest point description


The proposed ear recognition system is composed of In [10], SURF local feature descriptors describe a pixel
the following 3 main pahses (figure 2): features extraction, in an image using its local content, it provides a unique and
feature clustering, and feature classification. robust description of an image feature under under various
Ear feature extraction phase has an important effect on conditions including small deformations, localization er-
the accuracy of ear identification and the matching pro- rors and rotations.
cess. This phase combines the Speeded-Up Robust Fea- The construction of SURF descriptors consists of iden-
tures (SURF) [10] and Principal Component Analysis tifying a reproducible orientation based on information
(PCA) [11] to construct the features local descriptors. from a circular region around the interest point and then
SURF detectors allocate the interest points, by applying constructing a square region aligned to the selected orien-
fast Hessian-matrix, and extracte the feature vectors of in- tation. Finally, features between two images are matched.
terest point as 64-dimensional SURF descriptors, and for The square region is split up equally into 4×4 smaller
fast indexing and computation efficiency, the PCA for the square. The Haar wavelet responses in horizontal (𝑑𝑥 ) and
interest point is appliedto speed up the matching process. vertical (𝑑𝑦 ) direction within each sub-region with Gauss-
Once the features are extracted from the ear image, the ian (σ = 3.3s) centred at the interest point are computed
objective is to classify them. To achieve the feature classi- and the summed up to construct the first set of entries in
fication phase, first the local descriptors of ear image is feature vector of the interest point. The sum of the absolute
values of the horizontal (𝑑𝑥 ) and vertical (𝑑𝑦 ) response are
also extracted, and concatenated with set of entries in fea-
ture vector.
The dominant orientation is estimated by calculating
the sum of all responses within a sliding orientation win-
dow of size π/3 (figure 4).
3.3 Fast Indexing for Matching
Fig. 2. Overview of ear recognition system architecture
To speed up matching the matching stag, the SURF de-
scriptor includes the sign of the Laplacian for the blob-type
divided into several sub-regions by applying scalable K- structures interest point. The sign of the Laplacian distin-
means++ algorithm and compare the features of each sub- guishes dark blobs on bright backgrounds from the reverse
region separately, and then the local and global similarities situation. The only feature with the same sign of the La-
are calculated and integrated to classify the ear images, the placian are compared, which allows for faster matching
classifying strategy is similar to [13]. and lower computational cost of matching, without reduc-
Figure 3 illustrates all stages for the proposed ear ing the descriptor’s performance.
recognition method.

2.1 Interest point detection


As in Figure 3 the first step for feature extraction is the
interest points detection at different scales. For this pur-
pose, the SURF detector [10] uses the determinant of Hes-
sian-matrix approximation operation on the integral im-
ages, which reduces the computation time and detects
blob-like structures at locations where the determinant is
maximum.
The value of the integral image 𝐼Σ (𝑥) at the location
(𝑥, 𝑦) in input image 𝐼 represents the sum of pixel values
in 𝐼 within a rectangular region, i.e above and to the left of
(𝑥, 𝑦), mathematically is formed by:

 𝑖=0 ∑𝑗=0 𝐼(𝑖, 𝑗)


𝐼𝛴 (𝑥) = ∑𝑖≤𝑥
𝑗≤𝑥
 Fig.3. Proposed ears recognition scheme

266 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
AUTHOR ET AL.: TITLE Vol. 15, No. 2, February 2017 3

Fig. 4. Orientation assignmen Fig. 5. Feature vectors’ projection process

3.4. Principal Component Analysis


To reduce the dimension of descriptor to a lower di- 2.5. Feature clustering
mension, and to improve the computation efficiency, the
Principal Components Analysis (PCA) is applied to the 64- An ear image region is divided into 4 sub-regions by
dimension SURF descriptor. applying the scalable k-means++ algorithm [14]. Each sub-
The dimensionality reduction of feature vector im- resion features are compared separately. Scalable K-
proves the computational efficiency, simplifies, reduces, means++ algorithm consists of two steps: select the first
and cleans the data without much loss of information, re- center С uniformly at random from the data and take each
duces the memory usage, and faster classification. PCA point to the nearest centroid cluster. The scalable K-
projects data from a higher dimension to a lower dimen- means++ involves the following steps:
sional such that the error incurred by reconstructing the (1) Uniformly at random pick an initial center С from
data in the higher dimension is minimized. 𝑋.
In the proposed method, the PCA [11] is applied to the (2) Calaculate the initial cost 𝜓 = ∑𝑥 𝑑 2 (𝑥, 𝑐)
matrix of feature vectors to construct projection matrix that (3) For 𝘖(log 𝜓) iteration do
reprsents the useful and relevant information of feature a. Calculate oversampling factor ℓ = 𝛺(𝑘)
vectors. The projection matix is a compact low-dimen- b. 𝐶 ′ ← sample each point 𝑥 ∈ 𝑋 nde-
sional space encoding of a high-dimensional matrix of fea- pendently with probability
ture vectors called as the PCA-SURF descriptors. ℓ. 𝑑 2 (𝑥, 𝑐)
𝑝𝑥 =
The feature vectors’ projection involves the following 𝜓
steps: c. С ⟵ 𝐶 ′ ⋃ 𝐶
(1) Obtain a 𝑛×𝑡 matrix M represents of a set of train- (4) End for
ing vectors of 𝑡 images in the 𝑛 dimensional space. (5) For x ∈ C, set 𝜔𝑥 to be the number of points in 𝑋
(2) Compute the training set image n-dimensional closer to 𝑥 than any other point in C
mean vector (𝜇) as follow: (9) Recluster the weighted points in C into k clusters.
1
 𝜇𝑛 = 𝑡 ∑𝑡𝑛−1 𝑀𝑛   Figure 6 illustrates, sample ear images of five different
(3) Subtract the mean 𝜇𝑖 of each dimension 𝑛 in 𝑀 clustering approaches.
from its dimension to give a mean adjusted matrix
3.6 Features Classification
M.
𝑤𝑖 = 𝑀𝑖 − 𝜇 (2) For ear feature classification, the Local and global sim-
(4) Compute the total Scatter Matrix or Covariance ilarity for ear sub-regions are calculated and integrated to
Matrix by the following equation: classify the ear images. Fast indexing method [7] [14] is
𝑆 = ∑𝑛𝑖=1 𝑤𝑖 𝑤𝑖𝑇 (3) used for filtering the interest points, which reduce the com-
(5) Calculate the eigenvectors 𝑈𝐿 and corresponding plexity of computing similarity. Figure 7 illustrates the
eigenvalues 𝜆𝐿 of the covariance matrix 𝑆. Eigen- computation process for local and global similarity.
vectors with high eigenvalues represent dimen- Suppose that the feature descriptors of a test ear image
sions of greater variability. 𝐼𝑡 in sub-regions 𝑘 denoted by:
(6) Sort the eigenvectors by decreasing eigenvalues.
The eigenvectors with high eigenvalues are se- 𝐼𝑡 = (𝑓11 , ⋯ 𝑓1𝑚1 , 𝑓21 , ⋯ , 𝑓2𝑚2 , 𝑓𝑘1 , ⋯ , 𝑓𝑘𝑚𝑘 ) (5)
lected to form projection matrix P. 𝑗
(6) Transfor the a 𝑛×𝑡 matrix M into 𝑛′ ×𝑡 matrix N by where the 𝑓𝑘 is the 𝑗𝑡ℎ feature descriptor in the 𝑘 𝑡ℎ sub-
projecting the mean-adjusted matrix over the projection region of image 𝐼.
matrix, (N = P×M).
The flowchart of feature vectors’ projection is shown in
Figure 5.

267 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
4 Vol. 15, No. 2, February 2017 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

𝑦
𝑆𝐺 (𝐼𝑡 , 𝐼𝑟 ) = 𝑚𝑎𝑡𝑐ℎ(𝐼𝑡 , 𝐼𝑟 )× max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) (8)

where 𝑚𝑎𝑡𝑐ℎ(𝐼𝑡 , 𝐼𝑟 ) is the number of validly matched fea-


tures of 𝐼𝑡 and 𝐼𝑟 computed using the nearest-neighbor al-
gorithms.
Finally, the local and global similarity are integrated to
to compute the final similarity value as follow:
𝑆 = 𝑆𝐿 × 𝑆𝐺 (9)
The final similarity is used for ear recognition

EXPERIMENTAL RESULTS
The performance and effieceiny of the proposed algo-
Fig. 6. Ear images of different clustering sub-regions. rithm, is tested using AMI Ear Database. AMI Ear Data-
base has been created by Esther Gonzalez at Computer Sci-
The fast index matching is applied for each feature in ence department of Universidad de Las Palmas de Gran
the sub-region of 𝐼𝑡 to remove the extreme different fea- Canaria (ULPGC). It includeds 700 ear images of 100 in-
tures and retaine the similar features in the same sub-re- dividuals. For each individual, six right ear images and one
gion. The local similarity 𝑆𝐿 is calculated as follow: left ear image were taken. For left ears, each sample was
captured from a slightly different pose and distance. Five
1
𝑘
(6) images were right ear with the individual facing forward
𝑦
𝑆𝐿 (𝐼𝑡 , 𝐼𝑟 ) = ∑ (max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) ×𝑤𝑖 ) (FRONT), looking right and left (Right, LEFT) and look-
𝑘
𝑖=1
ing down and up (Down, UP). The sixth image of right
𝑥 ∈ (1, ⋯ , 𝑚𝑡𝑖 ), 𝑥 ∈ (1, ⋯ , 𝑚𝑟𝑖 )
where: profile was taken with the subject facing forward but with
 𝐼𝑡 is the test image a different camera focal lenght (ZOOM). Last image
 𝐼𝑟 is the reference image (BACK) was a left side (left ear), with the subject facing
𝑦
 𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 ) is the correlation between each pair of forward. The image resolution is 492 x 702 pixels. Some
features associated with the 𝑖 𝑡ℎ sub-regions of 𝐼𝑡 images from the AMI Ear Database are shown in figure 8.
and 𝐼𝑟 : The first 30 individuals have been used for training and
𝑦 remaining 70 subjects, we used sessions 2 until 6 for train-
𝑦 (𝑓𝑡𝑖𝑥 , 𝑓 ) (7)
𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 ) = 𝑥 𝑟𝑖 𝑦 ing and session 7 for testing. This implies that we have 560
‖𝑓𝑡𝑖 ‖ ∙ ‖𝑓𝑟𝑖 ‖ training images and 140 testing images.
 𝑤𝑖 is the relative weight for the 𝑖 𝑡ℎ sub-region, The study shows that PCA-Based SURF suitable for ear
𝑦
 max (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) is the maximal similarity in 𝑖 𝑡ℎ recognition. In Table 1, the variation of recognition is
sub-region. shown with five different cluster types, which are illus-
The global similarity is computed using the inline-point trated in Figure 5.
similarity 𝑚𝑎𝑡𝑐ℎ(𝐼𝑡 , 𝐼𝑟 ) and maximal cosine correlation The results proposed method have been evaluated using
𝑦 invariant ear images categorized as follows: image is Nor-
𝑚𝑎𝑥 (𝑑(𝑓𝑡𝑖𝑥 , 𝑓𝑟𝑖 )) as follows:
mal, image is Rotated, image is Rotated and change in
Contrast of image.

Fig. 8. Sample images of 2 different test images sets

Fig.7. Similarity computation

268 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
AUTHOR ET AL.: TITLE Vol. 15, No. 2, February 2017 5

Table 1. Recognition Rate with five different cluster types 293, 2007.
[6] L. Meijerman, G. Maat and C. Van Der Lugt, "Cross-
Recognition Rate (%)
Sectional Anthropometric Study," ournal of Forensic
Method Rotated
Sciences, vol. 52, no. 2, pp. 286-293, 2007.
Normal (degrees) Rotated and
180 90 -90 Contrast [7] M. Ibrahim, M. Nixon and S. Mahmoodi, "The effect
A 99.0 98.8 99.0 99.0 98.8 of time on ear biometrics," in The first International
B 98.5 98.2 98.4 98.4 98.0 Joint Conference on Biometrics, Washington DC,, 2011.
C 97.0 96.8 96.8 96.8 96.5 [8] J. Kasprzak, "Identification of ear impressions in
D 98.1 98.1 98.1 98.1 97.8 polish forensic," Problems of Forensic Sciences, vol. 57,
E 99.5 99.2 99.5 99.5 99.2 pp. 168-174, 2001.
[9] R. Purkait and P. Singh, "A test of individuality of
human external ear pattern: Its application in the
field of personal identification," Forensic Science
6 CONCLUSIONS International, vol. 178, no. 2-3, pp. 112-118, 2008.
In this paper, we proposed PCA-SURF features for an [10] H. Bay, A. Ess, T. Tuytelaars and L. Van Gool,
effective ear recognition system as method for identifying "Speeded-up robust features (SURF)," Computer
of an individual using ear print. The proposed method con- Vision and Image Understanding, vol. 110, no. 3, p. 346–
sists of 4 statges. It starts by constructing of ear SURF de- 359, 2008.
scriptors. The second stage of the method consists of com-
[11] I. Jolliffe, Principal component analysis, John Wiley
bining the SURF descriptor with the PCA algorithm to ex- & Sons, Ltd., 2002.
tract and construct the ear local descriptors. The PCA en-
[12] B. Bahmani, B. Moseley, A. Vattani, R. Kumar and S.
cods of a high-dimensional descriptor into a compact low-
Vassilvitskii, "Scalable K-Means++," in VLDB
dimensional space called as the PCA-SURF. The third
Endowment, Istanbul, 2012.
stage is concern with clustering the extracted ear local de-
scriptors by applying the scalable K-means++ algorithm. [13] L. D. Shinfeng, L. Bo-Feng and L. Jia-Hong,
Finaly the classification of the ear images is carried out by "Combining Speed-up Robust Features with
calculating local and global similarities. Principal Componenet Analysis in Face Recongnition
System," International Journal of Innovative Computing,
Experimental results show that the performance of the
Information and Control, vol. 8, no. 12, pp. 8545-8556,
proposed method is quite well and robust to the accessory,
2012.
expression, pose and age variations, rotation, and under
lighting. [14] D. Arthur and S. Vassilvitskii, "k-means++: The
In addition, due to using the PCA in feature space re- Advantages of Careful Seeding," in 07 Proceedings of
duction, scalable k-means++ clustering, and applying the the eighteenth annual ACM-SIAM symposium on
Discrete algorithms, New Orleans, 2007.
fast indexing in matching stag the proposed method has
lower complutational complexity and computation time of [15] B. Arbab-Zavar and M. Nixon, "Robust Log-Gabor
feature matching. Filter for Ear Biometrics," in The 18th International
Conference on Pattern Recognition, Florida, 2008.

REFERENCES

[1] A. Jain, P. Flynn and A. Ross, Handbook of


Biometrics, New York: Springer Science & Business
Media, 2007.
[2] A. Bertillon, La Photographie Judiciaire: Avec Un
Appendice Sur La Classification Et L’Identification
Anthropometriques, Paris: Gauthi-er-Villars, 1890.
[3] A. Iannarelli, "Ear identification," forensic
identification series, 1989.
[4] C. Sforza, G. Grandi, M. Binelli, D. Tommasi, R.
Rosati and V. Ferrario, "Age- and sex-related changes
in the normal human ear," Fo-rensic Science
International, vol. 187, no. 1-3, pp. 110 -117, 2009.
[5] L. Meijerman, C. Van Der Lugt and G. Maat, "Cross-
Sectional An-thropometric Study of the External
Ear," Journal of Forensic Sciences, vol. 52, no. 2, p. 286–

269 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Simulation of Various QAM Techniques


Used in DVBT2 & Comparison for
Various BER Vs SNR
Sneha. Pandya, Research Scholar, C.U.Shah University, Wadhwan.
Nimit Shah, Associate Professor, Electrical Engg, C.U.Shah College of Engg & Technology.
Dr. C.R.Patel, Professor, V.V.P. Engineering College, Rajkot
Abstract- This research paper contains the comparison of DVBT and DVBT2 and deals with the comparative analysis
of both the techniques and also deals with simulation process on random data as well as video signals using DVBT2
techniques using 16 and 64 QAM modulation techniques and comparing the results for required SNR for a given
acceptable error rate. The trade-off between SNR and BER is analyzed and also the choice of specific QAM technique is
considered considering various aspects.

I. INTRODUCTION
The paper describes Digital video broadcasting – Terrestrial standard as the one which actually is modifying the
existing analog standards being used currently across the globe. The most important part of such standards is the
retrieval of perfect signal at the receiver end excluding the effects of the channels it goes through and the noise and
timing jitter. In the transmission being carried out, the data – either audio – video or any picture information or
randomized data is processed for coded orthogonal frequency division multiplexing (COFDM) before they are
modulated using QAM – Quadrature Amplitude Modulation constellation and mapped in the group of blocks. After
formation of the blocks, IFFT – Inverse Fourier Transform is carried out with point 2048 or 8192, which will
determine bandwidth requirement and number of subcarriers. Some of these subcarriers are kept in reserve to be
used for the pilot symbols – much needed for efficient reception of the signals, whereas the others are to be used for
guard-bands as well.[1][2]
Limitations of DVBT: Though there are many virtues of implementing DVB-T system, there are many
shortcomings too of the same which cannot be neglected. The very first limitation and an important one too is in the
form of bit rates supported by it. They are limited and not compatible with the existing and rapidly changing
wireless standards. For the transmission of HDTV – high-definition television and also for accommodating more
channels for broadcasting, there was a strong need of new standard. [3]
The second thing it lacked was interaction with the user which was needed to be upgraded.
The third limitation of the DVB-T system is its hugely inferior performance with portability or mobility which
restricted its usages in moving vehicles.
Last but not the least is regarding Single Frequency Networks – SFNs, where repeated signals create interference
to their own versions of the signal and damages the quality of reception. [3]

270 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

II. EVOLUTION OF DVBT2

A new standard which provides enhanced capacity and also the required sturdiness in the terrestrial scenario is the
second generation standard of DVBT, popularly known to be DVB-T2. It was basically designed in such a way that
it can support the fixed receptors but was also equipped with required mobility. It was designed in such a way as to
maintain the spectrum characteristics of its ancestor standard – i.e. DVBT. Figure 1 shows the functional block
diagram of a DVB-T2 transmitter.[4] The most important change made is in its strategy of correcting errors, which
has been inherited from DVBS2.

Figure 1. DVBT2 Functional Block Schematic

A combination of LDPC – Low Density Parity Check code and BCH - Bose-Chaudhuri-Hocquenghem code
improves the performance by great amount giving the robustness in receiving the signal efficiently. The FEC –
forward error check coding techniques are way better than Convolution Codes used in DVBT to achieve the same
purpose. As far as the modulation technique is concerned, DVBT2 uses the same – OFDM as used in DVBT, but it
uses this modulation technique introducing longer symbols with 16K and 32K carriers so that an increment in the
length of the guard interval can be carried out without damaging the spectral efficiency. The second generation
provides combination of different numbers of carriers and guard interval lengths and hence it become a very flexible
standard and can be used for any of the multiple combinations.[5]

A very important modification offered by DVBT2 is the presence of 8 different pilot patterns in the scattered
format, whose choice would be made by the parameters of current transmission. Because of all these minute changes
made and modulation techniques too updated, a new standard has emerged giving the best possible spectral
efficiency. In the block diagram, it can be observed that interleaving is carried out in multi-folds – bit inter-leaver –
time inter-leaver and then frequency inter-leaver to avoid the bursts of errors as much as possible and giving a way
to the randomised error pattern within the frame of LDPC. [5]

The Bit Error Rate as obtained from the internal decoder is taken into account for all results compared in this
article. For the justified comparison to be made between DVBT and DVBT2, a quasi-error free (QEF) of
BER=2·10-4 and BER=10-7 must be considered for DVB-T and DVB-T2 after convolutional and LDPC decoders,

271 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

respectively.[6] If these QEF reference values are considered, for an additive white Gaussian noise (AWGN)
channel model - a gain of 6 dB can be obtained between the two standards and in a Rayleigh channel - nearly 4 dB.9

TABLE I
COMPARISON OF DVBT AND DVBT2
DVBT DVBT2
FEC carried out by REED SOLOMON & FEC carried out by LDPC & BCH codes
Convolutional codes
Modes are QPSK, QAM 16 & 64 QPSK,QAM 16,64 & 256
Guard intervals 1/4, 1/8, 1/16, 1/32 1/4, 19/256, 1/8, 1/16, 1/32 , 19/128,
FFT up to 8K FFT size up to 32K
Scattered pilots 12% Scattered pilots 1%
Continual pilots 2.6% Continual pilots 0.35%

III. DVBT2 AS APPLIED TO RANDOM DATA - BER Vs SNR


The research paper includes the study related to DVBT2 technique as modified from DVBT using LDPC coding
method and BCH codes which is verified on random stream of data.[6]
After achieving the results of reconstruction as the results expected or desired for, this particular technique is
applied on video signal and the results of 4, 16 and 64 QAMs are compared.
It can be seen from the given graphs that in 16 QAM technique used as the basic modulation technique in
DVBT2, the graph shows the value of BER for given value of SNR. The gradual betterment achieved in the BER
(by betterment, here reduction is referred.)
The permissible generalized value of BER taken as 10e-5 is achieved in 16 QAM using / providing / maintaining
the SNR at approx. 7.5 dB. It can be seen in the graph in Figure 2.[7]

Figure 2. DVBT2 BER Vs SNR – 16 QAM

The same is applied using 64 QAM techniques as the basic modulation technique in DVBT2, the graph shows the
value of BER for given value of SNR. The gradual betterment achieved in the BER can be observed. The resulting
graph can be observed in Figure – 3.
The permissible generalized value of BER taken as 10e-5 is achieved in 64 QAM using / providing / maintaining
the SNR at approx. 11.86 dB.

272 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 3. DVBT2 BER Vs SNR – 64 QAM

Figure 4. DVBT2 BER Vs SNR – 16 and 64 QAM

In Figure – 4, for both the techniques, the graphs of BER vs SNR are plotted so that a closed comparison can be
made and the required technique can be chosen.

IV. SIMULATION RESULTS:

The proposed research work was to carry out to find the best and optimized method of implementing DVBT2
using the most efficient and promising QAM technique and also the optimized value for SNR for an acceptable
value of BER. Just like the DVBT2 applied to random data stream, it has been applied to video signal and the
various results are obtained.[8] This entire work actually is carried out in order to further obtain the same results on
video signals and later to implement them to Digital Video Broadcasting- Hand Held – also becoming popular
DVBH.

Figure 5 shows how in 4 QAM (very much similar in results and performance to QPSK) it can be seen the good –
and – the best reconstruction of the video for the values of SNRs taken 1.80 dB and 1.98 dB respectively.

273 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

SNR = 1.80 dB SNR = 1.98 dB

Figure 5. DVBT2 4 QAM as applied to a video

Figure 6 shows that in 16 QAM, it can be seen the good -and – the best reconstruction of the video for the values
of SNRs taken 7.1 dB and 7.90 d B respectively.

SNR = 7.1 dB SNR = 7.90 dB

Figure 6. DVBT2 16 QAM as applied to a video

Figure 7 shows that in 64 QAM, it can be seen the good –and- the best reconstruction of the image for the values
of SNRs taken 11.79 dB, 11.81 dB and 11.86 dB respectively.
Table II shows the comparison of 16 QAM and 64 QAM as applied to random data bits generated applying
DVBT2 and taking the results of achieved BER for gradually increasing values of SNR. The results of Videos
reconstructed applying gradually increasing SNRs, it has been observed that the BER becomes permissible after a
specific SNR and that brings the trade-off into existence between BER and SNR.[9]

274 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

SNR = 11.69 dB SNR = 11.90 dB

Figure 7. DVBT2 64 QAM as applied to a video

TABLE II
COMPARISON OF DVBT2 USING 16 AND 64 QAM TECHNIQUES
DVBT2 -16 QAM DVBT2 -64 QAM
SNR BER SNR BER
1 7.5 dB 10-5 11.86 dB 10-5
2 4 dB 0.8 x 10-2 7.5 dB 0.5 x 10-1
3 2 dB 10-1 2 dB 0.5 x 10-1

V. CONCLUSION

DVB-T2 offers data rates up to 50 to 90 percent higher than DVB-T for the equal level of strength. The increase
results from the subsequent improvements:
• Improved Forward Error Check
• Rotated / revolved Constellation diagrams and
• Larger SFNs Flexible Pilot Pattern
This certainly makes it the better choice when presenting DTT or adding HD services to the terrestrial platform.
But, accurate definition of the key parameters of the DVB-T2 system is more precarious in planning DVB-T2
networks than it is for DVB-T.
Another important conclusive point is using 64 and 16 QAM respectively, the similar value of accepted BER can
be achieved by applying higher SNR.
The benefit of choosing the higher order formats is - there are more points included within the constellation – so,
it is possible to transmit more bits per symbol. The shortcoming is that the constellation points are closer together
and so the link is more susceptible to noise. As a result, higher order versions of QAM are only used when there is a
sufficiently high signal to noise ratio.
With the variation in QAM techniques as the order of it increases, the number of bits accommodated increases but
that can be achieved sacrificing either more signal power or BER has to be compromised.

275 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

ACKNOWLEDGMENT
The Research work I have carried out is a collective effort of my contributions and valuable inputs from experts of
this area which include my internal supervisor from C.U.Shah University – Dr. Nimit Shah sir whose insights are
powerful and suggestions are very innovative. I hereby thank him wholeheartedly. I also would like to extend my
heartfelt thanks to my supervisor Dr. Charmy Patel for guiding my work with her valuable inputs and carving my
path for this research work.

REFERENCES
[1] Digital Video Broadcasting (DVB); Frame structure channel coding and modulation for a second generation digital terrestrial
television broadcasting system (DVB-T2), ETSI Std. EN 302 755 V1.1.1, Sep. 2009.
[2] DVB-T2: New Signal Processing Algorithms for a Challenging Digital Video Broadcasting Standard, Mikel Mendicute, Iker Sobrón,
Lorena Martínez and Pello Ochandiano
[3] DVB-T2 Performance Comparison with other Standards DVB – NCA Seminar - 18 – 19 August 2010 Bangkok John Bigeni
[email protected]
[4] DVB-T and DVB-T2 Performance in FixedTerrestrial TV Channels Ladislav Polak and Tomas Kratochvil 978-1-4673-1118-2/12/$31.00
©2012 IEEE
[5] DVB-T2: The Second Generation of Terrestrial Digital Video Broadcasting System I˜naki Eizmendi, Manuel Velez, David G´omez-
Barquero, Javier Morgade, Vicente Baena-Lecuyer, Mariem Slimani, IEEE transactions on broadcasting, vol. 60, no. 2,June 2014
[6] Digital Video Broadcasting (DVB); Frame structure channel coding and modulation for a second generation digital terrestrial
television broadcasting system (DVB-T), ETSI Std. EN 300 744 V1.6.1, Jan. 2009.
[7] DVB Fact Sheet - August 2014
[8] A Comparison of 64-QAM and 16-QAM DVB-T under Long Echo Delay Multipath Conditions by Scott L. Linfoot, Member, IEEE, IEEE
Transactions on Consumer Electronics, Vol. 49, No. 4, NOVEMBER 2003.
[9] Design of a DVB-T2 simulation platform and network optimization with Simulated Annealing, Carlos Enrique Herrero, Carlos Alberto
López Arranz.

276 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Assessing e-Government systems success in Jordan (e-JC): A

validation of TAM and IS Success model

Validation of TAM and IS for e-Government Systems Success in

Jordan

Arif Sari¶1*, Murat Akkaya1¶, Bashar Abdalla1¶

1
Department of Management Information Systems, Girne American University,
Kyrenia, Turkish Republic of Northern Cyprus, via Mersin 10, Turkey

*Corresponding author
E-mail: [email protected]


These authors contributed equally to this work.


 
277 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Abstract

Drawing on information systems success model and technology acceptance model, this article

will examine the impact of system quality, information quality, and service quality on perceived

usefulness, perceived ease of use, and citizen’s attitudes toward the use of e-JC system. Data

analysis involving 398 randomly selected subjects was conducted to test these propositions,

general support was found for all the interactions. Results from structural equation modeling

delineates that information, system and service quality of e-government influences citizens

perceived ease of use and perceived usefulness, which in turn influences citizens

adoption/attitudes toward use of the e-government system. The findings indorse the model of

interest, and also contributes to the literature by strengthening researchers' theoretical and

practical understanding of the effects of information, system, and service quality in developing e-

government system.

Keywords: e-government, Jordan, information quality, system quality, perceived usefulness,

perceived ease of use.


 
278 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

1. Introduction

Technological innovations aimed at offering citizens improved and equitable access to public

services is popularly known as Electronic government (e-Government). This innovation has been

accepted and embraced by many countries throughout the globe. More specifically western

industrialized nations, with infrastructure and educated citizens. Given this research on e-

Government adoption has, to date, focused on developed countries in the Western World. Thus

providing additional space for exploration in the non-western country like Jordan. According to

Muganda-Ochara (2010), e-Government is still a novel concept in many developing countries, as

it is rarely used in government centered services. Similarly e-government literature consist of

themes from developed countries (Hsieh, Huang, & Yen, 2013; Krishnan, Teo, Lim, 2013),

however, e-Government researches specifically in Arab countries, has not received equal

attention.

Accordingly, e-government is seen as a computer-mediated activity designed to improve

individual’s access to public data, and services (United Nations, 2003). It is also the strategic use

of IS by regional or national government to attain greater government efficiency, to provide

better service quality, and a more democratic participation (Yang, & Rho, 2007). Academicians

and practitioners championing a utopian image thrash out that advances in IS will not only

deconstruct hierarchical forms of social and organizational structure, but also decentralized the

forms of information flow on a network relationship among citizens (Blanchard, & Horan, 1998;

Frissen, 1997; Klein, 1999).


 
279 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Jordan is considered a developing country with more than 60% urban population and about 76%

of households have Internet access as of 2015 (Mohammad, 2015). Very little about e-

Government in Jordan has been investigated or published. One of the researches delineated that

e-Government applications in Jordan lacks the standard features required for such application,

and that the system failed to take account of citizens need and expectations (AL-Soud, & Nakata,

2010). Given this, the author suggested formulated the following research questions:

¾ Does information quality, system quality, and service quality enhances perceived

usefulness and adoption intents?

¾ Does information quality, system quality, and service quality enhances perceived ease of

use and adoption intents?

1.1. Importance of the study

The existing e-Government portals in Jordan are standalone, in other words each ministry has its

own portal, as such end users (citizens) must register, create a username and a password. Given

the number of ministries Jordan, for citizens/business to benefit and use e-Government services,

they should remember each ministry’s username and password. This has generated redundant

information, this information overload and the increased complexity from the citizen’s side often

discourage them from using such services, and this has hinder the growth of e-government

applications and increased pressure on public servants and resources. Drawing on information

system success (DeLone, & McLean model) and Technology Acceptance Model (TAM), this

study presents an empirically validated model for measuring the success and acceptance of e-

government systems from the citizens’ perspective. The low adoption and use of e-government

services by end users thus remain major barriers to successful e-government implementation. The

outcome of this study may address this barrier in Jordan.


 
280 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

2. Theoretical framework

E-government as a branch of scholarship has become a popular motto in public administration

altering the way things are done in key areas like government top public interaction, public

service delivery. E-government is not limited to the above said but also touches important aspect

of public administration like transparency, and accountability (Yildiz, 2007). E-government

applications can help a country better delivery government services to its citizens, improved

communication and understanding with enterprises, the empowerment of nationals through

information sharing and accessibility (World Bank, 2010). Contemporary scholars have argued

that e-government can deliver efficient, effective and transparent services the public, this notion

has been supported by substantial empirical research (e.g., Affisco & Soliman, 2006; Reddick &

Roy, 2013; Weerakkody et al., 2011).

Originally introduced by Davis in 1989, the TAM build on the social psychology theory or

reasoned action which aims to model user acceptance of IS applications. It is one of the most

used framework to measure individuals’ willingness to adopt and use a particular technology.

The model has two famous and extensively used constructs namely; the “perceived usefulness”

popularly abbreviated as (PU) and the “perceived ease of use” popularly abbreviated as (PEOU).

PU is “the degree to which an end-user considers and believes that the use of an IS application

will enhance task performance”. In the context of e-Government applications the concept, it

assumed to have influence on the usage of an e-Government portal (Wirtz et al., 2015). Whereas,

PEOU is “the degree to which an end-user believes a system will be free of effort to a large or

lesser extent” as noted by (Venkatesh et al., 2003; Shen & Chiou, 2010). In the context of e-


 
281 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Government applications the concept has been used successfully in some e-Government studies

(Alanezi, Kamil, & Basri, 2010; Wirtz et al., 2015).

According to Horst et al. (2007), PU is seen as the most important motivator for a user’s

willingness to adopt, employ and use a technology irrespective of education, location or culture.

These attributes of TAM are deemed as the key factors shaping end-users behavioral intention to

use a system. The TAM framework has been utilized in many research and disciplines associated

with technological innovation and development. For instance, Kwon (2000) adopted the model in

a study to evaluate technology adoption in the cellular telephone industry, personal digital

assistant usage (Liang et al., 2003). Pavlou (2003) adopted the model in an electronic commerce

application to evaluate the factors and determinants of adoption.

Additionally, the framework was used in an online shopping sites to evaluate online consumer

behavior (Koufaris, 2002), and World Wide Web (Lederer et al., 2000). Recent studies have also

employed the model to measure the willingness of citizens to use in e-Government application

(e.g., Alghamdi, & Beloff, 2015; AL-Athmay, Fantazy, & Kumar). Majority of the studies

assumed that PU and PEOU of the TAM can adequately capture the overall perceived value of

using an e-Government application or system. Scholars like Bagozzi et al. (2007) were among

the pioneers of TAM model, however the authors acknowledged certain limitations of TAM

model. This study is motivated to use it as a theoretical lens because it presents the genesis of all

the models for measuring technology adoption and use.

Delone and Mclean (2004) came up with the IS success model, which consist of the following

variables system, information and service quality, perceived usefulness, user satisfaction, and


 
282 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

system usage and net benefit. The main aim of the model is to test system usage (Rai et al.,

2002). As mentioned earlier, adoption of the system and usage continues to be considered an IS

success variable in most studies and widely used by IS researchers as noted by (McGill et al.,

2003). The addition and integration of a new framework (i.e., the IS success model) will open

possibilities for discovering other unknown factors and in so doing providing opportunities for

extending the TAM based on context.

System quality (SQ) - system quality refer to as “the degree to which the system is easy to use to

accomplish tasks “(Schaupp, Boudreau, & Gefen 2006). The construct entails and considers

performance characteristics, functionality, and usability, among others (McKinney Kanghyun, &

Zahedi, 2002). Accordingly, in an e-Government context, we adopted the following definition for

system quality: “the ability of an e-government system to provide its citizens with accurate,

reliable, relevant, and easy to understand information”. It represents the performance of the

system in terms of ease of use, user-friendliness, and usability (Wang, & Liao, 2008). Previous

research on other technological context suggested that the construct can influence PU, PEOU and

citizen’s attitude toward use of a system.

Information quality (IQ) - information quality is refer to as “the degree to which the quality of

the information that the portal provides and its usefulness for the user enables them accomplish

the stated goal of the system. Information quality is considered one of the most important success

factor when investigating overall IS success of any given system (McKinney et al., 2002). In the

context of e-Government, it is defined as “the ability of the e-government system to provide its

citizens with new, accurate, clear, and easy-to understand information” (Urbach et al., 2010). It is


 
283 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

also refer to as the quality of e-Government system output and is measured by different semantic

attributes (Wang, & Liao, 2008). Previous research on other technological context suggested that

the construct can influence PU, PEOU and citizen’s attitude toward use of a system.

Service quality (SRQ) - encompass measures of the overall service performance and assistances

related to an e-Government system, ranging from responsiveness, reliability, empathy, to the

competency of service personnel as noted by (Chang & King, 2005; Pitt, Watson, & Kavan,

1995). Petter, DeLone and McLean (2013) recently noted that the SRQ variable in an IS model

“captures the general quality of an e-Government system from the perspective of readiness of

personnel to provide proper service, safety of transactions when using the e-government system,

availability of the system to users, individual attention of IS personnel, and providing specific

needs for users”. Previous research on other technological context suggested that the construct

can influence PU, PEOU and citizen’s attitude toward use of a system. Based on the extent

literature the following hypotheses were developed and diagrammatically presented in figure 1.

H1a: Information quality will have a positive effect on citizenships perceived usefulness of e-JC

system.

H1b: Information quality will have a positive effect on citizenships perceived ease of use of e-JC

system.

H2a: System quality will have a positive effect on the perceived usefulness of e-JC system.

H2b: System quality will have a positive effect on perceived ease of use of e-JC system.

H3a: Service quality will have a positive effect on the perceived usefulness of e-JC system.


 
284 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

H3b: Service quality will have a positive effect on perceived ease of use of e-JC system.

H4: Perceived usefulness of e-JC system will have a positive effect on citizen’s

adoption/attitudes toward use of e-JC system.

H5: Perceived ease of use of e-JC system will have a positive effect on citizen’s

adoption/attitudes toward use of e-JC system.

Information 
quality (IQ) 
Perceived 
usefulness (PU) 

System quality  Attitude toward 
(SQ)  use e‐government 

Perceive ease of 
use (PE) 
Service quality 
(SEQ) 


 
285 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Fig 1: Proposed research model to test the system

3. Research Method

This study employed a survey method to test and analyze the research model presented in figure

1. The participants in this study were Jordanian citizens who have used the e-government system,

they were randomly selected and asked to voluntarily participate in the study. A survey was

developed using validated instruments from previous relevant studies in English. Prior to

administering the survey, back-translation was conducted since most Jordanian speaks and

understood Arabic. The method was used to translate the scale items from English to Arabic and

vice versa following (Brislin, 1970) recommendations. As a next step, A Q-Sort pilot study was

conducted to remove ambiguities, prevent misinterpretations of measurement items, and refine

the study questionnaire.

The Q-sort method is “an iterative process in which the degree of agreement between judges

forms the basis of assessing construct validity and improving the reliability of the constructs”

(Nahm, Rao, Solis-Galvan, & Ragu-Nathan, 2002, p. 114). The outcome of the pilot study was

satisfactorily, and then the main survey was subsequently administered. The final survey

instrument consisting of 25 items, available in table 1 was operationalized. Although the items

were adapted from previous research, the author modified them to fit the context of the current

study. A five-point Likert scale was employed to measure the items using anchors that ranged

from 1=strongly disagree to 5=strongly agree.

10 
 
286 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

4. Results and Analysis

Five hundred questionnaires were distributed, of which four hundred and thirteen were returned,

yielding a (response rate of 82.6%). Out the 413 returned questionnaires, 15 had missing data and

were subsequently eliminated from the study. This response rate is comparable with those of

other studies that have examined e-government. The final study sample involved 398 Jordanian

citizens, 228 (57.3%) males and 170 (42.7%) females, 344 Education (50%), 116 (29.1%) were

single, and the rest married. The demographic data also posit that, 55 (13.8%) of the participants

have high school diploma, 94 (23.6%) have some college degrees, 178 (44.6%) have bachelor’s

degrees and the rest higher degrees. The ages of the participants ranged between18 and 35 years,

with a mean of 25 (SD = 2.42) years.

Before testing the structural model, I tested the measurement model and assessed the

relationships between the observed variables and their underlying constructs, which were

allowed to inter-correlate freely. More specifically, to provide support for the issues of

dimensionality, convergent and discriminant validity, all measures were subjected to

confirmatory factor analysis (CFA) with IBM SPSS AMOS v21. First, a single factor test was

conducted to assess the likelihood of common method bias (Podsakoff, MacKenzie, Lee, &

Podsakoff, 2003), the result yielded a poor fit, suggesting that the dataset is not affected with

CMV.

Next, the author conducted a six factor model test, the initial results of the CFA provided low

model fit statistics. Then, two items from adoption/attitudes toward use, and two from perceived

ease of use were eliminated due to low standardized <0.50 and/or cross loading as recommended

11 
 
287 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

by (Hair et al., 1998). The results [Chi-square: χ2 = 501.36; d.f =194; p = .000; Goodness of Fit

Index (GFI) = .89; Normed Fit Index (NFI) = .93; Comparative Fit Index (CFI) = .95; Tucker-

Lewis index (TLI) = .95; The Root Mean Square Error of Approximation (RMSEA) = .063,

Relative χ2= 2.58] shows that model conforms to the criteria’s suggested by (Ullman, 2006;

Byrne, 1994; Browne and Cudeck, 1993; Tucker and Lewis, 1973). See figure 2

The retained item loadings exceeded .50; Cronbach’s alphas were all above the benchmark of

.60; CR and AVE were also above the benchmark of .50 (Hair et al., 1998). Discriminant validity

is established when the estimated correlations between the variables is below 0.85 (Abubakar &

Ilkan, 2016). It does give confirmation of convergent and discriminant validity among our

measures (See TABLE 1). First, bivariate correlations were computed among the variables.

Second, structural equation modeling (SEM) was incorporated to evaluate the proposed and

alternative models via path analysis in AMOS. Means, standard deviations, and correlations of

the study variables are demonstrated in TABLE 2.

12 
 
288 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table 1. Psychometrics properties of the measures

Scale items loading t-value

Information quality (IQ)

The e-Government system provides the precise information you need. .86 16.82

The e-Government system provides sufficient information. .73 17.86

The e-Government system provides up-to-date information. .79 18.65

The e-Government system provides reliable and useful information. .78 -

System quality (SQ)

The e-Government system is user friendly. .86 16.65

The e-Government system is easy to use. .88 17.12

The e-Government system is usable .73 -

Service quality (SEQ)

When you have a problem, the e-Government system service shows a sincere interest in solving it. .78 16.92

You feel safe in your transactions with the e-Government system service. .79 17.15

The e-Government system service gives you individual attention. .82 -

Perceived usefulness

The e-Government system enable me to accomplish tasks .77 -

13 
 
289 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The e-Government system improve public service performance .86 18.65

The e-Government system increase public service productivity .89 19.58

The e-Government system enhance public service effectiveness .88 19.31

The e-Government system make it easier to do deliver public services .86 18.59

The e-Government system is useful in for public service activities .78 16.66

Perceive ease of use

Learning the e-Government system is easy for me -* -*

Easy to get e-Government system to do what I want to do .72 14.86

The e-Government function is clear and understandable .81 17.24

e-Government system is flexible to interact with .90 19.51

Easy to become skillful at using e-Government system. -* -*

e-Government system is easy to use .78 -

Adoption/Attitude toward use

I am dependent on the e-Government system. .91 -

The frequency I use the e-Government system is high. .82 18.36

The tendency that I will use the e-Government system is high. -* -*

It is very likely that I will use the e-Government system in the near future -* -*

14 
 
290 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

15 
 
291 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Fig 2: CFA diagram

Table 2 presents the mean bivariate correlations among the observed variables. As seen in the

table, information quality was positively correlated with perceived usefulness (r = .707, p<.01)

and perceive ease of use (r = .627, p<.01). Next, the dataset uncover that system quality was

positively correlated with perceived usefulness (r = .636, p<.01) and perceive ease of use (r =

.674, p<.01). In addition, correlation analyses shows that service system quality positively

correlate with perceived usefulness (r = .689, p<.01) and perceive ease of use (r = .621, p<.01).

Finally, the relationship between perceived usefulness and citizens adoption/attitudes toward use

of the e-government system was positive and significant (r = .642, p<.01). Similarly, a

significant and positive correlation was found between perceive ease of use and citizens

adoption/attitudes toward use of the e-government system (r = .589, p<.01). The strong

associations between the research measures provided a precursory support for the all the

hypothesized relationships.

Table 2. Means, standard deviations, and correlations of study variables

Variables 1 2 3 4 5 6

1. Information quality (IQ) -

2. System quality (SQ) .707** -

3. Service quality (SEQ) .700** .642** -

4. Perceived usefulness .707** .636** .689** -

5. Perceive ease of use .627** .674** .621** .647** -

16 
 
292 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

6. Attitude toward use .585** .525** .621** .642** .589** -

Alpha (α) .89 .86 .84 .93 .88 .86

Composite reliability .89 .87 .84 .94 .88 .86

Average variance extract .67 .68 .64 .71 .65 .75

Mean 2.98 3.20 2.79 3.09 3.28 2.49

Standard deviation .92 .89 .90 .87 .83 .98


Note: Composite scores for each variable were computed by averaging respective item scores. SD, standard deviation;

CR, composite reliability; α Cronbach’s alpha; AVE, average variance extract; ** Correlations are significant at the .01 level.

Fig 3: Path analysis

17 
 
293 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table 3. Path analysis coefficients for the research model

Regressor Regressand Coefficient t- p

Variables Variables estimates statistics

***
Information quality (IQ) Perceived usefulness 0.336 10.981

***
Information quality (IQ) Perceive ease of use 0.162 5.203

***
System quality (SQ) Perceived usefulness 0.172 5.421

***
System quality (SQ) Perceive ease of use 0.366 11.317

***
Service quality (SEQ) Perceived usefulness 0.321 10.211

***
Service quality (SEQ) Perceive ease of use 0.227 7.092

***
Perceived usefulness Adoption/Attitude 0.504 9.541

***
Perceive ease of use Adoption/Attitude 0.354 6.575

Notes: *Significant at the p < 0.05 level (two-tailed); **significant at the p < 0.01 level (two-tailed)

In order to test the hypotheses formally, we conducted a structural equation analysis of the

relationships, the resulting path estimates are shown in Figure 3. Hypothesis 1a and 1b state that

information quality has positive effects on perceived usefulness (β = .336, p<.01) and perceive

ease of use (β = .162, p<.01). In agreement with findings in the literature, [H1 gained support].

18 
 
294 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Hypothesis 2a and 2b state that system quality has positive effects on perceived usefulness (β =

.172, p<.01) and perceive ease of use (β = .227, p<.01). In agreement with findings in the

literature, [H2 gained support].

Hypothesis 3a and 3b state that system quality has positive effects on perceived usefulness (β =

.321, p<.01) and perceive ease of use (β = .366, p<.01). In agreement with findings in the

literature, [H3 gained support]. Hypothesis 4 state that perceived usefulness has positive effects

on adoption/ attitude toward use (β = .504, p<.01). In agreement with findings in the literature,

[H4 gained support]. Hypothesis 5 state that perceive ease of use has positive effects on

adoption/ attitude toward use (β = .354, p<.01). In agreement with findings in the literature, [H5

gained support].

5. Conclusions and Recommendations


From a theoretical point of view, the results of this study highlight two critical aspect of e-

Government systems in Jordan. This article proposed and developed a sophisticated model for

Jordanian government, the developed e-government portal integrated basic user-centric concepts

of technology acceptance and online service quality research, which provides a more nuanced

understanding of TAM and IS success model. More generally, the integration of portal-related

measures like perceived ease of use, perceived usefulness, information quality, system quality

and service quality allows for a more sophisticated modeling approach. This combination also

enriched the tested factors and also provide a more broad explanatory nature of the interaction

and interplay among the constructs observed.

19 
 
295 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The empirical evidence from this study enriches the research on factors associated with citizen’s

adoption/attitudes toward use of the e-government system. Concurrent with previous research in

other industries and e-government studies in other countries who utilize TAM and IS success

model. This article found that information, system and service quality appears to rendered

pronounced impact on the perceived usefulness and perceived ease of use of e-government

system. Additionally, there is support for previous evidence that perceived usefulness and

perceived ease of use are important antecedents for technology adoption/attitudes toward use.

Strategies aimed at enhancing system, information and service quality may not be enough for the

adoption of a new system though: the results provide strong support for that usefulness and ease

of use perception must be taken care of. Intuitively speaking, nation-wide interventions that can

promote awareness and openness culture to new technology are critical development tools for

Jordanian government.

The author will like to point out certain limitations associated with the paper. One, the results

possibly suffer from single-method bias due to its cross-sectional nature, which may prevent

proper assessment for intertemporal variations. Hence, future study should conduct a longitudinal

design to validate the current finding. Two, data was collected in Amman, as such the interaction

between the variables might not be applicable to the whole of Jordan; and the outcome limits the

generalizability to other countries and cultural context. Lastly, it is our view that a quasi-

experimental study would offer an immense contribution.

20 
 
296 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

References

Affisco, J. F., & Soliman, K. S. (2006). E-government: A strategic operations management

framework for service delivery. Business Process Management Journal, 12(1), 13– 21.

Alanezi, M. A., Kamil, A., & Basri, S. (2010). A proposed instrument dimension for measuring

e-government service quality. International Journal of u-and e-Service, 3(4), 1–18.

Alghamdi, S., & Beloff, N. (2015).Exploring determinants of adoption and higher utilization for

e-Government: A study from business sector perspective in Saudi Arabia. Computer Science and

Information Systems (FedCSIS), 2015 Federated Conference (IEEE).1469 - 1479.

Al-Soud, A.R., Al-Yaseen, H., & Al-Jaghoub, S.H. (2014). Jordan’s e-Government at the

crossroads. Transforming Government: People, Process and Policy, 8(4), 597 - 619:

AL-Athmay, A.A.A., Fantazy, K., & Kumar, V. (2016) "E-government adoption and user’s

satisfaction: an empirical investigation. EuroMed Journal of Business, 11(1), 57 - 83

Blanchard, A., & Horan, T. (1998). Virtual communities and social capital. Social Science

Computer Review, 16, 293–307

Abubakar, A.M., & Ilkan, M. (2016). Impact of online WOM on destination trust and intention to

travel: A medical tourism perspective. Journal of Destination Marketing & Management,

http://dx.doi.org/10.1016/j.jdmm.2015.12.005

Brislin, R.W. (1970). Back-Translation for Cross-Cultural Research. Journal of CrossCultural

Psychology 1: 185-216.

21 
 
297 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Brown, D. (1999). Information systems for improved performance management: Development

approaches in US public agencies. In Heeks, R., (ed), reinventing government in the information

age; New York: Routledge.

Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen

& J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Beverly Hills, CA: Sage.

Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand

Oaks, CA: Sage Publications.

Chang, J.C.J., & King, W.R. (2005). Measuring the performance of information systems: A

functional scorecard. Journal of Management Information Systems, 22(1), 85–11.

Chang, I.C., Li, Y.C., Hung, W.F., & Hwang, H.G. (2005). An empirical study on the impact of

quality antecedents on tax payers’ acceptance of Internet tax-filing systems. Government

Information Quarterly, 22(3), 389–410

Chen, S. (2011). Understanding the Effects of Technology Readiness, Satisfaction and Electronic

Word-Of-Mouth on Loyalty in 3c Products. Australian Journal of Business and Management

Research, 1(3)

Chen, L., Soliman, K.S., Mao, E., & Frolick, M.N. (2000). Measuring user satisfaction with data

warehouses: an exploratory study. Information & Management, 37(3), 103-110.

Chiu, C.M., Chiu, C.S., & Chang, H.C. (2007). Examining the integrated influence of fairness

and quality on learners’ satisfaction and Web-based learning continuance intention. Information

Systems Journal, 17, 271–287.

22 
 
298 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Davis, F.D. (1989). Perceived usefulness, ease of use, and user acceptance of information

technology. MIS Quarterly, 13(3), 319-339

Davis, F., Bagozzi, R., & Warshaw, P. (1989). User acceptance of computer technology: a

comparison of two theoretical models. Management Science, 35(8), 982-1003.

Delone, W.H., & McLean, E.R. (2003). The DeLone and McLean Model of Information Systems

Success: A Ten-Year Update, Journal of Management Information Systems, 19(4), 9-30.

Frissen, P. (1997). The virtual state: Post modernization, informatisation, and public

administration. In Loader, B. D., (ed.), the governance of cyberspace. London: Routledge. 111–

125.

Hair, J.F. Jr , Anderson, R.E., Tatham, R.L., & Black, W.C. (1998). Multivariate Data Analysis,

(5thEdition). Upper Saddle River, NJ: Prentice Hall.

Hassanzadeh, A., Kanaani, F., & Elahi, S. (2012). A model for measuring e-learning systems

success in universities, Expert Systems with Applications, 39(12), 10959-10966,

http://dx.doi.org/10.1016/j.eswa.2012.03.028.

Horst, M., Kuttschreuter, M., & Gutteling, J. (2007). Perceived usefulness, personal experiences,

risk perception and trust as determinants of adoption of eGovernment services in The

Netherlands. Computers in Human Behavior, 23, 1838-1852.

23 
 
299 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Hu, P. J. H., Brown, S. A., Thong, J. Y., Chan, F. K., & Tam, K. Y. (2009). Determinants of

service quality and continuance intention of online services: The case of eTax. Journal of the

American Society for Information Science and Technology, 60(2), 292–306.

Hsieh, P.H., Huang, C.S., & Yen, D.C. (2013). Assessing web services of emerging economies in

an eastern country-Taiwan’s e-government”, Government Information Quarterly, 30(3), 267-276.

Kim, K., & Prabhakar, B. (2000). Initial trust, perceived risk, and the adoption of internet

banking. Paper presented at the 21st International Conference on Information Systems, Brisbane,

Australia.

King, W. R., & He, J. (2006). A meta-analysis of the technology acceptance model. Information

& Management, 43(6), 740–755.

Klein, H. (1999). Tocqueville in cyberspace: Using the Internet for citizen associations. The

Information Society, 15, 213–220.

Koufaris, M. (2002). Applying the technology acceptance model and flow theory to online

consumer behavior. Information Systems Research, 13(2), 205-223

Krishnan, S., Teo, T.S.H., & Lim, V.K.G. (2013). Examining the relationships among e-

government maturity, corruption, economic prosperity and environmental degradation: A cross-

country analysis. Information and Management, 50(8), 638-649.

24 
 
300 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Kwon, H. (2000). A test of the technology acceptance model: the case of cellular telephone

adoption. Proceedings of the 33rd Annual Hawaii International Conference on System Sciences,

Big Island, HI, USA.

Lean, O.K., Zailani, S., Ramayah, T., & Fernando, Y. (2009). Factors influencing intention to use

e-government services among citizens in Malaysia. International Journal of Information

Management, 29, 458-475.

Liang, H., Xue, Y., & Byrd, T.A. (2003). PDA usage in healthcare professionals: testing an

extended technology acceptance model. International Journal of Mobile Communications, 1(4),

372-389.

McGill, T., Hobbs, V., & Klobas, J. (2003). User-developed applications and information

systems success: a test of DeLone and McLean’s model”, Information Resources Management

Journal, 16, 24-45.

McKinney, V., Kanghyun, Y., & Zahedi, F.M. (2002). The measurement of web customer

satisfaction: an expectation and disconfirmation approach. Information Systems Research 13 (3),

296–315

Mohammad, G. (2015).Internet penetration rises to 76 per cent in Q1.

http://www.jordantimes.com/news/local/internet-penetration-rises-76-cent-q1

Muganda-Ochara, N. (2010), “Assessing irreversibility of an e-government project in Kenya:

implication for governance”, Government Information Quarterly, 27, 89-97.

25 
 
301 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. (2002). The Q-sort method:

Assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal

of Modern Applied Statistical Methods, 1(1), 114–125.

Ozkan, S., & Koseler, R. (2009). Multi-dimensional students’ evaluation of e-learning systems in

the higher education context: An empirical investigation. Computers & Education, 53(4), 1285-

1296,

Parent, M., Vandebeek, C., & Gemino, A. (2004). Building citizen trust through e-government.

Proceedings of the 37th Hawaii International Conference on System Sciences – 2004, Big Island,

Hawaii, pp. 1-9.

Petter, S., DeLone, W., & McLean, E.R. (2013). Information Systems Success: The Quest for the

Independent Variables, Journal of Management Information Systems, 29(4), 7-62

Pitt, L.F., Watson, R.T., & Kavan, C.B. (1995). Service quality: a measure of information

systems effectiveness. MIS Quarterly 19 (2), 173–187

Podsakoff, P. M., MacKenzie, S. B., Lee, J.Y., & Podsakoff, N. P. (2003). Common method

biases in behavioral research: A critical review of the literature and recommended remedies.

Journal of Applied Psychology, 88(5), 879–903.

Qutaishat F.H. (2013). Users’ Perceptions towards Website Quality and Its Effect on Intention to

Use E-government Services in Jordan. International Business Research, 6(1),

Rai, A., Lang, S.S., & Welker, R.B. (2002). Assessing the validity of IS success models: An

empirical test and theoretical analysis. Information Systems Research, 13(1), 50-69

26 
 
302 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Reddick, C. G., & Roy, J. (2013). Business perceptions and satisfaction with e-government:

Findings from a Canadian survey. Government Information Quarterly, 30(1), 1–9.

Shen, C.C., & Chiou, J.-S. (2010). The impact of perceived ease of use on internet service

adoption: the moderating effects of temporal distance and perceived risk”, Computers in Human

Behavior, 26(1), 42-50.

Tucker, L. and Lewis, C. (1973) A reliability coefficient for maximum likelihood factor analysis.

Psychometrika, 38(1), 1–10.

Ullman, J. B. (2006). Structural equation modeling. In B. G. Tabachnick & L. S. Fidell (Eds.),

Using multivariate statistics, (5th ed.; pp. 653–771). Boston: Allyn & Bacon

United Nations (UN). (2003). World public sector report 2003: E-government at the crossroads.

New York: United Nations.

Urbach, N., Smolnik, S., & Riempp, G. (2010). An empirical investigation of employee portal

success, The Journal of Strategic. Information Systems, 19(3), 184-206,

Venkatesh, V., Moris, M.G., & Davis, G.B. (2003). User acceptance of information technology:

toward a unified view. MIS Quarterly, 27(3), 425-478.

Wang, Y., & Liao, Y. (2008). Assessing e-Government systems success: A validation of the

DeLone and McLean model of information systems success. Government Information Quarterly,

25(4), 717-733,

Weerakkody, V., Janssen, M., & Dwivedi, Y. K. (2011). Transformational change and business

process reengineering (BPR): Lessons from the British and Dutch public sector. Government

Information Quarterly, 28(3), 320– 328.

27 
 
303 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Wirtz, B.W., Piehler, R., & Daiser, P. (2015). E-Government Portal Characteristics and

Individual Appeal: An Examination of E-Government and Citizen Acceptance in the Context of

Local Administration Portals. Journal of Nonprofit & Public Sector Marketing, 27(1), 70-98.

World Bank (2016). e-Government Research and Resources

http://www.worldbank.org/en/topic/ict/brief/e-gov-resources#egov

Yang, K., & Rho, S. (2007) E-Government for Better Performance: Promises, Realities, and

Challenges, International Journal of Public Administration, 30:11, 1197-1217.

Yildiz, M. (2007). E-government research: Reviewing the literature, limitations, and ways

forward. Government Information Quarterly, 646-665, ISSN 0740-624X.

28 
 
304 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Mining Student Data Using CRISP-DM Model

Layth Almahadeen, Murat Akkaya, Arif Sari


Department of Management Information Systems
Girne American University

Abstract: The educational system faces several challenges; one of them is identifying
the factors which have an effect on the students’ performance. This paper aims to
apply the Cross-Industry Standard Process for Data Mining (CRISP-DM) on the
student database from the Education directorate in the Al Karak region, in order to
identify the main attributes that may influence the performance of the student. The
C4.5 Decision Tree has been used to build a classification model that has the ability to
predict the final grade in a computer course.

Key Words: CRISP-DM Model, Data Mining, Classification, C4.5 Decision Trees,
Student Data, Education.

1. Introduction

Teachers or the other officials at Directorates of the Ministry of Education do not


have time to look at a large number of databases to extract useful information to make
decisions with, so they try to get the summary of data, information, basic rules, or
new patterns to make the decision-making process easier, faster and more accuracy
(Mierle et al., 2005). 

Data mining offers that and more, where data mining can be defined as extracting the
knowledge from a large database. Recently, there is a growing demand for using data
mining techniques in various fields, such as education, the telecommunication
industry and banking, to enhance their performance (Han et al., 2011).

The adoption of effective business applications, such as business intelligence tools, is


essential for educational institutions to improve their work environment and improve
their performance. Business intelligence tools such as data mining have been
presented as tools to improve the performance of educational  institutions by several
researchers.

305 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

For example, Calvo-Flores (2006) tried to identify the students who have not passed
courses at the Cordoba University; the Artificial Neural Network (ANN) model was
used to predict the students’ marks, the data having been obtained from the Moodle
logs. The database contained 240 students who had an account on the Moodle system
from Cordoba University and enrolled in the programming course. However, the lack
of students’ patterns forced the authors to add a random noise to create a new
students’ patterns.

Kotsiantis (2004) tried to take advantage of the students’ information, which is


written on the assignments, to predict the students' performance at the Hellenic Open
University, which uses the distance learning methods. Six machine learning
techniques were applied in two experimental works, but in the end, the authors found
that the Naive Bayes technique was the best techniques.

Al-Radaideh, (2006) tried to use the CRISP-DM methodology to investigate the


attributes of students in order to identify the related attributes which have an effect on
the performance of students in courses; the databases were obtained from the
undergraduate students who studied the C++ course at the Yarmouk University in the
Information Technology and Computer Science Faculty. The authors used the
classification tasks to achieve their goals; the decision tree was applied and WEKA
software was used to build the classification model, which in turn was used to predict
the performance of students. The holdout method and K-fold Cross Validation (K-
CV) method were used to measure the classification accuracy, which was found to be
low, and the authors explained that the collected attributes were not enough to build a
good classification model.

The main goal of this study is identifying the factors affecting the performance of
students, particularly the students’ grade in the computer course, where a
classification model is built to predict the student's grades.

2. The Classification Model

To build the classification model CRISP-DM methodology was adopted. This


methodology has five phases: collecting the data of students, selecting the relevant

306 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

attributes, building the classification model, using the classification model to predict
the student performance and evaluating the classification model (Chapman, 2000).

2.1 Collecting the Data of Students

In this phase, the students’ data was collected using questionnaires that were passed
among the students; the age of students was from thirteen to sixteen (from the seventh
grade to tenth grade). The data was collected from five schools, taking into account
the geographic distribution of these schools to cover different areas within the Al
Karak city.

All the student attributes were available in the questionnaire; about 600 questionnaires
were passed among the students and after verifying the validity of questionnaires,
some of the questionnaires (about 85) were manually eliminated due to incomplete
data. Therefore, the number of questionnaires that could be used was about 515.

2.2 Selecting the Relevant Attributes

Figure 1 shows how the collected data was prepared in a table in Attribute Relation
File Format (ARRF). After that, the ARRF file was uploaded into a data mining
system (WEKA toolkit) as shows Figure 2.

307 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 1: The ARRF File for the Students Data.

Figure 2: Loading Data to WEKA.

308 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The questionnaire contained about 15 attributes. Since the collected attributes may
have some irrelevant attributes that may reduce the performance of the classification
model, a feature selection approach was used to select the most appropriate set of
attributes, where six attributes were excluded because they were not related. Finally,
nine attributes and one class (computer grade) were adopted. The attribute, their
descriptions, their possible value and the attributes type are all presented in Table 1.

# Attribute Description Possible Values Attribute Type


1 St - Gender Student Gander Female, male symbolic
2 St - Age Student age 13-16 numeric
3 F-e Father education PhD, MS, BS, primary, symbolic
Diploma, Uneducated
4 M-e Mother education PhD, MS, BS, primary, symbolic
Diploma, Uneducated
5 S–n School name Bathan, Falah, Thalagah symbolic
, Zeen alsharaf, Thaneah
6 fail Fail in previous Yes, No symbolic
classes
7 H-c Having computer Yes, No symbolic
8 F- m Family member 3 -12 numeric
9 M-i Monthly income 100 -1750 numeric
10 C- g Computer Grade A, B, C,D E,F symbolic
(the class)
Table 1: The Symbolic Attribute Description.

Note: A = 90 - 100, B = 80 - 89, C = 70 - 79, D = 60 - 69, E = 50 - 59, F < 50.

2.3 Building the Classification Model

The WEKA toolkit was used to build the classification model, where the C4.5
decision tree technique was applied, as seen in Figure 3. C4.5 implementation in the
WEKA toolkit is known as J48 and it has several different configurations, as seen in
Figure 4. The decision tree is a good and practical technique since it is fast, and can
be converted to be a classification rule (Al-Radaideh, 2006).

The experimental work was repeated many times until a reasonable tree model with a
good percentage of correct classification was obtained. After repeating the experiment
more than 70 times, a classification model was adopted.

309 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 3: List of Various Decision Trees Algorithms in WEKA.

Figure 4: J48 configuration setting window in WEKA.

310 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

To build the decision tree, the gain ratio was used to determine the most affective
attribute which caused the resultant nodes and was the most appropriate root node; the
attribute that had the highest gain ration was the fail attribute, so that is located at the
top of the decision tree. To build the whole decision tree, the previous process was
repeated several times, as seen in Figure 5.

Figure 5: Decision Tree for Students Dataset (the first model).

2.4 The Experimental Work & Predicting the Student Performance

The experimental work was repeated many times until a reasonable tree model with a
good percentage of correct classification was obtained. After repeating the experiment
more than 70 times a classification model was adopted. The decision tree of the
adopted model is presented in Figure 6, where the adopted model's size is about 27
and the number of leaves (rules) is 14.

311 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 6: Decision Tree of the Adopted Model (J48).

To predict the student performance (marks of students) the decision tree was used; the
rules were generated by following every path of the decision tree: 14 rules were generated.

Table 2 represents the generated rules of the adopted model, where the first column
represents the rule number, the second column represents the rule, the third column
represents the number of students who satisfy the rule, the fourth column represents
the rule accuracy where it is equal (number of students who satisfy the rules divided
on the number of students who are included in the rule), the fifth column represents
the number of attributes contained in every rule.

312 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table 2: The Generated Rules of the adopted model.


Rules # Rule Students # Accuracy Attribute #
1 IF fail=yes THEN computer grade = 35 87.5% 1
F
2 IF fail=no, h-c=yes, f-e=m THEN 16 80% 3
computer grade =A
3 IF fail=no, h-c= yes, f-e!=m,f- 12 75% 4
e!=p,s-n=f THEN computer grade
=A
4 IF fail=no, h-c=yes, f-e!=m,f-e=p,s- 11 73.3% 4
n=b THEN computer grade=B
5 IF fail=no,h-c= yes,f-e!=m,f-e!=p,s- 26 68.4% 6
n!=f,m-e =b,f-m<=7 THEN
computer grade =A
6 IF fail=no,h-c =no. THEN computer 78 66.1% 2
grade = F
7 IFfail=no,h-c= yes,f-e!=m,f-e=p,s- 13 65% 6
n!=b,m-e!= b,s-n!=t,mi>400,s-n=f
THEN computer grade =A
8 IFfail=no,h-c= yes,f-e!=m,f-e=p,s- 53 64.6% 7
n!=b,m-e!= b,s-
n!=t,mi<=400,sex=male THEN
computer grade =D
9 IFfail=no,h-c= yes,f-e!=m,f-e!=p,s- 16 64% 6
n!=f,m-e =b,f-m>7 THEN computer
grade =B
10 IFfail=no,h-c=yes,f-e!=m,f-e=p,s- 22 61.1% 5
n!=b,m-e = b, THEN computer
grade =D
11 IFfail=no,h-c= yes,f-e!=m,f-e!=p,s- 85 58.2% 5
n!=f,m-e!=b THEN computer grade
=A
12 IFfail=no,h-c= yes,f-e!=m,f-e=p,s- 45 57.6% 7
n!=b,m-e!= b,s-
n!=t,mi<=400,sex=female THEN
computer grade =D
13 IFfail=no,h-c= yes,f-e!=m,f-e=p,s- 23 57.5% 5
n!=b,m-e!= b,s-n=t, THEN computer
grade =B
14 IFfail=no,h-c= yes,f-e!=m,f-e=p,s- 29 56.8% 6
n!=b,m-e!= b,s-n!=t,mi>400,s-n!=f
THEN computer grade =E

From reviewing Table 2, the rules were arranged according to their accuracy in
descending order, where the accuracy of the strongest rule is about 87.4% and the
accuracy of the weakest rule is about 56.8%. The rules were arranged in this way in
order to determine the most significant rule. Also we note that the longest rule
contains seven attributes, while the shortest rule contains just one attribute which
makes it the strongest rule.

313 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

2.5 Evaluating the Adopted Model

In order to evaluate the performance of the adopted model, the classification accuracy
is usually used for this purpose.

According to Rudner (2003), the accuracy can be defined as an agreement between


the actual results or class values and predicted results or class values, or it can be
defined as the percentage of database instances which classified in the correct way
(Han, 2011). The simple classification accuracy can be expressed as the following:

C
Accuracy =
D

C is the number of instances which classified in the correct way and D is the number
of all instances in the database. Table 3 shows the confusion matrix for the adopted
model. From this confusion matrix, we can calculate the accuracy of the adopted
model.

Table 3: The Confusion Matrix for the Adopted Model

Table 3 shows the predicted instances or class and the actual instances or class. The
highlighted numbers represent all instances which are classified in the correct way,
equaling 210 instances. The total number represent the number of all instances (marks
of students) in the database, equaling 515 instances.

314 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Therefore, the accuracy of the adopted model can be calculated as follows:

C 210
Accuracy= = *100
D = 40.77% 515

We notice that the accuracy of the classification model is not high. This indicates that
the collected questionnaires and attributes are not satisfactory to generate a
classification model with high quality.

3. Conclusion

This paper tried to use data mining technologies for evaluating student performance,
and to identify the factors which have an effect on the student performance at schools
belonging to the Education Directorate in the Al Karak region, where the
classification model can be used to predict the marks of students, but the collected
data is not sufficient to build a classification model with high quality.

This paper found that “fail, father education, having computer” attributes are the most
effective attributes which have an effect on the students' performance, while the rest
of the attributes have no influence on the students' performance.

For future work, reliable information should be obtained to enhance the quality of the
classification model, and then the classification model should be combined with the
educational systems to provide deeper knowledge about the students’ behaviour.

Such knowledge can be used by decision-makers to take the required actions to


enhance the quality of the higher educational system.

315 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

References

Al-Radaideh, Q. A., Al-Shawakfa, E. M., & Al-Najjar, M. I. (2006, December).


Mining student data using decision trees. In International Arab Conference on
Information Technology (ACIT'2006), Yarmouk University, Jordan.

Calvo-Flores, M. D., Galindo, E. G., Jiménez, M. P., & Piñeiro, O. P. (2006).


Predicting students’ marks from Moodle logs using neural network models. Current
Developments in Technology-Assisted Education, Vol 1, pp.586-590.

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth,
R. (2000). CRISP-DM 1.0 Step-by-step data mining guide.

Han, J., Kamber, M., Jian P. (2011). Data Mining Concepts and Techniques. San
Francisco, CA: Morgan Kaufmann Publishers.

Kotsiantis S., Pierrakeas C., Pintelas P. (2004)" Predicting students’ performance in


distance learning using Machine Learning techniques", Applied Artificial
Intelligence, Vol.18, No. 5, pp. 411-426.

Mierle, K., Laven, K., Roweis, S., & Wilson, G. (2005). Mining student CVS
repositories for performance indicators. In ACM SIGSOFT Software Engineering
Notes Vol. 30, No. 4, pp. 41-45.

Rudner, L. M. (2003). The classification accuracy of measurement decision theory. In


annual meeting of the National Council on Measurement in Education, Chicago.

316 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Malware-Free Intrusion: A Novel Approach to


Ransomware Infection Vectors
Aaron Zimba
Department of Computer Science and Technology
University of Science and Technology Beijing
Beijing 100083, China
[email protected]

Abstract— The Internet is so diverse such that at any given domain is not uncommon. Attackers use a wide range of
instance someone is clicking a link, opening a file, downloading malware not limited to viruses, worms, trojans, rootkits etc to
an email attachment and so forth. Such seemingly benign actions achieve their ultimate. One new breed of malware coined as
do not always return the expected outcome because attackers Ransomware [3] employs a new philosophy altogether, that of
leverage these actions to spread their malware. And malware extortion, as a means to achieve the end goal. Unlike
today casts a broad spectrum of software with varying conventional malware which usually seeks to replicate, delete
characteristics some of which include Ransomware. Ransomware files, exfiltrate data or extensively consume system resources,
has come to claim its place in the malware wild due to the Ransomware on the other hand imposes some form of denial of
philosophy of extortion behind its operations. Ransomware
service to either the system or system resources such as files
threat actors are seeking ways to delivery their malware payload
until a ransom is paid. One class of Ransomware uses
in ways that do not generate suspicion via unusual network
traffic and system calls by involving less user input if any at all. encryption to encrypt victim files and demands a ransom before
Malware-free intrusions present attack vectors so desirable to decryption. This type of malware has targeted critical industries
Ransomware threat actors in this respect in that they do not [4] where the victim has had to pay as the only way out due to
employ an extra malicious code which otherwise would be the vitality of access to data on demand. Figure 1 below shows
detected by intrusion detection and prevention system. We in this the distribution of Ransomware attacks on different sectors of
paper explore the utilization of malware-free backdoors for the economy for 2016 [15].
Ransomware payload delivery over a network with RDP-based
remote access. We further show that leveraging such backdoors
does not require user input while providing high probability
levels of success thus adding to the expansion of the available
attack surface.

Keywords- Ransomware; Attack Vector; Backdoor; Remote


Access;

I. INTRODUCTION
The rise of the Internet has likewise seen the emergency
related cyber-attacks and the two are seen not to occupy
opposite ends of the continuum. The Internet was initially built
without security in mind [1] implying that all technologies that
jump onto this bandwagon need to address the associated
security concerns in their respective niche, but unfortunately
this is not the case. Due to the vast number of technologies
integrated into the Internet today, the variety of attacks thereof
are extensively wide correlating to the incepting technologies. Figure 1. Ransomware Infections by Organization Sector,
There are many metrics and parameters used to classify cyber- January 2015 – April 2016 [15]
attacks but they can broadly be classified as targeted or non-
targeted attacks [2]. Non-targeted attacks usually don’t have a It is estimated that Ransomware has costed millions of
specific target and tend to be works of novices and script dollars to victims [5] while enriching the criminals that be. As
kiddies as opposed to targeted attacks. On the contrary, with all cyber-attacks, attacks via Ransomware cast a wide
targeted attacks are the works of highly skilled technical people spectrum of attack vectors. These are the ways and means
who might be working on individual basis, for organized crime through which Ransomware is spread and delivered to the
groups, for big corporations or even governments. This class of potential victim. The attacker is therefore tasked with finding
attackers employ sophisticated techniques to compromise and optimal ways of infecting victims and Ransomware is known to
victimize their targets. The use of malicious software in this use some of the common attack vectors employed by other
malware. Some of these attack vectors generate suspicious

317 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

network traffic and issue out unusual system calls, something network and will utilize lapses in security configurations.
very undesirable to attackers as this tends to raise a red flag. These lapses may include poor security implementations,
Most malware require some form of user input of some sort vulnerabilities imposed by software built without security in
to effectively carry out an infection. However, most user mind, social engineering etc. To this effect, Ransomware
systems implement Intrusion Detection Systems (IDS) which mainly comes in two flavors; non-encrypting Ransomware
detect and alert the user of potential harm as a consequential also known as Locker Ransomware and encrypting
result if certain actions are performed. This detection is Ransomware also known as Crypto Ransomware. The diagram
inclusive of Ransomware. Attackers therefore seek to employ below in Figure 2 shows the Microsoft report for the relative
methods and tactics which do not require user interaction and distribution of Ransomware variants.
generate little noise as possible if any at all. For attacks as those
employed by Advanced Persistent Threats (APT), stealthiness
and a low threshold of noise are of great essence as such threat
actors seek to maintain an undetected persistence presence for a
long time [6]. Therefore an attack vector which is stealth and
less noisy is likewise desirable to Ransomware threat actors.
One such attack vector is backdoor implantation leveraging the
pre-authentication services available in almost every version of
the Windows operating system. The accessibility backdoor in
Windows is actualized by replacing an accessibility binary
executable with a system file capable of granting system level
access before even one logs in. This backdoor is documented
[7] to be present in important sectors such education, judiciary,
government etc at the disposal of Ransomware threat actors.
This attack vector is especially attractive to attackers in that it Figure 2. Relative distribution of different Ransomware
does not involve any malicious code. This implies that all IDSs variants. [16]
which are signature based [8, 9, 10] are incapable of detecting Using Figure 2, we in this paper consider the most
it and since the rationale behind the backdoor is to utilize common Ransomware variants of the Locker and Crypto
system resources and files to covertly achieve a goal, a Ransomware, whose characteristics are later documented in
behavioral based IDS will particularly find it hard to detect it Section IV in the analysis stage.
since there is no anomaly behavior to evaluate.
A. Locker Ransomware
This paper explores the utilization of the aforementioned
backdoor attack vector for Ransomware payload delivery. We This is a less common type of Ransomware which basically
investigated the delivery of the malware payload in the locks down the victim’s system and its applications while
presence of IDS on different versions of the Windows disabling user input to prevent the user from operating the
operating system from Windows XP to Windows 10. The system at all. The victim is usually extorted that they have
Windows operating system is chosen as the victim on the engaged in some of cyber-crime like copyright infringement,
pretext that it’s the most widely used operating system [11] child pornography, money laundering etc and that they need to
hence the obvious casualty of such attacks. We leverage the pay some fee, usually in the form of bitcoin [12], before the
built-in RDP-based remote access functionality in these charge is disposed of. The emphasis usually is that the victim
systems to establish an RDP session without any login at all to won’t be able to use the system and will be in trouble with the
deliver the malware payload and confirm the result. We law unless a payment is made. Some variants of this malware
contend that such an attack vector increases the attack surface are capable of modifying the Master Boot Record (MBR) and
of Ransomware attacks with a high probability of inflicting even the partition table. Only limited system functionality is
maximum damage without any direct user input. made available such as numeric functions and limited mouse
movements to enable the victim to enter and pay the ransom
The rest of the paper is organized as follows: Section II amount on the displayed Ransomware screen. This strain of
provides background information and concepts whilst the Ransomware usually leaves the system and user files
attack model for Ransomware payload delivery and the uncorrupted [13] and can usually be recovered offline via a
analysis thereof are discussed in Section III. Experiment technical hack or otherwise, cause of the weak techniques
simulations are presented in Section IV while best practices employed.
and mitigation techniques are presented in Section V and we
conclude the paper in Section VI. B. Crypto Ransomware
This is by far the most common type of Ransomware [14]
II. BACKGROUND AND CONCEPTS and employs encryption techniques to achieve resource
The networks that build up the Internet are thought to be like inaccessibility. This Ransomware variant silently infects the
an egg. There is an obvious hard network perimeter that victim and communicates with its Command and Control (C2)
requires penetration into the softer inner core which permits servers if need be to download the relevant encryption keys.
lateral traversal once the attacker obtains access. Therefore, The malware then extracts the keys and encrypts targeted user
Ransomware attackers try to find ways of penetrating a target files which become inaccessible without the decryption key.

318 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Unlike Locker Ransomware, crypto Ransomware does not lock D. Infection Vectors
down the system; it only encrypts user data and displays a There are different ways in which attackers deliver their
message after completion of encryption that the victim’s data Ransomware payload to the victim. They differ in the degree
files are no longer accessible and can only be accessed via
of complexity and effectiveness. Here we discuss the most
decryption upon payment of the ransom. The attacker holds the
prevalent ones and elaborate how the attack vector considered
decryption keys and promises to avail them once the ransom
demand is met. Whether the attacker avails the decryption keys in this paper contributes to the attack surface.
upon payment of the ransom is a debate of circumstance but 1) Malicious Emails
one thing for sure is that there’s no guarantee that the keys will This is one of the most common way of Ransomware delivery.
be provided after paying the ransom. The diagram in Figure 3 The payload is delivered as an attachment from emails sent
below shows the general structure of Crypto Ransomware. through spam using botnets and other compromised hosts. The
victim is social engineered into interacting with the attachment
by directly opening an attachment which executes the
Encryption Ransomware payload, opening a malicious file which in turn
Key* initiates payload delivery via a macro or by clicking on a URL
in the email which redirects to an exploit kit which in turn runs
Main Body to find vulnerabilities on the target system and executes the
(Payload) Ransomware payload thereafter. Spam attachments, as shown
in Figure 4 below according to Cisco Security Research [25],
Encryption usually carry files of different formats which could be used to
Algorithm deliver the Ransomware payload.

Figure 3. General overview of Crypto Ransomware

Crypto Ransomware also comes in two flavors; Private-key


Crypto Ransomware (PrCR) and Public-key Crypto
Ransomware (PuCR). PrCR uses classical stream or block
symmetric ciphers for encryption. Since key distribution is a
known challenge in symmetric encryption, these Ransomware
variants, e.g. CryptorBit [17], used a self-designed substitution
cipher in the first 1024 bytes of the target file. These are thus
crackable via cryptanalysis once the analyst gets hold of the
Ransomware itself.
PuCR on the other hand employ public key encryption
where the encryption and decryption key are entirely different.
In this approach, a pair of keys is generated by an asymmetrical
cryptosystem such as RSA and the public key used for
encryption is delivered together with the payload whilst the
private key used for decryption is kept in the hands of the
attacker which the attacker promises to release upon payment
of the ransom. Cryptowall [18], for example, uses a 2048 bit
RSA public key for encryption which is believed to be
computationally infeasible to break without consented efforts
of distributed computing. Figure 4. Malicious files attached to spam emails [25].
C. Command and Control (C2) Servers
2) Brute-force Authentication Credentials
C2 servers are the attacker's online infrastructure which Another attack vector in this domain finding growing usage is
generally coordinate operations of the infected hosts. This brute-forcing user credentials to different systems. Attackers
infrastructure can be a system which the attacker owns or a set employ automated scripts to achieve this task. Bucbi
of compromised hosts in form of a botnet. Ransomware will Ransomware [19] utilized this attack vector to obtain access to
usually beacon back to these servers once an infection is the system via RDP. We contend in this paper that
successful which may handle the remote distribution and
Ransomware threat actors can alternatively leverage malware-
encryption of the victim’s files. It’s these resources that are
responsible for handling operations like payment mechanisms free intrusion backdoors to obtain access through RDP as
and other related tasks. It worth noting that C2 play a vital role some system implement a lockdown mechanism upon failed
in the Ransomware attack chain in that if these servers are multiple authentication attempts.
offline, the malware may not complete the attack process and
the subsequent encryption.

319 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

3) Exploit Kits (EKs) We now employ an attack tree [24] for our model
EKs are software packages which scan for vulnerabilities with consideration. The diagram in Figure 5 below shows an attack
the purpose of malware installation upon successful discovery. tree with different infection vectors. The root node is denoted
These scans are run on third party servers and inject code into by G0 and it’s the attacker’s ultimate goal. The rest of the
different portions of the server depending on the context nodes and leaves are denoted as follows: G1 – attack via
which in turn redirect server visitors to the malware. The offline or out of band payload delivery, G2 – network attack
Angler EK [20] for example accounted for close to 20 million payload delivery, G3 – authentication attack, G4 – payload
attacks thwarted by Symantec. delivery via USB flash drive, G5 – payload delivery through
optical media, G6 – payload delivery via Bluetooth, G7 –
4) Other Attack Vectors payload delivery via EKs, G8 – payload delivery through spam
There are many other attack vectors employed to attain a email, G9 – payload delivery via brute-forcing, G10 – payload
successful Ransomware attack. Some of these include the delivery through malware-free intrusion and Gn+ – payload
injection of redirect links in JavaScript, Malvertising, Drive- delivery via other network-based infection vectors.
by-Downloads [21, 22, 23] etc.
III. THE ATTACK MODEL G0 Root Node
We now formulate the attack model based on the preferred (System Level Access)
infection vector from the preceding section. We model our
attack model based on a set of conceptual units which serve as
the basic building blocks of the whole attack process. G1 Intermediate Node
Therefore the delivery of Ransomware to a victim can be G2 (Sub-goals)
envisioned as a process with an attacking agent carrying out
the attack by acquiring a set of assets after performing some
Gn+
action with the sole purpose of reaching the goal, the delivery G4 G6 Leaf Nodes
and successful execution of the malware on the targeted host. G5 G3 G7 G8 (Atomic Attacks)
A. Attacking Agent
This is the subject of the attack process who carries out actions G9 G10
towards the object which might be a host, network or system.
The agent can be software e.g. EKs or a human actor or a
combination of both. We distinguish the agent of our model to Figure 5. Attack Tree of Infection Vectors
be a highly skilled threat actor with a considerable level of All the intermediate nodes in the resulting graph decompose
sophistication in terms of traceability and stealthiness. into children nodes sharing disjunctive OR association. This
B. Assets implies that only one node need to be true to traverse to the
upper node, meaning if the route through G1 is traversed, the
These are resources which the agent requires in order to attacker has options of either starting with G4, G5 or G6. The
further the attack. Some resources are but not limited to same is true if the attack route pursued traverses through G2,
information about the host such as operating system, IP the attacker can either start with the leaves G7, G8 or Gn+.
address, open ports, TCP/UDP connectivity and so forth. It’s
important to note that such information might be reusable 0 1 0 0 0 0 0 0
throughout the attack process hence the need for constant 1 0 1 1 11 0 0
verification for consistency. 
C. Actions 0 1 0 0 0 0 1 1
 
These are requests made by the agent with specified input 0 1 0 0 0 0 0 0
parameters with an expected return. There are preconditions AG = 
that have to be met for a given action to return the correct
0 1 0 0 0 0 0 0
output. The output of an action can be either true, in which
 
case the returned parameters further the attack or false where
0 0 1 0 0 0 0 0
the returned value denotes that further pursuance of the chosen
0 0 1 0 0 0 0 0
attack vector does not yield fruition.
 
0 1 0 0 0 0 0 0
D. Goals
These are the treasures that the attacker seeks to attain. If the attacker instead opts to use the path through G3, likewise
There’s the ultimate goal, Ransomware delivery and execution he has an option of starting either with the leaf G9 or G10.
in our context, but also other sub-goals of the attack process Pursuance of attack vectors through G1 is practically daunting
which act as pivots for further attacks. These are only reached because of the constraints imposed by the need for physical
when the returned value of a certain action is true. presence. Moreover it even limits the target audience which
needs to be reached. We therefore drop this attack path in our

320 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

attack consideration and generate the adjacency square matrix further activate RDP-based remote access on the targeted
AG of the 8th order shown above for the resultant attack graph. hosts. We acquire and verify Ransomware samples with
We thus deduce five attack scenarios from the adjacency VirusTotal and Malwr [26, 37] for delivery to the victims.
matrix AG corresponding to the following paths: Since the malware is active and harmful, we perform the test
in a securely built environment as specified by common
P7: {G7, G2, G0}
scientific guidelines [27, 28, 29, 30] while maintaining limited
P8: {G8, G2, G0}
regulated Internet access via NAT integrated in Virtual Box.
Pn+: {Gn+, G2, G0}
We in addition used the following tools for analysis of the
P9: {G9, G3, G2, G0}
Ransomware: Process Monitor [31] for verification of process
P10: {G10, G3, G2, G0}
activity, Regshot for registry alterations monitoring, API
EKs require the existence of an exploit before Monitor [32] for observance of issued system calls and
materialization meaning a target without vulnerabilities being ApateDNS and Netcat [33, 34] for emulating some C2 servers.
sought by the EK won’t be susceptible to the attack. We thus We use an Nmap [35] for reconnaissance attacks where the
drop the first path P7. Spam email largely depend on user input actions of the defined attack model returned assets in the form
and a user with up to date Internet hygiene would rarely fall of list of hosts IP addresses, types operating systems, open
prey to such. Moreover, spam emails are subject to filtering by ports and running services. We did obscure an RDP port on
spam filter giving no assurance of success of payload delivery. one of the hosts and it was discovered to be running RDP
We likewise drop path P8. In authentication attacks, brute- services upon probing for the service banner. We also
forcing passwords is subject to system lockdown upon employed the services of an automated script [36] for
multiple failed attempts. We in this regard likewise drop the automated backdoor discovery. We set up an FTP server on
path P9. Malware-free intrusions on the other hand do not the attacker’s machine to host the Ransomware payload. The
require any vulnerabilities for exploits and neither do they snapshot in Figure 7 below shows successful malware delivery
require any user input. Once the malware-free intrusion on one of the targeted hosts.
backdoor is identified, the attacker can without difficulty
We ran the attack by deploying the Ransomware to the victims
deliver his Ransomware payload directly to the victim hence
using the malware-free intrusion infection. First a
the actualization of the attack. We therefore base our
reconnaissance attack was carried out which revealed the list
experiment simulations solely on the attack path P10 in the
of available host and their respective running services upon
following section.
port scans and banner grabbing. The obscured port likewise
IV. EXPERIMENT SIMULATIONS AND ANALYSIS revealed that the RDP service was running. The automated
scripted referenced earlier was employed to probe the
Our simulation environment consists two networks separated
availability of the backdoor and five backdoors were
by a simulated Internet as shown in Figure 6 below. The
discovered on separate hosts. As can be seen in Figure 7, the
attacker is located in one subnet whilst the targeted hosts Ransomware payload file, named Invoice.zip is a small file in
reside in a different subnet altogether. In practice the attacker the range of ~ 10KB and might not raise suspicion to the
can reside in the same network with his victims but that's a
benign user. They are usually attached to some spam email
rarity from a logical point of view in as far as Ransomware is
with a catchy subject to raise interest from the would-be
concerned. The threat actor, the attacking agent defined in the
victim.
attack model ran from the Kali Linux whilst the targeted hosts
ran on Windows XP, Windows Vista, Windows 7, Windows 8 We observed from the pursued attack vector that it did not
and Windows 10. require any user action so long the backdoor was present and
RDP service active. Table I below summarizes some of the
Ransomware Advesary known activities and properties of the Ransomware payloads.
Victim Network
TABLE I. RANSOMWARE ATTACK ACTIVITIES
Ransomware Attack Details
Family
Switch Variant Delete MBR Steals
Name Encryption
Internet Files Alteration Info
Cryptowall Crypto   X 
Router FakeBSOD Locker X X X 
Brolo Locker X X X X
CTB-Locker Crypto  X X 
Teslacrypt Crypto   X 
Figure 6. Experiment setup for Ransomware delivery Reveton Locker X  X 
Seftad Locker X X  X
We implant the accessibility backdoor on target hosts via Cerber Crypto   X X
registry manipulations by setting cmd.exe as the debugger for
specified accessibility suite executable binaries and by Ransomware threat actors might have specific targets in mind
switching specific accessibility suite binary executable with but using this attack vector would increase the surface area of
cmd.exe in the %systemroot%\System32\ directory. We infection and since the motive is to extort as much money as

321 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

possible regardless of the target victim, this attack vector is activities shown in Table I. The chain comprises five stages as
especially attractive for maximizing profits. shown in Figure 8 otherwise elaborated below as follows:
Different Ransomware families have specific file activity upon 1) Malware-free Intrusion Backdoor Discovery
infection. It’s notable that the Locker variant of the The attacker in this case seeks to find the accessibility
Ransomware does not employ encryption. The deletion carried backdoor which when invoked avails a system level access
out by the various families differ in their target files and the console via RDP-based remote access. The attacker does not
subject implementing the deletion. Some attacker are known concern himself with the implantation of the backdoor, his
to remotely delete the files upon failure to pay the ransom main objective is to determine whether the backdoor exists or
while some Ransomware payload itself deletes target files to not. We base this assumption on the fact that this type of
reduce any possibility of recovery. Families which employ backdoor is documented [7, 38] to be existing on critical
asymmetric encryption largely depend on the C2 for networks of educational institutions, governments,
generation and deployed of encryption keys. The public key is manufacturing industries, legal sector, gaming companies etc.
always used for encryption. There are a number of Moreover, we contend that a determined attacker with a
Ransomware out in the wild but the majority belong existing specific target might employ other techniques to achieve this
families, only that they introduce some additional backdoor or any of this type which does not necessarily
functionalities, e.g. changing from symmetric encryption to

Figure 7. 1 of Ransomware Successfully Delivered to Victim


asymmetric encryption. Therefore, mutations of Ransomware, require user action. Once the backdoor has been discovered,
just like conventional malware, are not uncommon. the attacker goes to the next step of initial payload delivery.
A. The Infection Chain 2) Payload Delivery via FTP
Once in the system console of the victim, the attacker could
We now describe the infection chain for crypto use whichever method applicable to fetch the Ransomware
Ransomware used in our experiment. It depicts the life process from wherever it’s harbored. In our experiment the
of the Ransomware until it accomplishes the given task. Since Ransomware was assumed to be hosted on the C2 servers
the attack vector pursued in our setting is somewhat different controlled by the attacker. For effective distribution of the
from the common infection vectors in that it does not require payloads, attackers are known to host malware payloads in
user input or user action, the infection chain likewise different mirrors depending on the location of the victim. We
emphasizes that the attack vector leverages malware-free chose to fetch the Ransomware via FTP considering that this
intrusion as a result of the backdoor and not a direct protocol is allowed by most firewalls and would not be red-
consequence of a user’s action. The same chain applies to flagged under normal conditions.
Locker Ransomware with the only difference being the attack

322 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

3) Infection and Initial Execution victim. There is the process of payment which is carried out in
Once the Ransomware is delivered to the victim via FTP, the different forms the most notable being via bitcoin using the
first action it carries out is to beacon out to the C2 servers. Tor network to eliminate any chances of traceability.
This is common for the variants which employ public key With the infection chain of Ransomware attack via malware-
encryption. Cases of the Ransomware falling to successfully free intrusion defined, we now look at how this infection
encrypt are documented [39] where the failure is attributed to vector fares in comparison to others in terms of intractability
inability to establish contact with the C2 servers. It should be while putting both the attack and victim into context. Table II
noted at this point that the attacker does not engage further in below summarizes the comparisons thereof.
the attack process but rather awaits the Ransomware to carry
out its tasks. Another point worth noting is that some variants TABLE II. INFECTION VECTOR CHARACTERISTICS
are known not to execute instantly upon infection but Infection Victim Exploit Mule
Repudiation
hibernate in efforts to avoid detection. Vector Action Dependence Carrier
Spam Mail  X  X
Payload Execution Brute-Forcing X X X 
C2 Beaconing Exploit Kits    
FTP Payload 2 Malware-Free
Download Victim 3 X X X 
Intrusion
Victim
Backdoor It’s evident from the above table that malware-free intrusion
File Encryption
Discovery infection vector shares some commonalities with the brute-
Ransomware Encryption Key 4 forcing vector all due to the fact that these two methods
File
1 Download Victim require first access to the victim’s machine. Though this may
result in requiring a lot of input parameters for the attack to
Targeted Hosts User Notification materialize, the benefits outweigh attack paths pursued via
Victim About Ransom other means. It’s worth noting that though these two vectors
5
may share some commonalities, brute-forcing is subject to a
C2 Severs lot of hurdles compared to malware-free intrusions and the
latter is therefore a better attack vector.
Figure 8.Infection Chain for Ransomware Attack via
V. MITIGATION AND BEST PRACTICES
Malware-free Intrusion Vector.
In as far as prevention and detection are concerned, we
4) File Encryption approach it twofold; against the Ransomware attack itself and
This is the stage where the payload actually encrypts the against malware-free intrusion. This is so because as earlier
targeted files using keys obtained from the C2 servers. The shown in the attack model, the absence of malware-free
encryption is selective in that it does not encrypt system files intrusion backdoor implicitly entails that a Ransomware
but user files not limited to images, text documents, pdf infection vector in this regard would not be feasible.
documents, database files, excel files, PowerPoint etc. Crypto There are a number of suggested solutions against
Ransomware are not known to attack system files since from a Ransomware attacks the majority directed towards prevention
logical point of view it might be cumbersome to deliver the than recovery. The most echoed of these is “prevention is
ransom notice. Once the files are encrypted, only the attacker better than cure” where offline backup of data is strongly
with the decryption keys has got the means of making the data emphasized. Offline is stressed due to the fact that some
accessible again hence the ransom, but there is no guarantee Ransomware families are known to search for any network
that the attacker will keep their word. Some variants as shown attached storage and any network resources and induce an
in Table I do delete original files while others don’t. Others attack if the target files are present. Offline backup is arguably
are known to delete shadow files so as to prevent any the best solution because Ransomware variants keep mutating
possibility of system restoration available in Windows. Crypto and new ones keep emerging with new techniques altogether.
Ransomware threat actors suffer from the challenge of key It is also recommended to keep the anti-virus updated so as to
management. Clearly using the same key for multiple include new Ransomware signatures and anomaly behavior to
encryption, hence decryption increases the chances of key the IDS and IPS engine. Good Internet hygiene is another
discovery once a victim pays the ransom as the key would be recommended house-keeping activity; users whether technical
reusable. Generating multiple keys introduces the challenge of or otherwise should likewise be educated on the importance of
key management and increases the chances of key discovery. safe Internet browsing since Ransomware attacks are mainly
5) Ransom Notification directed towards the Internet and the users thereof.
Once the Ransomware is done encrypting and deleting file in One solution against Ransomware attacks is to keep restore
accordance with its characteristics, a ransom note is delivered points on the system. However this method works with earlier
to the screen of the victim. Attackers employ a myriad of scare variants of Ransomware which did not attack the restore utility
tactics to intimidate the victim into succumbing. Some of the in the Windows operating system. Newer mutated and updated
scare tactics employed try to prey on the ignorance of the versions of Ransomware seek to delete system restore points

323 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

via the vssadmin.exe file to prevent system restoration, .i.e. the It’s worth noting that security via port obscurity is not
Ransomware depends on local system resources to make forthcoming due to the fact that service banner probes do
recovery impossible. Trivial methods have been suggested reveal the actual running service.
[40] to prevent access to the above file by making a backup of
the file and renaming the original file. If this is implemented, VI. CONCLUSION
the Ransomware fails to find the vssadmin.exe responsible for Ransomware attacks keep evolving and so do the methods and
removing restore points consequently failing to explicitly techniques employed to carry out the attacks. The most
prevent a system restore. Though the Ransomware at this point common methods of Ransomware payload delivery involve
might be able to encrypt the targeted files, restoration is some third party and require some action of the user.
acquired by removing the Ransomware payload first, Moreover these types of attack vectors also involve some form
assuming it didn’t delete itself, then renaming the backed up of malware for initialization of the attack before Ransomware
file to vssadmin.exe and implementing a system restore. infection which might otherwise be detectable by the IDS.
Nevertheless, this is in the hope that the Ransomware doesn’t Malware-free intrusions introduce a new attack vector so
compute hashes to check collisions with the targeted file. desirable to the attacker in that it does not require a third party
Considering that Crypto Ransomware is the more resilient of mule and neither does it require any action from the user. As
the two Ransomware variants, other suggested solutions are demonstrated in this paper, all the attacker needs to do is to
process signing and traffic monitoring by the IPS. Crypto download the payload once the victim’s system has been
Ransomware always tries to beacon back to C2 for further penetrated. We in this paper explored the accessibility
instructions and in this prevention approach, the IPS that be backdoor as a malware-free infection vector where system
signs all processes in the system and monitors and logs level access is gained over an RDP session at pre-
process activity on the network. Communication to C2 servers authentication without logging at all. Since the accessed
can be sighted as unusual traffic and the necessary steps taken console at pre-authentication is at system level permission, the
to prevent further damage. This techniques has shortfalls in Ransomware does not need any user action as it will run under
that C2 servers are not static resources. Attackers employ system root, the highest permission in the system.
different techniques to keep their C2 servers dynamic and Furthermore, this infection vector does not rely on any exploit
difficult to trace. Moreover, C2 servers can be anything from whatsoever; all versions of Windows systems are susceptible
normal compromised user machines to botnets controlled by to the attack via this infection vector so long RDP-based
the attacker. In this case, the Deep Packet Inspection (DPI) remote access is activated, and all versions of Windows
could be employed to examine the payload of communication considered in this paper from Windows XP to Windows 10
with C2s but then DPI is known to be slow and costly for high ship with RDP by default. The attacker needs not to worry
bandwidth applications. about the implantation of the backdoor because recent study
Countering malware-free intrusions calls for addressing the has shown that a lot of systems running on the Internet today
three attack vectors that make it possible. Since the intrusion have this backdoor as evidenced by the SHODAN search
discussed herein is as a result of backdoor planting via engine.
accessibility tools, ultimate prevention and mitigation calls for Ransomware attacks pursued through this attack vector ought
prohibition of interactive console access at pre-authentication, to be countered by addressing the security loopholes resulting
most importantly over RDP-based remote access. One way to from the implantation of the backdoor. Until Microsoft find a
detect the presence of the backdoor is by hashing all binary way to implement context detection of cmd.exe execution at
executables in the %systemroot%\System32\ directory. Any pre-authentication, i.e. prohibition of execution of cmd.exe or
hash collision is an indicator of compromise. The second any other system binary that avails system level access at pre-
method of detecting the backdoor is by checking registry login, the backdoor will continue to exist. Since this backdoor
entries to check whether cmd.exe has been set as the debugger has already been used in APT attacks, it only remains to be
to any of the accessibility tools. The presence of such a setting documented for use in Ransomware attacks. Another security
likewise is a clear indication of compromise. Network Level implementation to thwart infection via this vector is ensuring
Authentication (NLA), a feature that has been introduced in system integrity check that a system executable binary capable
newer versions of Windows starting with Vista, prevents of providing system level access at pre-authentication is not
establishment of an RDP session before authentication. This set as a debugger to any of the accessibility tools. Though the
implies that activation of NLA will ultimately see the concept of introduction of NLA in newer versions of Windows
thwarting of this malware-free intrusion. However, it must be somewhat helps prevent the backdoor, it should be extended to
stated that NLA imposes requirements such as belonging to a pre-authentication attacks and not limited to denial of service
network domain and a third entity for credential handling, attacks as originally intended.
requirements that are not befitting to the average independent There are other Ransomware infection vectors that do not
user. This solution therefore can only work in isolated require user action neither a third party carrier, like brute-
instances. forcing, but their effectiveness is hindered by a couple of other
Since one of the actions of the attack model involves RDP factors. Malware-free intrusion promises a better turnout to the
service discovery, and this intrusion is only actualized via attacker.
RDP, closing RDP ports and service will prevent this attack.
REFERENCES

324 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[1] Al-Salqan, Yahya Y. "Future trends in Internet security." In Distributed [23] McDowell, Karen. "Now that we are all so well-educated about spyware,
Computing Systems, 1997., Proceedings of the Sixth IEEE Computer can we put the bad guys out of business?." In Proceedings of the 34th
Society Workshop on Future Trends of, pp. 216-217. IEEE, 1997. annual ACM SIGUCCS fall conference: expanding the boundaries, pp.
[2] Thonnard, O., Bilge, L., O’Gorman, G., Kiernan, S. and Lee, M. 235-239. ACM, 2006.
"Industrial espionage and targeted attacks: Understanding the [24] Schneier, Bruce. "Attack trees." Dr. Dobb’s journal 24, no. 12, pp.21-29,
characteristics of an escalating threat." In International Workshop on 1999.
Recent Advances in Intrusion Detection pp. 64-85, 2012. [25] Catalin Cimpanu. (February 2017). "Spam Accounts for Two-Thirds of
[3] O'Gorman, Gavin, and Geoff McDonald. Ransomware: A growing All Email Volume, and It's Still Going Up." [Online] Available:
menace. Symantec Corporation, 2012. https://www.bleepingcomputer.com/news/security/spam-accounts-for-
[4] Richard Winton (February 2016). "Hollywood hospital pays $17,000 in two-thirds-of-all-email-volume-and-its-still-going-up/#comment_form.
bitcoin to hackers; FBI investigating" [Online] [Accesed February 2017].
Available:http://www.latimes.com/business/technology/la-me-ln- [26] VirusTotal - Free Online Virus, Malware and URL Scanner. [Online]
hollywood-hospital-bitcoin-20160217-story.html [Accessed 3rd January Available: https://www.virustotal.com/. [Accessed 20th December
2017] 2016].
[5] V. Weafer, “McAfee Labs Threats Report,” McAffe, March 2016 [27] Rossow, C., Dietrich, C.J., Grier, C., Kreibich, C., Paxson, V.,
[6] Gonzales, Daniel, Jeremy Kaplan, Evan Saltzman, Zev Winkelman, and Pohlmann, N., Bos, H., Van Steen, M.: Prudent practices for designing
Dulani Woods. "Cloud-trust-a security assessment model for malware experiments: status quo and outlook. In: 2012 IEEE
infrastructure as a service (IaaS) clouds." IEEE Transactions on Cloud Symposium on Security and Privacy (SP), pp.65–79. IEEE. 2012.
Computing (2015). [28] L. Zeltzer, “5 Steps to Building a Malware Analysis Toolkit Using Free
[7] D. Maldonado and T. M. Lares. "Sticky Keys to the Kingdom: Pre-Auth Tool.” Zektzer Security Corp. [Online] Available:
system RCE on Windows is more common than you think." DEFCON https://zeltser.com/build-malware-analysis-toolkit/. March 2015.
Conference, 2016. [29] Cuckoo Foundation. Cuckoo Sandbox: Automated Malware Analysis ).
[8] Sourabh Saxena. "Demystifying Malware Traffic." SANS Institute [Online] Available: http://www.cuckoosandbox.org. 2014.
InfoSec. August 2016. [30] M. Sikorski and A. Honig, “Practical Malware Analysis: The HandsOn
[9] Rieck Konrad, Philipp Trinius, Carsten Willems, and Thorsten Holz. Guide to Dissecting Malicious Software,” No Starch Press, 2012.
"Automatic analysis of malware behavior using machine learning." [31] M. Russinovich (2016, July). Process Monitor [Online]. Available:
Journal of Computer Security 19, no. 4, pp. 639-668. 2011. https://technet.microsoft.com/en-us/sysinternals/processmonitor.aspx
[10] Alazab Mamoun, Sitalakshmi Venkataraman and Paul Watters. [32] API Monitor. (2017). [Online] Available:
"Towards understanding malware behaviour by the extraction of API http://www.rohitab.com/apimonitor
calls." In Cybercrime and Trustworthy Computing Workshop (CTC), [33] ApateDNS. (January 2017). [Online] Available:
2010 Second, pp. 52-59. IEEE, 2010. https://www.fireeye.com/services/freeware.html/mandiant_apatedns.htm
[11] "Top 7 Desktop OSs on January 2017." [Online] Avaialable: l
http://gs.statcounter.com/os-market-share/desktop/worldwide/ [Accessed [34] Ncat - Netcat for the 21st Century. (January 2017). [Online] Availble:
20th January 2017] https://nmap.org/ncat/
[12] Nakamoto, Satoshi. "Bitcoin: A peer-to-peer electronic cash system." [35] Nmap: The Network Mapper. (January 2017). [Online] Available:
2008. https://nmap.org/
[13] Bhardwaj, Akashdeep, et al. "Ransomware Digital Extortion: A Rising [36] “Sticky-Keys-Slayer” (2017). [Online] Available:
New Age Threat." Indian Journal of Science and Technology 9, pp 1- 5, https://github.com/linuz/Sticky-Keys-Slayer [Accessed 2nd January
2016. 2017]
[14] Kharraz, Amin, William Robertson, Davide Balzarotti, Leyla Bilge, and [37] Malware Anysis Service. (2017). [Online] Available:
Engin Kirda. "Cutting the gordian knot: A look under the hood of https://www.malwr.com.
ransomware attacks." In International Conference on Detection of
Intrusions and Malware, and Vulnerability Assessment, pp. 3-24. [38] Zach Grace. "Hunting Sticky Keys Backdoors" (March 2015) [Online]
Springer International Publishing, 2015. Available: https://zachgrace.com/2015/03/23/hunting-sticky-keys-
backdoors.html
[15] Symantec Security Response. "An ISTR Special Report: Ransomware
[39] Ahmadian, M. M., Shahriari, H. R., & Ghaffarian, S. M. "Connection-
and Businesses 2016." [Online] Available:
monitor & connection-breaker: A novel approach for prevention and
http://www.symantec.com/content/en/us/enterprise/media/security_respo
detection of high survivable ransomwares." In Information Security and
nse/whitepapers/ISTR2016_Ransomware_and_Businesses.pdf
[Accessed 20th January 2017] Cryptology (ISCISC), 2015 12th International Iranian Society of
Cryptology Conference on (pp. 79-84). IEEE. 2015.
[16] Ransomware, Microsoft Malware Protection Center, Februrary 2015.
[40] Mattias Weckstén, Jan Frick, Andreas Sjöström and Eric Järpe. "A
[17] Salvi, Miss Harshada U., and Mr Ravindra V. Kerkar. "Ransomware: A Novel Method for Recovery from Crypto Ransomware Infections." 2016
Cyber Extortion." Asian Journal of Convergence in Technology. 2016. 2nd IEEE International Conference on Computer and Communications
[18] Ransom Cryptowall. June 2014 [Online] Available: (ICCC 2016), Chengdu, China. IEEE. 2016.
https://www.symantec.com/security_response/writeup.jsp?docid=2014-
061923-2824-99 [Accessed 23rd January 2017].
AUTHORS PROFILE
[19] Ransom Bucbi.(May 2016). [Online] Available:
https://www.symantec.com/security_response/writeup.jsp?docid=2016- Aaron Zimba is currently a PhD student at the University of
050921-2018-99 [Accessed 23rd January 2017] Science and Technology Beijing in the Department of
Computer Science and Technology. He received his Master
[20] Ankit Singh. (January 2016). "What Symantec’s Intrusion Prevention and Bachelor of Science degree from the St Petersburg
System did for you in 2015." Symantec Security Response. 2016. Eletrotechnical University in St Petersburg in 2009 and 2007
[21] Broad Analysis Threat Intelligence and Malware Research. "Malicious respectively. He is also a member of the IEEE. His main
Java Script sends Locky Ransomware Again". [Online] Available: research interests include Network Security Models, Network
http://www.broadanalysis.com/2016/04/29/malicious-java-script-sends- & Information Security and Cloud Computing Security.
locky-ransomware-again/ [Accessed 23rd January 2017]
[22] Sood, Aditya K., and Richard J. Enbody. "Malvertising–exploiting web
advertising." Computer Fraud & Security 2011, V no. 4, pp.11-16, 2011.

325 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017

Network Forensics For Detecting Flooding Attack On


Web Server

DESTI MUALFAH IMAM RIADI


Department of Informatics Department of Information System
Islamic University of Indonesia Ahmad Dahlan University
Yogyakarta, Indonesia Yogyakarta, Indonesia
[email protected] [email protected]

Abstract- Flooding attack is one of the serious threats of network many computers simultaneously, by spending resources
security on Web servers that resulted in the loss of bandwidth (resource) owned by that computer until the computer is not
and overload for the user and the service provider web server. able to function properly[7]. Making it necessary to do the
The first step to recognizing the network flooding attack is by search step to finding the lights and reconstructed the attack
applying the detection system Intrusion Detection System (IDS) action through evidence analysis attack. To combat this, the
like Snort. Snort is an open source system that can be used to Intrusion Detection System (IDS)[8], which can be used for
detect flooding attacks using special rules owned by Snort. All detection and identification of flooding attacks such as Snort
activities are recorded on Snort are stored in a log file that IDS[9]. Snort can detect intruders with a rule that has been
records all activity on network traffic. Log files are used at this
owned by Snort by way of a packet sniffer to see the data
stage of the investigation to the forensic process model method to
find evidence. The results of this research scenario analysis
traffic on computer networks[10].
obtained 15 IP Address recorded perform illegal actions on web Detection of flooding attack on a web server as done with
server. This research has successfully detected flooding attack on the forensic evidence in forensic process model approach,
the network by performing forensics on web server. namely forensic methods to gather information, examination,
analysis, and reports[5]. Therefore, the topics raised in this
research is the detection of flooding attack on a web
Keyword: Flooding, IDS, Snort, Network Forensics server[11], including flooding attack detection process and the
I. INTRODUCTION reconstruction of the characteristics of the log file that has
been recorded by Intrusion Detection System (IDS) Snort[12].
In this era, the increase in threats and attacks on network The detection process is done with the aim to help network
security is increasing because the web server is supported by administrators to minimize manual tasks undertaken in the
the ease of access and resource availability are more easily search for evidence of an attack on the data of each visitor
lead to hacker has vulnerabilities for hacking web servers. who is deliberately flooding attack on a web server[13].
Web server is an application server with the content contained
on HTTP or HTTPS from the browser and send it back in the Forensic research network with the network computers
form of pages[1]. contained in the Bureau of Information and Communication
Technology (ICT)[14], University of Muhammadiyah
The web server will record data every of the visitor in the Magelang. From the description of the ICT network
form of log files on the web server[2]. The data log file will be administrator, University of Muhammadiyah Magelang most
very helpful in case of problems on the web server[3]. In this attacks are attacking flooding. therefore required an
regard, there is a field of computer technology, the network investigation and forensic investigation network in the
forensic (forensic network) is a branch of digital forensics[4], University of Muhammadiyah Magelang.
using the technique scientifically proven to collect, use,
identifying, testing,
II. BASIC THEORY
A. Network Forensic
Analyzing, documenting and over and can present digital Network forensics is defined in [19] as capture, recording,
evidence from several sources of digital evidence to network and analysis of network events in order to discover the source
events on finding source of the attack[5]. of security attacks or other problem incidents. In other words,
There are many possible attacks on a web server, one of network forensics involves capturing, recording and analyzing
which is a flooding attack. Flooding attack is an attack of network traffic. The network data is derived from the
indicated[6] or stop services carried from one computer or to existing network security appliances such as firewall or

326 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
intrusion detection system, examined for attack
characterization, and investigated to trace back to the attacker.
In many cases, certain crimes which do not break network
security policies but might be legally prosecutable. Those
crimes can be handled only by network forensics.
B. Model Process Forensic
The template is used to format your paper and style the text.
All margins, column widths, line spaces, and text fonts are
prescribed; please do not alter them. You may note
peculiarities. For example, the head margin in this template
measures proportionately more than is customary. This Figure 1: Snort detection
measurement and others are deliberate, using specifications that
anticipate your paper as one part of the entire proceedings, and
not as an independent document. Please do not revise any of III. METHODOLOGY
the current designations.
Flooding attack detection configuration phase consists of
C. Intrusion Detection System (IDS) Intrusion Detection System (IDS) Snort. This configuration is
performed to detect flooding attack on a web server, after
The software application or hardware device that can
configuration Intrusion Detection System (IDS) Snort the next
detect suspicious activity in a network system. Intrusion
step flooding conduct simulated attacks to test whether the
Detection System (IDS)[11]. Intrusion Detection System
Intrusion Detection System (IDS) Snort has been successfully
(IDS) can perform inspections of inbound and outbound traffic
installed. Snort log files can be saved in a file p.cap, it will be
in a system or network, do the analysis and find evidence of
analyzed to obtain the results of the forensic evidence of the
experiments (infiltration)[15].Intrusion Detection System
intruder on a web server.
(IDS) are passive which can only detect the presence of an
intruder to inform the network administrator that there is an A. Intrusion Detection System (IDS) Snort Configuration
attack or disruption to the network. Intrusion Detection
System (IDS) is divided into two types, namely[16]: Configuration phase Intrusion Detection System (IDS)
Snort performed to detect any demand (request) data, either by
 Network-based Intrusion Detection System (NIDS) request or attack. after configuring snort, then the next rule
configuration in accordance with the rules that have been
All traffic flowing into a network will be analyzed to owned by the snort to detect attacks flooding.
find whether there was an attempted attack or
intrusion into the network system. B. Flooding Attack Scenario
 Host-based Intrusion Detection System (HIDS) Phase flooding attack scenario was conducted to test
whether the configuration Intrusion Detection System (IDS)
Activities of a host of individual networks will be Snort on the web server has been successfully installed. The
monitored if they are in an attempted attack or simulation was performed using the LOIC tool used to test
intrusion into it or not. Intrusion Detection System (IDS) Snort to detect attacks
D. Snort flooding. The drill began with the sending IP packets on a
target and selected the port will be attacked.
Snort is a software to detect instruction on the system[17],
capable of analyzing in real-time traffic and logging IP
Address, able to analyze the port and detect all sorts of attacks
from outside[18]. Snort works in three modes package as
shown in figure 1, namely:
 Packet sniffer mode
In a packet sniffer mode, Snort works as a sniffer to
see the data traffic on computer networks.
 Packet Logger
None of the packets on the network will be analyzed. Figure 2: Simulation Flooding attack
 Intrusion Detection Mode Scenario of flooding attack as show by figure 2, carried out
In this mode, Snort will serve to detect attacks made from several directions that connected to the Internet, by using
through a computer network. the LOIC tool. The server Muh University of Mgl is targeted
Flooding attack simulation to test whether the Snort IDS has
worked well. During the simulation, flooding attacks carried
out for about 15 minutes. During the simulated attack lasted

327 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
flooding a target server running IDS Snort to capture traffic
and will get Snort log file shaped p.cap file.

IV. IMPLEMENTATION & RESULT


Phase analysis was used to reconstruct the results of Snort
logfile to obtain evidence. Muh University of Mgl network
topology is distributed (dispersed), the development of a star
topology. ICT Muh University of Mgl into the center once the
division of bandwidth in each faculty. The topology can be
seen in figure 3.

Figure 4: Data Collection Stages

 Phase Examination
Intrusion Detection System (IDS) used Snort forensic
investigators in examining the log file found on Snort
in the capture (p.cap) b entering parameters to be
plugged into Snort. The inspection process is going
through a phase in figure 5.
Figure 3: Topology University of Muhammadiyah Magelang
Start Traffic
Yes
No

A. Implemetation Model Process Forensic Check


Normal Abnormal
Rule
Implementation of network forensics process model in the
architectural design of network forensics in detecting flooding
attack on a web server Muh University Mgl. Detection of Alert
flooding attacks seen in figure 4.2. The simulation process
cases that are trying to attack the target web server and
Intrusion Detection System (IDS) Snort detects an intruder Stop
attempted flooding attacks by matching rules / rule that has
been owned by Snort. Intrusion Detection System (IDS) Snort Figure 5. Detection IDS Snort
will record all activities towards the delivery of data to the
target server. Thus the log file will be stored in a log file
Snort. So the intruder will be analyzed to look for forensic
evidence by using Wireshark to reconstruct characteristics of  Phase Analysis
log files contained on Snort.
At this stage of the analysis of log files will be
B. Model Process Forensic checked, the log files that have been recovered will
The results of this analysis have four stages Model Process be examination one by one to determine changes in
Forensic: the network and to see a timestamp. Flooding attacks
will be visible when the request to the web server
 Phase Collection University of Muhammadiyah Magelang increased
Collection evidence in this study used recordings of capture traffic that is an anomaly. Then flooding
traffic IDS. IDS is implemented for about three attacks are sent from the attacker so that traffic will
months during the study. IDS reconstruction process increase. In addition to traffic conducted investigator
begins after the catch traffic deemed a predetermined using a remote SSH, also can be in the graphic user
rule. The process of taking payload as flooding attack requesting increased in figure 6.
file in this study as figure 4.

328 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
the 70s Bytes (74 Bytes). On the Internet Protocol
Version 4, to read as 118.96.55.120 IP source and
destination IP address visible 203.6.149.136 with 20
Bytes header length and the total length of 60. on the
part of the user datagram protocol, source port reads
as 52 883 and destination port read as 80. If the filter
is returned to the ip.src == 118.96.55.120 and
investigated in another frame, the source port is
immutable, but still in a great range (ports 51000-
64000). log file analysis results obtained 15 IP
address that has acted illegally flooding attacks web
servers.
Figure 6: Load Average In addition, the analysis continued with statistics
module endpoint in Wireshark used to collect attack
After the Snort log files are recorded, the log file will
packets contained in log files Intrusion Detection
be taken and analyzed using Wireshark to have this
System (IDS) Snort during the attack simulation. In
forensic evidence. In the picture seen demand exceed
Figure 9 below explains that the IP address has a
30 packets in one second. When detected, the Snort
different load on each package and at different speeds
rules will give a warning message in the alerts as
in each of its bytes.
shown in figure 7.

Figure 7: Log Snort in Wireshark


Figure 9:Statistic Endpoint Snort
With the help of filters ip.src == flooding attack will
then be analyzed to select the rows one by one to
open the menu on UDP follow that will result in  Phase Reporting
Figure 8.
At the reporting stage is the last stage in the forensic
process model. This stage was the presentation of all
the findings in this study. Based on the analysis that
has been done then obtained 15 IP address which
becomes the findings in this research scenario, as
shown in Table 10.

V. CONCLUSION
IDS system that is applicable to the scenario of this study
have worked as expected, the system can record the activities
of the network in the form of log files with the extension p.cap
the file can be analyzed with Wireshark tool. Based on the
analysis that has been done, it was found that 15 IP address
Figure 8: UDP Follow web servers perform illegal actions, which led to overload
traffic.
From the collection of the line can have one line to
perform analysis on any part of the frame that By applying the forensic process model, IDS systems on a
represents a frame in an attack packet flooding of IP web server can be used to help meet the needs of forensics at
address 118.96.155.120 has a length (length) range in the Muh University of Mgl, other than that the administrator
can monitor and prevent future attacks.

329 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
TABLE 10 FILE LOG SNORT

Source Dest.
No. Timestamp Source Dest. IP Protokol Payload
Port Port
40ddf957603e0e006e69746f72696e6
1 7/10/2016 17:26 203.6.149.140 203.x.x.136 ICMP - - 763616374692d6d6f…

69732066696e6520746f6f2e204465
2 7/10/2016 16:32 112.78.32.170 203.x.x.136 UDP 52658 80 737564657375646573...

3 8/10/2016 19:28 36.73.51.196 203.x.x.136 UDP 58894 80 69732066696e6520746f6f2e204465


737564657375646573...
69732066696e6520746f6f2e204465
4 8/10/2016 20:36 118.96.155.120 203.x.x.136 UDP 52882 80 737564657375646573...

69732066696e6520746f6f2e204465
5 8/10/2016 19:45 180.253.133.16 203.x.x.136 UDP 60052 80 737564657375646573...

69732066696e6520746f6f2e204465
6 8/10/2016 20:26 180.253.128.44 203.x.x.136 UDP 63749 80 737564657375646573…

69732066696e6520746f6f2e204465
7 8/10/2016 20:09 180.254.95.85 203.x.x.136 UDP 53820 80 737564657375646573...

69732066696e6520746f6f2e204465
8 8/10/2016 20:15 180.254.89.62 203.x.x.136 UDP 61246 80 737564657375646573...

69732066696e6520746f6f2e204465
9 8/10/2016 20:51 180.254.66.63 203.x.x.136 UDP 54948 80 737564657375646573...

69732066696e6520746f6f2e204465
10 8/10/2016 20:07 36.73.104.81 203.x.x.136 UDP 53817 80 737564657375646573...

69732066696e6520746f6f2e204465
11 8/10/2016 20:08 36.73.54.59 203.x.x.136 UDP 53814 80 737564657375646573...

69732066696e6520746f6f2e204465
12 8/10/2016 20:20 36.81.87.139 203.x.x.136 UDP 63748 80 737564657375646573...

69732066696e6520746f6f2e204465
13 8/10/2016 20:25 36.81.26.141 203.x.x.136 UDP 63756 80 737564657375646573...

69732066696e6520746f6f2e204465
14 10/10/2016 6:34 36.81.47.197 203.x.x.136 UDP 55291 80 737564657375646573...

69732066696e6520746f6f2e204465
15 10/10/2016 6:34 36.81.35.5 203.x.x.136 UDP 56328 80 737564657375646573...

REFERENCES
[1] J. D. Ndibwile and A. Govardhan, “Web Server Protection against Neural P,” vol. 75, no. 3, pp. 397–404, 2015.
Application Layer DDoS Attacks using Machine Learning and [7] E. Lee, “Detection Of Flooded Areas From Multitemporal Sar
Traffic Authentication,” pp. 261–267, 2015. Images 2016 Second International Conference on Science
[2] A. Iswardani and I. Riadi, “Denial Of Service Log Analysis Using Technology Engineering And Management,” 2016.
Density K-Means Method,” vol. 83, no. 2, pp. 299–302, 2016. [8] V. Shah and A. K. Aggarwal, “Heterogeneous fusion of IDS alerts
[3] T. A. Cahyanto and Y. Prayudi, “Web Server Logs Forensic for detecting DOS attacks,” Proc. - 1st Int. Conf. Comput. Commun.
Investigation to Find Attack’s Digital Evidence Using Hidden Control Autom. ICCUBEA 2015, pp. 153–158, 2015.
Markov Models Method ,” Snati, pp. 15–19, 2014. [9] A. Dewiyana, A. Hadi, U. Mara, U. Mara, and U. Mara, “IDS Using
[4] K. K. Sindhu and B. B. Meshram, “Digital Forensics and Cyber Mitigation Rules Approach to Mitigate ICMP Attacks,” 2013.
Crime Datamining,” vol. 2012, no. July, pp. 196–201, 2012. [10] A. Saboor, M. Akhlaq, and B. Aslam, “Experimental Evaluation of
[5] R. Utami Putri and J. E. Istiyanto, “Network Forensic Analysis Case Snort against DDoS Attacks under Different Hardware
Studies SQL Injection Attacks on Server Universitas Gadjah Mada,” Configurations,” pp. 31–37, 2013.
Int. J. Comput. Sci. Secur., vol. 6, no. 2, 2012. [11] N. M. Lanke and C. H. R. Jacob, “Detection of DDOS Attacks
[6] R. K. Idowu, R. C. M, and Z. A. L. I. Othman, “Denial Of Service Using Snort Detection,” vol. 2, no. 9, pp. 13–17, 2014.
Attack Detection Using Trapezoidal Fuzzy Reasoning Spiking [12] A. I. Technology, T. Nadu, and T. Nadu, “Flow Based Multi Feature

330 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.15, No. 2, February 2017
Inference Model For Detection Of DDoS Attacks In Network
Immune,” vol. 67, no. 2, pp. 519–526, 2014.
[13] S. Sharma, “On Selection of Attributes for Entropy Based Detection
of DDoS,” pp. 1096–1100, 2015.
[14] “Guide to Integrating Forensic Techniques into Incident Response.”
[15] “Introduction to Snort A . Sniffer Mode,” pp. 1–11.
[16] H. Toumi, A. Eddaoui, and M. Talea, “Cooperative Intrusion
Detection System Framework Using Mobile Agents For Cloud
Computing,” vol. 70, no. 1, 2014.
[17] H. A. D. Eugene C. Ezin, “Java-Based Intrusion Detection System
in a Wired Network,” vol. 9, no. 11, 2011
[18] B. Khadka, C. Withana, A. Alsadoon, and A. Elchouemi,
“Distributed Denial of Service attack on Cloud : Detection and
Prevention,” 2015.
[19] Nguyen, K., Tran, D., Ma., & Shama, D. (2014) An Approach to
Detect Network Attacks Applied for Network Forensics, 655-660.

331 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Adaptive scheme for application methods offloading


in mobile cloud computing
Ahmed A. A. Gad-ElRab #1 , Farouk A. Emara ∗2
#
Department of Mathematics, Faculty of Science, Al-Azhar University
Cairo, Egypt
1
[email protected]


Department of Mathematics, Faculty of Science, Al-Azhar University
Cairo, Egypt
2
[email protected]

Abstract—Recently, due to existence of a lot of mobile appli- [4]. Generally, there are two main approaches which are
cations that consume high their computing power and energy. introduced to perform remote execution. The first approach
So, there are offloading schemes which move the computing is to use full process or full VM (Virtual Machine) migration
power and data storage away from a mobile phone into a power
full cloud server which has a big storage space by offloading as in CloneCloud [5]. The full process or VM can be migrated
methods or services to the cloud server. However, the offloading to the rich infrastructure to execute remotely. While the second
might consume more energy than the local processing of data approach is only offloading intensive methods or services of
when the size of code is small. This paper introduces a new applications to execute remotely [6], [7], [8]. This approach
offloading scheme by using decision making approaches. The leads to large energy saving because it is fine grained appli-
scheme considers each service or process in a mobile application
as a set of methods and can offload a group of methods based on a cations. This means that it can remote only the sub-parts that
defined cost model. Also, it enumerates the set of solutions based benefit from remote execution [6], [7], [8].
on decision making approaches to find all feasible solutions which In this paper to solve these problems method-based offloading
satisfy the cost model then it selects the best feasible solution scheme for mobile application is proposed. The proposed
among them. The conducted simulation results show that the scheme considers each service or process in a mobile ap-
offloading performance of the proposed scheme is much better
than local processing scheme. plication as a set of methods and can offload a group of
methods based on a defined cost model. It enumerates the set
I. I NTRODUCTION of solutions based on decision making approaches to find all
Mobile Cloud Computing refers to an infrastructure where feasible solutions which satisfy the cost model then it selects
both the data storage and the data processing happen outside the best feasible solution among them.
of the mobile device. Mobile cloud applications move the The rest of this paper is organized as follows: the related
computing power and data storage away from mobile phones work will be introduced in Section II. Section III describes
to the cloud which brings applications and mobile computing offloading problem formulation in MCC. Section IV explains
to not just smartphone users but a much broader range of the proposed scheme. Section V introduce the application
mobile subscribers [1]. Computation offloading offloads inten- scenario using the proposed scheme. Section VI introduces
sive methods of mobile applications to run remotely on rich the performance and qualitative evaluation of the proposed
resource such as cloud. In case of code compilation, offloading scheme and Section VII concludes the paper.
might consume more energy than that of the local processing
when the size of codes is small. So, offloading is not always II. RE L ATED WORK
the effective way to save energy of a mobile device [2]. For In recent years, a lot of studies have been appeared to
example, when the size of altered codes after compilation is support remote execution for mobile applications on the cloud
500 KB, offloading consumes about 5% of a device?s battery [5], [6], [7], [8]. In the rest of this section, these related work
for its communication while the local processing consumes will be introduced in details.
about 10% of the battery for its computation. In this case, the
offloading can save the battery up to 50%. However, when A. CloneCloud
the size of altered codes is 250 KB, the efficiency reduces to CloneCloud is introduced by B. Chun [5] in 2011. The
30%. Also computation offloading may require a large amount concept of clonecloud is based on creating virtual Smartphone
of data to be transferred on runtime, and then higher latencies called clone. The clone on the cloud has more hardware,
may occur. software, network and energy which provides more suitable
In recent years, a lot of studies have been appeared to support environment to process complicated tasks. The partitioning
remote execution for mobile applications on the cloud to mechanism in CloneCloud is dividing the application into
increase the performance and reduce energy consumption [3], software blocks based on energy consumption intensive or

332 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

computing. Some of this block will be running on the Smart- the server side consists of profiler, server proxy, solver and
phone and other will be running on the clone. Once the virtual controller. However, the working of a profiler and server
clone is available, then the some computing or energy intensive proxy is similar to the smart phone. The solver is the main
blocks will be offloaded to cloud for processing. Once those decision engine of the MAUI that holds the call graph of the
execution blocks have been completed, the output will be applications and the scheduled methods. Lastly, the controller
passed from the clone on the cloud to the Smartphone. is responsible for the authentication and resource allocation for
The main disadvantages of CloneCloud approaches are: (1) incoming requests. However, single method offloading is less
Access to native resources that are not virtualized already and beneficial compared to multiple methods offloading. Another
are not available on clone. (2) May be offloading task not weakness of MAUI is that if the programmer forgets to mark
correct if it consumes energy more than running local. (3) methods (for remote execution), MAUI will not be able to
Computation offloading may require a large amount of data offload those methods. Also, MAUI does not consider the
to be transferred on runtime, and then higher latencies may execution time in its optimization cost however it can predict
occur. the execution of a method. Nevertheless, the MAUI profilers
consume processing power, memory and energy, which is an
B. Giurgiu et al. Model overhead on the smart phones.
Giurgiu et al. [7] proposed a model that focuses on of-
floading intensive parts of applications to execute remotely D. AMBEO
on the cloud/server to optimize latency, data transfer delay AMBEO [8] divides an application into three layers: (1)
and cost. The core method of this model using R-OSGi presentation layer which contains user interface and resides on
[9] and AlfredO [10] frameworks for the management and the smart phone, (2) logical layer which contains computation
deployment of applications. R-OSGi is an enhanced version methods and is distributed between the cloud and the smart
of OSGi that supports multiple VMs residing on distributed phone according to determined optimal cost which takes into
servers, whereas the primary objective of OSGi is to assist with account memory constrain, and (3) data layer which contains
the decomposition and coupling of applications into modules, data and data access method and is fully deployed on the cloud
called bundles. The proposed model divides each mobile to minimize the data access over the data layer. Instead of
application into presentation layer, logical layer and data offloading a whole service or a whole application to the cloud,
access layer. AlfredO distributes the bundles of layers between AMBEO works with methods of each service by adaptive
the Smartphone and server. The bundles of presentation layer offloads some of logical layer methods of a service based
reside on the Smartphone while the bundles of logical layer are on a determined cost model. However, AMBEO decide the
distributed between the server and the Smartphone. Moreover, running of each method separately based on its local and
the bundles of data layer are fully deployed on the server to remote costs taking into account its needed memory and the
minimize the data access delay. available memory. But if another method needs to run in the
The main disadvantages of Giurgiu et al. are: (1) the cost same time the decision may be wrong.
model has focused so far only on the mobile device and
has assumed the server?s resources to be infinite. (2) CPU III. OFFLOADING PROBLEM IN MCC
consumption and energy consumption are not including into In MCC, The decision of computation offloading is an
the optimization problem. (3) Offloading bundle needs high extremely complex process and is affected by the nature of the
bandwidth. application. Therefore, the offloading problem is how services
or methods can be offloaded such that the mobile devices can
C. MAUI Model save their energy with keeping a high service performance and
MAUI [6] enables developers to produce an initial partition- minimum time delay.
ing of their applications with minimal effort. The developer
simply annotates as remote able those methods that the MAUI A. Assumptions and Models
run time should consider offloading to a MAUI server to The MCC model consists of a mobile node MN and a
minimize energy consumption of mobile devices. In MAUI, cloud server node CS. MN can communicate with CS by using
the application partitioning is dynamic and the offloading is advanced wireless technology to exchange data, applications,
done on the basis of methods instead of complete application and services. In addition, the set of assumptions that must
modules to minimize the offloading delay. However, MAUI be met in this MCC model are (1) there are developers can
creates two versions of smartphone application, for local and apply the Model-view-controller (MVC) [11] design pattern
remote execution. In MAUI, the mobile device consists of explicitly and rigorously to isolate the application logical
three main components: solver interface, profiler and client layer from the user interface and data layer of any mobile
proxy. The solver interface provides interaction with the application,(2) any method that interacts with a user or needs
solver and facilitates the offloading decision making. The to access device hardware will belong to a user interface, (3)
profiler collects information regarding the application energy the energy consumption of each hardware component of MN
consumption and data transfer requirements. The client proxy such as LCD, CPU and Wi-Fi can be measured separately
deals with the method offloading and data transfer. Similarly, by using a measurement application model for the energy

333 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

consumption on a mobile device (e.g., Android phone) on the where Ei,idle can be expressed as multiplying of the idle time
fly [12], and (4) any mobile application consists of a set of of the mobile device for waiting to get a result from the cloud,
methods and each method consists of a set of instructions that ti,idle and the power idle consumption per second Pi,idle .
can be determined at run time. The set of methods in a logical
layer of a certain service which can be offloaded is denoted as Ei,idle = Pi,idle ∗ ti,idle (7)
SM={mi , 1 ≤ i ≤ n}. Each mi ∈ SM has several metadata The energy consumption for sending data, Ei,s can be ex-
properties as memory cost, memi and code size, codei .The pressed as multiplying of the time for sending data from
number of instructions in a code size codi of a method iis mobile to cloud, ti,s and the power consumption for sending
denoted as I. The data size is needed for a method i to be sent data from mobile to cloud per second, Pi,s .
or received are denoted by sendi,s and recvi,r , respectively.
In addition, the speed of executing any instruction by a mobile Ei,s = Pi,s ∗ ti,s (8)
node is denoted by SpeedM N . Finally, there are n methods
The energy consumption for receiving data from cloud, Ei,r
that can be offloaded for an application or a service.
can be expressed as multiplying of the time for receiving data
B. Problem Formulation from cloud ti,r and the power consumption for receiving data
from cloud per second Pi,r .
To formulate the offloading problem in MCC, firstly, the
local and offloading costs for methods will be modeled based Ei,r = Pi,r ∗ ti,r (9)
on before mentioned assumptions and models. The local time
execution, Ti,local for a service i can be determined by using The idle time of a mobile device during waiting period for
the number of instructions I and mobile execution speed getting a result from a cloud server can be treated as the
SpeedM N as follows: execution time of a remote cloud, so Ei,of f load can be written
as follows.:
I Pi,idle ∗ I Pi,s ∗ Di , s Pi,r ∗ Di , r
Ti,local = (1) Ei,of f load = + + (10)
SpeedM N Speedcloud Bi, s Bi, r
The local energy consumption, Ei,local for a service i can be By using Eq. 1and Eq. 5, the total cost of execution time for
determined by using the number of instructions I and mobile n methods is
execution speed SpeedM N . n
X
Ei,local = Pi,local ∗ Ti,local (2) Ctime = (Ti,local ∗ (1 − xi ) + Ti,of f load ∗ xi ) (11)
i=1
Where Pi,local is the power for a local execution per second. Also, by using Eq. 2 and Eq. 10, the total cost of energy
Here, xi is introduced for a method i, which indicates whether consumption for n methods is
the method i is executed locally or remotely(xi =1 if a method n
i runs remotely and 0 if it runs local). By using xi , the data
X
Cenergy = (Ei,local ∗ (1 − xi ) + Ei,of f load ∗ xi ) (12)
sizes which will be sent, Di,s , and recieved, Di,r , are defined i=1
as follows.
Note that, the memory cost on the mobile device for n methods
Di,s = sendi ∗ xi (3)
that run local can be calculated as follows.
n
Di,r = reci ∗ xi (4) X
Cmemory = memi ∗ (1 − xi ) (13)
The time cost for offloading a method i to remote cloud, i=1
Ti,of f load , can be expressed as the sum of taking time during In addition, the data transfer cost for a remote execution of n
waiting w period for getting results from the cloud and methods, that includes the transfer cost of its related methods
transferring time(including sending and receiving)as follows. which are not at the same execution location. If the output
I Di,s Di,r of one method is an input of another is determined by the
Ti,of f load = ( + + ) (5) following equation
Speedcloud Bi,s Bi,r
n n X
k
where Speedcloud is the remote execution speed(cloud speed), X X
Ctranf er = codi ∗ xi + tri ∗ (xi XORxj ) (14)
Bi,s and Bi,r are the bandwidth for sending and receiving,
i=1 i=1 j=1
respectively.
The energy cost for offloading a method i to a remote cloud, where k is the number of related methods. By using Equations
Ei,of f load , can be expressed as the sum of energy consumption 11,12, 13, and 14, the total overall cost for n methods can be
during waiting period for getting results from a cloud Ei,idle written as :
, and transferring (including sending Ei,s and receiving Ei,r ) Ctotal = Ctranf er ∗ Wtr + Cmemory ∗ Wmem
as follows.
+ Cenergy ∗ Wenergy (15)
Ei,of f load = Ei,s + Ei,idle + Ei,r (6) + Ctime ∗ Wtime

334 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

where Wtr , Wenergy , Wtime , Wmem are the weights of trans- Local cost m1
Offloading cost
ferring, energy, time, and memory costs, respectively. These
weights represent the importance of these costs in the offload-
m2 m2
ing process such that the sum of these weights must equal 1
as follows.
m3 m3 m3 m3
Wtr + Wmem + Wenergy + Wtime = 1 (16)
m4 m4 m4 m4 m4 m4 m4 m4
Here, the solu tion x1 , x2 ...xn represents the required offload-
m4
ing partitioning of the application.
The objective goal of offloading problem in MCC is mini-
mizing the overall cost Ctotal as much as possible such that mn mn
takes into account the resources constraints of a mobile device.
So, the objective function of this problem can be written as
follows. Fig. 1: Decsion tree
min Ctotal (17)
x∈0,1

Such that Ctotal,M Pbf (n) = min {Ctotal,f ph (n) }∀f ph (n) ∈ F P (n)
x∈0,1
n (21)
X n
memi ∗ (1 − xi ) ≤ availmemory (18)
X
memi,M Pbf (n) ≤ availmemory (22)
i=1
i=1
Cenergy ≤ availenergy (19) where F P (n) = {f ph (n) : 1 ≤ h ≤ H} is the set of all
where constraint 18 means that the memory cost of a resident feasible paths (solutions) of n methods in the decision tree
method can not be more than available memory on the mobile and H is the number of feasible paths. Constraint 21 means
device. Constraint 19 means that the energy cost of can not that the selected path has the minimum cost among all feasible
be more than available energy of the mobile device. paths. Constraint 22 means that the memory cost of a resident
method can not be more than available memory on the mobile
IV. PROPOSED SCHEME device and memi,f ph (n) = 0 if the method Mi in this path
will be offloading to cloud server
Here, a new scheme called Application Decision Making-
Based Scheme for Method Offloading (ADBMO) is proposed C. Proposed Algorithm
to solve the offloading problem which was formulated in the
ADBMO consists of two phases based on its basic idea as
previous section.
follows.
A. Basic Idea • Phase 1: Profile phase which determines the current

The basic idea of ADBMO is based on five issues: (1) value of a mobile, cloud server, each method in the logical
dividing each mobile application into three layers: presentation layer, network conditions as bandwidth at run time. So,
layer, logical layer and data access layer, (2) considering each before each method is invoked, ADBMO determines
service or process in each layer as a set of methods, (3) the whether the method invocation should run locally or
methods of presentation layer resides on the mobile device, (4) remotely. So, ADBMO measures the characteristics of a
the methods of data layer are fully deployed on the cloud to mobile device and a cloud server at the initialization time
minimize the data access, and (5) the methods of logic layer and it continuously monitors the network characteristics
are distributed between a cloud server and a mobile device because these can often change and a stale measurement
using decision tree to enumerates the set of solutions based may force algorithm to make the wrong decision on
on decision making approaches to find all feasible solutions whether a method should be offloaded or not. Therefore,
which satisfy the cost model then it selects the best feasible the profile phase contains four profiling components:
solution among them.as shown Fig 1. – Device Profiling: In this profiling, ADBMO deter-
mines an energy consumption of a mobile device by
B. Problem Reformulated using a measurement application model for the en-
By using the decision tree the problem can be reformulated ergy consumption on a mobile device (e.g. Android
for finding the Path (solution) with minimal total cost as the phone) on the fly as PowerTutor [12]. In addition,
following. it measures the values of a processor speed, the
Objective: find the minimal total cost path of n methods, available memory of a mobile, battery consumption,
M Pbf (n) Such that amount of data transfer and memory used.
– Cloud Profiling: In this profiling, ADBMO deter-
M Pbf (n) ∈ F P (n) (20) mines the value of a processor speed of a cloud

335 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

server. ADBMO assumes that this value can be energy consumption local costs by using Eqs. 1 and
determined by using some means from the cloud. 2. Also, calculates the memory local cost for this
– Method Profiling: In this profiling, ADBMO deter- method. (ii) Calculating the time offloading and the
mines which the method belongs to a presentation energy consumption offloading costs by using Eqs.
layer, a logical layer or a data layer by using meta- 5 and 10. Also, calculating data delay transfer cost
data that stored in a manifest file and Java reflection which is represented by the sum of code size and
such as OSGi [13] which has been traditionally used receiving and sending data costs. (iii) Enumerating
to decompose and loosely couple Java applications the set of solutions using the decision tree. Each
into software modules. Then, for each method that node in a decision tree represents a method and
belongs to the logical layer it determines the charac- each a method has two edges one for local cost and
teristics of the methods such as code size, memory another for offloading cost. Paths from the root to the
used and amount of data transferred. leaves represent the set of solutions. (iv) Deleting
– Network Profiling: In this profiling, ADBMO mon- the unfeasible solutions. The unfeasible solutions
itors the network and gathers all information about are the total memory cost of methods is large than
the network as Internet connection availability of available memory. (v) Finding the best feasible path
a mobile device and the current bandwidth of the that has the best value among all feasible paths.
network to show its quality. (vi) saving the offloading decision (i.e., value of xi
) and the actual execution time of method, Tact .
This actual execution time can be used to determine
the actual number of instruction, Iact of this method
with respect to the number of instruction of a method
which is defined in Eq. 1 or Eq. 5 according to xi .
– Consequent execution time case: This time means
that a method will be run for second or more times.
In this case, there are two cases of the model parame-
ters as available memory or network bandwidth. (1)
Changed case: this means that the values of these
parameters are changed from their earlier values in
the previous run. In this case, ADBMO repeats the
two phases, Profiling and decision phases as the first
execution time. (2) Unchanged case: this means that
the values of these parameters are not changed from
Fig. 2: ADBMO phases their earlier values in the previous run. In this case,
ADBMO compares the number of instructions of
• Phase 2: Decision phase which determines which a method, I (which determine in method profiling
method will be run on a mobile device or on a cloud step) and the actual number of instructions, Ia ct to
based on the information that are calculated by using determine the changing degree of these parameters.
the profile phase, and the local and offloading costs If the result is in the period [0.0, 0.2], the change
as described in section 3. In this phase, the offloading is called Low and if the result is in the period
decision of a method depends on four factors: 1) charac- [0.21, 0.6] the change is called Medium. Otherwise
teristics of mobile device, 2) characteristics of the cloud, the change is called High. In case of Low change,
3) characteristics of methods, and 4) characteristics of ADBMO takes the same decision in the previous
network as network bandwidth. For each method, there execution time. While, in case of Medium or High
are two execution cases: first execution time case which ADBMO repeats the two phases of ADBMO as the
means that this method will be run for the first time and first execution time as shown in Fig 2.
consequent execution time case which means that this
method will be run for second time or more. These two V. APPLICATION SCENARIO: MOBILE FACE
cases are described as follows. RECOGNTION SYSTEM
– first execution time case: If a method will be run Identification of people is a major challenge faced by the vi-
for the first time, the methods of logical layer will sually impaired. The increase in computation capability of the
be run as follows: (1) If disconnect occurs, ADBMO cloud servers and mobile devices give motivation to develop
runs the method on the mobile device. In this case, applications that can assist visually impaired persons.Here,
the application?s energy consumption only incurs a The proposed face detection and recognition application is
small penalty cost due to offloading to the cloud. (2) designed to take advantage of ADBMO and is intended to
If a cloud server is available, ADBMO executes the assist visually impaired users in locating and identifying
following steps: (i) calculating the time local and the people that they know as follows.

336 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Presentation layer (user interface) in this layer, the applica- A. Simulation Results and Analysis
tion accesses the video feed from the device?s camera and after • Scenario A: In this scenario, ADBMO uses the difference
the person is identified, the application displays the results to numbers of methods to compare the cost of a running
the user. This layer will be running on the mobile device. method locally, the cost of running method by using
Logical layer (detection) the main goal of this layer is to AMBEO and running using ADBMO. The number of
detect faces. Face detection can be regarded as a specific case method (5,7,10,15,25), available memory 128Mb, and the
of object-class detection. In object-class detection, the task is code size of method iis100+50∗(i−1) kb. Fig -3 shows
to find the locations and sizes of all objects in an image that the cost of running methods on local device, running
belong to a given class. Face detection algorithms focus on by using AMBEO and by using ADBMO against the
the detection of frontal human faces. When a face is detected, number of methods. Fig -4 shows that total cost memory
a bounding box is drawn around it. This bounding box is of first 10 methods is 3575Kb which is less than available
used to extract and save the face of a person by cropping memory. The cost of running method local less than
the area inside it. Once the face is detected, the next detection
is performed after a delay to avoid overwhelming the user
with constant detection notifications. The methods of this
layer needs high computation resources and consume time and
energy. So all methods in this layer will be distributed between
a cloud server and a mobile device by using ADBMO.
Data layer (recognition) this layer complements the detection
by identifying the person 16 using the detected face. Image
matches with the image stores in database. This is done by
running the recognition program which searches the internal
database for a match using the saved face image as input. This
layer has two states, offline and online based on the status of
network connection. In Offline state after a face is detected,
a temporary image is captured and saved. After the image
is saved, it is used to identify the person by searching for a Fig. 3: cost VS number of methods
possible match in the application?s internal database. As soon
as the person is identified, the result is displayed to the user.
In the online state, the video feed is still accessed for scanning
and detecting faces. A temporary image is captured and saved
when a face is detected in the same manner as in the offline
state. However, instead of attempting to identify the person,
the image is sent to cloud servers for identification. After the
person is identified, the results are sent back to the application
and displayed to the user.

VI. PERFORMANCE EVALUATION


Firstly, ADBMO is implemented by using c++ program-
ming language. Due to the difference between the number
methods of mobile cloud application and available memory ,
Fig. 4: Memory cost VS number of methods
different scenarios are generated for number of methods and
available memory (two scenarios). The simulation parameters
running method in cloud server. So the ABMEO and
as mobile processor speed, cloud processor speed, consumed
ADBMO decide that running these methods on a mobile.
power by mobile in ideal case, and consumed power by mobile
Starting from method M11 the offloading cost is less than
for sending and receiving data are shown in Table I.
local cost so ADBMO Decide that running this methods
on a cloud server. From From Fig -3 and Fig -4 note
TABLE I: Simulation Parameters.
that the total cost of running the difference numbers of
Parameter Value
Mobile processor speed 0.6 GHz
methods at the same time using ADBMO algorithm is
Cloud processor speed 2.8 GHz equal or less than running locally. And also the total cost
Consumed power by mobile in ideal case 0.89 J of memory. Also, we note that the total cost of running
Consumed power by mobile for sending data 1.6 J
Consumed power by mobile for receiving data 1.6 J
25 methods at the same time local is 19250 is higher than
Bandwidth 3 Mbps maximum available memory. But running using ADBMO
the total cost memory is 3575 and total cost is less than
running locally.

337 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

• Scenario B: In this scenario, 15 methods are used and • Offloading level (OffL): the offloading level means
each method Mi has cod size 100 + 50 ∗ (i − 1) kb and that the offloaded entities which are needed to move into
the difference size of available memory to compare the the cloud are class objects, threads, software modules, or
cost of a running method locally, the cost of running methods.
method by using AMBEO and running using ADBMO. • Prediction of second execution (PSE): the algorithm
Fig -5 shows the cost of running methods on local device, can predict the second execution or not.
• Minimize data transfer (MDT): a cost model includes
the amount of data transfer to avoid data traffic.
• Multiple methods offloading (MMO): the algorithm
takes the offloading decision for a group of methods
together rather than for every method alone.

TABLE II: Qualitative Comparison


ANC SE MC OffL PSE MDT MMO
[5] No No No threads No NO -
software modules
[7] Yes No Yes No Yes -
(bundles)
[6] Yes Yes No methods Yes No -
[8] Yes Yes Yes methods Yes Yes N0
ADBMO Yes Yes Yes methods Yes Yes Yes

Fig. 5: cost Vs avaialbe memory

running by using AMBEO and by using ADBMO against According to the qualitative parameter, the best criteria is as
the available memory. It is clear that if running method follows: ANC is Yes, SE is Yes, MC is Yes, OffL is methods,
using AMBEO is better than ADBMO and running local. PSE is Yes, and MDT is Yes. The qualitative evaluation is
But in Fig -6 it is clear that if all methods will run in shown in Table II. As shown in Table II, ADBMO satisfies all
requirements of the best criteria among existing approaches.
VII. C ONCLUSION
In this paper, the offloading problem in mobile cloud
computing and a lot of studies have been appeared to support
remote execution for mobile applications on the cloud are
introduced. In addition, a new offloading algorithm called
ADBMO is proposed. ADBMO provides method level code
offloading which improves the performance and save energy of
the mobile device. ADBMO is dealing with the computation
offloading challenges. ADBMO can decide which method will
run in local device or must be offloaded into the cloud for
group of methods will run at the same time based on the
cost of its running and available memory. The performance
Fig. 6: Memory cost VS available memory of ADBMO algorithm has been evaluated through extensive
simulation with different values of available memory and
the same time the cost of memory is maximum available
difference number of methods. The simulation results demon-
memory and the best algorithm for running methods
strated that ADBMO is better than AMBEO and other existing
correctly by using ADBMO.
models if the number of methods running at the same time.
B. Qualitative Comparison R EFERENCES
In this section, the qualitative between ADBMO and some [1] H. T. Dinh, C. Lee, D. Niyato, and P. Wang, “A survey of mobile
of existing approaches according to the following criteria: cloud computing: architecture, applications, and approaches,” Wireless
communications and mobile computing, vol. 13, no. 18, pp. 1587–1611,
• Adaptive with network change (ANC): an approach 2013.
continuously monitors the network and is adaptive with [2] K. Kumar and Y.-H. Lu, “Cloud computing for mobile users: Can
network disconnection. offloading computation save energy?” Computer, vol. 43, no. 4, pp. 51–
56, 2010.
• Saving energy (SE): a cost model includes the parameter [3] M. Shiraz, A. Gani, R. H. Khokhar, and R. Buyya, “A review on
of energy consumption cost . distributed application processing frameworks in smart mobile devices
• Memory cost (MC): a cost model includes the parameter for mobile cloud computing,” Communications Surveys & Tutorials,
IEEE, vol. 15, no. 3, pp. 1294–1313, 2013.
of memory cost and takes into account the available [4] A. Khan, M. Othman, S. Madani, and S. Khan, “A survey of mobile
memory of the mobile device. cloud computing application models,” 2013.

338 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[5] B.-G. Chun and P. Maniatis, “Augmented smartphone applications


through clone cloud execution.” in HotOS, vol. 9, 2009, pp. 8–11.
[6] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu,
R. Chandra, and P. Bahl, “Maui: making smartphones last longer with
code offload,” in Proceedings of the 8th international conference on
Mobile systems, applications, and services. ACM, 2010, pp. 49–62.
[7] I. Giurgiu, O. Riva, D. Juric, I. Krivulev, and G. Alonso, “Calling the
cloud: enabling mobile phones as interfaces to cloud applications,” in
Middleware 2009. Springer, 2009, pp. 83–102.
[8] A. A. Gad-ElRab, T. Alzohairy, and F. A. Emara, “Article: Application
method-based efficient offloading scheme in mobile cloud computing,”
International Journal of Computer Applications, vol. 132, no. 3, pp. 1–8,
December 2015, published by Foundation of Computer Science (FCS),
NY, USA.
[9] J. S. Rellermeyer, G. Alonso, and T. Roscoe, “R-osgi: distributed appli-
cations through software modularization,” in Proceedings of the ACM/I-
FIP/USENIX 2007 International Conference on Middleware. Springer-
Verlag New York, Inc., 2007, pp. 1–20.
[10] J. S. Rellermeyer, O. Riva, and G. Alonso, “Alfredo: an architecture for
flexible interaction with electronic devices,” in Proceedings of the 9th
ACM/IFIP/USENIX International Conference on Middleware. Springer-
Verlag New York, Inc., 2008, pp. 22–41.
[11] G. E. Krasner, S. T. Pope et al., “A description of the model-view-
controller user interface paradigm in the smalltalk-80 system,” Journal
of object oriented programming, vol. 1, no. 3, pp. 26–49, 1988.
[12] L. Zhang, B. Tiwana, Z. Qian, Z. Wang, R. P. Dick, Z. M. Mao,
and L. Yang, “Accurate online power estimation and automatic battery
behavior based power model generation for smartphones,” in Pro-
ceedings of the eighth IEEE/ACM/IFIP international conference on
Hardware/software codesign and system synthesis. ACM, 2010, pp.
105–114.
[13] O. Alliance, “Osgitm service platform, core specification, release 4,
version 4.1,” 2007.

339 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Spectral unmixing From Hyperspectral Imagery


Using modified Gram Schmidt Orthogonalization and
NMF.
Neetu N. Gyanchandani Dr.A.A.Khurshid
Department of Electronics Engineering. Professor, Department of Electronics Engineering.
Research Scholar, GHRCE RCOEM,
Nagpur, India Nagpur, India

Dr.Sanjay Dorle
HOD, Department of Electronics Engineering
GHRCE,
Nagpur, India

Abstract— In hyperspectral image, every pixel shows some reduce the complexity. The statistical approach used in
combination of mixed pattern of endmembers and in order to Orthogonal Subspace Projections[1] combined bands with
find those endmembers, abundance of endmembers and hence redundancy and tend to reduce dimensionality of data. The
the respective spectral signature unmixing is an excellent resulting image generated gives some mathematical relation
approach. Atmospheric interferers are a potential source of like Eigen vector with the input image and does not generate
errors in spectral unmixing, therefore it becomes very
challenging to identify the end members. The problem usually
any physical relation.
faced in extracting information from hyperspectral data is the Sometimes both the methods discussed goes hand in
presence of mixed spectral information. This problem can be hand, as in case of endmember spectra determination and
solved by making use of unmixing technique. The ultimate goal mixing, where the image is transformed into the principal
is to devise a feasible and effective method for unmixing based on component statistically, then the selection of group
modified Gram Schmidt orthogonalization, lsma and NMF. characterizing image is done manually. The selected groups
indicates pure pixels or endmembers with unique spectral
Keywords- hyperspectral image, abundance,NMF,endmembers, properties and rest of the pixels are considered as mixed pixels.
lsma and unmixing. Mixture of two or more individual spectral information is
known as a mixed pixel. Material maps are produced and
I. INTRODUCTION resulting images gives fractional abundance of the spectrum of
Information regarding materials present in hyperspectral scene endmembers. Advantage of this is that it deals with both
can be extracted from spectral properties of materials. With this radiance and physical axes or it transforms data from radiance
information analysts can separate or identify objects within the axis to physical one. So, automated unmixing methods are
scene. Hyperspectral data can be interpreted in two ways as preferred to reduce complexity in finding endmembers.
Purely physical manner or Statistical Manner. In this physical
variable ( like radiance) is presented by each pixel to the Different endmember extraction techniques include
observer. For further analysis of physical interpretation Pixel purity index [2] where dimensionality reduction is
scientific methodologies are used such as for identification of applied to the original data which results in skewers. With the
particular objects its spectral properties are matched with Pure Pixel Index it is not possible to get the final list of
absorption properties. But all scenes of data are not endmembers with the selection of skewers which is done
atmospherically corrected and properly calibrated. Also, it is randomly [3].In the N-Finder method selection pixel detecting
not necessary that all analysts belong to the spectroscopy simplex is generated, but the disadvantage is that recalculation
background. Dealing with millions of spectra per image that to of pixel volume increases the computation of algorithm and
with accuracy is difficult. So, automated systems are required also leads to sensitivity to noise [4].In VCA the principle that
for identification and exploitation of hyperspectral image. endmembers are vertices of simplex is used. Here positive cone
of hyperspectral data is projected on hyperplane and generates
In case of statistical approach ,the variables of data for simplex with vertices. VCA determines the number of
analysis are considered as statistical variables and tend to endmembers and solves the computational complexity but the

340 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

final selection of endmember is difficult. VCA is further


modified as modified vector component analysis [5]. SGA The different steps involved are depicted in flow chart (Fig.1 )
(Simple graining algorithm) overcomes drawbacks of N-finder
like determination of the number of endmember [1] ,the III. LSMA
unpredictable final selection set of endmember &
computational complements. It starts with two vertices measure Irrelevant information apart from endmembers can be
vector by vector generating simplex. Finally vertices reaches compressed. Let, P is Number of targets and L is Number of
the number of endmember [6].NMF ( Non negative matrix spectral bands. Here if P two dimensional grayscale fractional
factorization) does not deal with pure pixel assumption. It images are coded and can represent L bands of data, good
gives two non negative low range matrix factors that are compression can be achieved since p< L. This can be achieved
applied to spectral unmixing. Resultant matrices of NMF gives by LSMA with NMF. Linear spectral mixture[8] analysis is
more spontaneous interpretation. NMF is greatly used in used for unmixing here.
spectral mixing application because it can simultaneously
estimate endmember and abundance. NMF is fast and efficient,
it shows tough approach for noise. ORASIS (An optical real
time Adaptive Spectral Identification System)[12] is an Extraction of first endmember based on a centroid pixel
autonomous end member finding algorithm. It is simplex
shrinking. In this modified gram Schmidt algorithm is used to
factor the data matrix. Calculate an orthogonal projection of all pixels and save
in matrix
This paper is organized as follows. Section II presents
the method of endmembers extraction with Gram Schmidt
orthgonisation algorithm for endmember extraction. Section III Calculate stopping factor for each pixel in matrix
describes LSMA for unmixing which gives abundance vectors.
Section IV Explains NMF method for solving the constraint of
nonnegativity and gives details for abundance estimation. Extract new endmember: select a pixel having high
Section V presents the experimental results. Lastly, Section VI orthogonal projection with previous endmember
gives the conclusions.

II. ENDMEMBER EXTRACTION Calculate the percentage of information lost(vector)

The general approach to unmix hyperspectral data consists of


endmember extraction followed by the estimation of fractional Compare maximum value of vector with α( generally 1)
abundances. Here end member extraction is based on modified
gram Schmidt orthgonisation used in FUN algorithm [7] ,Gram
Schmidt algorithm for endmember extraction has low
computational complexity since less matrix calculations are If α less than or equal to maximum value of vector, the
there and with relatively less complications, which improves pixel representing max value is endmember then go for
the speed of performance. Along with it previously computed next endmember extraction, else process is complete
information can be used. In this compression algorithm,
modified gram Schmidt orthgonisation method is used for the
generation of endmembers and their reflectance spectra only Output
and unmixing part is done by LSMA and NMF . 1.Number of endmembers
2.Computation time for each endmember.
Modified gram Schmidt orthgonisation method [7] 3.Variance for each endmember.
when applied to hyperspectral dataset calculates a specific 4.Reflectance spectra for each endmember.
number of endmembers as per flow of algorithm by checking 5.Orthogonised endmembers.
pixel vectors with stopping condition , here user can select the 6.Orthonormalized endmembers.
number of endmembers required depending upon the
application and requirement of compression ratio. Since
compression ratio is inversely proportional to the number of Fig.1:Design flow for modified gram Schmidt orthgonisation end member
extraction algorithm
endmembers or targets selected.

Along with endmembers elapsed time required for


r: Column pixel vector: Lx1
detection of each endmembers has been calculated. Also
P: No. of targets:[ t1,t2,....tp]
variance for each endmember is generated .
M: Target signature Matrix:[ m1,m2,.....mp]:LxP

341 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Assume, 900nm with a band width of 3.2nm. The data was collected by
α =(α1, α2,..... αp)T :Px1 the Florida Environmental Research Institute with lines equal
so, to 952, number of bands is 156,samples equal to 952,
r=Mα+n http://opticks/sampledata/samson/
few bands of image read in matlab are as shown in Fig.2
r is linear mixing of target Spectral signature and mixing
coefficients =(α1, α2,..... αp). Fractional images are grayscale
images representing abundance fractions of mixing objects.
LSMA generates fractional images and reflectance spectra with
respect to endmembers generated in previous state.

IV. NMF

For the smooth working of LSMA two constraints are required


to be imposed. Sum to one constrain and non negativity
constrain. various methods have been developed for
overcoming nonnegativity constraint [11] ,but Nonnegative Fig.2(a): Band 1 Fig.2(b): Band 10
matrix factorization provides better lower rank approximations
by factors. Bro and De jong[9] developed active set method to
solve NNLS problems. NMF is used for the reduction of high
dimension data.

Let input matrix with each element nonnegative:


A € Rmxn
k:integer k < min (m,n)

NMF finds two factors, W and H

W € Rmxk

H € Rkxn Fig.2(c): Band 30 Fig.2(d):Band50


where,
W : bases matrix (m x k)
H : coefficient matrix (k x n)
A : Input data matrix (m x n)
k : Target low-rank

A ≈ WH
min f(W,H)=1/2 *║A-WH║ 2 F
w≥0,H≥0
For this Euclidian or Frobenius distance is used.

W &H are nonnegative. Fig.2(e): Band 115 Fig.2(f): Band 156


Active set method had limitation that one variable can only be Reflectance signature for six endmembers are given below
exchanged among working sets for every iteration. With the
increase in number of known variables , process becomes slow.
So, to overcome this NMF with block pivoting method[10] is
preferred.

V. EXPERIMENTAL RESULTS AND CONCLUSION

The experimental hyperspectral dataset used was generated by


the SAMSON sensor. The instrument flown during the collect
is the SAMSON. It is a push-broom, visible to near IR,
hyperspectral sensor. It covers the spectral range of 400nm-

Identify applicable sponsor/s here. (sponsors)

342 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Fig.3(a): Reflectance spectrum for endmember 1

Fig.3(d): Reflectance spectrum for endmember 4

Out of the total 156 bands of the input image, Few bands are
presented in Fig.2. The output given by modified gram Schmidt
orthgonisation endmember extraction algorithm are
endmembers, orthogonized endmembers(Q) and
orthonormalised endmember(U). Endmembers detected have
their specific spectral signatures, which is plot between
spectral band and radiance.Fig.3 shows spectral signature for
all extracted endmembers. Also, Variance value, Elapsed time
for endmembers are calculated as shown in Table.1
Endmembers details like pixel value, orthogonalised and
orthonormatized vectors are given as input to LSMA.
Fig.3(b): Reflectance spectrum for endmember 2 Grayscale Fractional abundance images which has to be coded
are given by LSMA as shown in Fig.4.
When more attributes are there or they are ambiguous
in nature with weak predictability, NMF can be used. Because
NMF utilizes multivariate analysis with linear algebra, and can
produce effective patterns. Here input data matrix A is
decomposed into two matrices with lower ranks W and H.
Now, matrix W carries basis whereas H carries associated
coefficients or weights. W and H are modified in iterative
manner so that their product approaches A. Here nonnegative
values of sparse bases and sparse weightings can be obtained
and the original data structure can be preserved. Since NMF is
dimensionality reduction method further it helps in
compression.
Here, Nonnegative matrix factorization (NMF)
method is used for solving spectral unmixing problem caused
by nonnegativity constraint on abundance fractions.
Fig.3(c): Reflectance spectrum for endmember 3
.

343 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table:1

Fig.3(e): Reflectance spectrum for endmember 1

Fig.4 Fractional abundance images

Here centroid method is used for first endmember selection


[15].From the table.1 it can be seen that variance is maximum
for endmember 1 and elapsed time required is least. The results
depicted in the table.1 and reflectance spectra shown in fig 3
are for 6 endmembers. selection of number of endmembers can
be done depending on the requirements of application
compression, the classification of endmembers and related to
mixed or pure pixel identification. Here abundance fractional
images can be obtained which can be further utilized for image
compression.
Fig.3(f): Reflectance spectrum for
endmember 6

Endmember Variance Elapsed time Endmember REFERENCES


vector
1] J. B. Adams, M. O. Smith, and A. R. Gillespie, “Image
spectroscopy: Interpretation based on spectral mixture analysis,” in
Endmember1 54.296976014134 98.454245761383 105.00
Remote GeochemicalAnalysis: Elemental and Mineralogical
3e-006 7e-003
Composition, C. M. Pieters andP. A. Englert, Eds. Cambridge,
Endmember2 46.702786884236 1.3407970244714 63.00 U.K.: Cambridge Univ. Press,pp. 145–166, 1993.
5e-006 5e+000 2] J-Bioucas Dias et al, "Hyperspectral unmixing overview :
Geometrical, statistical and sparse regression- based approaches,"
Endmember3 38.691628949911 1.2386620543131 91.00
IEEE J.Sel.TopicsAppl.Earth observ. Remote sensing. Vol.5,No..2,
9e-006 9e+000
PP 354-370,April 2012.

Endmember4 26.455712362002 1.2380249147618 51.00 3] Theiler, J ,DD Lavenier, N.R Harvey, S.J Perkins and J.J

1e-006 6e+000 Szymanski, "using blocks at skewers for faster


computation of pixel purity index, " SPIE proc,4132,61-
Endmember5 25.643224973733 1.2278208129884 60.00 71,2000.
3e-006 9e+000 4] Winter M.E," N-Finder : an algorithm for fast autonomous

Endmember6 22.384385738664 1.2300259316586 52.00 spectral end member determination in hyperspectral data," SPIE
3e-006 9e+000 proc, 3753, 266-275,1999.
5] Sebastian lopez, pablo Horstrand, Gustara M.callico,"A low
computational- Complexity algorithm for Hyperspectral End

344 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

member extraction: modified vector component Analysis," IEEE 10] Jingu Kim and Haesun Park," Toward Faster Nonnegative Matrix
Geoscience and remote sensing letters Vol.9, 3 may 2012. Factorization: A New Algorithm and Comparisons",Eighth IEEE
International Conference on Data Mining (ICDM’08), Pisa,Italy,
6] Chein- I chang, Cheng Wu, Wer-min, “A new growing method Dec. 2008.
for simplex Based endmember extraxtion algorithm". 11] D.Lee and H-seung,”Learning the parts of object by nonnegative
matrix factorization,” Nature, Vol.401,PP788-791,1999.
7] Raul guerra, Lucana Santos, Sebastian Lopez and Roberto
Sarmiento," A new Fast Algorithm for Linearly Unmixing 12] J. Bowles, J. Antoniades, M. Baumback, J. Grossmann, D. Haas, P.
Hyperspectral Images," IEEE Transactions on geoscience and Palmadesso, and J.Stracka, Real time analysis of hyperspectral
Remote Sensing, 2015. data sets using NRL’s ORASIS algorithm,Proceedings of the SPIE,
8] J.B.Adams, M. O.Smith, and A.R. Gillespie,“Image vol. 3118, p. 38, 1997.
spectroscopy: Interpretation based on spectral mixture analysis,”
in Remote Geochemical Analysis: Elemental and Mineralogical
Composition, C. M. Pieters and P. A. Englert, Eds.
Cambridge, U.K.: Cambridge Univ. Press, pp. 145–166,1993.
AUTHORS PROFILE
9] R. Bro and S. De Jong. A fast non-negativity-constrained least Authors Profile …

squares algorithm.Journal of Chemometrics, 11:393–401, 1997.

345 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Hyperspectral Image Compression Methods: A


review

Dr.A.A.Khurshid
Neetu N. Gyanchandani Professor, Department of Electronics Engineering.
Department of Electronics Engineering. RCOEM,
Research Scholar, GHRCE Nagpur, India
Nagpur, India

Dr.Sanjay Dorle
HOD, Department of Electronics Engineering
GHRCE,
Nagpur,India

Abstract— Due to high spectral and spatial resolution The different steps involved in compression [encoding] of the
hyperspectral data seems to be most popular source of hyperspectral image are given in Fig.2
information. Extracting required information from large
hyperspectral volume is very difficult task, since data cubes These steps include:
require large memory and transmission space. Many
compression methods and techniques are being developed to
achieve the goal and are thus presented here.
1. Pre-Processing Unit- Different reversible processes
In recent years, dramatic growth in remote sensing are used and shared with decoder for decompression
applications and platform are observed both airborne and the to improve performance of compression e.g.
space borne. Data obtained from remote sensing faces challenges Reordering of bands, principle component analysis,
in acquisition, transmission, analysis and its storage. Accurate normalization and preclustering.
analysis or information extraction is better with high quality
data, but it increases the data volume. The size of hyperspectral 2. Compression unit - Standard compression techniques
data generated by NASA IPL’s Airborne Visible/ Infrared like vector quantization, transform coding or
imaging spectrometer for spectrum at reflected light is around prediction coding can be used.
500 megabytes per flight which serves purposes like geological
mapping, environmental Monitoring, disaster assessments
,target recognition, Urban growth analysis, vegetation,
classification ,defense and many more.

Index Terms—hyperspectral data, image compression, remote


sensing and Airborne Visible/Infrared imaging spectrometer.

I. INTRODUCTION
Hyperspectral data is three dimensional data. The data is a
stack of two dimensional images with each 2D image
corresponding radiation received by the sensor with specific
wavelength. These images are also called band images or
bands.
Another way to view hyperspectral data is by means
of pixel vector where data from same pixel location in each
band is used and a multi-dimensional vector is created. Here
elements of pixel vector corresponds to reflected energy with
specific wavelength from the surface of earth.

Fig.1: Hyperspectral datacube

346 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Encoded Actual Data II.COMPRESSION TECHNIQUES

Many techniques have been introduced for lossy and lossless


Pre processing unit Distortion hyperspectral compression. Selection among them is done on
side measure- the basis of application for which data is required. Different
Information ment techniques are used for compression of hyperspectral data are
discussed below.
Compression Unit
A. Vector Quantisation
Here the image is first decomposed into a set of vectors.
Selection of training set from subset of input vector is done,
Compressed Hyper Data which is followed by the generation of codebook from training
set, generally iterative clustering algorithm is used. Lastly, In
codebook codeword for closest code vector is found for every
input vector & transmission of codeword is done.
Fig.2:Generalised block diagram of encoder. Image

3] Distortion measures-Also some suitable distortion formation of vector


measures like absolute distortion measures or
percentage maximum distortion measures are utilized
to determine the effect of applied algorithm on generation of training set
reconstructed data. The compressed data is later
transmitted by means of communication channel.
Generation of codebook
Reverse process is applied in decoder and it gives
data in required standard form to classifier stage as
shown in Fig.3
Quantization

Fig.2:flowchart for vector quantisation


Compressed Data
Depending on methods of decomposition different methods
are invented for generation of the set of vectors.
Decompression unit
1]. Ryan and Arnold [1] developed many normalization
techniques for lossless coding like means / residual
VQ[2] where there is subtraction of mean from input
Side vector. Normalized VQ [3], here conversion of data
was carried out using zero mean and standard
information
deviation. Gain/shape VQ[4] in which shape vectors
are generated by the Euclidean norm. Means
Post processing unit normalized vector Quantization [1] M-NVQ, where
preprocessing of input vectors are done by
normalizing with mean, this method was preferred
approach due to reduced contribution to the overhead
Classifier for lossless coding.

2]. Pickering and Ryan [5] developed optimized spatial


M-NVQ which is further followed by spectral DCT,
Optimized spatial M-NVQ with DCT showed
compression ratio 1.5 to 2.5 times better than
optimized spatial M-NVQ. Improved M-NVQ [5]
Fig.3: Generalized block diagram of decoder.

347 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

was designed to have good compression without [SPIHT][8] and contact based approach shown is JPEG zero
much loss in classification accuracy, it gave a image compression standard. [9].
compression ratio near to 43:1.
C. Predictive coding
3. Qian et al [6] developed fast VQ lossy compression
techniques for , here segmentation of image is done Data already sent to decoder is utilized for prediction
and codebooks are utilized the processing speed of of current data in predictive coding. In hyperspectral
compression is improved by around 1000/ average predictive coding for prediction linear combination of values
fidelity penalty of 1dB. from pixels which are adjacent spectrally and specially to
current pixel are used.
4. Qian [7] again developed the generalized Lloyd Pixel selection for prediction method depends on
algorithm for hyperspectral imager in 2004 with order in which data is given to algorithm. Two main format in
which hyperspectral data can be analyzed with less this regard are 1) BIL: Band interleaved by line & 2) BSQ:
computational iterations with improvement in the Band Sequential.
distance of partition.
1. Roger and caven or [10] started with DPCM with 1.6-
B. Transform coding 2.0:1 compression ratio. Forwarded by Aiazzi et.al
[11] who used fuzzy logic for prediction co-efficents,
In transform coding a set of product values are produced gives 20% improvement in compression ratio
which is the product of original sample value and a set of basis compared to [10]. Then WU and Memon[12]
vectors where, Basis vectors are sinusoids with increasing developed content based adaptive lossless image
frequency. coding [CALIC] and 3D version of
Resulting coefficients which are generated after CALIC[13].
adding up product values indicates frequency content of the
original signal . Product value depends on shapes of original 2. Magli et al modified CALIC algorithm by using
sample & basis vector. If both are same it is positive if not previously generated bands for prediction and the
near to zero with high correlation of adjacent samples, data new M-calic algorithm has given excellent results for
bits for transmission can be effectively reduced. Some lossless and near lossless application for data in bit
Transform coding algorithms are discussed as follows interleaved format.

1. KLT: D.Linear mixture analysis


In KLT basis vectors are generated by considering
Hyper spectral data consists of end member or pure pixel
statistical properties of data. Here transform can be used for
vector or targets. Pixel vectors in linear mixture analysis
reduction of dimensionality of data. KLT is similar to PCA
are expresses as sum of SQ linear combination of end
,only the fact is KLT is data dependent. Disadvantage is its
members and noise. Pixel vector is represented by
high computational cost due to base vector recalculation &
equation: X=Mα+Y
transmitted as side band. But, this problem can be solved with
where, X= pixel vector
fixed basis vector. This is done by DCT. But DCT has poor
Y= Column vector of noise M=
performance as spectral decorrelated.
Pal.et.al[7] developed compression algorithm using matrix for tangents
KLT and JPEG2000 for spectral and spatial data .It included
M= [t1, t2……..tp],
effect of algorithm classification accuracy, experiments
showed 99% pixel identification can be done with this where tiÆ End member
algorithm.
α= Column vector
2.DWT: α=[ α1 α2……αp],
where αtÆ abundance fraction for
Discrete wavelet transform is subband filtering
process. Filtration of input data signal is done for low & high corresponding endmember.
pass filtered versions. Outputs are then subsampled by factor
Two constraints for linear mixture model are Sum to one
of two and set of subband samples are generated. These
constraint- In this abundance fraction always some to one for
subbands are wavelet coefficients. With DWT fast
perticular pixel vector and Non-zero constraint-abundance
computation with low memory requirement is possible.
fractions are always non-zero.Each original pixel vector is
Depending on type of entropy coding applied in
represented by linear combination of relatively less no of
wavelet transform process to the co-efficient algorithms have
targets.
been divided as zero tree coding & content based coding. Best
zerotree coding given is set partitioning in hierarchical trees

348 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

1. Rupert et-al; [14] developed lossless method for III. CONCLUSION


coding abundance fractions.Here by unsupervised Data given by Hyperspectral images is very
clustering method means of clusters are formed,those important. but dueto large volume of data storage space is
means of clusters are used as endmembers.futher required and transmission bandwidth increases.
process of iteration generates endmembers and set of To overcome these problems many methods using
fractional abundances. different techniques are studied and introduced both in lossy
and lossless types. Many 2D compression algorithms are
2. Bowles et-al; [15] modified rupert et.al. of extended to 3D for better compression of hyperspectral data
considering means of clusters formed by coding.Here different techniques for image compression are
unsupervised clustering method as endmember, it studied, But comparision between them is still a big challenge
attempts to determine endmember those may not be due to variation in size, type and nature of hyperspectral data
available in original data.here 2 dimentional spatial images used as input. Comparison of techniques could be
wavelet transform is used for compression of possible with identical data sets and with similar performance
abundance images. measures.
With the change in the codebook vectors [number
3. Du and Chang [16] developed entire linear mixture and length] and accuracy of transmitted difference
analysis based hyperspectral image compression vectors,vector Quantisation process can be modified.In general
system.It used unsupervised method for selection of linear mixture analysis based methods are more effective since
endmember & by linear mixture analysis generates they serve good compression ratio due to transmission of
abundance fractional images.The abundance abundance fractional images only. If data compression is
fractional images are encoded and transmitted which application specific and end members are selected
are then unmixed and used for regeneration of prominantly, then great compression ratios can be achieved
original data at receiver end.This compression with different supervised and unsupervised linear mixing
technique shows high compression values since techniques.
number of endmembers in hyperspectral image are
always less than number of spectral bands.

4. Hanye Pu & Zhao chen, [17] developed parameter REFERENCES


free algorithm for real application.
To overcome the problem constrained nonlinear
[1] Ryan M. And Arnold J.,"The Lossless compression of
AVIRS Images by vector Quantization", IEEE
least squares an alternating iterative optimization transactions on Geosci. and Remote sensing, Vol. 35,
algorithm is derived.Another problem is joint No.3, pp.546-550, May 1997
mixture resulting from the linearity and nonlinearity
in hyperspectral data.This can be solved by utilising [2] Baker R. And Gray R.," Image compression using
structured total least squares optimization approach. Non adaptive spatial vector Quantization",
conference record of 16th Asilomar Conference on
5. Raul Guerra developed fast algorithm for linearly Circuits, Systems, pp.55-61,oct 1982.
unmixing hyperspectral images. FUN algorithm was
[3] Murakami.T, Asai K.,"Vector quantizer of video signals,
based on the concept of orthogonal protection of end
Electronic letters,vol.7, pp.1005-1006, Nov 1982.
member entraction with lower computation
effect.[18]Here endmember extraction and [4] Ramamurthi B. And Gersho A.," Image vector
classification method is developed, modified gram quantisation with a perceptually based classifier", in
schmidth orthogonisation is used for end member proceedings of the IEEE International conference on
extraction. Acoustics, Speech, Signal Processing, San Diego,
CA,Vol.2, pp.32.10.1-32.10.4, Mar 1984.
6. A new method for compression of hyperspectral data
as well as reconstruction with a low complexity was [5] Pickering M. And Ryan M.," Efficient spatial spectral
Compression of Hyperspectral data", IEEE transactions
developed [19]. Here compressed hyperspectral data
on Geosci. and Remote sensing, Vol
are obtained as per compressive sensing principle. 39,No.7,pp.1536-1539, July 2001.
The proposed method uses augmented Lagrangian
type algorithm and proved high potential in real [6] Shen-En Qian ,Williams D., "Vector Quantization
applications. using spectral Index Based multiple subcode books for
hyperspectral data compresssion, " IEEE transactions
on Geosci. and Remote sensing, Vol.38, No.3,
pp.1183-1190, May 2000.

[7] Pal M., Brislawn C. and Brumby S., " Feature


Extraction from Hyperspectral Images Compressed Using

349 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The JPEG-2000 standard", fifth IEEE southwest


Symposium on image Analysis and
Interpretation, pp.168-172, 7-9 April, 2002.

[8] Said.A and Pearlman,"A new fast and Efficient Image


codec Based On set Partitioning In Hierarchical
Trees,"IEEE Transactions on Circuits ans systems for
video Technoogy,vol.6,pp.243- 250,jun.1996

[9] Taubman,D.and Marcellin,"JPEG2000: Image


Compression Fundamentals, standards and practice,"
Boston,MA,Kluwe,2002.

[10 ]Roger R. And cavaner M."Lossless compression


of AVIRIS images", IEEE transactions on image
processing,vol.5,no.5,pp 713-719,may 1996.

[11] Aizzi B., Aparone L.,"Lossless compression of


Multi/Hyper Spectral Imagery based on A 3D Fuzzy
Prediction ", IEEE transactions on Geosci.
and Remote sensing, Vol. 37, No.5,pp.2287- 2294,
Sept 1999.

[12] Wu X. And Memon N.," context based, adaptive, lossless


Image coding, "IEEE Transactions on
Communications,vol.45,pp.437-444,Apr.1997.

[13] Wu X. And Memon N." context based losless Interband


Compression-Extending Calic, "IEEE Transactions on
Image Processing,vol.9,pp.994- 1001,Jun.2000

[14] Rupert S.sharp M." Noise constraint Hyperspectral data


compression", IEEE Geoscience and Remote
sensing symposium IGARSS'01, Vol 1,pp.94-96,
July 2001.

[15] Bowels J.,Wei Chen and Gillis D.,"ORASIS


framework-Benefits to working with linear mixture
model",IEEE Geosci. and Remote sensing syposium
IGARSS'03, Vol 1,pp.96-98,21- 25 July 2003.

[16] Du.Q and Chang C-I, "Linear Mixture Analysis based


Compression for hyperspectral Image analysis", IEEE
transactions on Geoscience and Remote sensing,
Vol 42,No.4,pp.875-891,April 2004.

[17] Hanye Pu, Zhao Chen, Bin Wang and Wei Xia,"
Constrained Least Squares Algorithms for
Nonlinear", IEEE Transactions On Geosci. And
Remote Sensing, VOL. 53, NO. 3, pp- 1287-
1303,MARCH 2015

[18] Raúl Guerra, Lucana Santos, Sebastián López and


Roberto Sarmiento,"A New Fast Algorithm for
Linearly Unmixing Hyperspectral Images",IEEE
Transactions On Geosci. And Remote Sensing, VOL. 53,
NO. 12, December 2015.

[19] C. Chang, C. Wu, W. Liu, and Y. Ouyang, A new


growing method for simplex-based endmember extraction
algorithm, IEEE Transactions on Geosci. and
Remote Sensing, vol. 44, no. 10, pp. 2804–2819, 2006.

350 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 1

MQA: Mobility’s quantification algorithm in


AODV protocol
Meryem SAADOUNE Abdelmajid HAJAMI Hakim ALLALI
LAVETE Laboratory LAVETE Laboratory LAVETE Laboratory
Univ. HASSAN 1st, FSTS Univ. HASSAN 1st, FSTS Univ. HASSAN 1st, FSTS
Settat, Morocco Settat, Morocco Settat, Morocco
[email protected] [email protected] [email protected]

Abstract Mobility is one of the basic features that define an ad hoc network, an asset that leaves the field free for the nodes to
move. The most important aspect of this kind of network turns into a great disadvantage when it comes to commercial
applications, take as an example: the automotive networks that allow communication between a groups of vehicles. The ad hoc
on-demand distance vector (AODV) routing protocol, designed for mobile ad hoc networks, has two main functions. First, it
enables route establishment between a source and a destination node by initiating a route discovery process. Second, it maintains
the active routes, which means finding alternative routes in a case of a link failure and deleting routes when they are no longer
desired. In a highly mobile network those are demanding tasks to be performed efficiently and accurately. In this paper, we
focused in the first point to enhance the local decision of each node in the network by the quantification of the mobility of their
neighbors.

Index Terms—Ad hoc, mobility, RSSI, AODV, Localization, Distance, Relative speed, Degree of spatial dependence, GPS-free.

relative position of the nodes will generally remain


unchanged. In ad hoc, since the nodes are mobile, the
I. INTRODUCTION 1 network topology may change rapidly and unpredictably
and the connectivity among the terminals may vary with
M obile ad hoc network (MANET) is an appealing
technology that has attracted lots of research efforts.
Ad hoc networks are temporary networks with a dynamic
time. However, since there is no fixed infrastructure in this
network, each mobile node operates not only as a node but
topology which doesn’t have any established infrastructure also as a router forwarding packets from one node to other
or centralized administration or standard support devices mobile nodes in the network that are outside the range of
regularly available as conventional networks [1]. Mobile the sender. Routing, as an act of transporting information
Ad Hoc Networks (MANETs) are a set of wireless mobile from a source to a destination through intermediate nodes,
nodes that cooperatively form a network without is a fundamental issue for networks. [4]
infrastructure, those nodes can be computers or devices The problem that arises in the context of ad hoc networks
such as laptops, PDAs, mobile phones, pocket PC with is an adaptation of the method of transport used with the
wireless connectivity. The idea of forming a network large number of existing units in an environment
without any existing infrastructure originates already from characterized by modest computing capabilities and backup
DARPA (Defense Advanced Research Projects Agency) and fast topology changes.
packet radio network's days [2][3]. In general, an Ad hoc According to the way of the creation and maintenance of
network is a network in which every node is potentially a roads in the routing of data, routing protocols can be
router and every node is potentially mobile. The presence of separated into three categories, proactive, reactive and
wireless communication and mobility make an Ad hoc hybrid protocols. The pro-active protocols establish routes
network unlike a traditional wired network and requires that in advance based on the periodic exchange of the routing
the routing protocols used in an Ad hoc network be based tables, while the reactive protocols seek routes to the
on new and different principles. Routing protocols for request. A third approach, which combines the strengths of
traditional wired networks are designed to support proactive and reactive schemes, is also presented. This is
tremendous numbers of nodes, but they assume that the called a hybrid protocol.
Ad-hoc On-Demand Distance Vector routing protocol
(AODV) [5] is a reactive routing protocol, who was
standardized by the working group MANET [6] with IETF

351 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017

(Internet Engineering Task force), by the (RFC 3561). Maintained roads is done by periodically sends short
The protocol's algorithm creates routes between nodes message application called "HELLO" , if three consecutive
only when the routes are requested by the source nodes, messages are not received from a neighbor, the link in
giving the network the flexibility to allow nodes to enter question is deemed to have failed . When a link between
and leave the network at will. Routes remain active only as two nodes of a routing path becomes faulty, the nodes
long as data packets are traveling along the paths from the broadcast packets to indicate that the link is no longer valid.
source to the destination .When the source stops sending Once the source is prevented, it can restart a process of
packets, the path will time out and close. route discovery.
In this paper we propose a solution that enables each AODV maintains its routing tables according to their use,
node in the network to determine the location of its a neighbor is considered active as long as the node delivers
neighbors in order to create a more stable and less mobile packets for a given destination, beyond a certain time
road. For that purpose, we locally quantify the neighbor’s without transmission destination, the neighbor is considered
distances of a node as the metric of mobility using AODV inactive. An entered routing table is considered active if at
protocol. least one of the active neighbors using the path between
The remainder of this paper is organized as source and destination through active routing table entries is
follows. Section 2, describes briefly the AODV protocol. called the active path. If a link failure is detected, all entries
In Section 3, a summary of related work is presented. we of the routing tables participating in the active path are
present in Section 4,6,8 how to quantify, evaluate, estimate removed.
mobility in ad hoc network(Distance, Relative Speed, AODV (ad hoc on-demand distance vector routing) is a
Degree of special dependence). Section 5,7,9 shows the routing protocol for mobile network. AODV is capable of
algorithm used the quantification of the Mobility’s metrics both unicast and multicast routing. An algorithm in the
in AODV protocol. Section 10 presents some simulations application that is to say it does built roads between the
and results. Finally Section 11 concludes this paper. nodes when requested by the source nodes. It maintains
these routes as long as the sources are in need.
A. Route Discovery
II. AD HOC ON-DEMAND DISTANCE VECTOR
When a source node wants to establish a route to a
AODV is an on-demand protocol which is capable of destination for which it does not yet drive it broadcast a
providing unicast, multicast [7], broadcast communication Route Request packet through the network.
and Quality of Service aspects (QoS) [8], [9]. It combines
mechanisms of discovery and maintenance roads of DSR
Route Request
(RFC 4728) [10] involving the sequence number (for
Broadcast ID
maintains the consistency of routing information) and the
IP source
periodic updates of DSDV [11].
Destination address
At the discovery of routes, AODV maintains on each
Meter jump
node transit information on the route discovery, the AODV
Sequence number of the source
routing tables contain:
Sequence number of the destination
- The destination address
Table 1:Route Request Contents
- The next node
- The distance in number of nodes to traverse
- The sequence number of destination B. The return path
- The expiry date of the entry of the table time.
When a node receives a packet route discovery (RREQ), A node receiving a Request Route records in its routing
it also notes in its routing table information from the source table the IP address of the source node, the sequence
node and the node that just sent him the package, so it will number, the number of hops which separates the source and
be able to retransmit the response packet (RREP). This the IP address of the neighbor who just sent him this
means that the links are necessarily symmetrical. The request.
destination sequence number field of a route discovery If it is the destination or if it has a route to the destination
request is null if the source has never been linked to the (with a higher sequence number equal to or included in the
destination, else it uses the last known sequence number. It Request Route), the node will issue a Route Reply packet.
also indicates in this query its own sequence number. When Otherwise, it broadcast the Request Road. The nodes keep
an application sends a route discovery, the source waits for track of IP sources and broadcasts Route Request ID. If
a moment before rebroadcast its search query (RREQ) road, they receive a Request Route they have already treated,
after a number of trials, it defines that the source is they depart and do not transmit.
unreachable.

352 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 3

III. RELATED WORK


Route Reply In [12], a geometric mobility metric has been proposed to
Hop counter quantify the relative motion of nodes. The mobility measure
Destination address between any pair of nodes is defined as their absolute
Sequence number of the destination relative speed taken as an average over time. This metric
source address has certain deficiencies: First, it assumes a GPS like scheme
Lifetime for calculation of relative speeds while in a MANET, we
Table 2: Route Reply Contents cannot assume the existence of GPS, so we have to resort to
other techniques for measuring relative mobility. Secondly,
it is an “aggregate” mobility metric and does not
characterize the local movement of the neighboring nodes
C. Routing Table to another particular node.
The Reference Point Group Mobility Model (RPGM)
The Route Reply allows each node to perform their proposed in [13] was useful for predictive group mobility
routing table to the destination. Route Request can record management. In RPGM, each group has a logical “center”
the way back to the source. The routing table contains the and the center’s motion defines the entire group’s motion
following information behavior including location, velocity, acceleration etc.
In [14], they proposed a measure of the network mobility
Routing Table which is relative and depending on neighboring and link
Destination address state changes. Each node estimates its relative mobility,
Next Hop (next hop) based on changes of the links in its neighboring. This
Number of hops (hop count) measure of mobility has no unit, it is independent of any
Sequence number for the destination existing mobility models and it is calculated at regular time
Expiration time for the route table entry (validity time intervals.
for the road) The degree mobility used in [15] was calculated from the
network interface change of its neighboring to each node in time. The node
Table 3: Routing Table Contents mobility degree represents at a given time for each node
in the ad hoc network, the change variations undergone in
D. Sending Message its neighboring compared to the previous time .
Thus, nodes that join or/and leave the neighboring of a
Once the source has received the Route Reply, it can start given node will have surely an influence on the evaluation
transmitting data packets to the destination. If, of its mobility.
subsequently, the source receives a Route Reply containing . However, the last two measures are not representative’s
a sequence number higher or the same account but with a values of a change node’s motion with respect to another
smaller jumps, it will update its routing information to this node.
destination and begin to use the best route. We can see that none of the metrics described above are
suitable for characterizing the relative mobility of nodes in
E. Route Maintenance a particular node's neighborhood in a MANET. Hence, we
A route is maintained as long as she continues to be feel that there is a need to develop such a metric which can
active. A route is considered active as long as data packets be used by any routing protocol.
pass periodically from the source to the destination. When
the source stops transmitting data packets over a finite life, IV. LOCAL QUANTIFICATION OF NEIGHBORING DISTANCE
the link expires and is deleted from the routing tables of the
intermediate nodes.
There is no acknowledgment of receipt; a node can detect In this section, we define how we estimate nodes
a broken link: if it receives data he concluded that the link mobility in ad hoc network. Mobility is quantified locally
is valid. A node can inform its neighbors of its presence in and independently of localization of a given node. We
sending interval regulate HELLO messages that they represent this local quantification node mobility, by
always see as active. calculating of the distance between a node and its
If a link will break when a route is active, the one end it neighbors.
nodes of the broken link transmits a route error packet to
the source node. After receiving the route error, the source The quantification of the distance can be done using 3
can reinitiate a process of discovery. methods:

Calculate the exact distance: this is done by two ways:

1st way: The distance calculation using the GPS: This

353 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017

operation is done by using a terminal capable of being MANET network.


localized through a positioning system by satellites: GPS. We put i the center of the coordination system.
The principle of localization by the GPS system is based on Let N: The set of nodes in the network.
the use of satellite coordinates and the estimation of
distances between the receiver satellites. Distances are
Pi: The set of one hop neighbor of the node i.
obtained from the estimation of the TOA (Time Of Arrival) dij: the distance between nodes i and j.
of the signals transmitted by the satellites [16] Di: All distances dij
2nd way: Distance calculation function in a simulation We choose i / i є Pa , i ≠ b and I ≠ a
environment: Like NS2, OPNET, tor other simulator. We choose p and q / p,q є Pa , dpq ≠ 0,
pîq ≠ 180° , pîq ≠ 0°
Calculate the distance using the RSSI (Received Signal
Node i defines its system of local coordination.
Strength Indication): in case that the absolute positioning
is not accessible, dedicated equipment not available or not We set the x-axis as the line (ip).
possible, in theory, to determine the distance between a We set the y-axis as the line (iq).
transmitter and a receiver we can use the RSSI. RSSI is a Thus the system is defined:
generic radio receiver technology metric, which is usually The node i is the center of the system.
invisible to the user of the device containing the receiver, ix = 0 , iy = 0
but is directly known to users of wireless networking p is the node situated on the axis of abscissas
of IEEE 802.11 protocol family.
px = dip , py = 0
The distance using RSSI can be calculated using the FRIIS The node q is located on the ordinate axis
transmission formula:
qx = diq cos α , py = diq sinα
with α = pîq
Using the theorem of Al-Kashi:

: Receiving power.
: Transmitting Power.
: Gain of a transmitting antenna = ability to α is calculated using this formula:
radiate in a particular direction in space.
: Gain of a receiving antenna = ability to couple
the energy radiated from a direction in space.
: is the wavelength.
: is system loss factor which has nothing to do
with the transmission
: is the distance between the antennas.
Then, to calculate the distance between two nodes that’s are
equipped by transmitting antennas, the formula is

Calculate distance using GPS-free [17]: In case that the


Figure 1: The reference system
GPS is not accessible, we can use a GPS-free to localize the
neighbor of each node. This method uses a mobile reference
Once the reference is selected, the calculation of the
to calculate the coordinates of all the nodes in the network.
coordinates of the nodes that belong to Pi is easy.
However we can conclude the distance between any nodes.
In this part, we use the distance reception power to
determinate the distance between the reference and the
others nodes.
To implement this method in AODV protocol, we have
to choose the reference:
Choice of the reference
a and b are two nodes that want to communicate in a

354 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 5

- x diffuse RREQ.
- Each node receiving RREQ, calculates the
distance between itself and the neighbor who
For b, the coordinates are: sent him RREQ (in this part we use the exact
distance or the distance using the Pr) and
broadcasts its table [neighbors-distance] to its
neighbors.

To use the third method for the quantification of the


distance, the algorithm has to change.

 A node x wants to communicate with a node y.


- x diffuse RREQ.
- Each node receiving RREQ, calculates the
Then the final system is as follow: distance between itself and the neighbor who
sent him RREQ (in this part we use the exact
distance or the distance using the Pr), broadcasts
its table [neighbors-distance] to its neighbors
and choose the reference who has the smallest
distance and recalculate the newest distances
using the third method.
N.B.: the node who receive the RREQ is the node a in
the previous part.

VI. LOCAL QUANTIFICATION OF NEIGHBORING


RELATIVE SPEED
Figure 2: The system of localization local Using one of methods above, every node can calculate
the movement’s speed of its neighbors.
By definition the relative speed is the variation in time of
Changing the Benchmark: the distance between two mobiles.
We propose in this part to replace the reference by
another composed by the a’s neighbors having the smallest a and b are two nodes that want to communicate in a
distances. MANET network.
Dab(t): the distance between nodes a and b at time t.
Va(b): the speed of b with respect to a.
However, we choose i, p, q / I,p,q є Pa , i ≠ b , i ≠ a ,
pîq ≠ 180°, pîq ≠ 0°
Va(b) =
And and k ≠ a,b,i,p,q
dak > dai , dak>dap and dak>daq. The interpretation of the value of this metric is done
according to the sign of the latter.
After the quantification of the distance between all
nodes, we can describe the behavior of the node in the If it is positive: Nodes move away from each other.
network by calculate the average of all the distances Else: Nodes move toward each other.
Avg(dij).
If the average is very high we say that the network nodes After the quantification of the speed between all nodes,
are very agitated else the network is supposedly more we can describe the behavior of the node in the network by
stable. calculate the average of all speeds Avg(Vij).
If the average is very high we say that the network nodes
V. ALGORITHM OF QUANTIFICATION DISTANCE IN THE are very agitated else the network is supposedly more
AODV ROUTING PROTOCOL stable.
In this part, we propose to use one of those methods in
the first function of a AODV protocol (rout establishment
between a source and a destination).

 A node x wants to communicate with a node y.

355 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017

VII. ALGORITHM OF QUANTIFICATION RELATIVE SPEED


SR( )=
IN THE AODV ROUTING PROTOCOL

In this part, we propose to use one of those methods in Using the theorem of Al-Kashi:
the first function of a AODV protocol (rout establishment
between a source and a destination). The Relative Direction can be reformulated as:
 A node x wants to communicate with a node y.
- x diffuse RREQ.
- Each node receiving RREQ, calculates the
distance between itself and the neighbor who
sent him RREQ (in this part we use the exact .
distance or the distance using the Pr) and Using one of methods of the quantification of distances
broadcasts its table [neighbors-distance-time] to cited above, every node can calculate the movement’s
its neighbors. speed of its neighbors.
- Each node calculates the relative speed between By definition the relative speed is the variation in time of
itself and its neighbors using the precedent the distance between two mobiles.
formula.

To use the third method for the quantification of the


distance, the algorithm has to change.

 A node x wants to communicate with a node y.


- x diffuse RREQ.
- Each node receiving RREQ, calculates the The value of is high when the nodes
distance between itself and the neighbor who a and b travel in more or less the same direction
sent him RREQ (in this part we use the exact and at almost similar speeds. However,
distance or the distance using the Pr) [16], decreases if the Relative Direction or the
broadcasts its table [neighbors-distance-time] to Speed Ratio decreases.
its neighbors
- Each node calculates the relative speed between
itself and its neighbors using the precedent IX. ALGORITHM OF QUANTIFICATION OF THE DEGREE OF
distances. SPATIAL DEPENDENCE IN THE AODV ROUTING PROTOCOL
- We choose the reference who has the smallest
value of speed and recalculate the newest A. Calculation of the Distances and the relative speeds
distances using the third method.
In this part, we propose to use one of those methods in
VIII. LOCAL QUANTIFICATION OF NEIGHBORING the first function of a AODV protocol (rout establishment
DEGREE OF SPACIAL DEPENDENCE between a source and a destination).
 A node b wants to communicate with a node a.
a and b are two nodes that want to communicate in a - b diffuse RREQ.
MANET network. - Each node receiving RREQ, calculates the
Degree of Spatial Dependence: It is extent of similarity distance between itself and the neighbor who
of the velocities of two nodes that are not too far apart.[20] sent him RREQ (in this part we use one of the
methods listed in [16]) and broadcasts its table
Formally, [neighbors-distance-time] to its neighbors.
RD( ) *SR( ) - Each node calculates the relative speed between
itself and its neighbors using the precedent
RD: Relative Direction (RD) (or cosine of the angle) formula.
between the two vectors is given by:
In this part, we are sure that all nodes have all
RD( )= distances between themselves and their 2-hop
neighbors.
SR: Speed Ratio (SR) between the two vectors is given
by:

356 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 7

B. Selection of the Benchmark and calculation of the 6.2. Discussions of results


degree of spatial dependence
The results present the local quantification of neighbor’s
a and b are two nodes that want to communicate in a Metrics of Mobility during the simulation.
MANET network. After the application of our proposition on a AODV
For this part, we use the condition that the node a has protocol we obtain:
at least 2 neighbors.
In the following figures, we observe the change of the
Formally N(a)>1 distance between the node 1 and theirs neighbors in the first
3s of the simulation.
We choose a node i where i and i
And aîb 180° and aîb 0°

To optimize the selection of the benchmark, we can


add the following condition before the calculation of the
degree of spatial Dependence.

Figure 1: Quantification of the Distance between node 1 and


its neighbors during the first 3s of the simulation
N.B: If b N(i)

All parts of the whole algorithm is repeated during


the simulation.

X. SIMULATIONS AND RESULTS:


In the following simulations, we applied our proposition
to the AODV protocol .For this, we have been used the
simulator NS-2 [18], with its implementation of AODV
Figure 2: Quantification of the Distance between node 1 and
protocol of the version NS-2.35 and Ying-3D [19] to its neighbors during the first 3s of the simulation “Another
represent some results in 3D. observation angle”

6.1. Environment
The network size considered for our simulations is
(1000m1000m) . The nodes have the same configuration,
in particular TCP protocol for the transport layer and Telnet
for the application layer. Time for each simulation is of 60s.
For each simulation the mobility of the nodes is represented
by the choice of an uniform speed between = 0 and
= 100 m/s. The nodes are moved after a random
choice of the new destination without leaving the
network (1000m1000m).

357 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017

Figure 3: The distance between node 1 and its neighbor node


8 during the first 3s of the simulation

Figure 5: Quantification of the Relative speed between node 1


and its neighbors during the first 3s of the simulation

Figure 4: The distance and the distance using RSSI between


node 1 and its neighbor node 8 during the first 3s of the
simulation

In the following figures, we observe the change of the


Relative speed between the node 1 and theirs neighbors in
the first 3s of the simulation.
Figure 6: Quantification of the Relative speed between node 1
and its neighbors during the first 3s of the simulation
“Another observation angle”

358 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 9

Figure 9: Quantification of the between node 1 and


Figure 7: The Relative Speed between node 1 and its its neighbors during the first 3s of the simulation
neighbor node 8 during the first 3s of the simulation

Figure 10: Quantification of the between node 1 and


its neighbors during the first 3s of the simulation
“Another observation angle”
Figure 8: The distance and the Relative Speed using RSSI
between node 1 and its neighbor node 8 during the first 3s of
the simulation

In the following figures, we observe the change of


the between the node 1 and theirs neighbors in
the first 3s of the simulation.

Figure 11: The between node 1 and its neighbor


node 8 during the first 3s of the simulation

359 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017

REFERENCES

[1] “Mobile Ad hoc Networking(MANET): Routing Protocol


Performance Issues and Evaluation Considerations ”Request for
Comments 2501, IETF, January, 1999

[2] J. Jubin and J. D. Tornow, "The DARPA Packet Radio Network


Protocols," Proceedings of the IEEE, Vol. 75, No. 1, pp. 21-32, Jan.
1987.

[3] B. M. Leiner, D. L. Neilson, F. A. Tobagui "Issues in Packet Radio


Network Design," Proceedings of the IEEE, Vol. 75, No. 1, pp. 6-20,
Jan. 1987.

[4] Sabina Baraković, Suad Kasapović, and Jasmina Baraković,


“Comparison of MANET Routing Protocols in Different Traffic and
Mobility Models”, Telfor Journal, Vol. 2, No. 1, 2010.

Figure 12: The distance, the Relative Speed and [5] C. Perkins, B.-R. E. and D. S.“Ad hoc On-demand Distance Vector
between node 1 and its neighbor node 8 during the first 3s of routing” Request For Comments (Proposed Standard) 3561, Internet
the simulation Engineering Task Force, July, 2003.

[6] http://datatracker.ietf.org/wg/manet/charter/

This algorithm can be used in all environments with or [7] C. Cordeiro, H. Gossain and D. Agrawal “Multicast over Wireless
without GPS. The metric calculated describes the similarity Mobile Ad Hoc Networks: Present and Future Directions” vol. 17,
of the velocities of two nodes using the distance and the no. 1, pp. 52-59, January/February, 2003
relative speed.
[8] Sung-Ju Lee, Elizabeth M. Royer and Charles E. Perkins
“Scalability Study of the Ad Hoc On-Demand Distance Vector
0.020 - 084: the is high wish describes the Routing Protocol” In ACM/Wiley International Journal of Network
Management, vol. 13, no. 2, pp. 97-114, March, 2003
similarity of the velocity between the nodes 1 and 8, that’s
translated by the almost stability of the distances and the [9] Ian Chakeres Elizabeth and M. Belding-Royer “AODV
relative speeds in this interval. Implementation Design and Performance Evaluation” to appear in a
special issue on Wireless Ad Hoc Networkirng’ of the
InternationalJournal of Wireless and Mobile Computing
0.94 - 1.75: the decreases wish describes the (IJWMC), 2005
difference of the velocity between the nodes 1 and 8.
[10] David B. Johnson, David A. Maltz and Josh Broch “DSR: The
Dynamic Source Routing Protocol for Multi-Hop Wireless Ad Hoc
1.75 – 2.9: the increases wish describes the Networks” Proceedings of INMC, 2004 - cseweb.ucsd.edu
similarity of the velocity between the nodes 1 and 8
[11] Guoyou He. “Destination-sequenced distance vector (DSDV)
protocol.” Technical report, Helsinki University
of Technology, Finland. 2 Dec 2003
XI. CONCLUSION:
In this paper, we tried to calculate a local Mobility’s [12] ] P. Johansson, T. Larsson, N. Hedman, B. Mielczarek, and M.
Degermark, “Scenario-based performance Analysis of Routing
Metric between a node and its neighbors in a AODV Protocols for Mobile Ad Hoc Networks,” F’roc. ACM Mobicom
routing protocol for the ad hoc networks. This metric of 1999, Seattle WA, August 1999
mobility that can be used to choose a stable rout to transmit
data thus ameliorate the Quality of Service in this kind of [13] X. Hong, M. Gerla, G. Pei, and C.-C. Chiang, “A Group Mobility
Model for Ad Hoc Wireless Networks,” Proc. ACM/IEEE MSWiM
networks. ’99, Seattle WA, August 1999
To allow this proposition more really feasible, we present
the three methods to calculate the distance between two [14] N. Enneya, K. Oudidi and M. Elkoutbi “Network Mobility in Ad hoc
nodes. First, we use the exact distance with a GPS or using Networks”, Computer and Communication Engineering, 2008.
ICCCE 2008. International Conference on, 13-15 May 2008, Kuala
RSSI. In case that the absolute positioning is not accessible, Lumpur, Malaysia.
we propose our improved GPS-free implementing in
AODV protocol. [15] N. Enneya, M. El Koutbi and A. Berqia“Enhancing AODV
Performance based on Statistical Mobility Quantification”,
Using one of those methods, we can calculate two others Information and Communication Technologies, 2006. ICTTA '06,
metrics of mobility: Relative speed and the degree of 2nd (Volume:2 ), Pages 2455 – 2460.
special dependence.
[16] Ahmad Norhisyam Idris, Azman Mohd Suldi & Juazer Rizal Abdul
Hamid ,Effect of Radio Frequency Interference (RFI) on the Global
Positioning System (GPS) Signals,2013 IEEE 9th International

360 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 1, January 2017 11

Colloquium on Signal Processing and its Applications, 8 - 10 Mac.


2013, Kuala Lumpur, Malaysia.

[17] S. Capkun, M. Hamdi and J.P. Hubaux, “GPS-free positioning in


mobile Ad-Hoc networks,” Hawaii International Conference On
System Sciences, HICSS-34 January 3-6, 2001 Outrigger Wailea
Resort

[18] “The network simulator - ns-2”January, 2006

[19] http://revue.sesamath.net/spip.php?article362

[20] F. Bai, N.Sadagopan, and A. Helmy, “Important: a framework to


system atically analyse the impact of mobiltily on performance of
routing protocols for adhoc networks”, INFOCOM 2003, April 2003.

361 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Testing Coverage based Software Reliability Models:


Critical Analysis and Ranking based on Weighted
Criterion.
Manohar Singh Dr. VaibhavBansal
Research Scholar, Department of Computer Science Associate Professor, Department of Computer Science
OPJS University, Churu, Rajasthan, India OPJS University, Churu, Rajasthan, India
[email protected]

Abstract:This paper focuses on the Testing Coverage based Software Reliability Growth Models. In this paper we analyzed the various
SRGM that incorporates testing Coverage and proposed a Testing Coverage based Software Reliability Growth Model. We also
suggested a Weighted Criterion for rankingand analysis of SRGMs. In this paper, the proposed model with various existing
SRGMsbased on Testing Coverage has been examined and analyzed using two real time software data sets. We have also ranked the
various SRGMs based on weighted comparison criteria values. We find that the proposed model can provide a significant improved
goodness-of-fit and conclude the ranking of SRGMs.

Keywords: Testing Coverage, Software Reliability, Weighted Criterion.

I. SECTION – 1
A. Introduction:
Software Reliability models provide quantitative measures of the reliability of software systems during software development
processes. In the software engineering the Reliability of Software System is considered as the key characteristic of the quality of
the software. Achieving a level of software quality forms the mean basis for deciding the release schedule of the software. Besides
testing efficiency, testing efforts, various other factors are incorporated during reliability estimation such as fault complexity,
debugging time lag etc. There are many other factors that greatly influence the reliability growth. Among all others, one factor
that plays a critical role in a reliability assessment is Testing Coverage. Testing Coverage is a measure that enables software
developer to evaluate the quality of the tested software and determine how much additional effort is required to the software
reliability.
Among all SRGMs, a large family of stochastic reliability models based on a non-homogeneous Poisson process, which is
known as NHPP reliability models, has been widely used to track reliability development during software testing. These models
facilitate software developers to estimate software reliability in a quantitative manner. They have also been effectively used to
provide guidance in building decisions such as when to conclude testing the software or how to distribute available resources.
However, software development is a very intricate process and there are still issues that have not so far been addressed. Testing
coverage is among those issues.
Testing coverage is an essential measure for both the software developers and clients of software products. It can assist
software developers to estimate the excellence of the tested software and determine how much additional attempt is required to
improve the reliability of the software. Testing coverage, on the other hand, can offer customers with a quantitative assurance
criterion when they plan to buy or use the software products.
Gokhale et al.[1] analysed the affect of testing coverage and drive several types of coverage function in registering NHPP
based SRGMs. Yamada et al. [2] develop software reliability models with testing domain coverage ratio. Pham and Zhang [3]
proposed NHPP software reliability model for obtaining coverage function for various testing efficiency. Kapur et al [4] also
suggested S-shaped Testing Coverage based SRGM.
In this paper, the study is focused on the critical analysis and ranking of various existing Testing coverage based software
reliability growth models based on weighted criteria values using two data sets of real time failure data. The rest of this paper is
organized as follows. Section-2 describes Testing Effort Coverage SRGM in literature and also proposes a Testing Coverage
SRGM for analysis and ranking of SRGM based on Weighted Criteria.In Section-3, we discuss about comparison Criteria of
SRGM. In Section-4, we explain our new Ranking methodology Scheme. Section-5 shows the experimental results through two
different real data sets. The parameters are estimated for Goodness-of-Fit and Comparison with ranking of various SRGMs are
also included in this section. The conclusion and remarks are given in Section-6.

362 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

II. SECTION - 2
A. Description of the Testing Coverage SRGMs:
In this Section, we discuss the different testing efforts function and various existing Testing Coverage Based Software
Reliability models and also propose a Testing Coverage SRGM for Critical Analysis and Ranking.
1) Testing Effort functions:
Testing Effort plays an important role in the Testing Process of Software Development. Testing Effort is measured by:
The numbers of executed Test cases, The CPU time spend in Testing face, The Amount of manpower etc. The consumption
curve of Testing Resources over Testing Period can be thought of Testing Effort curve. A Testing Effort Function Describes
distribution or consumption function of Testing Efforts such as Test cases executes, CPU Testing hours, manpower etc. during
Testing face Yamada et al [5][6][7], Kapur et al [8], Kuo et al [14], Hueng et al [10][11] suggested software reliability growth
models explaining the relationship among the Testing time, Testing Efforts expenditure and number of software faults detected.
Yamada et al. [5][6][7] proposed Weibull type distribution to explain testing efforts function as given below:
• Exponential Testing Efforts Function: The exponential curve is used for processes that decline monotonically to
an asymptote. The cumulative Testing effort consumed in (0,t] is given as:
( ) = (1 − )
• Rayleigh Testing Effort Function:Rayleigh curve is frequently used as an alternative to the exponential curve. This
curve predicts the cost and the schedule of the software development. The testing effort function is given as:
( )= 1−
• Weibull Testing Effort function: The Weibull Testing Effort Function is given as:
( )= 1−
Exponential Rayleigh functions are the special cases of Weibull Function for l=1 and l=2 respectively.
• Logistic Testing Effort Functions: The Logistic Testing effort function over the time period (0, t] can be defined
as:
( )= , ℎ (0) =
Parameter used in above mentioned testing effort functions are:
:is the total amount of test effort expenditures required for software testing
:is the scale parameter
l: is the shaped parameter
β: is constant
W(t): is a testing effort function
2) TestingCoverage based SRGM:
In general test coverage measure is defined as how well a test covers all the potential faults sites in software system. A
large verity of test coverage measures exists [12] in the literature. Malaiya et al [13] based on the previous work made the initial
attempt of testing coverage to software reliability. Gokhale et al. [1] proposed the model analysing the effort of testing coverage
and derived several forms of coverage functions for various executing NHPP models. What we are considering in this paper is
GO-Exponential Testing Coverage SRGM [1]. Yamada et al[14] also proposed S-shaped Testing coverage SRGM for software
error detection. Pham and Zhang[15] also suggest the testing coverage model, incorporating the concept of testing efficiency.
Kapur et al. [16] also suggested a Testing Efficiency S-shaped coverage dependent SRGM. Inoue et al. [17] proposed a Flexible
Testing Coverage SRGM. Yamada et al. also proposed an Exponential and Rayleigh Testing Coverage SRGM [18]. The mean
value function ( ) and coverage function ( ) of above said executing models considered for analysis and ranking are given
in Table-1. In addition to these SRGM based on the Testing Coverage, a Testing Coverage based SRGM is proposed. The
proposed Model incorporating testing coverage and testing effort is based on basic assumptions of NHPP software failure
phenomenon and also assumed that identified errors are removed perfectly and no additional faults introduced during the
process. Cumulative Testing Effort Function is modelled by logistic function. The cumulative Testing Effort consumed in the
interval (0, ] of the logistic fault is given as:
( ) =
1+
where (0) = and , and are constant.
The Coverage Function ( )of the Model is considered as:
( )
1−
( ) = ( )
1+
The Mean Value Function of proposed SRGM is

363 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

( )
1−
( )= ( )
1+
where and are constant and a is constant, representing the number of faults lying in the software at the starting of
testing.

Testing Coverage Based SRGMs Mean value function m(t) and Reference
Coverage Function c(t)
Model-1: G-O Exponential Testing Coverage ( ) = (1 − ) [1]
SRGM ( ) = (1 − )
Model-2: Yamada et al. S-Shaped Testing ( ) = (1 − (1 + ) ) [14]
Coverage SRGM ( ) = (1 − (1 + ) )
Model-3: Pham et al Testing efficiency ( )= 1 − (1 + )( ) ( ) [15]
Coverage SRGM 1−
( ) = (1 − (1 + ) )
Model-4: Kapur et al. Testing Efficiency S- [16]
Shaped Coverage SRGM ( )= 1− 1+
1−
( )
( ) ( )
+
2
( )
( )= 1− 1+ +
2
Model-5: Pham et al. Testing Efficiency S- ( ) = (1 − (1 + ( + ) + ) ) [19]
Shaped Coverage SRGM. ( ) = (1 − (1 + ) )
Model-6: Inoue et al. Flexible Testing ( )= 1− ( ) [17]
Coverage SRGM ( )
( )= , =
Model-7: Yamada et al. Testing Coverage ( )= ( ) [18]
1−
SRGM
( )= (Rayleigh)
( )= (Exponential)
Model-8: Proposed Testing Coverage SRGM ( )
1−
( )= ( )
1+
( )
1−
( )= ( )
1+
( )=
1+
Table-1: Summary of the SRGM based on Testing Coverage Model and Proposed Model

III. SECTION – 3
A. Comparison Criteria:
To investigate the effectiveness of SRGMs, the various comparison criteria used to compare models quantitatively are
Mean Squared Errors (MSE), BIAS, VARIANCE, Mean Absolute Error (MAE), Mean Error of Prediction (MEOP), Predictive
Ratio Risk (PRR), Accuracy of Estimation (AE), Sum of Squared Errors (SSE), R2 and Root Mean Square Predictive Error
(RMPSE). In each comparison criteria except R2 it is seen that Lower the value provides a better to the Goodness-of-Fit for
SRGM. In case of R2 value close to 1 provides a better Goodness-of-Fit of Model [9]. The summary of comparison criteria is
given in Table-2
Metrics Formula

364 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Mean Absolute Error (MAE) | ( ) − ( )|



=

Where is the number of parameters in model and
represents the time period of testing; ( ), ( )are
actual and estimated faults corresponding to the time
period ( )
Mean Error of Prediction (MEOP) ∑ | ( ) − ( )|
=
− +1
Sum of Squared Error (SSE)
= ( )− ( )]

Mean Square Fitting Error (MSE) ∑ ( )− ( )]


=
Coefficient of Multiple
Determination (R2) =1−
Predictive-Ratio Risk (PRR) ( )− ( )
=
( )
BIAS ∑ | ( )− ( )|
=
Variation
∑ ( ( )− ( )− )
=
−1
Root Mean Square Prediction Error = ( + )
(RMSPE)
Accuracy of Estimation (AE) = , where are the actual and estimated
cumulative number of detected errors after the test
respectively.
Table-2: Comparison Criteria used for Goodness-of-Fit of SRGMs
IV. SECTION – 4
A. Weighted Criteria Ranking Methodology
In this section, a new attempt is made to develop a deterministic quantitative model based on weighted mean for the
purpose of ranking of Software Reliability Growth Models. In this scheme of ranking of SRGMs, we use the weight of each
comparison criteria; therefore we have named this as the weighted criteria method. It involves the following steps:
Step1: Criteria Value Matrix:
Let us consider n numbers of SRGMs having mComparison Criteria. The Criteria value matrix C is given as:
⎡ C11 C12 KK KK C1m ⎤
⎢ C
⎢ 21 C 22 KK KK C2 m ⎥⎥
⎢ C31 C32 KK KK C3 m ⎥
⎢ ⎥
C=⎢ M M KK KK M ⎥
⎢ C n1 Cn 2 KK KK C nm ⎥
⎢ ⎥
⎢ CMIN1 CMIN 2 KK KK CMIN m ⎥
⎢CMAX CMAX
⎣ 1 2 KK KK CMAX m ⎥⎦
Where, C = Value of j criteria of i model.
(CMIN) = Minimum value of j criteria
(CMAX) = Maximum value of j criteria, for all i = 1 to n and j = 1 to m.
Step2: Criteria Weighted Matrix:
The Criteria weighted matrix W is given as:

365 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

⎡W 11 W12 K K W1m ⎤
⎢W W22 K K W2 m ⎥⎥
W = ⎢ 21
⎢ M M K K M ⎥
⎢ ⎥
⎣ Wn1 Wn 2 K K Wnm ⎦
Where, W = 1 − Z , for i = 1 to n and j = 1 to m.
Z is criteria rating of j criteria of i model. There are two cases to calculate the criteria rating
When the smaller criteria value is best fit to the actual data, the criteria rating is calculated as:
(CMAX) − C Critria Maximum value − Criteria Value
Z = =
(CMAX) (CMIN) Criteria Maximum value – Criteria Min value
When the larger criteria value is the best fit to the actual data then criteria rating is calculated as:
C − (CMIN)j Critria value – Criteria Minimum Value
Z = =
(CMAX) (CMIN) Criteria Maximum value – Criteria Minimum value
Step 3: Weighted Criteria Value:
Weighted Criteria value is calculated by multiplying criteria value of each Criterion with their weight. Let V is
weighted criteria value of j criteria of i model and is calculated as:
V = W ∗ C
The Weighted Criteria value Matrix V is given as:
⎡V 11 v12 K K v1m ⎤
⎢V V22 K K V2 m ⎥⎥
V = ⎢ 21
⎢ M M K K M ⎥
⎢ ⎥
⎣ Vn1 Vn 2 K K Vnm ⎦
Step 4: Permanent Value of Model:
The Weighted mean value of all Criteria is called Permanent value of model. The Permanent Value of model is given as:

P =∑ , for i=1 to n
Step 5: Ranking of Models:
The ranking of models is proposed on the basis of permanent value of the model. The model with smaller permanent
value is considered good ranker as compared to the model with bigger permanent value. Thus ranks for all models are
provided by comparing permanent values.

V. SECTION – 5
A. Data Validation, Data Set and Data Analysis
1) Model Validation:
To illustrate the estimation procedure of Software Reliability Growth Model (Existing as well as proposed), we have
carried data analysis of the two real software data sets. A data set (Data Set-1) cited from Musa, Iannino and Okumoto [20]. The
software was real time control system, which represented the faults observed during system testing for 25 hours of CPU time.
For this real time control system 21700 object instructions were delivered. It was developed by Bell Laboratories. A data set
(Data Set-2) obtained from H. Pham [21]. In this data set the number of faults detected in each week of testing is found, and the
cumulative number of faults since the start of testing is recorded for each week. It observes 416 hours per week of testing. It
provides the cumulative number of faults by each week up to 21 weeks and 43 failures observed during system testing for 8736
hours of CPU time. The parameters of the Model have been estimated using Statistical Package SPSS. The results of the
parameter estimation of SRGMs using Data Set-1 and Data Set-2 are shown in Table-3 and Table-4 respectively.

Testing Coverage Estimated Parameters


based SRGM
Model-1 135.857 0.139
Model-2 124.600 0.357
Model-3 0.312 61.252 0.9977

366 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Model-4 0.0459 412.1673 0.9996


Model-5 123.995 0.217 -0.061
Model-6 357.020 0.004
Model-7 187.9150 0.008
Proposed Model-8 753.627 0.003 1.214
Table-3: Parameter Estimation of the SRGMs for Data Set-1 (Musa et al.)

Testing Coverage Estimated Parameters


based SRGM
Model-1 1959.921 0.0001
Model-2 62.305 0.119
Model-3 62.305 0.119 3.00E-009
Model-4 19.063 0.387 0.664
Model-5 64.869 0.106 -0.004
Model-6 1136.111 0.001
Model-7 208.937 0.005
Proposed Model-8 119.650 0.020 1.492
Table-4: Parameter Estimation of the SRGMs for Data Set-2 (Pham)
2) Comparison Criteria of SRGMs:
In this study we use 10 comparison criteria such as R2, MSE, BIAS, VARIANCE, RMPSE, MAE, MEOP, AE, SSE and
PRR for analysis of various Software Reliability Growth Models for both the data sets. The results of comparison criteria for
Data Set-1 and Data Set-2 are shown in Table-5 and Table-6 respectively.In all comparison criteria except R2, Lower the value
provides a better to the Goodness-of-Fit for SRGM. In case of R2 value close to 1 provides a better Goodness-of-Fit of Model.

Criteria R2 MSE BIAS Variance RMPSE MAE MEOP AE SSE PRR


Model
Model-1 0.966 33.811 4.77 23.35 23.83 6.27 5.96 0.03 778.12 14.83
Model-2 0.864 134.574 9.58 46.94 47.91 12.61 11.98 0.09 3094.99 121.45
Model-3 0.963 38.186 4.91 24.03 24.53 6.81 6.45 0.03 839.43 17.29
Model-4 0.965 36.454 11.95 58.53 59.74 16.59 15.72 0.17 4719.87 54.10
Model-5 0.886 181.081 8.72 42.72 43.60 6.44 6.10 0.08 2597.59 72.52
Model-6 0.984 15.893 14.88 72.89 74.39 5.58 5.29 0.09 6243.00 53.22
Model-7 0.898 101.418 8.34 40.85 41.7 10.97 10.42 0.12 2513.09 22.11
Proposed
0.974 26.538 4.60 22.51 22.98 6.38 6.05 0.01 800.63 11.76
Model-8
Table-5: Comparison Criteria of the SRGMs for Data Set-1 (Musa et al.)

Criteria R2 MSE BIAS Variance RMPSE MAE MEOP AE SSE PRR


Model
Model-1 0.969 6.723 2.740 12.260 12.560 3.030 2.880 0.053 236.610 10.070
Model-2 0.985 3.273 1.570 7.020 7.190 1.730 1.650 0.032 62.370 23.620
Model-3 0.985 3.455 1.570 7.020 7.190 1.830 1.730 0.032 62.370 23.620
Model-4 0.984 3.733 1.560 6.960 7.130 1.810 1.720 0.017 67.160 75.750
Model-5 0.985 3.410 1.500 6.710 6.870 1.750 1.660 0.027 62.370 15.230
Model-6 0.992 1.783 1.050 4.680 4.790 1.220 1.160 0.002 33.660 8.430
Model-7 0.910 19.547 3.780 16.900 17.320 4.180 3.970 0.081 391.290 23.300
Proposed
0.995 1.037 0.860 3.860 3.950 1.010 0.950 0.025 21.420 1.710
Model-8
Table-6: Comparison Criteria of the SRGMs for Data Set-2 (Pham)

3) Goodness of Fit for SRGM:

367 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The fitting of various SRGMs and the proposed model to Data Set-1 and Data Set-2 is graphically illustrated in Figure-1
and Figure-2 respectively. The fitness of proposed SRGM estimated and actual values of the faults to Data Set-1 and Data Set-2
is shown in Figure-3 and Figure-4 respectively.

Goodness of fit curves for Data Set-1 (Musa et al.)

160

140
Cumulative Number of Faults

120

100

80
Actual Data
60 Model-1
Model-2
Model-3
40 Model-4
Model-5
20 Model-6
Model-7
Proposed Model-8
0
0 5 10 15 20 25 30
Execution Time in Hours

Figure-1: Goodness of Fit Curves for various SRGMs and Proposed Model for Data Set-1 (Musa et al.)

Goodness of fit curves for Data Set-2 (Pham)


50

45
Cumulative Number of Faults

40

35

30

25
Actual Data
20 Model-1
Model-2
15 Model-3
Model-4
10 Model-5
Model-6
5 Model-7
Proposed Model-8
0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
System Test Hours

Figure-2: Goodness of Fit Curves for various SRGMs and Proposed Model for Data Set-2 (Pham)

368 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Goodness of fit curves for Data Set-1 (Musa et al.)


160

140
Cumulative Number of Faults

120

100

80

60

40 Actual Data
Proposed Model-8
20

0
0 5 10 15 20 25 30
Execution Time in Hours
Figure-3: Goodness of Fit Curves for Actual Data and Proposed Model for Data Set-1 (Musa et al.)

Goodness of fit curve for Data Set - 2 (Pham)


50

45

40
Cumulative Number of Faults

35

30

25

20

15 Actual Data

10 Proposed Model-8
5

0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
System Test Hours

Figure 4: Goodness of Fit Curves for Actual Data and Proposed Model for Data Set-2 (Pham)
4) Ranking of SRGM based on Weighted Criterion:
The Ranking of these eight software reliability growth models (seven existing and one proposed) based on ten criteria
values as described above are calculated using Weighted Criterion Values for Data Set-2. The Weighted Values of the Criteria
is shown in Table-7 and the Permanent Values of Models and Ranking is shown in Table-8.

369 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Criteria R2 MSE BIAS Variance RMPSE MAE MEOP AE SSE PRR SUM
Model
Model-1 0.969 6.723 2.740 12.260 12.560 3.030 2.880 0.053 236.610 10.070 287.895
Model-2 0.985 3.273 1.570 7.020 7.190 1.730 1.650 0.032 62.370 23.620 109.440
Model-3 0.985 3.455 1.570 7.020 7.190 1.830 1.730 0.032 62.370 23.620 397.335
Model-4 0.984 3.733 1.560 6.960 7.130 1.810 1.720 0.017 67.160 75.750 794.670
Model-5 0.985 3.410 1.500 6.710 6.870 1.750 1.660 0.027 62.370 15.230 100.512
Model-6 0.992 1.783 1.050 4.680 4.790 1.220 1.160 0.002 33.660 8.430 57.767
Model-7 0.910 19.547 3.780 16.900 17.320 4.180 3.970 0.081 391.290 23.300 481.278
Proposed
0.995 1.037 0.860 3.860 3.950 1.010 0.950 0.025 21.420 1.710 35.817
Model-8
Maximum 0.995 19.547 3.780 16.900 17.320 4.180 3.970 0.081 391.290 75.750 481.278
Minimum 0.910 1.037 0.860 3.860 3.950 1.010 0.950 0.002 21.420 1.710 35.817
Table-7: Weighted Values of Criteria of various SRGMs
Criteria Weighted Matrix of 8 SRGMs (Rows) and 10 Comparison Criteria (Columns) is given below:
⎡0.306 0.31 0.644 0.64 0.644 0.64 0.639 0.65 0.582 0.11⎤
⎢0.118 0.12 0.243 0.24 0.242 0.23 0.232 0.38 0.111 0.30⎥⎥

⎢0.118 0.13 0.243 0.24 0.242 0.26 0.258 0.38 0.111 0.30⎥
⎢ ⎥
0.129 0.15 0.240 0.24 0.238 0.25 0.255 0.19 0.124 1.00 ⎥
W =⎢
⎢0.118 0.13 0.219 0.22 0.218 0.23 0.235 0.32 0.111 0.18⎥
⎢ ⎥
⎢0.035 0.04 0.065 0.06 0.063 0.07 0.070 0.00 0.033 0.09⎥
⎢ 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.29⎥
⎢ ⎥
⎢⎣ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.29 0.00 0.00⎥⎦
Weighted Criteria Values Matrix of 8 SRGMs (Rows) and 10 Comparison Criteria (Columns) is given below:
⎡0.30 2.07 1.76 7.90 8.09 1.93 1.84 0.03 137.66 1.14 ⎤
⎢0.12 0.40 0.38 1.70 1.74 0.39 0.38 0.01 6.91 6.99 ⎥⎥

⎢0.12 0.45 0.38 1.70 1.74 0.47 0.45 0.01 6.91 6.99 ⎥
⎢ ⎥
0.13 0.54 0.37 1.65 1.70 0.46 0.44 0.00 8.31 75.75⎥
V =⎢
⎢0.12 0.44 0.33 1.47 1.50 0.41 0.39 0.01 6.91 2.78 ⎥
⎢ ⎥
⎢0.04 0.07 0.07 0.29 0.30 0.08 0.08 0.00 1.11 0.77 ⎥
⎢ 0.91 19.55 3.78 16.90 17.32 4.18 3.97 0.08 391.29 6.79 ⎥
⎢ ⎥
⎢⎣0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 ⎥⎦
Model Permanent Value and Ranking
Model Sum of Weight Sum of Weighted Values Permanent Value Rank
Model-1 5.162 162.71 31.52 6
Model-2 2.212 19.02 8.60 5
Model-3 7.373 19.22 2.61 3
Model-4 2.811 89.35 31.78 7
Model-5 1.980 14.34 7.24 4
Model-6 4.792 2.81 0.59 2
Model-7 9.292 464.77 50.02 8
Proposed
0.291 0.01 0.03 1
Model-8
Table-8: The Permanent Values of Models and Ranking
6) Data Analysis:
It is clear from Table-5 for the Data Set-1of comparison criteria MSE, BIAS, Variance, RMPSE, AE, PRR of proposed
SRGM is the lowest as compared to existing SRGMs and the value of R2 is very close to 1 which shows the goodness-of-fit of

370 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

the proposed model. Table-6 for the Data Set-2 of comparison criteria MSE, BIAS, VARIANCE, RMPSE, MAE, MEOP, SSE
and PRR of proposed SRGM is also the lowest as compared to existing SRGMs and the value of R2 is very close to 1 which
further shows the goodness-of-fit of proposed model. It is also clear from Table-8, the ranking of proposed model is 1 as
compared to existing SRGMs.
VI. SECTION - 6
A. Conclusion:
In this paper, we studied the various existing Software Reliability Growth Models based on Testing Coverage and also
proposed a Testing Coverage Software Reliability Growth Model. We also explained a new ranking methodology based on
Weighted Criteria and evaluated the software reliability growth models. The result of comparison criteria for both data sets i.e.
Data Set-1 and Data Set-2 shows the goodness-of-fit curves of proposed model. This paper also includes the issue of optimal
selection of the Testing Coverage SRGMs based on Weighted Criteria. This Weighted Criteria method is suitable for ranking the
software reliability growth models based on a set of criteria taken all together. The weighted criteria method uses a relatively
simple mathematical formulation and straight forward calculation. Now we conclude in this paper that the ranking of the proposed
testing coverage SRGM is 1 using Weighted Criteria which also matches the goodness-of-fit using individual comparison criteria.

Data Set #1: Real-Time Command and Control System


The data set was reported by Musa (1987) [18] based on failure data from a real-time command and control system, which
represents the failures observed during system testing for 25 hours of CPU time. The delivered number of object instructions for
this system was 21700 and was developed by Bell Laboratories.
Faults Cumulative Faults
1 27 27
2 16 43
3 11 54
4 10 64
5 11 75
6 7 83
7 2 84
8 5 89
9 3 92
10 1 93
11 4 97
12 7 104
13 2 106
14 5 111
15 5 116
16 6 122
17 0 122
18 5 127
19 1 128
20 1 129
21 2 131
22 1 132
23 2 134
24 1 135
25 1 136

371 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Data Set #2: A data set (Data Set-2) obtained from H. Pham [19]. In this data set the number of faults detected in each week of
testing is found, and the cumulative number of faults since the start of testing is recorded for each week. It observes 416 hours per
week of testing. It provides the cumulative number of faults by each week up to 21 weeks and 43 failures observed during system
testing for 8736 hours of CPU time.

Week Index Faults Test Hours Cumulative Faults


1 3 416 3
2 1 832 4
3 0 1248 4
4 3 1664 7
5 2 2080 9
6 0 2496 9
7 1 2912 10
8 3 3328 13
9 4 3744 17
10 2 4160 19
11 4 4576 23
12 2 4992 25
13 5 5408 30
14 2 5824 32
15 4 6240 36
16 1 6656 37
17 2 7072 39
18 0 7488 39
19 0 7904 39
20 3 8320 42
21 1 8736 43

VII. REFERENCE:
[1]. Gokhale SS, Philip T, Marinos PN, Trivedi KS (1996) Unification of finite failure non homogeneous poison process models through test
coverage. In: Proceedings 7th International Symposium on Software Reliability Engineering, White Plains, pp 299–307.
[2]. Shaik. Mohammad Rafi et al., ―Software Reliability Growth Model with Logistic-Exponential Test-Effort Function and Analysis of
Software Release Policyǁ, (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 02, 2010, 387-399.
[3]. Pham H, Zhang X (2003) NHPP software reliability and cost models with testing coverage. Eur J Oper Res 145(2):443–454.
[4]. Kapur PK, Singh O, Gupta A (2005) Somemodeling peculiarities in software reliability. In: Proceedings Kapur PK, Verma AK (eds)
Quality, reliability and infocom technology, trends and future directions. Narosa Publications Pvt. Ltd., New Delhi, pp 20–34.
[5]. H. Pham, Software Reliability, Springer, Berlin, 2000.
[6]. H. Pham, X. Zhang, Software release policies with gain in reliability justifying the costs, Annals of Software Engineering 8 (1999) 147–
166.
[7]. H. Pham, Software reliability, in: J.G. Webster (Ed.), Wiley Encyclopedia of Electrical and Electronics Engineering, Wiley, New York,
2000.
[8]. A. Wood, Predicting software reliability, IEEE Computer 11 (1996) 69–77.
[9]. S Singh, M., Bansal, V., 391Parameter Estimation and Validation Testing Procedures for Software Reliability Growth Model in
International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064, 5(12), 1675-1680, 2016.
[10]. S. Yamada, Software quality/reliability measurement and assessment: Software reliability growth models and data analysis, Journal of
Information Processing 14 (3) (1991) 254–266.
[11]. S. Yamada, K. Tokuno, S. Osaki, Imperfect debugging models with fault introduction rate for software reliability assessment,
International Journal of Systems Science 23 (12) (1992).
[12]. Malaiya YK, Li MN, Bieman JM, Karcich R (2002) Software reliability growth with test coverage. IEEE Trans Reliability 51(4):420–
426.

372 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[13]. Malaiya YK, Li N, Bieman J, Karcich R, Skibbe B (1994) The relationship between test coverage and reliability. In: Proceedings of the
5th International Symposium Software Reliability Engineering, Monterey, CA, pp 186–195.
[14]. Yamada S, Ohba M, Osaki S (1983) S-shaped software reliability growth modelling for software error detection. IEEE Trans Reliability
R-32(5):475–484.
[15]. Pham H, Zhang X (2003) NHPP software reliability and cost models with testing coverage. European Journal Operational Research
145(2):443–454.
[16]. Kapur PK, Singh O, Gupta A (2005) some modelling peculiarities in software reliability. In: Proceedings Kapur PK, Verma AK (eds)
Quality, reliability and infocom technology, trends and future directions. Narosa Publications Pvt. Ltd., New Delhi, pp 20–34.
[17]. Inoue S, Yamada S (2008) Two dimensional software reliability assessment with testing coverage. The 2nd International Conference on
Secure System Integration and Reliability Improvement, pp 150–155
[18]. Yamada S, Ohtera H and Narithisa H, “Software Reliability Growth Models with Testing Effort” IEEE Trans. on Reliability, R-35 (1),
pp. 19-23, 1986.
[19]. Pham H (2006) System software reliability., Reliability engineering series Springer Verlag, London
[20]. Musa JD, Iannino A and Okumoto K, Software Reliability: Measurement, Prediction, Applications, McGraw Hill, 1987.
[21]. Pham H, An Imperfect-debugging Fault-detection Dependent-parameter Software, “International Journal of Automation and
Computing” 04(4), October 2007, 325-328, DOI: 10.1007/s11633-007-0325-8.

373 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Accelerating a Secure Communication Channel


Construction Using HW/ SW Co-design
Roghayeh Mojarad, Hossain Kordestani
Department of Computer Engineering and Information Technology
Amirkabir University of Technology (Tehran Polytechnic)
[email protected], [email protected]

Abstract— A secure communication channel is imperative Authentication is done by the help of another node as a
part of any significant communication in different systems. In trusted third party, who is the constructor of system in this
this paper, two well-known algorithms,RSA and Diffie- situation. RSA algorithm is an asymmetric cryptography
Hellman, are combined in order to form a secure
communication channel. This paper uses RSA for algorithm, which is used for digital signature and
authentication and Diffie-Hellman for key exchange. After confidential applications [2]. Diffie-Hellman algorithm is a
formation of the channel, the nodes will use the exchanged key key-sharing algorithm in which both nodes will have a
for encryption of their messages. Moreover, construction of same key in secure state without transmission of any key in
the secure communication channel is accelerated using the channel [3]. In this paper, key exchange is similar to
hardware/software (HW/ SW) co-design such that, it uses SPEKE algorithm but there is a prominent difference that
both advantages of hardware and software. Their
implementations using software-only is time consuming, while
hashing functions are removed and asymmetric signature
implementation of it using a HW/ SW co-design platform algorithm RSA is added [4].
speeds up the secure communication. The experimental This paper proposes a new design to accelerate
results show that this kind of design is 6 times faster than construction of a secure communication channel; then, it
software-only one. In addition to that, memory overhead and evaluates the implementation of this design. The hardware
computational overhead of this method is approximately implementation usually has better performance compared
trivial. The implementation of this paper was done by Xilinx
with software one, while the latter has better flexibility,
Embedded Development Kit (EDK ) software tool, which is a
suitable mean to implement different projects of HW/SW co- usage of empty space of the processor, lack of hardware
design. design complexity and area overhead. Therefore, this
Keywords—Secure Communication Channel, secure communication is implemented using HW/SW co-
Authentication, Key Exchange, RSA Protocol, Diffie-Hellman design to access beneficial aspects of both sides. The goal
Protocol, Field Programmable Gate Array (FPGA), EDK Tool, of this paper is increasing the speed of a secure
Co-design.
communication construction and enhancing its
I. INTRODUCTION performance with enough flexibility and usage of empty
space of processor. To obtain this goal, the complex and
In modern days of vast use of embedded systems, the
time-taking part of algorithm should be implemented using
attention to the non-functional requirements of them are
hardware. Modular exponentiation is a part of the algorithm
arisen; one of the key concept are security, which consists
which takes long execution time in software
of three main aspects of confidentiality, integrity and
implementation because of its complexity. The hardware
availability. Since the availability is usually covered in the
part is implemented by VHDL.
functional requirements, the focus is usually on the other
FPGAs are suitable candidate to improve a HW/SW co-
two. The known solutions for these are encryption and
design for obtaining higher performance [5, 6], and
authentication. For encryption, the ‘key’ plays the most
microprocessor are used for getting higher flexibility and
important role, and a secure mechanism for key exchange
more features. Implementing modular exponentiation on
is vital. Moreover, in unsecure channels, there is the
FPGA speeds up the construction of a secure
possibility of the man-in-the-middle attacks [1]; which
communication channel. In this paper, modular
focuses the attention toward authentication. Therefore,
exponentiation is used in both algorithms used in the
provision of security to communication nodes is
proposed protocol; also, time-taking parts is mapped into
indispensable. In this paper, a secure protocol is
the FPGA logic blocks whereas generation of public and
implemented to communicate two (or more) embedded
private keys is performed by a software processor (i.e.,
systems. Protocol steps are explained in the following.
MicroBlaze) in a Xilinx FPGA. Therefore, using HW/SW
co-design, a high speed secure communication channel is
1- Node Setting
constructed.
2- Authentication
There is a cost of area with increasing the speed; but
3- Key Exchange
since in the HW/SW co-design, the area overhead is very
little, it can usually be placed in the empty areas of the

372 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

FPGA; but if the design was done using only hardware, not [12] has provided a co-design flow to use high level
only it lacks the required flexibility, but also it usually description of a system and later take parts of it for
requires so much area, that requires changing the FPGA to hardware acceleration. They implemented a
a bigger one. hardware/software co-design of AES on NIOS II with a
This kind of implementation is 6 times faster than hardware accelerator. They described the algorithm in
another one, which is implemented using software-only. Catapult C, which converts C code into RTL language.
Moreover, its area and computational overhead is They reached around 8 times better performance in co-
measured. Because of their trivial values, these overhead design, comparing to pure software implementation of
can be ignored in comparison with the obtained speed. AES.
The rest of paper is organized as follows; related works In 2015, high throughput wireless communication
is presented in section II. Section III is related to the system was implemented using HW/SW co-design. The
background knowledge of the context. Section IV depicts goal of this work was reducing hardware design and time
the proposed method and implementation details; section V in order to provide reliable design [13].
discusses the experimental results and it is wrapped up with [14] proposed a HW/SW co-design of RSA in order to
conclusion in section VI. obtain performance and flexibility. It adopted Xilinx Zynq-
7000 SOC platform such that integrated a dual-core ARM-
II. RELATED WORKS A9 system with Xilinx programmable logic.
Some HW/SW co-design projects were done in previous
years but the proposed method is a new method to III. BACKGROUND KNOWLEDGE
accelerate a secure communication channel. This section presents different steps of two algorithms
In 2009, in Bristol University, a HW/SW co-design of RSA and Diffie-Hellman. This paper concerns two secure
public-key cryptography for Secure Socket Layer (SSL) functions in embedded systems: 1) authentication, 2)
was executed in embedded systems. In this work, the encryption. For authentication, we use asymmetric
hardware part included complex processor SPARC V8 cryptography to ensure the identity of each ends of
with a set of mathematical operations in elliptic curves, communication. For encryption, the algorithm itself is
while matrix of the secure socket layer was implemented in flexible but for preparation, we use a secure key-sharing
software part. This implementation increased the speed of algorithm. We call the combination of authentication and
public-key operations in a handshaking process of the key-sharing, a setup suite of secure channel. The main
secure socket layer. The result of the work on SPARC focus of this paper is to accelerate the implementation of
processor 20 MHz was 10 times faster than software-only this suite using co-design approach.
implementation [7]. The DMA controller, ECC accelerator
A. Authentication
and total design were remarkably large.
In 2010, a project was done for cryptography of elliptic One of the most imperative dangers in unsecure
curve on microcontroller PicoBlaze. In this project, the environment is masquerading. This kind of attacks may
cryptographic processor of scalable elliptic curve with have remarkable negative effects on the systems. The
limited sources was set on FPGA. The result of this work known defense mechanism is authentication; each node
was compared with the implementation on different Xilinx makes sure about identify of the node that is communicated
FPGAs (Spartan) based on scalability and area overhead with it in the beginning of the communication. There are
[8]. many protocols that can perform this function. Digital
In 2010, algorithm RSA was implemented based on two signature of trusted third party (TTP) is one of the highest
methods that one of them used hardware-only and another secure protocol. This method is used for security and
one was using software-only on microcontroller 8051. This authentication of various websites in the secure socket
paper presented that there is not any complete method; layer (SSL). The drawback of it is that finding the trusted
therefore, there is a trade-off between performance and third party is arduous; in this paper, constructor of
flexibility based on the type of applications. The result of embedded system is identified as trusted third party.
this paper was that speed of hardware-only implementation Asymmetric cryptography can be used for
was 4 times more than another one [9]. authentication, which needs two various keys for
In 2012, hash algorithm SHA-256 was implemented encryption and decryption, whereas symmetric
using HW/SW co-design and its performance had tangible cryptography requires only one key. Therefore, sender can
boost. This project used pipeline methods beside the send information securely to anyone. To do that, one key is
hardware part. Framework of this work was Virtex family identified as public-key and another one is private-key,
and it doubled operational power [10]. which is available for only the receiver.
[11] discusses the implementation of AES in wireless Another application of asymmetric cryptography is
sensor network using HW/SW co-design. Regarding the digital signature; if sender of packet encrypts it using its
environment, they focused mainly on the power private key, any node can decrypts it using the associated
consumption of implementation. Albeit, their public key. In this situation, in contrast with the encryption,
implementation shows 6 times better performance than the focus is on integrity aspect of security, and the receiver
software-only one.

373 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

is assured that the packet is approved by the sender using hence, key renewal is suggested on regular basis for each
its private key and it has the sender’s signature. communication to decrease possibility of finding those
RSA is the most famous and applicable algorithm among keys.
asymmetric algorithms. Security of RSA is related to There are many algorithms for secure key exchange; one
complexity of solving the problem of decomposition into of them is Diffie-Hellman. Its security is based on the
prime numbers. The various steps of this algorithm are complexity of solving the modulus logarithm. Different
presented in the following [2]. steps of this algorithm is informally explained in the
1- Two large prime numbers (𝑃, 𝑄) are selected. following [3].
2- 𝜑(𝑁) = (1 − 𝑃)(1 − 𝑄), 𝑁 = 𝑃 ∗ 𝑄 1- Both nodes select one modulus (𝑁) and one prime
3- To generate public key (e), one number is selected number (𝑃).
between 1 and (N) such that it should be relatively 2- Each of both nodes selects one value (𝑎), and sends
prime to (N). the value 𝑃𝑎 to another node.
4- To generate private key (d), 𝑒. 𝑑 ≡ 1 𝑚𝑜𝑑 𝜙(𝑁) 3- Both nodes generate the same value (𝐾𝑒𝑦), which is
5- 𝐸𝑛𝑐𝑟𝑦𝑝𝑡(𝑚) = 𝑐 ≡ 𝑚𝑒 𝑚𝑜𝑑 𝑁 received value 𝑃𝑎 to the exponent of their private
𝐷𝑒𝑐𝑟𝑦𝑝𝑡(𝑐) = 𝑚 ≡ 𝑐 𝑑 𝑚𝑜𝑑 𝑁 number.

Trusted third party selects both of public (𝑃𝑈𝑇 ) and Figure 3 formally describes the steps which were
private key (𝑃𝑅𝑇 ) for itself; these keys are used for digital described above.
signature. Constructor of the embedded system can hard- 𝑎𝑙𝑖𝑐𝑒 ↔ 𝑏𝑜𝑏: 𝑃, 𝑁
code all the required information including node 𝑎𝑙𝑖𝑐𝑒 𝑐ℎ𝑜𝑠𝑒𝑠 𝑎; 𝐴 ≡ 𝑃𝑎 𝑚𝑜𝑑 𝑁
identification number, public key, private key, and 𝑏𝑜𝑏 𝑐ℎ𝑜𝑠𝑒𝑠 𝑏; 𝐵 ≡ 𝑃𝑏 𝑚𝑜𝑑 𝑁
modulus of operation in that node [2]. Figure 1 shows all 𝑎𝑙𝑖𝑐𝑒: 𝐾𝑒𝑦 ≡ 𝐵𝑎 𝑚𝑜𝑑 𝑁 (𝐾𝑒𝑦 = (𝑃𝑏 )𝑎
fields, which are set in each node. 𝑏𝑜𝑏: 𝐾𝑒𝑦 ≡ 𝐴𝑏 𝑚𝑜𝑑 𝑁 (𝐾𝑒𝑦 = (𝑃𝑎 )𝑏
Figure 3. Required communication to generate the public key
∃𝑥: 𝑇𝑇𝑃 → 𝑥: { 𝑖𝑑𝑥 , 𝑃𝑈𝑥 , 𝑃𝑅𝑥 , 𝑁, [𝑖𝑑𝑥 , 𝑃𝑈𝑥 ]𝑃𝑅𝑇 }
Figure 1. Required fields in each node
IV. THE PROPOSED HW/SW CO-DESIGN
Each node sends its unique identification number along
The prominent part of the most modern electronic
with its signature in a packet; then, the receiver compares
systems is digital components, which have hardware
the node number of that packet with the received number;
platform such that software programs are executed on it.
if these two numbers are equal, authentication is done and
The goal of HW/SW co-design is providing merits of
public key is approved. This operation should be done by
hardware and software at the same time.
second node, too [2]. Figure 2 presents the communication
Introduction of Computer-Aided Design (CAD) makes
steps for authentication.
that HW/SW co-design a hot topic. Since, there are tangible
𝑎𝑙𝑖𝑐𝑒 → 𝑏𝑜𝑏: {𝑖𝑑𝑎 , [𝑖𝑑𝑎 , 𝑃𝑈𝑎 ]𝑃𝑅𝑇 } preference on HW/SW co-design tools, which have an
imperative role in market [15].
𝑏𝑜𝑏: 𝑖𝑑𝑎′ , 𝑃𝑈𝑎= [[𝑖𝑑𝑎 , 𝑃𝑈𝑎 ]𝑃𝑅𝑇 ] ; 𝑖𝑓(𝑖𝑑𝑎 = 𝑖𝑑𝑎′ )𝑂𝐾
𝑃𝑈𝑇 In this paper, Xilinx Embedded Development Kit (EDK)
𝑏𝑜𝑏 → 𝑎𝑙𝑖𝑐𝑒: {𝑖𝑑𝑏 , [𝑖𝑑𝑏 , 𝑃𝑈𝑏 ]𝑃𝑅𝑇 } software tool provides enough capacity for co-design. To
𝑎𝑙𝑖𝑐𝑒: 𝑖𝑑𝑏′ , 𝑃𝑈𝑏 = [[𝑖𝑑𝑏 , 𝑃𝑈𝑏 ]𝑃𝑅𝑇 ] ; 𝑖𝑓(𝑖𝑑𝑏 = 𝑖𝑑𝑏′ )𝑂𝐾 create a processor, Xilinx Platform Studio (XPS) tool is
𝑃𝑈𝑇
used. Moreover, the process of adding custom peripherals
is done using the import peripheral wizard.
Figure 2. Communication steps for authentication In the proposed method, modular exponentiation, which
is time-taking part of algorithm is implemented using
Authentication is ended in this state of process such that hardware and rest of algorithm is employed by software to
if node number of the packet is equal to the received one, obtain better performance with acceptable flexibility.
the node number will be approved. In the next step, key
exchange should be done, which is explained in the A. Implemetaion
following subsection. In order to implement the proposed secure
communication channel, EDK tool is used; its processor is
B. Key Exchange
a MicroBlaze system. The MicroBlaze processor is a 32-bit
Eavesdropping can be catastrophic depending on the Reduced Instruction Set Computer (RISC) architecture,
context of the application; the appropriate method to which is optimized for implementation in Xilinx FPGAs
defend this kind of attacks is cryptography. In with separate 32-bit instruction and data buses running at
cryptography, security is based on the key; therefore, whole full speed to execute programs and access data from both
information may be in danger, if that key is compromised. on-chip and external memory at the same time. MicroBlaze
One solution is asymmetric cryptography but their processor is a configurable and user-friendly processor that
computational overhead is higher than symmetric ones. can be used across FPGA and all programmable SOC
Moreover, keys might be found by exhaustive search; families.

374 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The backbone of the architecture of aforementioned


system is a single-issue, 3-stage pipeline with 32 general-
purpose registers (does not have any address registers like SW
HW
the Motorola 68000 Processor), an Arithmetic Logic Unit
(ALU), a shift unit, and two levels of interrupt. This basic Select P, Q
Modular
design can be configured with more advanced features to FSL Calculate 𝜑(𝑁) and N
Power
Generate e and d
tailor to the exact needs of the target embedded application
such as: barrel shifter, divider, multiplier, single precision
Select N, P
floating-point unit (FPU), instruction and data caches,
Select a, b
exception handling, debug logic, Fast Simplex Link (FSL)
interfaces and others. This flexibility allows the user to
balance the required performance of the target application Figure 5. The overview of the proposed HW/SW co-design block
against logic area cost of the soft processor. Figure 4 shows
a view of a MicroBlaze system [16]. There are various implementations of modular
exponentiation in the hardware. For the sake of speed in
this application, the shift and squaring algorithm are used
[17]. Pseudo-code of which is presented in Figure 6.

Figure 4. MicroBlaze architecture

The MicroBlaze core is organized as a hardware


architecture with dedicated bus interface units for data
accesses and instruction accesses. MicroBlaze does not
separate between data accesses to I/O and memory (i.e. it
uses memory mapped I/O). Figure 6. Pseudo-code of power using squaring [17]
The processor has up to three interfaces for memory
accesses: Local Memory Bus (LMB), IBM’s On-chip In order to convert it to modular exponentiation, two
Peripheral Bus (OPB), and Xilinx CacheLink (XCL). The multiplications of this pseudo-code should be replaced by
LMB provides single-cycle access to on-chip dual-port the modular multiplication, which its pseudo-code is shown
block RAM (BRAM). The OPB interface provides a in figure 7 [18].
connection to both on-chip, off-chip peripherals and
memory.
The CacheLink interface is intended for using with
specialized external memory controllers. MicroBlaze also
supports up to 8 Fast Simplex Link (FSL) ports, each with
one master and one slave FSL interface. The FSL is a
simple, yet powerful, point-to-point interface that connects
userdeveloped custom hardware accelerators (co-
processors) to the MicroBlaze processor pipeline in order
to accelerate time-critical algorithms [16]. FSL can write to
and read from processor and FPGA.
In this implementation, an AFX VirtexII pro fg 456 proto
board is used. Board version C is selected as a co-processor
in the EDK tool. In the proposed secure communication
Figure 7. Pseudo-code of modular multiplication [18]
protocol, modular exponentiation is implemented by
VHSIC Hardware Description Language (VHDL) in After implementation of the proposed HW/SW co-
hardware and rest of protocol is implemented by C design, hardware view of processor and its peripherals is
language in software. Figure 5 presents this kind of division presented in figure 8.
of tasks in the proposed method.

375 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

VI. CONCLUSIONS
This paper proposes a new method to accelerate the
construction of a secure communication channel using
HW/SW co-design. The method uses from strengths of
software and hardware to obtain more suitable result. In
this design, time-taking part of the proposed method is
implemented as hardware part in order to obtain better
performance; the rest of algorithm is implemented by
software in order to get higher flexibility.
EDK is a suitable tool to implement HW/SW co-design.
The results of the implementation show that this method
has a remarkable acceleration of about 6 times faster than
software-only implementation with low hardware
overhead. Also, the results present that the proposed
method obtains suitable performance rather than existing
Figure 8. Hardware view of the proposed method with its peripherals methods of a secure communication channel construction.
REFERENCES
V. EXPERIMENTAL RESULT
1. Johnston, A. M., Gemmell, P. S."Authenticated Key Exchange
This section demonstrates the boost in the running speed Provably Secure against the Man-in-the-Middle Attack",
of the secure communication channel construction using Journal of Cryptology, pp.139-148, 2001.
the proposed HW/SW co-design. Table 1 presents running 2. Rivest, R. L., Shamir, A., Adleman, L. “A Method for
Obtaining Digital Signatures and Public-Key
time of modular exponentiation part and whole proposed
Cryptosystems”, Communications of the ACM, 21(2), pp.120-
protocol. As it is observed, this part is time-taking part of
126, 1978.
protocol because it should be called 8 times in the protocol. 3. Diffie, W., Hellman, M. “New Directions in
Cryptography”, Information Theory, IEEE Transactions
Table 1. Running time of modular exponentiation and whole on, 22(6), pp.644-654, 1976.
proposed method
4. Jablon, D. P. “Strong password-only authenticated key
Running Time of the Running Time of the exchange”, ACM SIGCOMM Computer Communication
Modular exponentiation Proposed Method (PS) Review, 26(5), pp.5-26, 1996.
(PS) 5. Ferrandi, F., Santambrogio, M. D., Sciuto, D. “A Design
96,688,000 773,505,000 Methodology for Dynamic Reconfiguration: The Caronte
Architecture”, In Proceedings of the 19th IEEE International
Table 2 shows running time of proposed protocol in two Parallel and Distributed Processing Symposium, pp. 4-8, April
different implementations including software-only 2005.
6. Mhadhbi, I., Litayem, N., Othman, S. B., & Saoud, S. B.
implementation and HW/SW co-design one. It is clear that
“Impact of Hardware/Software Partitioning and MicroBlaze
the latter is 6 times faster than the former. FPGA Configurations on the Embedded Systems
Performances”, In Complex System Modelling and Control
Table 2. Comparison of software-only implementation and HW/SW
through Intelligent Soft Computations Springer International
co-design one based on running time
Publishing, pp. 711-744, 2015.
Type of Implementation Running Time of the Proposed
Method(PS) 7. Koschuch, M., Grobschcadl, J., Page, D., Grabher, P., Hudler,
Software-Only 773,505,000 M., Kruger, M. “Hardware/Software Co-design of Public-Key
HW/SW Co-design 128,919,928 Cryptography for SSL Protocol Execution in Embedded
System”, ICICS 2009, LNCS 5927, PP. 63-79, 2009.
Table 3 presents memory and computational overhead of 8. Hassan, M. N., Benaissa, M. “A Scalable Hardware/Software
Co-design for Elliptic Curve Cryptography on PicoBlaze
HW/SW co-design implementation in order to analyze its
Microcontroller”, Circuits and Systems (ISCAS) IEEE, 2010.
hardware overhead. The memory overhead of the 9. Uhsadel, L., Ullrich, M., Verbauwhede, I., Preneel, B.
implementation can be neglected; since this amount can “HW/SW co-design of RSA on 8051”. In European Workshop
usually be found unused in modern FPGA applications. on Microelectronics Education, pp. 41-44, 2012.
10. Michail, H., Athanasiou, G., Kritikakou, A., Goutis, C.,
Table 3. Memory and computational overhead of the proposed Gregoriades, A., Papadopoulou, V. “Ultra High Speed SHA-
method 256 Hashing Cryptographic Module for IPSEC
Memory Overhead (KB) 340,640 Hardware/Software Codesign”, Proceedings of the
Computational Overhead 130,041
International Conference on Security and Cryptography
(ms)
(SECRYPT), IEEE, pp. 1-5, July 2010.
11. Otero, C. T. O., Tse, J., & Manohar, R. “AES Hardware-
There is a trade-off between these overheads and the Software Co-design in WSN”, In Asynchronous Circuits and
speed acceleration. It can be changed based on various Systems (ASYNC), 21st IEEE International Symposium on,
desired applications. pp. 85-92, May 2015.

376 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

12. Feten, T., Halim, K., & Younes, L. “Hardware/software co-


design using high level synthesis for cryptographic module”,
In 2015 7th International Conference on Modelling,
Identification and Control (ICMIC), IEEE, pp. 1-6, December
2015.
13. Sutisna, N., R. Hongyo, L. Lanante, Y. Nagao, M. Kurosaki,
and H. Ochi. "Fast design exploration with unified HW/SW co-
verification framework for high throughput wireless
communication system", In IC Design and Technology
(ICICDT), 2016 International Conference on, IEEE, pp. 1-4,
2016.
14. Sharif, Malik Umar, Rabia Shahid, Kris Gaj, and Marcin
Rogawski. "Hardware-software codesign of RSA for optimal
performance vs. flexibility trade-off", In Field Programmable
Logic and Applications (FPL), 26th International Conference
on, EPFL, pp. 1-4, 2016.
15. G. De Micheli, R. K. Gupta, “Hardware/Software Co-Design,”
in the Proc. of the IEEE 95, march 1997, pp.349-365.
16. http://ecasp.ece.iit.edu/tutorials.html Visited at 25 September
2016 (MicroBlaze Embedded System Design with Xilinx EDK
Tutorial).
17. Messerges, T. S., Dabbish, E. A., Sloan, R. H. “Power analysis
attacks of modular exponentiation in smartcards”.
In Cryptographic Hardware and Embedded Systems, pp. 144-
157, Springer Berlin Heidelberg, January 1999.
18. Montgomery, P. L.”Modular multiplication without trial
division”, Mathematics of Computation, 44(170), 519-521,
1985.

377 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

An Efficient Zone-Based Routing Protocol for WSN


Kamal Beydoun1, Khodor Hammoud2,
1
Department of Computer Science, Lebanese University, Beirut, Lebanon
2
Department of Computer Science, Lebanese University, Beirut, Lebanon

Email: [email protected], [email protected]

ABSTRACT

A Wireless Sensor Network is a collection of sensor nodes that cooperate with each other to send data to a base station.
These nodes have limited resources in terms of energy, memory, and processing power. Energy conserving
communication is one of the main challenges of wireless sensor networks. Several studies and research are focused
on saving energy and extending the lifetime of these networks. Architectural approaches, like hierarchical structures,
tend to organize network nodes in order to save energy. Most of these protocols need background information on the
network for them to be efficient. In this paper, we describe a new approach for organizing large sensor networks into
zones, based on the number of hops, to address the following issues: large-scale, random network deployment, energy
efficiency and small overhead. This network architecture enables a hierarchical network view, with the purpose of
offering efficient routing protocols based on zone partitioning. Simulations undertaken demonstrate that our approach
is energy-efficient; this is highlighted by the reduction of traffic overhead.

KEYWORDS

Wireless Sensor Network — Hierarchical Routing.

1 Introduction transmitter node is used. However, in largely deployed


Technological advances in microelectronic and networks, high transmission power wouldn’t be
wireless communications have enabled environmental enough to reach the sink and would consume a lot of
monitoring using small sensor devices grouped in new energy. This problem can be overcome by multi-hop
types of networks called wireless sensor networks communication; the number of hops should be
(WSNs). Wireless Sensor Networks (WSNs) are dense minimized in order to save energy. Sensors thus play
wireless networks made up of small, low-cost sensors a double role: data generator and data router.
(nodes), distributed randomly along a designated area. When a node wants to send a data packet, it sends it
These sensor nodes collect and disseminate along a route. The process of sending data from source
environmental data. Wireless sensor networks to destination is called Routing. Routing protocols
facilitate the monitoring and controlling of physical used in WSNs construction should fill some
environments from remote locations with good requirements concerning energy efficiency,
accuracy. Sensor nodes have limited resources: small distributed-based algorithmic and scalability. In order
batteries, small memory and small processing power. to address the challenge of energy-efficient, scalable
They are equipped with both sensory devices allowing SWN communication, most existing research use
data sensing, and wireless transceivers that help them hierarchical architectures such as cluster-based
communicate. When detecting a stimulus, sensor topologies. Clustering builds up groups of nodes,
nodes (called sources) generate data packets and named clusters, according to some metrics. Each
transmit them through the network to one or several cluster has an elected Cluster Head. Its role is to assure
special nodes (called sinks). Direct communications membership management and routing, by
would be possible if large transmission power on the

378 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

communicating the collected data via nodes to the base long lifetime for the system. Another main goal is
station. Despite the advantages of clustering protocols, reducing the size of the stored data (e.g. routing table)
most of them require further information on the in each node of the network. Recent researches in
network (e.g. node energy, connectivity, geographical Wireless Sensor Networks are focused on increasing
position). That leads to an overload in the network due the lifetime of the system by decreasing energy
to the number of sent packets. Consequently, both the consumption of each node in the network ([2], [3],
energy and lifetime of the network decrease. [4]). Because of the importance of energy
consumption optimization, a particular interest is
In wireless, mobile and multi-hop networks, routing
oriented towards routing protocols.
protocols should be able to deal with random node
deployment. It means that even though sensors’ 2.1 WSN Routing Protocols
positions are known (manually deployed), no Ad hoc routing protocols (AODV [5], DSR [6], and
particular hypotheses concerning its neighbors can be DSDV [7]) may be used as network protocols for
done due to the large scale of the network sensor networks. However, such approaches will
(neighborhood discovery protocols need to be generally not be good candidates for sensor networks
implemented). Therefore, sensor networks are because of the main following reasons ([8]): (i)
considered a subclass of ad hoc networks because of sensors have low battery power and low memory
the absence of an infrastructure. Thus, ad hoc availability; (ii) the routing table size scales with the
networking may influence some routing approaches in network size. According to the structure of the
wireless sensor networks, with respect to their network, the routing protocols in WSN are classified
topologies. In hierarchical structures, topology control as follows ([9]):
can be applied to minimize the set of active nodes
2.1.1 Flat-based routing
(switching off some of them to preserve energy) or to
In flat networks, each node typically plays the same
define coordination tasks for some particular nodes.
role and sensor nodes collaborate to communicate the
We are interested in hierarchical structures because
sensed data. Due to the large number of such nodes, it
flat architectures generally depend on the size of the
is not feasible to address each node. This consideration
network, which makes routing approaches difficultly
has led to data centric routing, where the BS (base
scalable. In either approach, an important issue that
station) sends queries to certain regions and waits for
needs to be addressed is the most crucial aspect:
the data from the sensors located in the selected
energy efficiency.
regions. Early works on data centric routing, e.g. SPIN
In our work, we propose a new approach of node [2] and directed diffusion [10], were shown to save
grouping (ZHRP[1]) in zones for large WSNs, where energy through data negotiation and redundant data
zone construction uses the number of hops as the elimination.
metric. No other information on the network is needed.
2.1.2 Location based routing
For this purpose, an inexpensive neighborhood
In this type of routing, sensor nodes are addressed by
discovery algorithm is proposed. The idea is to
means of their locations. The distance between
distribute routing roles between nodes inside a zone,
neighboring nodes can be estimated on the basis of
avoiding cluster management (including cluster head
incoming signal strengths. Relative coordinates of
election and rotation, cluster construction as in
neighboring nodes can be obtained by exchanging
classical hierarchical approaches). The zone topology
such information between neighbors like in GEAR
we propose does not intend to give a management role
[11] and SPAN [12] protocols. Alternatively, it may
to specific nodes: the nodes on the zone border will
be possible to obtain location information using
help routing between the zones, and all nodes of a zone
existing infrastructure, such as the satellite-based GPS
have the same function inside their zone. Moreover, it
(Global Positioning System), if the nodes are equipped
does not need prior information on the network.
with a low power GPS receiver like in GAF protocol
2 Related work [13].
Energy consumption is one of the main challenges in
wireless sensor networks. Energy saving assures a

379 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

2.1.3 Hierarchical routing remaining energy, connectivity with other nodes ([18],
Hierarchical routing (Table 1), originally proposed in [14], [19]).
wired networks, is a well-known technique that has
There are many existing clustering protocols. LEACH
advantages related to scalability and efficient
[14] is a distributed clustering-based protocol that uses
communications. The concept of hierarchical network
randomized rotation of the CHs to evenly distribute
architecture is also used to perform energy-efficient
the energy load among the sensors in the network.
routing in wireless sensor networks. That is because in
LEACH assumes that the fixed sink is located far from
a hierarchical architecture, higher energy nodes can be
the sensors and that all sensors in the network are
used to process and send information while lower
homogeneous and battery-constrained. Lin’s protocol
energy nodes can be used to perform the sensing in the
[20] is a distributed clustering technique for large
proximity of the target. LEACH [14] and HPAR [15]
multi-hop mobile wireless networks. The cluster
are two known hierarchical routing protocols.
structure is controlled by the hop distance. In each
Hierarchical architectures are efficient ways to lower
cluster, one of the nodes in the cluster is designed as
energy consumption performing data aggregation and
cluster head. Other nodes join a cluster if they are
fusion in order to decrease the number of transmitted
within a predetermined maximum number of hops
messages to the BS.
from the cluster head. HEED [21] is a distributed
Table 1. Some Available Hierarchical Routing Protocols clustering protocol that periodically selects cluster
heads according to a hybrid function between their
residual energy and a secondary parameter, such as
node proximity to its neighbors or node degree.

2.2 WSN Clustering:


Nodes are gathered in several groups, generally
Figure 1: Clustering in WSN
disjointed, which are named clusters. Each cluster has
a Cluster Head (CH). The sensors collect data and send Most topologies based on clusters assume that cluster
it to the CH. CHs can communicate with the Base heads are high-energy nodes and their transmission
Station (BS) directly or via other CHs. In some power can be adapted in order to reach the base station
networks, the CHs, referred to as gateways, will at far distances and to communicate directly to other
perform data aggregation, and send only relevant cluster heads. Another assumption is that nodes within
information through long haul radio communication to a cluster can directly communicate to the cluster head.
the BS. For that purpose, CHs will have specialized The transmission defines the set of neighbors for a
processing and telecommunication capabilities, and sensor node, those able to receive the transmitted
fewer energy constraints. If the CH is elected only signals. Because variation of the transmission range
once, we describe these networks as “static” in terms consumes more resources, virtual topologies should be
of change of CHs ([16], [17]). On the contrary, in proposed for sensor networks that are made of sensors
“dynamic” networks, nodes exchange the role of CH with fixed transmission power.
(re-election) according to several metrics like

380 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Here are some hierarchical protocols listed according will detail the new approach for sensor grouping in
to their classes: multiple zones.

• Groups 3 Zone Hierarchical Routing Protocol (ZHRP)


o SOP [22]: Nodes are classified into Zone Hierarchical Routing Protocol [1] is a protocol
Groups, then a hierarchical tree is formed for constructing a large scale WSN where the nodes
to create a routing table at every node. are randomly deployed. ZHRP is energy efficient, low
Two metrics are used in routing for cost, and makes efficient routing between nodes.
choosing the path. The first is the ZHRP aims to group nodes in a large WSN into
minimum energy consumption for disjoint zones without the need for information, like
transmitting 1 bit. The second is the path energy level and geographic location, from
capacity in terms of bits. Transmission neighboring nodes, which makes the WSN decrease
always done in the path with maximum the number of packets sent from one node to another
capacity. and decrease the energy needed by the nodes that
• Clusters indeed increase the lifetime of the WSN. In ZHRP, no
o LEACH [14]: Two hop communication node has a management role; nodes are deployed
is done, in the first one a node in a cluster randomly.
sends its data directly to the cluster head
3.1 ZHRP Stages
of its cluster, the second is that the
ZHRP divides the WSN into zones and saves useful
cluster head forwards the data to the base
information for routing in each node (like nodeId,
station directly.
nextHopId, and number of hops). Each zone consists
• Chains of an inviting node, Normal nodes, and Border nodes.
o PEGASIS [23]: Nodes uses codes in The Normal nodes are nodes that can sense data and
communication to reduce interference. A send and receive data packets. The Border nodes are
chain is formed, and each node sends like normal nodes but they are at the border of the zone
data to its neighbor till the packet is and they are neighbors of other zones, and they save
received by the chain leader which sends information about neighbor zones’ Border Nodes. The
the packet to the base station. Inviting node is a normal node, but it has one and only
• Zones one additional task, which is to broadcast a packet to
o ZHLS [24]: each zone consists of a invite nodes to join its zone. Each node has a Node ID,
center node, and nodes that are in range which is unique over the WSN.
of the center range. The center node
range is a radius in terms of hops. Before using it in routing, ZHRP is divided into three
o ZHRP [1]: is a protocol consisting of stages. Zones Construction stage [25], Intra- Zone
zones. Unlike ZLHS, there is no center Routing Table Construction stage [26], and Inter-
node. Zone Routing Table Construction stage [26].

The challenge addressed in this paper presents an Nodes are classified into zones. In each zone, an Intra-
approach of virtual structuring of networks without Zone Routing Table will be constructed at all nodes.
using topology control technique. Our contribution to Then at the Border Nodes, an Inter-Zone Routing
topology construction addresses two main issues in Table is constructed. When a node wants to send a
WSNs: distributed approaches, and energy efficiency. packet, it uses the Intra-Zone Routing Table to send
Moreover, our approach is independent of the the packet to one of its zone Border Nodes. The Border
embedded sensor technology (being able to vary the Node then uses the Inter-Zone Routing Table to send
transmission power or not); the only parameter the packet to the destination zone. At the destination
considered is the current node’s transmission range. zone, a Border Node will receive the packet, then it
The algorithm is executed simultaneously with the uses its Intra-Zone Routing Table to send the packet to
neighborhood discovery protocol for random sensor the destination node.
node deployments. Out Approach In this section we

381 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

3.1.1 Zones Construction Stage SrcId Node id of the sender node


The first stage in ZHRP is the Zones Construction DestId Node id of the destination node
stage [25]. In this stage, nodes are categorized into ZoneId Zone id of the sender node
zones with low cost, without the need of a cluster head. Subject Subject of the packet
There are three parameters that the Zones Construction NodeType Type of the sender node
depends on: R, zN, and N. TTL Packet’s Time to Live
Table 3: Zone Construction Packet Fields
• R: is the zone radius, which is the maximum
number of nodes between the inviting node The inviting nodes start the construction phase by
and the invited nodes. broadcasting (only once) a construction packet of
• zN: is the required number of zones. subject INVITATION to its neighbor nodes that
• N: is the number of nodes in the wireless initially don’t belong to any zone. This construction
sensor network. packet has TTL equal to R. When a node receives a
packet, it first checks the Subject of the packet. If the
During this stage, each node specifies some attributes:
Subject is “INVITATION”, the node then checks if it
ZoneId, NodeType, and BorderTable.
is already joined to a zone or not. If it is already in a
ZoneId is the Id of the zone that the node belongs to. zone, then if the received packet ZoneId is the same as
NodeType is the type of the node (Normal or Border the node ZoneId, it does nothing. But if they are
node). Initially nodes are Normal nodes but they can different, the node changes its type to Border node and
change to Border nodes if they are located at the border broadcasts a construction packet of Subject
of the zone. DISAGREEMENT and of TTL 1, and adds a record to
the BorderTable that specifies the packet sender’s
BorderTable shown in Table 2 is the table in which the node id and its zone.
Border node saves information about the neighbor
zones’ Border nodes. Each record in the Border Table If the node doesn’t already belong to a zone, it joins
contains the attributes NodeId and ZoneId. NodeId is the zone from the packet and checks the received
Id of the node and ZoneId is the Id of the zone which packet TTL. If the TTL is 0, the node changes its type
the node belongs to. to Border and broadcasts a packet of Subject
BORDER and TTL 1. Else if the TTL is greater than
NodeId Id of the node 0 then the node broadcasts a packet of Subject
ZoneId Id of the zone that node belongs to INVITATION and TTL equal to the received packet’s
Table 2: Border Table TTL minus 1. If the received packet belongs to a
different zone, and its subject is BORDER or
During zones construction stage, nodes exchange
DISAGREEMENT, then the node changes its type to
packets to construct the zones. One packet structure is
BORDER and broadcasts a construction packet of
used during the zones construction stage. The fields of
subject BORDER and TTL 1. If the received packet
this packet are shown in Table 3.
subject is NEW_NODE, then the node sends a
The packet Subject can be: INVITATION, construction packet of subject INVITATION and TTL
DISAGREEMENT, BORDER and NEW_NODE. 1 to the sender node.
The algorithm of Zones construction phase is shown
At the end of this phase, NodeType, ZoneId and
in Figure 2. Initially the wireless sensor network
BorderTable will be specified for each node in the
consists of Normal nodes and Inviting nodes. The
wireless sensor network.
number of Inviting nodes is equal to the number of
zones. Initially each zone consists of one inviting
node.

382 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 2: Zone Construction Packet Received Algorithm

3.1.2 Intra-Zone Routing Table Construction Each entry in the Intra-Zone table contains attributes
Stage as shown in Table 4. During ZHRP Intra-Zone
The second stage in ZHRP is the Intra-Zone Routing Routing Table Construction stage, nodes exchange
Table Construction [26]. In this stage, nodes in the packets to complete the phase. The exchanged packets
same zone will know the minimal path to send packets contain the following structure (Table 5).
to each other.
srcId Node Id of the sender
destNodeId Destination node Id zoneId Zone id of the destination node
nextHopId Next hop Id destinationId Destination node Id
M (metric) Number of nodes nextHopId next node Id
Node Type of the destination node Metric computed in number of
nodeType M (metric)
(Border or Normal) nodes
List of neighboring zones’ ids if nodeType Node type of the sender node
neighZonesId the destination node is of type If the sender node is a Border
Border node borderTable node, then it sends the Border
Table 4: Intra-Zone Routing Table Entry Fields Table
Table 5: Intra-Zone Routing Table Packet Fields
The Intra-Zone table is constructed based on the
Distance-Vector Algorithm (Bellman-Ford [27]).

383 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

The algorithm of Intra-Zone Routing Table packet’s destinationId. If so, the node checks if the
Construction is composed of two steps; Figure 3 packet’s metric M is less than the entry’s metric in the
shows the first step. Each node broadcasts an Intra- Intra-Table, and updates the entry with the new values
Table Construction packet specifying its zoneId, from the packet (nextHopId, M, nodeType,
nodeType and borderTable (if it is a Border node). BorderTable). Then, the packet rebroadcasts the
Any node that receives an Intra-Zone Construction modified entry.
Packet checks if the zoneId of the sender is the same
If there is no entry for the received packet’s
as its zoneId. If it isn’t, it ignores the packet; else if the
destinationId, the node will add a new entry to the
received packet’s sender zoneId is the same as the
Intra-Table setting the fields (destinationId,
receiver zoneId (packet is received from the same
nextHopId, nodeType, M, BorderTable) from the
zone) and there is no entry in the Intra-Zone Table for
received packet. Then it will broadcast the new added
that sender node, then the node adds an entry to the
entry. After this phase, each node will have an Intra-
Intra-Zone table with
Table with shortest paths for packets to be sent to their
• destinationId as the sender nodeId. destination. The routing algorithm will be discussed
later in ZHRP Data Routing section.
• metric M equals to 1.
• nextHopId same as the sender nodeId. 3.1.3 Inter-Zone Routing Table Construction
• nodeType as the nodeType in the packet. Stage
The third stage in ZHRP is the Inter-Zone Routing
• If the nodeType is Border, it adds the
Table Construction [25]. In this stage, all Border
BorderTable from the packet.
Nodes will have an Inter-Zone Routing Table which
Hence every node knows its neighbors. contains information about other zones and the paths
that should be taken to reach them.

destZoneId Destination zone Id


nextZoneId Next Zone Id
Zone Metric, which is the longest
path (number of nodes) between
zoneM
two nodes in a zone. It is computed
during the Intra-Table construction
Table 6: Inter-Zone Routing Table Entry Fields
Figure 3: Intra-Zone Routing Table Construction Step1
This stage is also based on Distance-Vector algorithm
(Bellman-Ford [27]), which is applied between zones
to form the Inter-Zone Routing Table. Each entry in
the Inter-Zone Routing Table contains the fields
presented in Table 6. To avoid redundant computation,
a Border node is chosen in every zone to accomplish
the task of building the Inter-Routing Table for the
corresponding zone. This node is then called the
BORDER_CHIEF node. The BORDER_CHIEF node
is the node with the highest identification between
Figure 4. Intra-Zone Routing Table Construction Step 2 Border Node identifications (border node with the
Figure 4 describes the second step of Intra-Zone highest nodeId). It is computed from the Intra-Zone
Routing Table construction. After knowing their Routing Table. The BORDER_CHIEF node then
neighbors, each node broadcasts its Intra-Zone computes the Inter-Zone Routing Table and sends it to
Routing Table. When a node receives a packet, if the all other Border nodes.
packet is from another zone, it ignores it; else the node
checks if its Intra-Table contains an entry for the

384 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

srcId Source node Id The second stage: the stage where zones exchange
nextHopId Next hop node Id their Inter-Zone Routing Table with each other to
The zoneId of the complete them.
srcZoneId
Source Node
In the first stage, each Chief-Node constructs the
subject Packet Subject
Initial Inter-Zone Routing Table from their Intra-Zone
The Inter-Zone Routing
Table. For each entry in the Intra-Zone table, if the
interZoneRoutingTable Table produced by the
nodeType is Border-Node; then for each zone in
BORDER-CHIEF
neighZonesId list, add an entry to the Inter-Zone
The final destination
finalDestId Routing Table setting the destZoneId and the
nodeId nextZoneId to zone and zoneMetric as computed
Table 7: Inter-Zone Routing Table Packet
during Intra-Zone Table construction. Figure 5 shows
Table 7 shows the packet fields that are used during how the initial Inter-Zone Routing Table is
the Inter-Zone Routing Table construction. constructed at the Chief Border Node.

The packet Subject can be:

COMPL_TABLE Complete the table


UPDATE_TABLE Update the table
Table 8: Inter-Zone Routing Table Packet Subject
Figure 5. Initial Inter-Zone Routing Table Construction at
The Inter-Zone Routing Table construction is Chief Node
composed of two stages:
In the second stage, the Chief-Border node sends its
The first stage: constructing the initial Intra-Zone Inter-Zone Routing table to all Border-Nodes in its
Routing Table that contains entries for Zones that the zone and the neighbor nodes from the Chief-Border
current zone can reach directly (neighbor zones). Border Table as shown in Figure 6.

IF (n.NodeType = CHIEF-BORDER) THEN


PART A :
TempZones := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds | zone ∉ TempZones DO
Send a packet P (n.NodeId, entryRT.NextHopId, entryRT.DestNodeId, n.ZoneId, UPDATE_TABLE, n.InterZoneRoutingTable)
Save zone in TempZones
ENDDO
ENDDO
PART B :
For each entryBT in BorderTable DO
For each node in entryBT.BorderNodesIds DO
Choose randomly node from entryBT.BorderNodesIds
Send a packet P’ (n.NodeId, node, NULL, n.ZoneId, UPDATE_TABLE, n.InterZoneRoutingTable)
ENDDO
ENDDO

ENDIF

Figure 6. Chief Border Node Sends the Inter-Zone Routing Table

385 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

INPUT : InterZoneRoutingTable at CHIEF-BORDER nodes


When node n receives a packet P
PART A :
IF (n.NodeType = NORMAL) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.FinalDestId
Send a packet P’(n.NodeId, entryRT.NextHopId, P.FinalDestId, n.ZoneId, P.Subject, P.ZoneT)
ELSE
PART B :
IF (n.NodeType = CHIEF-BORDER) THEN
UPDATE n.InterZoneRoutingTable
IF (InterZoneRoutingTable is not complet) THEN
TempZones := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds | zone ∉ TempZones DO
Send a packet P’(n.NodeId, entryRT.NextHopId, entryRT.DestNodeId, n.ZoneId, UPDATE_TABLE, n.InterZoneRoutingTable)
Save zone in TempZones
ENDDO
ENDDO
For each entryBT in BorderTable DO
Choose randomly node from entryBT.BorderNodesIds
Send a packet P’ (n.NodeId, node, NULL, n.ZoneId, UPDATE_TABLE, n.InterZoneRoutingTable)
ENDDO
ELSE
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
Send a packet P’(n.NodeId, entryRT.NextHopId, entryRT.DestNodeId, n.ZoneId, COMPLET_TABLE, n.InterZoneRoutingTable)
ENDDO
ENDIF
ELSE

PART C :
IF (n.ZoneId ≠ P.ZoneId) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.NodeType = CHIEF-BORDER
Send a packet P’(n.NodeId, entryRT.NextHopId, entryRT.DestNodeId, n.ZoneId, P.Subject, P.InterZoneRoutingTable)
ELSE
IF (n.NodeId = P.FinalDestId) THEN
IF (P.Subject = UPDATE_TABLE) THEN
For each entryBT in BorderTable DO
Choose randomly node from entryBT.BorderNodesIds
Send a packet P’ (n.NodeId, node, NULL, n.ZoneId, P.Subject, P.ZoneT)
ENDDO
ELSE
Save P.ZoneT in n.InterZoneRoutingTable
ENDIF
ELSE
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.FinalDestId
Send a packet P’(n.NodeId, entryRT.NextHopId, P.DestNodeId, n.ZoneId, P.Subject, P.ZoneT)
ENDIF
ENDIF
ENDIF
ENDIF
OUTPUT : InterZoneRoutingTable at BORDER nodes

Figure 7. Inter-Zone Routing Table Construction

386 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 7 shows how nodes act when receiving an Inter- SrcId Global id of the sender node
Zone Routing Table. When a Normal node receives an LocalDestId Destination node Id in the zone
Inter-Zone Routing Table Construction Packet, it just NextHopId Node Id of the nextHop
forwards it to its destination. If a Border node receives FinalDestId Final destination global node Id
an Inter-Zone Routing Table Construction Packet DestZoneId Destination Zone Id
from another zone, it changes the nextZoneId for all Data The data to be send
the packet’s Inter Table to the packet’s source zoneId Table 9: Data Routing Packet Fields
and then forwards the packet to the Chief-Border node.
If the received packet is from the same zone and the 3.2.1 Sending Data Packet
packet’s destinationNodeId is the current Border node, As described in [28], when a node n1 wants to send
then: data to node n2, it follows the algorithm in Figure 8: If
the srcZoneId equals to the destZoneId (in the same
If the packet’s subject is UPDATE_TABLE, forward zone), then the Intra-Table is used to find the
the packet to the neighbor Border nodes in the Border corresponding information to build the packet. Else, if
Table. the srcZoneId is not equal to the destZoneId, then if n1
is a Normal node then search the Intra-Table to find a
If the subject is COMPLETE_TABLE, save the
Border-Node such that this Border-Node is a neighbor
packet’s Inter-Zone Routing table in the current
to the Destination Zone. If such a Border-Node is
Border node. If the packet’s destinationNodeId is not
found, then send a packet to one of its neighbor node.
the current Border node, forward it to the packet’s
If no Border-Nodes are found, then send the packet to
destinationNodeId (get the nextHopId from the Intra-
any Border-Node.
Table).
If the sender node n1 is a Border-Node, then if there
If the Chief-Border receives the packet, it updates its
exists a node in the borderTable such that
Inter-Zone Routing Table. Then if the Inter-Zone
destinationZoneId equals the neighbor border node
Routing Table is complete (number of zones = number
ZoneId, then send the packet to it.
of entries), it sends the Inter-Table to all Border nodes
in the same zone setting the packet’s Subject to Else, if there is no such Border Node in the Border
COMPLETE_TABLE. Table, then search the Inter-Table for an entry
(interRecord) such that interRecord.destZoneId equal
If the Inter-Zone Routing Table is not complete, the
to the packets destinationZoneId, then find a
Chief-Border sends its Inter-Table to all Border nodes
borderTable record (borderTableRecord) such that the
in the same zone and to the neighbor nodes of neighbor
interRecord.NextZoneId equals to
zones setting the packet’s Subject to
borderTableRecord.zoneId and send the packet to the
UPDATE_TABLE.
borderTableRecord.nodeId.
3.2 ZHRP Data Routing
If no such borderTableRecord exists, then search the
Once the Intra-Zone and Inter-Zone Routing Tables
Intra-Table for a Border Node that is a neighbor to the
are constructed, the data routing can be accomplished
destZoneId and send the packet to that border node.
easily. The packet structure used in Data Routing
contains the following entries:

387 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

When source node n wants to send DATA to destination node n’


PART A :
IF (n.ZoneId = n’.ZoneId) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = n’.NodeId
Send a packet P (n.NodeId, n’.NodeId, entryRT.NextHopId, n’.NodeId, n’.ZoneId, DATA)
ELSE
PART B :
IF (n.NodeType = NORMAL) THEN
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = n’.ZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
IF (TempNodes ≠ ∅) THEN
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send a packet P (n.NodeId, destNode, entryRT.NextHopId, n’.NodeId, n’.ZoneId, DATA)
ELSE
Find randomly entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER
Send P (n.NodeId, entryRT.DestNodeId, entryRT.NextHopId, n’.NodeId, n’.ZoneId, DATA)
ENDIF
ELSE

PART C :
IF (∃ entryBT in BorderTable | n’.ZoneId = entryBT.neighZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P (n.NodeId, node, node, n’.NodeId, n’.ZoneId, DATA)
ELSE
Find entryZone in InterZoneRoutingTable | entryZone.DestZoneId = n’.ZoneId
IF (∃ entryBT in BorderTable | entryZone.NextZoneId = entryBT.neighZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P (n.NodeId, node, node, n’.NodeId, n’.ZoneId, DATA)
ELSE
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = entryZone.NextZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send a packet P (n.NodeId, destNode, entryRT.NextHopId, n’.NodeId, n’.ZoneId, DATA)
ENDIF
ENDIF
ENDIF
ENDIF

Figure 8. Data Packet Sending Algorithm

388 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

When a node n receives a data packet P


PART A :
IF (n.NodeId = P.FinalDestId) THEN
Process P.data
ELSE

PART B :
IF (n.ZoneId = P.DestZoneId) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.FinalDestId
Send a packet P’ (n.NodeId, P.FinalDestId, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE

PART C :
PART C.1 :
IF (n.NodeType = NORMAL) THEN
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = P.LocallDestId
Send a packet P’ (n.NodeId, P.localDestId, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.2 :
IF (∃ entryBT in BorderTable | entryBT.neighZoneId = P.DestZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P’(n.NodeId, node, node, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.3 :
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = P.DestZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
IF (TempNodes ≠ ∅) THEN
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send P’(n.NodeId, destnode, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.4 :
Find entryZone in InterZoneRoutingTable | entryZone.DestZoneId = P.DestZoneId
IF (∃ entryBT in BorderTable |entryBT.neighZoneId = entryZone.NextZoneId) THEN
Choose randomly node from entryBT.borderNodesIds
Send P’(n.NodeId, node, node, P.FinalDestId, P.DestZoneId, P.data)
ELSE
PART C.5 :
TempNodes := ∅
For each entryRT in IntraZoneRoutingTable | entryRT.NodeType = BORDER DO
For each zone in entryRT.ZoneIds DO
IF (zone = entryZone.NextZoneId) THEN
Save entryRT.DestNodeId in TempNodes
ENDIF
ENDDO
ENDDO
Choose randomly destNode from TempNodes
Find entryRT in IntraZoneRoutingTable | entryRT.DestNodeId = destNode
Send a packet P’ (n.NodeId, destnode, entryRT.NextHopId, P.FinalDestId, P.DestZoneId, P.data)
ENDIF
ENDIF
ENDIF
ENDIF
ENDIF
ENDIF

Figure 9. Data Packet Receiving Algorithm

389 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

3.2.2 Receiving Data Packet the same procedure, then the packet reaches a Border
As described in [28], when a node n receives a packet Node in zone Z9. Finally, this node will use its Intra-
p it acts as the following (Figure 9): Zone Routing Table to forward the packet to Dest.

4 Simulations and results


In this section, we will show the implementation of all
stages of the protocol ZHRP,. The implementation has
been done using a simulation framework called
Omnet++ [29]. Omnet++ is not a simulator, it is a
discrete event network simulation framework. In other
words, it provides infrastructure and tools to build
network simulations. A simulator based on Omnet++
is used in building the simulation, which is called
Castalia WSN simulator [30].

4.1 ZHRP Error Ratio


In this simulation, the value of R (zone radius) was 5,
15, and 25 nodes; the number of nodes was 200, 300,
and 400. Figure 11 shows how the error ratio changes
when R and number of zones and number of packets
changes.

Figure 10. Packet Emitted from Zone 4 to Zone 9

If the n.nodeId equals to p.finalDestId, then the packet


arrived to its destination, hence process the data. If
n.zoneId equals to p.destZoneId, then p has arrived to
the destination zone but not to the destination node.
Then find an entry in the Intra-Table that has
destNodeId equals to p.finaldDstNodeId and forward
the packet to it. If the p is not in the destination zone,
then if n.nodeType is a Normal-Node, forward the
Figure 11. Error Ratio
packet to p.localDestId. If n is a Border-Node and is a
neighbor to p.destZoneId then send the packet to a In Figure 11, when R has the value of 10 or higher, the
neighbor node in the neighbor zone. If it is no error ratio is always zero whatever the number of
neighbor, then find a Border Node in the same zone zones and the number of nodes is. On the other hand,
that is a neighbor with p.destZoneId from the Intra- when R has the value of 5, some nodes are not
Table. If such node exists, then forward the packet to associated to any zone, but as the number of zones
it. Else, the destination zone is not a neighbor to this increases, error ratio decreases till it becomes 0 at 25
zone, so use the Inter-Table to know to which zone the zones. That is because the number of INVITING
packet should be forwarded. Figure 10 shows how a nodes increase as number of zones increase, and the
data packet is sent from Emett in Z4 to Dest in Z9. number of nodes in each zone increase as R increase,
In Z4, Emett wants to send a packet to Dest in Z9. so more nodes will join the zone.
First, it uses its Intra-Zone Routing Table to send the
4.2 ZHRP Installation
packet to a Border Node in Z4. At the Border Node, The ZHRP installation is implemented in three stages,
the node uses the Inter-Zone Routing Table to send the
which are: Zones Construction, Intra-Zone Routing
packet to the next zone Z7. At Z7, use the Intra-Zone Table construction, and Inter-Zone Routing Table
Routing Table to reach a Border Node. Then use the construction. In each stage, the number of Sent and
Inter-Zone Routing Table to reach zone Z8. Z8 follows Received packets is monitored when to the number of

390 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

zones and the zone radius R changes. The radius can 4.2.2 ZHRP Intra-Zone Routing Table
take the values 5, 15, and 25 while the number of zones Construction
changes between 5, 10, 15, 20, and 25. The simulation As shown in Figure 14 and Figure 15, in Intra-Zone
takes place with 200, 300, and 400 nodes. Routing Table construction, when the number of zones
increases, the number of sent and received packets
4.2.1 ZHRP Zones Construction
decreases. This is because the number of nodes in each
Figure 12 and Figure 13 show that the increase in R
zone will decrease when the number of zones increase,
correlates with an increase in the number of sent and
so the communication between nodes in the same zone
received packets. However, when R takes a value of
will decrease. When R has the value of 10 or more, the
15 or 25, we get an equal amount of sent and received
values became the same for same number of zones and
packets. That’s because zones were already neighbors,
number of nodes. This is because the zone is fully
but the nodes within the radius are less than 10.
constructed (have Border nodes) before R reaches 10.
Increasing R will let more nodes joins the zone, hence
The number of sent packets decrease from 200 to 75
more packets exchange will occur. When the number
when the number of zones increases from 5 to 25,
of zones increases, the number of packets
while the number of received packets decrease from
sent/received will increase, because increasing the
550 to 250.
number of zones will increase the number of inviting
nodes, so more INVITATION messages will be
transmitted.

Figure 14: Intra-Zone Routing Table Sent Packets

Figure 12: Zones Construction Sent Packets

Figure 15: Intra-Zone Routing Table Received Packets

4.2.3 ZHRP Inter-Zone Routing Table


Construction
Figure 16 and Figure 17 show that when the number
of zones increase, the number of packets sent and
received increase. When the number of nodes = 400,
Figure 13: Zones Construction Received Packets the number of sent and received packets will increase
till they reach a maximum value of 900 for sending
and 700 for receiving. When the number of nodes =
300, the number of sent packets will reach 890 and that

391 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

of the received ones will reach 700. When the number reduce this complexity by the two-level routing tables
of nodes = 200, the number of sent packets reach 650 that we construct. Next, we are interested in the space
and received packets reach 450. The peeks in the complexity, in terms of number of bytes occupied by
graphs show the maximum number of packets sent and the involved data structures. As far as we know, no
received that the WSN can reach as long as the number other routing mechanism proposed, in literature, a
of zones increases. When the number of zones is less wireless sensor network that considers this metric. The
than the value at the peek, if a zone wants to reach formula for computing the size (in bytes) of the routing
another zone, it must pass through many zones to reach data structures is given in Table I, for N deployed
the destination zone. But when the number of zones is nodes, when nZ zones are constructed, each zone
greater than the value at the peek, the zone will have having in average nB border nodes.
many neighbors. Hence, in order for a zone to reach
4.4 Lower bound for the number of zones
another zone, it must pass through less zones, because
The previous metric does not only estimate the size of
its will have more neighbors, and the probability for
the needed data structure in order to assure pro-active
the destination zone to be a neighbor will increase. For
routing based on routing tables, it also gives a lower
400 nodes, the peek is at number of zones = 20. For
bound of the number of zones. This computation is
300 nodes, the peek is at 15. For 200 nodes, the peek
based on a memory limit imposed for sensors in
is not reached before number of zones equal to 25.
respect with the total memory capacity of a sensor.
Depending on the technology used, this total capacity
may vary. Therefore, we make the following
assumption: nodes technically dispose of MEM_RAM
RAM memory capacity. Obviously, only a fraction of
the total available memory can be used for protocol
data structures. We did it by the
MAX_MEM_PRCTG percentage (%). Considering
the theoretical memory capacity needed by the
protocol for a normal node, we have

Figure 16: Inter-Zone Routing Table Sent Packets


This gives a lower bound for the number of zones,

We neglected in our estimation the memory capacity


for the border nodes. Meanwhile, memory constraints
for border nodes can be included by varying
judiciously the MAX_MEM_PRCTG constant.

4.5 Routing in ZHRP


In this simulation, a variable number of events will
take place over the WSN. Nodes that receives those
Figure 17: Inter-Zone Routing Table Received Packets events will send their data to a destination node in
another zone. The total number of events will vary
4.3 Memory capacity between 300, 400, 500, and 600. The number of nodes
One other important evaluation metric for the WSN is 400. The number of zones will change between 10,
routing protocol that we propose is the size of the data 20, and 30.
structure used for routing. Using table-driven
algorithms may need important memory space in the
context of largely-deployed networks. We already

392 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

4.5.1 Sending Packets


Figure 18and Figure 19 show that the number of Sent
packets when number of zones respectively equals to
10, 20.

Figure 21: Z20 Received Packets

5 Battery Consumption
CPU Consumption 8 mAh
Figure 18: Z10 Sent Packets Receiving Consumption 10 mAh
Transmitting Consumption 27 mAh
Initial Energy 2900 mAh
Voltage 3V
Data Transfer Rate 38400 bits/s
Communication Range 500ft
Table 10. Characteristics of MICA2 Sensor

Energy Consumption has been calculated based on the


characteristics of the MICA2[31] sensor node. The
characteristics of MICA2 is shown in Table 10. In this
example, we have calculated battery consumption
during ZHRP. We changed the number of nodes
Figure 19: Z20 Sent Packets
between 200, 300, and 400, and the number of zones
4.5.2 Receiving Packets between 5, 15, and 25, and for R we used the values 5
Figure 20 and Figure 21 show that the number of and 25. We also calculated the battery consumption
received packets when number of zones respectively during the routing scenario using ZHRP.
equals to 10, 20.
5.1 Battery Consumption in Zones
Construction
Battery consumption for the Zones Construction stage
are shown in Figure 22 for the sent packets, and in
Figure 23 for the received packets.

Figure 20: Z10 Received Packets

Figure 22: Zone Construction Sent Packets

393 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 23: Zone Construction Received Packets Figure 26: Inter-Zone Routing Table Sent Packets
5.2 Battery Consumption in Intra-Zone
Routing Table Construction
Battery consumption for the Intra-Zone Routing Table
Construction stage are shown in Figure 24 for the sent
packets, and in Figure 25 for the received packets.

Figure 27: Inter-Zone Routing Table Received Packets

5.4 Battery Consumption in ZHRP Routing


Battery consumption in ZHRP routing are shown in
Figure 28 for the sent packets, and in Figure 29 for the
received packets.
Figure 24: Intra-Zone Routing Table Sent Packets

Figure 28.ZHRP Routing Sent Packets

Figure 25: Intra-Zone Routing Table Received Packets

5.3 Battery Consumption in Inter-Zone Routing


Table Construction
Battery consumption for the Inter-Zone Routing Table
Construction stage are shown in Figure 26 for the sent
packets, and in Figure 27 for the received packets.

Figure 29: ZHRP Routing Received Packets

394 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

6 Conclusion [8] Tilak S., Abu-Ghazaleh N.B., and Heinzelman W.


WSN is a collection of sensor nodes that are deployed “A Taxonomy of Wireless Micro-Sensor Network
randomly over a large area. Those sensor nodes have Models”. Mobile Computing and Communications
limited resources such as processing unit, battery, Review, Vol. 6, 2002. pp. 28-36.
memory, and limited transmission range. The sensors
[9] Kamal A.E., and Al-Karaki J.N. “Routing
need to cooperate to route and send data to the base
techniques in wireless sensor networks: a survey”.
station. However, this routing should be energy
IEEE Wireless Communications. Vol. 11, December
efficient in order to increase the WSN lifetime.
2004, pp 6-28.
In this work, WSN was introduced with its
[10] Estrin, D., Intanagonwiwat, C., and Govindan R.
characteristics and its applications. Then, we have
“Directed diffusion: A scalable and robust
described some routing protocols, which are very
communication paradigm for sensor networks”.
important for energy consumption efficiency. ZHRP,
Proceedings of ACM MobiCom' 00. Boston 2000. pp.
a hierarchical routing protocol, was deeply introduced
56-67
and its stages were well described. ZHRP was also
implemented and tested. [11] Yu Y., Estrin D., and Govindan R. “Geographical
and Energy-Aware Routing: A Recursive Data
7 References Dissemination Protocol for Wireless Sensor
[1] K. Beydoun and V. Felea, “Energy-efficient WSN Networks”. Technical Report, UCLA Computer
infrastructure,” in Collaborative Technologies and Science Department, May 2001.
Systems, 2008. CTS 2008. International Symposium
on, 2008, pp. 58–65. [12] Chen, B., Jamieson, K., Balakrishnan, H., and
Morris R. SPAN: “An energy-efficient coordination
[2] Karl H., and Willig A. “A short survey of wireless algorithm for topology maintenance in ad hoc wireless
sensor networks”. Telecommunication Networks networks”. Wireless Networks, Vol. 8, September
Group. Technical University Berlin, Berlin, October 2002, pp. 481-494.
2003.
[13] Xu Y., Heidemann J., and Estrin D. “Geography-
[3] Akyildiz, I.F., Su, W., Sankarasubramaniam Y., informed Energy Conservation for Ad-hoc Routing”.
and Cayirci E. “Wireless sensor networks: a survey”. Proceedings of the Seventh Annual ACM/IEEE
Computer Networks.Vol. 38, 2002, pp. 393-422. International Conference on Mobile Computing and
[4] Tubaishat M.and Madria S. “Sensor Networks: An Networking. pp. 70-84, 2001.
Overview”. IEEE Potentials Magazine, Vol. 22(2), [14] Heinzelman W., Chandrakasan A., and
2007, pp. 20-23. Balakrishnan H. “Energy-Efficient Communication
[5] Perkins, C. “Ad hoc On-Demand Distance Vector Protocol for Wireless Microsensor Networks”.
(AODV) Routing”. Network Working Group. Proceedings of the 33rd Hawaii International
http://rfc.sunsite.dk/rfc/rfc3561.html, July 2003. Conference on System Sciences (HICSS '00), 2000.

[6] Broch, D., Johnson, D., and Maltz, J. “DSR The [15] Rus D., Li Q., and Aslam J. “Hierarchical Power-
Dynamic Source Routing Protocol for Multihop aware Routing in Sensor Networks”. Proceedings of
Wireless Ad Hoc Networks”. C.E. Perkins. Addison- the DIMACS Workshop on Pervasive Networking,
Wesley. Vol. 5, 2001, pp. 139-172 May 2001.

[7] Bhagwat, P., and Perkins, C. Highly Dynamic [16] Hebden P., and Adrian R. “Pearce Distributed
Destination- “Sequenced Distance-Vector Routing Asynchronous Clustering for Self-Organisation of
(DSDV) for Mobile Computers”. ACM Wireless Sensor Networks”. Proceedings of the Fourth
SIGCOMM'94 Conference on Communications International Conference on Intelligent Sensing and
Architectures, Protocols and Applications. London, Information Processing (ICISIP-06). Bangalore, India
United Kingdom, 1994, pp. 234-244. 2006, pp. 37-42.

395 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[17] Niu Y., Y, Z., and Hua,F. “Sub Cluster Aided [24] Hamma, T. Katoh, T. Bista, B.B. Takata, T. “An
Data Collection in Multihop Wireless Sensor Efficient ZHLS Routing Protocol for Mobile Ad Hoc
Networks”. IEEE, Wireless Communications and Networks”. 17th International Conference on
Networking Conference WCNC. Kowloon, China, Database and Expert Systems Applications. 2006, pp.
2007. pp. 3967-3971. 66-70.

[18] Fahmy S, and Younis O. “Distributed clustering [25] K. Beydoun, V. Felea, and H. Guyennet,
in ad-hoc sensor networks: A hybrid, energy-efficient “Wireless sensor network infrastructure: construction
approach”. Proceedings of the IEEE (INFOCOM) and evaluation,” in Wireless and Mobile
Conference on Computer Communications. Hong Communications, 2009. ICWMC’09. Fifth
Kong 2004. International Conference on, 2009, pp. 279–284.

[19] Boutaba, R., and Aoun, B. Clustering in WSN [26] K. Beydoun and V. Felea, “Wireless sensor
with Latency and Energy Consumption Constraints. networks routing over zones,” in Software,
Journal of Network and Systems Management, Vol. Telecommunications and Computer Networks
14, September 2006. (SoftCOM), 2010 International Conference on, 2010,
pp. 402–406.
[20] H. Lin, Y. Chu. A clustering technique for large
multihop mobile wireless networks. Vehicular [27] Walden, D. The Bellman-Ford Algorithm and”
Technology Conference Proceedings,Tokyo, Japan. Distributed Bellman-Ford”. [En ligne] 2009.
Vol. 2, pp. 1545-1549. (2000)
[28] Kamal Beydoun, “Conception d’un Protocole de
[21] O. Younis, S. Fahmy. Heed: A hybrid, energy- Routage Hirarchique pour les Reseaux de Capteurs”,
efficient, distributed clustering approach for ad hoc PHD Report, 2009, Franche-Comte.
sensor networks. IEEE Transactions on Mobile
[29] Omnet++ User Manual, 2014.
Computing. Vol.3, Issue 4, pp. 366–379. (2004)
https://omnetpp.org/
[22] L. Subramanian and R. Katz, “An architecture for
[30] A. Boulis, “Castalia”, Simulator for Wireless
building self-configurable systems,” in Proceedings of
Sensor Networks and Body Area Networks User
IEEE/ACM Workshop on Mobile Ad, Hoc
Manual Online, 2009.
Networking and Computing, Boston, MA, 2000.
https://castalia.forge.nicta.com.au
[23] Lindsey S, Raghavendra C. PEGASIS: “power-
[31] Crossbow. MICA2 Data sheet. [Online] 2009.
efficient gathering in sensor information systems”. In:
http://www.xbow.com/products/Product_pdf_files/W
IEEE aerospace conference proceedings, vol. 3; 2002.
ireless_pdf/MICA2_Datasheet.pdf.
p. 1125–30.

396 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Zone Hierarchical Routing Protocol with Data


Aggregation
Kamal Beydoun1
Department of Computer Science, Lebanese University, Beirut, Lebanon
Email: [email protected]

ABSTRACT
A wireless sensor network (WSN) is a network formed by a large number of sensor nodes where each node is equipped
with a sensor to detect physical phenomena such as light, heat, pressure, etc... WSNs are regarded as a revolutionary
information gathering method to build the information and communication system which will greatly improve the reliability
and efficiency of infrastructure systems. Compared with the wired solution, WSNs feature easier deployment and better
flexibility of devices. In the energy-constrained sensor network environments, it is unsuitable in numerous aspects of battery
power, processing ability, storage capacity and communication bandwidth, for each node to transmit data to the sink node.
This is because in sensor networks with high coverage, the information reported by the neighboring nodes has some degree
of redundancy, thus transmitting data separately in each node while consuming bandwidth and energy of the whole sensor
network, which shortens lifetime of the network. To avoid the above-mentioned problems, data aggregation techniques
have been introduced. Data aggregation is the process of integrating multiple copies of information into one copy, which is
effective and able to meet user needs in middle sensor nodes. In this paper, we will propose data aggregation solution to
the routing protocol ZHRP (Zone Hierarchical Routing Protocol). This solution will efficiently improve the lifetime of the
WSN.

KEYWORDS
Wireless Sensor Networks — Hierarchical Routing — Data Aggregation.

1. Introduction • Find the optimal routing paths


• Provide auto configuration
Recent technological advances led to the development of
very small and low-cost sensor devices with computational, Routing protocols can be flat or hierarchical. In Flat
processing, data storage and communicational capabilities. protocols, all nodes have the same roles. While in the
These devices, called wireless sensor nodes, when deployed hierarchical routing, some nodes may have specific roles in
in an area (indoors or outdoors) form a Wireless Sensor the routing to minimize communications and make energy
Network (WSN). The initial development of WSN was consumption efficient. Sensor Nodes are tiny and cheap
motivated by military applications such as enemy detection, devices that use low energy. Each node mainly has three
battlefield surveillance, etc. As years went by, considerable tasks: sense data, process data, and transmit data. Nodes can
amounts of research efforts have enabled the actual sense various types of data. They can sense temperature,
implementation and deployment of sensor networks tailored humidity, pressure, presence and absence of an object, and
to the unique requirements of certain sensing and monitoring can be attached to objects to get their characteristics like size,
applications. Nowadays WSNs are a very promising tool of speed, and location.
monitoring events and are used in many other fields, such as In all of the three tasks, the sensor node will have to consume
agriculture, environmental monitoring of air-water pollution, energy. Hence decreasing the energy consumption during
greenhouse, health monitoring, structural monitoring and these three stages will increase the WSN lifetime in a great
more. Given the benefits offered by WSNs compared to way. According to [3] the battery consumption of
wired networks, such as, simple deployment, low installation transmitting data is higher than processing the same amount
cost, lack of cabling, and high mobility, WSNs present an of data. Each sensor node consists of four main components
appealing technology as a smart infrastructure for building as shown in Figure 1.
and factory automation, and process control applications.
• Sensing Subsystem: It can sense data from the
WSN has many topics to deal with, the most important and
environment.
challenging topics are the WSN structure and WSN routing.
Routing allows sending packet from source to destination by • Processor Subsystem: This is used to process and
detecting the optimal routing path taking in consideration make some operations.
energy consumption. • Transceiver Subsystem: This is used to send and
Routing protocols should accomplish some objectives to be receive packets.
an efficient protocol. The main objectives are: • Power Supply Subsystem: This is usually the
• Minimize the communication between nodes to battery.
decrease the energy consumption.
• Perform reliable multi-hop communications

397 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017
• Limited Transmission Range: Nodes are tiny and
have small antennas and small battery so their
transmission range is limited and small.
• Limited Memory: Because of the small size of the
nodes, the nodes contain small and limited
memories.
• Limited Computing Power: The Processing unit in
the nodes is small and has limited resources due to
the size of the nodes.
• Dynamic Topology: Nodes may die, added, or even
Figure 1: Sensor Node Architecture
move. Therefore, the WSN will dynamically change
1.2 Wireless Sensor Networks its structure.
Sensor nodes offer a powerful combination of distributed 1.3 Wireless Sensor Applications
sensing, computing and communication. The ever-increasing Wireless Sensor Networks (WSNs) are used in many
capabilities of these tiny sensor nodes, which include sensing, applications that are divided into three categories:
data processing, and communicating, enable the realization of
• Monitoring of areas
WSNs based on the collaborative effort of a number of other
o Environment and Habitat: forest fire
sensor nodes. They enable a wide range of applications and, at
the same time, offer numerous challenges due to their detection, animal monitoring
peculiarities, primarily the stringent energy constraints to o Military: monitor friendly forces,
which sensing nodes are typically subjected. Wireless Sensor ammunition
Network is a highly distributed and randomly deployed o Agriculture: farming
wireless network consists of large number of sensor nodes • Monitoring of objects
called Motes. These nodes work with each other to sense data o Structures: critical building monitoring,
from the environment and send them to the base station over a machine status
large area. A very important factor in the lifetime of the WSN o Medical Diagnosis: blood pressure
is the energy consumption since sensor nodes are driven by monitoring
small batteries; they have a limited energy resource. When • Monitoring both areas and objects
sensors sense data, compute or communicate they consumes • Asset Tracking: vehicle tracking
energy, hence the lifetime of WSN will decrease. Therefore, • HealthCare: monitoring patients
the battery consumption should be decreased efficiently to • Disaster Management: volcanic monitoring
increase the network lifetime. Not only energy consumption is
a challenge for WSN, there are some other challenges like
limited memory, limited processing power, and limited 2. Related Work
communication range. 2.1 Routing Protocols in WSN
WSN nodes have a limited transmission range so they cannot
communicate with the base station directly so they must Energy consumption is one of the main challenges in wireless
cooperate with each other to deliver the data packets to the sensor networks. Energy saving assures a long lifetime for the
base station. The base station is responsible to collect data system. Another main goal is reducing the size of the stored
from the WSN. Nodes send data packets to their neighboring data (e.g. routing table) in each node of the network. Clustering
nodes which are in the range of the transmitting nodes, and is an important technique for prolonging the system lifetime
then forward those packets to their ‘neighbors’ until the base and reducing the size of the stored data. In clustering, nodes are
station. This act will consume power because of packets gathered in several groups, generally disjoint, which are named
transmission thus the communication should be decreased to a clusters. Each cluster has a cluster head (CH). The nodes
minimum in order to make the battery consumption lower and collect data and send it to the CH that forwards this data to the
as a result increase the network’s lifetime. final user or Base Station (BS). CHs can communicate with the
Base Station directly or via other
1.2 Wireless Sensor Characteristics
CHs. There are many existing clustering protocols. LEACH
Any WSN have some common characteristics such as: [22] is a distributed clustering-based protocol that uses
• Infra Structure less: WSN initially has no structure randomized rotation of the CHs to evenly distribute the energy
but it may define a structure after deployment. load among the sensors in the network. LEACH assumes that
• Large Area and Large Number of Nodes: WSN the fixed sink is located far from the sensors
contains a large number of sensor nodes and can and that all sensors in the network are homogeneous and
cover a very large area. battery-constrained. Lin’s protocol [23] is a distributed
• Many Interferences: Nodes in the WSN may receive clustering technique for large multi-hop mobile wireless
many packets at the same time. Packets may collide networks. The cluster structure is controlled by the hop
and lost. distance. In each cluster, one of the nodes in the cluster is
designed as cluster head. Other nodes join a cluster if they are
• Security Issues: WSNs highly exposed to security within a predetermined maximum number of hops from the
breaches, and nodes can be hacked easily. cluster head. HEED [24] is a distributed clustering protocol
that periodically selects cluster heads according to a hybrid
function between their residual energy and a secondary

398 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
parameter, such as node proximity to its neighborsVol. or node deployments.
15, No. 2, In the next section, we will detail our work for
February 2017
degree. In CES distributed protocol [25], each sensor computes structuring wireless sensor networks into zones, which is not a
its weight based on the k-density, the residual energy and the real clustering algorithm like the cited related work. Therefore,
mobility features. Then it broadcasts the weight to its 2-hop no cluster heads exist in our topology; no other information on
neighborhood. The sensor node having the greatest weight in the network (e.g. geographic position) is required.
its 2-hop neighborhood becomes the cluster head and its
2.2 Data Aggregation in WSN
neighboring sensors will join its cluster. SPAN [26] is a
distributed, randomized protocol in which nodes make local Sensor networks are distributed event-based systems that
decisions on whether to sleep, or to join a coordinator that differ from traditional communication networks in several
rotates at times. Each node makes its decision depending on the ways: sensor networks have severe energy constraints,
amount of available energy on the node and on its degree (the redundant low-rate data, and many-to-one flows. Data centric
number of its neighbors when the node is active). SPAN is a mechanisms that perform in-network aggregation of data are
protocol that operates under the routing layer and above the needed in this setting for energy-efficient information flow.
MAC and physical layers. The routing layer uses information Because of the requirement of unattended operation in remote
SPAN provides, and SPAN leverages any power saving or even potentially hostile locations, sensor networks are
features of the underlying MAC layer [26]. The centralized extremely energy-limited. However since various sensor
PEGASIS protocol [27] constructs chains instead of clusters. nodes often detect common phenomena, there is likely to be
Each node delivers the sensed data to the nearest neighbor some redundancy in the data the various sources communicate
to a particular sink. In-network filtering and processing
node. One sensor node on the chain is assigned as the cluster
techniques can help to conserve the scarce energy resources.
head node that delivers sensed data to the base station. The
Data aggregation has been put forward as an essential
head node is selected by turns; this technique allows even
paradigm for wireless routing in sensor networks [3, 6]. The
energy consumption in wireless sensor networks. However, the
idea is to combine the data coming from different sources–
PEGASIS protocol causes redundant data transmissions since eliminating redundancy, minimizing the number of
one of the nodes on the chain is selected as the head node transmissions and thus saving energy. This paradigm shifts the
regardless of the base station's location. In [28], authors focus from the traditional address-centric approaches for
propose the enhanced PEGASIS protocol based on the « networking (finding short routes between pairs of addressable
concentric clustering » scheme to solve this problem. It means end-nodes) to a more data-centric approach (finding routes
that clusters have the shape of concentric circles. Similar to from multiple sources to a single destination that allows in-
PEGASIS, the SHORT protocol [29] adopts centralized network consolidation of redundant data).
approaches and requires powerful BS to take the responsibility Data Aggregation is the process of collecting and summarizing
of managing the network topology and to calculate the routing data from the sensor nodes in a way that the communications
path and time schedule for data collection. between nodes are reduced so the energy consumption of the
Most topologies based on clusters assume that cluster heads are nodes is decreased hence increasing the network lifetime.
high-energy nodes and their transmission power can be adapted
in order to reach the base station at far distances and to
communicate directly to other cluster heads. Another
assumption is that nodes within a cluster can directly
communicate to the cluster head. In SHORT, HEED, CES,
PEGASIS and Enhanced PEGASIS all nodes are supposed to
have the ability to modify the transmission power in order to
control topology. The LEACH radio model [22] is used for
these protocols. Requirement of adaptive and dynamic
transmission power modification can be prohibitive, especially
for sensors not equipped with transmission amplifier. SPAN
uses the radio model of the Cabletron Figure 2: Data Aggregation
Roundabout 802.1 card has fixed transmission range and does
Figure 2 shows how sensor nodes S1, S2 ... Sn send their
not support power control. Lin’s protocol does not mention the
packets S’1, S’2 … S’n to a data collector node called
radio model used for simulations. The transmission defines the
aggregator node A, which indeed collects data and eliminates
set of neighbors for a sensor node, those able to receive the
the redundant data. The aggregator uses some methods (f in
transmitted signals. Because variation of the transmission Figure 2) to remove redundant data and produce the
range consumes more resources, virtual topologies should be aggregated filtered data y’. These methods could be statistical
proposed for sensor networks that are made of sensors with methods like in [13], probabilistic methods like in [14], or
fixed transmission power. The challenge addressed in this artificial intelligence like in [15]. The filtered data y’ is then
paper presents an approach of virtual structuring of networks sent to the base station R.
without using topology control technique. Our contribution to Data Aggregation protocols main goal is to gather and
topology construction addresses two main issues in WSNs: aggregate data in an energy efficient manner. These protocols
distributed approaches, and energy efficiency. Moreover, our can be classified as structure based and structure-free data
approach is independent of the embedded sensor technology aggregation protocols. In structure based protocols data are
(being able to vary the transmission power or not); the only transmitted to the base station by creating chain [6], tree
parameter considered is the current node’s transmission EIPDAP [2], cluster [16], tree-cluster [17], or hierarchy
range. The algorithm is executed simultaneously with the clustering [18].
neighborhood discovery protocol for random sensor node

399 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
3. EIPDAP Vol. 15, No. 2, 3.3
February
Result2017
Checking
In the result-checking phase, the base station verifies the
Efficient Integrity-Preserving Data Aggregation Protocol integrity of the aggregated values with the two tags. As a
(EIPDAP) [2] is an aggregation protocol that can verify the result, the base station can preserve the integrity of the
integrity of aggregation result immediately after receiving aggregated data immediately after receiving the aggregated
aggregation result and the corresponding authentication data and their corresponding authentication information, so it
information. The integrity verification is not done through will reduce the energy consumption not like other protocols
another query-and-forward phase, for this reason energy, which makes another query phase to check if the integrity of
consumption and communication delay will be reduced the aggregated data. EIPDAP is energy efficient because the
significantly. result-checking phase is done in the base station hence no
EIPDAP needs some network assumptions to work. The first congestion in the aggregation tree during the result checking-
assumption is that the base station needs to be powerful with phase.
transmission range enough to cover the wireless sensor
network in order to broadcast messages to all nodes directly,
4. Integration of EIPDAP in ZHRP
because the base station needs to broadcast authenticated
query before the aggregation phase. The second assumption is
As we have described previously, ZHRP protocol splits the
that the wireless sensor network should form a tree topology
WSN into disjoint zones. Each zone has one Inviting node,
with the base station as the root.
Normal and Border Nodes. Each node has an Intra-Zone
EIPDAP is based on the elliptic curve discrete logarithm with
Routing Table that contains path cost to reach a destination
hierarchical aggregator topology. EIPDAP goal is to prevent
node in the same zone. This path is the minimal between the
stealthy attacks where the attacker tries to send wrong data to
paths. The Border nodes have Border Table to know their
the base station and make it accepts them. Each node in the
neighbor nodes from the neighbor zones. In addition, the
wireless sensor network should have a unique identifier s,
Border nodes have the Inter-Zone Routing Table that contains
private keys r and l ∈Zp, and shares a private key sk with the
the cost of passing through a zone when sending a packet. In
base station and a private point Ө ∈ cyclic elliptic group E(Zp)
ZHRP, the packet passes through the shortest route. Therefore,
with the base station. Also the generator point G ∈ E(Zp) is
the packet will pass through fewer nodes until it reaches its
preloaded to all the nodes. In addition, two parameters and
destination. However, if there are x nodes that wants to send
such that = and = are preloaded to all nodes. data packets to their destinations, each packet will have a route
EIPDAP is accomplished after three main phases: query to pass through, hence there will be x routes, therefore a lot of
dissemination, aggregation-commit, and result checking. nodes will have to send and receive packets. To decrease the
3.1 Query Dissemination number of routes, packets must be combined as one packet as
much as possible to reduce the number of packets and reduce
In the dissemination phase aggregation tree information is the number of routes.
collected; if the aggregation tree is not constructed then it is Data aggregation is the process of collecting and summarizing
constructed during this phase. Then the base station calculates data. This process will reduce the amount of data to be sent
path-keys and edge key for each node and encrypts them with from one node to another, which will reduce communications
the secret key shared between the base station and the node, and decreases the energy consumption to increase the WSN
and then the base station sends them to the corresponding lifetime.
node. In ZHRP, if an event occurs, it may that many nodes in the
3.2 Aggregation-Commit same zone will send the same sensed data (event) to the same
destination so there will be x packets each will have a path,
and the packets may be redundant. This scenario will consume
energy in a bad way. Therefore, to solve this problem we will
build some aggregation trees in all the zones. These
aggregation trees will collect data, summarize data, and sends
them as one packet to the destination node. In this way, we will
highly reduce the number of packets to be send from one zone
to another. The x packets may become one packet. This will
decrease the number of send and receive actions at the nodes,
so energy consumption will be efficient.
4.1 Aggregation Tree Construction Algorithm
Table 1: Tree Construction Packet Fields

SourceId Node Id of the sending node


TreeId Tree Id that the packet comes from
Level Level of the sending node
ZoneId Zone Id of the sending node

After finishing all ZHRP stages (zone construction stage,


Intra-Zone Routing Table construction stage, and Inter-Zone
Routing Table construction stage), each zone will start
Figure 3: Aggregation Commit phase constructing the aggregation trees, each Border Node will be a
root of a tree and start the construction of that tree, the tree will

400 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
have treeId equals to the root nodeId. During Vol. the 15,
treeNo. 2,
receive more
February packets from other children. Then it will add its
2017
construction, a packet will be used with fields illustrated in data and apply any aggregation algorithm on the collected
Table 1. data, such as filtering, addition, or subtraction. Then the node
When a node receives a tree construction packet, if the node that received the data packets will send the result data as one
joins the tree, it must send a child packet to the sender of the packet to its parent, and the parent node will do the same task
construction packet to tell it that it is its child. until the data packet reaches the root node, which is a Border
Table 2 shows the fields of the child packet. Node.
When a Border node receives data from its children, then it
Table 2: Child Packet Fields will send the data using ZHRP routing to the destination node.

ChildId The node Id of the child node


ZoneId The zone Id of the child node

After Tree Construction, every node will compute some fields


as shown Table 3:

Table 3: Fields at each node after Tree Construction

TreeId The Id of the tree the node belongs to


Level The level of the node in the tree
ParentId Node id of the Parent node
ChildsList List of Ids of the Children of the node Figure 6: Node activities when an event occurs

Figure 4 shows the trees in a zone with two border nodes: 5. ZHRP vs ZHRP with Aggregation:
Routing Scenario
Figure 7describes what happens when an event occurs at zone
Z1 and how the nodes that detects that same event sends their
packets to the destination base station B using ZHRP routing
(Table 4).

Figure 4.Trees Constructed in a Zone

Figure 5shows the tree construction algorithm. Each Border


node will broadcast a construction tree packet with TreeId
same as Node Id and Level equals to zero, as it is the Root of
the tree. Any node that receives the construction packet, if the
node is already in a tree it will ignore the packet. However, if
it is not already joined, it will join the tree and set the level to
that from the packet added by one, and set its TreeId as the
packets tree Id. Then the node will also broadcast constructing
packets. The node must send a Child packet its parent to tell it
that it is its child.

Figure 7: Routing Scenario with ZHRP

Table 4: ZHRP routing path of the same event

Node Path Sent


Figure 5: Tree Construction Algorithm Packets

4.2 Tree Data Aggregation Algorithm n1 n1,n2,b1,b2,b3,b4,b10,b11,n10,n11, 15


b12,b13,b14,n12-n13,B
After trees are constructed, and when an event occurred near
some nodes, those nodes must send their data to base station n3 n3,n4,n7,b4,b10,b11,n10,n11, 13
or to any defined node in the WSN. Figure 6 shows how data b12,b13,b14,n12,n13,B
aggregation occurs. Nodes that have sensed data must send
their data to their parents in the aggregation tree. These packets n6 n6,b6,b7,b8,b10,b11,n10,n11, 13
should contain the data that the node wants to send. When a b12,b13,b14,n12,n13,B
node receives a data packet from a child of it, it will wait for a
specific time t (which is a parameter for our protocol) to

401 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
assumed:
Vol. 15, No. 2, Februaryno interference, no interruptions, and no packet loss
2017
as we are simulating in the network layer. The simulation takes
place in a field of size 200m x 200m. The sensor transmission
range is 15m.
6.1 Sent and received packets – Aggregation Tree
Construction
During Aggregation Tree Construction as shown in Figure 9
and Figure 10, the number of Sent and Received packets in
each node is very small. This number does not increase when
R and number of zones change because nodes will be able to
communicate with their direct neighbors only.

Figure 8: ZHRP routing scenarion with aggregation

Figure 8 describes what happens when an event happens at


zone Z1 and how the nodes sends their packets to the
destination base station B using ZHRP routing and aggregation
tree.

Table 5: ZHRP routing path with aggregation for the same event
Figure 9: Aggregation Tree Construction - Sent Packets
Node Path Sent Packets
n1 n1,n2,b2 2
n3 n3,n4,n5,b2 3
n6 n6,n7,n5,b2 3

When an event happens in Z1, nodes n1, n3, and n6 receives


the event. Each node must send data packet to its parent in the
aggregation tree. The tree root is b2. Packets are traveling as
shown in Table 5.
From b2 to B normal ZHRP routing occurs hence we have 12
Sent packets. Therefore, the total number of packets is 20,
which is less than the ZHRP routing alone. Hence, energy Figure 10: Aggregation Tree Construction - Received Packets
consumption is decreased and the network lifetime increased.
Note that when a node n in source zone Z1 wants to send data 6.2 Sent packets– Routing
packet to a destination d in destination zone Z2. At first, node In this comparison, a variable number of events will take place
n will send it to its parent in the aggregation tree. The parent over the WSN. Nodes that receive those events will send their
then will send it with data from other children to its parent, and data to a destination node in another zone. Therefore, we are
so on. Until the data packet reaches the root of the tree, which simulating the routing of sensed data. The total number of
is a Border node in the source zone Z1. When the data packet events will vary between 300, 400, 500, and 600. The number
is received by the Border node, the Border node will forward of nodes is 400. The number of zones will change between 10,
the packet to the destination zone Z2 using its Inter-Zone 20, and 30.
Routing Table. When the packet is received by a Border node
at the destination zone Z2, this node will use its Intra-Zone
Routing Table to forward the received packet to the destination
node d in Z2.

6. Simulation and Results


In this section, we will show the implementation of all stages
of the protocol ZHRP, the tree construction and ZHRP with
aggregation. The implementation has been done using a
simulation framework called Omnet++ [19]. Omnet++ is not a
simulator; it is a discrete event network simulation framework.
In other words, it provides infrastructure and tools to build
network simulations. A simulator based on Omnet++ is used
Figure 11: Number of zones Z =10
in building the simulation, which is called Castalia WSN
simulator [20]. In this simulation, an ideal configuration was

402 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 12: Number of zones Z =20 Figure 15: Number of zones Z =20

Figure 11, Figure 12, and Figure 13show that when events
happens the number of packets Sent in ZHRP is much higher
than that of ZHRP with tree aggregation. The results show that
the number of Sent packets decreased from 15994 to 3480
which signals that the number of sent packets is decreased by
about 80%. Therefore, the integration is very efficient, it will
decrease the energy consumption, and the WSN lifetime will
increase.

Figure 16: Number of zones Z =30

6.4 Energy Consumption Results


Energy Consumption has been calculated based on the
characteristics of the MICA2 [21] sensor node shown in Table
6. We have calculated the battery consumption during
Aggregation Trees Construction. We also calculated the
battery consumption during the routing scenario of ZHRP
Figure 13: Number of zones Z =30 compared to ZHRP with aggregation. We change the number
of nodes between 200, 300, and 400, and the number of zones
6.3 Received packets - Routing between 5, 15, and 25, and for R we used 5 and 25.

Table 6: Characteristics of MICA2 Sensor

CPU Consumption 8 mAh


Receiving Consumption 10 mAh
Transmitting Consumption 27 mAh
Initial Energy 2900 mAh
Voltage 3V
Data Transfer Rate 38400 bits/s
Communication Range 500ft

Energy consumptions for the Aggregation Tree Construction


are shown in Figure 17 for the sent packets and in Figure 18
Figure 14: Number of zones Z =10 for the received packets. It is clear that the energy consumption
decreases when the number of nodes increases because the
density of network (number of nodes/m2) increases so that the
Figure 14, Figure 15, and Figure 16 show that when events algorithm of construction demands less energy.
happens the number of packets Received in ZHRP is much
higher than that of ZHRP integrated with tree aggregation. The
results show that the number of packets Received decreased
from 701 to 201, which means that the number of packets
Received is decreased by about 75%. Therefore, the
integration is very efficient, it will decrease the energy
consumption, and WSN lifetime will increase.

Figure 17: Aggregation Tree Construction - Sent Packets

403 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
For
Vol. 15, No. 2, the routing
February 2017 scenario the battery consumption energy
percentage decreased (in the worst case) from 40% to 10%
after adding the aggregation to the ZHRP.

8. References
[1] K. Beydoun and V. Felea, “Energy-efficient WSN
infrastructure,” in Collaborative Technologies and Systems,
2008. CTS 2008. International Symposium on, 2008, pp. 58–
65.
[2] L. Zhu, Z. Yang, M. Li, and D. Liu, “An Efficient Data
Figure 18: Aggregation Tree Construction - Received Packets Aggregation Protocol Concentrated on Data Integrity in
Wireless Sensor Networks,” Int. J. Distrib. Sens. Netw., vol.
Battery consumption in ZHRP with Aggregation Tree routing 2013, pp. 1–9, 2013.
are shown in Figure 19for the sent packets, and in Figure 20for [3] Alberto Camilli, Carlos E. Cugnasca, Antonio M. Saraiva,
the received packets. André R. Hirakawa, Pedro L.P. Corrêa, “From wireless
sensors to field mapping: Anatomy of an application for
precision agriculture”, in: Computers and Electronics in
Agriculture archive. Volume 58 Issue 1, August 2007 Pages
25-36.
[4] L. Subramanian and R. Katz, “An architecture for building
self-configurable systems,” in Proceedings of IEEE/ACM
Workshop on Mobile Ad, Hoc Networking and Computing,
Boston, MA, 2000.
[5] Heinzelman W, Chandrakasan A, Balakrishnan H.
“Energy-efficient communication protocol for wireless
microsensor networks”. In: Proceedings of the 33rd Hawaii
international conference on system sciences (HICSS ’00);
Figure 19: ZHRP with Aggregation Tree Sent Packets 2000. p. 3005–14.
[6] Lindsey S, Raghavendra C. PEGASIS: “power-efficient
gathering in sensor information systems”. In: IEEE aerospace
conference proceedings, vol. 3; 2002. p. 1125–30.
[7] Hamma, T. Katoh, T. Bista, B.B. Takata, T. “An Efficient
ZHLS Routing Protocol for Mobile Ad Hoc Networks”. 17th
International Conference on Database and Expert Systems
Applications. 2006, pp. 66-70.
[8] K. Beydoun and V. Felea, “WSN hierarchical routing
protocol taxonomy,” in Telecommunications (ICT), 2012 19th
International Conference on, 2012, pp. 1–6.
[9] K. Beydoun, V. Felea, and H. Guyennet, “Wireless sensor
network infrastructure: construction and evaluation,” in
Figure 20: ZHRP with Aggregation Tree Received Packets
Wireless and Mobile Communications, 2009. ICWMC’09.
Fifth International Conference on, 2009, pp. 279–284.
[10] K. Beydoun and V. Felea, “Wireless sensor networks
7. Conclusion routing over zones,” in Software, Telecommunications and
The introduction of data aggregation benefits both from saving Computer Networks (SoftCOM), 2010 International
energy and obtaining accurate information. The energy Conference on, 2010, pp. 402–406.
consumed in transmitting data is much greater than that in [11] Kamal Beydoun, “Conception d’un Protocole de Routage
processing data in sensor networks. Therefore, with the node’s Hirarchique pour les Reseaux de Capteurs”, PHD Report,
local computing and storage capacity, data aggregating 2009, Franche-Comte.
operations are made to remove large quantities of redundant [12] Walden, D. The Bellman-Ford Algorithm and”
information, to minimize the amount of transmission and save Distributed Bellman-Ford”. [En ligne] 2009.
energy. As previously proved, the addition of aggregation to [13] W. Zhang, Y. Liu, S.K. Das, P. De. “Secure data
the ZHRP has reduced the number of sent and received packets aggregation in wireless sensor networks: a watermark based
in a route from source to destination. This addition decreased authentication supportive approach”. Pervasive Mobile
the number of sent and received packets from 41 to 20. Results Comput, 4 (2008). pp. 658–680.
prove the energy efficient. The simulation also shows how the [14] Huang, H. Leung. “An expectation maximization based
aggregation trees are constructed with small number of interactive multiple model approach for collaborative
packets; hence the addition of aggregation tree will not driving”. IEEE Trans Intell Transp Syst (2005), pp. 206–228.
consume a lot of energy. During scenario of routing with [15] S. Croce1, F. Marcelloni1, M. Vecchio. “Reducing power
aggregation, sent and received packets decreases by 80% and consumption in wireless sensor networks using a novel
75% respectively. Hence, aggregation trees will increase the approach to data aggregation”. Comput J Math Stat, 51 (2007),
WSN lifetime by a noticeable value. Results clearly show that pp. 227–239.
the battery consumption during ZHRP with Aggregation trees [16] Heinzelman W, Chandrakasan A, Balakrishnan H.
construction does not more than 1% of the battery energy. “Energy-efficient communication protocol for wireless
microsensor networks”. In: Proceedings of the 33rd Hawaii

404 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
international conference on system sciences (HICSS Vol. ’00); [25]
15, No. 2, M. Lehsaini,
February 2017 H. Guyennet and M. Feham. An Efficient
2000. p. 3005–14. Cluster-based Self-organization Algorithm for Wireless
[17] Huang KC, Yen YS, Chao HC. “Tree-clustered data Sensor Networks. International Journal of Sensor Networks,
gathering protocol (TCDGP) for wireless sensor networks”. Inderscience Publishers. Vol. 6, Issue 4 (2009)
In: Proceedings of the future generation communication and [26] B. Chen, K. Jamieson, H. Balakrishnan, and Robert
networking (FGCN 2007), vol. 02; 2007. p. 31–6. Morris. Span: An Energy-Efficient Coordination Algorithm
[18] P. Mohanty, M.R. Kabat. “A hierarchical energy efficient for Topology Maintenance in Ad Hoc Wireless Networks.
reliable transport protocol for wireless sensor networks”. Ain Springer. Wireless Networks. Vol. 8, pp. 481-494(14). (2002)
Shams Eng J, 5 (2014), pp. 1141–1155 integrity. [27] S. Lindsey, C.S. Raghavendra. PEGASIS: Power-
[19] Omnet++ User Manual, 2014. https://omnetpp.org/ efficient gathering in sensor information systems. IEEE
[20] A. Boulis, “Castalia”, Simulator for Wireless Sensor Aerospace Conference Proceedings. Vol. 3, pp. 1125-1130.
Networks and Body Area Networks User Manual Online, (2002)
2009. https://castalia.forge.nicta.com.au [28] S. Jung, Y. Han, T. Chung. The Concentric Clustering
[21] Crossbow. MICA2 Data sheet. [Online] 2009. Scheme for Efficient Energy Consumption in the PEGASIS.
http://www.xbow.com/products/Product_pdf_files/Wireless_ The 9th International Conference on Advanced
pdf/MICA2_Datasheet.pdf. Communication Technology. pp. 260-265. (2007)
[22] W.R. Heinzelman, A. Chandrakasan, H. Balakrishnan. [29] Y. Yang, W. Hui-Hai, C. Hsiao-Hwa. SHORT: Shortest
Energy efficient Communication Protocol for Wireless Hop Routing Tree for Wireless Sensor Networks. IEEE
Microsensor Networks. Proceedings of the IEEE Hawaii International Conference on Communication. Vol. 2, pp. 3450
International Conference on System Sciences. Vol. 2, p. 10. - 3454. (2006)
(2000).
[23] H. Lin, Y. Chu. A clustering technique for large multihop
mobile wireless networks. Vehicular Technology Conference
Proceedings,Tokyo, Japan. Vol. 2, pp. 1545-1549. (2000)
[24] O. Younis, S. Fahmy. Heed: A hybrid, energy-efficient,
distributed clustering approach for ad hoc sensor networks.
IEEE Transactions on Mobile Computing. Vol.3, Issue 4, pp.
366–379. (2004)

405 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Live Forensics on RouterOS using API Services


to Investigate Network Attacks
Muhammad Itqan Mazdadi Imam Riadi Ahmad Luthfi
Department of Informatics Engineering Department of Information System Department of Informatics Engineering
Islamic University of Indonesia Ahmad Dahlan University Islamic University of Indonesia
Yogyakarta, Indonesia Yogyakarta, Indonesia Yogyakarta, Indonesia
[email protected] [email protected] [email protected]

Abstract— Network Forensics are complicated and worth software and high-level. RouterOS Mikrotik API began in
studying. One of the interesting parts of the network is a router introduced and used since version 3.0 [4][5].
that manages all connection for all logical network activity. On
network forensics, we need traffic log to analyze the activity of Under the background of the domains that have been
any computer connected to the network in purpose to know what presented, this study is to gather information and conduct an
hackers do. In another hand, not all information can get from analysis of digital evidence contained in the RouterOS by
traffic log if the network didn’t save the network sniffing. Thus using the API (Application Programming Interface) as a tool
this case need find other resources like router information. To to help maintain information on the activity of network
access information on the router like RouterOS on Mikrotik forensics.
devices, can maintain some data using API to remote access the
router remotely. The purpose of this paper is to explore how to II. RELATED WORKS
do a forensics of RouterOS based Mikrotik devices and Several previous studies have been done on digital
developing a remote application to extract router data using API forensics. Research about Logs management system has been
services. As a result, acquisition process could obtain some developing for several years like kiwi syslog, bnare backlog,
valuable data from the router as digital evidence to explore
spectorosoft server manager, manage engine, and splunk log
information of network’s attacks activity.
management[6]. The management log system helps forensics
Keywords- Network; Router; Live Forensics; API; Logs investigators to analyze and determine an approach to detect
network attacks[7]. The most useful research on network
I. INTRODUCTION forensics is the development of method ontology for
intelligent network forensics analysis[8].
Network forensics is part of the digital forensics that
focuses on monitoring and analysis of data traffic on the Some research about router forensics has done on several
network. The type of data being handled is dynamic data router devices like Cisco, TP-Link, Ubiquity etc. Most of the
network forensics. It is different to that of digital forensics, study of router forensics is included DHCP handling in
data which are static[1][2]. With the increasing presence of determining IP address of computer client that extracted on
digital devices, information storage, and network traffic, device memory[9]. Not only on physical devices, but some
forensics Cyber face of the growing number of cases that have virtual model of the router also given some information for
complex growth. digital forensics[10].
Digital evidence is always taken from the network traffic Acquisition of data from the Household and Small
logs derived from sniffing or monitoring activity to be Business Wireless Router also provides an overview of how
analyzed[2][3]. In addition to the actual network traffic logs of the retrieval of data from the router. In addition they also
the router device, we can also get some valuable information mapped the correlations NAT TCP flow on private wireless
to a network. Information may be found on the router is admin networks between TCP flow to the internet. As well as the
logging activity, a list of client IP address, mac address, mechanism of relationship logging IP and TCP port[11].
network configuration, firewall configuration, etc.
III. BASIC THEORY
API (Application Programming Interface) is a set,
functions, and protocols that can be used by programmers A. Network Forensics
when building software for a particular operating system. API Network forensics is a forensic field that focuses on the
allows programmers to use standard functions to interact with area of network and associated devices. Network forensics is
other operating systems. API is one method of doing an attempt to find the attacker information to look for potential
abstraction, usually (but not always) between the low-level evidence after an attack or incident. There is a variety kind of
attacks include probing, DoS, user to root (U2R) and remote to

406 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

local. Network forensics is the process of capturing,


annotating and analyzing network activity in order to find
digital evidence of an attack or crime committed using a
computer network.
Digital evidence from the network can be identified from
the recognized attack patterns and deviations from normal
behavior. Network forensics has a variety of activities and
techniques of analysis, such as analysis of existing processes
in the IDS , the analysis of network traffic and the analysis of
network device itself, all considered the part of a network
forensics[2].
B. Live Forensics Method Figure 1. Use of API
Live forensic is a method of forensics used when the
system is in running state. This is because the data that will be One of the benefits of the API is to provide convenience to
withdrawn likely lost when the system is turned off. The developers to create applications that can communicate to the
implementation of live forensic usually used in the case of device through a wide variety of programming languages.
volatile memory. Volatile data is the data that usually stored in RouterOS has provided an API Services that can be used to
temporary media like RAM, where is very easily to lost. The communicate custom software to the router devices. The
volatile data acquisition process should be implemented as developers can create their own applications to suit their
soon as possible after the incident. The live forensic method is needs. The API services on RouterOS can be applied to the
required in a case of acquisition of log server and computer programming language like Java, PHP, pearl, C ++, etc[4][5].
RAM.
IV. SYSTEM REQUREMENT
One of the problems that happen to live forensic is the
To support the experiment on this research, the hardware
modification of data or contamination data, it because the
acquisition process is done on the system itself. In the case of and software that necessary used is listed below:
taking the Log on the server, the servers also keep records on  Mikrotik RB750 with RouterOS version 6
the log of acquisitions activity. Even so, the forensic  Access Point TP-Link MR3020
examiners must be competent and understand implications of  ADSL Modem as Source Internet
their actions [12].  Laptops, PC, and Smartphone as client on the network
C. Mikrotik RouterOS  Java SDK
 Netbeans for software development
Mikrotik is a Linux-based operating systems on a
computer that functioned as a router[12]. It designed to The Network topology implemented on this research
manage the network in small-scale networks like home and illustrated by figure 2:
large-scale network like an office. Mikrotik began to be Mikrotik
Router
established in 1995 that was originally intended for the
Internet
company's Internet service (Internet Service Provider, ISP).
ADSL 192.168.1.2
Currently, MikroTik provides services to many wireless ISPs Modem
0.0.0.0/0
for Internet access services in many countries around the 192.168.1.1
world and also very popular in Indonesia[13]. Hub/Switch

Mikrotik on standards-based hardware Personal Computer


(PC) known for their stability, quality control and flexibility Access Point 192.168.2.0/24
for different types of data packets and handling of these 192.168.3.0/24
processes (routing). Mikrotik created as a router-based
computer much benefit for an ISP that wants to run multiple
applications ranging from the lightest to advance. In addition
192.168.3.2 192.168.3.4 192.168.2.2 192.168.2.3
to routing, Mikrotik can be used as a management access
capacity, bandwidth control, firewall, hotspot system, Virtual Figure 2. Network Topology
Private Network server and much more[13].
This topology applies router Mikrotik RB750 to distribute
D. API (Application Programming Interface) internet access from ADSL Modem to client trough
API (Application Programming Interface) is a set of Switch/Hub and Access Point. Switch/Hub is a bridge for
commands, functions, and protocols that can be used by connecting PC to the router using wired LAN Port. Access
programmers when building software for a particular Point is used to connecting Laptops and Smartphone using
operating system. API allows programmers to use standard wireless connectivity.
functions to interact with other operating systems[4][10][14].
The concept of API model shown in figure 1:

407 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

V. ACQUISITION TOOL DEVELOPMENT


Refers to the analysis of digital evidence on the common
router from related research, we determine what information
need to be extracted from RouterOS. The information that
should be capable of being extracted is Log activity, ARP, IP
Address, DHCP Leases, RouterBoard Info, and DNS Cache.
On RouterOS version 6, the API service is automatically
active as the default configuration. The API allow the software
that we develop to maintain data from the router using
communication protocol on port 8763[4]. Workflow of
developed application show by figure 3:

Figure 5 Sample Exported Data to Excel

Java VI. TEST AND RESULT


Application
Environment A. Attack Simulation Test
On simulation stage, we perform an attack on the router
using hydra tool. The purpose of this attack is to leave some
footprint on the router of some attack activity. Hydra is a tool
to maintain attack via network services like FTP Service.
API Services Router Using “dictionary attack” method, hydra obtains login based
on dictionary of username and password. Hydra will do
several login requests until the login is success gained or fail if
Figure 3 Process Workflow no one username and password is match with dictionary list.
Hydra is used by CLI (Command line interface) as shown in
The software is built with one simple form as the main figure 6:
menu. To perform data extraction, the investigator needs to
input IP Address of Router, Username, and Password. After
that, the tables on the form will show the extraction data. The
data is categorized by tabulation bar for each field. Sample
result of data acquisition shown in Figure 4:

Figure 6. Hydra Attack

In this case, it attacks router on address 192.168.1.1 via


FTP services. The result is found that username is “admin”
Figure 4. Sample Extraction Process and password is “qazwsx123”.
B. Acquisition & Analysis
The extracted data show separately by tab based on the
information categories. The categories tab is Log Activity, IP The challenge of router forensic is about volatile data
address List , ARP, DHCP Leases, RouterBoard Info, Users, stored on the memory that will destroy after a reboot or the
and DNS Cache. router shut down. This condition is the reason why the method
of live forensics should be implemented in this process. The
For the purposes of analysis, export feature is added to acquisition should be performing as soon as possible after the
allow extracted data export as .xlsx spreadsheet file. This xlsx incident[15]. The acquisition is initiated on a computer that
file can be used as digital evidence for analysis. The sample of connected to the network as a client.
exported .xlsx file shown in figure 5.
Using the application that been developed to extract the
data from the router. The sample result of acquisition shown in
table 1:

408 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE 1. SAMPLE DATA OF LOG ACTIVITY Observation data starts with observing Log Activity on
Time Topic Message Table 1 that show the IP Address 192.168.1.246 has 38 failed
16:07:31 system,error,critical login failure for user pengelola from login requests from time 16:07:30 until 16:07:32. This action
192.168.1.246 via ftp is impossible as human behavior that can make 38 requests in
16:07:31 system,error,critical login failure for user pengelola from 2 seconds. That makes 192.168.1.246 as suspected address. In
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user pengelola from
16:08:46 found that 192.168.1.246 is successfully logged in
192.168.1.246 via ftp via telnet which means it gained full access to the router. The
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via action is followed by an activity to add a new user to the
ftp router with name “jebol” at time 16:11:26.
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
ftp For validation of attack activity, we collect the network log
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via from the network sniffing. The sniffing process is recording
ftp
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
the activity of traffic on the network. With tool named
ftp Wireshack, observe a .pcap file to explore FTP Services
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via Communication activity as shown in figure 8. As expected,
ftp same FTP activity obtained from address 192.168.1.246 is
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
found.
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via
ftp
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user administrator from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user pengelola from
192.168.1.246 via ftp
16:07:31 system,error,critical login failure for user sa from 192.168.1.246 via
ftp
16:07:31 system,error,critical login failure for user root from 192.168.1.246 via
ftp
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
Figure 8. Observation Network Traffic Log with Wireshack
16:07:32 system,error,critical login failure for user power from 192.168.1.246
via ftp
The search continued in subsequent data by looking the
16:07:32 system,error,critical login failure for user kasir from 192.168.1.246 via ARP list. ARP list shows information about IP address that is
ftp owned by a Mac Address on the network. Observation on the
16:07:32 system,error,critical login failure for user power from 192.168.1.246 table 2 found that Mac Address of 192.168.1.246 is
via ftp
16:07:33 system,error,critical login failure for user admin from 192.168.1.246
00:0C:29:48:0B:0A.
via ftp TABLE 2. ARP LIST
16:07:34 system,info, account user admin logged out from 192.168.1.246 via IP Address Mac Address Interface
ftp
192.168.1.243 00:26:6C:98:CE:C3 ether2
16:08:46 system,info, account user admin logged in from 192.168.1.246 via
172.16.150.1 00:1E:67:CF:1A:B1 ether1
telnet
192.168.1.242 CC:07:AB:8F:06:9D ether2
16:11:26 system,info user jebol added by admin
192.168.1.254 14:F6:5A:67:CF:59 ether2
16:12:51 system,info simple queue changed by admin 192.168.1.246 00:0C:29:48:0B:0A ether2
192.168.1.251 74:2F:68:9D:26:35 ether2
16:13:29 system,info, account user investigator logged in from 192.168.1.243 192.168.1.249 00:21:5D:4C:D7:D0 ether2
via api 192.168.1.250 60:D9:A0:64:36:2C ether2
Note : The highlighted data colored by red.
The analysis process should be able to link information
from different variable includes the completion of information In addition, to knowing the hostname of attacker computer
against other information to explain an event or attacks and validate the address, DHCP Leases field needs to observe.
activity. Stages of analysis data field shown in figure 7: The data of DHCP Leases shows at table 3:
TABLE 3. DHCP LEASES
Activity Log ARP List DHCP Server IP Address Mac Address Host Name
192.168.1.245 58:A2:B5:82:5D:08 android-d803df206d5dfd68
•Find Suspected •IP Address Leasses
192.168.1.244 54:27:1E:B8:98:EF Falcon-00
Activity •Mac Address •IP Address 192.168.1.241 74:29:AF:EB:17:CF POLICE
•Mac Address 192.168.1.242 CC:07:AB:8F:06:9D android-ab0a5c691d5a4e06
•Hostname 192.168.1.246 00:0C:29:48:0B:0A HACKER
192.168.1.248 74:E5:43:6E:4B:6D Billy-PC
192.168.1.249 00:21:5D:4C:D7:D0 Puniyas
Figure 7. Data Stages Analysis
192.168.1.243 00:26:6C:98:CE:C3 mazda
Note : The highlighted data colored by red.

409 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

As the result of analysis, it reports that the attack is coming REFERENCES


from PC with Name “HACKER”. It has Mac address [1] A. Iswardani and I. Riadi, “Denial Of Service Log Analysis Using
00:0C:29:48:0B:0A with IP Address 192.168.1.246. Density K-Mans Method,” vol. 83, no. 2, pp. 299–302, 2016.
[2] I. R. Jazi, Eko Istiyanto, A. Ashari, and Subanar, “Internet Forensics
C. Reboot Test Framework Based-on Clustering,” Int. J. Adv. Comput. Sci. Appl., vol. 4,
In order to understand the characteristics of digital no. 12, pp. 115–123, 2013.
[3] I. Riadi, J. Istiyanto, and a Ashari, “Log Analysis Techniques using
evidence on RouterOS, we perform reboot on router to test Clustering in Network Forensics,” Int. J. Comput. Sci., vol. 10, no. 7,
what kind information is still exist or lost. After the router is 2014.
rebooted, we compare the information of Activity Log, ARP, [4] G. Stoitsov and V. Rangelov, “One implementation of API interface for
DHCP Leases, Interface, IP Address, Users, Routerbard Info, RouterOS,” vol. 3, no. 2, 2014.
DNS Cache before and after the reboot. As the result we found [5] C. O’Halloran and D. Chambers, “Dynamic adaptation of OSPF
interface metrics based on network load,” 2015 26th Irish Signals Syst.
some information still exists but most of it is lost as explain at Conf. ISSC 2015, no. 89, 2015.
table 4. [6] A. Aeri and S. Tukadiya, “A comparative study of network based system
log management tools,” 2015 Int. Conf. Comput. Commun. Informatics,
TABLE 4 CHARACTERISTIC OF DIGITAL EVIDENCE ON ROUTEROS pp. 1–6, 2015.
No Information Field Behavior after reboot [7] K. Nguyen, D. Tran, W. Ma, and D. Sharma, “An approach to detect
1 Activity Log Lost / volatile network attacks applied for network forensics,” 2014 11th Int. Conf.
2 ARP Lost / volatile Fuzzy Syst. Knowl. Discov. FSKD 2014, pp. 655–660, 2014.
3 DHCP Leases Lost / volatile [8] S. Saad and I. Traore, “Method ontology for intelligent network
4 Interface Exist / Non-volatile forensics analysis,” PST 2010 2010 8th Int. Conf. Privacy, Secur. Trust,
5 IP Address Exist / Non-volatile pp. 7–14, 2010.
6 Users Exist / Non-volatile [9] T. Fiebig, “Forensic DHCP Information Extraction from Home
7 Routerboard Info Exist / Non-volatile Routers,” 2013.
8 DNS Cache Lost / volatile [10] X. Gao, X. Zhang, Z. Lu, and S. Ma, “A general model for the virtual
router,” 2013 15th IEEE Int. Conf. Commun. Technol., pp. 334–339,
VII. CONCLUSIONS 2013.
[11] Z. Liu, Y. Chen, W. Yu, and X. Fu, “Generic network forensic data
Forensic against Router OS-based router devices can be acquisition from household and small business wireless routers,” 2010
done with methods of live forensics through the media API. IEEE Int. Symp. “A World Wireless, Mob. Multimed. Networks”,
Extraction of Router’s data with API gives us access to WoWMoM 2010 - Digit. Proc., 2010.
information on a wide variety of activities that are on the [12] A. M. Saliu, “Internet Authentication and Billing (Hotspot) System
Using MikroTik Router Operating System,” Int. J. Wirel. Commun.
network. Mob. Comput., vol. 1, no. 1, p. 51, 2013.
[13] I. Riadi, “Optimalisasi Keamanan Jaringan Menggunakan Pemfilteran
The developed application is success gained 9(nine) field Aplikasi Berbasis Mikrotik Pendahuluan Landasan Teori,” JUSI, Univ.
of data from the router, which are Log Activity, IP address Ahmad Dahlan Yogyakarta, vol. 1, no. 1, pp. 71–80, 2011.
List, ARP, DHCP Leases, RouterBoard Info, Users, and DNS [14] K. Liu and K. Xu, “Open service-aware mobile network API for 3rd
Cache. All of the data fields is used for observing network party control of network QoS,” Proc. - 2012 Int. Conf. Comput. Sci.
attack based on the scenario, but DNS Cache field role has no Electron. Eng. ICCSEE 2012, vol. 1, pp. 172–175, 2012.
[15] B. T. Fernalld and C. Lahaie, “Live System Forensics Patrick Leahy
correlation with FTP Services Attack case scenario. Analysis Center for Digital Investigation Champlain College.”
of the connected links between any variable field on
acquisition data can help digital forensic investigators to
determine an activity and source of Attacks from Networks.
In order to avoid lost of information, the acquisition
process of forensics should be perform as soon as possible
before the Router turned off or rebooted.
VIII. FUTURE WORKS
The future works of the research is necessary to extend the
knowledge of the current research. This paper is only
maintaining information from internal network attack. For the
future works, the exploration of forensics method to
investigate attacks from different scheme of topology is
necessary to obtain information of attack from external
network or from the internet source.

410 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Evaluating Maintainability of Open Source Software: A Case Study


Feras Hanandeh1, Ahmad A. Saifan2, Mohammed Akour3, Noor Al-Hussein4, Khadijah Shatnawi5
1
The Hashemite University, Zarqa, Jordan, [email protected]
2,3
Software Engineering Department, Faculty of IT, Yarmouk University, Irbid, Jordan,
[email protected], [email protected]
4,5
CIS Department, Faculty of IT, Yarmouk University, Irbid, Jordan,
[email protected], [email protected]

Abstract
Maintainability is one of the most important quality attribute that affect the quality of software.
There are four factors that affect the maintainability of software which are: analyzability,
changeability, stability, and testability. Open source software (OSS) developed by collaborative
work done by volunteers through around the world with different management styles. Open source
code are updated and modified all the time from the first release. Therefore, there is a need to
measure the quality and specifically the maintainability of such code. This paper discusses the
maintainability for the three domains of the open source software. The domains are: education,
business and game. Moreover, to observe the most effective metrics that directly affects the
maintainability of software. Analysis of the results demonstrates that OSS in the education domain
is the most maintainable code and cl_stat (number of executable statements) metric has the highest 
degree of influence on the calculation of maintenance in all three domains..

1. Introduction
Software maintenance is a primary phase of the software development life cycle. Several studies
reported that this phase is the most effort and time consuming. For example, authors in [1] reported
that the software maintenance phase takes around seventy percent of total resources and 40%-60%
of the total software lifecycle efforts. Having maintainable software decreases maintenance cost
and efforts. Software maintainability can be defined as “the degree to which an application is
understood, repaired or enhanced” [2].
There are four factors that affect the maintainability of software: (1) analyzability, which measures
the ability to identify the fault or failure within the software, (2) changeability is the capability to
modify software products, (3) stability refers to the capability to avoid unexpected effects from
changing the software product, and (4) testability is the capability to test and validate the modified
software product [3].

There are some software metrics which we can use to measure the maintainability. Software
metrics is a predictor to asses and predict software quality.
Although maintenance is a critical task, but it is poorly managed due to inadequate measurement,
so we need precise criteria for measuring software maintenance [5].Finding a tool that provides
accurate and relevant result is not an easy task and it is a big issue.. Moreover, some tools are
dedicated to special development languages and other tools have an interface that requires an
411 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

extensive training [6], indeed some tools able to measure specific metrics. Moreover, they are vary
in evaluating different sizes of systems.[7]
Many researchers depend on the open source software, because it is free in terms of licensing.
Industrial organizations. Moreover, open source software keeps updated all the time by different
developers. Therefore, there is a need to measure the quality of such code and the maintainability
of open source software is needed [4].
Open source software (OSS) is developed by collaborative work done by volunteers through
around the world with different management styles. There are issues with OSS) including the lack
of attention of user interface design that will cause less use of OSS; and the lack of documentation
which is a serious issue; many OSSs poorly documented since they do not have a contractual
responsibility [8], OSS sometimes is more secured more than proprietary software, but there some
drawbacks in OSS’s security since many eye review leads to find vulnerabilities, another security
drawback that attackers can find vulnerabilities easier [9]. There are also some usability problems
in OSS but it is not significant [10], so the quality of open source software needed to be studied
deeply, maintainability is one of the major factors used to assess the quality of OSS.
Empirical studies refer to OSS is more maintainable than proprietary software. Design metrics
literature indicate that after a long period of maintenance, OSS will be harder to maintain and
expected deterioration in maintenance over time for open source software [11].
Many studies have been conducted on evaluating the maintainability of OSS, some research
evaluates whether the maintainability of OSS different from closed software. On the other hand,
other research study shows the variance of maintainability measurement with different versions for
same OSS, and some other research examine the impact of design metrics on the maintainability.
In this paper, we are going to find the answers for the below questions: Do the domains of the OSS
have direct an effect on maintainability? What is the most available metrics that have directly
affected on the maintainability of software? Which tool can we trust to provide accurate
measurement to maintainability?
2. Background
2.1 Software Metrics and Tools
In this paper Chidamber and kemerer (C&K) of object oriented metrics have been
used. The metrics are as described in [12]:
• WMC (Weighted Method per Class):”Total Complexity for all class methods”.
• DIT (Depth of Inheritance Tree):”Defines maximum length from node to root from
tree”.
• NOC (Number Of children):”number of immediate Subclasses”.
• CBO (Coupling between objects):”Tow classes are coupled when method declared
in one class use method or variables instance of other class”.
• RFC (Response for Class):”number of method that can be invoked in response to a
message sent to an object of class”.

412 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

• LOCM (Lack of Cohesion Method):”number of different method within class that


reference given instance variable” .
In this paper two different of tools have been used Eclipse Metrics Plug-in 1.3.6 and understand
tool to do the experiment. Moreover, we used Eclipse Metrics Plug-in 1.3.6.Eclipse is a Java-
based open source environment used in computer programming, allows a software developer to
create a customized development environment (IDE) from plug-in components are built by Eclipse
members. It supports plug-ins that allows developers to develop and test code is written in other
languages. In this paper, we implement Eclipse Metrics Plug-in 1.3.6 to measure various
maintainability metrics [7]. Understand for Java: is an IDE that enable developers to understand
their source code, analyze it, and maintain it. Understand allow developers to view their source
code as packages, files, classes and methods, also it is capable of analyzing projects with millions
of lines of code written in different languages. Understand provide view of charts that summarize
relationship between variables and procedures. [7]

3. Literature Review
Some literatures compare maintainability for different version of software like[13] Mukti Chauhan
and Monika Sharma who calculate the maintainability through MI equation for two OSS with
several version for each one using McCabe cyclomatic complexity, Halstead volume (Hal.Vol)
and LOC metric. For Jasper report with version 1.0.0 the MI increased to get highest value in
version 2.0.0,then it decrease in version 5.0.0 and 4.0.0 with minimum value of MI ,For Appache
they noticed that MI increased from version 1.5.3 to 1.6.4 and increased move to get the highest
MI value for version 1.8.0,the author observed that MI increased in both software with decrease in
average cyclomatic complexity ,but for hal.vol and loc result was different between increase and
decrease ,so they concluded that the three metrics have compound impact on maintainability index.
Ruchika and Anuradha [14] measure maintainability for two versions from five database intensive
software, they proposed two metrics useful for predicting maintainability in database intensive
application, (NODBC) number of database connection and (CCR) code to comment ratio. The
author calculate the value of MI, CC, DIT, CBO, and LOC on two version of each application
(original and modified versions), they identify change() method as number of line added, deleted,
or modified in source code and calculate the value of change for the two version; the actual value
of change(AV) and predicted value(PV) using FFNN modeling, then analyze correlation between
(NODBC)and (CCR)with the value of change and proof their proposed metrics.
Many literature compute maintainability based on MI index, Similar to [15] we examine three
open source software, Anita ganpati examine the maintainability of four OSS through formula
based on average Halstead volume, average cyclomatic complexity per module, average Line of
code per module, Antina prove that maintainability value differ from one application to another;
Mozilla firefox gain the highest maintainability value whereas Appache gain the lowest value. The
author conclude that the Mozilla more maintainable than Appache.
Nahlah Najm [16] suggest new equation to calculate the maintainability index depending on the
factored formula (MI);MI consist of LOC, cyclomatic complexity, halstead volume, suggested
formula resulted from calculation on old formula, it is depend on LOC only, which produce result
close to the old MI formula with less effort and time to calculate. Unlike [15, 16] we examine the
maintainability depending on its four sub characteristics formula.

413 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Many researches assess maintainability using various model and techniques. Similar to [17] we
depend on C&K metrics to assess maintainability. The researchers reviewed OO metrics to
construct a model that predict OSS maintainability, they made a comparison between metrics
applicable on OSS versus those applicable to OO and they find many OO metrics applicable to
OSS but few (like LOD and LCN) cannot be applicable to OSS. They try to find a relationship
between maintainability and coupling, cohesion, complexity. They found that maintainability has
inverse relationship with complexity which increase according to increase in
(LOC,WMC,NAM,DIT, CBO, REF, LCOM, CC, EHF, NCB)., Unlike [18] we focus on the main
characteristics of maintainability; Analyzability , Changeability ,Stability , and Testability.
RimiSuini compare the result of main characteristic of quality software like portability, usability,
modifiability and maintainability using various quality model such as McCall, Boehm, FURPS,
Glip, etc,. Then he apply a comparison on the sub characteristic of maintainability as
changeability, readability, stability, simplicity, modifiability, localizability, compatibility and
other using same previous model. he evaluating the maintainability of software product in order to
reduce maintenance cost and effort, he found that inappropriate handling of maintenance refer to
shortage in assured measurement of maintainability.
S. W. A. Rizvi and R. A. Khan [19] predict maintainability based on understandability and
modifiability. They developed three models; understandability model, modifiability model which
quantified in term of size and structural complexity to establish maintainability model using
multiple linear regression. They conclude that maintainability model shows both understandability
and modifiability strongly related with maintainability. While we depend on Java language in our
experimental study,[20]assess maintainability depending on project from different programming
language in opposite, they focus on the relationship between internal quality attribute as size,
inheritance, coupling, complexity and external quality attribute like maintainability.
Amjed Tahir and Rodina Ahmad [1] developed an AOP technique to capture all important aspect
of system behavior at run time using dynamic metrics that are collected by adding extra code to the
source code, they proposed maintainability as a dynamic metric. In order to implement their
framework named the maintainability dynamic metrics-AOP, they select DCBO metric to measure
dynamic coupling which collected during run time by injection a piece of code to the source code.
They test their framework using two simple application; the Address Book and Paint .They found
that AOP was effective in capturing the maintainability dynamic metric and AOP can derived the
metric form source code.
Wide range of metrics available to predict maintainability such as MOOD, QMOOD,C&K.
according to [21] the authors aimed to find a relationship between number of metrics and
maintainability using large number of Java open source, they collect 15 design metrics from 148
Java open source, mainly from five different domains ;software development tool, communication
, system, internet , DB, depending on higher number of download, using MI equation and linear
regression. The authors conclude that increase number of method parameter ,control flow
complexity of method ,number of attribute and amount of polymorphism will decrease the
maintainability, however the increase number of method, number of child classes, depth of
inheritance tree will increase the maintainability, but other factor such as amount of method,
number of classes have no significant affect on maintainability. Finally they found that the average
control flow complexity per method (OSAVG) is the most important maintainability factor.
Bahavna katoch and loverpreet kaour shah [22] show a comparison between MOOD and
QMOOD metrics that measure the maintainability 414 for object oriented design ,found that MOOD
https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

metrics useful for object oriented program and has replicable measurement, whereas QMOOD
show the relation between design property and quality attribute through the equation for measuring
the quality factor of maintainability. Furthermore [23] Jubair developed a system based on MOOD
metrics in order to asses large software Java program, he identify each MOOD metrics and
formula used to calculate that metric, also he present the correlation between MOOD metrics and
characteristics of object oriented model. Three input Java program with different design and
complexity are tested and evaluated through experimental study, using equation measure each
metric and weight factor for each metric; weight factor is indicator used to reflect the importance
of each metric, he concluded that the system is pass in evaluating java program and also in
checking the quality of student program. On other hand in our paper we depend on C&K metrics
rather than MOOD, and QMOOD.
Sanjay and ajay [24] study the impact of object oriented metrics on maintainability; they mention
different type of O.O metrics and focus on CK metrics. Taking every CK metrics through
empirical study, then analyze the impact of each metrics on maintainability ,they concluded that
lower value of CK metrics will produce more maintainable software, whereas [25] depend on
C&K and MOOD Metrics. The authors take two open source software Marf and Gipsy and
measure the maintainability of them using MOOD and CK metrics using numorous tools as
logiscope, JDeadurant, Marcraft ,MCCabe listing the advantages and disadvantages for each
tool.They find that the maintainability affected by four factor analyzability, testability,
changeability and stability, the team of work measure each factor in isolation using different tool
with different classes taken from two case studies, then they focus on validating the measurement
value resulted from the tool using different test cases at different unit, also they compare the result
of two case study with different classes and different level and illustrate a recommendation to
improving the maintainability, another comparison made between C&K and mood metrics, they
proposed that C&K better in design decision and focus on class level where as MOOD measure the
quality of overall project, finally the authors illustrate that internal quality factor have strong
impact on external quality factor, similar to [25] we use same formula to compute maintainability
for three OSS; File transfer and chat, Faculty book system, and Car sale system using two tool;
Understand for java and Eclipse.

The researcher in [26] developed an experiment by using ten software tool;Analyst4j,CCCC,C&K


Java metrics, Dependency Finder, Eclipse Metrics Plug-in 1.3.6, Eclipse Metrics Plug-in 3.4,
OOMeter, Understand for Java, VizzAnalayzer. They select nine software metrics focus on OO
metrics on class level as CBO, DIT,LCOM-CK, LCOM-HS,LOC,NOC,NOM,RFC, and WMC ,
applying their experiment on three java project Jaim, jTcGUI, andProGuard).They find that some
metrics like NOC give same result with different tools, but for other metrics like CBO, and
LCOM)give varying results from one tool to another ,so the conclude that the metrics values differ
by using different tools and the metrics values was tools dependent. Mamdouh et al [30] addressed
the Test Suite Effectiveness of several OSS as an indicator for Open Source Software Quality.
Two common techniques are utilized in their study which is error seeding and mutation testing.
The result shows how the effectiveness of test suite of could give an indication about studied
systems quality.

415 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

4. Research Methodology
In our maintainability evaluation approach, we selected three different open source software that
correspond to three different domains: education, business and game. The evaluation approach
consists of following steps:
1. Select the OSS of the application.
2. Run the three different domain’s applications in both tools Understand and Eclipse
3. Evaluate the four attributes that effect on maintainability which are changeability,
analyzability, testability, and stability.
4. evaluate the maintainability
5. Assess maintainability for different domain of OOS
Analyze generated results.

4.1 Identify Select OSS apps to measure its maintainability


After searching many website and reading many papers we choose three java OSS in three
different domains: Car sale management system project for business domain, File Transfer and
Chat Project for game and Faculty Book System for education. The following show brief
description about every open source software.

File Transfer and Chat Project (Message Sending) is “a system has been developed in Java 1.3
which is based on Object Oriented Methodology. There are several packages in Java, but mainly
swing packages and networking has been utilized in developing this project. it works under two
modules, namely Active and Passive. Only passive clients can receive files, but active clients can
send and receive files as well. Any kind of files, including .fmx files, .exe files and more, can be
sent using this system”. [27]
Car sales management system is “ a project is written in java. It aims to make car sales more
easier through searchable criteria, such as car model, car price, car specification, speed, average
and many other factors”.[28]
A faculty book system is “a project is written in JAVA and can be used as a major project by
students. This project is used to keep record of faculty books and removed the manual work. There
is a full record of all the faculty of college and school in this system and whenever any book is
issued to any teacher; it is added to the faculty book system along with date of issue and return
date.” [29].

In nutshell, these apps consist of almost the same number of files, packages and line of code, as
show in table. Table 1 describes the three OSS:

416 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table1: Description of the three selected open source software


Faculty
Business FileTransferAnd
Measurements Book
car sale Chat
System
Files 12 11 10
Package 107 105 69
classes 14 15 15
Line of Code 1391 1939 1282
# of statement Executed 622 1296 673
comment line 795 377 70

4.2 Determine the tool required to assess the maintainability of OSS.


There are several tools have been proposed to predict the maintainability, we have selected
Understand for java and Eclipse Metrics Plug-in 1.3.6 tools, because it is offer all metrics required
to evaluate the maintainability according to formula we have based on.

4.3 Run open source software and evaluate the maintainability.


During this phase, we run the OSS (File Transfer and chat system, Car Sale System and Faculty
Book System software) on using both understand tool and, plus on and eclipse tools, then we
collect the required metrics to calculate compute the maintainability according to following
formula:
Maintainability = Analyzability + Changeability +Stability + Testability Whereas
Anayzability = cl_wmc + cl_comf +in_bases + cu_cdused
Changeability= cl_stat + cl_func + cl_data
Stability= cl_data publ+cu_cdusers+in_noc+cl_func_publ
Testability= cl_wmc + cl_func + cu_cdused
According to [22]:
The following table explains the definition for each metrics is used in each formula.
Table2 Description of software metrics [22]

Metric Definition
Sum of the static complexities of the class
cl_wmc
methods(cyclomatic # of function)
Ratio between the number of lines of comments in
cl_comf
the module and the total number of lines
Number of classes from which the class inherits
in_bases
directly or not
cu_cdused The number of directly classes used by this class.
Number of executable statements in all methods
cl_stat
and initialization
417 code of a class. https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

cl_func The number of methods declared in the class


cl_data The number of attributes declared in the class.
The number of attributes declared in the class
cl_data_publ
public section.
cu_cdusers The number of classes that directly use this class.
The number of classes which directly inherit the
in_ noc
class the class.
The number of methods declared in the class
cl_func_publ
public section

4.4 Assess the maintainability for different domain of OOS.


According to result of the previous step, we determine which the domain’s OSS is more
maintainable than the others. Moreover, we can assess if different tool can predict the same value
for same input software or not. Also, which tool trusted than the others? Then, what is the most
predominant factor that affects the maintainability measurement? Finally, what is the most
predictor within factor that affects the maintainability?

5. Experimental Result
According to our experimental study, we choose three projects under three different domains, all
have approximately the same size; according to LOC value; the For all project, we calculate the
maintainability for each class according to our based formula, then we find the mean value for
each metrics in each class, and depending on these value we calculate the maintainability of
overall project.
5.1 Evaluating the maintainability using Understand tool

5.1.1 File transfer and chat system:

After analyzing the file transfer and chat system using understand tool, the result in Figure 1
represent the maintainability factor values for each class in this domain.

418 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 1
Figure 1 shows the changeability factor takes the highest value among all other factor, testability is
the second, then analyzability, but stability has varying values according to other factor.
Depending on the result presented in Figure 2, we choose fair and poor class and calculate min,
mean and max value for each metrics included in the formula.

Figure 2

The Table 3 show min, mean and max metrics value for chat.java class which we classify as poor
class.
Table 3 the metrics values for Chat.java Class
Class Name Chat.java
419 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Metrics Value Min Mean Max


cl_wmc 27 2 15.41667 42
cl_comf ratio 0.075676 0.0098039 0.0627238 0.1315789
cd_cdused 39 2 19.416667 53
cl_stat 85 11 54.416667 162
cl_func 10 1 5.5 12
cl_data 18 2 10.416667 26
cl_func_pub 8 0 3.8181818 12
cu_cduser 23 2 10.545455 30

• Cl_wmc value is close to max value. Therefore, the complexity of the class will increase.
• Cd_cdused value needs to be decreased in order to make the class easy to change.
• High cl_stat indicates larger project size that leads to complex class.
• The number of cl_func should be decreased.
• The value of cl_fun_pub is 8 out of 10 cl_func, this ratio is high. So, we should replace
public method with protected and private method and just identify the accessible method as
public.
• The high value of cl_data indicates high coupling lead to more complex system, and high
cl_data _publ value low information hiding (low encapsulation).

In the opposite, table 4 represents clientform.java as good class in the file transfer and chat
system.
Table 4 the metrics values for clientform.java Class
Class Name clientform.java
Metrics Value Min Mean Max
cl_wmc 2 2 15.416667 42
cl_comf ratio 0.129032 0.0098039 0.0627238 0.1315789
cd_cdused 15 2 19.416667 53
cl_stat 21 11 54.416667 162
cl_func 1 1 5.5 12
cl_data 2 2 10.416667 26
cl_func_pub 0 0 3.8181818 12
cu_cduser 8 2 10.545455 30

• Cl_wmc indicates low complexity that leads to high maintainability.


• Cd_cdused has 15 out of 53 which indicates low coupling that leads to better design, and easily
maintained class.
• Cl_stat has low value 21 out of 162. This means that this is the small size class, which is better.
• There is only one method declared in this class, meaning in low complexity.
• There is no public method identified in the class.

5.1.2 Faculty Book System:


420 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

After analyzing Faculty Book System using understand tool, the result in Figure 3 represent the
maintainability factor values for each class in this domain.

Figure 3
Similar to the File transfer and chat system, the changeability factor takes the highest value among
all other factor, then testability, analyzability, and stability respectively.
Depending on the result presented in Figure 4, we choose fair and poor class and calculate min,
mean and max value for each metrics are included in the formula.

Figure 4

The Table 5 shows min, mean and max values for jfrmfacultybook.java class which we consider as
a poor class.

421 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table 5 the metrics values for jfrmfacultybook.java Class


Class Name jfrmfacultybook.java
Metrics Value Min Mean Max
cl_wmc 44 2 32.888889 74
cl_comf ratio 0.0839506 0.0839506 0.1426509 0.264
cd_cdused 27 2 21.888889 42
cl_stat 230 27 142.55556 347
cl_func 12 2 10.666667 19
cl_data 30 7 17.333333 41
cl_func_pub 8 2 8.6666667 19
cu_cduser 32 5 16.444444 36

• Cl_wmc value is close to max value. This means that it is a complex class.
• Cd_cdused value needs to be decreased in order to make the class easy to change.
• High cl_stat (230) indicates larger project size that leads to complex class.
• The number of cl_func should be decreased.
• The value of cl_fun_pub is 8 out of 12 cl_func, this ratio is high. So, we should replace the
public method with the protected and private method and just identify the accessible
method as public.
• The value of cl_data is close to the maximum value, which indicates the high coupling that
leads to more complex system.

In the opposite, table 6 represents jfrmabout.java as good class in the Faculty Book
System.

Table 6 the metrics values for jfrmabout.java Class


Class Name jfrmabout.java
Metrics Value Min Mean Max
cl_wmc 2 2 32.888889 74
cl_comf ratio 0.1066667 0.0839506 0.1426509 0.264
cd_cdused 21 2 21.888889 42
cl_stat 43 27 142.55556 347
cl_func 2 2 10.666667 19
cl_data 7 7 17.333333 41
cl_func_pub 2 2 8.6666667 19
cu_cduser 12 5 16.444444 36

• Cl_wmc is low which indicates low complexity that leads to high maintainability.
• Cd_cdused is on the mean value.
• Cl_stat has a low value (43 out of 347), which means that we have a small size class
• There are only 2 methods declared in this class and all are declared as public.

422 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

5.1.3 Car Sale System:

After analyzing Car Sale System using understand tool, the result in Figure 5 represent that the
maintainability factor values for each class in this domain.

Figure 5

The changeability factor holds the highest value among all other factor, then testability, but the
values for analyzability and stability were varied from one class to another.
Depending on the result presented in Figure 6, we choose a fair and poor class and calculate the
min, mean and max value for each metrics included in the formula.

Figure 6

The Table 7 shows the min, mean and max value for CarSaleSystem.java class which we
consider as a poor class.

423 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table 7 the metrics values for CarSaleSystem.java Class


Class Name CarSaleSystem.java
Metrics Value Min Mean Max
cl_wmc 51 5 17.1 51
cl_comf ratio 0.2117647 0.211765 0.339118 0.613793
cd_cdused 27 4 9.7 27
cl_stat 112 12 52.3 112
cl_func 20 4 8.8 20
cl_data 17 2 10.6 21
cl_func_pub 20 2 7.2 20
cu_cduser 17 3 11.5 21

• Cl_wmc value is the max value. Therefore, the complexity of the class will increase that
leads to less maintainable software.
• Cd_cdused value has the max value that need to be decreased in order to make the class
easy to change.
• cl_stat has the max value which indicates a larger project size ,that leads to complex class.
• The number of cl_func should be decreased.
• All method declared in the class as a public method, so we should replace a public method
with a protected and private method, and just identify the accessible method as public.
• High cl_data value indicates high coupling between methods of class, the high value of
cl_data_pub indicates low information hiding (low encapsulation).

In the opposite, table 8 represents manfacturer.java as a good class in Car Sale System.
Table 8 the metrics values for manfacturer.java Class
Class Name manfacturer.java
Metrics Value Min Mean Max
cl_wmc 8 5 17.1 51
cl_comf ratio 0.613793 0.211765 0.339118 0.613793
cd_cdused 4 4 9.7 27
cl_stat 12 12 52.3 112
cl_func 7 4 8.8 20
cl_data 2 2 10.6 21
cl_func_pub 6 2 7.2 20
cu_cduser 3 3 11.5 21

• Cl_wmc is low which indicates low complexity that leads to high maintainability.
• Cd_cdused is low, meaning low coupling, which is better.
• Cl_stat has low value 12 out of 112, result in a small size class.
• There is 7 methods declared in this class, which indicate low complexity.
• The number of public methods should be decreased, as cl_fun_pub is 6 from 7 methods.
• The number of cl_data_pub is low, which indicate higher encapsulation.
424 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

5.2 Comparing the maintainability values between the domains


After calculating the maintainability for each domain, Table 9 summarizes the measurement.

Table 9 The maintainability results for three domains


Maintainability
Domain value
Education 334.37
Social 171.37
Business 174.74

Higher maintainability values indicate more complex software, therefore less maintainable
software. In spite of, Business domain is bigger than social domain with 109 LOC, both Social
and Business domains almost have the same maintainability value. This indicates that the two
domains have the same degree of maintainability. Education domain has the highest
maintainability value, which means that it is the more complex domain and less maintainable
software.

5.3 Evaluating the maintainability using Eclipse Tool

We measure maintainability for Business and education domain on eclipse tool, which result in
close measurement to understand tool. Business LOC is 1401, and education LOC is 1945. Table
10 summarizes the maintainability value for the two domains.

Table10 Maintainability values for Business and education domains

Domain Factor Value


Analyzability 27.183
Changeability 67.515
Business Stability 26.486
Testability 33.272
Maintainability 154.456
Analyzability 48.415
Changeability 162.944
Education Stability 37.648
Testability 55.121
Maintainability 304.128

5.4 Results Analysis


According to the result from the table above business domain is more maintainable than education
domain, that is similar to the result gained from understand tool.

We found that Cl_Stat has the most impact on maintainability values, then cl_wmc and cl_cdused.
Figures 7,8 and 9 shows the metrics values that have the highest impact factor in calculating
maintainability for the three different domains.
425 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 7 Metrics values for Business Domain

Figure 7 Metrics values for Social Domain

Figure 7 Metrics values for Education Domain

5.5 Threat To Validity


The paper presents a case study involving three open source Java Systems from three different
domains. The purpose of the case study is to reveal the relationship between the domain of the
open source system and high level maintainability. Both the evaluation process and the selected
open source bear some properties which may affect the validity of the results and the usability of
the approach. First we only choose one project for each domain which is not enough for drawing
general conclusion about the relationship between the domain of the open source code Java
systems and the level of maintainability.
Another threat to the validity, the formula that we were used to calculate the four different quality
attributes that affect maintainability because the formula used in [22[ was calculated based on
426 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

different projects. Our projects could include some other metrics that have a direct affect on the
calculations.
It is also a threat to the validity of the using the tools. The tools have several advantages and
disadvantages. Below, the advantages and disadvantages of these tools:

5.4.1 Understand Tool


Advantages:
• Easy to install and learn.
• Comfortable view for project architecture; understand provide view for packages,
files, classes, methods, entity.
• Provide measurement of most metrics rather than other tool.
• Generate summary report for overall project.
• Analyzing the code without detecting errors.

Disadvantages:
• Trial version.
• Cannot generate the mean, min and max value for each metrics.
5.5.2 Eclipse Tool
Advantages:
• Available for free.
• Generate the mean, max, min and Standard deviation values for each metrics.

Disadvantages:
• Doesn’t generate the measurement for all metrics, such as …..
• For small project some metrics doesn't generated
• Cannot analyze project containing errors.

6. Conclusion

Maintainability is the most important quality attribute that affect the quality of software.
Depending on C&K metrics we compute the maintainability for three different domains using
two different tools.
We conclude that the different domains produce different maintainability measurements, and the
most important factor affect maintainability is changeability. According to result from the tools:
understand and eclipse metrics, we deduce that different tools generate similar result for
maintainability, we trust understand tool that provide accurate measurement for maintainability
value agreed to its advantages over Eclipse tool.

427 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

References:
[1] Tahir, Amjed, and Rodina Ahmad. "An AOP-Based Approach for Collecting Software
Maintainability Dynamic Metrics." Computer Research and Development, 2010 Second
International Conference on. IEEE, 2010.
[2] http://www.castsoftware.com/glossary/software-maintainability. Access on 2017.
[3]Ghosh, Soumi, and Sanjay Kumar Dubey. "Fuzzy Maintainability Model for Object Oriented
Software System." (2012).
[4] Johari, Kalpana, and ArvinderKaur. "Validation of object oriented metrics using open source
software system: an empirical study." ACM SIGSOFT Software Engineering Notes 37.1 (2012): 1-
4.
[5]Rizvi, S. W. A., and R. A. Khan. "Maintainability estimation model for object-oriented software
in design phase (memood)." arXiv preprint arXiv:1004.4447(2010).
[6] Albeladi, Abdulrhman, et al. "Toward Software Measurement and Quality Analysis of MARF
and GIPSY Case Studies a Team 13 SOEN6611-S14 Project Report." arXiv preprint
arXiv:1407.0063 (2014).
[7]Lincke, Rüdiger, Jonas Lundberg, and WelfLöwe. "Comparing software metrics
tools." Proceedings of the 2008 international symposium on Software testing and analysis. ACM,
2008.
[8] Levesque, M. (2005). Fundamental issues with open source software development (originally
published in Volume 9, Number 4, April 2004). First Monday.
[9] Open-source_software_security, availavle at: https://en.wikipedia.org/wiki/Open-
source_software_security. Accessed Dec 11, 2016.
[10] Nichols, D., Twidale, M. (2002). The Usability of Open Source Software. First Monday, 8(1).
Availavle at: http://firstmonday.org/article/view/1018/939#n2.
[11] Ayalew, Y., & Mguni, K. (2013). An Assessment of Changeability of Open Source
Software. Computer and Information Science, 6(3), p68.
[12]Gulia, Preeti, and Rajender Singh Chhillar. "Design based Object-Oriented Metrics to Measure
Coupling and Cohesion." Journal of Management & Computing Sciences (IJMCS) 1.3 (2011): 42.
[13] Mukti Chauhan, Monika Sharma, june 2013, "Predicting Maintainability Of Open Source
Software: An Empirical Approach", IJERT, Volume 2, Number 6, Pages 3333- 3336.[14]
Malhotra, Ruchika, and Anuradha Chug. "An empirical study to redefine the relationship between
software design metrics and maintainability in high data intensive applications." Proceedings of
the World Congress on Engineering and Computer Science. Vol. 1. 2013.
[15] Ganpati, Anita, Arvind Kalia, and Hardeep Singh. "A Comparative Study of Maintainability
Index of Open Source Software." International Journal of Software and Web Sciences 3.2 (2012):
69-73.

428 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[16] Najm, Nahlah MAM. "Measuring Maintainability Index of a Software Depending on Line of
Code Only IOSR Journal of Computer Engineering (IOSR-JCE) ,Volume 16, Issue 2, Ver. VII
(Mar-Apr. 2014), PP 64-69
[17] Bakar, A. D., Sultan, A. M., Zulzalil, H., & Din, J. (2012). Review on'Maintainability'
Metrics in Open Source Software. International Review on Computers and Software, 7(3).
[18] Saini, Rimmi, Sanjay Kumar Dubey, and Ajay Rana. "Analytical study of maintainability
models for quality evaluation." Indian Journal of Computer Science and Engineering 2.3 (2011):
449-454.
[19]Rizvi, S. W. A., and R. A. Khan. "Maintainability estimation model for object-oriented
software in design phase (memood)." arXiv preprint arXiv:1004.4447(2010).
[20]Ghosh, Soumi, and Sanjay Kumar Dubey. "Fuzzy Maintainability Model for Object Oriented
Software System." (2012).
[21] Zhou, Yuming, and Baowen Xu. "Predicting the maintainability of open source software
using design metrics." Wuhan University Journal of Natural Sciences13.1 (2008): 14-20.
[22] Katoch, Bhavna, and Lovepreet Kaur Shah. "A Systematic Analysis on MOOD and QMOOD
Metrics." International Journal of Current Engineering and Technology 4.2 (2014): 620-622.
[23] Al-Ja’Afer, J., and K. Sabri. "Metrics for object oriented design (MOOD) to assess Java
programs." King Abdullah II school for information technology, University of Jordan,
Jordan (2004).
[24] Dubey, Sanjay Kumar, and Ajay Rana. "Assessment of maintainability metrics for object-
oriented software system." ACM SIGSOFT Software Engineering Notes 36.5 (2011): 1-7.
[25] Albeladi, Abdulrhman, et al. "Toward Software Measurement and Quality Analysis of
MARF and GIPSY Case Studies a Team 13 SOEN6611-S14 Project Report." arXiv preprint
arXiv:1407.0063 (2014).
[26]Lincke, Rüdiger, Jonas Lundberg, and WelfLöwe. "Comparing software metrics
tools." Proceedings of the 2008 international symposium on Software testing and analysis. ACM,
2008.

[27] File transfer chat project, available at: http://www.codewithc.com/file-transfer-chat-project-


java/. Accessed Dec 11, 2016
[28] Car sales management system, available at: https://codecreator.org/projects/car-sales-
management-system/. Accessed Dec 11, 2016

[29] Faculty book system, available at: http://www.final-yearproject.com/2009/07/faculty-book-


system.html. Accessed Dec 11, 2016.
[30] Mamdouh Alenezi , Mohammed Akour, Alaa Husein. Test Suite Effectiveness: An Indicator
for Open Source Software Quality, Open Source Software Computing (OSSCOM 2016), IEEE,
2016.

429 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Classification of Human Vision Discrepancy during


Watching 2D and 3D Movies Based on EEG Signals
Negin Manshouri, Masoud Maleki, Temel Kayıkçıoğlu
Department of Electrical and Electronics Engineering
Karadeniz Technical University
Trabzon, Turkey
[email protected], [email protected], [email protected]

Abstract— besides the important applications of been investigated as the main topic of this research. Wajid et al.
Electroencephalogram (EEG) signals, like recognizing different have analyzed five regions of the brain and proposed a new
mental diseases other aspects of EEG utilization such as method based upon classification of states of 2D and 3D games
biometrics, music, entertainment, etc., are striking nowadays. To data [7]. To predict the movement intent of human body [8]
make a good interface between human brains and the surrounding Kaiyang et al. have used EEG signals. Norizam et al. By
environment, brain-computer interface (BCI) has created a analyzing EEG signals in the lab, interpreted human though,
strange evolution in this field. In this paper, achieved EEG signals, despite the low accuracy of this work the study can create the
during watching 2D and 3D movie has been investigated. A sample Labview block diagram to test [9]. In addition to medical useges
of nine healthy volunteers (age range 18-30) contributed in the
of BCI, there are some other recreational applications too.
experiments, these experiments consist of two parts: first, subjects
watched 2D movie and then watched the same movie in 3D mode.
Nowadays, in the realm of three-dimensional images and videos
After data acquisition, to predict states of brain, signals are sent to there are significant progresses. For instance, Khairuddin et al.
the feature extraction stage. Fast Fourier Transformation (FFT) is by gathering adults EEG signals during video game play in 2D
used to extract features and then classified by “Classification and 3D, concluded that their method may be useful in
Learner App”. Two kinds of Support Vector Machine (SVM) quantifying the EEG signals during 2D and 3D visualization
classifier, and fine kind of k nearest neighbors (kNN) were used as [10]. In 2014, another team's discovery showed that during
classifiers in this study. To understanding that which frequency playing in 3D mood, in theta and alpha bands, there is an
bands are more effective in the EEG signals during watching 2D increase at the frontal and occipital regions, while in 2D play,
and 3D Movies, these combinations of EEG bands are used as the chiefly in temporal lobes higher beta and gamma activity was
features: delta, theta, alfa, beta and gamma bands abbreviated as found [11]. In [12] research team after measuring brain activities
“all bands”, delta, theta and alfa bands as “low frequency bands”, of viewers during watching 2D, 2.5D and 3D motion pictures,
theta, alfa and beta as “middle bands” and alfa, beta and gamma have compared them with each other, their results showed that
bands as “high frequency bands”. Finally, in comparing the the relative intensity of α-frequency band of 2.5D-viewer was
results, the classification accuracy of “all bands” in channel T5 for lower than that of 2D-viewer, while that of 3D-viewer remained
Quadratic SVM was attained as the highest. with similar intensity. In other studies, It is shown that the
obtained brain waves at α-frequency band are not related with
Keywords— EEG, brain-computer interface, 2D and 3D movies, visual perception, although incompatible results has reported by
kNN, SVM, Classification Learner App
few previous studies [13, 14].

I. INTRODUCTION Collecting of EEG signals are spontaneously in this


BCI is a novel communication channel that is applied research, in other words the continuous, rhythmic potential
between human brain and an external device. Without using any changes of brain cortex are called spontaneous EEG signals.
words or activities, it enables people to control materials. These kinds of EEG are divided into five distinguished bands
Electroencephalography (EEG) is an attractive method that is [8]. Table I. shows EEG Frequency band division.
used to record electrical potential of the brain and among various
methods of brain imaging, EEG is the most generally used one.
This method is easy, transferable and typically non-invasive that TABLE I. FREQUENCY BAND DIVISION OF THE SPONTANEOUS EEG.
helps us to demonstrate activity of the brain by using suitable Bands Ranges of Frequency
classification algorithms and feature extraction methods. (Hz)
Delta (δ) 2-4
In recent researches, BCI usage was progressed with the aim Theta (θ) 4-8
of helping disabled and patient people [1], controlling games [2], Alfa (α) 8-13
Neurofeedback games [3], controlling of a computer cursor [4], Beta (β) 13-30
robotic limb [5] and etc. In [6] by focusing on one of the BCI Gamma (γ) >30
application and using EEG signals, human identification has

430 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

As known, EEG is used to discuss human brains visual B. EEG Data Acquistion
perception besides other applications. Our goal is to design a To display the use of EEG signals in order to detect the
pattern recognition system, for classifying brain signals during differences of 2D and 3D movie watching, data collection
watching 2D and 3D movies by applying EEG analysis. Also we scenario was made. The EEG data corresponding to 2D and 3D
want to know that which bands of EEG signal have an important movie watching started by the Brain Quick EEG System
role in our study. In this system, we are considering working on (Micromed, Italy) from 18 scalp locations (Fp1, Fp2, F3, F4, C3,
eight subjects. By using common feature extraction method and C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, Fz and Pz channels)
high performance classification algoritm, we hope that this based on the international 10-20 system where Cz was used as
pattern helps us to identify people in both 2D and 3D movie reference. EEG data were recorded with a sampling rate of 512
watching. Furthermore, by exploring the diffrence between samples per seconds. On the related EEG cap, there are 19
brain activity, during watching these two cases it is hoped that channels in total. All the channels were selected to be analyzed.
this pattern can be excutable in BCI applications. In other words,
the subject can watch a 2D motion image for ON and a 3D
motion image for OFF of a device. This study is a beginning step
C. Data Preprocessing
to design and implement a new, fast, simple BCI system.
In this paper, the raw data of achieved EEG signals are sent
The structure of the paper is described as below: after the to computer via flash memory and then saved as .edf file, then
preliminaries, the experimental setup, results and finally, the by a MATLAB programming the data is converted to .mat
conclusion and discussion section are given. The structure of format ready to analyze. Different preprocessing methods can be
brain signal recording is shown in Figure 1. applied to raw data in order to remove line noise, polarization
noise, eye movements, and muscular activities. In this study, we
were used a band-pass filter to elicit the desired signal between
0.1 till 120 Hz, and to delete line noise we were used a 50 Hz
notch filter. A mean normalization process was used to each
epoch as (1) [16].

𝑥−𝑥
𝑋𝑁 = (1)
𝑚𝑎𝑥| 𝑥−𝑥|

that in above equation, x, 𝑥̅ and 𝑋𝑁 denote the original epoch,


mean of the original epoch and the normalized epoch,
respectively.

D. Epoch Category
After data collection and preprocessing EEG signals, they
are divided into 1-sec epochs. For a movie viewer, there are 280
1-sec. epochs at all. The described data set is given in Table II.

Fig. 1. The structure of brain signal recording TABLE II. DESCRIBING THE SELECTION OF DATA SET

1-second 280 epochs in total,


II. METHODOLOGY AND EXPERIMENTAL PARADIGM epochs for
Avatar movie (140 epoch for 2D

A. Subjects &
Eight subjects including three women and five men aged 140 epoch for 3D)
between 18 and 30 years old, are participated in this
investigation. There is no problem about subject’s vision ability. III. FEATURE EXTRACTION
In other step, all of the subjects were informed about the
experimental conditions. All of them were asked to sit on the
chair that is almost one meter away from TV (LG 32 inch) stand, Besides time reduction, high performance of feature
and be relax and focus on the television screen. Subjects were extraction is very important in signal analysis. In order to
asked to keep unessential movements to a minimum during the highlight the effectiveness and efficiency of this BCI system, we
trial. Two sessions were organized for each subject. Subjects particularly concentrate on the simplest algorithm for decreasing
first watched 2D movie and then after stopping program, and the calculations and shorten latency [6].
one-minute gap recording the watching 3D of the same movie, To describe the characteristics of EEG signals and transmit
is continued. Two-minute and twelve-second 2D and Two- them to the frequency domain, Fast Fourier Transform (FFT) is
minute and twelve-second 3D AVATAR movie [15] are shown used as (2). In this study, all of the channels were used to extract
to the subjects. features. By using this useful analysis, converting the raw data

431 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

from its original domain (time) to the frequency domain can be Because of binary classes of this investigation, the prediction
possible. By using Discrete Fourier Transform (DFT), speed is fast, but in multiclass model this speed is generally
converting of discrete-time sequences into discrete-frequency medium. Furthermore, this type of Linear SVM classifier has
versions is performed, which is derived by (3). To calculate the medium memory usage and easy interpretability. Flexibility of
sequence's DFT, FFT transform is an effectual algorithm [17]. this model is low and has linear separation between classes. In
FFT command is available as a function fft() in MATLAB which Quadratic SVM model, prediction speed is more than linear,
is used in this study. memory usage is large and has hard interpretability. Flexibility
+∞ of this model is medium.
X(f)=F(x(t))=∫−∞ 𝑥(𝑡) 𝑒 −2𝜋𝑗𝑓𝑡 dt (2)

X(f)=∑𝑛−1
𝑖=0 𝑥𝑖 𝑒
−𝑗2𝜋𝑖𝑘/𝑛
fork=0,1,...,n-1 (3)

where in (2), x(t) is the time domain signal, and X(f) is its FT,
and in (3), x is the input sequence, X is its DFT, and n is the
number of samples [18].
After signal transformation for each epoch of 18 channels,
the five bands of EEG signals (Delta, Theta, Alpha, Beta, and
Gamma) were obtained. For each epoch, the sample’s average
for each band is calculated in order to reduce dimension of EEG
signals. In this way, for each epoch in one channel, 5 features
were extracted and, as mentioned, 18 channels were used. So, 90
(18*5) features were prepared for each epoch.

IV. CLASSIFICATION BASED ON CLASSIFICATION LEARNER APP.

Fig. 2. The steps of “Classification Learner App” using


After preprocessing, getting the clean EEG signals and
extracting features, dataset was prepared for classification. To
select the most appropriate classifier for chosen features, the V. RESULTS AND DISCUSSION
properties of the classifiers must be known. The results in this
study were classified by using the Classification Learner App, in
In this study we have reported results on the detection of the
the Statistics and Machine Learning Toolbox of MATLAB. This
differences during watching 2D and 3D movies for designing
is a new app to train models and classify data using supervised
and improving the performance and reliability of BCI systems.
machine learning. By considering two algorithms, k-Nearest
We have the two-classes classification problem. As mentioned
Neighbor (kNN) fine type and two kinds of Support Vector
above, for reducing the number of channels and understanding
Machine classifier (linear and Quadratic SVM), the app has the
which channels (or lobes) and which bands have best
ability to compare and asses these models. The step of
performance in the classification, we separately classified 18
classification learner is depicted in Figure 2.
channels in Four different band types. In the first section, EEG
signal of a channel was divided into 1-second epochs. After
normalization, 140 epochs (for each class) were divided into two
A. kNN training and testing sets. For each epoch from the training set, 5
The purpose of kNN in the training dataset, is to assign to an features were extracted using FFT. The classification in our
unseen point of the dominant class among its k nearest neighbors study, was performed in an interactive app environment in the
[19]. In other words, this simple algorithm by gathering all the Statistics and Machine Learning Toolbox for MATLAB that
samples, classifies new sample based on its neighbor’s called “Classification Learner App”. It has different classifiers
distances. Fine kNN type has medium prediction speed, memory that we used only three of them mainly due to their fast
usage, and hard interpretability. This model is flexible, and by prediction calculation speed, and high accuracy results. These
applying this type of classifier, the ability of strict distinction classifiers are kNN, SVM Quadratic and SVM Linear.
between classes is possible. Calculation of the accuracy and plotting the ROC curves, scatter
plot and confusion matrix diagrams of model are ways of
B. SVM
displaying and visualizing results that “Classification Learner
This algorithm has an important role in EEG signal App” has. Also with using this app, we can train the classifier
classification. In order to separate a group of features that having with K-FCV (in this study K=10). After training the classifiers
different class memberships, with maximum margin, we exported the models to the work space of MATLAB for
constructing an optimal hyperplane is important, one of these prediction of test epochs. The classification result was defined
methods that find optimal hyperplane is SVM. In this paper, two as the percentage of the number of epochs classified correctly
kinds of SVM classifier (linear and Quadratic SVM) are used.

432 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

over the size of the testing set. Table III shows the classification that high frequency bands are a few more effective than other
accuracy obtained using 18 channels from three classifiers for bands.
all EEG bands. Also the average of each classifier for all subjects
in each channel was computed separately in the last row. It can VI. CONCLUSION
be seen that channel T5 in Quadratic SVM has best performance This paper proposed a new BCI system based on watching
with %69.23. 2D and 3D movies. This paper also tried to find that which bands
To understanding that which bands are more effective in of EEG signals have more effect in watching 2D and 3D movies.
the EEG signals during watching 2D and 3D movies, low Four different combination of EEG bands were used as features.
frequency bands involved delta, teta and alfa are used as These combinations are delta, theta, alfa, beta and gamma bands
features. Table IV shows the classification accuracy obtained abbreviated as “all bands”, delta, theta and alfa bands as low
using 18 channels from three classifiers for low frequency frequency bands, theta, alfa and beta as middle bands and alfa,
bands. Similar Table III, the average of each classifier for all beta and gamma bands as high frequency bands. The results
subjects in each channel was computed separately in the last showed that all bands can be used in classification with a
row. In low frequency bands, channel Fp1 in Linear SVM has reasonable accuracy about %70. Also the results show that high
best performance with %64.50. In the following of frequency frequency bands are a few more effective than other bands. The
bands evaluation, middle frequency bands and high frequency classification results were prepared by the help of the
bands were used as features. Table V and Table VI show the Classification Learner App, in the Statistics and Machine
classification accuracy obtained using 18 channels from three Learning Toolbox of MATLAB. This scheme has a potential to
classifiers for middle and high frequency bands, separately. The be used in the BCI system and even in automation of diagnosing
average of each classifier for all subjects in each channel was fatigue. In future, different classifiers and feature extracting
computed separately in the last row of table. In middle frequency methods can be employed to observe that which one may
bands, channel Fp1 in Linear SVM has best performance with provide even better classification results.
%62.23. Also in high frequency bands, channel C4 in Quadratic
SVM has best performance with %64.69. In brief, we can see

TABLE III. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR ALL EEG BANDS

Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1


S1 kNN 56.7 59.5 56.7 52.8 56.7 53.2 55.2 53.2 66.7 58.3 44.4 70.2 53.6 67.9 54.4 66.3 88.1 59.5
Linear SVM 66.7 64.3 63.1 52 60.3 62.7 57.9 59.9 77.4 52.4 54.4 79.4 64.3 72.2 60.7 67.9 90.5 67.5
Quadratic SVM 61.1 60.3 60.7 53.2 60.7 61.1 59.9 56.7 75.4 69.4 50 77.4 65.5 67.1 65.1 66.7 88.5 65.1
S2 kNN 48.4 65.6 58.2 50.4 66.4 49.2 47.5 51.6 57.4 48.4 49.6 53.7 73.8 48.4 48.4 52.5 51.6 63.9
Linear SVM 63.1 78.3 55.7 57.8 73 57.8 48.8 69.7 66.8 53.7 56.1 66.8 77 61.9 54.9 53.7 57.4 73.8
Quadratic SVM 62.3 79.1 60.2 55.7 69.7 57.8 47.1 58.6 66.4 56.6 57.4 67.2 75.8 56.1 52.9 50 56.6 70.5
S3 kNN 62.7 66 60.02 59 63.5 59.4 52.9 58.2 49.6 62.7 52.9 53.7 53.7 61.9 53.3 57.4 60.7 68.9
Linear SVM 68.9 67.2 52.5 58.6 64.3 57.8 61.5 63.1 56.6 66.4 62.7 56.1 47.1 72.5 62.7 64.3 70.1 67.2
Quadratic SVM 72.5 71.7 51.6 64.8 65.6 58.6 63.9 64.3 55.7 66.8 63.5 55.7 56.1 69.7 62.3 62.7 70.9 61.9
S4 kNN 63.6 62.4 69.4 65.3 58.3 56.2 65.3 63.2 62 71.9 56.6 74 74.4 48.3 58.3 57 54.5 58.7
Linear SVM 69.4 65.7 72.7 75.2 69 62.4 74.8 67.4 66.9 78.1 56.6 80.6 82.6 54.1 68.6 61.2 59.9 40.5
Quadratic SVM 67.8 62 69.8 74.4 66.1 63.6 72.3 65.7 68.6 76 52.9 78.9 81 55.4 61.6 61.2 58.7 46.7
S5 kNN 51.6 47.1 55.7 54.1 43.4 53.7 52.5 50.8 48.8 48.8 63.5 66.4 57.4 86.9 47.5 51.6 56.6 55.7
Linear SVM 54.1 54.5 66 42.6 47.5 54.5 50.4 61.9 54.1 56.1 50.8 47.1 54.1 87.3 52.5 52.5 50.8 45.9
Quadratic SVM 54.5 50 63.1 48.8 46.7 46.7 46.3 59.8 55.7 56.1 52.5 51.6 52 85.2 50 54.9 48 48
S6 kNN 53.3 56.1 54.5 61.1 52.5 54.5 56.6 70.1 50.4 52.5 62.7 50.8 47.5 62.5 50.8 57.8 55.7 55.7
Linear SVM 50.8 51.6 59.8 54.9 49.2 50.8 52 67.6 54.1 52.5 48.4 53.3 50.4 52.9 57 48.8 50.4 54.9
Quadratic SVM 56.1 49.6 54.1 44.7 50.4 53.7 53.7 71.7 46.7 54.5 50.8 47.5 52.5 52 50.8 52.5 53.3 50.8
S7 kNN 48.4 56.6 61.9 49.6 57 60.2 51.6 57 53.7 62.7 57.8 48.4 56.1 52.5 51.2 52.5 52.9 51.2
Linear SVM 52.5 55.3 53.3 58.6 59.8 55.7 57.8 48.4 55.3 71.3 57.8 54.1 59.8 57.8 57 58.2 56.6 52
Quadratic SVM 54.9 53.7 62.3 59 58.2 53.3 54.9 61.5 57.4 67.6 59.4 51.6 54.9 59.8 56.6 56.1 55.7 56.6
S8 kNN 95.9 55.7 58.2 62.7 59 63.9 55.3 63.1 53.7 62.7 57.4 59.4 57 65.2 98.8 54.5 56.6 59.8
Linear SVM 95.1 56.1 60.2 73 68.4 66.8 60.2 60.2 61.9 66 62.3 65.6 66 61.9 82 51.2 59.4 68.9
Quadratic SVM 94.7 63.1 61.5 68.4 67.2 73 61.1 55.7 61.9 67.6 62.7 63.9 63.5 67.6 98.4 57.8 60.7 63.9
S9 kNN 67.2 63.1 59.8 59.4 65.6 58.2 58.2 52.9 51.2 59.4 66 67.6 60.2 81.1 69.7 50 44.3 47.5
Linear SVM 68.4 65.6 58.6 60.7 57.4 62.7 56.1 57.4 59 69.7 74.2 66.8 72.1 77 68.9 57 65.2 59.4
Quadratic SVM 72.5 64.8 53.7 56.6 63.5 64.8 54.5 58.2 51.6 68 76.6 68.9 74.2 79.9 74.2 58.2 57.8 58.2
Ave kNN 60.87 59.12 59.38 57.16 58.04 56.5 55.01 57.79 48.87 58.6 56.77 60.47 59.3 66.38 59.16 55.52 57.89 57.8
Linear SVM 65.44 62 60.21 59.27 60.88 59.02 57.72 61.74 61.34 62.92 58.15 63.04 60.49 66.4 62.7 57.2 62.26 58.9
Quadratic SVM 66.27 61.59 59.67 62.67 60.9 59.18 57.07 61.35 59.94 65.65 58.42 62.53 69.23 65.87 63.55 57.79 61.14 57.9

433 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE IV. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR DELTA,THETA, AND ALFA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 56,3 55,2 56,7 45,6 52,4 56 62,3 49,6 54,8 50,4 49,2 56,7 52 52 46,4 67,1 49,2 45,2
Linear SVM 61,5 62,7 63,5 52 62,3 56,7 59,5 55,6 55,2 55,2 57,1 56,7 60,7 50,4 57,5 68,3 55,2 45,2
Quadratic SVM 60,7 57,7 63,9 51,6 61,3 60,7 60,3 50 57,5 55,6 48,8 53,2 56,7 52,8 57,9 67,1 55,2 62,3
S2 kNN 46,7 47,1 59 51,6 50,4 49,2 52,5 43 50 49,6 57,4 52,9 69,3 54,1 53,3 58,2 43 66
Linear SVM 54,5 54,9 55,7 60,7 56,1 60,2 54,5 50,8 54,9 54,1 56,1 53,3 80,3 55,3 58,6 54,5 54,5 69,3
Quadratic SVM 51,6 55,7 59 56,6 51,2 58,6 55,7 52,9 55,3 52,5 55,7 55,3 77,9 49,2 58,2 53,3 54,9 66,4
S3 kNN 60,7 57,8 60,2 54,5 61,1 52,5 55,3 61,1 53,7 61,1 55,7 52 65,2 64,3 58,6 64,8 60,2 60,7
Linear SVM 68,4 68 52,5 53,7 65,2 60,7 61,9 63,1 49,2 64,8 63,5 57 73 71,7 63,1 64,3 70,5 68,4
Quadratic SVM 63,5 66,4 48,8 56,6 65,2 60,7 62,3 62,7 54,1 62,3 63,5 56,6 71,3 70,5 63,5 63,1 68,4 69,3
S4 kNN 59,5 63,2 61,6 62,8 51,7 53,7 61,2 55 50,4 61,6 58,7 48,8 56,2 46,3 53,7 55,8 49,2 59,9
Linear SVM 68,2 62 67,4 70,7 59,5 62 66,5 59,1 57,4 71,9 53,3 57 57,9 55,8 62,4 63,2 55,4 47,5
Quadratic SVM 63,6 61,6 68,6 68,2 59,1 63,2 63,2 62,4 57 70,7 58,7 54,5 57 51,7 59,1 59,9 57,9 55,8
S5 kNN 53,7 48,8 59,4 51,6 48 49,2 54,5 55,3 49,2 45,5 56,1 54,1 54,9 81,1 45,1 50 52 53,7
Linear SVM 55,3 50,4 67,2 46,7 44,7 53,3 52 59,4 51,6 53,7 48,4 51,2 54,5 82 52,9 52,9 51,6 47,5
Quadratic SVM 50,4 49,6 66,8 50,4 48,4 50 50,8 64,8 52,5 49,2 49,2 51,2 48,4 78,7 53,3 47,5 52,9 47,5
S6 kNN 52,5 56,1 56,1 43 50 49,2 50,4 48,4 70,9 49,2 48 54,5 50 51,2 51,2 50 50,4 48,4
Linear SVM 52,9 58,2 52,5 50 52,9 49,6 43,4 51,2 71,7 58,2 52,9 61,5 48,8 49,6 57,4 53,7 54,1 50
Quadratic SVM 58,2 57,4 58,2 55,3 56,6 52 48 45,5 61,9 58,2 54,5 57 54,5 56,6 58,6 54,9 57 59,8
S7 kNN 88,9 49,2 60,7 62,7 49,6 59,8 56,6 54,1 61,1 45,9 50 59,8 49,2 61,5 95,1 51,6 54,5 47,5
Linear SVM 91,8 61,5 60,7 68 68,9 60,7 57,4 56,6 58,6 47,1 63,5 64,8 55,7 58,6 72,5 55,7 60,2 57,8
Quadratic SVM 89,3 61,1 59 62,3 68 72,1 63,9 54,9 60,7 58,2 62,3 66 54,1 66 89,8 61,9 61,5 58,6
S8 kNN 59 51,2 52,5 55,3 63,9 60,2 59,4 52 47,5 59 48,8 49,2 53,7 73,8 57,8 53,3 47,1 48,8
Linear SVM 64,3 55,7 57 60,7 57,4 63,5 53,7 55,3 49,2 54,1 53,7 50 61,5 69,7 62,7 51,2 57,4 60,2
Quadratic SVM 62,3 54,5 57,4 59,4 59,8 63,9 57 52,9 53,3 56,1 54,1 50,8 58,2 72,1 69,3 51,6 55,7 61,9
S9 kNN 59 53,3 57 52,9 63,9 59 59,8 51,2 48 58,2 45,1 51,2 53,7 72,1 55,3 49,6 45,9 49,2
Linear SVM 63,9 57,4 58,2 58,6 54,9 62,7 55,7 54,9 50,4 54,9 54,5 51,6 62,3 70,5 63,1 49,2 55,7 60,7
Quadratic SVM 61,5 59,4 55,3 59,8 61,9 63,1 58,2 52,5 50 53,7 58,2 48,4 57 73 66 43 55,3 61,5
Ave kNN 59.5 53.55 58.14 53.3 54.56 54.3 56.89 52.19 53.9 53.39 52.12 53.2 56.0 61.83 57.39 55.6 50.17 53.27
Linear SVM 64.5 58.98 59.42 57.9 57.99 58.8 56.07 56.23 55.3 57.12 55.89 55.9 61.6 62.63 61.14 57 57.18 56.29
Quadratic SVM 62.3 58.16 59.67 57.8 59.06 60.4 57.72 55.4 55.8 53.39 56.12 54.7 59.4 63.4 63.97 55.82 57.65 60.35

TABLE V. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR THETA, ALFA AND BETA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 44.3 53.7 61.5 52.9 67.6 57 53.7 49.2 52 50.8 51.2 51.6 67.2 46.7 53.3 48 52 66.8
Linear SVM 57.8 65.6 57 60.7 71.7 61.5 50.8 61.9 59.8 52.9 61.1 62.3 77 57.8 54.5 59.8 57.4 73.4
Quadratic SVM 55.3 64.8 53.3 61.1 70.5 60.7 50.8 60.2 58.6 54.1 57 57 78.3 56.6 51.6 57.4 56.6 73.8
S2 kNN 43.4 53.3 59.8 52 69.7 59.4 51.2 48.4 49.2 52.9 50.8 50 68 48.8 52.9 51.2 51.2 66.4
Linear SVM 61.1 67.2 57 62.3 72.5 61.5 53.3 63.1 61.1 54.5 62.3 63.9 76.2 59 52.5 54.9 55.7 75.4
Quadratic SVM 54.5 65.6 57.4 58.6 69.7 61.5 46.3 58.6 60.2 54.1 56.6 60.2 78.3 54.5 48.4 53.7 54.5 73.8
S3 kNN 59.8 60.2 57.4 48 59.4 59.8 55.7 54.9 45.5 62.3 61.9 56.6 56.1 59.8 52.9 54.5 54.9 47.1
Linear SVM 62.7 61.5 50 55.7 59.8 58.2 59.4 59 52.5 64.8 66 56.6 54.5 63.9 58.2 59.8 68 56.6
Quadratic SVM 52.5 66.4 48 58.2 58.6 56.6 59.8 57 47.5 61.9 64.8 51.2 50.8 60.2 57 63.1 65.6 57
S4 kNN 58.7 56.2 66.1 65.7 58.7 48.8 68.6 55 55 68.2 46.3 56.2 52.9 50.8 51.7 52.9 56.6 52.9
Linear SVM 64.5 66.5 70.2 73.6 59.1 60.3 72.3 56.6 57.4 76.4 52.1 70.2 59.1 52.9 57.4 58.3 59.5 50
Quadratic SVM 63.2 66.1 71.9 69.4 60.7 57.4 70.2 53.7 59.1 73.6 50.4 69 61.6 52.9 52.1 58.7 63.6 52.9
S5 kNN 43.9 50 56.1 53.3 42.6 45.9 52 65.2 52.9 54.1 56.6 64.3 54.5 75.4 48.4 52 54.1 51.6
Linear SVM 50 51.2 56.6 48.8 49.6 47.1 47.5 54.1 48.8 47.1 43.4 45.9 43.4 80.7 50 44.3 49.6 51.2
Quadratic SVM 50 50 60.7 49.2 54.9 51.2 49.6 59 49.2 50.4 46.3 48.8 48.8 60.7 48.8 48.4 47.5 50
S6 kNN 59.8 58.2 53.7 50 50 60.2 57 57.4 50.4 60.7 53.7 53.7 49.2 49.2 51.2 45.9 56.6 52.5
Linear SVM 51.6 53.7 51.2 60.7 60.7 60.2 52.9 41.8 47.5 72.1 57.4 49.6 48 48.8 56.6 55.3 55.3 50.8
Quadratic SVM 61.1 55.3 62.3 65.6 55.7 57.4 59.8 58.2 51.2 71.3 58.6 48.8 54.1 53.7 52 55.3 57.4 52
S7 kNN 91 49.2 59 59.4 56.1 63.5 51.6 52.9 58.2 57 49.2 60.2 48.8 60.2 98 52.9 50.4 55.7
Linear SVM 94.3 59.8 51.6 68.4 67.6 55.7 55.3 57.8 61.5 57.4 49.6 58.2 50.8 56.6 84.8 54.1 61.5 52.9
Quadratic SVM 93 59 59 70.1 66 69.7 55.7 54.5 65.6 60.2 53.3 53.7 52.5 64.8 99.2 54.5 59.8 59.4
S8 kNN 59.4 57.8 45.5 49.6 58.2 59.4 62.3 55.7 49.2 60.2 67.6 64.3 67.6 75 61.9 47.5 41.8 51.2
Linear SVM 67.2 66 48.4 60.2 55.7 50.8 52.9 57 56.6 68 70.9 65.2 66 75.8 65.2 53.7 52.9 58.2
Quadratic SVM 59.8 62.3 50 60.7 55.7 52 45.1 51.6 60.2 71.3 69.3 68 69.7 78.7 70.5 53.3 58.2 57.8
S9 kNN 63.1 45.5 54.5 58.6 56.1 52.9 53.3 69.7 50.4 50.4 61.9 52.9 49.6 48.8 58.6 41 50.8 50.4
Linear SVM 50.8 52.9 49.6 50.8 51.6 50.4 49.6 65.2 53.3 53.7 50.4 51.2 48 49.6 57.4 50 51.6 50.8
Quadratic SVM 52 50 53.3 49.6 50.4 53.7 53.3 65.6 48.4 51.6 52 49.6 48.8 47.5 50.8 48 46.3 52.5
Ave kNN 58.15 53.79 57.06 54.39 57.6 56.3 56.16 56.49 51.43 57.4 55.4 56.6 57.1 57.19 58.77 49.55 52.05 54.96
Linear SVM 62.23 60.49 54.63 60.14 60.93 56.1 54.89 57.39 55.39 60.77 57.0 58.1 58.12 60.57 59.63 54.47 56.84 57.7
Quadratic SVM 60.16 59.94 57.33 60.28 60.25 57.8 54.52 57.6 55.56 60.95 56.4 56.2 60.33 58.85 58.94 54.72 56.62 58.8

434 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE VI. CLASSIFICATION ACCURACY OBTAINED USING 18 CHANNELS AND 3 CLASSIFICATION METHODS FOR ALFA , BETA AND GAMMA BANDS
Sub Classifier FP1 FP2 F4 F3 F8 F7 FZ T4 T3 C4 C3 T6 T5 P4 P3 PZ O2 O1
S1 kNN 56 55.2 67.9 46.4 57.9 54.8 48.4 57.5 66.7 64.7 51.2 63.9 75.5 65.9 63.5 58.7 90.1 63.9
Linear SVM 63.5 65.1 57.1 52 59.9 63.1 59.9 60.3 78.2 60.7 50.4 79 65.1 70.2 60.7 62.7 91.3 66.3
Quadratic SVM 64.7 65.1 56 59.1 61.5 66.3 59.1 58.3 77.8 75.4 48 78.6 68.7 71 61.1 67.5 90.5 69.8
S2 kNN 50.8 70.1 57.4 48.4 67.6 54.9 48 53.7 53.7 49.6 51.6 54.5 59 53.7 49.2 52 55.7 57.8
Linear SVM 64.8 78.3 52.5 59.8 72.1 61.9 53.3 68 67.6 52.5 54.5 68 65.2 61.5 50.8 55.7 62.3 75.4
Quadratic SVM 62.7 79.9 46.7 61.1 71.7 57.4 51.2 65.6 65.2 57 59.4 70.1 66.8 58.2 51.6 55.3 59.4 73
S3 kNN 59.4 62.7 57 49.2 54.5 59.4 54.5 54.5 46.3 60.7 59.4 49.6 54.9 57.4 53.7 56.6 64.3 54.9
Linear SVM 60.7 63.5 52.9 57 59.4 58.6 57 57.8 55.3 64.3 66.8 55.7 53.7 64.8 59 60.7 68 57.4
Quadratic SVM 69.3 64.8 50.4 59.8 58.2 54.1 59.8 61.1 52.9 63.9 63.1 53.3 49.6 60.7 60.2 61.5 61.1 59.4
S4 kNN 59.5 57.9 69 63.2 64.5 50 65.3 57.4 62.4 70.7 60.7 71.5 73.1 51.7 64.5 48.3 57 62
Linear SVM 62.4 66.1 73.6 74.8 68.2 57.4 72.7 69 65.3 76 56.2 76 82.6 51.7 68.2 59.5 57 47.9
Quadratic SVM 66.1 66.1 71.9 73.6 68.2 62.4 74.8 69.4 71.1 72.3 57.4 76 78.9 51.7 66.1 53.3 63.2 50.4
S5 kNN 50 58.2 53.3 54.9 50.8 50.8 51.2 57 52 49.2 59.4 71.7 56.1 53.7 52.9 56.1 68.4 49.2
Linear SVM 47.1 49.6 60.2 46.3 49.2 43.4 45.9 50.4 52.9 46.3 50.8 46.7 44.3 47.1 49.2 50 48.4 50
Quadratic SVM 50 50 52.5 48 51.2 52 48.4 54.1 51.2 48.8 54.1 49.6 52.5 45.9 50 49.2 48 50.8
S6 kNN 50.8 50 60.7 51.6 54.5 49.2 53.7 56.6 49.6 61.5 54.1 59 54.9 46.7 43 52 52.9 52.5
Linear SVM 54.1 52.9 50.8 58.6 58.2 59 56.6 48 57 69.7 54.5 52.9 50 58.2 56.6 57 57 49.6
Quadratic SVM 49.6 49.6 61.1 59.8 61.1 57.4 55.3 55.7 54.9 70.1 51.6 54.1 53.7 52.9 52.9 55.7 56.1 53.3
S7 kNN 91.4 51.6 53.7 65.2 54.1 54.5 47.1 57 53.3 60.7 53.7 58.6 57.8 59.8 99.2 50.8 52 55.7
Linear SVM 95.1 59.8 57.8 74.2 60.7 60.7 61.1 56.1 61.9 67.6 51.6 61.1 60.2 62.3 52.9 49.2 63.1 62.7
Quadratic SVM 94.3 62.7 61.1 61.9 67.6 64.8 57.8 48.8 63.5 66.4 47.1 54.5 54.9 65.6 98.8 58.2 61.5 64.8
S8 kNN 61.5 61.1 49.2 55.3 58.2 50.8 53.3 50.8 54.1 57.4 69.7 66.4 66 71.7 67.2 49.2 49.2 50
Linear SVM 65.2 68 46.7 57.8 52.9 43.9 52 57.8 60.2 70.9 78.3 66.8 66.8 77.9 66.4 60.7 62.7 49.2
Quadratic SVM 63.9 65.6 50.8 58.2 61.1 55.7 50.4 51.2 60.2 70.9 76.2 68 69.7 64.3 73.8 54.9 58.6 49.2
S9 kNN 59 47.1 52.9 54.5 59.4 52.9 52.5 75.8 49.2 50 63.5 49.2 50.8 63.5 56.6 43.98 54.1 48.8
Linear SVM 51.2 52 53.7 53.3 52.5 50.8 51.6 68.4 51.2 52.9 48.8 51.2 49.2 53.7 55.3 50.4 51.6 49.2
Quadratic SVM 52.9 47.1 54.1 50.4 50.4 49.6 50 67.2 52 57.4 51.2 48 49.2 46.7 48.8 48 49.6 48.8
Ave kNN 59.8 57.1 57.9 54.3 57.9 53.0 52.67 57.8 54.15 58.28 58.15 60.49 60.9 58.2 61.09 51.96 60.42 54.98
Linear SVM 62.6 61.7 56.15 59.32 59.24 55.43 56.68 59.5 61.07 62.33 56.88 61.94 59.6 60.8 57.68 56.22 62.38 56.42
Quadratic SVM 63.7 61.2 56.07 59.1 61.23 57.75 56.32 59.0 60.98 64.69 56.46 61.36 60.4 57.4 63.32 55.96 60.89 57.73

[12] S. Kim, D. Kim, “Differences inthe Brain Waves of 3D and 2.5D Motion
REFERENCES PictureViewers,”2012.
[1] Q. Wang, O. Sourina, and M. Nguyen, “EEG-based “Serious” Games [13] Y. Jin, O. J. Halloran, L. Plon, CA. Sandman, SG. Potkin, “Alpha EEG
Design for Medical Applications,” 2010 International Conference on predicts visual reaction time,”.Int J Neurosci 116, pp. 1035- 1044, 2009.
Cyberworlds, pp. 270-276, 2010. [14] E. Callaway, RS. Layne, “Interaction between the visual evoked response
[2] B. Rebsamen, E. Burdet, C. Guan, H. Zhang, C. L. Teo, Q. Zeng, et al., and two spontaneous biological rhythms: The EEG alpha cycle and the
"A brain-controlled wheelchair based on P300 and path guidance," pp. cardiac arousal cycle,”.Annal New York Acad Sci 112, pp. 421-431,
1101-1106, 2006. 1964.
[3] D. C. Hammond, "What is neurofeedback?," Journal of Neurotherapy, [15] https://www.youtube.com/watch?v=g7ps5TWzJ-o
vol. 10, pp. 25-36, 2006. [16] L.J., Cao, K. S., Chua, W. K., Chong, H. P., Lee, and Q. M., Gu, “A
[4] J. Carmena, M. Lebedev, R. Crist, J. O. Doherty, D. Santucci, D. comparison of PCA, KPCA and ICA for dimensionality reduction in
Dimitrov, P. Patil, C. Henriquez, M. A. L. Nicolelis, “Brain-controlled support vector machine”, Neurocomputing, 55, pp. 321–336, 2003.
muscle stimulation for the restoration of motor function,” journal of [17] M. Shaker, “EEG Waves Classifier using Wavelet Transform and Fourier
elsevier. Learning to control a brain–machine interface for reaching and Transform”, World Academy of Science, Engineering and Technology,
grasping by primates. PLoS Biol. 1, pp. 193–208, 2003. pp. 723-728, 2007.
[5] J. Collinger, B. Wodlinger, J. Downey, B. Wei Wang, E. Tyler-Kabara, [18] N. Manshouri, T. Kayikcioglu, “Classification of 2D and 3D videos based
D. Weber, A. McMorland, M. Velliste, M. L Boninger, A. B Schwartz,; on EEG waves”, Signal Processing and Communication Application
“High-performance neuroprosthetic control by an individual with Conference (SIU), 2016.
tetraplegia, ” Lancet 381, pp. 557–564, 2013. [19] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Recognition, second
[6] X. Huang, S. Altahat, D. Tran, D. Sharma, “Human Identification with edition. WILEYINTERSCIENCE, 2001.
Electroencephalogram (EEG) Signal Processing,”. ISCIT, pp. 1021-1026,
2012.
AUTHORS PROFILE
[7] W. Mumtaz, L. Xia, A. Malik, M. Yasin, “EEG Classification of
Physiological Conditions in 2D/3D Environments Using Neural
Network*,” Annual International Conference of the IEEE EMBS, Osaka, Negin Manshouri received the B.Sc.
Japan, 2013. degree of Telecommunication
[8] K. Li, X. Zhang, Y. Du, “A LINEAR SVM based classification of EEG Engineering from Islamic Azad
for predicting the movement intent of human body,” 10th International
Conference on Ubiquitous Robots and Ambient Intelligence (URAI), , pp.
University, in 2010. She practically
402-406, 2013. experienced in fields of microwave,
[9] N. Sulaiman, Ch. Chee Hau, A. Abdul Hadi, M. Mustafa, Sh. Jadin, mobile communication, as an antenna
“Interpretation of Human Thought Using EEG Signals and LabVIEW,” designer. Her research interest includes
IEEE International Conference on Control System, Computing and the area of design and analysis of
Engineering, pp. 384-388, 2014.
different kinds of microstrip antenna,
[10] H. R. Khairuddin, A. S. Malik, W. Mumtaz, N. Kamel, L. Xia, “Analysis
of EEG Signals Regularity in Adults during Video Game Play in 2D and ultra-wideband antenna and also branch of biomedical
3D,” 35th Annual International Conference of the IEEE EMBS Osaka, engineering. She received the M.S. degree Telecommunication
Japan, 2064-2067, 2013. Engineering from Islamic Azad University, in 2013. She is
[11] R. N. Hamizah R. Khairuddin, Aamir Saeed Malik and Nidal Kamel, currently working toward the PhD degree in Biomedical
“EEG Topographical Maps Analysis for 2D and 3D Video Game Play, ”
IEEE, 2014.
Engineering at the Karadeniz Technical University, Trabzon,
Turkey.

435 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Masoud Maleki received his Master


degree in Electronic Engineering from
the Azad University, Iran in 2010 and
the B.Sc. degree in Telecommunication
Engineering from the Azad University,
Iran 2007. Currently pursuing his Ph.D.
from Karadeniz Technical University,
Trabzon, Turkey in Biomedical
Engineering. His research interest is
Signal and Image Processing, Brain-Computer Interfacing.

Temel Kayıkçıoğlu received the


Ph.D. degree in Electrical Engineering,
Texas Tech University, USA, in 1993.
The M.S. degree in Electrical
Engineering, Karadeniz Technical
University, Trabzon, TURKEY, in
1986 and the B.Sc. degree in Electrical
Engineering, Karadeniz Technical
University, Trabzon, Turkey, in 1984.
He is Professor in the Department of Electrical and
Electronics Engineering, Karadeniz Technical University,
Trabzon Turkey. His research interests include Signal and
Image Processing, Image Reconstruction, Medical Image
and Signal Processing, Computational Neuroscience, Brain-
Computer Interfacing. He has authored and co-authored
many research papers in international journals and also he
has presented his research work in various international
conferences.

436 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A New Brain-Computer Interface System Based on


Classification of the Gaze on Four Rotating Vanes
Masoud Maleki, Negin Manshouri, Temel Kayıkcioglu
Department of Electrical and Electronics Engineering
Karadeniz Technical University
Trabzon, Turkey
[email protected], [email protected], [email protected]

Abstract— a brain-computer interface (BCI) is a device that easily measurable, which can be applied and tested on a large
enables direct communication between humans and computers by human population [2, 4]. An additional, EEG is an electrical
analyzing neural signals and transforming them into digital signal with high temporal resolution that is generated by
signals. This study presents a novel BCI system based on the gaze neuronal dynamics from the scalp. Therefore, a BCI system
on rotating vane-dependent EEG signals. This BCI system records these brain signals and translates into artificial outputs
proposes to identify four different Rotating Vane from EEG or commands. In the other words, features of EEG signals have
signals that represent commands in a limited visual space. The acts in a real world.
Rotating Vanes have these Specifications: The first vane rotates
slow in an anti-clockwise manner, the second vane rotates fast in Although BCI development is a very young research area, in
an anti-clockwise manner, the third vane rotates slow in clockwise the literature, many methods based on BCI have been proposed.
and the forth vane rotates fast in clockwise manners. All the One of these famous methods is “Steady-State” Visually Evoked
signals are obtained at Department of Electrical and Electronics Potentials (SSVEPs). When our retina is excited by a stimulus
Engineering, Karadeniz Technical University, from 4 healthy flashes at a frequency higher than 6 Hz, our brain generates an
human subjects in age groups between 25 and 32 years old. The electrical activity of the same frequency with its multiples or
features are extracted from the 1-sec epoch of the EEG using Fast harmonics. The stimulus produces a stable Visual Evoked
Fourier Transform (FFT). We use k-Nearest Neighbor (k-NN) and Potential (VEP) in the human visual system that called as
Support Vector Machine (SVM) algorithms to classify these
“Steady-State” Visually Evoked Potentials (SSVEPs). In this
features. Our results demonstrated that SVM was more accurate
paradigm, to produce such potentials, the user gazes a target
compared to k-NN. The proposed algorithm is efficient in the
classification phase with the obtained mean accuracy of 81.51%
block flickers (for example using LEDs) with a certain
for 4 subjects in 1-sec epochs. frequency on screen [5]. A flickering stimulus of different
frequency with a constant intensity can extract SSVEPs with a
Keywords—Brain‐computer interface; Support Vector Machine; maximum amplitude in low (5-12 Hz), medium (12-25 Hz) and
Electroencephalography; Feature extraction; Fast Fourier high (25-50 Hz) frequency bands, separately [6, 7]. Other
transform; k-nearest neighbor algorithm. famous method is Mental Task BCI. In this paradigm, users
think of different mental tasks. In this way, different tasks
activate different areas of the brain. Multi-channel EEG
I. INTRODUCTION recordings will have need to recognition of distinct EEG patterns
to differentiate the tasks. In a recent study [8], researchers used
A brain-computer interface (BCI) is a technology, which EEG to control an electronic device. In this paper, the
provides a straight connection pathway between the brain of a classification of a three-class mental task-based brain–computer
physically disabled patient and an external device or computer. interface (BCI) was presented. The Hilbert–Huang transform for
BCI research wants to generate a non-muscular way for the feature extractor and fuzzy particle swarm optimization by
physically disabled patients to communicate with other such as cross-mutated-based artificial neural network for the classifier
a spelling system for speech or writing a letter and control an were used. These three relevant mental tasks are letter
external device such as an environmental control system. composing, arithmetic, and Rubik's cube rolling forward that
meant left, right, and forward commands to wheelchair,
respectively. Oddball paradigms were used in BCI to generate
BCI systems have been rapidly developed in recent years, event-related potentials (ERPs), like the P300 wave, on the
because these systems may be the only possible solution for targets selected by the user. P300 visual evoked potential (VEP)
people who are unable to communicate via conventional means is another kind of EEG that is extract around 300–600 ms after
because of severe motor disabilities [1, 2, 3]. In the past few visual stimulus beginning. A P300 speller was based on this
decades, a noninvasive brain imaging method commonly principle, in which the detection of P300 waves allow the user
employed in BCIs. Electroencephalography (EEG) signals in the to write characters. A new method for the detection of P300
field of biomedical engineering are often used in BCI systems. waves was presented by Hubert et al. [9], which was based on a
EEG have the advantages such as lower risk, inexpensive and convolutional neural network (CNN). The proposed method has

437 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

detected P300 waves in the time domain. The monitoring of eye


movement could help subjects to communicate with their
environment and control devices. A number of techniques have
been used to discern eye movements [10, 11, 12]. In a recent
research, Abdelkader et Al. [13] proposed a simple algorithm for
the offline recognition of four directions of eye movement from
EEG signals. A single trial has been used to make a decision.
The proposed algorithm in this paper obtained the accuracy of
50-85% for twenty subjects.

In this paper, a new fast and simple brain–computer


interface system based on the gaze on rotating vane-dependent
EEG signals was presented. Speed and simplicity in BCI
systems are very important factors. The proposed method can be
Fig. 1. Experimental framework and tools for EEG recordings
used for a biomedical engineering application to control an
electronic device, like an electronic wheelchair, a robotic arm,
etc. In clinically, physicians could become aware of the subject's Using Matlab 2014a, four rotating red vanes in a black
state using this method. screen was designed. Under each vane, the letter of ‘A’ was
written in white. Speed and direction of the rotation could be
controlled. Two rotation speeds were defined: one rotation per 5
The organization of this paper is as follows: after the sec (called slow rotating) and one rotation per 1 sec (called fast
introduction section, the experimental setup is provided. Then, rotating). The Rotating Vanes have these Specifications in order:
pre-processing, feature extraction and classification are The first vane rotates slow in an anti-clockwise manner, the
described, respectively. In the fifth section, the results are second vane rotates fast in an anti-clockwise manner, the third
provided. The conclusion and discussion are given in the sixth vane rotates slow in clockwise and the forth vane rotates fast in
section. clockwise manners. Screenshot of the rotating vanes is shown in
Fig. 2.
II. METHODS
A. Subjects

EEG signals were obtained from four subjects (3 males and


1 female) in the age groups between 25 and 32 years old at
Department of Electrical and Electronics Engineering,
Karadeniz Technical University. The volunteers were labeled as:
s1, s2, s3, and s4. No one had previous experience in using a BCI
system. All measurements were in noninvasive method and the
volunteers were free to withdraw at any time. Before selection
of volunteers, for the precautions, we consulted them about
visual problems, headaches, family history with epilepsy and
problems related to brain damage. The subjects did not report
any problems.
Fig. 2. Rotating vanes designed by Matlab 2014a

B. Equipment and Setup of Stimulation Unit


C. Experimental Tasks
For the development of the BCI system, the EEG signals
were acquired by Brain Quick EEG System (Micromed, Italy). Before beginning to record, the subjects were asked to
The EEG signals were sampled at 512 samples/s and filtered calm down and relax on a chair for ten seconds. EEG recording
between 0.1 and 120 Hz by a pass-band filter. Also a 50 Hz notch was in four sessions. In each session we asked the subjects that
filter, to eliminate line noise, was used. The electrodes were used gaze on each vane for 4 min. In the first session, each subject
on the scalp in different locations based on the international 10- gazed on the clockwise rotating vane at slow speed. There was
20 system. Eighteen EEG electrodes from all lobes of the brain a 2-min gap for relaxation. Afterwards, the subject was asked
were located according to this system and referenced to the to gaze on the anti-clockwise rotating vane at fast speed and,
electrode Cz. These electrodes included Fp1, Fp2, F7, F3, Fz, after 2 min of relaxation, in the third session, the subject gazed
F4, F8, C3, C4, T3, T4, P3, P4, T5, T6, Pz, O1 and O2. The chair on the anti-clockwise rotating vane at slow speed. Finally, the
was placed 1 m in front of the monitor. Fig. 1 shows the subject gazed on the anti-clockwise rotating vane at slow speed.
experiment framework and tools. To synchronize, before finishing each relaxation time, a beep
was issued.

438 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

D. Data Analysis

In these four sessions, the generated signals (separately for


each channel) were divided into 1 sec epochs. In this way,
240*4 epochs (240 epochs for each speed) were generated per
subject. Epochs of each session were divided into two groups,
randomly. The first group was called training set (which
contained 120 epochs) and the second group was called testing
set (which contained 120 epochs). Collection of the data set is
described in Table 1. To verify the results, classification was
repeated 10 times and in each time different distributions of
training and testing sets were used.

TABLE I. SELECTION DESCRIPTION OF THE DATA SET FOR ONE SUBJECT


IN A CHANNEL

1-seconds 960 240 epochs 120 epochs for


epochs epochs in for session 1, training set in each
total 240 epochs session,
for session 2, 120 epochs for test
240 epochs set in each session
for session 3,
240 epochs
for session 4, Fig. 3. Flow chart of design System

E. System Design to Classification of EEG


𝑋𝑘 = ∑𝑛−1
𝑖=0 𝑥𝑖 𝑒
−𝑗2𝜋𝑖𝑘/𝑛
for k=0,1,…,n-1 (3)
Fig. 3 shows the flowchart of the proposed classification of
EEG method that includes three parts: 1) pre-processing; 2) where in (2), x(t) is the time domain signal and X(f) is its Fourier
feature extraction; and 3) classification. Each of these parts will Transform; in (2), x is the input sequence, X is its DFT, and n is
be explained in the following. the number of samples [15]. The FFT is an optimized
implementation of a DFT, because DFT is computationally very
F. Preprocessing intensive in theory [16].
In this study, the generated epochs were used for extracting
The amplitude of signals can directly influence the
features. As is known, there are five frequency rhythms in EEG
classification performance. Therefore, a normalization process
signals: delta-band (0-4 Hz with 75 micro volt in Amplitude),
was implemented to each epoch in order to reduce the impact theta-band (4-7 Hz with 50_75 micro volt in Amplitude), alpha-
of the magnitude change. In this paper a mean normalization band (8-12 Hz with 20_60 micro volt in Amplitude), beta-band
process was used to each epoch as (1) [14]. (13-30 Hz with 2_20 micro volt in Amplitude), and gamma-band
(30-49 Hz with 20_60 micro volt in Amplitude) [17]. These
𝑥−𝑥̅
𝑋𝑁 = 𝑚𝑎𝑥|𝑥−𝑥̅ | (1) bands were extracted by fast Fourier transform (FFT) method.
In this paper, we used fft( ) function in Matlab for the detection
of EEG signal bands. Mean of absolute power of FFT in each
Here 𝑥, 𝑥̅ , and 𝑋𝑁 denote the original epoch, mean of the epoch was used as features. In this way, for each epoch in one
original epoch and the normalized epoch, respectively. channel, five features were extracted and, as mentioned, 18
channels were used. The number of channels and the selection
of channels may play an important role in our study. From this
III. FEATURE EXTRACTION study it appears that the gaze and perception capability are not
consistent in every subject. Therefore, cautions should be taken
A. Fast Fourier Transform (FFT) about which channel is used for the purpose of our system based
The Fourier transform is a method to convert time domain BCI study.
signals into frequency domain that is defined as (2). Discrete
Fourier Transform (DFT) converts discrete-time sequences into IV. CLASSIFICATION
discrete-frequency versions, which is derived by (3). DFT of An algorithm that has be trained with labelled training
discrete-time signals is widely used for spectrum analysis. samples to be able to distinguish new unlabeled samples
+∞ between a fixed set of classes is called a classifier. In this study,
𝑋(𝑓) = 𝐹{𝑥(𝑡)} = ∫−∞ 𝑥(𝑡)𝑒 −𝑗2𝜋𝑓𝑡 𝑑𝑡 (2) we have the four classes classification problem (i.e., chance

439 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

level of 25%). Many multi-class classifiers have been recently these channels with together, we improved the performance of
used in difficult pattern recognition problems, with great proposed method.
success. Support Vector Machine (SVM) is a popular machine
learning method for classification, regression, and other learning
tasks. k-NN algorithm is another multi-class classifier. The A. One Channel Classification
proposed algorithm was compared with multi-SVM and k-NN
algorithm. A summary of these algorithms is given below. In the first section, EEG signal of a channel divided to 1-sec
epochs. 960 epochs after normalization divided to two training
and testing sets. For each epoch from training set, five features
A. k-NN Algorithm using FFT were extracted. And then we separately trained the
classifiers using K-FCV to calculate classifiers parameter. After
training classifiers, for each epoch from testing set, five features
The k-NN is one of the easiest algorithms for implementation
using FFT were extracted. And then we classified these features
among the existing classification algorithms. First, in this
and calculated classification result (CR) for each classifier
algorithm, the number of the nearest neighbor to the unknown
sample must be determined. Euclidean distance method is separately. Flow chart of design System was shown in figure 3.
commonly used to calculate the nearest neighbors to the sample. The classification result was defined as the percentage of the
Then, the label that is maximum between these neighbors is number of epochs classified correctly over the size of the testing
diagnosed and the unknown sample is labelled with its set. To verify the results, this method was repeated 10 times
maximum label. In binary classification problems, it is with different distributions of training and testing sets. In each
beneficial to use odd numbers for k, because they do not cause time, training and testing set were selected, randomly. And the
any problems for researchers while deciding upon a label [17]. parameter of classifiers (k for k-NN and sigma for SVM) for
each training set were calculated using K-FCV, separately.
In this study, to determine optimum k value, K-fold cross Mean of the classification results in these 10 times (for each
validation (K-FCV) technique was used. Minimum number of
channel separately) and standard deviations of classification
epochs in the training set for each speed was 120; so, the
results for k-NN classifier are provided as Table 2. As is seen
optimum k value was searched in the interval between 1 and 50
with the step size of 2. in Table 2, mean of each channel for four subjects was
calculated. Similarly, Results of Multi-SVM classification for
each channel are shown in Table 3.
B. Support Vector Machine (SVM) Algorithm
B. Multi - Channel Classification
Among the many methods for solving classification
problems, support vector machine (SVM) is one of the most In the second section, seven channels, that they have
popular supervised learning algorithms due to its generalization maximum accuracy in first section (for each classifier
ability [18]. SVM involves the adoption of a nonlinear kernel separately), was selected. We used these seven channels
function to transform input data into a higher-dimensional together, to improve performance of proposed method. In this
feature space that can be formulated as a quadratic optimization case, we have 35 features (7*5) for each 1-sec. These channels
problem in feature space. In iterative learning process of SVM, are Fp1, Pz, T3, P4, O1, T4 and T6 for both classifier. In this
optimal hyperplane with the maximum margin between each way, the channel-reduction process is done. All methods that
class in the higher-dimensional feature space are searched. We were used in the first section are used in this section. We also
utilized SVM with a radial basis function kernel. We have selected 5, 4, 3 and 2 channels, that they have maximum
chosen this kernel due to that the number of hyper parameters accuracy in the first section of classification. Mean and standard
of this kernel is smaller than those of other kernels. This kernel deviations of the classification accuracy for k-NN and SVM are
function is specified by the scaling factor σ. To find best σ value shown in Table 3 and Table 4, separately. As shown in tables,
we searched in interval between 0.1 and 4.5, with step size of the best seven channels are the same for two classifiers. It shows
0.2. To determine optimum σ value K-FCV technique was used. that these channels are important role in our study. But the best
two or three channels for classification are different in each
classifier. For example channels Fp1 and Pz in k-NN classifier
V. RESULTS
have better performance, while channels T3 and P4 are better
As mentioned above, in this study, we have the four-class in SVM classifier.
classification problem. To discovery of problem, classification
in two sections is done. In the first section, for reducing number
of channels and the understanding which channels have best In the other hand, all channels features for an epoch were
performance in the classification, we classified 18 channels, used. So, 90 (18*5) features were prepared for each epoch.
separately. In the second section, seven channels, that they have Results of this classifications, also are shown in tables. The best
maximum accuracy in previous section, were selected. By using classification accuracy is about 81.51%, when all channels were
used for SVM classifier.

440 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE II. RESULTS OF K-NN CLASSIFICATION FOR EACH CHANNEL


Subject 1 Subject 2 Subject 3 Subject 4 Mean of
each
Channels Mean std Mean std Mean std Mean std electrode
FP1 0,585417 0,021469 0,504167 0,014907 0,810417 0,014099 0,471667 0,028005 0,5929
C3 0,370417 0,011122 0,439583 0,0218 0,380833 0,012621 0,372917 0,029256 0,3909
FZ 0,350000 0,020177 0,362500 0,020338 0,500000 0,027536 0,362500 0,021225 0,3937
C4 0,424583 0,011506 0,416250 0,010249 0,363750 0,026418 0,428750 0,023227 0,4083
F4 0,286667 0,018317 0,305000 0,02571 0,540833 0,005229 0,413750 0,027457 0,3865
F3 0,415000 0,017763 0,391250 0,004974 0,486250 0,012638 0,356667 0,016231 0,4122
F7 0,388333 0,023274 0,402917 0,011932 0,487083 0,036126 0,371250 0,013929 0,4123
F8 0,304583 0,022755 0,387083 0,028565 0,620417 0,016043 0,342500 0,023726 0,4136
FP2 0,357500 0,007156 0,506250 0,018281 0,447500 0,015934 0,377083 0,015138 0,4220
T4 0,385000 0,013176 0,486250 0,015281 0,495833 0,018423 0,373750 0,021999 0,4352
T3 0,371667 0,005433 0,586250 0,012201 0,846667 0,013851 0,356667 0,031368 0,5403
T6 0,407917 0,01559 0,444583 0,011083 0,654167 0,012058 0,346250 0,020676 0,4632
T5 0,335417 0,015281 0,297500 0,026286 0,342500 0,024008 0,353333 0,012378 0,3321
P4 0,668333 0,022698 0,738333 0,013307 0,379583 0,021368 0,360417 0,01566 0,5366
P3 0,310417 0,017217 0,338333 0,023735 0,387917 0,026245 0,368333 0,002716 0,3512
Pz 0,583333 0,012148 0,612917 0,020338 0,727083 0,009838 0,601667 0,020958 0,6312
O2 0,378750 0,015408 0,378333 0,023552 0,527500 0,015548 0,365417 0,021072 0,4125
O1 0,547083 0,018078 0,479583 0,030562 0,537500 0,013819 0,360417 0,016002 0,4811

TABLE III. RESULTS OF MULTI-SVM CLASSIFICATION FOR EACH CHANNEL


Subject 1 Subject 2 Subject 3 Subject 4 Mean of
each
Channels Mean std Mean std Mean std Mean std electrode
FP1 0,487083 0,018946 0,362083 0,049573 0,411250 0,025962 0,517500 0,012638 0,4444
C3 0,329167 0,019874 0,412917 0,01995 0,359583 0,056281 0,375000 0,013176 0,3691
FZ 0,379583 0,012876 0,382083 0,015562 0,478750 0,017268 0,370833 0,005103 0,4028
C4 0,444583 0,015352 0,449583 0,02549 0,397083 0,018435 0,425000 0,007065 0,4290
F4 0,298333 0,009363 0,283333 0,01559 0,467500 0,011468 0,414583 0,021949 0,3659
F3 0,400417 0,008255 0,339583 0,027083 0,445000 0,049049 0,384167 0,020391 0,3922
F7 0,360833 0,02182 0,351667 0,023954 0,518750 0,012758 0,372917 0,011785 0,4010
F8 0,346250 0,024179 0,426250 0,007003 0,493750 0,003294 0,382917 0,011637 0,4122
FP2 0,402917 0,014328 0,416250 0,088577 0,429583 0,036306 0,371250 0,006815 0,4050
T4 0,412083 0,011748 0,459583 0,012004 0,503750 0,010458 0,378750 0,023413 0,4385
T3 0,472917 0,017861 0,567083 0,024322 0,457083 0,025069 0,477083 0,015095 0,4935
T6 0,392917 0,011063 0,402917 0,022273 0,565417 0,026963 0,404167 0,008961 0,4413
T5 0,317500 0,007454 0,307500 0,014176 0,335833 0,02345 0,378333 0,020968 0,3347
P4 0,485833 0,009501 0,537917 0,002083 0,487917 0,017217 0,457083 0,005705 0,4921
P3 0,339167 0,003423 0,352500 0,011354 0,397083 0,013709 0,360417 0,0216 0,3622
Pz 0,392500 0,0189 0,505833 0,032755 0,500417 0,018946 0,478333 0,008411 0,4692
O2 0,377083 0,020252 0,374167 0,033773 0,494167 0,015267 0,377917 0,007454 0,4151
O1 0,495833 0,005512 0,430417 0,022945 0,534583 0,01355 0,379167 0,005103 0,4600

TABLE IV. RESULTS OF MULTI - CHANNEL CLASSIFICATION FOR K-NN CLASSIFIER


Channels Subject 1 Subject 2 Subject 3 Subject 4 Average
Mean std Mean std Mean std Mean std
All channels 0,7379 0,0175 0,6443 0,0109 0,7141 0,0244 0,7025 0,0143 0,6997
Fp1,Pz,T3,P4,
O1,T4,T6 0,7091 0,0207 0,5850 0,0092 0,7033 0,0197 0,7120 0,0175 0,6773
Fp1,Pz,T3,P4,O1 0,7091 0,0244 0,6258 0,0265 0,6662 0,0226 0,7241 0,0120 0,6813
Fp1,Pz,T3,P4 0,7091 0,0242 0,8070 0,0235 0,6370 0,0200 0,7242 0,0310 0,7192
Fp1,Pz,T3 0,6043 0,0135 0,7674 0,0187 0,8382 0,0030 0,6780 0,0320 0,7215
Fp1,Pz 0,6075 0,0184 0,7176 0,0189 0,7251 0,0121 0,6550 0,0127 0,6757

TABLE V. RESULTS OF MULTI - CHANNEL CLASSIFICATION FOR SVM CLASSIFIER


Channels Subject 1 Subject 2 Subject 3 Subject 4 Average
Mean std Mean std Mean std Mean std
All channels 0,7945 0,0147 0,8156 0,0162 0,9166 0,0358 0,7337 0,0198 0,8151
T3,P4,Pz,O1,
Fp1,T4,T6 0,7937 0,0288 0,8093 0,0132 0,8812 0,0058 0,6808 0,0159 0,7913
T3,P4,Pz,O1,Fp1 0,8000 0,0294 0,7812 0,0058 0,8572 0,0162 0,6416 0,0068 0,7700
T3,P4,Pz,O1 0.7792 0,0178 0,7708 0,0117 0,7872 0,0132 0,6279 0,0152 0,7286
T3,P4,Pz 0,6766 0,0388 0,7968 0,0014 0,8058 0,0213 0,6566 0,1209 0,7340
T3,P4 0,6554 0,0290 0,7570 0,0214 0.7604 0.0340 0,6400 0,0376 0,6841

441 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[13] N. Abdelkader, H. Hideaki, Y. Natsue, S. Duk and K.Yasuharu,


VI. CONCLUSION AND DISCUSSION “Classification of Four Eye Directions from EEG Signals for Eye-
Movement-Based Communication Systems”, Journal of Medical and
BCI is a kind of communication system that enables the Biological Engineering, 2014.
control of devices or communicates with others only through [14] T.H., Dat, L. Shue, C., Guan, Electrocorticographic signal classification
the brain's signal activities without using motor activities. This based on time-frequency decomposition and nonparametric statistical
paper presented a novel approach for brain-computer interface modeling, Proceedings of the 28th IEEE EMBS Annual International
Conference ,New York City, USA (2006), 2292-2295.
systems. A simple algorithm was developed for the offline
[15] A. V., Oppenheim, R. W., Schafer, Discrete-Time Signal Processing,
identification of rotating vane from EEG signals without any Prentice-Hall, p. 611-619, 1989.
training phase. The results showed that the proposed algorithm [16] C. S. Burrus and T. W. Perks, “DFT/FFT and Convolution Algorithms,”
was promising for real-time applications. Wiley Interscience, New York, 1985.
[17] K. Temel, M. Masoud, E, Kubra.,” Fast and accurate PLS-based
In the future, we would like to design a suitable BCI system classification of EEG sleep using single channel data”, Expert Systems
based on rotating vanes. Reducing channels to make the user with Applications, Volume 42, Pages 7825–7830, 2015.
more comfortable and using different methods for feature [18] V. Vapnik, Statistical Learning Theory; Wiley: New York, NY, USA,
1998.
extraction and classification to improve the classification result
will be pursued in our future works. The goal is non-invasive,
asynchronous, fast, and simple BCI system based on EEG, AUTHORS PROFILE
because a BCI system with these properties is very suitable for
practical machine control, inexpensive, and potentially portable.
Real-time control of a wheelchair, flying a helicopter, or driving Masoud Maleki received his Master
a car, and even designing a spelling system are our aims by using degree in Electronic Engineering from
the proposed algorithm. the Azad University, Iran in 2010 and the
B.Sc. degree in Telecommunication
REFERENCES Engineering from the Azad University,
Iran 2007. Currently pursuing his Ph.D.
[1] J.R. Wolpaw, N. Birbaumer, Heetderks, W.J.; McFarland, D.J.; Peckham, from Karadeniz Technical University,
P.H.; Schalk, G.; Donchin, E.; Quatrano, L.A; Robinson, C.J.; Vaughan,
T.M., "Brain-computer interface technology: a review of the first
Trabzon, Turkey in Biomedical
international meeting," Rehabilitation Engineering, IEEE Transactions Engineering. His research interest is
on , vol.8, no.2, pp.164,173, Jun 2000. Signal and Image Processing, Brain-
[2] S.P., Kelly; E.C. Lalor,; Finucane, C.; McDarby, G.; Reilly, R.B., "Visual Computer Interfacing.
spatial attention control in an independent brain-computer interface,"
IEEE Transactions on Biomedical Engineering, vol.52, no.9,
pp.1588,1596, Sept. 2005. Negin Manshouri received the B.Sc.
[3] X. Gao; D. Xu; Ming Cheng; Shangkai Gao, "A BCI-based degree of Telecommunication
environmental controller for the motion-disabled," IEEE Transactions on Engineering from Islamic Azad
Neural Systems and Rehabilitation Engineering, vol.11, no.2, p.137,140, University, in 2010. She practically
June 2003.
experienced in fields of microwave,
[4] Ch. H., Chen, M. Sh., Ho, K., Shyu, K., Ch., Hsu, K., W., Wang, P., L.,
Lee, “A noninvasive brain computer interface using visually-induced mobile communication, as an antenna
near-infrared spectroscopy responses, Neuroscience Letters, Volume designer. Her research interest includes
580, 19 September 2014, Pages 22-26, ISSN 0304-3940. the area of design and analysis of
[5] B., He. Neural Engineering. Springer. 2nd ed. 2013. different kinds of microstrip antenna,
[6] Zhenghua Wu and Yongxiu Lai and Yang Xia and Dan Wu and Dezhong ultra-wideband antenna and also branch of biomedical
Yao. Stimulator selection in SSVEP-based BCI. Medical Engineering and
Physics. 2008. engineering. She received the M.S. degree Telecommunication
[7] D., Zhu and J. Bieger and G. Garcia Molina and R. M. Aarts. A survey of Engineering from Islamic Azad University, in 2013. She is
stimulation methods used in SSVEP-based BCIs. Computational currently working toward the PhD degree in Biomedical
Intelligence and Neuroscience. Hindawi Publishing Corporation. 2010. Engineering at the Karadeniz Technical University, Trabzon,
[8] Ch., Rifai, S. H. Ling, P. Hunter, Y. Tran and H. T. Nguyen, “Brain– Turkey.
Computer Interface Classifier for Wheelchair Commands Using Neural
Network With Fuzzy Particle Swarm Optimization”, IEEE Journal Of
Biomedical And Health Informatıcs, Vol. 18, No. 5, September 2014. Temel Kayıkçıoğlu received the Ph.D.
[9] C. Hubert and G. Axel, “Convolutional Neural Networks for P300 degree in Electrical Engineering, Texas
Detection with Application to Brain-Computer Interfaces”, IEEE
Transactions on Pattern Analysıs and Machine Intelligence, Vol. 33, No. Tech University, USA, in 1993. The
3, March 2011. M.S. degree in Electrical Engineering,
[10] Q. Ji, H. Wechsler, A. T. Duchowski and M. Flickner “Special issue: eye Karadeniz Technical University,
detection and tracking,” Comput. Vis. Image Underst., 98: 1-3, 2005. Trabzon, TURKEY, in 1986 and the
[11] S. Kawato and N. Tetsutani, “Detection and tracking of eyes for gaze- B.Sc. degree in Electrical Engineering,
camera control,” Image Vis. Comput., 22: 1031-1038, 2004.
Karadeniz Technical University,
[12] J. Kim, “A simple pupil-independent method for recording eye Trabzon, Turkey, in 1984. He is
movements in rodents using video”, J. Neurosci. Methods, 138: 165-171,
2004. Professor in the Department of Electrical and Electronics

442 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Engineering, Karadeniz Technical University, Trabzon


Turkey. His research interests include Signal and Image
Processing, Image Reconstruction, Medical Image and
Signal Processing, Computational Neuroscience, Brain-
Computer Interfacing. He has authored and co-authored
many research papers in international journals and also he
has presented his research work in various international
conferences.

443 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Decentralized Access Control With Anonymous


Authentication for Secure Data Storage on Cloud
Shraddha Mokle Prof. Nuzhat F Shaikh
Department of Computer Engineering Department of Computer Engineering
Modern Education Society's College of Engineering Modern Education Society's College of Engineering
Pune, India Pune, India
[email protected] [email protected]

congruity of the client it must be guaranteed. Need of law


Abstract - Cloud computing, as a prominent computing paradigm, enforcement technical service to ensure other than security
enables users to remotely store their data/information into a cloud so and privacy for data [1]. Cloud carries the user who stores data
as to enjoy scalable services on-demand. The issue of scalability and on cloud for different purpose likewise, the cloud liable for the
data confidentiality of access control really still remains uncertain. services it provides. Cloud verifies credibility of the user who
Proposed scheme address these issues data storage, time required for stores the data on it.
accessing data, user revocation and prevents replay attacks. Access
control is processed on decentralized key distribution centers it is A. Parallel and Distributed Systems:
being more secure for data encryption. Generated decentralized key
distribution centers are then grouped by (KGC). Proposed system
provides authentication for the user, in which only authorized users
Parallel and distributed systems are a number of
are able to securely store and view the stored information. User nodes connected in the network. Collection of processing
validations and access control scheme are introduced in elements that cooperate and communicate to achieve a
decentralized, which is required for preventing replay attacks and common goal is defined as distributed and parallel systems.
supports modification of data stored in the cloud. The access control Node to node communication is done through messages. New
scheme is gaining more attention because it is important that only green computing also deals with the energy consumption
approved users have access to valid examine. Proposed scheme during information transfer between nodes to nodes in the
supports creation, replay attacks, reading and modify data stored in network. Important objective of the green computing is to
the cloud. We also address user revocation. The problems of reduce the cost by reducing the energy consumption during
validation, access control, privacy protection should be solved
simultaneously.
information transfer between nodes to nodes.
Usual security risk of encroachment through an
Keyword: Access control, Authentication, Key Generation Center, access control scheme is by easily following a conforming
user through a door. Legal user will hold the door for the
Cloud storage, Attribute Based Encryption, Key Distribution Center.
(some on who intrudes) intruder. Security awareness training
of the people or more active means such as turnstiles by this
I. INTRODUCTION risk factor can be reduced. In high security applications that
Cloud research is come to its best and getting a great deal of are sensitive information this risk is minimized by using a
consideration from both scholastic and modern universes and sally port. Sometimes called a man trap or security vestibule,
distributed computing, as a conspicuous figuring worldview, where operator interfacing is required presumably to secure
putting away information remotely into a cloud in order to authorized verification.
appreciate adaptable administrations on-request is finished by User identification has related to multiple fields like
distributed computing. Taken a toll sparing and efficiency anthropology, art, antiques a common problem is classifying
upgrade accomplished by little, medium-sized undertakings by that a given artifact was produced in period of history, was
utilizing cloud-based administrations to make joint efforts and produced by a certain person, was identifying a person's
to oversee ventures. Clients can outsource their calculation identity is often needs to access securely authentication
and stores data to cloud utilizing Internet with cloud information and systems.
administrations like stages for engineers to compose Access control for cloud is achieving attention
applications (e.g., Amazon's S3, Windows Azure), because it is vital that only verified users have access to
applications (e.g. Web Apps), foundations (e.g Amazon's authorized service. Different data are stored on cloud servers
EC2), is distributed computing. Information kept in mists is much more of stored data is highly sensitive data. Sensitive
extremely delicate and which needs assurance, for medicinal information can be pictures, personal information, videos that
records, interpersonal organizations. Security, protection is are shared on social networking for selected groups of users or
along these lines extremely key and basic issues in distributed not shared with others. It is possible to record the
computing. Important idea of secure distributed computing, information/data safely in the cloud but also it’s required to
the client ought to get validated before starting any exchange, ensure security of user. Consider an example a end user would
and different clients don't have the foggiest idea about the like to save sensitive data but does not want to be

444 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

acknowledge, end user requires security for their data that will Center is hard to keep up as a result of substantial number of
stored anywhere on cloud. End user may want to share clients that are available in the cloud for data sharing or
images, comment, etc and does not wish that his/her identity capacity. Thusly, this underline mists ought to take
to be known by others. Even if, the user must be able to decentralized methodology for conveying mystery keys.
verify/signify to the other users that she/he is a verified user Presently days it's very characteristic that mists have
who stored the data but don’t want to reveal the identity. numerous [4] key distribution centers in various remote spots
in the system.
Previous research on access control of cloud was not In ABE plan which is explained [5] has set of traits
decentralized in nature. Even some not centralized or characterized with one of a kind Id. One or more classes has
fragmented approaches do not support identification for user. been characterized in the KP-ABE, ABE, sender has an
Privacy preserving validated access control for cloud is entrance strategy to scramble information. In technique KP-
supported by pervious work & takes not decentralized ABE, ABE [6] sender has an entrance approach to encode
approach where single key distribution center distributes information. An author whose traits and keys have been
attributes and single key to users present in the system. repudiated can't compose back stale data. The beneficiary gets
traits and mystery keys from the property power and can
II. LITERATURE SURVEY unscramble data in the event that it has coordinating
Distinctive plan utilizes symmetric or ABE key characteristics. In Figure content approach, CP-ABE, the
approach which, does not bolster client recognizable proof. recipient has the entrance strategy displayed in type of a tree,
Past research gave security safeguarding confirmed get to with qualities as leaves, monotonic access structure with OR,
control in cloud. A decentralized approach is proposed by AND other limit entryways [7] [6].
others in existing work; their strategy does not confirmation All the methodologies take a concentrated approach
clients for information get to, who need to stay in secret while and permit stand out Key Distribution Center which is a
getting to information. Past work has proposed conveyed get solitary purpose of disappointment. Pursue proposed a multi
to control system in cloud. The work which has been power Attribute Based Encryption, in which there are a few
distributed, that plan does not give verification to clients Key Distribution Center powers (facilitated by a trusted
exhibit in the framework. Other essential thing was that power) which appropriate credits and mystery keys to clients.
exclusive proprietor of document that is maker/up loader can Multi power Attribute Based Encryption convention was
compose that put away record different clients can not ready contemplated in, which required no trusted power which
to compose that document which was put away on cloud. In requires each client to have qualities from at all the Key
past research compose get to was given to just Distribution Centers. As of late, [9] proposed work
proprietor/maker of particular document not to the viewers/ completely, decentralized Attribute Based Encryption where
peruses this was the disservice. clients could have zero or more qualities from every power
Servers mean cloud servers are mindful to experience and did not require a trusted server. In every one of these
the ill effects of Byzantine disappointment, at whatever point a cases, unscrambling at client's end is calculation escalated.
capacity server can flop in various ways. Inquire about Along these lines, this system may be wasteful when clients
proposes power to deny client characteristics with less access utilizing their cell phones. To get on this issue, research
exertion [3]. A cloud is capable to get influenced by server proposed to outsource the unscrambling assignment to an
intriguing assaults and information upgrading. Server mystery intermediary server, so that the client can rival least assets (for
intriguing harm, the enemy can deal stockpiling servers, with instance, hand held gadgets). Nonetheless, the vicinity of one
the objective that it can trade or conform data records the intermediary and Key Distribution Center makes it less
length of they are inside uniform. Information/bits of data effective than decentralized methodologies. Both these
encryption are relied upon to give secure data stockpiling methodologies had no real way to approve clients, namelessly.
which is on cloud. In any case the information is regularly Changes of verified clients, who need to stay unknown while
changed and this dynamic property should be thought about getting to the cloud [5]. As of late distinctive procedures take
while arranging successful secure stockpiling plans. Effective a shot at decentralized approach and gives confirmation
inquiry on figure information is likewise an imperative without uncovering the character of the clients. As said in the
arrangement in mists. Mists ought not know the operations of past segment it is inclined to replay assaults.
information/bit of data that is inquiry however ought to have
the capacity to give back the arrangement of information that
fulfill the question. By method for searchable encryption this III. SYSTEM ARCITECTURE
is accomplished
Key Distribution Center is incorporated methodology Framework engineering of circulated get to control of
where a solitary Key Distribution Center disperses mystery information put away in cloud so that exclusive checked
keys and it’s a scribe to all clients that are available. If it’s clients with right properties can get to them. Decentralize get
centralized then single Key Distribution Center is single to control strategy with mysterious confirmation, which gives
purpose of disappointment; with single mystery key anticipates replay assaults, client repudiation. The cloud does
disappointment entire framework can fall. Key Distribution not know the character of the client who stores data, yet just

445 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

differs the client's certifications. Enter dispersion is done information. The proprietors ought to change the put
decentralized. The expenses are tantamount to the current not away information and send redesigned data to different
decentralized techniques, and the exorbitant operations are clients. Secure Control It secures information on the
finished by the cloud. As the figure 3.1 shows framework premise get to approach and get to control method.
design of secure framework verification, User will be the up Security controls are sheltered gatekeepers or
loader get to arrangement. countermeasures to stay away from, balance or minimize
security dangers identifying with individual property.
A. Contollers of the Decetralized Access Control
1. Access control System 5) Security control
2. Access policy management It secures data on the basis access policy and access
3. Anonymous executive control technique. Security controls are safe guards or
4. User revocation counter measures to avoid, counteract or minimize security
5. Security control risks relating to personal property. We just consider how to
review the honesty of imparted information in the cloud to
static gathering's keys. It implies the gathering key is pre-
characterized before shared information is made in the
cloud and the enrollment of clients in the gathering key is
not changed amid information sharing. The first client is in
charge of choosing who can share her information before
outsourcing information to the cloud.

B. Set Thory

Let S, be a system such that,


S = {C, D, A}
S- Proposed System,
C – Creator {
Creator who creates the file will be owner of that
document.
Created Document stored in db.
Creator stores their data with some basic
Fig. 3.1. System Architecture authentication.},
D – Data, A - Access
1) Access control System Where,
It provides access control based on user information. In
Ui - the registerd user with attributes {N,Type,Id,CD,SD}
this module cloud verifies the users who are
under the system S,
authenticated. Anonymous users are authenticating in
N = Username,
cloud by some encryption method. This unique client
Access Policy = Role (Up-loader, Reader, Modifier)
makes and imparts information to different clients in the
Reader:
gathering through the cloud.
Modifier: Writer can read and write that the file
available on cloud. Only authorized user can choose
2) Access policy management
the role writer to update the file which is stored on
Authorization for individual users is provided for
cloud.
authenticated users and anonymous users. Authorizations
Reader: Reader will read the data from cloud with the
are given to users on the basis on key generation.
key associated with him/her.
Id – random key generate as password,
3) Anonymous executive
CD – Current Date,
It provides access policy based on users information. It
SD – Subscription limit date.
provides security for user information based on the
attribute based encryption technique.
Process: Authorization and Authentication process of user
with data. Authentication of user, authorized data, cipher text.
4) User revocation
Authenticated users can store and modify their data which is
It secures the information from revoked client and
stored on the cloud. Readers are only allowed to reading or
information aggressors. Mystery keys with the negligible
monitoring the creator/writers document Readers are not able
arrangements of qualities, required to unscramble the

446 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

to modify or any other process performed on creator’s Hashing algorithm which works in one way
document. Writer who is authorized can modify or rewrite the encryption fashion used to maintain integrity of signature. As
creator’s document. hash algorithm are sensitive for a small change, even a single
space is added to Di message digest Mi will change hence can
Process (Ux, Ai, Ci) detect any change done on Di and thus maintain its integrity.
Ux – Authorized User Takes as info a message of self-assertive length and produces
Ai – Authorized Data as yield a 128 piece unique mark or message summaries of the
Ci - Ciphertext data. It is guessed that it is computationally infeasible to
deliver two messages having the same message digest.
X- Input of System: Data as Input { N,Type,Id,CD}.
Expected where a huge document must be compacted in a
Y- Output of System: Efficiency and accuracy providing the protected way before being encoded with a private key under
data security and authorization. an open key cryptosystem, for example, PGP.
T- Set of steps to be performed from verification to upload the
Access Policy for Creator:
data on cloud:
• User will Login With his specific access policy that
Output: (cipher text, user revocation, decrypts data) will be R1=up-loader. Creator access policy will
request KGC (key generation center) to generate
B = ciphertext (∑ux,Pi)
keys.
Where, • KGC: Its key generation center which will able to
create random keys for creator, KGC: {k1, k2, k3…}.
Ux=number of users involved in data storage
• Proposed system is based on decentralized access
Pi=authorized process, user revocation control to store data, so we are using more than one
KGC for better performance. KGC’s will be in
distributed area, whatever KGC generated keys are
IV. SECURE DATA STORAGE ON CLOUD USING DECENTRALIZED generated by KGC’s that will be distributed to the
ACCESS CONTROL access policy i.e. here is up-loader.
• Up-loader can select any key generated by KGC and
proceed for file selection for uploading that file F1.
Sche Centraliz Read Secure Types User • File f1 which is going to be uploaded in cloud, that
me ed/ /Writ data of Revocatio
Decentral file need to have signature generated, Di will be
e storage Access n signature for file which need to uploaded and keep
ized Acce Contro user identity anonymous to other users and also prove
ss l it to be valid user.
12 Centrali 1-W- Not ABE Yes • Digital signature Di is generated and then we attach
zed M-R Authen Di and encrypt file Fi encryption algorithm [8] to
tication generate cipher text Ci.
13 Decentr M- Authen ABE Yes • User then request CSP to upload the cipher text Ci on
alized W- tication cloud.
M-R
• Cipher text Ci is then uploaded to cloud.
Propo Decentr M- Authen Propos Yes
sed alized W- tication ed
sche M-R Acess V. RESULTS
me Contro We are considering some other algorithm for encryption
l comparing the performance proposed system’s algorithm.
[Encry
Proposed scheme is to generate digital signature and applying
ption the encryption algorithm with the decentralized Key
with Distribution Center and KGC for secure storage.
digital
signat TABLE I
ure ] Scheme Comparative Result
As the system architecture fig.3.1 shows the Signature
generation which will take input file and access policy. Digital The criteria contain S1- fine-grained access control, S2- data
Signature generation makes user identity anonymous to other confidentiality, S3-scalability, S4-user revocation. This
users on cloud every user identity is checked by digital comparison table is listed in Table 2.
signature attached.
TABLE II
Signature Generation for File Encryption: Criteria Results

447 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Criteria ABE KP-ABE Proposed Graph shows that proposed ABE with blowfish gives good
efficiency in terms of time as its running time is less for same
S1 N Y N
file F.
S2 Y Y Y
Time is calculated in Milliseconds.
S3 N N Y
S4 Y Y Y
Proposed developed scheme is more secure and it provides VI. CONCLUSION
required revocation and it’s faster than the other schemes. Proposed system provides secure data storage on cloud with
the anonymous authentication. We construct scalability, high
performance, prevents replay attack. Identity of the user who
stores information is not known by cloud, but scheme only
identifies the user’s credentials. To gain Cloud data for
verified users in decentralized network is beneficial and robust
hence overall communication storage has been implemented
by comparing to the non decentralized techniques. Proposed
scheme is more secure and robust as the performance result
shows. System provides high performance with minimum
storage requirement.

References
[1] H. Li, Y. Dai, L. Tian, and H. Yang, “Identity-Based Authentication for
Fig. 4.1. Storage Performance Cloud Computing,” Proc. First Int’l Conf. Cloud Computing
(CloudCom), pp. 157-166, 2009.
Fig 4.1 shows comparison graph on bias of Storage required [2] S. Ruj, A. Nayak, and I. Stojmenovic, “DACC: Distributed access
by the Algorithm to store the encrypted form of document. control in clouds,” in IEEE TrustCom, 2011.I.S. Jacobs and C.P. Bean,
Proposed proposed Algorithm ABE with Blowfish takes less “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol.
III, G.T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271-
storage irrespective of number of attributes inputted by data 350.
Owner the size of cipher text does not increases exponentially. [3] C. Wang, Q. Wang, K. Ren, N. Cao, and W. Lou, Toward Secure and
Blue color: Existing System Dependable Storage Services in Cloud Compu-ting,ǁ IEEE Trans.
Green color: Proposed System Services Computing, Apr.- June 2012.
[4] S.Seenu Iropia and R.Vijayalakshmi (2014), “Decentralized Access
Storage required is represented in Bytes. Control of Data Stored in Cloud using Key-Policy Attribute Based
Encryption” in preceedings:International journal of Inventions in
Computer Science and Engineering ISSN(print):2348-3431.
[5] A. Sahai and B. Waters, “Fuzzy Identity-Based Encryption,” Proc.Ann.
Int’l Conf. Advances in Cryptology (EUROCRYPT), pp. 457-473, 2005.
[6] G. Wang, Q. Liu, and J. Wu, Hierarchical Attribute-Based Encryption
for Fine-Grained Access Control in Cloud Storage Services,ǁ Proc. 17th
ACM Conf. Computer and Comm. Secu-rity (CCS), 2010.
[7] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-Policy Attribute-
Based Encryption,” Proc. IEEE Symp. Security and Privacy, pp. 321-
334, 2007.
[8] Alabaichi, A. Inf. Technol. Dept., Univ. Utara Malaysia, Sintok,
Malaysia “Security analysis of blowfish algorithm,” IEEE Informatics
and Applications (ICIA),2013 Second International Conference on -
2013.
[9] M. Chase, “Multi-Authority Attribute Based Encryption,” Proc.Fourth
Conf. Theory of Cryptography (TCC), pp. 515-534, 2007.
Fig. 4.2. Time Performance [10] S. Yu, C. Wang, K. Ren, and W. Lou, Attribute Based Data Sharing with
Attribute Revocation,ǁ Proc. ACM Symp. Infor-mation, Computer and
Comm. Security (ASIACCS), 2010.
Fig. 4.2 shows comparison graph on bias of Time required by [11] H. K. Maji, M. Prabhakaran, and M. Rosulek, “Attribute-based
the algorithm to compute the encrypted form of document signatures: Achieving attribute-privacy and collusion resistance,” IACR
simply algorithm running time. Proposed algorithm ABE with Cryptology ePrint Archive, 2008.
Blowfish takes less time irrespective of number of attributes [12] F. Zhao, T. Nishide, and K. Sakurai, “Realizing fine-grained and flexible
access control to outsourced data with attribute-based cryptosystems,” in
inputted by data owner the size of cipher text does not ISPEC, ser. Lecture Notes in Computer Science, vol. 6672. Springer, pp.
increases exponentially. 83–97, 2011.
Blue color: Existing System [13] Shraddha Mokle, Nuzhat Shaikh, “Decentralized Access Control
Green color: Proposed System Schemes for Data Storage on Cloud” SAP, Computer Science and
Engineering, 6(1), pp. 1-6 10.5923/j.computer.20160601.01

448 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[14] S Sushmita Ruj, Milos Stojmenovic and Amiya Nayak, Decentralized


Access Control with Anonymous Authentica-tion of Data Stored in
Clouds, IEEE TRANSACTIONS ON PARALLEL AND
DISTRIBUTED
[15] Shraddha Mokle, Nuzhat Shaikh, Anonymous Authentication For Secure
Data Stored On Cloud With Decentralized Access Control, IEEE 2016
International Conference on Wireless Communications, Signal
Processing and Networking on March 23-25 2016.

449 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Data partition and Aggregation in MapReduce


to Improve Processing time
1. Priya P. Gawande, Modern Education Society’s College of Engineering, Pune.
2. Nuzhat F. Shaikh, Modern Education Society’s College of Engineering, Pune

Abstract
The MapReduce model processes 1. Introduction
large-scale data by exploiting parallel map
tasks and reduce tasks. In the shuffle phase Mapreduce is evolved as the efficient
The network traffic generated in middle factor to process the data among huge data
phase i.e shuffle is ignored while increasing centers. Data sets are generated by
the performance of MapReduce jobs. mapreduce which is a model of
Improving the traffic generated in the programming. A set of intermediary
network helps in improving the key/value pairs are generated by processing
performance efficiently. Usually a hash a key/value by map function. Another
function is considered to partition the function which is the reduce function is
intermediate data among reduce tasks, which responsible for merging all the
is not traffic-efficient as the network corresponding values that are linked to a
topology and the data size associated with similar intermediate key on a machine.
each key are not taken into consideration. Run-time system is responsible to store the
The data partition algorithm has being information regarding the input data
proposed to decrease cost of the network partition and also which particular program
traffic for a MapReduce job. We also is to be executed i.e the scheduling among
consider the aggregator placement problem different machine is decided by it. Also
in which each aggregator is responsible to intermachine communication and machine
reduce the data . The data aggregation is failures are managed using run-time system
carried by considering various terms like The computation in Mapreduce is
wordcount, wordfrequency, considered having basically two phases
documentfrequency, TF, IDF.Considering which are map and reduce .Firstly, the
all these factors the data uploaded by the former phase which is map the input is
user is agggregated which helps to reduce initially rearranged in a manner that
the processing time as compared to the computation that is required is achieved by
processing time without aggregation. applying a particular algorithm to the small
data parts. The later phase after map is the
reduce as mentioned earlier and both these
phases are perform parallely on large scale
data . MapReduce job should be considered
as consisting of three phases rather than only

450 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

map and reduce phase while considering the When the data chunk linked to the
mapreduce system performance particular task has been locally stored then it
The `shuffle' phase is the one existing or is said to be a local task. Machine where the
occurring inbetween the two phases i.e map task allocated is referred as the remote
and reduce which can be referred as the machine to that particular task , also that
data transfer phase. The output given by the task on that particular remote machine is
mapper then is again combined and sent to called remote task. While refering to the
the compute nodes which perform term of locality we can also consider the
corresponding reduce operations they are portion of tasks which runs on local
scheduled in the shuffle phase. The machines ..
MapReduce performance wholly relies In order of advancing locality which can
mostly on the manner in which tasks are reduce equally the processing time of map
scheduled which are associated to the map, tasks and also the network traffic load given
reduce and shuffle phases. that very less map tasks require to remotely
At the same time as current techniques had obtain data. Though for allocating the tasks
deal among scheduling workflow to local machines might result in different
implementation on grids, similar techniques allocation of tasks amongst the machines,
are not helpful for scheduling MapReduce that means few machines might be greatly
jobs. The user has to initially define the networked traffic at the same time remaining
position of the reducers which depends on might remain idle. We take into
two factors which are latitude and longitude. consideration the closest machine that are
Before allocating the data to the reducer the able to store the input data in chunks which
data is initially partitioned as mentioned is allocated by the map tasks while the map
above and later the data is aggregated before tasks are allocated. For achieving the proper
allocated to the nearest reducer. Different balance inbetween data localityand load-
terms like wordcount, wordfrequency, balancing among MapReduce algorithm
documentfrequency, TF, IDF are used for which inputs map tasks to machines a map-
aggregation. Considering all these factors scheduling algorithm or a simple scheduling
the data uploaded by the user is agggregated algorithm is considerd.
which helps to reduce the processing time as The Scheduling is the most crucial
compared to the processing time without aspects of MapReduce for the reason that
aggregation. Another added dispute that the various issue which were noticed , and
occured while using MapReduce job is the more additional issues while scheduling in
big data. the MapReduce. To defeat these problems
considering different techniques and
approach numerous algorithms are examined
2. Present State of Art and proposed. Several among these
algorithm concentrate on escalating quality
of data locality while few of these help to
guarantee Synchronization processing.
Data locality is where we take into
Several of these algorithm cover the
consideration the closest machine that are
implementation to decrease the overall
able to store the input data in chunks which
processing time.
is allocated by the map tasks while the map
Several methods are designed of the
tasks are allocated. A local machine is
MapReduce interface. The correct selection
considered for each and every task.
of implementation thoroughly relies on there

451 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

surrounding. Consider, particular association amongst files and generate


implementation is appropriate for the small references into the directories and files.
shared-memory machine, while other is 2. Checksum: Checksum is being calculated
proper for a big multiprocessor, and next on the basis of arbitrary block of digital data
one is for bigger assembly of networked for errors detection that might perhaps had
machines. Different opensource been introduced by data transmission or
implementations of MapReduce are storage.
available , and the applicability of 3. Compression: Checksum have two
MapReduce to a variety of problem domains important benefits: the minimum amount of
has been studied. data is to be stored and other is that the data
Entity resolution (ER) identifies transfer is accelerated across the network or
stored data which relates to the same entity to/from disk.
from the different data sources. It is a
tedious task in the research of database
management. The ER algorithm is required 3. Proposed system
for using the MapReduce framework for
cloud computing is for absolute volume of To decrease the network traffic in a
data collections currently. A large amount of MapReduce job, aggregated data with
research on blocking-based ER consider that similar keys is considered previous to
single blocking key is associated with forwarding them to remotely located reduce
particular entity. In few applications tasks . Eventhough the combiner also
multiple blocking keys may be associated to performs the same function, but the
entity. The correspondence computation combiner operates on the generated data by
accelerate the blocking-based ER. Blocking map task individually which thus fails to
key has a specified value which is generated make use of the data aggregation
from hashing which is the entity in ER. opportunity between the multiple tasks .
The blocking-based ER consists of two Basically a MapReduce application
main steps which are a blocking step and combines of mainly three functions which
matching step. Blocks are the groups are: map function, partition function and
generated by partitioning the whole entities ,reduce function. The map function operates
present in datasets with the help of blocking on a series of key/value pairs, processes
keys. them and output key/value pairs. Each and
Here in blocking step the same blocking key every output key/value pair is then allocated
is present in all the blocks of entities. Later to a reducer by the partition function. It
step which is the matching step carries takes as input the key and the total number
computations for similarity considering all of reducers and returns the index of the
the pairs of entity in every block and then reducer to which the corresponding
identifies the likeness amogst all blocks. key/value pair should be sent for furthur
The result is a set of all the entity pairs of processing. The reduce function interates
high similarity. through the values associated with a unique
key and emits the output.
The implementation of Zput consist
of three key operators:
1. Link: Link avoids the main issues which
are data duplication and relocation. Its
needed to maintain the track of the

452 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

reduce the processing time .To support the


analysis, we have to build an auxiliary graph
with a three-layer structure.

Fig 1 : Map Reduce

Main motive is to reduce and improve


processing time by using data partition and
aggregation for a MapReduce job as shown
in fig 2. For splitting the actual large-scale
problem into many subproblems we propose
Distributed algorithm and later these
subproblems can be then solved. Later the
data splitted is aggregated which eventually
decreases the processing time. Fig. 1. The network traffic minimization
The user has to initially define the problem model.
position of the reducers which depends on
two factors which are latitude and longitude. Another added challenge which
Both the factors can be given in floating arised in dealing with the MapReduce job is
point values which is further required for for big data. Main basic idea behind it is to
conidering the data allocation to a particular split the actual large-scale problem into
reducer.Before allocating the data to the several small distributed solvable
reducer the data is initially partitioned as subproblems. The distributed algorithm can
mentioned above and later the data is be useful to apply to large database andthe
aggregated before allocated to the nearest obatained result clearly signifies that the
reducer. The data aggregation is carried by processing time is redude efficiently
considering various terms like word count compared to traditional method.
(wc), word frequency (wf), document Both the Map and reduce tasks under some
frequency (df), term frequency(TF), IDF cases could partially overlap but the
.Considering all these factors the data execution is to improve system throughput.
uploaded by the user is agggregated which After the user defines the two reducers the
helps to reduce the processing time as node1 and node2 is started which handles
compared to the processing time without the data given to them as input and then
aggregation. further aggregates the input data and passes
The results generated finally suggest it to the reducer after aggregating it by using
that the given proposals can significantly all the factors mentioned before like TFIDF,

453 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

wordcount, documentcount, wordfrequency, f u n c ti o n Te r aS o r t ( ) :


documentfrequency. {
Pairs of key/value are given as the inputs to // I n i t i a l i z e Map Reduce o b j e c t s
the map function . When a system receives v oid mr = MR create (MPI
input from mapreduce job, the map tasks COMMWORLD) ;
which can be referred as mappers then start // re ad f i l e s , l o o p and add key v al u e
on the compute nodes and every map task u si n g MR kv add
performs the map function to all the w hil e ( listNum!= n u l l ){
key/value pair (k1, v1) which are assigned // u si n g de f a u l t so r t e d key d i st r i b
to it. More than zero or null key/value pairs utions
(list(k2, v2)) can be generated for the similar r e a di n gH a n dl e r=r e a d Pa r t i t i o n
input key/value pair. (mr , f i l e ar g s ) ;
}
f u n c ti o n map ( S t ri n g key , S t ri n g v //Use the Map f u n c ti o n
al u e ) : mainHandler=MR map(mr , r e a di n gH a n
// key : doc ument n ame | | v al u e : dl e r ) ;
documen t co n t e n t s MR c oll a te (mr ,NULL ) ;
f o r each word w i n v al u e : MR reduce (mr,&sum ,NULL ) ;
Emi t In te rmedi a te (w, ” 1 ” ) ; // S o r t by key s
reduce ( S t ri n g key , I t e r a t o r v al u e s // key s have unique o r de r a s ex pl ai n e d
): above .
MR sort keys (mr,& ncompare ) ;
// key : a word | | values : a l i s t o f count s 4
i n t re s u l t = 0 ; // ncompare d e f i n e s the s o r t i n g r u l e
f o r each v i n v al u e s : s
r e s u l t += P a r s e I n t ( v ) ; MR Gather ( ) ;
Em it ( A sS t rin g ( r e s u l t ) ) ; DispOutput ( ) ;
}
Local file system is used to store the
intermediate results and then are sorted with • Input and Output modules: The data
the help of the keys. Past completing all of taken as input with several input
the map tasks , the MapReduce alerts the formats is recognized by the help of
reduce tasks (which are also processes that input module, and further splitted the
are referred to as reducers) to initiate their input data into key/value pairs.
processing. Reducers then fetch the output The particular module permits the
files from the map tasks parallely, and processing engine to work with
merge-sort the files obtained from the map different storage systems. Which is
tasks to combine the key/value pairs into a done by permitting dissimilar input
set of new key/value pair (k2, list(v2)), formats considered for parsing unlike
where all values with the same key k2 are data sources, like the binary file,,
grouped into a list and used as the input for text file, and even the files like
the reduce function. The reduce function database files. Similarly the output
applies the user-defined processing logic to format of both the mapper reducer is
process the data specified by the output module.

Algorithm :

454 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

• Combine module: The shuffling cost compared with the the one where random
is reduced by performing a local aggregation is carried. The processing time
reduce process for the key/value which is achieved by considering the TF
pairs by the combine module which IDF values and the different several
are generated by mapper. occurances in the data. The processing time
Hence this particular module is is majorly smaller by using aggregation as
considered particular type of a compared to the processing time without
reducer. aggregation.

• Partition module. Partition module


is useful for shuffling the key/value
pairs from mapper to reducer. The
function is defined for partition as f
(key) = h(key)%numberOfReducers,
where % stands for the mod operator
and h(key) represents the hashvalue
of a particular key. A key/value pair
(k, v) is forwaded to the f(k)-th
reducer. We may define several
partition functions to sustain more
complicated behavior.

• Group module. The data which is


received from different map
processes is merged with the help of
group module .The merged data is
converted into single sorted run in Fig. 2. Processing time with aggregation and
the reduce phase. Specifying the without aggregation
groupfunction helps the data to be
merged more flexibly, that is
function of the map output key.
Suppose ,consider if the map output
key is a combination of a number of
attributes (sourceIP,destURL), the
group function can only compare a
subset of the attributes(sourceIP).

5. Results and Discussion


Extensive simulations are conducted
to test the performance of the proposed
distributed algorithm DA. We then compare
the results of distributed algorithm and the Fig 3: Aggregation of data by
one without aggregation, which is the TFIDF,WF,DF,WC
default method in Hadoop. The aggregator
placement Algorithm is identified and then

455 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

data uploading approach for the hadoop


distributed file system”,
6. Conclusion in Cluster Computing (CLUSTER), 2013
IEEE International
Conference on. IEEE, 2013, pp. 15.
The whole uploaded data by the user
is partitioned and then aggregated by
[5] T. White,“Hadoop: the definitive guide:
mapreduce to decrease the processing time
the definitive guide”.
and improve the efficiency of mapreduce
OReilly Media, Inc., 2009.
job. The data aggregation is carried by
considering various terms like word count,
[6] S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He,
word frequency, document frequency , TF,
and L. Qi,
IDF. Considering all these factors the data
“Leen: Locality/fairness-aware key
uploaded by the user is agggregated which
partitioning for mapreduce
helps to reduce the processing time as
in the cloud”, in Cloud Computing
compared to the processing time without
Technology and Science
aggregation. Also the data distribution is
(CloudCom), 2010 IEEE Second
done by considering the locality of the
International Conference
reducers. The nearest reducer is allocated
on. IEEE, 2010, pp. 1724.
about 75% of data and the remaining is
allocated to the another reducer.
[7] L. Fan, B. Gao, X. Sun, F. Zhang, and Z.
Liu, “Improving
7. References the load balance of mapreduce operations
based on the key
[1] J. Dean and S. Ghemawat,“Mapreduce: distribution of pairs”, arXiv preprint
simplified data processing arXiv:1401.0355, 2014.
on large clusters”, Communications of the [8] S.-C. Hsueh, M.-Y.Lin, and Y.-C. Chiu,
ACM, vol. “A load-balanced
51, no. 1, pp. 107113, 2008. mapreduce algorithm for blocking-based
entity-resolution with
[2] W. Wang, K. Zhu, L. Ying, J. Tan, and multiple keys”, Parallel and Distributed
L. Zhang, ‘task Computing 2014,
scheduling in mapreduce with data locality: p. 3, 2014.
Throughput and
heavy-traffic optimality”, in INFOCOM, [9] Priya Gawande, Nuzhaft Shaikh.
2013 Proceedings "Improving network traffic in MapReduce
IEEE. IEEE, 2013, pp. 16091617. for big data applications", 2016 International
Conference on
[3] F. Chen, M. Kodialam, and T. Electrical, Electronics, and Optimization
Lakshman, ‘scheduling of Techniques (ICEEOT), 2016
processing and shuffle phases in mapreduce
systems”, in INFOCOM, [10] T. Condie, N. Conway, P. Alvaro, J. M.
2012 Proceedings IEEE. IEEE, 2012, pp. Hellerstein, K.
11431151. Elmeleegy, and R. Sears, “Mapreduce
online”. in NSDI, vol.
[4] Y. Wang, W. Wang, C. Ma, and D. 10, no. 4, 2010,p. 20.
Meng, “Zput: A speedy

456 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Studying the Numerical Methods for


Calculating Bi-Phase Fluid Flow
Behrouz Aghaei1*, Afshin Mohseni Arasteh1
1
North Branch, Islamic Azad University, Tehran, Iran E-mail: [email protected]

Abstract- In calculations of shipbuilding and reviewing the marine phenomena, numerical solution of bi-phase fluid
plays a major role. So many researches are conducted in the sea, lakes, and canals. Numerical solution of this flow
is presented by Navier-Stokes, continuity and surface fitted equations. Based on the bi-phase physical properties of
flow requiring discretion of governing equations coupling above equations is important. In this paper, distribution of
bi-phase fluid is obtained in the whole range of calculation by solving the surface fitted equations by interface
capturing method. Therefore, equations governing on the fluid flow will be solved for a bi-phase fluid. For coupling
the velocity and pressure field, “fractional step method” was also used. Apparently, the only wise selection in any part
of numerical solution algorithm- together with dominance on existing selections conformed to the requirements of
problems ahead- resulted in developing an efficient numerical method that is the basis for this paper aiming to
provide a platform for this issue.
Keywords- bi-phase flow, fractional step method, interface capturing method
I. INTRODUCTION
According to the importance of knowing multi-phase process in different industries such as industrial, oil
and gas and petrochemical companies as well as national holding industries such as Water & Sewage Company
and Ministry of Power, it is very important to know more about this issue for properly predicting the flow
system and reducing the risks from phase changes such as cavitation, modeling the multi-phase flows and
knowing their performance. Predicting the properties of different phases such as temperature, pressure as well as
detecting the way of changes in their interface considerably helps to review and to analyze multi-phase systems.
On the other hand, dominance on this field results in reducing the cost of numerical simulations as well as
accurately predicting in critical points. For this purpose, numerical simulation of this type of flows could be very
useful for determining the optimal conditions of functions of devices. For simulating the multi-phase flow by the
help of fluids dynamics, there are required some calculations to use of models in which volumetric fraction of
the fluid could be accurately modeled. For this reason, modeling the interface between multi-phases particularly
in very sensitive occasions such as detecting the systems governing on it is very complicated. Like all numerical
modeling, producing the netted geometry is one of the requirements for simulating the multi-phase currents; in
this section, desirable gridding according to the nature of governing equations as well as proper zoning the
solution field is very important for predicting the intersection. After gridding and selecting the appropriate
method, by determining the boundary conditions conformed to current physics, it will be possible to analyze the
process of multi-phase current. By conducting a proper analysis, one could investigate different parameters such
as volumetric fraction values of the fluid, pressure, velocity Fluid flow occurs together with a free surface in
most scientific problems. Modeling such current is one of the common issues in calculation fluids dynamic and
there are many studies are being conducted in this field. The problem discussed has comprised from two general
parts including solving the equations governing on fluid flow (Navier-Stocks and Continuity equations) and free
surface modeling.

457 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A. Navier-Stocks equations
Solving Navier-Stocks equations require choosing the algorithm for a couple of field, velocity, and
pressure. Methods for solving the fundamental equations governing on current (three equations of conducting
the linear motion size and a continuity equation) could be divided into two solutions i.e. Simultaneous Solution
and Sequential Solution. Simultaneous solution approach comes with the high cost of calculations, particularly
in big problems; in comparison, there are approaches that the continuity conditions will be satisfied by one time
of solving the equation in any temporal step; in this case, sequential solution approach has been more developed.
The main issue in sequential solution approach is a lack of a clear equation for pressure. In this case, there were
formed approaches such as (1) Predictor- Corrector; (2) Artificial Compressibility, and (3) Projection or
Fractional Step.
Predictor- Corrector Approach: by initial guess about velocity and pressure field, the equations were
solved and in several repetitions by sequentially correcting the velocity and pressure, there was obtained a field
that is correct in all governing equations and continuity is made in calculation field. Artificial Compressibility
Approach: This approach has been developed based on the idea of converting the equations governing on non-
compressible flow – with elliptical- parabola nature- to the equations governing on compressible flow – with
hyperbola nature- and using developed approaches in this kind of flows. In this case, is added to the continuity
equation of compressible flow where β is artificial compressibility parameter. Projection or Fractional Step
Approach: in this method, we solve the linear motion size conduction equations by applying the continuity
conditions in a few steps with no needing to repetition. In this case, using fractional step approach – due to no
needing to repetition among governing equations and transient flows governing on the ocean environment- is
considered as an appropriate choice. Thus, according to increasing the solution speed, fractional step approach is
confirmed [1]. This approach was recommended by Crine [2, 3] in the early 80s and developed by others [4, 5,
6, 7].
B. Free surface
Numerical approach in this field could be divided into two main categories including [8]:
• Interface Tracking Methods
• Interface Capturing Methods
In the interface tracking methods, by marking the surface fitted by particle on interface methods and using
surface fitting methods [9], it provides a sharp interface. In the first approach, particles moved by the local
velocity of fluid are followed on Lagrange based [10]. Therefore, this approach is also used for 3D problems
[11]. In the second approach, the calculation grid follows free surface to satisfy two kinematic and dynamic
conditions [12]. Although these approaches determine the accurate situation of the free surface, however,
according to the algorithms used, they encounter with fundamental limitations in modeling the complicated
geometries such as wave breaking [13]. In comparison to aforementioned approach, there is interface capturing
methods modeling the free surface by using particles on the interface of the fluid, solving the conduction
equation of capturing ratio or calculating the distance to free surface. Interface capturing methods are generally
able to model complicated geometries in the interface, great deformations and considering the climate effects;
for this purpose, there are considered as a desirable choice for reviewing the bi-phase flow. In this state, the
interface of both fluids is considered as a discontinuity in physical properties of calculation range. In this case,
most plans ended to interface capturing method will model the free surface by solving a conduction equation,

458 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

distributing two phases of the fluid in any cell of calculation grid and on the other hand obtain the volumetric
ratio. Proper discontinuity of conduction equation of free surface quantity is very important with considerable
studies conducted in this case:
A) Marking the free surface by particle on the interface
B) Confirming the computational grid
C) Using particles in the fluid
D) Calculating the volumetric or capturing ratio in two phases.

Figure 1. Surface Fitted Modeling Approaches


C. Fundamental Governing Equations
When modeling the fluids together with the free surface by calculating the distribution of two phases of the
fluid in whole calculation range, it could be assumed that there is a fluid affecting in the interface. Therefore, the
fundamental equations governing on flow pressure and velocity field for controlling volume, p, could be given
as below:
(1)

(2)

∫ ∫ ∫ ∫ ∫

459 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Where “u” is velocity vector and “n” is a vector perpendicular to the cell surface. Also, assuming K=1, 2, 3
and Ukare velocity components (w,v,u) respectively and Xk is spatial components (X,Y,Z) respectively and also
gx are gravity components (gz, gy, gx) respectively. In addition, P indicates density, indicates kinematic
viscosity, Vp is the volume of cell P and “Ap” is the surfaces of this cell. The first sentence in above equation
called as “Unsteady Term”. The second sentence in “(2)” indicates the mechanism of conducting the linear
motion size by “Convection Term” and third sentence indicates the mechanism of “Diffusion Term” in this
convection. The fourth sentence indicates surface forces (consultant) and fifth sentence indicates capturing
forces (gravity). Density and viscosity of fluid effective in any cell are calculated by “(3)”:
( ) (3)
( )
Where, indices 1 and 2 indicate two phases of the fluid. Volume ratio α (presence % of two fluids in any
calculation cell) in the solution interface is obtained by “(4)”:

{ (4)

By using continuity equation “(1)” and defining the properties of effective fluid (“3”), results in “(5)” for
conducting the volume ratio, α:

( ) (5)

Discontinuity of convection equation of volume ratio of two phases of fluid “(5)” is very important.
D. Coordinate System and Computational Grid
Making the decision for coordinate system comprises 2 points. First, using the inertial coordinate system or
non-inertial coordinate system [14] and then using Cartesian and non-Cartesian velocity components [15].
Converting the physical continuous space to computational space has been conducted by producing a
computational grid; in this case, there are three main problems including the structure of the grid, properties of
the grid and method of producing the grid. Structurally, types of computational grids could be divided into three
categories including “Structured Grid”, “Block-Structured Grid”, and “Unstructured Grid” as indicate in “Fig.
2”. By comparing the advantages and disadvantages as mentioned above, one could use unstructured grid as a
proper choice in simulating two-phase turbulent flow governing on the sea environment. Some characteristics of
a good computational grid include fine grid in the regions together with severe gradient helping even
distribution of the error; slow changes in the size of neighborhood cells that is desirable for accuracy of the
solution and using tetrahedral cells in 2D state and hexagonal cells in 3D state – particularly next to the wall-
that is effective in reducing the error. Considering all such specifications when producing the gird by using
numerical approaches, differential equations and or different variables [16] resulted in selecting this approach as
one of the most important parts of the numerical solution.

TABLE I
TYPES OF COMPUTATIONAL GRID BASED ON STRUCTURE
Grid type Characteristics Advantages disadvantages
Structured Possibility of following the grid  Determined neighborhood  Time-consuming gridding

460 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

lines cells  Impossibility of controlling the


 Structure programming grid fineness
 Ordered algebraic equation  Limitation in complicated
system geometries
 Time-consuming gridding
 Determined neighborhood
Possibility of following the grid  Impossibility of controlling the
Block- cells per any block
lines per any block grid fineness
structured  Ordered algebraic
 Needing to store data related to
equation system
the connection of blocks
 Easy gridding  Unspecified neighborhood of
 With no limitation in the cell
complicated geometries  Needing to store the data of
Unstructured ‫ـ‬  Possibility of fully cells
controlling the grid  Complicated programming
fineness  Non-ordered algebraic equation
system

Figure 2. Classification of grids based on structure; (a) structured, (b) block-structured; (c) unstructured
E. Raily- Taylor Instability
When a heavy fluid located on the peak of a light fluid, according to the floating force, both fluids will be
displaced. This problem has unstable form called as Raily-Taylor problem. The displacement form of both fluids
depends on preliminary turbulence given to the calculation range. There have been conducted many works in
this case. For example, it changes the initial distribution of fluid and or gives initial velocity to calculation
range. Here, initial velocity “(6)” for both fluids has been considered as initial turbulence.
| |
( ) ( )
{ | |
(6)
( ) ( )

| |
( ) ( )

461 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

In this equation, A is the turbulence range and Δy is the width of cells used in even gridding. For
comparing the accuracy of results, Reynold number has been used based on heavy fluid (as indicated in “Fig.
3”). Meanwhile, in initial temporal steps, the deformation of the interface has a symmetrical nature followed by
losing such symmetry continuously. For reviewing the Reynold number for interface formation, Raily- Taylor
Instability with initial velocity to the computational range in different Reynolds were studied in the modeling as
indicated in “Fig. 3”.

Figure 3. Influence of increased Reynold number on displacement and velocity and time spent in both fluids; (a) Re= 353; (b) Re= 484;
(c) Re= 602.

II. CONCLUSION
This study investigated numerical solution of bi-phase flow and type of grid for modeling the interface
between two phases particularly under very sensitive occasions such as detecting the governing systems that are
very complicated. In this case, most projects ended to interface capturing methods and surface fitted method by
solving a convection equation, distribution of two fluid phases, in each cell of calculation grid and on the other
hand they could obtain volume ratio; the numerical solution of this flow is possible by Navier-Stocks and
continuity equations and surface fitted method. Therefore, coupling above equations based on physical
properties of two-phase of the fluid is very important requiring discretization of governing equations.

REFERENCES
[1] Ferziger, J.H., Peric, M. ,،،Computational methods for fluid dynamics,, , 3rd. , Springer, 2002.
[2] Chorin, A.J., ،،Numerical solution of the Navier-Stokes equations,, , Math. Comput. 22, 745, 1968.
[3] Chorin, A.J., ،،On the convergence of discrete approximations to the Navier-Stokes equations,, , Math. Comput. 23,341,1969.
[4] Goda, K., ،،A multiphase technique with implicit difference schemes for calculating two- or three- dimensional cavity flows,, , J.
computational physics. 30, 76, 1979.

462 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[5] Bell, J. B., Collela, P ., Howell, H., ،،An efficient second-order projection method for Viscous incompressible flow,,, In proceeding of
tenth AIAA computational fluid dynamics conference, AIAA, p.360, 1991.
[6] Kim, J., Moin, P., ،،Application of a fractional – step method to incompressible Navier-Stokes equations,,, J. Comput. Phys. 59, 308,
1985.
[7] Van Kan, J., ،،A second-order accurate pressure-correction scheme for viscous incompressible flow,,, SIAM, J. Sci. Comput. 7, 870,
1986
[8] Panahi, R., Jahanbakhsh, e., Seif, M.S. "Comparison of Interface Capturing Methods in Two Phase Flow", Iranian J. Science
&Technologye, Transaction B: Technology, Vol, No.B6, 2005.
[9] Dervieux, A., Thomasset, F., "A finite element method for the simulation of Rayleigh – Taylor instability", IRAN-LABORIA report,
F-78150 Le Chesnay, 1979.
[10] Nichols, B.D., Hirt, C.W., "Calculating thtee-dimensional Free surface Flows in the vicinity of submerged (PCBFC) method for the
analysis of the analysis of Free-surface Flow", Int. J. Num. Methods Fluids, Vol. 15, p. 1213-1237, 1973.
[11] Muzaferija, S., Peric, M., Seidl, V., "Computation of flow around circular cylinder in a channel", Internal Report, institute fur
Schifbau, University of Hamburg. 1995.
[12] Clarke, A.P., Issa, R.I., "A numerical model of slug flow in vertical tubes", Dept. of Mechanical Engineering, Imperial College,
London, Internal Report, 1995.
[13] Ferzinger, J., Peric, Peric, M., "Computational methods for Fluid dynamics", 3rd Rev. Ed., Springer Verlag, 1995.
[14] White, F.M., "Fluid mechanics", McGraw-Hill, 4th Ed., 2001.
[15] Melaaen, M.C., PhD Thesis, University of Trondheim, 1990.
[16] Arcilla, A,S., Hauser, J., Eiseman, P.R., Thompson, J.F. (eds.), "Numerical grid generation in computational Fluid dynamics and
related Fields", North-Holland, Amsterdam, 1991.

463 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A Novel Simple Method to Select Optimal k in


k-Nearest Neighbor Classifier
Masoud Maleki, Negin Manshouri, Temel Kayıkçıoğlu
Department of Electrical and Electronics Engineering
Karadeniz Technical University
Trabzon, Turkey
[email protected], [email protected], [email protected]

Abstract— k-nearest neighbor (k-NN) algorithm is one of the role in the performance of a nearest neighbor classifier. If k is
traditional methods that is used in classification. It assigns an too small, then the result can be sensitive to noise points; on the
unseen point to the dominant class between its k nearest neighbors other hand, if k is too large, then the neighbors may include too
within the data set. However, lack of a formal framework for many points from other classes [4]. In many classification
selecting the number of the neighborhood k is problematic. This studies, selection methods of k have not been stated and, in some
article investigates a novel method for calculating the optimum studies, k has been selected using trial-and-error method. In the
value of k using cross-validation techniques. The proposed method study by Duda et al. [5], the best k was selected using (1) in any
is fully automatic with no user-set parameters and it is also tested data set:
on different benchmark data set with comparison of other popular
methods for selecting k. 𝑚 = √𝑛 (1)
Keywords—K-fold cross-validation; optimum k; k-nearest
neighbor; leave-one-out; pattern recognition
n is the number of observations of training data set and the
nearest integer value of m is determined as the best k value. In
this algorithm, k is a function of training data set. Enas and Choi
I. INTRODUCTION [6] accomplished a simulation study and suggested k scaling as
Classification is one of the most active research areas and n^(2/8) or n^(3/8). n is also the number of observations of
also important measures in many applications. It is the allocation training data set. In this algorithm, value of k also depends on
of unknown samples to a known class-based feature vector. training data set. In brief, no method is dominating the literature
Selection of a classifier depends on the kind of problem, used and simply setting k=1 or choosing k via cross-validation
features, and other parameters of problem. The dissatisfactory appears the most popular methods [7]. The advantage of cross-
classification occurs when feature vectors have overlapping validation is that k-NN classifies testing observations with
areas. In this case, an optimum decision boundary should be awareness and acquaintance to training data set; as a result, it
made such that the probability of misclassification is minimized influences the misclassification rate.
[1]. Classification Accuracy Rate (CAR) is one of the important
parameters in performance of a classifier. CAR is the percentage In some papers, empirical algorithms have been used, like
of the number of trials classified correctly in the testing data over K-fold cross validation (K-FCV). The best k value is selected by
the total testing data trials. maximum value of classification accuracy. In some studies k-
NN algorithm is trained by K-FCV, in which the best k is
k-nearest neighbor classification is one of the fastest, easy selected according to maximum classification accuracy rate [8,
to implement, and common algorithms among the existing 9, 10]. In another paper, Onder A. and Temel K. [11] used leave
classification algorithms for statistical pattern recognition [2,3]. one out cross-validation (LOO-CV) method to determine
It forms a limited partition X1, X2, . . . , XJ of the sample space optimum k value. They utilized LOO-CV method, since it makes
X such that an unknown observation x is classified into the jth the best use of the available data and avoids the problems of
class if x ∈ Xj. Performance of a nearest neighbor classifier random selections. This algorithm has a high response time
depends on the distance function and value of the neighborhood when the number of data set is high. In another k selection
parameter k. There are several ways for calculating the distance algorithm, Temel K. and Onder A.[12] used sub-sampling
of two points, which include Minkowski distance, Euclidean method. They repeated this method 30 times and computed each
distance, City block (Manhattan) distance, Canberra distance, CAR for the validation set for different k values. Then, they
Chebyshev distance, and Bray Curtis distance (Sorensen selected k of maximum CAR and used it in testing data set. As
distance). It is worth mentioning that Euclidean distance method can be seen from literature, in many studies, the value of k is
is commonly used in k-NN algorithm. If the observations are not selected by many trials on the training and validation sets. But
of comparable units and scales, it is meaningful to standardize these selected methods are often based on chance and so they are
them before using the Euclidean distance. not acceptable. In this work, an awareness algorithm for
selecting optimum k using cross-validation methods is proposed.
The other parameter, which controls the volume of the This algorithm identify the data set and then select optimum k.
neighborhood and consequently the smoothness of the density It does not select k by chance. The performance of the proposed
estimates, is k number of neighbors. It plays a very important

464 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

method was tested using artificial data set and six different TABLE I. MEAN AND VARIANCE OF ARTIFICIAL DATA SETS WITH
DIFFERENT DISTRIBUTIONS
medical datasets, downloaded from University of California, Data Type
Irvine (UCI) machine learning data repository. Class/distribution
A B C
Mean Class1 (3,6) (3,6) (3,6)
The rest of the paper is organized as follows: the next (3,6) (3,6) (3,10)
(First Dim., Second Dim.) Class2
section describes data set and introduces the popular cross Variance Class1 (1.5,15) (8,8) (1.5,1.5)
validation methods. In Section 3, we explain the proposed (First Dim., Second Dim.) Class2 (15,1.5) (1.5,1.5) (1.5,1.5)
algorithm and experimental results and compare our algorithm
with other classical algorithms. Finally, conclusion is presented
in the last section. To access the effectiveness of the proposed algorithm, it was
tested in the data set with 200 and 1000 observations separately.
Table II shows the details about the artificial data set.
II. MATERIAL AND METHODS TABLE II. ARTIFICIAL DATA SET CHARACTERISTICS
A. Description of data set Dataset Name Total No. of Total No. of Total No. of
Observations Features class
In the following subsections, we describe the used data set. Type A 200 2 2
1000
Type B 200 2 2
a. Description of artificial data set
1000
The three different types of data set with different Type C 200 2 2
variances and means as well as different distributions in two
1000
classes (class1 and class2) were made. These three types (A, B,
and C) of distributions made different hypotheses and so
changes in sample distributions affected the operation of b. Description of UCI data set
classifiers in various ways. In Fig. 1, these three types of To evaluate the effectiveness of the proposed algorithm on real
distributions are presented. Observations were 2-dimensional data, classification of data from UCI machine learning was
and their number was the same in each class. Variance and mean performed. The six data set used for this evaluation are described
of different distributions of the data set are sorted in Table I. In in detail at the UCI website. A summary of each data set is given
this table, Dim. denotes dimension. For example, in type ‘A’,
in Table III.
mean of the first component of class1 is 3 and the second
component is 6. Also, variance of the first component of class 1 TABLE III. UCI DATA SET CHARACTERISTICS
is 1.5 and the second component has 15. Means of the first and Data set Name Total No. of Total No. of Total No. of
second components of class2 are 3 and 6, respectively. Variance Instances Features class
of the first and second components of class2 is 15 and 1.5, Breast Cancer 699 10 2
respectively. Wisconsin
Pima Indians 768 8 2
Diabetes
Bupa 345 6 2
Heart 270 13 2
Thyroid 7200 21 3
Iris 150 4 3

B. Proposed method for selecting optimum k


In the following subsections, we describe our methods for
selecting optimum k value using K-fold cross-validation and
LOO-CV, separately.

a. K-fold cross-validation
K-fold cross-validation (K-FCV) is one of the most widely
adopted criteria for assessing the performance of a model and
for selecting a hypothesis within a class. An advantage of this
method, over the simple training-testing data splitting, is the
repeated use of the whole available data for both building a
Fig. 1. Three types of distributions generated by rand function with seed 12 learning machine and testing it. Hence, it reduces the risk of
in Matlab enviroment (un)lucky splitting [13]. In K-fold cross-validation method, data
set is randomly split into K subsets with equal size and the
method is repeated K times. Each time, one of the K subsets is

465 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

used as the validation set and the other K-1 subsets are put the available observations was controlled and noise was added
together to form the pre-training. according to the experimental purpose. rand function in
MATLAB R2014a was used to make artificial data set with
We illustrate the use of the proposed method using an “seed12”. The proposed algorithm was tested and checked on
example. If K=10, the training data set is divided into 10 parts; three different distributions of the data set. For all the artificial
in each iteration, 9 parts are for pre-training and the rest are datasets, the results were reported based on both Euclidean and
related to the pre-testing. Then, we check k nearest neighbor for Mahalanobis distances. The data set were randomly divided to
the samples. CAR value for all values of k is calculated 10 times training and testing observations with the same number. For
(due to K=10). Average of CAR is calculated for each k in these example, when there were 200 observations in total, in each
10 repetitions. Finally, k of the highest CAR is selected. This partition, 100 observations (50 observations from each class)
value of k is optimum k value. Our results are demonstrated in were used for training and 100 observations (50 observations
Table IV for distribution type A (200 samples in total). As from each class) were for the testing. To test effect of fold in
shown in this table, average of CAR (in 10 folds) for k=7 cross-validation, K was set in 10, 20, and N (for LOO-CV).
neighbor is 96.5; so, it is optimum k in 25 nearest neighbor. A When there were 200 or 1000 observations in total, N was 100
common problem in cross-validation methods is the number of and 500, respectively. In the training item, the proposed method
folds, into which the training set is divided. In this paper, we
tried to find optimum k value between 1 and 25 as well as 1 and
checked two kinds of folds. 50 when total observations were 200 and 1000, respectively. To
TABLE IV. SELECTION OF K VALUE FOR DISTRIBUTION TYPE ’A’ BY
verify the proposed method, it was repeated 10 times in each
PROPOSED METHOD data set. Table V shows CARs and standard deviations for three
different distributions of artificial data set when total
k 1 2 3 4 5 6 7 8 9 10 Avg. observations were 1000. Also Table VI shows results for data
/folds CAR
set with 200 observations in total. These results were compared
k=1 0,90 1 0,9 1 0,85 0,8 0,9 0,95 1 0,85 0,915
with those of classical training methods.
k=3 0,95 1 0,9 0,95 0,85 0,95 0,95 0,9 1 0,85 0,930
k=5 0,95 1 1 0,95 0,85 0,95 0,95 0,9 0,95 0,9 0,940 Here, along with the artificial data set, six other real data set
k=7 0,95 1 1 1 0,9 0,95 1 0,95 0,95 0,95 0,965 (Breast Cancer Wisconsin, Pima Indians Diabetes, Bupa, Heart,
k=9 0,95 1 0,9 1 0,9 0,95 0,9 0,95 0,95 0,95 0,945 Thyroid and Iris) were used for illustration. For all the datasets,
where the measurement variables were of the same unit and
k=11 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950
scale, the results were computed based on both Euclidean and
k=13 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 Manhattan distances. Those sets were formed by randomly
k=15 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 partitioning the data. Data set were randomly divided into
k=17 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 training and testing observations in almost the same number. To
k=19 0,95 1 0,9 1 0,9 1 0,9 0,95 0,95 0,95 0,950 test the effect of fold on cross-validation, K was set in 10, 20,
k=21 0,95 1 0,9 1 0,9 0,95 0,9 0,95 0,95 0,95 0,945 and N (for LOO-CV) as artificial data set. To satisfy the value
k=23 0,95 1 0,85 1 0,9 1 0,95 0,95 0,95 0,85 0,940 of K in 10 and 20 in some real data set, observations of training
k=25 0,90 1 0,85 1 0,9 1 0,9 0,95 0,95 0,8 0,925 and testing ones were not equal. For example, when there were
b. Leave-one-out cross-validation 268 observations in Pima dataset, in the training partition, 140
observations (70 observations from each class) and 128
LOO-CV is a particular case of K-FCV with K=N, where N observations (64 observations from each class) were used for
is size of the training set. Hence, the validation sets are all of size testing. In the training item, the proposed method tried to find
one. Like other algorithms, the training data set is divided into optimum k value between 1 and 25 in all the data sets. To verify
two groups. The procedure of LOO-CV method is to take one the proposed method, it was repeated 10 times in each data set.
out of N observations and use the remaining N-1 observations as Table VII and VIII show these results for two class and three
the training set for deriving the parameters of the classifier [14]. class problems. CARs and standard deviations for all data sets
This process is repeated for all N observations to obtain the are shown in these tables. The present results were compared
estimation for the classification accuracy. with those of classical training methods. As shown in Tables V,
VI, VII, and VIII, the proposed method showed 0.1% - 4%
The proposed method was applied in LOO-CV, like K-FCV.
higher CAR than other methods.
If K=N, the training data set is divided into N parts; in each
iteration, K-1 parts are used for pre-training and the rest for the When the number of observations in the data set is high, the
pre-testing. Then, we check k nearest neighbor for the samples. learning task needs a long time and using LOO-CV is not
CAR value is calculated for all the values of k for N times (due suitable. In this case, K-FCV is a good way for the learning task.
to K=N). Average of CAR is calculated for each k in these N The results showed that the present algorithm by K-FCV was a
times. Finally, k of the highest CAR is selected. This value of k very good way for finding the optimum value of k when the
is optimum k value for that data set. number of observations was high, because the response time of
the proposed algorithm was very low. Also, these results
III. EXPERIMENTAL RESULTS illustrated that the size of the folds in K-FCV algorithm did not
We begin with three types of artificial data set on two class greatly affect the results. As mentioned, the results were
classification problems. These data sets were described earlier computed based on both Euclidean and Manhattan distances and
in Section 2. MATLAB R2014a was the used environment for it was found that the kind of distance did not affect the proposed
creating the data set. By using artificial data set, the number of method.

466 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE V. RESULTS FOR ALL DISTRIBUTIONS OF ARTIFICIAL DATA SET WITH 1000 OBSERVATIONS IN TOTAL
Algorithms/ Folds Methods Type A Type B Type C
City block, Proposed method 0.9221±0.0075 0.9225±0.0030 0.8971±0.0056
10-Fold Onder’ method 0.9118±0.0057 0.9095±0.0053 0.8949±0.0053
Duda’ method 0.9112±0.0058 0.9079±0.0071 0.8978±0.0066
City block, Proposed method 0.9236±0.0060 0.9243±0.0075 0.8781±0.0082
20-Fold Onder’ method 0.9170±0.0103 0.9036±0.0090 0.872 ±0.0119
Duda’ method 0.9133±0.0079 0.9086±0.0077 0.8984±0.0089
City block, Proposed method 0.9224±0.0062 0.9241±0.0034 0.9035±0.0069
N-Fold Onder’ method 0.9051±0.0057 0.8784±0.0067 0.8530±0.0095
Duda’ method 0.904±0.0055 0.9085±0.0067 0.9035±0.0062
Euclidean, Proposed method 0.9240 ±0.0055 0.9245±0.0081 0.8903±0.0056
10-Fold Onder’ method 0.9023±0.0094 0.9008±0.0073 0.8974±0.0086
Duda’ method 0.9025± 0.0068 0.9110±0.0094 0.9013±0.0072
Euclidean, Proposed method 0.9264±0.0046 0.9249±0.0049 0.8990±0.0060
20-Fold Onder’ method 0.9270±0.0057 0.9042±0.0056 0.8907±0.0104
Duda’ method 0.9243± 0.0037 0.9087±0.0068 0.8997±0.0066
Euclidean, Proposed method 0.9184±0.0053 0.9235±0.0044 0.9007±0.0070
N-Fold Onder’ method 0.9049±0.0044 0.8798±0.0046 0.8480±0.0064
Duda’ method 0.9190±0.0054 0.9094±0.0050 0.9027±0.0068

TABLE VI. RESULTS FOR ALL DISTRIBUTIONS OF ARTIFICIAL DATA SET WITH 200 OBSERVATIONS IN TOTAL
Algorithms/Folds Methods Type A Type B Type C
City block, Proposed method 0.9190±0.0160 0.9130±0.0071 0.9010±0.0204
10-Fold Onder’ method 0.9020±0.0132 0.9025±0.0175 0.8950±0.0255
Duda’ method 0.9065±0.0194 0.8810±0.0122 0.9020±0.0254
City block, Proposed method 0.9250±0.0131 0.8860±0.0209 0.9030±0.0153
20-Fold Onder’ method 0.9175±0.0223 0.8685±0.0253 0.8830±0.0275
Duda’ method 0.9125±0.0136 0.8685±0.0176 0.9025±0.0174
City block, Proposed method 0.9145±0.0172 0.9080±0.0130 0.8995±0.0174
N-Fold Onder’ method 0.8990±0.0242 0.8750±0.0111 0.8610±0.0250
Duda’ method 0.9095±0.0109 0.8790±0.0250 0.8970±0.0189
Euclidean, Proposed method 0.9140±0.0287 0.9115±0.0251 0.9110±0.0223
10-Fold Onder’ method 0.9025±0.0241 0.9055±0.0318 0.9025±0.0190
Duda’ method 0.9075±0.0241 0.8910±0.0204 0.9000±0.0204
Euclidean, Proposed method 0.9180±0.0125 0.9095±0.0205 0.9085±0.0257
20-Fold Onder’ method 0.9080±0.0236 0.8915±0.0300 0.8935±0.0252
Duda’ method 0.9180±0.0226 0.8695±0.0206 0.9095±0.0251
Euclidean, Proposed method 0.9220±0.0225 0.9075±0.0206 0.8960±0.0313
N-Fold Onder’ method 0.9015±0.0267 0.8845±0.0254 0.8660±0.0273
Duda’ method 0.9005±0.0228 0.8660±0.0254 0.8990±0.0211

TABLE VII. RESULTS FOR REAL DATA SET WITH TWO CLASSES
algorithms/types Methods Pima Wisconsin Bupa Heart
(140/128) (120/119) (80/65) (60/60)
City block, Proposed method 0.7078±0.0211 0.9647±0.0147 0.6277±0.0342 0.6967±0.0276
10-Fold Onder’ method 0.7016±0.0239 0.9597±0.0180 0.6362±0.0441 0.6833±0.0340
Duda’ method 0.7094±0.0281 0.9513±0.0174 0.6269±0.0329 0.6842±0.0307
City block, Proposed method 0.7399±0.0133 0.9739±0.0118 0.6408±0.0301 0.6992±0.0268
20-Fold Onder’ method 0.7063±0.0178 0.9567±0.0074 0.6162±0.0608 0.6883±0.0500
Duda’ method 0.7063±0.0220 0.9504±0.0071 0.6231±0.0274 0.6900±0.0378
City block, Proposed method 0.7364±0.0220 0.9768±0.0187 0.6377±0.0574 0.7033±0.0233
N-Fold Onder’ method 0.6465±0.0232 0.9592±0.0179 0.5900±0.0514 0.6367±0.0193
Duda’ method 0.7145±0.0096 0.9555±0.0171 0.6438±0.0570 0.6958±0.0252
Euclidean, Proposed method 0.7061±0.0210 0.9772±0.0099 0.6223±0.0268 0.6375±0.0255
10-Fold Onder’ method 0.6926±0.0252 0.9667±0.0156 0.6285±0.0366 0.6417±0.0255
Duda’ method 0.6918±0.0303 0.9535±0.0098 0.6192±0.0374 0.6475±0.0319
Euclidean, Proposed method 0.6887±0.0315 0.9873±0.0075 0.6208±0.0399 0.6592±0.0234
20-Fold Onder’ method 0.6871±0.0268 0.9618±0.0121 0.6262±0.0359 0.6250±0.0412
Duda’ method 0.6930±0.0281 0.9605±0.0084 0.6315±0.0335 0.6317±0.0222
Euclidean, Proposed method 0.7016±0.0312 0.9723±0.0060 0.6246±0.0351 0.6350±0.0340
N-Fold Onder’ method 0.6535±0.0334 0.9664±0.0114 0.5831±0.0427 0.6033±0.0261
Duda’ method 0.6875±0.0128 0.9588±0.0092 0.6192±0.0361 0.6267±0.0218

467 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

TABLE VIII. RESULTS FOR REAL DATA SET WITH THREE CLASSES [7] Ripley, B.D. “Pattern Recognition and Neural Networks”. Cambridge
University Press, Cambridge, 1996.
algorithms/ Methods Iris Thyriod
types (25/25) (90/76) [8] Efron B., “Estimating the Error Rate of a Prediction Rule: Improvement
on Cross-Validation”, Journal of the American Statistical Association,
City block, Proposed method 0.9573±0.0216 0.7030±0.0405
Vol. 78, No. 382, pp. 316-331, 1983.
10-Fold Onder’ method 0.9480±0.0160 0.6825±0.0397
Duda’ method 0.9587±0.0203 0.6829±0.0582 [9] Huang, P., Lee. C. H.,”Automatic Classification for Pathological Prostate
City block, Proposed method 0.9573±0.0216 0.7086±0.0283 Images Based on Fractal Analysis”, IEEE Transaction on Medical
Imaging, vol. 28, NO. 7, JULY, 2009.
20-Fold Onder’ method 0.9427±0.0199 0.6833±0.0348
Duda’ method 0.9520±0.0157 0.6601±0.0342 [10] Onder, A. and Temel. K., "Wavelet Transform Based Classification of
City block, Proposed method 0.9680±0.0129 0.6850±0.0299 Invasive Brain Computer Interface Data", Radio engineering, 20(1), pp.:
31-38, 2011.
N-Fold Onder’ method 0.9507±0.0167 0.6632±0.0248
Duda’ method 0.9680±0.0157 0.6658±0.0308 [11] Onder, A. and Temel. K., “Comparative Performance Assessment of
Euclidean, Proposed method 0.9780±0.0143 0.6533±0.0315 Classifiers in Low-Dimensional Feature Space Which are Commonly
10-Fold Onder’ method 0.9613±0.0098 0.6298±0.0361 Used in BCI Applications”, Elektrorevue, 2(4), pp.: 58-63, 2011.
Duda’ method 0.9667±0.0169 0.6167±0.0436 [12] Temel, K. and Onder. A., “A Polynomial Fitting and k-NN Based
Euclidean, Proposed method 0.9627±0.0371 0.6512±0.0233 Approach for Improving Classification of Motor Imagery BCI Data”,
20-Fold Onder’ method 0.9627±0.0138 0.6373±0.0339 Pattern Recognition Letters, 31(11), pp. 1207-1215, 2010.
Duda’ method 0.9500±0.0327 0.5982±0.0314 [13] Anguita, D., Ridella, S., Fabio R.,”K-Fold Generalization Capability
Euclidean, Proposed method 0.9560±0.0209 0.6512±0.0409 Assessment for Support Vector Classifiers”, Proceedings of International
N-Fold Onder’ method 0.9560±0.0090 0.6386±0.0326 Joint Conference on Neural Networks, Montreal, Canada, July 31 -
Duda’ method 0.9627±0.0225 0.6184±0.0333 August 4, 2005.
[14] Alippi C., M. Roveri, “Virtual k-fold cross validation: an effective method
for accuracy assessment”, in Proc. IEEE International Joint Conference
on Neural Networks (IEEE IJCNN 2010), Barcelona, Spain, July 18-23,
IV. CONCLUSION 2010.
k-nearest neighbor (k-NN) algorithm is one of the most
popular methods that supervises learning algorithm and exploits
lazy learning between classification methods. k-NN's AUTHORS PROFILE
performance is highly competitive with other techniques. There
are several key issues that affect the performance of k-NN, one
of which is distribution of data set. Value of k plays an important Masoud Maleki received his Master
role in the decision of unknown pattern in k-NN; so, it can be degree in Electronic Engineering from
considered another issue. In this paper, a new method was the Azad University, Iran in 2010 and the
presented for selecting optimum k-nearest neighbor using cross- B.Sc. degree in Telecommunication
validation methods in k-NN algorithm. The results were Engineering from the Azad University,
computed based on both Euclidean and Manhattan distances. It Iran 2007. Currently pursuing his Ph.D.
was found that the kind of distance and number of folds in cross from Karadeniz Technical University,
validations methods did not affect the proposed method. The Trabzon, Turkey in Biomedical
proposed method was also fully automatic with no user-set Engineering. His research interest is
parameters. The experimental results by using the proposed Signal and Image Processing, Brain-
algorithm, could be decided the most optimum k value in
Computer Interfacing.
compare with other algorithms according to achieved
classification accuracy rates (CAR). This algorithm was applied
to different distributions of artificial and real data sets.
Negin Manshouri received the B.Sc.
degree of Telecommunication
REFERENCES Engineering from Islamic Azad
[1] Yooii K. Kim and Joon H. Han, "Fuzzy K-NN Algorithm using Modified University, in 2010. She practically
K- Selection", 0-7803-2461-7/95, International Conference on Fuzzy experienced in fields of microwave,
Systems and The Second International Fuzzy Engineering Symposium., mobile communication, as an antenna
IEEE, 1995.
designer. Her research interest includes
[2] Fix, E., Hodges, J.L., “Discriminatory analysis nonparametric the area of design and analysis of
discrimination: consistency properties”, International Statistical Review,
Project 21-49-004, Report 4, pp. 261–279. US, 1951. different kinds of microstrip antenna,
[3] Cover, T.M., Hart, P.E., “Nearest neighbor pattern classification”, IEEE ultra-wideband antenna and also branch of biomedical
Trans. Inform. Theory 13, 21–27, 1968. engineering. She received the M.S. degree Telecommunication
[4] Wu X., Kumar V. and J. Ross Quinlan, "Top 10 algorithms in data mining Engineering from Islamic Azad University, in 2013. She is
", Knowledge and Information Systems, Volume 14, Issue 1, pp 1-37, currently working toward the PhD degree in Biomedical
2008.
Engineering at the Karadeniz Technical University, Trabzon,
[5] Duda R. ET Hart P. E., and Stark D. G., "Pattern classification", 2nd
edition, John Wiley, 2000. Turkey.
[6] Enas, G. G. and Choi, S. C. “Choice of the smoothing parameter and
efficiency of k-nearest neighbor classification”, Comput. Math.Applic. A,
12, 235–244, 1986.

468 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Temel Kayıkçıoğlu received the


Ph.D. degree in Electrical
Engineering, Texas Tech University,
USA, in 1993. The M.S. degree in
Electrical Engineering, Karadeniz
Technical University, Trabzon,
TURKEY, in 1986 and the B.Sc.
degree in Electrical Engineering,
Karadeniz Technical University,
Trabzon, Turkey, in 1984. He is Professor in the Department of
Electrical and Electronics Engineering, Karadeniz Technical
University, Trabzon Turkey. His research interests include
Signal and Image Processing, Image Reconstruction, Medical
Image and Signal Processing, Computational Neuroscience,
Brain-Computer Interfacing. He has authored and co-authored
many research papers in international journals and also he has
presented his research work in various international
conferences.

469 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

APROPOSED METHOD FOR FACE


IMAGE EDGE DETECTION USING
MARKOV BASIS
#1
Husein Hadi Abbass , Zainab Radhi Mousa #2
#
Department of Mathematics, Faculty of Education for Girls, University of Kufa
Najaf, IRAQ
1
[email protected]
2
[email protected]

Abstract—Edge detection is one of the basic steps in to be used for further image processing . The major
image processing, image pattern recognition, image anal- property of the edge detection technique is its ability
ysis and computer vision techniques . it is solving many to extract the exact edge line with good orientation
complex problems, Edge is determined on the basis of the
boundary between two different areas of color intensity . The main features can be extracted from the edges
in the image . In this paper we suggested edge detection of an image. Edge detection has major feature for
method for detection of facial expression to pick up the image analysis. These features are used by advanced
edge of eyes , mouth’s and other facial expressions of a computer vision algorithms . Spatial masks can be
human face image . we compare the facial expressions used to detect all the three types of discontinuities in
for suggested method with the expressions resulting from
traditional methods to edge detection ( Sobel, Prewitt an image . There are many edge detection techniques
, Kirsh , Robinson , Marr-Hildreth , LoG and Canny Those techniques are Roberts edge detection, Sobel
Edge Detection). The method depend on a suggested Edge Detection, Prewitt edge detection, Kirsh edge
filter result from combine some elements of the markov detection, Robinson edge detection, Marr-Hildreth
basis found by H.H.Abbas and H.S.Mohammed hussein edge detection, LoG edge detection and Canny Edge
in 2014 , and Laplace filters , the results of using this
suggested method to color image or gray image is more Detection[1][2] .
accuracy and clarity than traditional methods.
Keywords: image processing, edge detection, color
II. RELATED WORKS
and gray image , Laplace filter , Gaussian, Markov
basis. Dr.S.Vijayarani and Mrs.M.Vinupriya (October
, 2013) , edge detection methods transform original
images into edge images benefits from the changes
I. I NTRODUCTION of grey tones in the image. Use two edge detection
Edge detection is one of the most commonly algorithms namely Canny edge detection and Sobel
used operations in image processing and pattern edge detection algorithm are used to extract edges
recognition, the reason for this is that edges form from facial images which is used to detect face .
the outline of an object. An edge is the boundary Performance factors are analyzed namely accuracy
between an object and the background, and indicates and speed are used to find out which algorithm works
the boundary between overlapping objects . Since better. From the experimental results, it is observed
computer vision involves the identification and that the Canny edge detection algorithm works better
classification of objects in an image, edge detection than Sobel edge detection algorithms[3].
is an essential tool. Efficient and accurate edge
detection will lead to increase the performance of Abdallah A. Alshennawy and Ayman A. Aly
subsequent image processing techniques, including (2009) , this method based on fuzzy logic reasoning
image segmentation, object-based image coding, and strategy is proposed for edge detection in digital
image retrieval . The purpose of edge detection in images without determining the threshold value. The
general is to significantly reduce the amount of data proposed approach begins by segmenting the images
in an image, while preserving the structural properties into regions using floating 3x3 binary matrix. The

470 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

edge pixels are mapped to a range of values distinct i1 ...im , A non- negative integer xi ∈ N denotes the
from each other. The robustness of the proposed frequency of a cell i , N = {0, 1, 2, 3, ...}.The set
method results for different captured images is of frequencies x = {xi }i ∈ I is called acontingency
compared to those obtained with the linear Sobel table. A contingency table x = {xi }i ∈ I can be
operator[4] . written as n-dimensional column vector of non-
negative integers in N n . Let Z be the set of integers
Bijay Neupane , Zeyar Aung and Wei Lee and let aj ∈ Z v , j = 1, ..., v, denote fixed column
Woon (April , 2012 ) , edge detection many vectors consisting of integers .
approaches have were proposed , the basic approach The v-dimensional column vector t = (t1 , t2 , ..., tv )0
is to search for abrupt change in color, intensity is defined as tj = a0j x , where 0 denotes the transpose
or other properties. We propose a new method for of a vector or a matrix . If A = [a1 ...av ]0 is v × n
edge detection which uses k-means clustering, and matrix with jth row a0j then t = Ax . The set
where different properties of image pixels were used A−1 (t) = {x ∈ N n : Ax = t} (t-fibers) is the set of
as features. We analyze the quality of the different contingency tables x is for agivent . is considered for
clusterings obtained using different k values (i.e., the performing similartests. If ∼ is the relation x1 ∼ x2
predefined number of clusters) in order to choose if and only if x1 _x2 belongs to the kernel of A ,
the best number of clusters. The advantage of this ker(A) , this relation is an equivalence relation on
approach is that it shows higher noise resistance N n and the set of t-fibers is the set of its equivalence
compared to existing approaches[5] . classes .
An n-dimensional column vector Z = {zi } ∈ z n is
Muthukrishnan.R and M.Radha(Dec , 2011 ) called amove if z ∈ ker(A) , i.e , AZ = o . A set of
, interpretation of image contents is one of the finite moves M is called Markov basis if for all t ,
objectives in computer vision . In image interpretation A−1 [t] constitutes one B equivalence class. [9]
the partition of the image into object and background
is a severe step. Segmentation separates an image into
its component regions or objects. Image segmentation
needs to segment the object from the background to
read the image properly and identify the content of They proved the number of elements in B equals to
the image carefully. In this context, edge detection is n2 −3n
3 . If n = 9 then the number of elements in B is
n2 −3n 2
a fundamental tool for image segmentation. [6].
3 = 9 −3×9
3 = 18 these elements are [8]

Genyun Sun, Qinhuo Liu, Qiang Liu , Changyuan


Ji and Xiaowen Li (2007) , edge detection algorithm
based on the law of universal gravity. The algorithm
   
assumes that each image pixel is a celestial body 1 −1 0 0 0 0
with a mass represented by its grayscale intensity. Z1 = −1 1 0 , Z2 =  1 −1 0
Accordingly, each celestial body exerts forces onto 0 0 0 −1 1 0
its neighboring pixels and in return receives forces
   
from the neighboring pixels. These forces can be 1 0 −1 0 0 0
calculated by the law of universal gravity. The vector Z3 = −1 0 1  , Z4 =  1 0 −1
sums of all gravitational forces along, respectively, 0 0 0 −1 0 1
the horizontal and the vertical directions are used to
   
compute the magnitude and the direction of signal 0 1 −1 0 1 −1
variations. . The proposed algorithm was tested and Z5 = 0 −1 1  , Z6 = 0 0 0
compared with conventional methods such as Sobel, 0 0 0 0 −1 1
LOG, and Canny[7].
   
0 0 0 1 −1 0
Z7 = 0 1 −1 , Z8 =  0 0 0
III. THE MARKOV BASIS
0 −1 1 −1 1 0
In this section, we describle markov basis as in
A.Takemura and S.Aoki (2004).
   
1 0 −1 −1 1 0
let n be the number of elements in a finite set I . An Z9 =  0 0 0  , Z10 =  1 −1 0
element i ∈ I is called a cell . It is often multi-index −1 0 1 0 0 0

471 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

   
0 0 0 −1 0 1 Step3
Z11 = −1 1 0 , Z12 = 1 0 −1
1 −1 0 0 0 0 We will add one value to the filter center (Z100 )
    for filter (Z1∗ ) :
0 0 0 0 −1 1
Z13 = −1 0 1  , Z14 = 0 1 −1 
0 0 0

1 0 −1 0 0 0 Z1∗ (x, y) = Z100 (x, y) + 0 1 0
    0 0 0
0 −1 1 0 0 0
Z15 = 0 0 0  , Z16 = 0 −1 1
0 1 −1 0 1 −1 Then
 
−2 −4 −1
   
−1 1 0 −1 0 1
Z17 = 0 0 0 , Z18 = 0 0 0 Z1∗ (x, y) = −3 20 −2
1 −1 0 1 0 −1 −2 −1 −4

Step4

IV. PROPOSED FILTERS FOR EDGE Filters collect included in the step (3) with a
DETECTION special filter (F) To obtain the following filter (Z2∗ ) :
For n = 3 we will use some elements of a Markov
basis B and Laplace filters to generate new filter . Z2∗ (x, y) = Z1∗ (x, y) + F (x, y)
The following steps illustrate this
where
A. Filter Generation
 
F1 F2 F3
Step1
F (x, y) = F4 a F6 
F7 F8 F9
We use the Markov basis elements Zi , Zj , find
P5 P18
Z10 = i=1 Zi + j=14 Zj

we obtain the following filter :


B. Edge Detection
 
0 −1 1
Z10 (x, y) = 0 −1 1  To detection the edges in an image , we use
0 2 −2 the following steps :

Step2 Step1

Filters collect Markov included in the step (1) Let f (x, y)be color image with dimension m × n × 3
with laplace filters To obtain the following filter : and gray image with dimension m × n .
Z100 (x, y) = L(x, y) + Z10 (x, y), Z0 ∈ M
Step2
where  
−2 −3 −2 The convolution process to new filter 3 × 3 Let
L(x, y) = −3 20 −3 an image matrix f (x, y) of dimension m × n , which
−2 −3 −2 can be written as following :

we get on the filter following : 


f11 f12 ... f1n

   f21 f22 ... f2n 
−2 −4 −1 f (x, y) =  .

..

.. 
 .. ..
Z100 (x, y) = −3 19 −2 . . . 
−2 −1 −4 fm1 fm2 ... fmn

472 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

And the our filter Z ∗ (x, y) of dimension i × j defined


as follows : So,
 ∗ ∗ ∗
 g(1, 1) = (f11 ∗ z11 ) + (f12 ∗ z12 ) + (f13 ∗ z13 )
Z11 Z12 Z13
Z ∗ (x, y) = Z21
∗ ∗
Z22 ∗ 
Z23 , i = 1&j = 1 + (f21 ∗ z21 ) + (f22 ∗ z22 ) + (f23 ∗ z23 )
∗ ∗ ∗ + (f31 ∗ z31 ) + (f32 ∗ z32 ) + (f33 ∗ z33 )
Z31 Z32 Z33 i×j

Our filter is division by ( 38 ) ,


b.Move the mask one pixel to the right , multiply
 ∗ coincident terms sum , and place the new results into
∗ ∗

Z11 Z12 Z13 the buffer at the location that corresponds to the new
8
Z ∗ (x, y) = Z21
∗ ∗
Z22 ∗ 
Z23 /( ) center location of the convolution mask ,continue to
∗ ∗ ∗ 3
Z31 Z32 Z33 the end of the row.
 
Z11 Z12 Z13
= Z(x, y) = Z21 Z22 Z23 
Z31 Z32 Z33 3×3

a. Overlay the convolution mask in the upper-


Fig. 2: Multiply the filter in the image
left corner of the image. Multiply coincident terms,
sum, and put the result into the image buffer at the
location that corresponds to the masks current center. After that we shift the filter z as much as one column,
as in the following :

Fig. 1: Multiply the filter in the image


So,

Make the two arrays as the following form : g(1, 1) = (f12 ∗ z11 ) + (f13 ∗ z12 ) + (f14 ∗ z13 )
+ (f22 ∗ z21 ) + (f23 ∗ z22 ) + (f24 ∗ z23 )
+ (f32 ∗ z31 ) + (f33 ∗ z32 ) + (f34 ∗ z33 )

So we continuous until we find g(p,q)

g(a, b) = (f(a,b) ∗ z11 ) + (f(a,b+1) ∗ z12 )


+ (f(a,b+2) ∗ z13 ) + (f(a+1,b) ∗ z21 )
+ (f(a+1,b+1) ∗ z22 ) + (f(a+1,b+2) ∗ z23 )
Calculate the convolution equation for all pixels of + (f(a+2,b) ∗ z31 ) + (f(a+2,b+1) ∗ z32 )
blurred matrix g :
+ (f(a+2,b+2) ∗ z33 )
3 X
X 3
g(x, y) = f (x, y) ⊗ z(x, y) = (f (i, j) ∗ z(i, j))
i=1 j=1
∀ 1 6 a 6 (m + 2), 1 6 b 6 (n + 2)

473 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

edge and each layer of color image (Red (R), Green


c . the last apply to filter with part form image (G) and Blue (B) ) as matrix to detect the edge, and
later we recombine them to get color image with edge
.
Step5

The input filter Z3∗ which previously created ,


with dimension 3 × 3

Step 6

Fig. 3: Multiply the filter in the image Combine the three image layers (R1 , G1 , B1 )
To reshape the color image .

Step 7

The resulting image is edge detection either color


image or gray image.

V. EXPERIMENTAL RESULT
So ,
We offer in this section various edge detection
techniques such as Roberts edge detector, Sobel Edge
g(m, n) = (f(m−2)(n−2) ∗ z11 ) + (f(m−2)(n−1) ∗ z12 ) Detector, Prewitt edge detector, Kirsch, Robinson,
+ (f(m−2)(n) ∗ z13 ) + (f(m−1)(n−2) ∗ z21 ) MarrHildreth edge detector, LoG edge detector and
+ (f(m−1)(n−1) ∗ z22 ) + (f(m−1)(n) ∗ z23 ) Canny Edge Detector. The edge detection techniques
were implemented using MATLAB R2015b, and tested
+ (f(m)(n−2) ∗ z31 ) + (f(m)(n−1) ∗ z32 ) with an images . The objective is to produce edge to
+ (f(m)(n) ∗ z33 ) extracting the principal features of the image.

the final form of the matrix g is:


 
g11 ... f1q
 .. .. .. 
g(x, y) =  . . . 
gp1 ... gpq p×q
(b) Suggested
,where p=m , q=n (a) Original
method
 
g11 ... f1n
 .. .. .. 
g(x, y) =  . . . 
gm1 ... gmn m×n

Step3
(d) Suggested
(c) Original
method
Smoothing image by using Gaussian or median
filter . Fig. 4: Original gray Image with the result of Sug-
gested method.
Step4

We process gray image to get gray image with

474 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

(a) Original (b) Roberts (c) Sobel (a) Original (b) Roberts (c) Sobel

(d) Prewitt (e) Kirsch (f) Robinson (d) Prewitt (e) Kirsch (f) Robinson

(g) Marr- (g) Marr-


(h) LoG (i) Canny (h) LoG (i) Canny
Hildreth Hildreth

(l) Suggested (l) Suggested


(j) Fuzzy (k) Laplacian (j) Fuzzy (k) Laplacian
method method
Fig. 5: Original gray Image with the result of var- Fig. 7: Testing the original color image with of the
ious edge detection techniques Roberts, Sobel and other edge detection techniques and suggsted method.
Prewitt results actually deviated from the others. Marr-
Hildreth, LoG and Canny produce almost same edge
map. Kirsch and Robinson edge maps are almost same VI. CONCLUSION
. It is observed from the figure, result the suggested
method is superior by far to the other results . In this paper, an attempt is made to review the edge
detection techniques which are based on discontinuity
intensity levels. The relative performance of various
edge detection techniques is carried out with an image
by using MATLAB software. It is observed from the
results Marr-Hildreth, LoG and Canny edge detectors
produce almost same edge map. The results method
suggested is detect the exact image without noise and
the high accuracy of the edge when compared with
(b) Suggested The traditional methods of detecting edge .
(a) Original
method

R EFERENCES
[1] Soumya Dutta and Bidyut Baran Chaudhuri ,"A Color Edge
Detection Algorithm in RGB Color Space" , International Con-
ference on Advances in Recent Technologies in Communication
(d) Suggested and Computing , November 21 , 2009.
(c) Original [2] John Canny. A computational approach to edge detection.
method
Pattern Analysis and Machine Intelligence, IEEE Transactions
Fig. 6: Original color Image with the result of Sug- on, PAMI-8(6):679–698, Nov. 1986.
[3] Dr.S.Vijayarani and Mrs.M.Vinupriya ," Performance Analysis
gested method. of Canny and Sobel Edge Detection Algorithms in Image Min-
ing", International Journal of Innovative Research in Computer
and Communication Engineering , Vol. 1, Issue 8, October 2013.

475 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[4] Abdallah A. Alshennawy and Ayman A. Aly ," Edge Detec-


tion in Digital Images Using Fuzzy Logic Technique", World
Academy of Science, Engineering and Technology 53, 2009.
[5] Bijay Neupane , Zeyar Aung and Wei Lee Woon ," A New
Image Edge Detection Method using Quality-based Clustering"
, Technical Report DNA ,April 2012.
[6] Muthukrishnan.R and M.Radha ," EDGE DETECTION TECH-
NIQUES FOR IMAGE SEGMENTATION", International Jour-
nal of Computer Science & Information Technology (IJCSIT) ,
Vol 3, No 6, Dec . 2011.
[7] Genyun Sun, Qinhuo Liu, Qiang Liu , Changyuan Ji and
Xiaowen Li ," A novel approach for edge detection based on the
theory of universal gravity ", Pattern Recognition 40,pp 2766 –
2775, (2007) .
[8] H. H. Abbass, and H. S. Mohammed Hussein " On Toric Ideals
for 3 × n 3
-Contingency Tables with Fixed Two Dimensional
Marginals n is a multiple of 3", European Journal of Scientific
Research, Vol. 123, pp. 83-98, 2014.
[9] A. Takemura, and S. Aoki, "Some Characterizations of Minimal
Markov Basis for Sampling From Discrete Conditional Distribu-
tions", Annals of the Institute of Statistical Mathematics, pp.1-
17, 2003.

476 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Face Recognition Age Invariant:A Closer Look


Divyanshu Sinha Dr. JP Pandey Dr. Bhavesh Chauhan
KCCITM KNIT ABESIT
Noida, India Sultanpur, India Ghaziabad, India
[email protected]

Abstract—Face recognition from image is a prevalentsubject in realtime calculation and variation, 4) Scope of natural time
biometrics research. Many open places typically have hardware.
surveillance cameras for video capture and these cameras have Facial biometric checking system is utilized to check proof of
their important value forsafetypersistence. It is persons attempting amongdifferent edge work and access
extensivelyrecognized that face recognition have played
asignificantrole in surveillance framework as it doesn’t need the
control framework. Facial matching make utilization of digital
object’s assistance. The realbenefits of face based identification pictures of face saved in a database and on card. Digital
over other biometrics are uniqueness and acceptance.At start we picture is clicked by registration into framework, and after that
give the basic overview about face recognition and diverse match with live picture person upon an access attempt in a
parameters that affects face shape and structure and texture. procedure known as "matching". Output of matching
User uses age tasks merged with aging method to calculate age. algorithm can be enhanced that is, placement of wrong
Then user uses judge age, commonly vector generating function matches and wrong accepts can be facial pictureincreased.
or feature vector of real image to generate incorporate feature Because of this, latest standards for reduced if standard of
vectors at destination age. User uses a structure and texture facial picture can be increased. Because of this, latest
vectors to show a facial image by forecasting it in Eigen space of
structure. In this article author primary focused on this domain
standards for biometric facial picture specify normative
of face recognition and give the overview of the existing research demands for facial picture standard and provide good practices
that has been initiated in this area and also we discuss the for biometric facial picture captured.
benefits and limitation of researches coined in literature. Face recognition is a simple work for humans experiments in
[12-20] have shown, day old child are able to differentiate
Keywords-Face Recognition, Aging, among known faces. Facial recognition use different
parameters of face contain upper outlines, eye sockets, areas
I. INTRODUCTION around cheekbones, side view of face, and position of nose
Role of Human face is very significant in social interaction, and eyes to work checking and proof. Almost methods are
carry individual's check utilizing human face as a factor to resistant because of changes in hairstyle, as they do not use
safety, machine checking of faces is presented as large field of face position close to hairline. When utilized in proof
calculation range crossing numerous standards for instance mode, facial recognition method commonly give person lists
picture procedure, pattern recognition, PC vision or neural of near matches as different to give a specific match (as do
networks. Biometric face recognition innovation had gotten fingerprint and iris-scan methods apply of facial recognition
critical consideration both from neuroscientists and from PC method is nearly bounded to standard of facial picture. Low-
vision researchers in the previous years a while because of quality picture is large same to output in enrolment and
capability for a large assortment of utilizations in two matching problems than large-standard picture. Such as, large
lawenforcement and non-law enforcement for example picture databases include with drivers' licenses or passports
passports, credit cards, image IDs, licenses of driving, or mug carry photographs of marginal standard, for example inserting
attempts to original time coordinating of scrutiny video data and perform matches goes to minimized accuracy. Same
picture. As contrasted with different biometrics framework well-known errors occur with surveillance deployments.
utilizing fingerprint print and iris, face recognition has Facial picture for enrolment and checking can be gain from
different benefits due to its non-contact procedure Face picture real view with large-standard widgets, system output enhance
can be clicked from large difference without content substantially. For facial recognition at little large-than-simple
individual's being checked, and checking does not want distances, there is a large matching among camera standard
contacting with individual's. Moreover, face recognition and differences ability. Different methods for 2D face
handlefraudleashwork due to face picture that have been recognition can be categorized into 3techniques: analytic,
captured and arrived can nextbenefitcheck an individual's. holistic and hybrid technique. While analytic technique match
Following facts is the result of Research interest in the area of salient facial components or parts detected from face, holistic
face recognition: 1) Enhancement in emphasis on commercial techniques make utilization of data derived from full face
research work, 2) Enhancement requirement for surveillance pattern. By merging both local and global components hybrid
related frameworks because of drug trafficking 3) Re- techniques attempt to generate a more complete representation
emergence of real network explained with intensity on of facial picture. Face recognition relay upon geometric
parameter of a face is veryperceptivemethod to face

477 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

recognition. Start automated face recognition method was an expansive number of ways. Real contrast is that
explained in [2]; marker points for example: location of eyes, distinguishing proof does not use a guaranteed character.
ears and nose were utilized for design a feature vector Rather than procuring a PIN or individual name, conveying
recognition was done by evaluating Euclidean distance among confirmation or denial of case, distinguishing proof
feature vectors of a studyor reference picture. This type of frameworks endeavor to reply inquiry "who am I?" [15] There
technique is strong among modification in illumination by its are only an unobtrusive group of enrollees in the database, this
nature, but has large problems: unique registration of maker need is not asking for: as databases create large, into tens and
dot is tuff, also bounded with state of art algorithms. Few numerous thousands, this work ends up being essentially more
weeks on geometric recognition was carried out in [3]. troublesome. System may simply have ability to contract
Eigenfaces technique explains in [4-28] a holistic technique to database to different likely competitors; Human intercession
face recognition: Facial picture check large dimensional may then be needed finally check stages. A second variable in
picture area and a small display notation is simple. Lower ID is component between target subjects and catch
dimensional subspace is calculated with main parameter contraption. In recognizable proof one expect an
Checks axes with large variance. Type of sending is un- accommodating gathering of people, one contained subjects
necessary nom a redesign standpoint; it doesn’t assume class who are convinced to use structure precisely. Facial yield
labels into measure. Assuming state in which variance is structures, dependent upon clear kind of execution, may in like
evaluated from extra data let it be normal. Axes with manner must be streamlined for non-agreeable and
highdeviation do not includeparticulardata; a category is uncooperative subjects [17]. Non-agreeable subjects are
impractical so class-limitedprecision with Linear Discriminate oblivious that a biometric system is set up, or couldn't mind
Analysis was tested to race recognition in [11]. Normal less to recognize or to avoid recognition. Uncooperative
arrangement is to lessen variance among a class, while subjects viably keep up a vital separation from affirmation,
upgrading variance among classes at equivalent period. and may use masks or take shifty measures. Facial yield
Distinctive procedure for a neighborhood feature extraction developments are significantly more prepared for agreeable
rose. To reject high-dimensionality or information data just subjects,, and are absolutely unequipped for perceiving
nearby locales of a photo are clarified. Removed components uncooperative subjects [16] a facial affirmation contraption is
are (ideally) more grounded inverse to partial occlusion, one that points of view a photo or video of a man and
enlightenment and little minimal capacity. Algorithms utilized considers it to one that is in database. It does this by taking a
for a neighborhood feature withdrawal are Gabor Wavelets, gander at structure, shape and degrees of face; partition
Discrete Cosines Transform and Local Binary Patterns [9]. It's between eyes, nose, mouth and jaw, upper charts of eye
stable an open examination address how to spare spatial data connections; sides of mouth: zone of nose and eyes; and zone
while implementing a neighborhood feature extraction, including check bones. To keep a subject from using a
because of spatial data is possibly helpful data as with all photograph or cover while being checked in a facial
biometrics, 4 stages test catch, feature extraction, layout acknowledgment program. A few endeavors to set up security
correlation, and coordinating [12] clarifies technique stream of have been founded. At moment that customer is being
facial scan strategies. Enrolment by and large contain of a 20- checked, they may be asked for that flicker, grin or gesture
30 second enrolment methodology whereby numerous picture their head. Another security highlight would be usage of facial
are taken of single face. Arrangement of picture will fuse thermographs to record warmth in face. Essential facial
marginally unmistakable points and outward appearances, to acknowledgment techniques highlight investigation, neural are
take into consideration substantial coordinating. After framework, Eigen confronts, programmed face preparing [17-
enrolment, particular components are removed, bringing about 18] few facial acknowledgment programming computations
improvement of a layout. Format is little than picture from recognize faces by expelling highlights from a photo of
which it is conveyed: facial picture can require 15kb to 30kb subject's face. Distinctive calculation institutionalize a
layouts range from 84 bytes to 3000 bytes. Little layouts are showcase of face pictures and after that pack face data simply
regularly utilized for 1: N coordinating. saving the data in picture that can be used for facial
acknowledgment [21-22]. A test picture is then diverged from
Check and distinguishing proof take after same strides face data. A really new procedure in business segment is
Imaging individuals is a helpful group of onlookers, rather three-dimensional facial acknowledgment. This method uses
than uncooperative or non-agreeable, individual claims' a 3-D sensors to capture information regarding condition of a
personality through a login name or a token, stands or sits face. Information is then utilized to perceive apparentparts on
before camera for an a few seconds, and is either coordinated face, for example, condition of eye connections, nose and jaw.
or not coordinated. This correlation depends on balance of Advantages of 3-D facial acknowledgment are that which is
most recent created match layout inverse reference format or not affected by changes in lighting, and it can perceive a face
formats on record. Point, at which 2 formats are sufficiently from a collection of focuses, including profile. Another new
same to coordinate known as threshold, can be balanced for strategy in facial acknowledgment uses visual unobtrusive
various work force, Pcs, time of day, and different components of skin, as got in standard progressed or checked
components. Framework outline and improvement for facial pictures. This procedure is called skin composition
sweep confirmation versus distinguishing proof particular in

478 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

examination, turns amazing lines, examples, and spots evident difference becomes saturated after a difference is bigger than 4
in a man's skin into a logical space. years, for difference of up to 10 years. Additionally, they
Preparatory tests have demonstrated the utilizing skin data calculate the picture standard and eyewear present more of a
investigation as a part of facial recognition can build outputin difficult as compare to facial hair.
distinguishing proof by 20 to 25 %. Focal points of facial
recognition are that it is not meddling, ought to be conceivable HaibinLingIn et al [8] portray a survey of Face
from a detachment even without individual staying alert they Recognition as human Age. In the first place, they outline
are being examined. What separates facial recognition from utilizing slope introduction pyramid for this work. Disposing
other biometric strategy is that it can be utilized for of inclination size and utilize various leveled technique, they
observation purposes; as in hunting down needed lawbreakers, found that most recent descriptor yields a solid and
suspected terrorists, and missing youngsters. Facial discriminative presentation. With given descriptor, they
recognition should be possible from far away so with no outline face check as 2-class issues and use a SVM as
contact with subject so they are unconscious they are being classifier. Procedure is connected to 2 international ID
checked. Facial recognition is most preferences to use for information sets including more than 1,800 picture sets from
facial verification than for ID purposes, as it is basic for each client with immense age gaps. Albeit simple, strategy
individual to change their face, highlights with a camouflage beats pale tried Bayesian technique and different descriptors,
or cover, and so forth. Environment is a thought and in contain power distinction and angle with extent. What's more,
addition subject movement and spotlight on camera. Facial it fills in and also 2 business frameworks. Second, for first
recognition, when used in converging with other biometric time, they experimentally concentrate how age contrasts
method, can improve confirmation and recognizable proof influence acknowledgment execution. Work demonstrates
yield significantly.Another Coined algorithm of face that, albeit maturing technique adds trouble to
impersonation of Across the huge gap of aging. The coined acknowledgment assignment, it doesn't surpass light or
algorithm focused on neural network, and training of face expression as a confounding variable.
recognition is done with neural network, neural network is
outperformed the system and existing algorithm with effect of Simone Bianco et al [1] describe huge age-gap face
high accuracy. Furrows may changes in 2majormethods: (1) checking by injection features in networks. He introduced a
most recent wrinkle may high in center of an edge; (2) one noveltechniquefor face verification across large age gaps and a
wrinkle may bifurcate and shape a letter Y. Diverse among (1) dataset containing variations of age in the wild, LAG dataset,
and (2) is not enormously to be correct, in light of the fact that with images consider the effect of face variation with birth to
one of sides of edge on the off chance that (1) may because of old age discriptors. Neural network Fine-tuning is performed
worn, or be restricted and low, and not leave an engraving, in in a Siamese architecture using a contrastive loss function. A
this way changing over it into case (2); on the other hand case layer of feature injection like catalyst is being introduced by
(2) might be changed into (1). Area of starting point of new the author to boost the accuracy.
wrinkle is, be that as it may, none less explained. And thus it
makes a single pattern for checking of a person [28]. Jiaji Huang et al [4] describe Geometry-aware Deep
Transform. This article, contain a novel profound learning
II. OVERVIEW OF SOME EXISTING ALGORITHM intention era that binds together both arrangement and metric
learning criteria. They give a geometry-mindful profound
Haibin Ling et al [6] depict Face Verification among Age change to empower a non-direct discriminative and solid
Progression utilizing techniques. They concentrate on issue by element change, which show focused execution on little
planning and assessing discriminative methodologies. These preparing sets for both engineered and certifiable data. They
straightforwardly handle check errands with no explicit age assist support proposed structure with a formal (K,epsilon)-
modeling, which is a difficult issue independent from anyone strength examination.
else. To begin with, they calculate that gradient orientation
(GO), subsequent discarding magnitude data, gives Dihong Gong et al [3] describe a large Entropy Feature label
straightforward however powerful display for this issue. This for Age Invariant Face Recognition. This article they design a
display is more enhanced at time various leveled data is latest technique to reduce shows and identical issue in age
utilized, which output in utilization of gradient orientation constant faces recognition. At start, latest maximum entropy
pyramid (GOP). At the point, joined with support vector feature descriptor is stooped that conceal microstructure of
machine (SVM) GOP shows phenomenal execution in every facial picture into a succession of different codes as far as
analysis, in examination with 7 distinctive methodologies expansive entropy. By thickly testing encoded face picture,
contain 2 commercial frameworks. Examination is directed on adequate prejudicial and articulate data can be extricated for
FGnet dataset and 2 substantial passport datasets; one is more examination. A most recent coordinating technique is
biggest from all recognition tasks. Other, exploiting datasets, additionally composed, known as character component
they exactly concentrate how age gaps and relevant problems investigation (IFA), to check likelihood that two appearances
influence recognition calculations. They discovered have break even with hidden personality. Adequacy of system
surprisingly that additional trouble of check delivered by age is affirmed by broad experimentation on 2 face maturing

479 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

datasets, MORPH (biggest open area face maturing dataset) utilized to every test pictures, this capacity is known as
and FGNET. They additionally perform chip away at worldwide aging function.
renowned LFW dataset to clarify phenomenal generalizability
of most recent tecnique.
B. Aging Way Classification
Guosheng Hu et al [2] depict When Face Recognition
reconciled with wide information: an interpretation of Point of aging style arranging is to order client into particular
Convolutional Neural Networks for Face Recognition. This classes relies on upon their aging ways. Single client face
article, gives a broad assessment of CNN-based face picture is ordered agreeing his aging way arrangement output.
recognition system (CNNFRS) on a shared belief to create Aging method for single client is shown by a vector which is
assignment effectively reproducible. In particular, they use outlined by evaluation mistake provided by worldwide aging
open database LFW to prepare CNNs, not at all like most function and genuine age. Case in point, a client has 3 face
existing CNNs prepared on private databases. They plan 3 pictures at genuine age and estimation problem of
CNN designs which are initially reported structures prepared worldwide aging function is so his aging style is
utilizing LFW information. This article significant analyses vector ( , ), ( , ), ( , ) , utilize K-mean
models of CNNs and creates impact of particular usage
grouping strategy for maturing way arrangement. User
decisions. They check numerous helpful properties of CNN-
clarifies separation among 2 maturing route as:
FRS. Case in point, dimensionality of scholarly components
can be essentially minimized without unfavorable impact on
face acknowledgment precision. Also, a customary metric = ∑ − ∑ (2)
learning technique misusing CNN-learned elements is
ascertained. Tests display 2 critical variables to great CNN-
FRS execution are combination of various CNNs and metric In condition (2), eiisevaluation error provided by worldwide
learning. aging function, N is number. After that user plan, age capacity
for each class. For a test picture with feature vector b, user
III. AGE ESTIMATION
computes same to ith class.
Age evolution is utilizing feature vectors to judge period of
client in face picture. It thought process to clarify .
communication among genuine age and facial feature vector. = max | |
, = 1, … . , (3)
Here user takes aging function for age evolution. In article [7],
a perception that client who see same tend to age same was
abused, so appearance special aging function was given. We Where is feature vector of sample in class,
imagine that client who looks same may age in unequal ways. training samples in corresponding class? So calculate age of
For instance, ages of 2 clients with same arrival might be test picture is:
unmistakable a great deal. Style which a client ages in is
named his maturing way. In any case, we envision that client
who age in same style has same maturing capacity. So a most ∑ ( )
recent age estimation strategy is planned which coordinate = (4)
maturing capacity and maturing way order. ∑
A. Aging Functions
For a picture, after facial component removal, we got a couple IV. FACE RECONITION ROBUST TO AGE VARIATION
of structure and surface element vector which are converged to In following section, a face recognition framework solid to
frame facial feature vector. Users utilize age capacity with age variety is outlined, as given in Figure.1. Framework
polynomial structure to show relationship among feature contains noteworthy techniques. Age simulation and face
vector and age, as given in condition (1). recognition. In first module, face in database and test face are
altered to single equivalent regulating age to decrease age
= ( ) = + (1) varieties. Face recognition module completes conventional
capacity of face recognition, for instance perceiving highlight
In above equation (1), ‘age’ is estimated age, ′ ′is aging extraction, comparability sorting and recognition result yield.
function, and ′ ′ is feature vector, ,…… is parameter
vector, and offset is needed. For age function,
userassessparameters byan arrangement of preparing tests with
familiar genuine ages. User intention to lessen issue among
genuine ages and judged ages. Standard of lessening square
blunder is received. On the off chance that aging function is

480 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Fig.3 Shape and texture information

2) Shape and Tecture Vector Extraction


Fig.1. Age simulation face recognition system
User did Principal Component Analysis to match of landmarks
and original texture picture. So user get structure Eigen space
V. AGE SIMULATION and texture Eigen space. For picture, after projection in Eigen
Age simulation framework contains 3 sections: facial spaces, user get structure feature vector and texture feature
component removal, age estimation and age simulation. Initial vector. And merged feature vector is utilized for further
segment, a shape feature vector or texture featurevector is working.
separated to show to a face. In otherportion, age function or
VI. CONCLUSION
aging style characterization is incorporated to judge age. In
last section, couple of structure highlight vector or This paper explains the overview of face recognition
composition feature vector is modified to that of objective age. encountered due to aging effect. In the preliminary stage, there
Before we separate facial feature, so as to get great simulation are a lot of researches already done in this area. But due to
output, few preprocessing are performed to enhance picture aging, there are still many facts that need to be considered. For
standard, contain explanation normalization, scale alteration, example, in India AADHAAR CARD is maintained to check
etc. individual identity. As passing of time, some types of
techniques are required to check aging identity. In article we
give layout of face recognition due to aging which provide the
large security to passport and other type of identities.

REFERENCES
[1] Bianco, Simone. "Large age-gap face verification by
feature injection in deep networks." arXiv preprint
arXiv:1602.06149 (2016).
Fig.2. Age simulation system [2] Hu, Guosheng, et al. "When face recognition meets
with deep learning: an evaluation of convolutional
A. Facial Feature Extraction neural networks for face recognition."Proceedings of
the IEEE International Conference on Computer
One face Picture contains shape data or original texture Vision Workshops. 2015.
picture. Facial picture can be differentiating into structure data [3] Gong, Dihong, et al. "A maximum entropy feature
and texture data. User have these 2type data, we can redesign descriptor for age invariant face
facial picture. recognition." Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. 2015.
1) Shape and Texture Information Extraction [4] Huang, Jiaji, et al. "Geometry-aware deep
transform." Proceedings of the IEEE International
Shape data is shown by coordinates of 101 landmarks on face Conference on Computer Vision. 2015.
(Figure.3), and Active Shape Model [9] is utilized to remove [5] Anjana Mall et al. “Skin Tone Based Face
landmarks. Subsequently landmark islocate, triangle relayed Recognition and Training using Neural Network"
affine transform is utilized to span face picture to original UETAE, ISSN 2250-2459. Volume 2, 1ssue9, pp. 1-
texture picture at typical shape. This original texture picture is 5. September 2012.
texture data of face picture. Commonly we utilize mean shape [6] Ling, Haibin, et al. "Face verification across age
of a set of faces as typical shape. progression using discriminative methods." IEEE

481 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Transactions on Information Forensics and security [22] Ruth Campbell et al. "The development of
5.1 (2010): 82-91. differential use of inner and outer face features in
[7] Maturana, Daniel, et al. "Face recognition with local familiar face identification. “Journal of Experimental
binary patterns, spatial pyramid histograms and naive Child Psychology 59.2 (1995): 196-210.
Bayes nearest neighbor classification." Chilean [23] Rama Chellappa et al. "Human and machine
Computer Science Society (SCCC), 2009 recognition of faces: A survey." Proceedings of the
International Conference of the. IEEE, 2009. IEEE 83.5 (1995): 705-741.
[8] Ling, Haibin, et al. "A study of face recognition as [24] N. M Allinson et al. "Face recognition: combining
people age." 2007 IEEE 11th International cognitive psychology and image
Conference on Computer Vision. IEEE, 2007. engineering." Electronics & communication
[9] Rodriguez, Yann. Face detection and verification engineering journal 4.5 (1992): 291-300.
using local binary patterns. No. LIDIAP-REPORT- [25] Roberto Brunelli et al. "Face recognition through
2006-022. IDIAP, 2006. geometrical features." European Conference on
[10] Anil K Jain, et al. "Biometrics: a tool for information Computer Vision. Springer Berlin Heidelberg, 1992.
security." IEEE transactions on information forensics [26] Matthew A Turk et al. "Face recognition using
and security 1.2 (2006): 125-143. eigenfaces." Computer Vision and Pattern
[11] Turati, Chiara, et al. "Newborns' face recognition: Recognition, 1991. Proceedings CVPR’91, IEEE
Role of inner and outer facial features." Child Computer Society Conference on. IEEE, 1991.
development 77.2 (2006): 297-311. [27] M TURK. Et al., AEigenfaces for recognition.
[12] Fabien Cardinaux et al. "User authentication via Journal of cognitive Neuroscience 3 (1991), 71.86.
adapted statistical models of face images." IEEE [28] Francis Galton,' Personal identification And
Transactions on Signal Processing 54.1 (2005): 361- Description Nature, 1888.
373. [29] RJ Baron. "A bibliography on face recognition," The
[13] TimoAhonen, et al. "Face recognition with local SISTM Quarterly Incorporating the Brain Theory
binary patterns." European conference on computer Newsletter, II(3) 27.36, 1979.
vision. Springer Berlin Heidelberg, 2004. [30] T. KANADE, Picture processing system by computer
[14] Anil K Jain et al. "An introduction to biometric complex and recognition of Hurrian faces. PhD
recognition." IEEE Transactions on circuits and thesis, Kyoto University, November 1973.
systems for video technology 14.1 (2004): 4-20.
[15] WenyiZhao et al. "Face recognition: A literature
survey." ACM computing surveys (CSUR) 35, no. 4
(2003): 399-458.
[16] Constantine L Kotropoulos et al. "Frontal face
authentication using discriminating grids with
morphological feature vectors" IEEE Transactions on
Multimedia 2.1 (2000): 14-26.
[17] Duc, Benoit et al. "Face authentication with Gabor
information on deformable graphs." IEEE
Transactions on Image Processing 8.4 (1999): 504-
516.
[18] Peter N Belhumeur et al."Eigenfaces vs. fisherfaces:
Recognition using class specific linear
projection." IEEE Transactions on pattern analysis
and machine intelligence19.7 (1997): 711-720.
[19] Wiskott, Laurenz, et al. "Face recognition by elastic
bunch graph matching."IEEE Transactions on
pattern analysis and machine intelligence 19.7
(1997): 775-779.
[20] Gerl, Susann et al. "3-d human face recognition by
self-organizing matching approach." pattern
recognition and image analysis c/c of
raspoznavaniyeobrazovianalizizobrazhenii 7 (1997):
38-46.
[21] Roberto Brunelli et al.. "Person identification using
multiple cues." IEEE transactions on pattern analysis
and machine intelligence 17.10 (1995): 955-966.

482 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Comparative study for selection of an item based on multi-criteria DSS

Viharika Padma Sanjana B L M Varaprasad Rao


([email protected]) ([email protected]) [email protected]
IV B. Tech (CSE) II Sem, IV B. Tech (CSE) II Sem, Professor, CSE
Anurag Group of Institutions, Anurag Group of Anurag Group of
Hyd, India Institutions Institutions
Hyd, India Hyd, India

Abstract

Recommender systems lay a pathway to users in delivering the best solution in their area of
interest. Recommender systems have gained prominence in the field of Information
Technology, e-commerce, etc., inferring personalized recommendations by effectively pruning
from a universal set of choices for end users to identify their content of interest. A number of
multi criteria decision support system algorithms are available for generating priority based
recommendations, which include Technique for Order of Preference by Similarity to Ideal
Solution (TOPSIS), Analytical Hierarchy Processes (AHP). This paper focuses mainly on user-
to-item based filtering technique. Here, a comparative study is conducted between TOPSIS and
AHP for selecting a mobile based on filtering.

Keywords

Recommender Systems, TOPSIS, AHP, User-to-Item Filtering, Efficiency, Decision Support


Systems.

Introduction

Decision Support Systems are computer-based systems that bring together information from a
variety of sources, assist in the organization and analysis of information and facilitate the
evaluation of assumptions underlying the use of specific models. In other words, these systems
allow decision makers to access relevant data across the organization as they need it to make
choices among alternatives. Most decision-making processes supported by DSS are based on
decision analysis, most commonly multi-criteria decision making (MCDM). MCDM involves
evaluating and combining alternatives' characteristics on two or more criteria or attributes in
order to rank, sort or choose from among the alternatives. Now-a-days a smart phone as become
a necessity in everybody’s life. Since, every day a new model comes into market users get
confused during selection of mobile phone while buying. To select the most suitable mobile
phone among various alternatives, the decision maker must consider meaningful criteria &
possess special knowledge of the phone specifications. In this study, the evaluation criteria
for decision making are selected from the studies in the discussions with the target audience.
A number of alternatives and conflicting criteria are increasing very rapidly. So, robust
evaluation models are crucial in order to incorporate several conflicting criteria
meritoriously. With its need to trade-off multiple criteria, the selection problem like mobile
phone selection is a multi-criteria decision-making (MCDM) problem. A number of
recommender system algorithms / methods are present for selecting an item using item-based,

483 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

user-based and user-item based filtering techniques [1]. Analytic hierarchy process (AHP),
fuzzy multiple attribute decision- making model, linear 0-1 integer programming, weighted
average method, genetic algorithms etc. are some of these methods. Reviewing of [1] led to
motivation of comparative study between TOPSIS and AHP for selecting an item using DSS
which will recommend the most suitable item based on user reviews. Now the rest of the
chapter is organised as, section II describes MCDM methods like TOPSIS and AHP. Section
III provides the comparison of TOPSIS and AHP. Section IV contains experimental results
obtained, followed by section V which contains conclusion.

MCDM Methods

a. TOPSIS

The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) [2] [3] [4] is
a multi-criteria decision analysis method. It is based on the concept that the chosen alternative
should have the shortest geometric distance from the positive ideal solution (PIS) [5] and the
longest geometric distance from the negative ideal solution (NIS)[5]. It is a method of
compensatory aggregation that compares a set of alternatives by identifying weights for each
criterion, normalising scores for each criterion and calculating the geometric distance between
each alternative and the ideal alternative, which is the best score in each criterion. An
assumption of TOPSIS is that the criteria are monotonically increasing or
decreasing. Normalisation is usually required as the parameters or criteria are often of
incongruous dimensions in multi-criteria problems [6][7] . Compensatory methods such as
TOPSIS allow trade-offs between criteria, where a poor result in one criterion can be negated
by a good result in another criterion. This provides a more realistic form of modelling than
non-compensatory methods, which include or exclude alternative solutions based on hard cut-
offs [8].

The TOPSIS process is carried out as follows:

Step 1

Create an evaluation matrix consisting of m alternatives and n criteria, with the intersection
of each alternative and criteria given as Xij, we therefore have a matrix (Xij)mxn..

Step 2

The matrix (Xij)mxn is then normalised to form the matrix R= (rij)mxn , using the normalisation
method rij = Xij / √∑xij2 where i=1,2,3.......n.

Step 3

Calculate the weighted normalised decision matrix

484 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

rij = Xij / √∑xij2 where i=1,2,3.......n

Where wj = Wj / ∑Wj ,j=1,2,3......n

so that ∑Wj =1 and wj is the original weight given to the indicator Vj j=1,2,3...n

Step 4

Determine the worst alternative (Aw) and the best alternative (Ab) :

Aw = { <max(tij|i=1,2,....m)| j € J->, <min(tij|i=1,2....m)| j € J+>} = {twj | j=1,2...n}

Ab = { <min(tij|i=1,2,....m)| j € J- >, <max(tij|i=1,2....m)| j € J+>} = {tbj | j=1,2...n}

where,

J+ = {j=1,2....n | j associated with the criteria having a positive impact, and

J- = {j=1,2....n | j associated with the criteria having a negative impact.

Step 5

Calculate the L2-distance between the target alternative i and the worst condition Aw

diw = √∑(tij - twj)2 , i= 1,2,....m ; j=1,2,...n and

the distance between the alternative i and the best condition Ab

dib = √∑(tij – tbj)2 , i= 1,2,....m ; j=1,2,...n

where, diw and dib are L2-norm distances from the target alternative i to the worst and best
conditions, respectively.

Step 6

Calculate the similarity to the worst condition:

Siw = diw /(diw + dib) 0 ≤ Siw ≤ 1 i=1,2,...m

Siw = 1 if and only if the alternative solution has the best condition;

Siw = 0 if and only if the alternative solution has the worst condition.

Step 7

Rank the alternatives according to Siw (i = 1,2,....m)

485 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

b. AHP

The analytic hierarchy process (AHP)[9] is a structured technique for organizing and
analysing complex decisions, based on mathematics and psychology. It has particular
application in group decision making, and is used around the world in a wide variety
of decision situations. Rather than prescribing a "correct" decision, the AHP helps decision
makers find one that best suits their goal and their understanding of the problem. It provides a
comprehensive and rational framework for structuring a decision problem, for representing and
quantifying its elements, for relating those elements to overall goals, and for evaluating
alternative solutions.

Steps involved are:

Step 1: Pair-wise Comparison

If A is compared with B for a criterion and preference value is 3, then the preference value of
comparing B with A is 1/3.

Any criteria compared to itself, must equally preferred.

Fig: Standard scale used in AHP

Step 2: Developing Preferences with criteria

Prioritize the decision alternatives within each criterion.

Referred to synthesization

Sum the values in each column of the pairwise comparison matrices.

Divide each value in a column by its corresponding column sum to normalize preference
values.

Average the values in each row.

Last column is called preference vector.

Step 3: Ranking the criteria

486 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Determine the relative importance or weight of the criteria.

Which one is the most important and which one is the least important one.

Rank the criteria with each criterion, using pair-wise comparison.

Step 4: Developing overall ranking

Average values of each criterion are multiplied with their respective values and summed up
for each case.

Based on the overall score, the highest score obtained is the most recommended one.

Comparative Study of TOPSIS and AHP

The aim of this study is to propose a multi-criteria decision making (MCDM) approach to
evaluate the mobile phone options in respect to the users' preferences order. Firstly, the most
desirable features influencing the choice of a mobile phone are identified. This is realized
through a survey conducted among the target group, the experiences of the telecommunication
sector experts and the studies in the literature. Here, target group belongs to few students in our
college. Two MCDM methods namely, TOPSIS and AHP are then used in the evaluation
procedure. Firstly, both the algorithms are implemented using JAVA-programming language.
Secondly, values for all the criteria for each alternative are calculated. Finally, based on time
taken for execution, most efficient algorithm will be proposed.

In this paper, comparison of three existing mobile phones of renowned companies namely:
Iphone, Samsung and Windows in general serve to validate the model by testing the
propositions that were developed. First of all, the evaluation criteria for the selection decision
considered are Price, Memory, Camera, Battery and Ease of use. Here, price is considered as
benefit factor. The weights of the main criteria are considered based on decision makers’
subjective judgments. A pair-wise comparison matrix of the main criteria and the calculation
of the weights are given followed by the end result and time taken for execution.

Experimental Results

The following contains the implementation of TOPSIS and AHP algorithms using java
programming language on eclipse platform. This would make the calculations easy and takes
less time to conclude results.

Step by step implementation and results generated by TOPSIS is given below:

487 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 1: Giving the required alternatives and criteria as input using JAVA Programming

Figure 2: Normalized and weighted normalized matrices using JAVA Programming

Figure 3: Positive and Negative ideal solutions using JAVA programming

488 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 4: Distance between Positive and Negative ideal solutions to alternatives using
JAVA programming

Figure 5: The closeness coefficient, solution and time taken for execution using JAVA
programming

Step by step implementation and results generated by AHP is given below:

Figure 6: Giving the required criteria values as input using JAVA programming

489 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 7: Giving required input values for each alternative based on each criteria

Figure 8: Input matrix and Normalized matrix for criteria

490 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 9: Input matrices of each alternative for each criterion

Figure 10: Normalized matrices for each alternative based on each criterion

Figure 11: overall score and overall time taken for execution

Based on the execution, the results show that Samsung is highest ranked by both the algorithms.
Also, we can conclude that time taken for execution of TOPSIS takes only 5 milliseconds while
AHP takes 75 milliseconds. Therefore, it can be proposed that based on the comparative study,
TOPSIS generates results more efficiently in less time.

Conclusion

The most popular multi-criteria decision making algorithms TOPSIS and AHP become very
complicated and calculative when there are greater than 4 Criteria and alternatives for a
particular problem. So, designing them in JAVA-Programming not only increases the accuracy
of result but also make easy to calculate any number of alternatives and criteria. Hence, based
on the comparative study made the time complexity for TOPSIS is O(n) which is lesser than

491 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

time complexity of AHP that is O(lg n) . Therefore, we can conclude that TOPSIS gives results
more efficiently and takes lesser time for execution.

References

1 Dr.M.Varaprasad Rao et al. A survey on Recommender Systems. International


Journal of Computer Science and Information Security Vol.14 Iss.5; May 2016
2 Hwang, C.L.; Yoon, K. (1981). Multiple Attribute Decision Making: Methods and
Applications. New York: Springer-Verlag.
3 Yoon, K. (1987). A reconciliation among discrete compromise situations. Journal
of Operational Research Society. 38. pp. 277–286. doi:10.1057/jors.1987.44.
4 Hwang, C.L.; Lai, Y.J.; Liu, T.Y. (1993). "A new approach for multiple objective
decision making". Computers and Operational Research. 20: 889–
899. doi:10.1016/0305-0548(93)90109-v
5 Assari, A., Mahesh, T., & Assari, E. (2012b). Role of public participation in
sustainability of historical city: usage of TOPSIS method. Indian Journal of
Science and Technology, 5(3), 2289-2294.
6 Yoon, K.P.; Hwang, C. (1995). Multiple Attribute Decision Making: An
Introduction. California: SAGE publications.
7 Zavadskas, E.K.; Zakarevicius, A.; Antucheviciene, J. (2006). "Evaluation of
Ranking Accuracy in Multi-Criteria Decisions". Informatica. 17 (4): 601–618.
8 Greene, R.; Devillers, R.; Luther, J.E.; Eddy, B.G. (2011). "GIS-based multi-
criteria analysis". Geography Compass. 5/6: 412–432.
9 Saaty, Thomas L.; Peniwati, Kirti (2008). Group Decision Making: Drawing out
and Reconciling Differences. Pittsburgh, Pennsylvania: RWS
Publications. ISBN 978-1-888603-08-8

492 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

IMPLEMENTATION OF INDIAN SIGN LANGUAGE RECOGNITION SYSTEM USING SCALE


INVARIENT FEATURE TRANSFORM (SIFT).

Sandeep Baburao Patil1, Dr. Rajesh H. Talwekar2


1
PhD Scholar (Electronics & Telecommunication),
Faculty of Engineering and Technology of Shri Shankaracharya Technical Campus,
Chhattisgarh Swami Vivekanand Technical University, Bhilai, INDIA.

2
Electronics & Telecommunication
Government Engineering College, Raipur, INDIA.

Abstract-Due to lack of awareness towards deaf and dumb peoples in Asian country, linguistic communication recognition
is a crucial tool normally developed for deaf and onerous hearing community. Linguistic communication is that the solely
mode of communication between them by generating totally different sign pattern. The scale invariant feature Transform
has been used for feature extraction as its options are invariant to translation, rotation and scaling. The static pictures of
hand gestures are pre-processed mistreatment scale invariant feature rework formula and trained all twenty six hand
gestures (A to Z) with one hundred thirty pictures gift within the information. This paper shows the matching between
input image and information pictures supported the options extracted mistreatment scale invariant feature rework
formula. This paper provides 98.7% of matching accuracies since the matching is completed by mechanically variable the
brink and distance magnitude relation.

Key words- Indian signing (ISL), Hand gesture, scale invariant feature remodel, validity magnitude relation, scale
invariant feature remodel.

I. INTRODUCTION
Detecting and understanding hand and body gestures is turning into a really vital and difficult task in computer
vision. The importance of the matter will be simply illustrated by the utilization of natural gestures that we have a
tendency to apply along with verbal and nonverbal communications. The utilization of hand gestures in support of
verbal communication is thus deep that they're even utilized in communications that folks haven't any visual contact.
There are completely different approaches for recognizing hand gestures in engineering community, a number of
that need carrying marked gloves or attaching further hardware to the body of the topic. These approaches are from
the user’s purpose of read thought-about to be intrusive, and so less possible to use for universe applications.
Additionally, recognition of the form of the hand is additionally vital in some applications like linguistic
communication recognition and Human pc Interaction.

The linguistic communication utilized by the deaf (in India) is principally learnt through a selected approach. There
are kinds of linguistic communication utilized in India that may be classified on regional basis. There are plenty
additional informal sign languages or the 'home sign' as referred. Although of these sign languages appear to be
reticular it's troublesome to search out the precise path of development of any language. It’s been shown through
earlier analysis works the interrelationship of those languages which all of them contribute to the event of the ISL
(Indian Sign Language).

493 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

II. LITERATURE SURVEY

A Literature survey of various researches on Indian linguistic communication is according as.

The approach of scale invariant feature remodel has been used for object, scene and hand gesture. The extracted
options square measure invariant and perform reliable matching between numerous views of an object. The
extracted options square measure extremely stable and invariant to image rotation and scaling. The options are
strong across distortion, noise and alter in illumination. The projected approach is economical and has ability to
extract sizable amount of options for a given image [1, 2, 3, 4]. A unique approach for hand poses recognition for
analyzing the textures and key geometrical options of the hand. The feature extraction technique mentioned during
this paper is additional complicated as a result of the kidnapping and movement angle of the fingers and their
internal variations has thought-about [5]. A hand gesture recognition system has been to acknowledge the alphabets
of ISL. The process time needed for the system is additional, since it's to method through four totally different
modules [6]. A unique approach for robot golem interaction (HRI) was steered. This method found its limitations in
real time applications wherever totally different speech and hearing problems square measure thought-about [7].
Author developed an indication Language bioscience system for south Indian languages. The system describes a
collection of thirty two signs, every representing the binary ‘UP’ &amp; ‘DOWN’ positions of the 5 fingers of right
palm. The given system is applicable to Tamil text solely and restricted to 5 fingers of single hand with same
background [8]. A vision-based multi-features classifier has been enforced for linguistic communication recognition.
The system first wont to developed dynamic linguistic communication look model, then classification has been done
by SVM technique. The experiment was applied over thirty teams of the Chinese finger alphabet pictures and also
the results proved that this look modeling technique is straightforward, efficient, and effective for characterizing
hand gestures. The system suffers with limitations in real time with dynamic hand gesture [9]. Period linguistic
communication recognition has been projected victimization hand gestures. Hand gestures square measure
recognized supported Haar options and K-means algorithm formula. The aforementioned formula was wont to cut
back the quantity of extracted options that reduced the process quality. This technique was restricted to little
information [10]. A unified framework was steered that at the same time performed spacial segmentation, temporal
segmentation, and recognition. The performance of this approach was evaluated on 2 difficult applications:
recognition of hand-signed digits gestured by users sporting short-sleeved shirts, and retrieval of occurrences of
signs of interest from a video info containing continuous, nonsegmental sign language of yank linguistic
communication (ASL). This method is proscribed to 5 signs of American Sign Language [11]. A unique formula for
the hand gesture pictures detection and recognition has been projected. The popularity method includes 2 phases (a)
the model construction and (b) linguistic communication identification. This formula provides ninety four accuracy
for hand gesture recognition and also the accuracy are adversely affected if the orientation of hand gesture lies in
skew or if the image half from the radio carpal joint to the arm is wrong [12].

A 2 level language identification system has been developed victimization acoustic options. Firstly, the system
identifies the family of the oral communication, and then it's fed to the second level to spot the actual language
within the corresponding family. The system uses hidden Markov model (HMM), Gaussian mixture model (GMM),

494 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

artificial neural networks (ANN). The system couldn't accomplish good accuracy [13]. A good use of glove has been
projected for implementing an interactive linguistic communication teaching programmed. Linguistic
communication glove with flex sensors square measure needed to mount on the finger of the glove whose resistance
changes consistent with the finger position that's tough to know for the deaf [14]. This paper represented the facial
expressions of the signers face victimization probabilistic principal part analysis (PPCA). A take a look at was
conducted to acknowledge six isolated facial expressions in yank linguistic communication (ASL). The popularity
accuracy according for American Sign Language facial expressions was 91.76% in the flesh dependent tests and
87.71% in the flesh freelance tests. The projected system isn't totally automatic and thus achieved lower accuracy
[15]. The projected system can facilitate the hearing impaired to speak additional fluently with the conventional
folks. A segmentation method represented during this paper is employed to separate the proper and left regions from
the image frame. The system needs regarding 3218 average mean epochs to coach the network model that exceeds
the coaching time. What is more there was confusion in 1st, fourth, eighth and ninth sign that greatly reduced the
accuracy [16]. This paper introduced the primary automatic Arabic linguistic communication (ArSL) recognition
system supported hidden Markov models (HMMs). The system operates in several modes as well as offline, online,
signer-dependent, and signer-independent modes. Experimental results incontestable that the given system has high
recognition rate for all modes. The system didn't accept the utilization of knowledge gloves or alternative means that
as input devices, and it permits the deaf signers to perform gestures freely and naturally. this method is proscribed to
solely Arabic sign language[17]. the assorted ways for feature extraction are projected for character recognition. The
author conjointly mentions the coaching of Devnagri character recognition victimization HMM and Neural network.
The author achieves 100 percent recognition result for tiny info [18, 19, 20, 21]. the assorted researchers have
created vital contribution within the field of sign languages from totally different countries and regions. The
literature supported totally different ways and findings with adequate result have conjointly shown [22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]

III. DATABASE FOR INDIAN SIGN LANGUAGE (ISL)

As there's no any customary information accessible on the web for Indian signing recognition. Whereas doing
intensive literature survey, we have a tendency to found a picture databases incorporates twenty six gestures of
Indian signing (ISL). The Figure one shows the image of ISL accessible on the web. As this image isn't adequate
for hand gesture recognition, this forces us to get new information for ISL cherish the pictures accessible with us.
Thus, so as to get a brand new information for ISL we have a tendency to created use of Nikon Coolpix-L24,
fourteen Mega picture element digital cameras that provides a clear image even in dark. The photo of gesture
corresponds to all or any the alphabets of English i.e. A to Z were taken from 5 completely different persons of
various age bracket. Total a hundred thirty pictures were taken of same size 1024 x 768 pixels. Generally, the time
interval is incredibly high if image size is massive and thus we have a tendency to reduce the scale of pictures to a
good extent making certain no loss of data to be used for signing recognition. So, we've regenerate these pictures
into a regular size of 284 x 215 pixels Figure1. Among these a hundred thirty pictures, seventy eight pictures were
used for coaching purpose and also the remaining fifty two pictures were used for testing purpose. Fig.2 shows the

495 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

image information collected as real time image database of an individual. This database incorporates twenty six
pictures of size 284 x 215 picture element. This info is more used for recognition purpose subjecting them to
numerous pre-processing and post process stages.

Figure 1: pictures of Indian signing for alphabets (A-Z)

.
Figure 2. Database acquisition for class 1.

IV. METHODOLOGY

The system multidimensional language of ISL recognition is shown in Figure three. Initial of all, we want to
initialize the vital parameter for SIFT (scale invariant feature transform) match formula [4]-[7]. The primary
parameter is distance quantitative relation whose price is ready to 0.65 and so threshold price is ready to 0.035.

496 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Scale Invariant Feature Transform

The scale invariant feature remodel (SIFT) algorithmic rule was introduced by Lowe and has been used for feature
extraction. This algorithmic rule is one in every of the foremost wide used thanks to the steadiness over image
translation, rotation and scaling and to some extent invariant to alter within the illumination and camera viewpoint.
They’re well localized in each the abstraction and frequency domains, reducing the likelihood of disruption by
occlusion, clutter, or noise. Giant numbers of options may be extracted from typical pictures with economical
algorithms. Additionally, the options area unit extremely distinctive, which permits one feature to be properly
matched with high likelihood against outsized information of options, provides a basis for object and scene
recognition. Following area unit the key stages of computation accustomed generate the set of image options (Figure
3)

Figure 3: Phases of SIFT algorithm.

Let us discuss however the information flows in every section of SIFT algorithmic rule. The input to a SIFT
algorithmic rule may be a set of N2 pixels of associate degree NxN image. Solely a little fraction of those pixels
usually end up being extrema. Let 0 < α < 1, be this fraction. Therefore αN2 extrema can travel to ensuing key
purpose detection section. Solely the little fraction of those extrema can qualify as a key purpose. Let 0 < β <1, be
this fraction. Therefore nominally there's αβN2 key purpose at this stage. Orientation assignment can re-examine all
the N2 purposes within the image to see if any point of great magnitude are lost. Let the fraction γ of the image
component qualify to be these extra key purposes. The compute descriptor section converts these points into vector
that area unit then was options. The amount of feature descriptor output by SIFT algorithmic rule is nominally
(∝β+γ) N2 for associate degree N x N image.

Scale space extreme detection

497 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

In this part the algorithmic rule identifies the points that are stable with image rotation, translation and people that
are minimally suffering from noise and tiny distortion. The algorithmic rule computes ‘scale’, ‘difference of
Gaussian’ and ‘extrema’ over many ‘octaves’[5],[7],[8].

This stage makes an attempt to spot those locations and scales that are identifiable from totally different views of
an equivalent object. This could be expeditiously achieved employing a "scale space" perform. Any it's been shown
underneath affordable assumptions it should be supported the Gaussian perform. Specifically the distinction of
Gaussian image D(x, y, σ ) is given by

D(x, y, σ) =L(x, y, ki σ)-L(x, y, kj σ) (1)

where L(x, y ,kj σ) is that the convolution of the first image I(x, y) with the Gaussian blur G(x, y, Kσ) at scale
Kσ i.e

L(x, y, kσ )=G(x, y, Kσ )*I(x, y) (2)

Wherever * is that the convolution operator, G(x, y, σ) could be a variable-scale Gaussian and I(x, y) is that the
input image. Once distinction of Gaussian (DOG) pictures has been obtained, key points are known as native
minima/maxima of the DOG pictures across scales. this can be done by comparison every element within the DOG
pictures to its eight neighbors at an equivalent scale as shown in Figure 4 and Nine corresponding neighboring
pixels in every of the neighboring scales. If the element worth is that the most or minimum among all compared
pixels, it's elite as a candidate key purpose.

Figure 4: Extrema Detection on octave.

To determine the distinction of mathematician, we tend to solely want two corresponding points and to visualize
whether or not the purpose is extremum we tend to solely want twenty six variations of mathematician points adjoin

498 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

3 scales around it. Thus, it's attainable to execute these operations over octaves in many various ways that. Figure
5. Shows the flow sheet of 1st part of SIFT formula.

Figure 5: Flow diagram of first phase of SIFT algorithm.

Key point detection

Within the 1st section the rule verify αN2 extrema and so additional examine them into αβN2 key purposes; that
ultimately becomes the key point of that image. During this section the candidate that lies on the sting of the image
or might corresponds to purpose of low distinction. These square measures usually not helpful as feature are
unstable over image variation and thence are rejected. For rejecting low distinction purpose, every extrema is
examined employing a technique that involves resolution a system 3 x 3 linear equation then it takes constant time.
To discover the extrema, a 2 x 2 matrix is generated and easy computation is performed on it; to come up with a
quantitative relation of principle curvature. This amount is solely compared with threshold worth to choose whether
or not extrema is rejected or not [8].

As per the higher as discussion, once the input image passes through the second section of SIFT rule, the SIFT
key purpose detection for each the coaching and check image is shown by red circles in Figure 6.

(a) Train Image (b) Test Image

499 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure 6: SIFT key points for train (left) and check image (right).

Orientation Assignment
The nominal range of key purpose at the beginning of this part is αβN2. This part adds to the set of key points
on the idea of their magnitude and orientation. The magnitude and orientation for every purpose may be calculated
as follows [4],[7].

m(x, y) = (L(x + 1, y) − L(x − 1, y)) + (L(x, y + 1) − L(x, y − 1)) (3)

θ(x, y) = tan ((L(x, y + 1) − L(x, y − 1))⁄(L(x + 1, y) − L(x − 1, y))) (4)

Non-key purposes whose magnitudes area unit near the height magnitude area unit added as new key point. The
entire range of key points at the tip of this part is:
αβN + γ(N − αβN ) = αβN (1 − γ) + γN ≅ N (αβ + γ) (5)

So the computation for magnitude and orientation may be done over constant time and later on the entire range of
key points matched in coaching and check image is shown in Figure 7.

(a) (b)
Figure 7: Matched key points between (a) and (b) when orientation and centralization.

Key point Descriptor


During this part, the formula computes a descriptor for every key purpose known. The formula computes a
descriptor vector for every key purpose such the descriptor is extremely distinctive and partly invariant to the
remaining variations like illumination, 3D viewpoint etc. This step is performed on the image nighest in scale to the
key point's scale. First, a group of orientation histograms square measure created on 4x4 picture element
neighborhoods with eight bins every. The magnitude and orientation values of samples in an exceedingly 16 x 16
region round the key purpose is employed to calculate the bar graph. so every bar graph contains samples from a 4 x
4 sub-region of the first neighborhood region. The magnitudes square measure any weighted by a mathematician
operate with σ adequate one half the dimension of the descriptor window. The descriptor then becomes a vector of

500 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

all the values of those histograms. Since there square measure 4 x 4 = 16 histograms every with eight bins the vector
has 128 parts. This vector is then normalized to unit length so as to boost unchangeableness to affine changes in
illumination. To cut back the consequences of non-linear illumination a threshold of 0.2 is applied and also the
vector is once more normalized.

Formula to get Validity magnitude relation

A formula for obtaining validity magnitude relation consists of the subsequent steps:
Step 1: Get the matched key purpose information from SIFT formula. Figure 8 shows the instance of key points
extracted from information and input image

Key points missing Key points present in


in input image both images

Figure 8: Key points extracted from training and test images.

Step 1: Let image one be the information image and image a pair of be the input image as shown in Figure 9.
presumptuous any, let d11 to d61 area unit the space factors of key purpose from the middle of image 1; and d12 to
d62 area unit the space factors of key purpose from the centre of image a pair of.

Figure 9: An example of key point extraction

501 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Step 2: If the numbers of matched key purposes are larger than 2 then calculate the gap of the matched key point to
the centre of the key point’s exploitation the formula given below.

=∑ (6)

=∑ (7)

Wherever dT1 and dT2 are the summation of all distance for image one and image two respectively; M denotes the
amount of matched points.

Step 3: The gap array quantitative relation (DAR) is outlined as:

DAR = (Distance between the matched key purpose and centre of the key point / Total distance) (8)

The DAR for information image and input image is calculated as:

DAR1 = ….. (9)


DAR2 = ….. (10)

Step 4: The gap masking is important so as to see the similar pattern of matched key purpose and therefore the
centre of the matched key purpose. The gap masking will be calculated by taking absolutely the price of DAR below
the edge price.

Distance mask = abs DAR1 − DAR2 < (11)

Step 5: The full validity points are obtained by summing the gap mask. The validity purpose is given as

Validity point = Sum (Distance Mask) (12)

If the amount of matched key points isn't larger than two then the amount of valid points is directly zero.

Step 6: Finally to calculate the validity quantitative relation of the key points, we tend to merely divide the valid
matched key points by the matched key points.

Validity Ratio = Sum (Distance Mask)/Num (13)

More the validity quantitative relation, additional key points of each the pictures are matched. Most validity
quantitative relation offers the matched result.

V RESULT

Visual perception is performed by initial matching every key purpose severally to the information of key points
extracted from coaching pictures. Several of those initial matches are going to be incorrect thanks to ambiguous
options or options that arise from background muddle. Therefore, clusters of a minimum of three options are initial

502 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

known that agree on AN object and its cause, as these clusters have a far higher likelihood of being correct than
individual feature matches. The careful geometric fitting of every cluster’s results accustomed settle for or reject the
interpretation. When applying these higher than six steps for validity quantitative relation, we tend to get 3 matching
result for a check image. As per the result shown in Figure 10, the key points of check image are matched with
information image numbers four, 23 and 24. It’s been ascertained that seven matches are found for image four, 3
matches are found for image twenty three and thirty 2 matches are found for image twenty four of the information.
The amount twenty four corresponds to character ‘X’ within the alphabet. Therefore recognizing character is ‘X’.
The combined result for these 3 characters is shown in Figure 11.

Figure 10: Matching between training and test image.

Figure 11: Combined matching result for image 4, 23 and 24 of database images.

Matching result throughout initial check

As explicit earlier that in the primary check the worth of important parameter that's the space quantitative relation
is 0.65 and threshold worth is unbroken at 0.035 severally. Therefore the algorithmic program offers North
American country the 3 outputs whose most key point’s area unit matched with the key points of input image.
Throughout the second check the important parameter is once more the space quantitative relation that is currently
incremented by zero.5 and threshold worth is decremented by 0.05. At the top of second check, the algorithmic
program offers North American country the ultimate output.

503 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

It’s been discovered that in the primary check the key points of image range four, twenty three and twenty four area
units possibly matched with the key points of input image. We’ve got additionally shown the combined result for all
the match points. Therefore, to differentiate it properly we have a tendency to take into account the individual results
of every image shown in Table 1 and its equivalent bar diagram illustration is shown in Figure 12.

TABLE 1: INDIVIDUAL RESULT FOR THREE IMAGES


x-position y-position x-position y-position Number of Number of valid Validity Number
matched point matched point ratio
157.19 132.69 150.42 99.30 7 5 0.714 4
113.97 41.140 148.20 66.56 3 1 0.333 23
150.93 98.708 142.15 99.6 32 30 0.937 24

Table one shows the individual results of the photographs that is closely matched with the check image. The check
image is ‘X’ in Figure eleven. The algorithmic program at first offers the 3 best results and out of those 3 the
ultimate result's hand-picked relying upon x-y position, the quantity of matched points, variety of valid match
points, and validity magnitude relation. The Table one shows that the image variety four, twenty three and twenty
four are matched with the check image (last column). Out of those three pictures it's been discovered that the row
three, that has thirty two matched points equals thereto of check image, thirty matched point’s are adequate to that of
check image and therefore the validity magnitude relation is once more near unity. This once more shows that the y-
position of column three is sort of equal (98.708 and 99.6). So the algorithmic program finally offers North
American nation the matched result i.e. image 24. (X-character). Similar interpretation is given in Figure 13.

(a) Image 4 (b) Image 23 (c) Image 24

Figure 12: Bar chart results for (a) image 4 (b) image 23 and (c) image 24.

Figure 12 presents a bar description of results which are explained further as follows:

™ BAR (1 and 2): Center point of database image in X and Y position.


™ BAR (3 and 4): Center point of test image in X and Y position.
™ BAR (5): Number of matched key points.
™ BAR (6): Number of valid matched key points.
™ BAR (7): Validity ratio.
™ BAR (8): Matched result. (For example, 24 which denote character ‘X’).

504 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Figure fourteen provides the detail discussion of matching between the input image and information image. As
mentioned earlier, the image four corresponds to gesture D, image twenty three is corresponds to gesture W and
image twenty four is corresponds to gesture X. The input image is of gesture X, the primary row shows in Figure 13
provides the data concerning the X-Y position of input image and information image. Bar a pair of and four of
image twenty four is nearly equal representing X-Y position of input image and information image is same. The X-
Y position of different 2 pictures was totally different. Equally the column 2 provides the data concerning range of
matched key points and validity key points between input and presumably matched image. Bar five and vi of image
twenty four shows that this image has thirty two matched key points with input image with thirty validity key points.
Finally the validity magnitude relation as shown by bar seven and eight is highest for image twenty four as
compared to different 2. This shows that the input gesture X is presumably matched with the image twenty four gift
within the information. To yield the ultimate output, the systems mechanically increment the space magnitude
relation by 0.5 and decrement the edge price by 0.03. This method of checking improves the accuracy of matching.

Bar image 1 image 2 image 3 (24.jpg) Remarks


number (4.jpg) (23.jpg)

Bar variety two and


four of image
3(24.jpg) square
1,2,3 and 4 measure equal; Y
position of input
image and information
image is same, that
isn't found in
alternative pictures.

5 and 6

The number of
matched key points
and valid key point’s
square measure
Matched key Matched key points additional in image
=3 three as compared to
points =7 Matched key different 2 pictures.
points=32
Valid key points Valid key points=1
=5 Valid key
points=30

505 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

7 and 8

Validity magnitude
Validity
relation is highest in
Ratio=0.7143 Validity Ratio=0.3333 Validity Ratio=0.9375 image three. The
Number= 4 matched character
Number=23 Number=24
corresponds to range
twenty four is ‘X’

Figure 13: Comparison of input image with database image.

Conclusion
This paper describes the implementation of Indian linguistic communication recognition system victimization Scale
Invariant feature remodel (SIFT). The SIFT offers stable options over translation, rotation and scaling. It
additionally offers the stable options over litter background. The input image is to be recognized is fed to the system,
the system no inheritable the SIFT options of input image and compared this options with the pictures gift within the
information. Throughout the primary check, the system offers 3 best results that are presumably to be matched with
input image for constant price of threshold and distance quantitative relation. Since the system returns 3 outputs, the
rule must be going for second check. Throughout the second check, the rule mechanically will increase the space
quantitative relation by 0.5 and decrement the edge price by 0.03, yields conclusion. This method offers the
accuracy of 98.7% for ISL information.

REFERENCES

[1] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-
110.
[2] David G. Lowe, "Object recognition from local scale-invariant features," International Conference on Computer Vision, Corfu, Greece
(September 1999), pp. 1150-1157.
[3] Lindeberg, Tony, “ Feature Detection with Automatic Scale Selection” International Journal of Computer Vision, vol 30, number 2, pp 77--
116, 1998.
[4] Sandeep B. Patil and G.R. Sinha, 2015, “Intensity based distinctive feature extraction and matching using Scale invariant feature transform
for Indian Sign Language” 17th International conference on Mathematical Methods, Computational Techniques and Intelligent System
(MAMECTIS15) held in Tenerife, Canary Islands, Spain, January 10-12, 2015. Pp-245-255. The ISI Journals (with Impact factor from
Thomson Reuters), ISBN:978-1-61804-281-1.
[5] Bhuyan M.K, Kar M.K & Debanga R.N, 2011, Hand Pose Identification from Monocular Image for Sign Language Recognition, IEEE
International Conference on Signal and Image Processing Applications (ICSIPA2011), 3(12):378-383
[6] Ghotkar A.S, Hadap M, Khatal R, Khupase S, 2012, Hand Gesture Recognition for Indian Sign Language, International Conference on
Computer Communication and Informatics (ICCCI -2012), Coimbatore, INDIA, 2(4):9-12.
[7] Nandy 2010, Recognizing & Interpreting Indian Sign Language Gesture for Human Robot Interaction, International Conference on
Computer & Communication Technology |ICCCT’10|:15-21.
[8] Rajam 2011, Real Time Indian Sign Language Recognition, IEEE conference on Image processing, 38-43.
[9] Yang quan,2013, Chinese Sign Language Recognition Based On Video, IEEE/ASME International Conference on Advanced Intelligent
Mechatronics, 67-74.
[10] Arulkarthick V.J, Sangeetha D. and Umamaheswari S, 2012, Sign Language Recognition using K-Means Clustered Haar-Like Features and
a Stochastic Context Free Grammar, European Journal of Scientific Research, 78(1):74-84.
[11] Alon J. and Athitsos V, 2009, A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation, IEEE Transactions
on Pattern analysis and Medical Intelligence, 31(9):1685-1699.
[12] Chou, 2012, An Encoding and Identification Approach for the Static Sign Language Recognition, IEEE/ASME International Conference on
Advanced Intelligent Mechatronic. 28-34
[13] Jothilakshmi S, Palanivel S. and Ramalingam V, 2012, A hierarchical language identification system for Indian languages, Digital Signal
Processing 2(2):544–553.
[14] Kadam, 2012, American Sign Language Interpreter, IEEE Fourth International Conference on Technology for Education.108-117.
[15] Nguyen T.D and Ranganath S, 2012, Facial expressions in American sign language: Tracking and recognition, Pattern Recognition: 1877–
1891.

506 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[16] Paulraj M P, Palaniappan R, Yaacob S, Zanar A, 2011, A Phoneme Based Sign Language Recognition System using 2D Moment Invariant
Interleaving feature and Neural Network, IEEE Student Conference on Research and Development, 2(11):111-116.
[17] Ahmed P, Mishra G.S and Sahoo A.K, 2012, A proposed framework for Indian signs language recognition, International Journal of
Computer Applications, 2(2): 158-169.
[18] Patil S.B and Sinha G.R, 2012, Real Time Handwritten Marathi Numerals Recognition Using Neural Network, International Journal of
Information Technology and Computer Science, 4(12): 76-81.
[19] Patil S.B, Sinha G.R and Thakur K, 2012, Isolated Handwritten Devnagri Character Recognition Using Fourier Descriptor and HMM in
International Journal of Pure and Applied Sciences and Technology (IJPAST), 8(1): 69-74.
[20] Patil S.B. and Sinha G.R, 2012, Off-line mixed Devnagri numeral recognition using Artificial Neural Network, Advance in computational
Research, Bioinfo journal, 4(1): 38-41.
[21] Patil S.B, Sinha G.R and Patil V.S, 2011, Isolated Handwritten Devnagri Numerals Recogntion Using HMM, IEEE Second International
conference on Emerging Applications of Information technology, Kolkota, ISBN: 978-1-4244-9683-9:185-189.
[22] Agarwal Ila, Johar S and Santhosh J, 2011, A Tutor For The Hearing Impaired (Developed Using Automatic Gesture Recognition),
International Journal of Computer Science, Engineering and Applications (IJCSEA), 1(4):49-61.
[23] Akmeliawati R, Kuang Y.C, 2007, Real-Time Malaysian Sign Language Translation using Colour Segmentation and Neural Network, IMTC
2007 Instrumentation and Measurement Technology Conference Warsaw, Poland, 1-3 May 2007, 1(2):1-6.
[24] Agrawal A and Rautaray S.S, 2011, Interaction with Virtual Game through Hand Gesture Recognition, International Conference on
Multimedia, Signal Processing and Communication Technologies, 203, 1(7):244-247.
[25] Bhosekar A, Kadam K, Ganu R. & Joshi S.D, 2012, American Sign Language Interpreter, IEEE Fourth International Conference on
Technology for Education, 6(12): 157-159
[26] Bharti P, Kumar D and Singh S, 2012, Sign Language to Number by Neural Network, International Journal of Computer Applications,
40(10): 38-45.
[27] Ulrike Zeshan, Madan M. Vasishta and Meher Sethna, 2005, Implementation of Indian Sign Language in Educational Settings, Asia Pacific
Disability Rehabilitation Journal:16(1):16-40.
[28] Chakraborty P, Mondal S, Nandy A, Prasad J.S , 2010, Recognizing & Interpreting Indian Sign Language Gesture for Human Robot
Interaction, Int’l Conf. on Computer & Communication Technology |ICCCT’10|,2(10), 52(11): 712-717.
[29] Chen X and Zhou Y, 2010, Adaptive Sign language recognition with exemplar extraction and MAP/IVFS, IEEE signal processing letter,
17(3):297-300
[30] Chung H. W, Chiu Y. H and Chi-Shiang Guo, 2004, Text Generation From Taiwanese Sign Language Using a PST-Based Language Model
for Augmentative Communication, IEEE Transcation On Neural System And Rehabilitation Engineering, 12(4):441-454.
[31] Debevc M, Kosec P & Rotovnik M, 2009, Accessible Multimodal web pages with Sign Language Translations for Deaf and Hard of Hearing
Users, 20th International Workshop on Database and Expert Systems Application 3(9): 279-283.
[32] Dasgupta T, Basu A, Mitra P, 2009, English to Indian sign language machine translation system: A structure transfer framework,
international conference on artificial intelligence (IICAI):118-124.
[33] Domingo A, Akmeliawati R & Kuang Ye Chow, 2007, Pattern Matching for Automatic Sign Language Translation System using LabVIEW,
International Conference on Intelligent and Advanced Systems 2007, 9(7):660-665.
[34] Dharaskar R.V, Futane P.R, , 2012, Video Gestures Identification And Recognition Using Fourier Descriptor And General Fuzzy Minmax
Neural Network For Subset Of Indian Sign Language, 12th International Conference on Hybrid Intelligent Systems (HIS):525-530.
[35] Dharaskar R.V, Futane P.R, 2011, HASTA MUDRA, An Interpretation of Indian Sign Hand Gestures, 3rd International Conference on
Electronics Computer Technology, 1: 337-380.

507 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

INDEXES’ OPTIMAL SELECTION FOR


DATA WAREHOUSE QUALITY
#1
Dr. Murtadha M. Hamad , Mohanad Ahamed Salih #2
#
Department of Computer Science,College of Computer Sciences & Information Technology , University of Anbar
Baghdad, IRAQ
1
[email protected]
2
[email protected]

Abstract—Query answering in DW performance is solve this problem, some mechanisms like summary
highly improved by bitmap Index. The efficiency of tables and indexes can be used. However, though there
complex query processing greatly increased through the is a good performance when using summary tables
use of Boolean operations (AND, OR, NOT, etc.) in the
selection predicate on multiple indexes and uses efficient for predefined queries, but in order to save space
storage space for attribute with low cardinality. The most and time during the query processing, indexing is a
important metrics are space and time for evaluating the better solution without the use of additional hardware.
performance of an index, and in case of bitmap index; The challenge is to find a suitable type of index
the focus was on time optimal index. In this paper, we are that would improve the performance of a query. To
proposing an algorithm to increase query performance
and reduce the response time using boolean operations speed up processing, new indexing techniques carried
(AND, OR) called multi Bitmap Index M-BI, bitmap out some relational database management systems,
index has been implemented on a group of tables, and such as bitmap indexing. Bitmap indexes have a
then the implementation of a bitmap index on a multiple specific structure for quick data retrieval [2] . Due to
columns in the table, as well as retrieve a query on any the low cost of both maintenance and construction,
two columns in the table at the same time with swap
the columns depending on the value cardinality. This Bitmap index structure is used effectively in DW.
study shows that the proposed method is better than Bitmap index structure minimizes the query response
other existing techniques of bitmap indexing on single time, consequently increases the Query Answering
column and single table. One of the results showed that efficiency [3] .
the performance of queries in terms of time of access
from multi Bitmap Index M-BI selection was found to
be 04 milliseconds (MS) while those queries directly II. BACKGROUND
selected from DW was found to be 38 milliseconds. This
shows the performance of query through Bitmap Index In this paragraph, we will present some of the
access is better than those directly access through DW previous studies on the efficiency of Bitmap index in
by 892.10%. the Data Warehouse and some of the algorithms used
Keywords: Data Warehouse , indexing , Bitmap which are close to and related to our study.
index , query processing .
• In 2012, Hamad, M. M., and Abdul-Raheem.M.
, this paper could achieve a number of results such
as ; the performance of query answering in Data
I. I NTRODUCTION
Warehouses is highly improved by Bitmap Index,
Data warehouse (DW) is sorting large historic data through using bitwise operations (AND, OR), it highly
collected from multiple heterogeneous data sources increases the efficiency of complex query processing
for supporting complex queries for strategic decision . A prototype of Data Warehouse “STUDENTS DW”
making. These complex queries require aggregated has been built due to the Inmon of Data Warehouses
data and want results to be realized in a minimum conditions . This prototype is built for student’s
response time. High response time is resulted by information [3] .
running queries on DW [1] . In DW, speeding up query
processing is considered to be a sensitive issue. To

508 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

• In 2014, Kausar, Firdous, Sh Odah Al Beladi improve complex query performance by applying
and Kholoud AL Shammari , this research offers low-cost Boolean operations such as OR, AND,
a comparison and analysis of some of the related and NOT in the selection predicate on multiple
facts, which have been conducted from past resources indexes at one time to reduce search space before
that concern on bitmap indexing for data warehouses. going to the primary source data[7]. Each column
This research aims to analyze and to compare number has its own factors which are the criteria to choose
of techniques that are concern with bitmap indexing a proper index. These Factors are as explained below :
for data warehouses. Those techniques are: Scatter
Bitmap Index Optimization, Enhanced Encoded
Bitmap, Index Evaluating the iceberg query through
the Compressed Index of Bitmap and determining the
Join Index of Bitmap through Techniques of Data • Distribution : The column distribution is the
Mining. The comparison is in terms of strengths, frequency of occurrence each distinct value of
weaknesses, differences and similarities of each the column. The distribution of column guides to
technique. Finally, this research gives a helpful determining which indexing technique should be
explanation for the importance of DW and the main adopted [2].
related techniques to enhance DW bitmap indexing [4].
• Value range : The range of values of an
• In 2016, Garhwani, Chiranjeev D., Shreya indexed column guides to the selection of an
S. Kandekar and Payal S. Chirde , this research appropriate index technique. For example, if the
has presented a comprehensive review on processing range of a high cardinality column is small, then
large data sets. Set predicates combined in a group, a bitmap-based indexing technique should be used [2].
allow selection of dynamically formed groups and set
values. They have presented an approach, compressed • Cardinality : The cardinality of a column is
bitmap index based approach using variable length the number of different values in that column. It is
coding to process large datasets. They observed that better to know whether the cardinality of an indexed
bitmap index has fallowing benefits is saving disk column is high or low since an indexing technique
access by avoiding tuple -scan on a table with more may work efficiently only with either high or low
number of attributes, and reducing computation time cardinality e.g. Bit map index only works well with
by conducting bitwise operations [5]. low cardinality data [2].

From the abovementioned, we conclude that Query • Read-Only attributes : An attribute of stable
optimization is the ultimate goal of enhancing the values, which are once stored in the DW, will not
performance of bitmap index. Time and space are be changed or modified. This means that the current
the most important metrics for the evaluation of values are fixed and the new values are appended to
the index performance. Bitmap index structure is them. The read-only/mostly attribute is a key factor
used effectively in DWs due to the minimal cost in choosing a column to be indexed, as well as
of construction and maintenance. The structure modifying a value in an indexed column which means
of Bitmap index increases the Query Answering to rebuild the entire index, which consumes time and
efficiency through minimizing the query response space operation, and causes inefficient performance
time by using the Boolean operations (AND, OR, for the DBMS [3].
NOT) in the selection predicate on multiple indexes
and uses efficient storage space for attribute with low This paper will choose cardinality as a factor of
cardinality . indexing, we depend on these concepts (distribution
and value range) to select which technique suitable
to use according to value of cardinality. So the chosen
III. C ONCEPT IN B ITMAP I NDEX the Bitmap index technique. In our work, we have
For read-only or read-mostly data, bitmap index stable data that will not change, in this case the index
considered one of the most efficient indexing of these data will be stable also and when we add
methods available for speeding up multi-dimensional new data, we need to add new value to the exit index
range queries, while traditional tree-based indexing without changing or modifying the original one. By
structures are designed for datasets that change depending the concept of Read-Only attributes.
frequently over time [6]. The Bitmap Indexes

509 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A. Features of Selecting an Appropriate Bitmap are built on columns . In this paper, we propose an
Index[2] algorithm to enhance the performance of query and
When choosing bitmap indexing technique, decrease the response time using Boolean operations
there are many features that need to be taken in (AND, OR) called multi Bitmap Index M-BI .
consideration :
V. F ILE S TRUCTURE OF A C OMPANY S YSTEM
• The index which takes small space and utilizes that DW
space efficiently .
• The index should be able to operate with other At this stage which is to be the first stage of our
indexes for filtering out the records before the access proposed system, we have built the data warehouse
to original data . tables (company system) in SQL Server 2012
• The index should support complex queries and environment, and these tables were filled of large
ad-hoc, also speed up join operations . number of records; creating these tables was based
• The index should be easy to be built (dynamically on the requirements of any company system. Clients
generated), implemented and then maintained . where all the information about the clients are saved,
Suppliers is the same but for suppliers, Invoices fall
into two types, Client invoices and Supplier invoices,
B. Basic Bitmap Index Each of which has two tables, a main table and
The main concept of bitmap index for an attribute detailed one (Invoices) and (Invoices − Details) for
with c distinct values, the basic bitmap index generates example . After creating the company system data
c bitmaps with R bits each, where R is the number warehouse , it becomes ready for importing in to the
of records (rows) in the data set. If the attribute in Visual Basic.net 2013 environment for completing the
the record is of a specific value, the corresponding next steps of the proposed system .
bitmap is set to “1”; otherwise the bit is “0”. That is,
a bit with value 1 indicates that a particular row has
VI. B ITMAP I NDEX D ESIGN
the value represented by the bitmap. Figure1 shows a
simple bitmap index with 4 bitmaps (B0-B3) [6]. In this section , we propose an algorithm to enhance
the performance of query and decrease the response
time with Bitmap indexing for data warehouse , we
examine how Query Processing is affected by Bitmap
Index , and study the efficiency of Query Answering
that produced by Bitmap Index. Figure 2 shows the
block diagram that describes the overall work is as
follow .

Fig. 1: a logical view of the basic bitmap index for a


variable named X. B0-B3 are bitmaps. [6]
Fig. 2: block diagram of multi Tables and multi
columns DW with Bitmap Index

IV. T HE P ROBLEM
For decision-makers, recently, data warehouse
system has become more and more important. A. The Proposed System
Most of the queries against a large data warehouse In this section, we demonstrate and explain the
are iterative and complex. In the data warehouse main steps (phases) of the proposed design to work
environment the ability to answer these queries Multi Bitmap Indexes algorithm (M- BI ) . Figure 3
efficiently is considered to be a critical issue. The shows the main phases of the proposed system .
performance of queries, especially ad hoc queries
would be greatly improved if the right index structures

510 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

C. Cardinality of a Suggested DW
In this section, the term "Cardinality" will be
explained and implemented practically. And as
previously mentioned, cardinality refers to the number
of distinct values in the column. In our project,
we design a tool that calculates the cardinality for
each column . The first step is creating a function
in database and name it “Cardinality” , its task of
is calculating the cardinality according to the given
column name .

Algorithm to calculate the cardinality

Input: multi column appears in ’where’ clause .


Output: number of the distinct value in the column
(i.e. cardinality).

Step1: the user enters the table name at client


Fig. 3: flowchart of Bitmap Index Using M-BI Algo-
machine.
rithm
Step2: client sends names each columns to server and
waits for answer.
Step3: function ’cardinality’ receives the column
B. Suggested Algorithm name, opens new session with COMPANY SYSTEM
DW database, calculates the number of the redundant
In this section we propose our indexing method data, and returns the result to the server. If cardinality
called multi Bitmap Index M-BI, to improve query equal to one, then column does not have redundant
performance and decreasing the response time. The data, such as (ID which is the primary key).
flowchart above was constructed according to the Step4: client receives the result from the server and
following algorithm . shows it to the user, which is the cardinality of the
column . Show Figure (4) is an example of the work
Input: databases entered by the user. of the algorithm, where it checks the cardinality of
Output: An answer to the query with evaluation the table "Supplying Invoice Details"
according to response time.

Step 1: analysis databases.


Step 2: find cardinality for each (g) column in table,
value cardinality [no. of distinct values / total no.
of record] check the cardinality of the attribute and
select an appropriate indexing technique.
Step 3: apply bitmap index on the any column. If
value of cardinality (x) column =1 or cardinality (y)
column =1 skip to the step 10.
Step 4: Choose any tow column from any table.
Step 5: Choose bitwise operation (OR) or (AND).
Step 6: if value cardinality (x) column < value Fig. 4: Checking Cardinality of the table "Supplying
cardinality (y) column then swap ((x), (y)). Invoice Details"
Step 7: send the query to the server, start timing, and
wait for answer.
Step 8: server process the query and send the answer
to client. VII. I MPLEMENTATION OF S UGGESTED
Step 9: client receive data from server, stop timing A LGORITHM
and calculate the query response time . As a start, we create tables of our DW according
Step 10: End . to our prototype “company”, which contains on

511 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

several tables (Warehouse Supplying Invoice Invoices, Step two: searching of the any table and chose
Supplying Invoice Details and Clients). Choose a any two column using bitmap index and compare
table, and then we calculated Cardinality for each without bitmap index using Boolean operations
column of the chosen table according to a Cardinality (AND, OR) . A search is done on three groups of
algorithm. After that retrieve a query of any two records, which are 105000, 450000, and 750000 .
columns in that table at the same time swapping
the columns depending on value of cardinality. And SELECT * FROM [TBL-CLIENTS]
when the M-BI algorithm was applied, the proposed where ([Email] = ’[email protected]’ OR
proved that outputs decreasing the query response [Name] = ’Ali Saeb Razzaq’)
time, which proves the efficiency of the algorithm
the proposed. We perform our experiments to test SELECT * FROM [TBL-INVOICE-DETAILS]
the response time of bitmap index. The experiments where ([Rto] = 7283 AND [Quantity] = 3)
applied on the project are of three steps as follows .
SELECT * FROM [TBL-SUPPINVOICE-DETAILS]
Step One: searching of the any table and chose where ([Rto] = 6350 AND [Quantity] = 84)
any column using bitmap index and compare without
bitmap index. A search is done on three groups of TABLE II: Response Time of Query Answering in
records, which are 105000, 450000, and 750000. Step Two of Algorithm

SELECT*FROM TBL-CLIENTS
where [Email] = ’[email protected]

SELECT*FROM TBL-INVOICE-DETAILS
where [Quantity] = 3

SELECT * FROM TBL-SUPPINVOICE-DETAILS


where [Quantity] = 84

TABLE I: Response Time of Query Answering in Step


One of Algorithm

Fig. 6: The performance of query by direct access and


access through Bitmap Index in Step Two of Algorithm

Fig. 5: The performance of query by direct access and


access through Bitmap Index in Step One of Algorithm

512 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Step three: searching of the any table and as shown in the following table 4.
chose any two column Boolean operations (AND,
OR) . Where it is to swap the columns under the
proposed algorithm is based on values Cardinality and TABLE IV: Response Time of Query Answering in
Comparison when not swap the columns . A search is Step Three of Algorithm
done on three groups of records, which are 105000,
450000, and 750000 .

SELECT * FROM [TBL-CLIENTS]


where ([Name] = ’Ali Saeb Razzaq’) OR
[Email] = ’[email protected]’)
The performance of query by direct access and access
SELECT * FROM [TBL-INVOICE -DETAILS] through Bitmap Index are represented in figure 8
where ([Quantity] = 3 AND [Rto] = 7283)

SELECT * FROM [TBL-SUPPINVOICE-DETAILS]


where ([Quantity] = 84 AND [Rto] = 6350)

TABLE III: Response Time of Query Answering in


Step Three of Algorithm

Fig. 8: The performance of query by direct access and


access through multi Bitmap Index Algorithm

From the performance any one judge the efficiency


of Bitmap Index queries over the queries directly
access through data warehouse. The response time
of queries through Bitmap Index access were found
04 milli-seconds while through direct access queries
were found 38 milli-seconds . And hence efficiency
of queries through Bitmap Index access over direct
access
Efficiency = (Direct access of queries - Bitmap Index
access of queries) / Bitmap Index access of queries
∗100
= (38 − 04)/04 ∗ 100
= 892.10%
Fig. 7: The performance of query by direct access i.e. 892.10% more efficient than access over direct
and access through Bitmap Index in Step Three of access .
Algorithm
IX. R ESULTS D ISCUSSION
VIII. Q UERY O PTIMIZATIONS As a result, we concluded some notes depending
Query optimization is the ultimate target of on observations and induction of the results of query
enhancing the performance of Bitmap index. Results response times.
obtained during the current work indicate the query
optimization through the increasing of query response 1. Bitmap index is works more efficient with
time. The optimization in query was due to the use multiple columns and bitwise operation (AND), than
of logical operations (AND, OR) and swap columns with a single column. Through the decrease of time

513 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

and space in order to deal with the update and delete R EFERENCES
of data source, an account the cardinality for multi [1] Bhosale P., et al. "Efficient Indexing Techniques On Data
columns by considering the updated data records Warehouse." International Journal of Scientific & Engineering
without re-computing the whole process. Research, Volume 4, Issue 5, May-2013 .
[2] Abdulhadi, Zainab Qays, Zhang Zuping, and Hamed Ibrahim
2. Increase the number of bitwise operations means Housien. "Bitmap Index as Effective Indexing for Low Car-
decreasing the query response time; this leads to an dinality Column in Data Warehouse." International Journal of
improved performance query. Computer Applications , 2013.
[3] Hamad, Murtadha M., and Muhammed Abdul-Raheem. "Eval-
3. Bitmap indexes provide a better performance when uation Of Bitmap Index Using Prototype Data Warehouse."
queries use a combinations of multiple conditions International Journal of Computers & Technology Volume 2
with OR/AND operators. No.2 April ,2012.
[4] Kausar, Firdous, Sh Odah Al Beladi, and Kholoud AL Sham-
4. Bitmap index efficiency is increased through mari. "Comparative Analysis of Bitmap Indexing Techniques in
increasing number of records. Data Warehouse." International Journal of Emerging Technol-
5. Bitmap index is works more efficient with swap ogy and Advanced Engineering 2014 .
[5] Garhwani, Chiranjeev D., Shreya S. Kandekar, and Payal S.
the columns Depending on value cardinality. Chirde. "A Review on Query Processing and Optimization in
SQL with different Indexing Techniques." (IJ) of Advanced
Research in Computer and Communication Engineering, 2016.
X. C ONCLUSIONS AND FUTURE WORK [6] Mei, Ying, Kaifan Ji, and Feng Wang . "A Survey on Bitmap
Index Technologies for Large-Scale Data Retrieval." ,Intelligent
Complex and interactive queries are often found in Networks and Intelligent Systems , 2013 6th International
data warehouse environment. They consume long time Conference on. IEEE, 2013.
[7] Guadalupe Canahuate, Tan Apaydin,"Secondary Bitmap In-
to access, find and retrieve answer for the queries. dexes with Vertical and Horizontal Partitioning" , ACM. EDBT
Space and time are the most important metrics to 2009, March 24–26, 2009, Saint Petersburg, Russia.
evaluate the performance of an index, and in case
of Bitmap index; the focus was on optimal time
index. Several bitmap indexes have been introduced,
aiming to reduce space requirement and improve query
processing time. In this paper, we proposed multi
Bitmap Index M-BI: an efficient algorithm to improve
the performance of query and decrease the response
time of Bitmap index in DW. Results obtained during
this work indicate that the query optimization is made
through the decreasing of query response time. Our
future work will explore techniques to choose other
structures (e.g. Join indexes and materialized views)
for database design in addition to indexes .

514 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A Survey on Different Methods of


Software Cloning and Detection
Syed Mohd Fazalul Haque V Srikanth E. Sreenivasa Reddy
Maulana Azad National Urdu University. K L University Acharya Nagarjuna University
[email protected] [email protected] [email protected]

Abstract approaches carried out illegally could be


Software cloning or code cloning is the still an open research. In future we
process of reusing code in software endeavour to explore it and the usage of
development. Duplicate copies made ontologies in clone detection and
with or without significant changes are representation.
also known as code smells. The recent
studies revealed that around 5% to 20% Index Terms – Software cloning,
clones are found in software systems. code cloning, clone detection,
Those clones are mostly resulted by applications, clone detection tools
copying existing code snippets for
promoting reusability. One of the INTRODUCTION
important issues with such clones is that In software engineering domain, cloning
a big is propagated to clones and has been around. It is achieved by
detection process needs to be duplicated copying piece of code and reusing with
in all the clones causing maintenance or without making any changes. These
problems in terms of cost, time and activities are prevalent in the industry.
effort. Refactoring of clones appeared to This phenomenon is named as code
be good solution as it is done cloning. The duplicated code is known
automatically. However, research as clone. There is an issue here. In later
revealed that refactoring is not desirable stages it is difficult to differentiate
with respect to certain clones. original code from the clone. Therefore
Nevertheless, it is very acceptable thing both original and duplicated one is
at least to detect clones. In this paper, the known as simply clones. Many
present state-of-the-art of software researchers concluded that a software
cloning is explored in terms of clone system with clones is likely to cause
detection research. Code definitions, maintenance issues [45], [8]. The
clone taxonomies, detection approaches, tendency of making clones not only
tools, applications of clones are causes maintenance problems but also
discussed in this paper. We also threw errors [21], [71], [72]. Code clones are
light on whether code cloning is harmful also known as a kind of bad smells [30].
and there is proof to be affirmative. We There is subtle evidence in the literature
believe that research on software cloning that code clones have adverse impact on
in the context of modern deployment the life-cycle of systems. Removing
approaches in cloud and the cloning

515 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

clones and preventing them to reoccur is large systems, approximately 13% - 20%
important [69]. code is duplicated code. According to
Generally copy-paste activities result in Lague et al. [69], with respect to
clones. This practice can help developers function clones, clones are reported at
to improve productivity especially in 6.4% to 7.5% while Baxter et al. [13]
case of device drivers that look same for opined that 12.7% code in software
many devices [72]. There are other systems is in the form of clones.
reasons such as coding style and
performance for duplicate code [13]. Industrial source code, according to
Accidental cloning is another Mayrand et al. [75], contains clones at
phenomenon which is due to the usage 5% to 20%. In case of large software
of same API while implementing systems, as per Kapser & Godfrey [49]
protocols [3]. In the literature many clones are at 10% to 15%. COBOL
other reasons were found for duplicate system with object oriented code was
code as explored in [8], [13], [47], [48], found to have 50% duplicated code [28].
[62], and [75]. From this it is understood that there is
huge amount of duplicated code and that
The research found in the literature also causes maintenance issues. Fortunately
found that there are serious problems many researchers bestowed mechanisms
caused by code cloning in real world for finding software clones [7], [9], and
software systems [5], [22], [8], [13], [13]. Nevertheless, there is no single
[28], [47], [48], [63], [62], and [75]. In definition for code cloning. There are
the presence of clones, system many terms used such as identical
functionality is not affected but causes program fragment [13], duplicate code
maintenance issues and development [28], clone [68] and so on.
becomes very expensive [81]. Clones
have negative impact on the evolution of Many researchers found in [13], [48],
software systems [33], [32], [31] in [75], [62], [11], [7], [28], [68], [59]
terms of comprehensibility, contributed to provide mechanisms to
maintainability and quality. Update detect clones and there are refactoring
anomalies are increased with code approaches to remove code clones [10],
cloning [12]. When finding and fixing [30], [40], [29], [61]. However,
bugs also duplicate code causes issues. refactoring clones was not found like a
Missing procedural abstraction and perfect solution though it can improve
missing inheritance are other problems the quality of code. Sometimes, even
when too much cloning is in system refactoring may not improve quality
[28]. From the literature it is understood [57]. Cordy did research on this issue
that the financial impact due to cloning and found that in some financial
is very high. After delivery of system, systems, it is not advisable to refractor
the cost of maintenance is estimated at the clones in the context of risk
40% to 70% of total cost [36]. In the management [23]. Another point of view
existing systems significant amount of in the literature is that clones are useful
code is duplicated [57], [72], and [43]. in some scenarios [54], [55]. However
According to Baker’s research [8], in there is overall support for the fact that

516 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

code duplication can cause maintenance


problems [30], [47], [75], [26] and [74]. DRAWBACKS OF CODE
DUPLICATION
There are many software engineering Code duplication can have many
tasks such as bug detection, virus drawbacks. They are bug propagation
detection, code compaction, evolution [45], [72], introduction of new bugs
analysis, and investigation of copyright [44], [13], bad design [78], difficulty in
infringement, plagiarism detection, change management [47], [75],
aspect mining and understanding code increased maintenance cost [75], [78],
quality. These tasks need to expect increased resource requirements as the
similar code [64]. However, the code system size increases, and difficulty in
duplicates cannot be found XP software making changes at architectural level
as it can improve performance [88]. [47].

REASONS FOR CODE APPLICATIONS OF DETECTING


DUPLICATION CODE CLONES
Code duplication does not occur There are many applications of detecting
automatically. Clones are introduced code clones in addition to code
either accidentally or due to the factors refactoring. Code duplicates are
explored in [8], [13], [47], [48], [62], potential sources for library candidates
[75], [80], and [56]. The prime reasons identification. Once they are
include design, logic, and reusing code. incorporated in library, code reuse is
Reuse by copy and paste is one of the possible without having drawbacks.
reasons [56]. Forking explored by Code duplicates also help in
Kapser & Godfrey [54] is another way understanding programs. The
of reusing code which is reusing similar comprehension of one copy of code will
solutions. Reuse of logic, functionalities help to understand other copies without
and design also cause code duplicates. much effort. Aspect mining research can
For instance device drivers in Linux get benefited from clones. The
have many duplicates [35]. duplicated code in cross-cutting
concerns can be separated and kept as a
When a system is developed clones may single aspect that can be reused across
be introduced in activities such as the application [16], [17].
merging two systems, generative
programming, and delay in restructuring. Detection of clones can have many other
There are some scenarios in which applications such as finding usage
clones are introduced for maintenance patterns [80], detection of malicious
benefits. When there is risk in software, detection of plagiarism,
developing new code [23], to have detection of copyright infringement [86],
similar architecture [54], speeding up [8], and [48]. Closing also helps in the
maintenance [80], to bring robustness in research related to software evolution
machine critical systems, and high cost due to the dynamic nature of clones with
functionalities are the scenarios in which different versions [4], [5], [35], [27], and
clones are intentionally used. [76]. Clone detection can also help in

517 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

code compacting in order to reduce the reduced comprehensibility [82], and


size of source code [20], [25]. problems in creation and use [87]. Many
studies found in the literature [50], [52],
IS THE SOFTWARE CLONING [49] proved that cloning is harmful.
HARMFUL? There was also research to find out the
Many researchers found that software relation between change couplings and
clones are harmful [5], [8], [13], [19], code clones. When code change is
[28], [47], [51], [62], and [75]. One main required, the code clones cause change
reason is that the effort to make changes couplings and ultimately that results in
to one copy is repeated for several times maintenance problems. The researchers
[34]. Automatic refactoring is the [73], [34], [57] made experiments and
possible way to overcome this problem proved that cloning is harmful. The
[10], [13]. Nevertheless, there are some study in [6], [67] and [57] found that
studies carried in the recent past suggest code cloning causes inconsistencies in
that refactoring is not the best solution as software system the effect will bring
it is not desirable in some cases. The about issues in the maintenance of the
closing has many issues such as system in the long run.
increased size of source code [10],

TYPES OF CODE CLONES in the function. The copy paste code


Code clones are of different types. results in first kind while the second one
However there are two kinds of is based on the functionality of code.
similarities in finding code clones. They
are similarity in the code and similarity
Type of code clone Description More information
Type I Code is similar but with Variations are in white
different white spaces spaces
Type II Syntactically or structurally Variations are in identifiers,
similar fragments comments, layout, literals
and types
Type III Copied copies with Changes in code, and
modifications variations in identifiers,
comments, layout, literals
and types
Type IV Functionally similar codes Known as semantic clones
Table 1 – Different types of code clones

518 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

NEAR-MISS CLONES algorithm also makes its impact on the


When copied fragments are very much detection performance. Algorithms are
similar to the original code, they are in different areas that are used for
known as near-miss clones. Editing of duplicate detection. For instance
unimportant things in code causes near- sequence matching algorithm is used in
miss clones. Type II clones come under bioinformatics.
near-miss clones. Sometimes slight
changes in source code may not show Computational Complexity: It is another
much difference. Hence, in this sense, major concern while determining a clone
the type III clones are also considered detection algorithm. Again the
near-miss clones. complexity depends on the kind of
transformation and granularity level.
CLONE DETECTION TOOLS AND
TECHNIQUES Clone Similarity: Method of interest is
Many detection techniques are found in used to detect clones. Some techniques
the literature. And most of the detection expect exact clones while other can
techniques are available in public make used of parameters and also near-
domain. These tools can detect different miss clones.
kinds of clones. The techniques are
classified into different kinds based Clone Granularity: Fixed and free are the
many properties described below. two granularity levels. When there is
Normalization: These techniques do not pre-defined boundary in the code, we
directly compare clones. They compare can call it fixed while there is no such
after making some kind of thing we can call it free.
transformation. For instance, they may
remove white spaces, comments and Language independency: It is another
even alternative form of code for major factor for clone detection. As
detection of clones. software can support many languages, it
is important consideration, in any system
Source Representation: different types of where clone detection is employed.
transformations are used to obtain code Groups of Clones: clone classes and
representation. This is done to meet the clone pairs are known as groups of
requirements of an algorithm used to clones. It is important to understand
compare and detect clones. The code whether the clone detection software
representation is used for comparison. will return such groups of clones.

Comparison Granularity: Different Clone refactoring: It is to find whether


levels of code granularity are possible. the detection technique supports clone
In the comparison phase there might be refactoring or not. Some techniques
different kinds of granularity that is support automatic refactoring whiles
exploited by algorithms. other need manual intervention.

Comparison Algorithm: Selection of an

519 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Language Paradigm: It indicates the shown as edges. The height of node is


language paradigm employed for used to denote the size. Later on in [43]
duplicate detection. It is to say that the hyperlinked web pages concept was
detection technique is suitable for introduced for navigating through clone
procedure oriented programming or classes. Metrics graph was proposed in
object oriented programming. Whatever [38] for visualization of similarities.
is the feature or technique the detection
process has two phases known as Polymetric views [70], [79], CLICS
transformation phase and comparison [53], taxonomy of clone types [52],
phase. These names are descriptive in Eclipse plug-in for visualizing CloneDR
nature. In the first phase code tool, AJDT visualize, cod clone
transformation takes place while the exploration tool [1], [2] known as
second phase does the clone detection. SoftGUESS are other tools available for
visualizing clone detection details. Jiang
VISUALIZATION OF CLONES et al. [42] improved the cohesion and
Clone pairs or clone classes are two coupling concept to code cloning with an
kinds of information shown by all clone architectural level visualization. In [41] a
detection tools. The textual information framework was proposed to understand
such s name of clones, the start and end clone information extracted from large
positions and line numbers are provided systems. The framework also used data
by tools. The clones differ in mining technique for finding clone
information such s type, degree of patterns. The tools apply various levels
similarity, size and granularity. In big of information including textual
systems like httpd from Apache and similarity in order to judge clones. Then
CCFinder there is huge number of clone interactive visualization is provided so
pairs [53]. As the retuned information in as to help end users to explore the clones
insufficient, the visualization of clones is extracted from the system.
said to be difficult. Efficient
visualization tools are required in order OPEN PROBLEMS IN SOFTWARE
to provide more intuitive visualization of CLONING
clone detection information. Many tools There are many open issues pertaining to
came into existence on visualization of software cloning. Many were raised
clone detection information as explored earlier in the literature as well. For
in [22], [28], [8], [79] and [84]. There instance in the seminar of Dagstuhl in
are many visual notions used. A dot 2006 many open problems were
indicates clones are similar. Scatter-plots discussed. In an international workshop
are used to visualize clone difference. In [85] 57 open issues were raised on
[37] an enhanced scatter-plot was used. software cloning. In the workshop two
decisions were made for each question.
Hasse diagrams are used in [46] for First, a decision is made to know
visualizing relationships between clones. whether the problem is already solved or
The diagram contains edges and nodes. solved partially or remain unsolved.
The source files are shown as nodes Second, adding the problem to one of the
while the relationship between them is classifications of clone research. From

520 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

the review of the recent literature we researcher to have knowledge


came to know that there was representation and ease of navigation.
considerable research on software The representation can provide details
cloning. However, it is still an open area like concepts and the relationships
where research can be continued as the among them.
software development platforms and the
software deployment models are largely Ontology based solutions in healthcare
changed due to cloud computing and domain were explored in [92]. Not only
mobile cloud computing technologies knowledge representation, the ontologies
and other technology innovations. can help in navigation and showing the
relation among different aspects
ONTOLOGIES FOR CLONE involved in the research area. Rosa et al.
DETECTION AND REMOVAL [93] explored ontologies detection of
Ontologies can be used to leverage approximate clones in business process
knowledge representation with respect to model repositories in software
clone detection and removal. Not only engineering domain. In [94] ontologies
knowledge representation but also code are explored for knowledge
detection process can be leveraged using representation in the context of cloud
ontologies. Software evolution analysis computing.
is also possible with ontologies [89].
Ontologies can be used to monitor
clones and visualize different
relationships as well. [90]. Ontologies
and their success factors with respect to
software engineering clone detection are
explored in [91]. Ontologies are the
potential area in the research of modern
clone detection as it can help the
 
 
 
   
   
   
   
   
   
   
 
   
   
   
   
 

521 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

 
  Johnson  [45],  Ducasse  et  al.  Marcus&  Maletic 
Properties   
Backer [7], [8]  [102]  [28], [103]  [104] 
 
 
Applies 
Syntactical  tokens   
Removes  white  transformation 
Transformati White  spaces  and  comment   
spaces  and  after  removing 
on of code  are removed  delimiters  are   
comments  white  spaces 
removed   
and comments 
 
 
Token  strings  Sub‐strings  are  Effective   
Representati Text  appears  like 
are  represented  as  sequence  of   
on of code  natural language 
parameterized  fingerprints  lines   
 
Employed  a   
String  matching   
Technique  Employed  sub‐ method  named 
based  on  Karp‐ A  method  based   
for  trees  for  token  Dynamic 
Rabin  on graph theory   
Comparison  matching  Pattern 
fingerprinting   
Matching 
O(n+m)   
Complexity  n  denotes  input  O(n2)   
in  the  lines  and  m  Not Available  n  represents  Not Available   
method  denotes  #  of  input lines   
matches found   
 
Granularity 
Paragraphs,   
of 
Tokens in a line   Sub‐strings in code  Line in code  sentences  and   
comparison 
words   
 
 
Based  on  Based  on  free 
Based  on  free   
Granularity  threshold  with  threshold  and  Code  segments 
threshold  with   
of code  minimum  15  longest  and functions 
minimum 50 lines   
lines  matches 
 
Parameterized 
Similarity  of  Exact matches that  Only  exact  Near  miss  or  exact   
and  exact 
clone  are repeated  matches  ADT   
matches 
 
Language  A  lexer  is  Only  source  code   
Does  not  need   A  lexer  is 
independenc required  text is considered   
lexer/parser  required 
e       
Clone  class  or   
Types  of  clone  pair  in  Clone  pair  in  the  Clone  pair  in   
‐ 
Output  textual format  form of text  the form of text   
   
Human  Human  Human  Human   
Refactoring 
intervention  is  intervention  is  intervention  is  intervention  is   
of clone 
required  required  required  required   
 
 
Table 2 – String based clone detection techniques 

522 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Properties Backer [8] Kamiya et al. [101] Li et al. [71] [72]

Transformation Comments and Many parameter and Source is mapped to


of code whitespaces are transformation replacements collection of sequences
removed besides removal of whitespaces to similar identifiers
and comments

Representation of Parameterized token Normalized, parameterized and Group of short


code string transformed sequence of tokens sequences

Technique for Token matching based Token matching based on suffix Employed a technique
comparison on suffix - tree – tree named frequent
subsequence mining

Complexity 0(n+m) n denotes input 0(n) n denotes length of source O(n2) n denotes lines of
lines and m denotes # code
of matches

Granularity in A line’s token Token Set of token identified


comparison sequences in a basic block

Granularity in Based on free threshold Based on free threshold with Based on free threshold
cloning with minimum of 15 minimum of 30 tokens pertaining to functions
lines and basic blocks

Similarity in Parameterized and Near miss or exact matches with Near miss or exact
cloning exact matches possible gaps matches with possible
gaps

Language A lexer is required Lexer with transformation rules A full parser is required
Independence are required

Types of Output Clone class and clone Clone class and clone pair in Clone pair
pair in textual format textual format

Refactoring of Human intervention is Human intervention is required Human intervention is


clone required required

Table 3 – Token based clone detection techniques

523 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Properties Backer et al.[13] Yang[98] Wahler et al.[99] Evans[100]

Transformation Parsing is done to Parsing is done to


Parsing is done to Parsing is done to a
of code have AST and then have AST and then
have AST variant’s parse tree
XML XML
Representation Represented as Represented as Represented as AST Represented as
of code AST parse tree in XML AST in XML

Technique for Employed Dynamic Comparison is made Employed graph

comparison Tree matching programming with through frequent theory


technique tree matching itemset
O(k:n2)
O(S1S2), S1
n denotes statements
denotes # of nodes
Complexity that are part of
0(N), N-AST of first tre, S2 Not Available
clones, k denotes
denotes no of nodes
clones of maximal
of second tree
size
Granularity in
comparison AST Node Tree node or token One line of code AST Node

Threshold based
Granularity in Free threshold based
free tree Segment or gram or Threshold based or
cloning (5 statements
similarity free fixed
usually)

Similarity in Near miss and Parameterized and Near miss with gap
cloning Near miss and exact
exact exact and exact

Language Lexer is required Pretty-printer and


Independence Parser is required Parser is required
for parsing parser are needed

Types of Clone information


Output is sent to
Output Clone pair - in the form of
pretty-printer
HTML document

Refactoring of Employs Human


Refactoring is not May be semi-
clone mechanical intervention is
supported automatic
refactoring required

Table 4 – Tree based clone detection techniques

524 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Properties Kontogiannis[62] Mayrand[95] DiLucca[96] Lanubile[97]

Transformation of Feature vectors are AST and then HTML tags are eMetrics are used
code constructed To AST and then extracted and to identify function
IRL composite tags are clones based on
extracted name similarity
Representation of Represented as Represented as Represented as Represented as
code AST parse tree AST in XML AST in XML

Technique for Employed Tree Employed Tree Employed frequent Graph theory is
comparison matching matching item set concept employed
technique technique
Complexity 0(N), N-AST O(S1S2) O(k:n2) Not Available
S1 denotes # of n denotes
nodes of first tree, statements part of
S2 denotes # of clones, k denotes
nodes of second clones with
tree maximal size
Granularity in Granularity is Tree node or token One line of content AST Node is the
comparison made using AST is the granularity granularity
Node level
Granularity in Threshold based Segment or gram Threshold based or Threshold based or
cloning free tree or free five statements or fixed
similarity free
Similarity in Near miss or exact Near miss and Parameterized and Near miss with gap
cloning exact exact and exact

Language Lexer is required Pretty-printer and A parser is A parser is


Independence for parsing parser are required required required

Types of Output Clone pair is the Pretty-printer is - Clone information


output used to present is presented in
output HTML document
Refactoring of Mechanical Refactoring is not May be semi- Human
clone refactoring is supported automatic intervention is
supported required

Table 5 – Metrics based clone detection techniques

525 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

CONCLUSIONS AND FUTURE Exploration of Code Clones in Context.


WORK In the proceedings of the 29th
Clone detection has been an active and International Conference on Software
ongoing research with plenty of methods Engineering (ICSE'07), Tool Demo,
already available in literature for pp.762-766, Minneapolis, MN, USA,
detection of clones and removing them. May 2007 .
With respect to evolution life cycle of
software systems and relation with [3] Raihan Al-Ekram, Cory Kapser,
clones was also researched. The reason Michael Godfrey. Cloning by Accident:
behind our research in this paper is that An Empir-ical Study of Source Code
any clone detection technique cannot be Cloning Across Software
perfect in all aspects. Moreover clone Systems.International Symposium on
detection became very important in the Empirical Software Engineering
context of cloud computing. (ISESE'05), pp. 376-385, Noosa Heads,
Australia, November 2005.
Since the software deployments are made
in cloud, clones can cause huge loss to [4] Giuliano Antoniol, Gerardo Casazza,
legitimate parties that deploy software in Massimiliano Di Penta, Ettore Merlo.
cloud in pay per use fashion. We made a Modeling Clones Evolution through
comprehensive survey on the clone Time Series. In Proceedings of the 17th
detection research that has been made. IEEE International Conference on
We explored whether clones are harmful, Software Maintenance (ICSM'01), pp.
disadvantages and advantages or clones 273-280, Florence, Italy, Novem-ber
in software systems. The research 2001. (PDF)
revealed that clone detection and clone
removal are still useful things in the [5] G. Antoniol, U. Villano, E. Merlo,
contemporary software development. and M.D. Penta. Analyzing cloning
This research is meaningful if it is evolution in the linux kernel.
extended further considering clone issues Information and Software Technology,
in cloud computing and clone 44 (13):755-765, 2002.
representations using modern knowledge
representations such as ontologies. [6] Lerina Aversano, Luigi Cerulo, and
Massimiliano Di Penta. How Clones are
REFERENCES Main-tained: An Empirical Study. In
Proceedings of the 11th European
[1] Eytan Adar. GUESS: a language and Conference on Soft-ware Maintenance
interface for graph exploration. In and Reengineering (CSMR'07), pp. 81-
Proceedings of the 2006 Conference on 90, Amsterdam, the Nether-lands, March
Human Factors in Computing Systems 2007.
(CHI'06), pp. 791-800. Montr¶eal, [7] Brenda S. Baker. A Program for
Qu¶ebec, Canada, April 2006. (PDF) Identifying Duplicated Code. In
Proceedings of Computing Science and
[2] Eytan Adar and Miryung Kim. Statistics: 24th Symposium on the
SoftGUESS: Visualization and Interface, Vol. 24:4957, March 1992.

526 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[8] Brenda Baker. On Finding International Conference on World Wide


Duplication and Near-Duplication in Web (WWW'05), pp. 924-925, Chiba,
Large Software Systems. In Proceedings Japan, May 2005.
of the Second Working Conference on
Reverse Engineering (WCRE'95), pp. [13] Ira Baxter, Andrew Yahin,
86-95, Toronto, Ontario, Canada, July Leonardo Moura, Marcelo Sant Anna.
1995. Clone Detection Using Abstract Syntax
Trees. In Proceedings of the 14th
[9] Magdalena Balazinska, Ettore Merlo, International Conference on Software
Michel Dagenais, Bruno Lague, Kostas Maintenance (ICSM'98), pp. 368-377,
Kon-togiannis. Advanced Clone-analysis Bethesda, Maryland, November 1998.
to Support Object-oriented System
Refactoring. In Proceedings of the 7th [14] Stefan Bellon. Detection of
Working Conference on Reverse Software Clones Tool Comparison
Engineering (WCRE'00), pp. 98-107, Experiment.Tool Com-parison
Brisbane, Qld., Australia, November Experiment presented at the 1st IEEE
2000. International Workshop on Source Code
Analysis and Manipulation, Montreal,
[10] Magdalena Balazinska, Ettore Canada, October 2002.
Merlo, Michel Dagenais, Bruno Lague,
Kostas Kon-togiannis. Partial Redesign [15] Stefan Bellon.Vergleich von
of Java Software Systems Based on techniken zur erkennung duplizierten
Clone Analysis. In quellcodes. Diploma Thesis, No. 1998,
Proceedings of the 6th Working University of Stuttgart (Germany),
Conference on Reverse Engineering Institute for Software Technology,
(WCRE'99), pp. 326-336, Atlanta, GA, September 2002.
USA, October 1999.
[16] Magiel Bruntink. Aspect Mining
[11] Hamid Basit, Simon Pugliesi, using Clone Class Metrics. In
William Smyth, Andrei Turpin, and Stan Proceedings of the 1st Workshop on
Jarzabek. E±cient Token Based Clone Aspect Reverse Engineering, 2004.
Detection with Flexible Tokenization. In
Proceedings of the Joint Meeting of the [17] Magiel Bruntink, Arie van
European Software Engineering Deursen, Remco van Engelen, Tom
Conference and Symposium on the Tourwe. On the Use of Clone
Foundations of Software Engineering Detection for Identifying
(ESEC/FSE'07), pp. 513-515, Crosscutting Concern Code.
Dubrovnik, Croatia, September 2007. Transactions on Software
Engineering, Volume 31(10):804-
[12] Hamid Basit, Damith Rajapakse, 818, October 2005.
Stan Jarzabek. An Investigation of
Cloning in Web Applications. In
Proceedings of the Special Interest
Tracks and Posters of the 14th

527 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[18] Elizabeth Burd and Malcolm of the 11th IEEE International


Munro. Investigating the maintenance Workshop on Program Comprehension
implications of the replication of code. (IWPC'03), pp. 196206, Portland,
In Proceedings of the 13th International Oregon, USA, May 2003.
Conference on Software Maintenance
(ICSM'97), Bari, Italy, September 1997. [24] Neil Davey, Paul Barson, Simon
Field, Ray J Frank. The Development of
[19] Gerardo Casazza, Giuliano a Software Clone Detector. International
Antoniol, Umberto Villano, Ettore Journal of Applied Software Technology,
Merlo, Massimiliano Di Penta. Vol. 1(3/4):219-236, 1995 .
Identifying Clones in the Linux Kernel.
In Proceedings of the 1st IEEE [25] Saumya K. Debray, William Evans,
International Workshop on Source Code Robert Muth, and Bjorn De Sutter.
Analysis and Manipulation (SCAM'01), Compiler techniques for code
pp, 90-97, Florence, Italy, November compaction. ACM Transactions on
2001. Programming Languages and Systems
(TOPLAS'00), Vol. 22(2):378-415,
[20] W-K. Chen, B. Li, and R. Gupta. March 2000.
Code Compaction of Matching Single-
Entry Multiple-Exit Regions. In [26] S. Demeyer, S. Ducasse, and O.
Proceedings of the 10th Annual Nierstrasz. Object-Oriented
International Static Analysis Sympo- Reengineering Patterns. Morgan
sium ( SAS'03 ), pp. 401-417, San Diego, Kaufmann, 2002.
CA, USA, June 2003.
[27] Ekwa Duala-Ekoko, Martin
[21] A. Chou, J. Yang, B. Chelf, S. Robillard. Tracking Code Clones in
Hallem, and D. R. Engler. An empirical Evolving Software. In Proceedings of
study of operating system errors. In the International Conference on
Proceedings of the 18th ACM Software Engineering (ICSE'07), pp.
symposium on Operating systems 158-167, Minneapolis, Minnesota, USA,
principles (SOSP'01), pp. 7388, Ban®, May 2007.
Alberta, Canada, October 2001.
[28] St¶ephane Ducasse, Matthias
[22] K. W. Church and J. I. Helfman. Rieger, Serge Demeyer. A Language
Dotplot: A program for exploring self- Independent Ap-proach for Detecting
similarity in millions of lines for text and Duplicated Code. In Proceedings of the
code. Journal of American Statistical 15th International Confer-ence on
Association, Institute for Mathematical Software Maintenance (ICSM'99), pp.
Statistics and Interface Foundations of 109-118, Oxford, England, September
North America, 2(2):153174, June 1993. 1999.

[23] J.R. Cordy. Comprehending reality: [29] Richard Fanta, Vclav Rajlich.
Practical challenges to software Removing Clones from the Code.
maintenance automation. In Proceedings Journal of Software Maintenance:

528 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Research and Practice, Volume Shinji Kusumoto, Katsuro Inoue. Code


11(4):223-243, August 1999. Clone Anal-ysis Methods for E±cient
Software Maintenance. Graduate School
[30] M. Fowler. Refactoring: Improving of Information Sci-ence and
the Design of Existing Code. Addison- Technology, Osaka University, 2006
Wesley, 2000. (published as a report (PhD Thesis?) and
paper version unpublished).
[31] Simon Giesecke. Generic modelling
of code clones. In Proceedings of [38] Yoshiki Higo, Yasushi Ueda,
Duplication, Re-dundancy, and Toshihiro Kamiya, Shinji Kusumoto,
Similarity in Software, ISSN 16824405, Katsuro Inoue. On Software
Dagstuhl, Germany, July 2006. Maintenance Process Improvement
Based on Code Clone Analysis. In
[32] Simon Giesecke. Clonebased Proceedings of the 4th International
Reengineering fÄur Java auf der Conference on Product Focused
EclipsePlattform. Masters thesis, Carl Software Process Improvement
von Ossietzky UniversitÄat Oldenburg, (PROFES'02), pp. 185-197, Rovaniemi,
Germany, September 2003. Finland, November 2002.

[33] Reto Geiger. Evolution Impact of [39] Yoshiki Higo, Toshihiro Kamiya,
Code Clones. Diploma Thesis, Shinji Kusumoto, Katsuro Inoue.
University of Zurich, October 2005. ARIES: Refac-toring Support
Environment based on Code Clone
[34] Reto Geiger, Beat Fluri, Harald C. Analysis. In Proceedings of the 8th
Gall and Martin Pinzger. Relation of IASTED International Conference on
code clones and change couplings. In Software Engineering and Applications,
Proceedings of the 9th International Cam-bridge, MA, USA, November
Conference of Funta-mental Approaches 2004.
to Software Engineering (FASE'06), pp.
411-425, Vienna, Austria, March 2006 [40] Yoshiki Higo, Toshihiro Kamiya,
Shinji Kusumoto, Katsuro Inoue.
[35] M.W. Godfrey, D. Svetinovic, and Refactoring Support Based on Code
Q. Tu. Evolution, growth, and cloning in Clone Analysis. In Proceedings of the
Linux: A case study. In CASCON 5th International Con-ference on
workshop on Detecting duplicated and Product Focused Software Process
near duplicated structures in large Improvement(PROFES'04), pp. 220-233,
software systems: Methods and Kansai Science City, Japan, April 2004.
applications, October 2000.
[41] Zhenming Jiang, and Ahmed
[36] Penny Grubb, and Armstrong A Hassan. A Framework for Studying
Takang. Software Maintenance Concepts Clones in Large Software Systems. In
and Prac-tice. 2nd edn. World Scienti¯c Proceedings of the Seventh IEEE
(2003). International Working Confer-ence on
[37] Yoshiki Higo, Toshihiro Kamiya, Source Code Analysis and Manipulation

529 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

(SCAM'07), Paris, France, October 2007.


[47] John Johnson. Substring Matching
[42] Zhen Ming Jiang, Ahmed E. for Clone Detection and Change
Hassan, and Richard C. Holt. Tracking. In
Visualizing Clone Cohesion and
Coupling. In Proceedings of the 13th Proceedings of the 10th International
Asia Paci¯c Software Engineering Conference on Software Maintenance,
Conference (APSEC'06), pp. 467-476, pp. 120-126, Victoria, British Columbia,
Bangalore, India, December 2006. Canada, September 1994.

[43] Lingxiao Jiang, Ghassan Misherghi, [48] Toshihiro Kamiya, Shinji


Zhendong Su, and Stephane Glondu. Kusumoto, Katsuro Inoue. CCFinder: A
DECKARD: Scalable and Accurate Multilinguistic Token-Based Code
Tree-based Detection of Code Clones. In Clone Detection System for Large Scale
Proceedings of the 29th International Source Code. Transactions on Software
Conference on Software Engineering Engineering, Vol. 28(7): 654- 670, July
(ICSE'07), pp. 96-105, Minnesota, USA, 2002.
May 2007.
[49] Cory J. Kapser and Michael W.
[44] J Howard Johnson. Navigating the Godfrey. Supporting the Analysis of
textual redundancy Web in legacy Clones in Soft-ware Systems: A Case
source. In Study. Journal of Software Maintenance
and Evolution: Re-search and Practice,
Proceedings of the 1996 Conference Vol. 18(2): 61-82, March 2006.
of the Centre for Advanced Studies
on Collaborative Research [50] Cory Kapser and Michael Godfrey.
(CASCON'96), pp. 7-16, Toronto, A Taxonomy of Clones in Source Code:
Canada, October 1996. The Re-Engineers Most Wanted List. In
Proceedings of the 2nd International
[45] J Howard Johnson. Identifying Workshop on Detection of Software
Redundancy in Source Code Using Clones (IWDSC'03), 2pp., Victoria, BC,
Fingerprints. In November 2003.

Proceeding of the 1993 Conference [51] Cory Kapser, and Michael Godfrey.
of the Centre for Advanced Studies Toward a taxonomy of clones in source
Conference (CASCON'93), pp. 171- code: A case study. In Proceedings of
183, Toronto, Canada, October the Conference on Evolution of Large
1993. Scale Industrial Software Architectures
[46] J. Howard Johnson. Visualizing (ELISA '03), pp. 67-78, Amsterdam, The
textual redundancy in legacy source. In Netherlands, Septem-ber 2003.
Proceedings of the 1994 Conference of [52] Cory Kapser, and Michael Godfrey
the Centre for Advanced Studies on . Aiding Comprehension of Cloning
Collaborative research (CASCON'94), Throu gh Cat-egorization. In
pp. 171-183, Toronto, Canada, 1994. Proceedings of the 7th International

530 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Workshop on Principles of Software ESEC/SIGSOFT FSE 2005 '05), pp. 187-


Evolution (IWPSE'04), pp. 85-94, Kyoto, 196, Lisbon, Portugal, September 2005.
Japan, September 2004. [58] Miryung Kim, David Notkin. Using
[53] Cory Kapser, Michael Godfrey. a Clone Genealogy Extractor for
Improved Tool Support for the Understanding and Supporting Evolution
Investigation of Du-plication in of Code Clones. In Proceedings of the
Software. In Proceedings of the 21st 2nd International Workshop on Mining
International Conference on Software Software Repositories (MSR'05), pp. 1-5,
Maintenance (ICSM'05), pp. 305-314, Saint Louis, Missouri, USA, May 2005.
Budapest, Hungary, September 2005.
[59] Raghavan Komondoor and Susan
[54] Cory Kapser and Michael W. Horwitz. Tool demonstration: Finding
Godfrey. \clones considered harmful" duplicated code using program
considered harmful. In Proceedings of dependences. In Proceedings of the
the 13th Working Conference on European Symposium on Pro-gramming
Reverse Engineering (WCRE'06), pp. (ESOP'01), Vol. LNCS 2028, pp.
19-28, Benevento, Italy, October 2006. 383386, Genova, Italy, April 2001.

[55] Cory Kapser and Michael W. [60] Raghavan Komondoor and Susan
Godfrey. \Cloning Considered Harmful" Horwitz. Using Slicing to Identify
Considered Harmful: A case study of the Duplication in Source Code. In
positive and negative e®ects. Empirical Proceedings of the 8th International
Software Engi-neering (invited for Symposium on Static Analysis (SAS'01),
publication), 2007. Vol. LNCS 2126, pp. 40-56, Paris,
France, July 2001.
[56] Miryung Kim, Lawrence Bergman,
Tessa Lau, David Notkin. An [61] Raghavan Komondoor and Susan
Ethnographic Study of Copy and Paste Horwitz. E®ective, Automatic
Programming Practices in OOPL. In Procedure Extrac-tion. In Proceedings of
Proceedings of 3rd International ACM- the 11th IEEE International Workshop
IEEE Symposium on Empirical Software on Program Compre-hension
Engineering (ISESE'04), pp. 83- 92, (IWPC'03), pp. 33-42, Portland, Oregon,
Redondo Beach, CA, USA, August USA, May 2003.
2004.

[57] Miryung Kim, Gail Murphy. An [62] K. Kontogiannis, R. DeMori, E.


Empirical Study of Code Clone Merlo, M. Galler, and M.Bernstein.
Genealogies. In Pattern Matching for Clone and Concept
Detection. In Automated Software
Proceedings of the 10th European Engineering, Vol. 3(1-2):77-108, June
software engineering conference held 1996.
jointly with 13th ACM SIGSOFT
international symposium on Foundations [63] Kostas Kontogiannis. Evaluation
of software engineering ( Experiments on the Detection of

531 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Programming Patterns using Software Germany, October 2001.


Metrics. In Proceedings of the 3rd
Working Conference on Reverse [69] Bruno LaguÄe, Daniel Proulx, Jean
Engineering (WCRE'97), pp. 44-54, Mayrand, Ettore M. Merlo and John
Amsterdam, The Netherlands, October Hudepohl. Assessing the Bene¯ts of
1997. Incorporating Function Clone Detection
in a Development Process. In
[64] R. Koschke, E. Merlo, A. Proceedings of the 13th International
Walenstein (Eds.). Dagstuhl Seminar Conference on Software Maintenance
Proceedings 06301. In Proceedings of (ICSM'97), pp. 314-321, Bari, Italy,
Duplication, Redundancy, and Similarity October 1997.
in Software,ISSN 16824405, Dagstuhl,
Germany, July 2006. [70] M. Lanza and S. Ducasse.
Polymetric views - a lightweight visual
[65] Rainer Koschke, Raimar Falke and approach to re-verse engineering. IEEE
Pierre Frenzel. Clone Detection Using Transactions on Software Engineering,
Abstract Syntax Suffix Trees. In Vol. 29(9):782-795, Semptember 2003.
Proceedings of the 13th Working
Conference on Reverse Engi-neering
(WCRE'06), pp. 253-262, Benevento, [71] Zhenmin Li, Shan Lu, Suvda
Italy, October 2006. Myagmar, Yuanyuan Zhou. CP-Miner:
A Tool for Finding Copy-paste and
[66] Rainer Koschke. Survey of Related Bugs in Operating System Code.
Research on Software Clones. In In Proceedings of the 6th Symposium on
Proceedings of Dagstuhl Seminar Operating System Design and
06301: Duplication, Redundancy, and Implementation (OSDI'04), pp. 289-302,
Similarity in Software, 24pp., Dagstuhl, San Francisco, CA, USA, December
Germany, July 2006. 2004.
[72] Zhenmin Li, Shan Lu, Suvda
[67] Jens Krinke. A Study of Consistent Myagmar, and Yuanyuan Zhou. CP-
and Inconsistent Changes to Code Miner: Finding Copy-Paste and Related
Clones. In Bugs in Large-Scale Software Code. In
IEEE Transactions on Software
Proceedings of the 14th Working Engineering, Vol. 32(3): 176-192,
Conference on Reverse Engineering March 2006.
(WCRE'07), 9pp., Vancouver, Canada, [73] Angela Lozano, Michel
October 2007. Wermelinger, and Bashar Nuseibeh.
Evaluating the Harmful-ness of Cloning:
[68] Jens Krinke. Identifying A Change Based Experiment. In
Similar Code with Program Proceedings of the 4th International
Dependence Graphs. In Proceed- Workshop on Mining Software
ings of the 8th Working Conference Repositories (MSR'07), 4 pp.,
on Reverse Engineering Minneapolis, USA, May 2007.
(WCRE'01), pp. 301-309, Stuttgart,

532 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[74] Zoltan Mann. Three Public Conference on Reverse Engineering


Enemies: Cut, Copy, and Paste. IEEE (WCRE'04), pp. 100-109, Delft
Computer, Vol. 39(7): 31-35, July 2006. University of Technology, the
Netherlands, November 2004.
[75] Jean Mayrand, Claude Leblanc,
Ettore Merlo. Experiment on the [80] Matthias Rieger. E®ective Clone
Automatic Detection of Function Clones Detection Without Language Barriers.
in a Software System Using Metrics. In Ph.D. Thesis, Universityof Bern,
Proceedings of the 12th International Switzerland, June 2005.
Conference on Software Maintenance
(ICSM'96), pp. 244-253, Monterey, CA, [81 Filip Van Rysselberghe, Serge
USA, November 1996. Demeyer. Evaluating Clone Detection
Techniques. In Proceedings of the
[76] E. Merlo, M. Dagenais, P. Bachand, International Workshop on Evolution of
J.S. Sormani, S. Gradara, and G. Large Scale Industrial Applications
Antoniol. Investigating large software (ELISA'03), 12pp., Amsterdam, The
system evolution: the linux kernel. In Netherlands, September 2003.
Proceedings of the 26th International
Computer Software and Applications [82] Nikita Synytskyy, James R. Cordy,
Conference (COMPSAC'02), pp. Thomas Dean. Resolution of Static
421426, Oxford, England, August 2002. Clones in Dynamic Web Pages. In
Proceedings of the 5th IEEE
[77] Gilad Mishne and Maarten de International Workshop on Web Site
Rijke. Source Code Retrieval Using Evolution (WSE'03), pp. 49-58,
Conceptual Similar-ity. In Proceeding of Amsterdam, September 2003.
the 2004 Conference on Computer
Assisted Information Retrieval [83] Robert Tairas, Je® Gray and Ira
(RIAO'04), pp. 539-554, Avignon Baxter. Visualization of clone detection
(Vaucluse), France, April 2004. results. In Proceedings of the 2006
OOPSLA Workshop on Eclipse
[78] Akito Monden, Daikai Technology eXchange, pp. 50-54,
Nakae,Toshihiro Kamiya,Shin-ichi Portland, Oregon, October 2006.
Sato,Ken-ichi Matsumoto. Software
quality analysis by code clones in [84] Yasushi Ueda, Yoshiki Higo,
industrial legacy software. In Toshihiro Kamiya, Shinji Kusumoto,
Proceedings of 8th IEEE International and Katsuro Inoue. Gemini: Code clone
Symposium on Software Metrics analysis tool. In International
(METRICS'02), pp. 87-94, Ottawa, Symposium on Empirical Software
Canada, June 2002. Engineering (ISESE'02), Vol. 2, pp.
3132, Nara, Japan, October 2002.
[79] Matthias Rieger, Stephane Ducasse,
Michele Lanza. Insights into [85] A. Walenstein, A. Lakhotia and R.
System{Wide Code Duplication. In Koschke. The Second International
Proceedings of the 11th IEEE Working Workshop on Detection of Software

533 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Clones (IWDSC'03). Workshop report. [92] Sajid Iqbal · Wasif Altaf ·


In ACM SIGSOFT Software Engineering Muhammad Aslam · Waqar Mahmood ·
Notes 29(2), pp. 1-5, March 2004 Muhammad Usman Ghani Khan. (2016).
Application of intelligent agents in
[86] Andrew Walenstein and Arun health-care: review. Springer Science,
Lakhotia. The Software Similarity p1-13.
Problem in Mal-ware Analysis. In
Proceedings Dagstuhl Seminar 06301: [93] MarcelloLaRosa , MarlonDumas ,
Duplication, Redundancy, and Similarity ChathuraC.Ekanayake , LucianoGarcía-
in Software, 10 pp., Dagstuhl, Germany, Bañuelos , JanRecker ,
July 2006. ArthurH.M.terHofstede . (2015).
Detectingapproximateclonesinbusinesspr
[87] Michael Toomim, Andrew Begel ocess modelrepositories. elsevier, p1-8.
and Susan L. Graham. Managing
Duplicated Code with Linked Editing. In [94] Vladimir Stantchev , Lisardo Prieto-
Proceedings of the IEEE Symposium on González, Gerrit Tamm. (2015). Cloud
Visual Languages and Human-Centric computing service for knowledge
Computing (VL/HCC'04), pp. 173-180, assessment and studies recommendation
Rome, Italy, September 2004. in crowdsourcing and collaborative
learning environments based on social
[88] Eric Nickell and Ian Smith. network analysi. elsevier, p.23-33.
Extreme programming and software
clones.In Proceedings of the 2nd [95] Jean Mayrand, Claude Leblanc,
International Workshop on Detection of Ettore Merlo. Experiment on the
Software Clones (IWDSC'03), 2pp., Automatic Detection of Function Clones
Victoria, BC, November 2003. in a Software System Using Metrics. In
Proceedings of the 12th International
[89] Sven Amann(B), Stefanie Beyer, Conference on Software Maintenance
Katja Kevic, and Harald Gall. (2015). (ICSM'96), pp. 244-253, Monterey, CA,
Software Mining Studies: Goals, USA, November 1996.
Approaches, Artifacts, and Replicability.
springer, p233-242. [96] G.A. Di Lucca, M. Di Penta,
[90] Nawel Bayar a, Saber Darmoul b, and A.R. Fasolino. An approach to
Sonia Hajri-Gabouj a, Henri Pierreval. identify duplicated web pages. In
(2015). Using immune designed Proceedings of the 26th
ontologies to monitor disruptions in International Computer Software
manufacturing systems. elsevier, p.24- and Applications Conference
34. (COMPSAC'02), pp. 481486,
[91] Christina Feilmayr, WolframWöß Oxford, England, August 2002.
. (2016). An analysis of ontologies and
their success factors for application to [97] Filippo Lanubile, and Teresa
business. elsevier. 101 , p1-4. Mallardo. Finding Function Clones
in Web Applica-tions. In
Proceedings of the 7th European

534 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Conference on Software Maintenance, pp. 120-126, Victoria,


Maintenance and Reengineering British Columbia, Canada, September
(CSMR'03), pp. 379-386, 1994.
Benevento, Italy, March 2003.
[103] St¶ephane Ducasse, Oscar
[98] Wuu Yang. Identifying Nierstrasz, Matthias Rieger. On the
syntactic di®erences between two E®ectiveness of Clone Detection by
programs. In SoftwarePrac-tice and String Matching. International
Experience, 21(7):739755, July Journal on Software Maintenance
1991. and Evolution: Research and
Practice, Volume 18(1): 37-58,
[99] V. Wahler, D. Seipel, Jurgen January 2006.
Wol® von Gudenberg, and G.
Fischer. Clone detection in source [104] Andrian Marcus and Jonathan
code by frequent itemset techniques. I. Maletic. Identi¯cation of high-
In Proceedings of the 4th IEEE level concept clones in source
Inter-national Workshop Source code.In Proceedings of the 16th
Code Analysis and Manipulation IEEE International Conference on
(SCAM'04), pp. 128135, Chicago, Automated Software Engineering
IL, USA, September 2004. (ASE'01), pp. 107-114, San Diego,
CA, USA, November 2001.
[100] Williams Evans, and
Christopher Fraser. Clone Detection
via Structural Abstraction. In
Proceedings of the 14th Conference
on Reverse Engineering
(WCRE'07), Vancouver, BC,
Canada, October 2007(to appear,
available as Technical Report since
August 2005).

[101] Toshihiro Kamiya, Shinji


Kusumoto, Katsuro Inoue.
CCFinder: A Multilinguistic Token-
Based Code Clone Detection
System for Large Scale Source
Code. Transactions on Software
Engineering, Vol. 28(7): 654- 670,
July 2002.

[102] John Johnson. Substring Matching


for Clone Detection and Change
Tracking.In Proceedings of the 10th
International Conference on Software

535 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Performance comparison of Adaptive OFDM Pre-and


Post-FFT Beamforming System
Waleed Abdallah,Yousef Abuzir ,Mohamad Khdair,
Faculty of Technology and Applied Sciences
Al-Quds Open University, Jerusalem, Palestine
[email protected],[email protected],[email protected],

Abstract There is a great need to improve the performance On the other hand, Post-FFT provides better performance
and capacity of mobile communication systems to fulfill the however, more complex computations are required because
ever increasing demand for bandwidth. The main problems spatial processing of individual subcarriers is performed by
affecting mobile communications and limiting their capacity applying the FFT operation on the received signal of each
and performance are multi-path fading, co-channel antenna [1]. In [3], a pre-FFT least mean square (LMS)
interference (CCI) and inter-symbol interference (ISI). To beamforming for OFDM systems was analyzed in additive
overcome such problems, Orthogonal Frequency Division Gaussian noise channel.
Multiplexing (OFDM), and Adaptive Antenna Array (AAA) An adaptive MBER beamforming was analyzed in [5] for
are used to increase the overall performance. single carrier modulation and in [6] for OFDM systems in
In this paper, simulation is used to investigate the behavior additive Gaussian noise channel. A class of MBER algorithms
of the pre-FFT and post-FFT beamformers for OFDM were studied in [5] and combined with space time coding in
system, two algorithms were considered: Least Mean [7].
Squares (LMS) and Minimum Bit Error Rate (MBER). The In earlier papers, a lot of work was done on Pre-FFT as it is
simulation results show that the binary phase shift keying computational efficient compared to Post-FFT. In some
(BPSK) signaling based on MBER technique utilizes the papers, Post-FFT was investigated [9] but it resulted in more
antenna array elements more efficiently than the Least Mean computations due to their selection of adaptation algorithms
Squares (LMS) technique in both cases of pre-FFT and post- used to calculate the weight vector. In [10] and [11] the
FFT, the best performance was obtained in the case of MMSE and MBER beamformers for Pre-FFT and Post-FFT
MBER with post-FFT. OFDM are studied, respectively, without investigating several
factors affecting performance such as considering Antenna
Index Terms MBER, OFDM, Pre-FFT, Post-FFT, MMSE, Array and Angle Spread. Pre-FFT and Post-FFT methods were
Beamforming, Smart Antenna. investigated with LMS in a selective fading channel in [8].
Since wireless standards, such as IEEE 802.11 and 802.16, use
I. INTRODUCTION the pilot subcarriers in their structures, our focus in this paper
Orthogonal Frequency Division Multiplexing (OFDM) is an will be given to suppress CCI and mitigate the multipath
efficient technique for high speed communications. It is used interference in pilot-assisted OFDM systems. In this paper, we
over severe multipath fading channels, especially where the considered the following factors affecting the performance of
delay spread is larger than the symbol duration [1-15]. OFDM both Pre-FFT and post-FFT beamformers: Number of
offers high bit rate transmission because it is less affected by antennas, power of the noise and interferences, presence of a
ISI resulting from multipath fading channels. frequency selective channels in addition to directional
interferences and angle spread, which all affect the array
Also, the use of AAA is efficient to suppress CCI by forming performance.
a beam pattern to keep the desired signal using spatial In addition, we analyzed the MBER algorithm in a practical
processing techniques. Adaptive beamforming can separate channel model in both Pre-FFT and Post-FFT methods and
transmitted signals on the same carrier frequency, provided compared with LMS algorithm, our main focus was on Post-
that they are separated in the spatial domain. The FFT using MBER algorithm which is less complex than LMS ,
beamforming processing combines the signals received by the this in turn results in less complex overall system and good
different elements of an antenna array to form a single output. performance.
In addition, Antenna Arrays can mitigate the effect of ISI and Remainder of this paper is organized as follows: Section II
relax the design of channel equalizer [1]. describes the system model and beamforming schemes.
Section III, describes adaptive algorithms. Sections IV and V
In OFDM systems, it is possible to apply AAA beamforming clarify on the computational complexity and the convergence
to time domain (Pre-FFT) or frequency domain (Post-FFT). rate for the simulated system respectively. In section VII
Pre-FFT processing is done in the time domain, which results simulation results are provided. Finally, conclusions and
in lower computations because only one FFT operation is possible directions for future work are presented in section
required; however, there is a slight performance degradation. VIII.

536 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

wireless channel. A multipath channel model (frequency


II. SYSTEM MODEL AND BEAMFORMING selective fading) is assumed to include maximum of L paths
SCHEMES and is assumed that the m th source (desired or interference)
A.OFDM System: and the receiving antenna array in the form of
An OFDM system is shown in Fig. 1, with an adaptive array at L 1
the receiver. In this figure, data bits at the transmitter are
converted into a constellation map using BPSK modulation.
hm (k )  
l 0
m ,l  k  l  m  1,, M (6)

This data is interpreted as a frequency-domain data in an where  m, l denotes a complex random number representing
OFDM system and is subsequently converted to a time-
domain signal by an IFFT operation [8]. The output of the the l th channel coefficient for the m th source and  . is delta
IFFT is transmitted to the channel after the addition of cyclic function.
prefix (CP). This process can be written as The signal at the receiver of an OFDM system, is described by
equation (7), where CP is assumed to be longer than the
1 H channel length (v > L), thus, received signal on the pth antenna
(1) ym  F xm 1 m  M
K of a Uniform Linear Array (ULA) for one OFDM symbol can
be writes as:
L 1 2
where M j ( p 1) d cos( m , l )
(7) r p (k )    m,l ~y m (k  v  l )e    p (k )
ym  ym (1), ym (2), , ym ( K ) T
(2) m 1 l  0
1 p  P 1 k  K
1 1  1  ,
  j 2 (1)(1) j 2 (1)( K 1)


1 e K  e K  where  p (k ) represents the channel noise entering the pth
F  (3)
  j 2 (K 1)(1)  
j 2 ( K 1)( K 1)  antenna.  m,l denotes the direction of arrival (DOA) of the lth
  
1 e
K  e K
 path and m th source. Without loss of generality, we have
assumed that the channels of all sources have the same length
representing the FFT operation matrix, L.
xm  xm (1), xm (2), , xm ( K )T (4)
B.PRE-FFT BEAMFORMING:
and H denotes the Hermitian transpose of a matrix. The output MMSE is implemented by sending known pilot symbols and
of the IFFT is transmitted to the channel after the addition of comparing them at the receiver with their known values to
cyclic prefix (CP). In order to add the CP, y m is cyclically generate an error signal, this error signal is used to correct
extended to generating ~ y m by inserting the last v element of errors in the data bits received on the same channel.
y m at its beginning, i.e. If there are a total of Q pilot symbols in every OFDM symbol
then we define two K 1 vector d q and Z q such that, the kth
~ J 
ym   v  ym (5) element of d q is zero if k is a data subcarrier and is the known
I K 
pilot value if k is a pilot subcarrier [8]. Similarly, the kth
where J v contains the last v rows of a size K identity matrix
element of Z q is zero if the k is a data subcarrier and is the
IK .
received pilot value if k is a pilot subcarrier. Therefore, the
10101 Signal X x CP
Mapping
IFFT
Addition
error signal in frequency domain is given by:
Eq  d q  Z q (8)
Channel This error signal must be converted to time domain for the
Per-FFT weight adjustment algorithm. Therefore,
1
ƞ(k)
e  F H Eq (9)
Interference Channel Channel K
noise
where e is the vector of error samples in time domain.
e  [e(1) e(2) e( K )]T (10)
r1(k) r2(k) r3(k) ... rk(k)
Consequently the Pre-FFT weights are updated as shown later
OFDM Reciver in Adaptation Algorithms section in this paper.

Fig. 1: The system model of OFDM transmitter

Finally, the OFDM time signal is transformed to the analog


form through D/A converter before transmission in the

537 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

W (k )  W (k  1)  2  r (k )  e* (k )
C.Post-FFT Beamforming: (12)
1 k  K
It should be noted that the number of subcarriers is 64 in post- where  is the step size parameter, and  represents the
FFT compared to 16 carriers in pre-FFT, thus resulting in complex conjugate. The last update at the end of each OFDM
more computations in post-FFT. As shown in Fig. 2 (block block (W (K)) is used as the initial value of the next block.
diagram of the Post-FFT beamforming), the received time The mean square error is increased with increase in step size
and is decreased according to decrease in the step size.
domain signal of each antenna is first converted to frequency
domain, then beamforming is performed on each subcarrier B. Minimum Bit Error Rate (MBER) Algorithm
[1],[8]. If Rm,k denotes the mth subcarrier of the kth antenna,
then the (frequency-domain) output signal of mth subcarrier is The block diagram of the Pre-FFT beamforming is shown in
given by: Fig.(3), the estimate of the transmitted bit b1 (k ) is given by
 1, Re( zˆ (k ))  0
K bˆ1 (k )  
Y (m)   k 1
wm ,k Rm,k 1 m  N (11)
(13)  1, Re( zˆ(k ))  0

In Eq (11), wm,k represents the weight associated with Rm,k. In where Re ( zˆ(k )) denotes the real part of zˆ(k ) .
Fig. 2 one weight is applied to every subcarrier, because we Array 1

assume that all subcarriers are pilot. Since there exist only a w1*
r1 (k )
few pilots in each OFDM block, every group of adjacent data Down convert A/D

subcarriers are clustered under one pilot symbol and the Array 2
Z Ẑ
weight of that pilot symbol is applied to all data subcarriers in w2*  FFT
.
r (k )
2 Down convert A/D
the cluster [1],[8]. .
.
Array N
wN ,1
Array 1 w*N
R N ,1
rP (k )
Z (1) Down convert A/D
r1 (n) .
A/D FFT . 
. Spatial Weigth control
. R1,1 and Update
.  w1,1
.  Z
wN , K
Array k  Fig.3. Block diagram of the Pre-FFT OFDM adaptive receiver.
RN , K
rk (n) . Pilot Separator
.
A/D FFT
.  Z (N ) In this section, Pre-FFT adaptive beamforming based on
R1, K MBER criteria is introduced to obtain the optimum weight
Zp
 set. The theoretical MBER solution for the Pre-FFT
w1, K 
Ep  dp OFDM beamformer is obtained in [4-7] where, the channel
Fig. 2: Block diagram of frequency-domain (post-FFT) beamforming
is assumed to be non-dispersive with additive Gaussian
noise. The error probability (BER cost function) of the
frequency domain signal of the beamformer is given by:
III. ADAPTIVE ALGORITHMS PE (W )  Prob{sgn(b1 (k )Real ( zˆ(k ))  0} (14)
where sgn() is the sign function. The weight vector that
The adaptive beamforming algorithms are used to update the minimize the BER is then defined as
weight vectors periodically to track the signal source in time W  arg min PE (W ) (15)
varying environment by adaptively modifying the system’s W
antenna pattern so that nulls are generated in the directions of From equation (14), define the signed decision variable
the interference sources. zˆ s (k )  sgn(b1 (k )) Re ( zˆ (k ))
(16)
A. Least Mean Square (LMS) Algorithm:  sgn(b1 (k )) Re ( zˆ (k ))   (k )
where
The LMS algorithm is a method of stochastically
implementing the steepest descent algorithm. Successive zˆ (k )  W H [r (k )   (k )] F (k ) (17)
corrections to the weight vector in the direction of the negative and
of the gradient vector eventually lead to the Minimum Mean  (k )  sgn(b1 (k )) Re (W H ) (k ) F (k )) (18)
Square Error (MMSE), at which point the weight vector
assumes its optimum value. The equations employed are: zˆs (k ) is the error indicator for the binary decision, when it is
positive, then the decision is correct, else an error occurred,
F(k) is the  (k ) k th column of F. Notice that F is unitary

538 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

matrix, so is still Gaussian with zero mean and variance The method of approximating a conditional pdf known as a
 n2  W H W . kernel density or Parzen window-based estimate [5-7], is used
to estimate the conditional error probability given the channel
The conditional probability density function (pdf) given the coefficients  m,l is used on OFDM systems. Given a symbol
channel coefficients  m,l of the error indicator, zˆ s (k ) , is a
mixed sum of Gaussian distributions [5], i.e., of K p training samples r (k ), b1 (k ) , a kernel density
1 estimate of the conditional pdf given the channel coefficients
p z ( zˆ s )  .
K 2  n W W H  m,l at pilot locations is given by
(19)
K
( zˆ s  sgn(b1 (k )) Re ( zˆ s (k ))) 2  zˆ  zˆ (k ) 2 
Kp


k 1
exp( 
2 n2W H W
) pˆ ( zˆ ) 
1
2K p 2W HW

exp  2 H 
 2 W W 
(25)
K p 1   
This is a good indicator for the beamformer's BER where the kernel width  is related to the noise standard
performance, because deriving a closed form for the average
error probability is not an easy process. Therefore, we use the deviation   .From this estimated p.d.f., the estimated BER is
gradient conditional error probability to update the weight given by:
vector. The conditional error probability provided that the K p 1

 Q(qˆ
1
channel coefficients  m,l of the beamformer PE (W ) , is PˆE (W )  k (W ) (26)
Kp k 0
given by [4-7].
K  u2

1
PE (W )   exp(  )du where
K 2  n W H W k 1
qk (W ) 2
(20) sgn(b1 (k  p  1)Re (W H R F p (k  p  1))
qˆ k (W )  (27)
K
n W HW
 Q(q
1
 k (W ))
K k 1 And F p (k  p  1) is the (k  p  1) th column of F p from
( zˆ s  sgn(b1 (k )) Re ( zˆ s (k ))
where u  (21) this estimated conditional p.d.f given the channel
 n (W H W ) coefficients  m,l , the gradient of the estimated BER is given
by [4-7]
where Q () is the Gaussian error function. K p 1
Re( ẑ(k  p  1)) 2
 exp( 
1
sgn(b1 (k )) Re ( zˆ (k )) PˆE (W )   (28)
qk (W )  (22) 2  n W H W k 0 2  n2W H W
n W W H
 sgn(b1 (k  p  1)) R F p (k  p  1)
In OFDM system, it is assumed that there are pilot signals in Now a block-data adaptive MBER algorithm is obtained by
every symbol to do channel estimation [1]-[3]. The pilot 
the gradient of PE (W ) . For each OFDM symbol, we can find
signals are also used to adaptive update the weight vector of
beamformer. The transmitted pilot signal vector of desired the optimum weight vector W by the steepest-descent gradient
user x1 p and the received pilot signal vector ẑ p in frequency algorithm [5].
 1 ( Re ( zˆ (k  p  1))) 2
domain can be written as follows PE (W )   exp(  )
2  n 2  n2
x1 p  [ x1(1),0..,x1(p  1),0,..,x1((K p  1)p  1)),0,..] (23)
 sgn(b1 ( k  p  1)) R Fp (k  p  1) 1 k  Kp
(29)
zˆ p  [ zˆ (1),0.., zˆ (p  1),0,.., zˆ (( K p  1)p  1),0,..] That is to say, W weight vector can be updated KP times in one
 W H R Fp OFDM symbol. Thus complexity is reduced and consequently,
the update equation is given by

W (k  1)  W (k )  PE (W )
1 0  1  1 0
 0  e  j 2 (1)( p ) / K  e
 j 2 (1)( K p 1) p / K
0  (Re ( zˆ (k  p  1))) 2 (30)
where Fp= 1  W (k )   exp(  )
2  n 2  n2
 0    0
  j 2 ( K 1)( p ) / K  j 2 ( K 1)( K p 1) p / K   sgn(b1 (k  p  1)) R F p (k  p  1) 1 k  Kp
1 0  e e 0
where  is a step size.
(24)

R  [r (1) r (2)  r ( K )] , p and Kp represent the


frequency spacing between consecutive pilot symbols and the
number of pilot symbols inserted in OFDM symbol,
respectively. We assume that the first pilot symbol is
positioned at the first sub-channel.

539 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Table I. MBER algorithm summary. We assumed an OFDM system perfectly synchronized, with a
Initialization CP length larger than the channel length with 64 subcarriers
i  1,   .01, Block size K  64 (pilot + data), BPSK modulation scheme, one desired source
 Calculate variance of noise  n and two interferences with equal powers. The desired and
interference sources were placed at 70°, 20°, and 120°,
 Initial weight vector W  .01 * ones( N ,1) respectively. We further assumed normalized channels with
Outer loop (1: floor (all bits/Block)) different length and real coefficients of 0.864, 0.435, 0.253
 Form a block of data from the received signals. and 0 for all sources, and an angle spread of  15 o [1,8]. Pilots
Inner loop (while k  K p ) were assumed to be distributed uniformly in the OFDM block
and the first subcarrier in every cluster was taken as a pilot.
 Calculate the gradient matrix over the block from
equations (29).
 Update the weight matrix as
ˆ
W (k )  W (k  1)   (k )PE from equation
(30).
 Normalize the solution
W (k  1)  W (k  1) / W (k  1)
 end of inner loop
 Determine the detected signals in order to be
used for calculating the BER.
 Increment the block number i  i  1
 end of outer loop

IV. COMPUTATIONAL COMPLEXITY


Fig.4. Comparison of the bit error performance when using 6 antenna
In this section, we compare the two algorithms in terms of elements and 3 users with SIR= 0 dB.
computational complexity [5]. Table II illustrates the
computational complexity of pre-weight update to complete a Fig. 4 shows the BER plots of pre-FFT and post-FFT method
single iteration, i.e., detecting one bit. The proposed MBER as a function of input SNR and SIR=0 dB. It is shown that the
maintains the linearity in complexity. performance of the post-FFT scheme is better than the pre-
Table II:
Comparison of Computational Complexity Pre-Weight Update FFT method. Also, it is observed that the BER of the post-FFT
exp(•) MBER beamformer is superior to that of LMS under moderate
Multiplications additions SNR.
evaluation
MBER 4 P  4 4  P 1 1
MMSE 8 P  2 8  P 1 ‾

V. CONVERGENCE RATE

In this section, we run the algorithm of the MBER for 400


samples. The results are shown in Fig.10 and Fig, 11, where
we can see that the proposed algorithm converges very fast to
the optimal solution.
Fig.12 illustrates the convergence performance of LMS based
on MMSE Pre-FFT beamformer. The figures resulting from
the simulation show that MBER (for Pre-FFT and Post-FFT)
converges much faster than MMSE.
Fig.5. Comparison of the bit error performance when using 6 antenna
elements and 3 users with SIR= -3dB.

VI. SIMULATION RESULTS Fig.5. shows a similar comparison to that shown in Fig.5 but
In this section, simulations are conducted to evaluate the for SIR= -3 dB. Also in this case, the performance of the post-
performance of the proposed adaptive beamforming for the FFT is better than the pre-FFT. Also, it is observed that the
LMS and MBER algorithms in a variety of channel conditions.

540 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

BER of the post-FFT MBER beamformer is superior to that of all paths of all sources. If this condition is not met, the
LMS under moderate SNR. performance of the pre-FFT will degrade [1,8]. Fig. 8. shows
the effect of number of antennas on BER. Note that MBER
0 algorithm reults in lower BER.

-5

-10
Array Pattern

-15

-20

Pre-fft-LMS
Pre-fft-MBER
-25
0 20 40 60 80 100 120 140 160 180
Observation Angle(in degrees)

Fig.6. Beampattern of the LMS and MBER beamformer using 6 antenna


elements and 3 users with SIR= -3dB.
Fig.6 illustrates the beam pattern of the MBER and LMS Fig. 8. The effect of the number of antennas on the Post-FFT and Pre-FFT
performance.
beamformers for Pre-FFT OFDM adaptive antenna array. It
shows that the MBER pre-FFT beamformer has lower sidelobe
On the other hand,post-FFT remains more robust than the pre-
levels.
FFT method when the number of antennas reduces. Although,
the performance of both schemes degrades with less number of
0 antennas, the curve of the pre-FFT shows a more rapid change
Post-FFT-LMS
Post-FFT-MBER
than the post-FFT method. however, post-FFT performance
-5 for MBER slightly degrades with lage number of antennas as
shown in Fig.8.
-10

Effect of Angle Spread


Array Pattern

-15
Fig .9 shown that the pre-FFT scheme has better results with
-20 wider angle spread while post-FFT exhibits better performance
with narrower angle spread. This is shown for a system with 8
-25 antennas, SNR=10 dB, the performance of MBER was better
than LMS [1,8].
-30

-35
0 20 40 60 80 100 120 140 160 180
Degress

Fig.7. Beampattern of the LMS and MBER beamformer using 6 antenna


elements and 3 users with SIR= -3dB.
In the post-FFT method, as shown in Fig.7, different paths of
an interference source are weight such that their combination
is canceled and a null is not necessarily required [5]. Other
subcarriers also demonstrated similar behavior. When the
antenna spacing is small, different paths are mostly correlated.
It is shown that the MBER has also lower sidelobes compared
to LMS method.

Effect of Number of Antennas:


The use of more antennas provides a better control for desired
Fig.9. the effect of the angular separation between signals on
source separation and interference rejection with the
the Pre-FFT and Post-FFT performance at 6 antennas.
presence of delayed paths, the pre-FFT scheme requires more
antennas to put nulls at their angles. In an ideal situation, the
Fig. 10A and Fig. 10B illustrate the convergence performance
number of antennas should be at least equal to the number of
of MBER Pre-FFT and post-FFT beamformer for different

541 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

values of SNR. The post-FFT had better BER performance


5
and faster convergence compared to pre-FFT. Pre-FFT-LMS
0
Pre-FFT convergence performance of MBER at different SNR Post-FFT-LMS
10
0
-1
10

-2
-5

MSE (dB)
10
Average BER (dB)

-3
10 -10

-4
10

SNR=0 -15
-5 SNR=4
10
SNR=6
SNR=8
-6 SNR=10 -20
10 0 100 200 300 400 500 600 700 800 900 1000
2 4 6 8 10 12 14 16 18 20 Number of OFDM Symbols
Number of OFDM Symbol
Fig. 12 MSE plots of all two schemes when the post-FFT performance is
POST-FFT convergence performance of MBER at different SNR
0
10 better than the pre-FFT.

-1
10 Fig. 12 shows the MSE curves of pre-FFT and post-FFT with
SNR=25 dB. Interestingly, while the MSE curve of the post-
-2
10 FFT scheme is better than the pre-FFT scheme its convergence
Average BER (dB)

rate is almost similar.


-3
10

VII. CONCLUSION
-4
10
In this paper, we studied MBER beamformer for Pre-FFT and
-5
SNR=0
SNR=4
Post-FFT OFDM adaptive antenna array. A multipath
10
SNR=6 (frequency selective fading) channel model is considered. The
SNR=8
-6 SNR=10
MBER algorithm and LMS algorithm are compared. The
10
2 4 6 8 10 12 14 16 18 20 MBER beamformer has advantageous characteristics such as
Number of OFDM Symbol
better BER performance, less computational complexity and
Fig. (10.A, 10.B) Convergence performance at different SNR.
shorter training symbols. We compared the post-FFT and Pre-
FFT performance, our results show that post-FFT has better
0.25 results in terms of BER but it requires more computations
resulting in a more complex system. We considered different
cases regarding number of antennas and angle separation and
0.2
channel paths.
Future work will cover a combined system of Post-FFT and
0.15 Pre-FFT, the complexity and performance of such a system
|weights|

can be compared to existing schemes, it is expected to use the


advantages of both schemes to provide a better performance
0.1
system.
W(0)
W(1)
0.05 W(2) REFERENCES
W(3) [1] S. Seydnejad and S. Akhzari "A combined time-frequency domain
W(4) beamforming method for OFDM systems", 2010 International ITG
W(5)
0 Workshop on Smart Antennas, IEEE, 23-24 Feb. 2010.
0 5 10 15 20 25
Number of OFDM Symbol [2] L. Fan, H. Zhang and C. He," Minimum bit error beamforming for Pre-
FFT OFDM adaptive antenna array", IEEE International Conference,
28-25 Sept. 2005.
Fig.11. Convergence of the MBER beamforming to obtain the optimum [3] C. K. Kim, K. Lee, and Y. S. Cho, “Adaptive Beamforming Algorithm
weights on the Pre-FFT performance. for OFDM Systems with Antenna Arrays”, IEEE Trans. on Consumer
Fig.11, shows that the optimum weights and the steady state Electronics, Vol. 64, No.4, pp. 1052-1058, 2000.
BER performance, respectively, require about 5 OFDM [4] Waleed Abdallah, Mohamad Khdair and Mos'ab Ayyash,” Minimum Bit
symbols under the condition of SNR=10dB and step Error Rate Assisted QPSK for Pre-FFT Beamforming in LTE OFDM
Communication Systems” International Jordanian Journal of Computers
size   .01 . and Information Technology (JJCIT), Vol. 2, No. 3, December 2016.
[5] S. Chen, N. N. Ahmad, and L. Hanzo, "Adaptive minimum bit-Error rate
beamforming", IEEE Trans. on Wireless Commun., Vol. 4, No. 2 March
2005, pp.341-348.

542 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[6] Lingyan Fan, Haibin Zhang and Chen He," Minimum bit error
beamforming for Pre-FFT OFDM adaptive antenna array", IEEE
International Conference, 28-25 Sept. 2005.
[7] Said Elnoubi, Waleed Abdallah, Mohamed M. M. Omar, "Minimum bit
error rate beamforming combined with space-time block coding", The
International Conf. on Com. and information Tech.-ICCIT 2011, March
2011-Aqaba, Jordon.
[8] S. R. Seydnejad, and S. Akhzari, “Performance Evaluation of Pre-FFT
Beamforming Methods in Pilot -Assisted SIMO-OFDM systems”,
Telecommunication Systems, Springer Science and Business Media,
March 2015.
[9] S. Hara, M. Budsabathon, and Y. Hara, "A Pre-FFT OFDM adaptive
antenna array with eigenvector combining", 2004 IEEE International
Conference on commun. Vol. 4, pp. 2412 - 2416 , June 2004.
[10] M. S. Heakle, M. A. Mangoud, and S. Elnoubi “LMS Beamforming
Using Pre and Post-FFT Processing for OFDM Communication
Systems,” Proc. of the 24th National Radio & Science Conference
(NRSC 2007) Mar. 2007.
[11] A. M. Mahros, I. Elzahaby, M. M. Tharwat, S. Elnoubi, “Beamforming
processing for OFDM communication systems,” Proc. of INCT 2012,
Istanbul, TURKEY, Oct. 2012.
[12] 3GPP, Release 8 V0.0.3, “Overview of 3GPP Release 8: Summery of all
Release 8 Features”, November 2008.
[13] M. Hsieh and C. We “Channel Estimation for OFDM Systems based
on Comb-Type Pilot arrangement in frequency selective fading
channels”, IEEE Transaction on Wireless Communication,vol.2, no.
1, pp 217-225, May 2009.
[14] 3GPP TS 36.211-Physical Channels and Modulation, 3GPP Technical
Specification, Rev. 8.9.0, 2009.
[15] M. Morelli and U. Mengali, “A Comparison of Pilot-aided Channel
Estimation Methods for OFDM Systems,” IEEE Transactions on Signal
Processing, vol. 49, pp.3065–3073, December 2001.

543 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Designing of cloud storage using Python language


Shipra Goel

Assistant Professor

Email –id :- [email protected]

Abstract: -

Cloud computing is a metaphor used to connected elements of network rendering services over
the internet. Cloud can store big data as well. Data can be accessed anywhere at any time and
data can never be lost. Python is an interpreter, object oriented language. It is dynamic in nature.
This language is widely accepted for Rapid application development software. Its syntax is very
simple and its readability is very high so its maintenance cost is very low. Debugging of python
code is very easy. So this research paper uses python to design clouds storage.

Keywords: -

Cloud computing, Clouds, Python

Introduction: -

Clouds are unified object storage for organization having large amount of data for data analytics
and for data archiving. It is multi-regional in nature with the highest availability and highest
QPS content. This research paper uses Python to design clouds.

Now step by step procedure is as follows:-

Step1:- Install gsutil on your computer: - Use command gsutil config - e

Step2 :- Boot library is installed and gcs-oauth2-boto-plugin

1. or the boto auth plugin framework. It provides OAuth 2.0 credentials that can be used
with Cloud Storage.

Setup to use the boto library and oauth2 plugin will depend on the system you are using.
Use the setup examples below as guidance. These commands install pip and then use pip
to install other packages. The last three commands show testing the import of the two
modules to verify the installation.

Debian and Ubuntu

wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo apt-get update

544 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

sudo apt-get upgrade


sudo apt-get install gcc python-dev python-setuptools libffi-dev libssl-dev
sudo pip install virtualenv
virtualenv venv
source ./venv/bin/activate
(venv)pip install gcs-oauth2-boto-plugin
(venv)python
>>>import boto
>>>import gcs_oauth2_boto_plugin

CentOS, RHEL, and Fedora

wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo yum install gcc openssl-devel python-devel python-setuptools libffi-devel
sudo pip install virtualenv
virtualenv venv
source ./venv/bin/activate
(venv)pip install gcs-oauth2-boto-plugin
(venv)python
>>>import boto
>>>import gcs_oauth2_boto_plugin

2. Set up your boto configuration file to use OAuth2.0.

You can configure your boto configuration file to use service account or user account
credentials. Service account credentials are the preferred type of credential to use when
authenticating on behalf of a service or application. User account credentials are the
preferred type of credentials for authenticating requests on behalf of a specific user (i.e., a
human). For more information about these two credential types, see Supported Credential
Types.

Using service account credentials

1. Use an existing service account or create a new one, and download the associated
private key.

Creating a service account

1. Open the list of credentials in the Cloud Platform Console.

Open the list of credentials

2. Click Create credentials.


3. Select Service account key.

545 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

A Create service account key window opens.

4. Click the drop-down box below Service account, then click New service
account.
5. Enter a name for the service account in Name.
6. Use the default Service account ID or generate a different one.
7. Select the Key type: JSON or P12.
8. Click Create.

A Service account created window is displayed and the private key for
the Key type you selected is downloaded automatically. If you selected a
P12 key, the private key's password ("notasecret") is displayed.

9. Click Close.

You need to get the private key in PKCS12 format. By default, when you create a
new key, the JSON format of the private key is downloaded.

2. Configure the .boto file with the service account. You can do this with gsutil:

gsutil config -e

The command will prompt you for the service account email address and the
location of the service account private key (.p12). Be sure to have the private key
on the computer where you are running the gsutil command.

Using user account credentials

3. If you don't already have a .boto file create one. You can do this with gsutil.

gsutil config

4. Use an existing client ID for an application or create a new one.

Creating a client ID for an application

1. Open the list of existing credentials


2. Click New credentials and select OAuth client ID.
3. In the Create client ID window, set Application type to Other.
4. Click Create.

The credential's associated ID and secret are displayed. You can also view
these later after the key is created.

5. Click OK.

546 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

5. Edit the .boto file. In the [OAuth2] section, specify the client_id and client_secret
values with the ones you generated.
6. Run the gsutil config again command to generate a refresh token based on the
client ID and secret you entered.

If you get an error message that indicates the .boto cannot be backed up, remove
or rename the backup configuration file .boto.bak.

7. Configure refresh token fallback logic.

The gcs-oauth2-boto-plugin requires fallback logic for generating auth tokens


when you are using application credentials. Fallback logic is not needed when you
use a service account.

You have the following options for enabling fallback:

1. Set the client_id and the client_secret in the .boto config file. This is the
recommended option, and it is required for using gsutil with your new
.boto config file.
2. Set environment variables OAUTH2_CLIENT_ID and
OAUTH2_CLIENT_SECRET.
3. Use the SetFallbackClientIdAndSecret function as shown in the examples
below.

Working with Cloud Storage

Setting up your Python source file

To start this tutorial, use your favorite text editor to create a new Python file. Then, add the
following directives, import statements, configuration, and constant assignments shown.

Note that in the code here, we use the SetFallbackClientIdAndSecret function as a fallback for
generating refresh tokens. See Using application credentials for other ways to specify a fallback.
If you are using a service account to authenticate, you do not need to include the fallback logic.

#!/usr/bin/python

import boto
import gcs_oauth2_boto_plugin
import os
import shutil
import StringIO
import tempfile
import time

# URI scheme for Cloud Storage.

547 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

GOOGLE_STORAGE = 'gs'
# URI scheme for accessing local files.
LOCAL_FILE = 'file'

# Fallback logic. In https://console.cloud.google.com/


# under Credentials, create a new client ID for an installed application.
# Required only if you have not configured client ID/secret in
# the .boto file or as environment variables.
CLIENT_ID = 'your client id'
CLIENT_SECRET = 'your client secret'
gcs_oauth2_boto_plugin.SetFallbackClientIdAndSecret(CLIENT_ID, CLIENT_SECRET)

Creating buckets

This following code creates two buckets. Because bucket names must be globally unique (see the
naming guidelines), a timestamp is appended to each bucket name to help guarantee uniqueness.

If these bucket names are already in use, you'll need to modify the code to generate unique
bucket names.

Note: Cloud Storage has kept the concept of default project from earlier versions of the product.
A default project exists for interoperability reasons. For more information and to learn how to set
a default project, see Setting a default project. The existence of a default project affects the way
the code shown on the right is written.
now = time.time()
CATS_BUCKET = 'cats-%d' % now
DOGS_BUCKET = 'dogs-%d' % now

# Your project ID can be found at https://console.cloud.google.com/


# If there is no domain for your project, then project_id = 'YOUR_PROJECT'
project_id = 'YOUR_DOMAIN:YOUR_PROJECT'

for name in (CATS_BUCKET, DOGS_BUCKET):


# Instantiate a BucketStorageUri object.
uri = boto.storage_uri(name, GOOGLE_STORAGE)
# Try to create the bucket.
try:
# If the default project is defined,
# you do not need the headers.
# Just call: uri.create_bucket()
header_values = {"x-goog-project-id": project_id}
uri.create_bucket(headers=header_values)

print 'Successfully created bucket "%s"' % name


except boto.exception.StorageCreateError, e:
print 'Failed to create bucket:', e

548 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Listing buckets

To retrieve a list of all buckets, call storage_uri() to instantiate a BucketStorageUri object,


specifying the empty string as the URI. Then, call the get_all_buckets() instance method.

uri = boto.storage_uri('', GOOGLE_STORAGE)


# If the default project is defined, call get_all_buckets() without arguments.
for bucket in uri.get_all_buckets(headers=header_values):
print bucket.name

Uploading objects

To upload objects, create a file object (opened for read) that points to your local file and a
storage URI object that points to the destination object on Cloud Storage. Call the
set_file_from_contents() instance method, specifying the file handle as the argument.

# Make some temporary files.


temp_dir = tempfile.mkdtemp(prefix='googlestorage')
tempfiles = {
'labrador.txt': 'Who wants to play fetch? Me!',
'collie.txt': 'Timmy fell down the well!'}
for filename, contents in tempfiles.iteritems():
with open(os.path.join(temp_dir, filename), 'w') as fh:
fh.write(contents)

# Upload these files to DOGS_BUCKET.


for filename in tempfiles:
with open(os.path.join(temp_dir, filename), 'r') as localfile:

dst_uri = boto.storage_uri(
DOGS_BUCKET + '/' + filename, GOOGLE_STORAGE)
# The key-related functions are a consequence of boto's
# interoperability with Amazon S3 (which employs the
# concept of a key mapping to localfile).
dst_uri.new_key().set_contents_from_file(localfile)
print 'Successfully created "%s/%s"' % (
dst_uri.bucket_name, dst_uri.object_name)

shutil.rmtree(temp_dir) # Don't forget to clean up!

Listing objects

To list all objects in a bucket, call storage_uri() and specify the bucket's URI and the Cloud
Storage URI scheme as the arguments. Then, retrieve a list of objects using the get_bucket()
instance method.

549 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

uri = boto.storage_uri(DOGS_BUCKET, GOOGLE_STORAGE)


for obj in uri.get_bucket():
print '%s://%s/%s' % (uri.scheme, uri.bucket_name, obj.name)
print ' "%s"' % obj.get_contents_as_string()

Downloading and copying objects

The following code reads objects in DOGS_BUCKET and copies them to both your home
directory and CATS_BUCKET. It also demonstrates that you can use the boto library to operate
against both local files and Cloud Storage objects using the same interface.

dest_dir = os.getenv('HOME')
for filename in ('collie.txt', 'labrador.txt'):
src_uri = boto.storage_uri(
DOGS_BUCKET + '/' + filename, GOOGLE_STORAGE)

# Create a file-like object for holding the object contents.


object_contents = StringIO.StringIO()

# The unintuitively-named get_file() doesn't return the object


# contents; instead, it actually writes the contents to
# object_contents.
src_uri.get_key().get_file(object_contents)

local_dst_uri = boto.storage_uri(
os.path.join(dest_dir, filename), LOCAL_FILE)

bucket_dst_uri = boto.storage_uri(
CATS_BUCKET + '/' + filename, GOOGLE_STORAGE)

for dst_uri in (local_dst_uri, bucket_dst_uri):


object_contents.seek(0)
dst_uri.new_key().set_contents_from_file(object_contents)

object_contents.close()

Changing object ACLs

The following code grants the specified Google account FULL_CONTROL permissions for
labrador.txt. Remember to replace valid-email-address with a valid Google account email
address.

uri = boto.storage_uri(DOGS_BUCKET + '/labrador.txt', GOOGLE_STORAGE)


print str(uri.get_acl())
uri.add_email_grant('FULL_CONTROL', 'valid-email-address')
print str(uri.get_acl())

550 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

Reading bucket and object metadata

This code retrieves and prints the metadata associated with a bucket and an object.

# Print ACL entries for DOGS_BUCKET.


bucket_uri = boto.storage_uri(DOGS_BUCKET, GOOGLE_STORAGE)
for entry in bucket_uri.get_bucket().get_acl().entries.entry_list:
entry_id = entry.scope.id
if not entry_id:
entry_id = entry.scope.email_address
print 'SCOPE: %s' % entry_id
print 'PERMISSION: %s\n' % entry.permission

# Print object metadata and ACL entries.


object_uri = boto.storage_uri(DOGS_BUCKET + '/labrador.txt', GOOGLE_STORAGE)
key = object_uri.get_key()
print ' Object size:\t%s' % key.size
print ' Last mod:\t%s' % key.last_modified
print ' MIME type:\t%s' % key.content_type
print ' MD5:\t%s' % key.etag.strip('"\'') # Remove surrounding quotes
for entry in key.get_acl().entries.entry_list:
entry_id = entry.scope.id
if not entry_id:
entry_id = entry.scope.email_address
print 'SCOPE: %s' % entry_id
print 'PERMISSION: %s\n' % entry.permission

Deleting objects and buckets

To conclude this tutorial, this code deletes the objects and buckets that you have created. A
bucket must be empty before it can be deleted, so its objects are first deleted.

for bucket in (CATS_BUCKET, DOGS_BUCKET):


uri = boto.storage_uri(bucket, GOOGLE_STORAGE)
for obj in uri.get_bucket():
print 'Deleting object: %s...' % obj.name
obj.delete()
print 'Deleting bucket: %s...' % uri.bucket_name
uri.delete_bucket()

References

[1]M.-E. BEGIN, An egee comparative study: Grids and clouds – evolution or revolution. EGEE
III project Report, vol. 30 (2008).

551 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 15, No. 2, February 2017

[2] XIAO Zhi-Jiao, CHANG Hui-You,YI Yang “An Optimization M ethod of W orkflow
Dynamic Scheduling Based on Heuristic GA”, Computer Science, Vo1.34 No.2 2007.

[3] Arash Ghorbannia Delavar, Yalda Aryan. A Synthetic Heuristic Algorithm for Independent
Task Scheduling in Cloud Systems. IJCSI International Journal of Computer Science Issues, Vol.
8, No 2, 2011, pp.289-295.

[4] Pinal Salot, “ A Survey of Various Scheduling Algorithms in Cloud Computing


Enviornment” IJRET, Vol.2, Issue.2,2013

[5] David Arthur and Sergei Vassilvitskii. k-means++: the advantages of careful seeding.
Proceed- ings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages
1027–1035, 2007.

[6] Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. Gephi: An open source software
for exploring and manipulating networks. Proceedings of the Third International ICWSM
Confer- ence, 2009

[7] Chih-Chung Chang and Chih-Jen Li. LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology, 2(3), 2011.

[8]Damian Conway. An algorithmic approach to english pluralization. Proceedings of the


Second Annual Perl Conference, 1998.

552 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

IJCSIS REVIEWERS’ LIST


Assist Prof (Dr.) M. Emre Celebi, Louisiana State University in Shreveport, USA
Dr. Lam Hong Lee, Universiti Tunku Abdul Rahman, Malaysia
Dr. Shimon K. Modi, Director of Research BSPA Labs, Purdue University, USA
Dr. Jianguo Ding, Norwegian University of Science and Technology (NTNU), Norway
Assoc. Prof. N. Jaisankar, VIT University, Vellore,Tamilnadu, India
Dr. Amogh Kavimandan, The Mathworks Inc., USA
Dr. Ramasamy Mariappan, Vinayaka Missions University, India
Dr. Yong Li, School of Electronic and Information Engineering, Beijing Jiaotong University, P.R. China
Assist. Prof. Sugam Sharma, NIET, India / Iowa State University, USA
Dr. Jorge A. Ruiz-Vanoye, Universidad Autónoma del Estado de Morelos, Mexico
Dr. Neeraj Kumar, SMVD University, Katra (J&K), India
Dr Genge Bela, "Petru Maior" University of Targu Mures, Romania
Dr. Junjie Peng, Shanghai University, P. R. China
Dr. Ilhem LENGLIZ, HANA Group - CRISTAL Laboratory, Tunisia
Prof. Dr. Durgesh Kumar Mishra, Acropolis Institute of Technology and Research, Indore, MP, India
Dr. Jorge L. Hernández-Ardieta, University Carlos III of Madrid, Spain
Prof. Dr.C.Suresh Gnana Dhas, Anna University, India
Dr Li Fang, Nanyang Technological University, Singapore
Prof. Pijush Biswas, RCC Institute of Information Technology, India
Dr. Siddhivinayak Kulkarni, University of Ballarat, Ballarat, Victoria, Australia
Dr. A. Arul Lawrence, Royal College of Engineering & Technology, India
Dr. Wongyos Keardsri, Chulalongkorn University, Bangkok, Thailand
Dr. Somesh Kumar Dewangan, CSVTU Bhilai (C.G.)/ Dimat Raipur, India
Dr. Hayder N. Jasem, University Putra Malaysia, Malaysia
Dr. A.V.Senthil Kumar, C. M. S. College of Science and Commerce, India
Dr. R. S. Karthik, C. M. S. College of Science and Commerce, India
Dr. P. Vasant, University Technology Petronas, Malaysia
Dr. Wong Kok Seng, Soongsil University, Seoul, South Korea
Dr. Praveen Ranjan Srivastava, BITS PILANI, India
Dr. Kong Sang Kelvin, Leong, The Hong Kong Polytechnic University, Hong Kong
Dr. Mohd Nazri Ismail, Universiti Kuala Lumpur, Malaysia
Dr. Rami J. Matarneh, Al-isra Private University, Amman, Jordan
Dr Ojesanmi Olusegun Ayodeji, Ajayi Crowther University, Oyo, Nigeria
Dr. Riktesh Srivastava, Skyline University, UAE
Dr. Oras F. Baker, UCSI University - Kuala Lumpur, Malaysia
Dr. Ahmed S. Ghiduk, Faculty of Science, Beni-Suef University, Egypt
and Department of Computer science, Taif University, Saudi Arabia
Dr. Tirthankar Gayen, IIT Kharagpur, India
Dr. Huei-Ru Tseng, National Chiao Tung University, Taiwan
Prof. Ning Xu, Wuhan University of Technology, China
Dr Mohammed Salem Binwahlan, Hadhramout University of Science and Technology, Yemen
& Universiti Teknologi Malaysia, Malaysia.
Dr. Aruna Ranganath, Bhoj Reddy Engineering College for Women, India
Dr. Hafeezullah Amin, Institute of Information Technology, KUST, Kohat, Pakistan
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Prof. Syed S. Rizvi, University of Bridgeport, USA


Dr. Shahbaz Pervez Chattha, University of Engineering and Technology Taxila, Pakistan
Dr. Shishir Kumar, Jaypee University of Information Technology, Wakanaghat (HP), India
Dr. Shahid Mumtaz, Portugal Telecommunication, Instituto de Telecomunicações (IT) , Aveiro, Portugal
Dr. Rajesh K Shukla, Corporate Institute of Science & Technology Bhopal M P
Dr. Poonam Garg, Institute of Management Technology, India
Dr. S. Mehta, Inha University, Korea
Dr. Dilip Kumar S.M, Bangalore University, Bangalore
Prof. Malik Sikander Hayat Khiyal, Fatima Jinnah Women University, Rawalpindi, Pakistan
Dr. Virendra Gomase , Department of Bioinformatics, Padmashree Dr. D.Y. Patil University
Dr. Irraivan Elamvazuthi, University Technology PETRONAS, Malaysia
Dr. Saqib Saeed, University of Siegen, Germany
Dr. Pavan Kumar Gorakavi, IPMA-USA [YC]
Dr. Ahmed Nabih Zaki Rashed, Menoufia University, Egypt
Prof. Shishir K. Shandilya, Rukmani Devi Institute of Science & Technology, India
Dr. J. Komala Lakshmi, SNR Sons College, Computer Science, India
Dr. Muhammad Sohail, KUST, Pakistan
Dr. Manjaiah D.H, Mangalore University, India
Dr. S Santhosh Baboo, D.G.Vaishnav College, Chennai, India
Prof. Dr. Mokhtar Beldjehem, Sainte-Anne University, Halifax, NS, Canada
Dr. Deepak Laxmi Narasimha, University of Malaya, Malaysia
Prof. Dr. Arunkumar Thangavelu, Vellore Institute Of Technology, India
Dr. M. Azath, Anna University, India
Dr. Md. Rabiul Islam, Rajshahi University of Engineering & Technology (RUET), Bangladesh
Dr. Aos Alaa Zaidan Ansaef, Multimedia University, Malaysia
Dr Suresh Jain, Devi Ahilya University, Indore (MP) India,
Dr. Mohammed M. Kadhum, Universiti Utara Malaysia
Dr. Hanumanthappa. J. University of Mysore, India
Dr. Syed Ishtiaque Ahmed, Bangladesh University of Engineering and Technology (BUET)
Dr Akinola Solomon Olalekan, University of Ibadan, Ibadan, Nigeria
Dr. Santosh K. Pandey, The Institute of Chartered Accountants of India
Dr. P. Vasant, Power Control Optimization, Malaysia
Dr. Petr Ivankov, Automatika - S, Russian Federation
Dr. Utkarsh Seetha, Data Infosys Limited, India
Mrs. Priti Maheshwary, Maulana Azad National Institute of Technology, Bhopal
Dr. (Mrs) Padmavathi Ganapathi, Avinashilingam University for Women, Coimbatore
Assist. Prof. A. Neela madheswari, Anna university, India
Prof. Ganesan Ramachandra Rao, PSG College of Arts and Science, India
Mr. Kamanashis Biswas, Daffodil International University, Bangladesh
Dr. Atul Gonsai, Saurashtra University, Gujarat, India
Mr. Angkoon Phinyomark, Prince of Songkla University, Thailand
Mrs. G. Nalini Priya, Anna University, Chennai
Dr. P. Subashini, Avinashilingam University for Women, India
Assoc. Prof. Vijay Kumar Chakka, Dhirubhai Ambani IICT, Gandhinagar ,Gujarat
Mr Jitendra Agrawal, : Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal
Mr. Vishal Goyal, Department of Computer Science, Punjabi University, India
Dr. R. Baskaran, Department of Computer Science and Engineering, Anna University, Chennai
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assist. Prof, Kanwalvir Singh Dhindsa, B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India
Dr. Jamal Ahmad Dargham, School of Engineering and Information Technology, Universiti Malaysia Sabah
Mr. Nitin Bhatia, DAV College, India
Dr. Dhavachelvan Ponnurangam, Pondicherry Central University, India
Dr. Mohd Faizal Abdollah, University of Technical Malaysia, Malaysia
Assist. Prof. Sonal Chawla, Panjab University, India
Dr. Abdul Wahid, AKG Engg. College, Ghaziabad, India
Mr. Arash Habibi Lashkari, University of Malaya (UM), Malaysia
Mr. Md. Rajibul Islam, Ibnu Sina Institute, University Technology Malaysia
Professor Dr. Sabu M. Thampi, .B.S Institute of Technology for Women, Kerala University, India
Mr. Noor Muhammed Nayeem, Université Lumière Lyon 2, 69007 Lyon, France
Dr. Himanshu Aggarwal, Department of Computer Engineering, Punjabi University, India
Prof R. Naidoo, Dept of Mathematics/Center for Advanced Computer Modelling, Durban University of Technology,
Durban,South Africa
Prof. Mydhili K Nair, Visweswaraiah Technological University, Bangalore, India
M. Prabu, Adhiyamaan College of Engineering/Anna University, India
Mr. Swakkhar Shatabda, United International University, Bangladesh
Dr. Abdur Rashid Khan, ICIT, Gomal University, Dera Ismail Khan, Pakistan
Mr. H. Abdul Shabeer, I-Nautix Technologies,Chennai, India
Dr. M. Aramudhan, Perunthalaivar Kamarajar Institute of Engineering and Technology, India
Dr. M. P. Thapliyal, Department of Computer Science, HNB Garhwal University (Central University), India
Dr. Shahaboddin Shamshirband, Islamic Azad University, Iran
Mr. Zeashan Hameed Khan, Université de Grenoble, France
Prof. Anil K Ahlawat, Ajay Kumar Garg Engineering College, Ghaziabad, UP Technical University, Lucknow
Mr. Longe Olumide Babatope, University Of Ibadan, Nigeria
Associate Prof. Raman Maini, University College of Engineering, Punjabi University, India
Dr. Maslin Masrom, University Technology Malaysia, Malaysia
Sudipta Chattopadhyay, Jadavpur University, Kolkata, India
Dr. Dang Tuan NGUYEN, University of Information Technology, Vietnam National University - Ho Chi Minh City
Dr. Mary Lourde R., BITS-PILANI Dubai , UAE
Dr. Abdul Aziz, University of Central Punjab, Pakistan
Mr. Karan Singh, Gautam Budtha University, India
Mr. Avinash Pokhriyal, Uttar Pradesh Technical University, Lucknow, India
Associate Prof Dr Zuraini Ismail, University Technology Malaysia, Malaysia
Assistant Prof. Yasser M. Alginahi, Taibah University, Madinah Munawwarrah, KSA
Mr. Dakshina Ranjan Kisku, West Bengal University of Technology, India
Mr. Raman Kumar, Dr B R Ambedkar National Institute of Technology, Jalandhar, Punjab, India
Associate Prof. Samir B. Patel, Institute of Technology, Nirma University, India
Dr. M.Munir Ahamed Rabbani, B. S. Abdur Rahman University, India
Asst. Prof. Koushik Majumder, West Bengal University of Technology, India
Dr. Alex Pappachen James, Queensland Micro-nanotechnology center, Griffith University, Australia
Assistant Prof. S. Hariharan, B.S. Abdur Rahman University, India
Asst Prof. Jasmine. K. S, R.V.College of Engineering, India
Mr Naushad Ali Mamode Khan, Ministry of Education and Human Resources, Mauritius
Prof. Mahesh Goyani, G H Patel Collge of Engg. & Tech, V.V.N, Anand, Gujarat, India
Dr. Mana Mohammed, University of Tlemcen, Algeria
Prof. Jatinder Singh, Universal Institutiion of Engg. & Tech. CHD, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Mrs. M. Anandhavalli Gauthaman, Sikkim Manipal Institute of Technology, Majitar, East Sikkim
Dr. Bin Guo, Institute Telecom SudParis, France
Mrs. Maleika Mehr Nigar Mohamed Heenaye-Mamode Khan, University of Mauritius
Prof. Pijush Biswas, RCC Institute of Information Technology, India
Mr. V. Bala Dhandayuthapani, Mekelle University, Ethiopia
Dr. Irfan Syamsuddin, State Polytechnic of Ujung Pandang, Indonesia
Mr. Kavi Kumar Khedo, University of Mauritius, Mauritius
Mr. Ravi Chandiran, Zagro Singapore Pte Ltd. Singapore
Mr. Milindkumar V. Sarode, Jawaharlal Darda Institute of Engineering and Technology, India
Dr. Shamimul Qamar, KSJ Institute of Engineering & Technology, India
Dr. C. Arun, Anna University, India
Assist. Prof. M.N.Birje, Basaveshwar Engineering College, India
Prof. Hamid Reza Naji, Department of Computer Enigneering, Shahid Beheshti University, Tehran, Iran
Assist. Prof. Debasis Giri, Department of Computer Science and Engineering, Haldia Institute of Technology
Subhabrata Barman, Haldia Institute of Technology, West Bengal
Mr. M. I. Lali, COMSATS Institute of Information Technology, Islamabad, Pakistan
Dr. Feroz Khan, Central Institute of Medicinal and Aromatic Plants, Lucknow, India
Mr. R. Nagendran, Institute of Technology, Coimbatore, Tamilnadu, India
Mr. Amnach Khawne, King Mongkut’s Institute of Technology Ladkrabang, Ladkrabang, Bangkok, Thailand
Dr. P. Chakrabarti, Sir Padampat Singhania University, Udaipur, India
Mr. Nafiz Imtiaz Bin Hamid, Islamic University of Technology (IUT), Bangladesh.
Shahab-A. Shamshirband, Islamic Azad University, Chalous, Iran
Prof. B. Priestly Shan, Anna Univeristy, Tamilnadu, India
Venkatramreddy Velma, Dept. of Bioinformatics, University of Mississippi Medical Center, Jackson MS USA
Akshi Kumar, Dept. of Computer Engineering, Delhi Technological University, India
Dr. Umesh Kumar Singh, Vikram University, Ujjain, India
Mr. Serguei A. Mokhov, Concordia University, Canada
Mr. Lai Khin Wee, Universiti Teknologi Malaysia, Malaysia
Dr. Awadhesh Kumar Sharma, Madan Mohan Malviya Engineering College, India
Mr. Syed R. Rizvi, Analytical Services & Materials, Inc., USA
Dr. S. Karthik, SNS Collegeof Technology, India
Mr. Syed Qasim Bukhari, CIMET (Universidad de Granada), Spain
Mr. A.D.Potgantwar, Pune University, India
Dr. Himanshu Aggarwal, Punjabi University, India
Mr. Rajesh Ramachandran, Naipunya Institute of Management and Information Technology, India
Dr. K.L. Shunmuganathan, R.M.K Engg College , Kavaraipettai ,Chennai
Dr. Prasant Kumar Pattnaik, KIST, India.
Dr. Ch. Aswani Kumar, VIT University, India
Mr. Ijaz Ali Shoukat, King Saud University, Riyadh KSA
Mr. Arun Kumar, Sir Padam Pat Singhania University, Udaipur, Rajasthan
Mr. Muhammad Imran Khan, Universiti Teknologi PETRONAS, Malaysia
Dr. Natarajan Meghanathan, Jackson State University, Jackson, MS, USA
Mr. Mohd Zaki Bin Mas'ud, Universiti Teknikal Malaysia Melaka (UTeM), Malaysia
Prof. Dr. R. Geetharamani, Dept. of Computer Science and Eng., Rajalakshmi Engineering College, India
Dr. Smita Rajpal, Institute of Technology and Management, Gurgaon, India
Dr. S. Abdul Khader Jilani, University of Tabuk, Tabuk, Saudi Arabia
Mr. Syed Jamal Haider Zaidi, Bahria University, Pakistan
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Dr. N. Devarajan, Government College of Technology,Coimbatore, Tamilnadu, INDIA


Mr. R. Jagadeesh Kannan, RMK Engineering College, India
Mr. Deo Prakash, Shri Mata Vaishno Devi University, India
Mr. Mohammad Abu Naser, Dept. of EEE, IUT, Gazipur, Bangladesh
Assist. Prof. Prasun Ghosal, Bengal Engineering and Science University, India
Mr. Md. Golam Kaosar, School of Engineering and Science, Victoria University, Melbourne City, Australia
Mr. R. Mahammad Shafi, Madanapalle Institute of Technology & Science, India
Dr. F.Sagayaraj Francis, Pondicherry Engineering College,India
Dr. Ajay Goel, HIET , Kaithal, India
Mr. Nayak Sunil Kashibarao, Bahirji Smarak Mahavidyalaya, India
Mr. Suhas J Manangi, Microsoft India
Dr. Kalyankar N. V., Yeshwant Mahavidyalaya, Nanded , India
Dr. K.D. Verma, S.V. College of Post graduate studies & Research, India
Dr. Amjad Rehman, University Technology Malaysia, Malaysia
Mr. Rachit Garg, L K College, Jalandhar, Punjab
Mr. J. William, M.A.M college of Engineering, Trichy, Tamilnadu,India
Prof. Jue-Sam Chou, Nanhua University, College of Science and Technology, Taiwan
Dr. Thorat S.B., Institute of Technology and Management, India
Mr. Ajay Prasad, Sir Padampat Singhania University, Udaipur, India
Dr. Kamaljit I. Lakhtaria, Atmiya Institute of Technology & Science, India
Mr. Syed Rafiul Hussain, Ahsanullah University of Science and Technology, Bangladesh
Mrs Fazeela Tunnisa, Najran University, Kingdom of Saudi Arabia
Mrs Kavita Taneja, Maharishi Markandeshwar University, Haryana, India
Mr. Maniyar Shiraz Ahmed, Najran University, Najran, KSA
Mr. Anand Kumar, AMC Engineering College, Bangalore
Dr. Rakesh Chandra Gangwar, Beant College of Engg. & Tech., Gurdaspur (Punjab) India
Dr. V V Rama Prasad, Sree Vidyanikethan Engineering College, India
Assist. Prof. Neetesh Kumar Gupta, Technocrats Institute of Technology, Bhopal (M.P.), India
Mr. Ashish Seth, Uttar Pradesh Technical University, Lucknow ,UP India
Dr. V V S S S Balaram, Sreenidhi Institute of Science and Technology, India
Mr Rahul Bhatia, Lingaya's Institute of Management and Technology, India
Prof. Niranjan Reddy. P, KITS , Warangal, India
Prof. Rakesh. Lingappa, Vijetha Institute of Technology, Bangalore, India
Dr. Mohammed Ali Hussain, Nimra College of Engineering & Technology, Vijayawada, A.P., India
Dr. A.Srinivasan, MNM Jain Engineering College, Rajiv Gandhi Salai, Thorapakkam, Chennai
Mr. Rakesh Kumar, M.M. University, Mullana, Ambala, India
Dr. Lena Khaled, Zarqa Private University, Aman, Jordon
Ms. Supriya Kapoor, Patni/Lingaya's Institute of Management and Tech., India
Dr. Tossapon Boongoen , Aberystwyth University, UK
Dr . Bilal Alatas, Firat University, Turkey
Assist. Prof. Jyoti Praaksh Singh , Academy of Technology, India
Dr. Ritu Soni, GNG College, India
Dr . Mahendra Kumar , Sagar Institute of Research & Technology, Bhopal, India.
Dr. Binod Kumar, Lakshmi Narayan College of Tech.(LNCT)Bhopal India
Dr. Muzhir Shaban Al-Ani, Amman Arab University Amman – Jordan
Dr. T.C. Manjunath , ATRIA Institute of Tech, India
Mr. Muhammad Zakarya, COMSATS Institute of Information Technology (CIIT), Pakistan
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assist. Prof. Harmunish Taneja, M. M. University, India


Dr. Chitra Dhawale , SICSR, Model Colony, Pune, India
Mrs Sankari Muthukaruppan, Nehru Institute of Engineering and Technology, Anna University, India
Mr. Aaqif Afzaal Abbasi, National University Of Sciences And Technology, Islamabad
Prof. Ashutosh Kumar Dubey, Trinity Institute of Technology and Research Bhopal, India
Mr. G. Appasami, Dr. Pauls Engineering College, India
Mr. M Yasin, National University of Science and Tech, karachi (NUST), Pakistan
Mr. Yaser Miaji, University Utara Malaysia, Malaysia
Mr. Shah Ahsanul Haque, International Islamic University Chittagong (IIUC), Bangladesh
Prof. (Dr) Syed Abdul Sattar, Royal Institute of Technology & Science, India
Dr. S. Sasikumar, Roever Engineering College
Assist. Prof. Monit Kapoor, Maharishi Markandeshwar University, India
Mr. Nwaocha Vivian O, National Open University of Nigeria
Dr. M. S. Vijaya, GR Govindarajulu School of Applied Computer Technology, India
Assist. Prof. Chakresh Kumar, Manav Rachna International University, India
Mr. Kunal Chadha , R&D Software Engineer, Gemalto, Singapore
Mr. Mueen Uddin, Universiti Teknologi Malaysia, UTM , Malaysia
Dr. Dhuha Basheer abdullah, Mosul university, Iraq
Mr. S. Audithan, Annamalai University, India
Prof. Vijay K Chaudhari, Technocrats Institute of Technology , India
Associate Prof. Mohd Ilyas Khan, Technocrats Institute of Technology , India
Dr. Vu Thanh Nguyen, University of Information Technology, HoChiMinh City, VietNam
Assist. Prof. Anand Sharma, MITS, Lakshmangarh, Sikar, Rajasthan, India
Prof. T V Narayana Rao, HITAM Engineering college, Hyderabad
Mr. Deepak Gour, Sir Padampat Singhania University, India
Assist. Prof. Amutharaj Joyson, Kalasalingam University, India
Mr. Ali Balador, Islamic Azad University, Iran
Mr. Mohit Jain, Maharaja Surajmal Institute of Technology, India
Mr. Dilip Kumar Sharma, GLA Institute of Technology & Management, India
Dr. Debojyoti Mitra, Sir padampat Singhania University, India
Dr. Ali Dehghantanha, Asia-Pacific University College of Technology and Innovation, Malaysia
Mr. Zhao Zhang, City University of Hong Kong, China
Prof. S.P. Setty, A.U. College of Engineering, India
Prof. Patel Rakeshkumar Kantilal, Sankalchand Patel College of Engineering, India
Mr. Biswajit Bhowmik, Bengal College of Engineering & Technology, India
Mr. Manoj Gupta, Apex Institute of Engineering & Technology, India
Assist. Prof. Ajay Sharma, Raj Kumar Goel Institute Of Technology, India
Assist. Prof. Ramveer Singh, Raj Kumar Goel Institute of Technology, India
Dr. Hanan Elazhary, Electronics Research Institute, Egypt
Dr. Hosam I. Faiq, USM, Malaysia
Prof. Dipti D. Patil, MAEER’s MIT College of Engg. & Tech, Pune, India
Assist. Prof. Devendra Chack, BCT Kumaon engineering College Dwarahat Almora, India
Prof. Manpreet Singh, M. M. Engg. College, M. M. University, India
Assist. Prof. M. Sadiq ali Khan, University of Karachi, Pakistan
Mr. Prasad S. Halgaonkar, MIT - College of Engineering, Pune, India
Dr. Imran Ghani, Universiti Teknologi Malaysia, Malaysia
Prof. Varun Kumar Kakar, Kumaon Engineering College, Dwarahat, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assist. Prof. Nisheeth Joshi, Apaji Institute, Banasthali University, Rajasthan, India
Associate Prof. Kunwar S. Vaisla, VCT Kumaon Engineering College, India
Prof Anupam Choudhary, Bhilai School Of Engg.,Bhilai (C.G.),India
Mr. Divya Prakash Shrivastava, Al Jabal Al garbi University, Zawya, Libya
Associate Prof. Dr. V. Radha, Avinashilingam Deemed university for women, Coimbatore.
Dr. Kasarapu Ramani, JNT University, Anantapur, India
Dr. Anuraag Awasthi, Jayoti Vidyapeeth Womens University, India
Dr. C G Ravichandran, R V S College of Engineering and Technology, India
Dr. Mohamed A. Deriche, King Fahd University of Petroleum and Minerals, Saudi Arabia
Mr. Abbas Karimi, Universiti Putra Malaysia, Malaysia
Mr. Amit Kumar, Jaypee University of Engg. and Tech., India
Dr. Nikolai Stoianov, Defense Institute, Bulgaria
Assist. Prof. S. Ranichandra, KSR College of Arts and Science, Tiruchencode
Mr. T.K.P. Rajagopal, Diamond Horse International Pvt Ltd, India
Dr. Md. Ekramul Hamid, Rajshahi University, Bangladesh
Mr. Hemanta Kumar Kalita , TATA Consultancy Services (TCS), India
Dr. Messaouda Azzouzi, Ziane Achour University of Djelfa, Algeria
Prof. (Dr.) Juan Jose Martinez Castillo, "Gran Mariscal de Ayacucho" University and Acantelys research Group,
Venezuela
Dr. Jatinderkumar R. Saini, Narmada College of Computer Application, India
Dr. Babak Bashari Rad, University Technology of Malaysia, Malaysia
Dr. Nighat Mir, Effat University, Saudi Arabia
Prof. (Dr.) G.M.Nasira, Sasurie College of Engineering, India
Mr. Varun Mittal, Gemalto Pte Ltd, Singapore
Assist. Prof. Mrs P. Banumathi, Kathir College Of Engineering, Coimbatore
Assist. Prof. Quan Yuan, University of Wisconsin-Stevens Point, US
Dr. Pranam Paul, Narula Institute of Technology, Agarpara, West Bengal, India
Assist. Prof. J. Ramkumar, V.L.B Janakiammal college of Arts & Science, India
Mr. P. Sivakumar, Anna university, Chennai, India
Mr. Md. Humayun Kabir Biswas, King Khalid University, Kingdom of Saudi Arabia
Mr. Mayank Singh, J.P. Institute of Engg & Technology, Meerut, India
HJ. Kamaruzaman Jusoff, Universiti Putra Malaysia
Mr. Nikhil Patrick Lobo, CADES, India
Dr. Amit Wason, Rayat-Bahra Institute of Engineering & Boi-Technology, India
Dr. Rajesh Shrivastava, Govt. Benazir Science & Commerce College, Bhopal, India
Assist. Prof. Vishal Bharti, DCE, Gurgaon
Mrs. Sunita Bansal, Birla Institute of Technology & Science, India
Dr. R. Sudhakar, Dr.Mahalingam college of Engineering and Technology, India
Dr. Amit Kumar Garg, Shri Mata Vaishno Devi University, Katra(J&K), India
Assist. Prof. Raj Gaurang Tiwari, AZAD Institute of Engineering and Technology, India
Mr. Hamed Taherdoost, Tehran, Iran
Mr. Amin Daneshmand Malayeri, YRC, IAU, Malayer Branch, Iran
Mr. Shantanu Pal, University of Calcutta, India
Dr. Terry H. Walcott, E-Promag Consultancy Group, United Kingdom
Dr. Ezekiel U OKIKE, University of Ibadan, Nigeria
Mr. P. Mahalingam, Caledonian College of Engineering, Oman
Dr. Mahmoud M. A. Abd Ellatif, Mansoura University, Egypt
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Prof. Kunwar S. Vaisla, BCT Kumaon Engineering College, India


Prof. Mahesh H. Panchal, Kalol Institute of Technology & Research Centre, India
Mr. Muhammad Asad, Technical University of Munich, Germany
Mr. AliReza Shams Shafigh, Azad Islamic university, Iran
Prof. S. V. Nagaraj, RMK Engineering College, India
Mr. Ashikali M Hasan, Senior Researcher, CelNet security, India
Dr. Adnan Shahid Khan, University Technology Malaysia, Malaysia
Mr. Prakash Gajanan Burade, Nagpur University/ITM college of engg, Nagpur, India
Dr. Jagdish B.Helonde, Nagpur University/ITM college of engg, Nagpur, India
Professor, Doctor BOUHORMA Mohammed, Univertsity Abdelmalek Essaadi, Morocco
Mr. K. Thirumalaivasan, Pondicherry Engg. College, India
Mr. Umbarkar Anantkumar Janardan, Walchand College of Engineering, India
Mr. Ashish Chaurasia, Gyan Ganga Institute of Technology & Sciences, India
Mr. Sunil Taneja, Kurukshetra University, India
Mr. Fauzi Adi Rafrastara, Dian Nuswantoro University, Indonesia
Dr. Yaduvir Singh, Thapar University, India
Dr. Ioannis V. Koskosas, University of Western Macedonia, Greece
Dr. Vasantha Kalyani David, Avinashilingam University for women, Coimbatore
Dr. Ahmed Mansour Manasrah, Universiti Sains Malaysia, Malaysia
Miss. Nazanin Sadat Kazazi, University Technology Malaysia, Malaysia
Mr. Saeed Rasouli Heikalabad, Islamic Azad University - Tabriz Branch, Iran
Assoc. Prof. Dhirendra Mishra, SVKM's NMIMS University, India
Prof. Shapoor Zarei, UAE Inventors Association, UAE
Prof. B.Raja Sarath Kumar, Lenora College of Engineering, India
Dr. Bashir Alam, Jamia millia Islamia, Delhi, India
Prof. Anant J Umbarkar, Walchand College of Engg., India
Assist. Prof. B. Bharathi, Sathyabama University, India
Dr. Fokrul Alom Mazarbhuiya, King Khalid University, Saudi Arabia
Prof. T.S.Jeyali Laseeth, Anna University of Technology, Tirunelveli, India
Dr. M. Balraju, Jawahar Lal Nehru Technological University Hyderabad, India
Dr. Vijayalakshmi M. N., R.V.College of Engineering, Bangalore
Prof. Walid Moudani, Lebanese University, Lebanon
Dr. Saurabh Pal, VBS Purvanchal University, Jaunpur, India
Associate Prof. Suneet Chaudhary, Dehradun Institute of Technology, India
Associate Prof. Dr. Manuj Darbari, BBD University, India
Ms. Prema Selvaraj, K.S.R College of Arts and Science, India
Assist. Prof. Ms.S.Sasikala, KSR College of Arts & Science, India
Mr. Sukhvinder Singh Deora, NC Institute of Computer Sciences, India
Dr. Abhay Bansal, Amity School of Engineering & Technology, India
Ms. Sumita Mishra, Amity School of Engineering and Technology, India
Professor S. Viswanadha Raju, JNT University Hyderabad, India
Mr. Asghar Shahrzad Khashandarag, Islamic Azad University Tabriz Branch, India
Mr. Manoj Sharma, Panipat Institute of Engg. & Technology, India
Mr. Shakeel Ahmed, King Faisal University, Saudi Arabia
Dr. Mohamed Ali Mahjoub, Institute of Engineer of Monastir, Tunisia
Mr. Adri Jovin J.J., SriGuru Institute of Technology, India
Dr. Sukumar Senthilkumar, Universiti Sains Malaysia, Malaysia
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Mr. Rakesh Bharati, Dehradun Institute of Technology Dehradun, India


Mr. Shervan Fekri Ershad, Shiraz International University, Iran
Mr. Md. Safiqul Islam, Daffodil International University, Bangladesh
Mr. Mahmudul Hasan, Daffodil International University, Bangladesh
Prof. Mandakini Tayade, UIT, RGTU, Bhopal, India
Ms. Sarla More, UIT, RGTU, Bhopal, India
Mr. Tushar Hrishikesh Jaware, R.C. Patel Institute of Technology, Shirpur, India
Ms. C. Divya, Dr G R Damodaran College of Science, Coimbatore, India
Mr. Fahimuddin Shaik, Annamacharya Institute of Technology & Sciences, India
Dr. M. N. Giri Prasad, JNTUCE,Pulivendula, A.P., India
Assist. Prof. Chintan M Bhatt, Charotar University of Science And Technology, India
Prof. Sahista Machchhar, Marwadi Education Foundation's Group of institutions, India
Assist. Prof. Navnish Goel, S. D. College Of Enginnering & Technology, India
Mr. Khaja Kamaluddin, Sirt University, Sirt, Libya
Mr. Mohammad Zaidul Karim, Daffodil International, Bangladesh
Mr. M. Vijayakumar, KSR College of Engineering, Tiruchengode, India
Mr. S. A. Ahsan Rajon, Khulna University, Bangladesh
Dr. Muhammad Mohsin Nazir, LCW University Lahore, Pakistan
Mr. Mohammad Asadul Hoque, University of Alabama, USA
Mr. P.V.Sarathchand, Indur Institute of Engineering and Technology, India
Mr. Durgesh Samadhiya, Chung Hua University, Taiwan
Dr Venu Kuthadi, University of Johannesburg, Johannesburg, RSA
Dr. (Er) Jasvir Singh, Guru Nanak Dev University, Amritsar, Punjab, India
Mr. Jasmin Cosic, Min. of the Interior of Una-sana canton, B&H, Bosnia and Herzegovina
Dr S. Rajalakshmi, Botho College, South Africa
Dr. Mohamed Sarrab, De Montfort University, UK
Mr. Basappa B. Kodada, Canara Engineering College, India
Assist. Prof. K. Ramana, Annamacharya Institute of Technology and Sciences, India
Dr. Ashu Gupta, Apeejay Institute of Management, Jalandhar, India
Assist. Prof. Shaik Rasool, Shadan College of Engineering & Technology, India
Assist. Prof. K. Suresh, Annamacharya Institute of Tech & Sci. Rajampet, AP, India
Dr . G. Singaravel, K.S.R. College of Engineering, India
Dr B. G. Geetha, K.S.R. College of Engineering, India
Assist. Prof. Kavita Choudhary, ITM University, Gurgaon
Dr. Mehrdad Jalali, Azad University, Mashhad, Iran
Megha Goel, Shamli Institute of Engineering and Technology, Shamli, India
Mr. Chi-Hua Chen, Institute of Information Management, National Chiao-Tung University, Taiwan (R.O.C.)
Assoc. Prof. A. Rajendran, RVS College of Engineering and Technology, India
Assist. Prof. S. Jaganathan, RVS College of Engineering and Technology, India
Assoc. Prof. (Dr.) A S N Chakravarthy, JNTUK University College of Engineering Vizianagaram (State University)
Assist. Prof. Deepshikha Patel, Technocrat Institute of Technology, India
Assist. Prof. Maram Balajee, GMRIT, India
Assist. Prof. Monika Bhatnagar, TIT, India
Prof. Gaurang Panchal, Charotar University of Science & Technology, India
Prof. Anand K. Tripathi, Computer Society of India
Prof. Jyoti Chaudhary, High Performance Computing Research Lab, India
Assist. Prof. Supriya Raheja, ITM University, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Dr. Pankaj Gupta, Microsoft Corporation, U.S.A.


Assist. Prof. Panchamukesh Chandaka, Hyderabad Institute of Tech. & Management, India
Prof. Mohan H.S, SJB Institute Of Technology, India
Mr. Hossein Malekinezhad, Islamic Azad University, Iran
Mr. Zatin Gupta, Universti Malaysia, Malaysia
Assist. Prof. Amit Chauhan, Phonics Group of Institutions, India
Assist. Prof. Ajal A. J., METS School Of Engineering, India
Mrs. Omowunmi Omobola Adeyemo, University of Ibadan, Nigeria
Dr. Bharat Bhushan Agarwal, I.F.T.M. University, India
Md. Nazrul Islam, University of Western Ontario, Canada
Tushar Kanti, L.N.C.T, Bhopal, India
Er. Aumreesh Kumar Saxena, SIRTs College Bhopal, India
Mr. Mohammad Monirul Islam, Daffodil International University, Bangladesh
Dr. Kashif Nisar, University Utara Malaysia, Malaysia
Dr. Wei Zheng, Rutgers Univ/ A10 Networks, USA
Associate Prof. Rituraj Jain, Vyas Institute of Engg & Tech, Jodhpur – Rajasthan
Assist. Prof. Apoorvi Sood, I.T.M. University, India
Dr. Kayhan Zrar Ghafoor, University Technology Malaysia, Malaysia
Mr. Swapnil Soner, Truba Institute College of Engineering & Technology, Indore, India
Ms. Yogita Gigras, I.T.M. University, India
Associate Prof. Neelima Sadineni, Pydha Engineering College, India Pydha Engineering College
Assist. Prof. K. Deepika Rani, HITAM, Hyderabad
Ms. Shikha Maheshwari, Jaipur Engineering College & Research Centre, India
Prof. Dr V S Giridhar Akula, Avanthi's Scientific Tech. & Research Academy, Hyderabad
Prof. Dr.S.Saravanan, Muthayammal Engineering College, India
Mr. Mehdi Golsorkhatabar Amiri, Islamic Azad University, Iran
Prof. Amit Sadanand Savyanavar, MITCOE, Pune, India
Assist. Prof. P.Oliver Jayaprakash, Anna University,Chennai
Assist. Prof. Ms. Sujata, ITM University, Gurgaon, India
Dr. Asoke Nath, St. Xavier's College, India
Mr. Masoud Rafighi, Islamic Azad University, Iran
Assist. Prof. RamBabu Pemula, NIMRA College of Engineering & Technology, India
Assist. Prof. Ms Rita Chhikara, ITM University, Gurgaon, India
Mr. Sandeep Maan, Government Post Graduate College, India
Prof. Dr. S. Muralidharan, Mepco Schlenk Engineering College, India
Associate Prof. T.V.Sai Krishna, QIS College of Engineering and Technology, India
Mr. R. Balu, Bharathiar University, Coimbatore, India
Assist. Prof. Shekhar. R, Dr.SM College of Engineering, India
Prof. P. Senthilkumar, Vivekanandha Institue of Engineering and Techology for Woman, India
Mr. M. Kamarajan, PSNA College of Engineering & Technology, India
Dr. Angajala Srinivasa Rao, Jawaharlal Nehru Technical University, India
Assist. Prof. C. Venkatesh, A.I.T.S, Rajampet, India
Mr. Afshin Rezakhani Roozbahani, Ayatollah Boroujerdi University, Iran
Mr. Laxmi chand, SCTL, Noida, India
Dr. Dr. Abdul Hannan, Vivekanand College, Aurangabad
Prof. Mahesh Panchal, KITRC, Gujarat
Dr. A. Subramani, K.S.R. College of Engineering, Tiruchengode
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assist. Prof. Prakash M, Rajalakshmi Engineering College, Chennai, India


Assist. Prof. Akhilesh K Sharma, Sir Padampat Singhania University, India
Ms. Varsha Sahni, Guru Nanak Dev Engineering College, Ludhiana, India
Associate Prof. Trilochan Rout, NM Institute of Engineering and Technlogy, India
Mr. Srikanta Kumar Mohapatra, NMIET, Orissa, India
Mr. Waqas Haider Bangyal, Iqra University Islamabad, Pakistan
Dr. S. Vijayaragavan, Christ College of Engineering and Technology, Pondicherry, India
Prof. Elboukhari Mohamed, University Mohammed First, Oujda, Morocco
Dr. Muhammad Asif Khan, King Faisal University, Saudi Arabia
Dr. Nagy Ramadan Darwish Omran, Cairo University, Egypt.
Assistant Prof. Anand Nayyar, KCL Institute of Management and Technology, India
Mr. G. Premsankar, Ericcson, India
Assist. Prof. T. Hemalatha, VELS University, India
Prof. Tejaswini Apte, University of Pune, India
Dr. Edmund Ng Giap Weng, Universiti Malaysia Sarawak, Malaysia
Mr. Mahdi Nouri, Iran University of Science and Technology, Iran
Associate Prof. S. Asif Hussain, Annamacharya Institute of technology & Sciences, India
Mrs. Kavita Pabreja, Maharaja Surajmal Institute (an affiliate of GGSIP University), India
Mr. Vorugunti Chandra Sekhar, DA-IICT, India
Mr. Muhammad Najmi Ahmad Zabidi, Universiti Teknologi Malaysia, Malaysia
Dr. Aderemi A. Atayero, Covenant University, Nigeria
Assist. Prof. Osama Sohaib, Balochistan University of Information Technology, Pakistan
Assist. Prof. K. Suresh, Annamacharya Institute of Technology and Sciences, India
Mr. Hassen Mohammed Abduallah Alsafi, International Islamic University Malaysia (IIUM) Malaysia
Mr. Robail Yasrab, Virtual University of Pakistan, Pakistan
Mr. R. Balu, Bharathiar University, Coimbatore, India
Prof. Anand Nayyar, KCL Institute of Management and Technology, Jalandhar
Assoc. Prof. Vivek S Deshpande, MIT College of Engineering, India
Prof. K. Saravanan, Anna university Coimbatore, India
Dr. Ravendra Singh, MJP Rohilkhand University, Bareilly, India
Mr. V. Mathivanan, IBRA College of Technology, Sultanate of OMAN
Assoc. Prof. S. Asif Hussain, AITS, India
Assist. Prof. C. Venkatesh, AITS, India
Mr. Sami Ulhaq, SZABIST Islamabad, Pakistan
Dr. B. Justus Rabi, Institute of Science & Technology, India
Mr. Anuj Kumar Yadav, Dehradun Institute of technology, India
Mr. Alejandro Mosquera, University of Alicante, Spain
Assist. Prof. Arjun Singh, Sir Padampat Singhania University (SPSU), Udaipur, India
Dr. Smriti Agrawal, JB Institute of Engineering and Technology, Hyderabad
Assist. Prof. Swathi Sambangi, Visakha Institute of Engineering and Technology, India
Ms. Prabhjot Kaur, Guru Gobind Singh Indraprastha University, India
Mrs. Samaher AL-Hothali, Yanbu University College, Saudi Arabia
Prof. Rajneeshkaur Bedi, MIT College of Engineering, Pune, India
Mr. Hassen Mohammed Abduallah Alsafi, International Islamic University Malaysia (IIUM)
Dr. Wei Zhang, Amazon.com, Seattle, WA, USA
Mr. B. Santhosh Kumar, C S I College of Engineering, Tamil Nadu
Dr. K. Reji Kumar, , N S S College, Pandalam, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assoc. Prof. K. Seshadri Sastry, EIILM University, India


Mr. Kai Pan, UNC Charlotte, USA
Mr. Ruikar Sachin, SGGSIET, India
Prof. (Dr.) Vinodani Katiyar, Sri Ramswaroop Memorial University, India
Assoc. Prof., M. Giri, Sreenivasa Institute of Technology and Management Studies, India
Assoc. Prof. Labib Francis Gergis, Misr Academy for Engineering and Technology (MET), Egypt
Assist. Prof. Amanpreet Kaur, ITM University, India
Assist. Prof. Anand Singh Rajawat, Shri Vaishnav Institute of Technology & Science, Indore
Mrs. Hadeel Saleh Haj Aliwi, Universiti Sains Malaysia (USM), Malaysia
Dr. Abhay Bansal, Amity University, India
Dr. Mohammad A. Mezher, Fahad Bin Sultan University, KSA
Assist. Prof. Nidhi Arora, M.C.A. Institute, India
Prof. Dr. P. Suresh, Karpagam College of Engineering, Coimbatore, India
Dr. Kannan Balasubramanian, Mepco Schlenk Engineering College, India
Dr. S. Sankara Gomathi, Panimalar Engineering college, India
Prof. Anil kumar Suthar, Gujarat Technological University, L.C. Institute of Technology, India
Assist. Prof. R. Hubert Rajan, NOORUL ISLAM UNIVERSITY, India
Assist. Prof. Dr. Jyoti Mahajan, College of Engineering & Technology
Assist. Prof. Homam Reda El-Taj, College of Network Engineering, Saudi Arabia & Malaysia
Mr. Bijan Paul, Shahjalal University of Science & Technology, Bangladesh
Assoc. Prof. Dr. Ch V Phani Krishna, KL University, India
Dr. Vishal Bhatnagar, Ambedkar Institute of Advanced Communication Technologies & Research, India
Dr. Lamri LAOUAMER, Al Qassim University, Dept. Info. Systems & European University of Brittany, Dept. Computer
Science, UBO, Brest, France
Prof. Ashish Babanrao Sasankar, G.H.Raisoni Institute Of Information Technology, India
Prof. Pawan Kumar Goel, Shamli Institute of Engineering and Technology, India
Mr. Ram Kumar Singh, S.V Subharti University, India
Assistant Prof. Sunish Kumar O S, Amaljyothi College of Engineering, India
Dr Sanjay Bhargava, Banasthali University, India
Mr. Pankaj S. Kulkarni, AVEW's Shatabdi Institute of Technology, India
Mr. Roohollah Etemadi, Islamic Azad University, Iran
Mr. Oloruntoyin Sefiu Taiwo, Emmanuel Alayande College Of Education, Nigeria
Mr. Sumit Goyal, National Dairy Research Institute, India
Mr Jaswinder Singh Dilawari, Geeta Engineering College, India
Prof. Raghuraj Singh, Harcourt Butler Technological Institute, Kanpur
Dr. S.K. Mahendran, Anna University, Chennai, India
Dr. Amit Wason, Hindustan Institute of Technology & Management, Punjab
Dr. Ashu Gupta, Apeejay Institute of Management, India
Assist. Prof. D. Asir Antony Gnana Singh, M.I.E.T Engineering College, India
Mrs Mina Farmanbar, Eastern Mediterranean University, Famagusta, North Cyprus
Mr. Maram Balajee, GMR Institute of Technology, India
Mr. Moiz S. Ansari, Isra University, Hyderabad, Pakistan
Mr. Adebayo, Olawale Surajudeen, Federal University of Technology Minna, Nigeria
Mr. Jasvir Singh, University College Of Engg., India
Mr. Vivek Tiwari, MANIT, Bhopal, India
Assoc. Prof. R. Navaneethakrishnan, Bharathiyar College of Engineering and Technology, India
Mr. Somdip Dey, St. Xavier's College, Kolkata, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Mr. Souleymane Balla-Arabé, Xi’an University of Electronic Science and Technology, China
Mr. Mahabub Alam, Rajshahi University of Engineering and Technology, Bangladesh
Mr. Sathyapraksh P., S.K.P Engineering College, India
Dr. N. Karthikeyan, SNS College of Engineering, Anna University, India
Dr. Binod Kumar, JSPM's, Jayawant Technical Campus, Pune, India
Assoc. Prof. Dinesh Goyal, Suresh Gyan Vihar University, India
Mr. Md. Abdul Ahad, K L University, India
Mr. Vikas Bajpai, The LNM IIT, India
Dr. Manish Kumar Anand, Salesforce (R & D Analytics), San Francisco, USA
Assist. Prof. Dheeraj Murari, Kumaon Engineering College, India
Assoc. Prof. Dr. A. Muthukumaravel, VELS University, Chennai
Mr. A. Siles Balasingh, St.Joseph University in Tanzania, Tanzania
Mr. Ravindra Daga Badgujar, R C Patel Institute of Technology, India
Dr. Preeti Khanna, SVKM’s NMIMS, School of Business Management, India
Mr. Kumar Dayanand, Cambridge Institute of Technology, India
Dr. Syed Asif Ali, SMI University Karachi, Pakistan
Prof. Pallvi Pandit, Himachal Pradeh University, India
Mr. Ricardo Verschueren, University of Gloucestershire, UK
Assist. Prof. Mamta Juneja, University Institute of Engineering and Technology, Panjab University, India
Assoc. Prof. P. Surendra Varma, NRI Institute of Technology, JNTU Kakinada, India
Assist. Prof. Gaurav Shrivastava, RGPV / SVITS Indore, India
Dr. S. Sumathi, Anna University, India
Assist. Prof. Ankita M. Kapadia, Charotar University of Science and Technology, India
Mr. Deepak Kumar, Indian Institute of Technology (BHU), India
Dr. Dr. Rajan Gupta, GGSIP University, New Delhi, India
Assist. Prof M. Anand Kumar, Karpagam University, Coimbatore, India
Mr. Mr Arshad Mansoor, Pakistan Aeronautical Complex
Mr. Kapil Kumar Gupta, Ansal Institute of Technology and Management, India
Dr. Neeraj Tomer, SINE International Institute of Technology, Jaipur, India
Assist. Prof. Trunal J. Patel, C.G.Patel Institute of Technology, Uka Tarsadia University, Bardoli, Surat
Mr. Sivakumar, Codework solutions, India
Mr. Mohammad Sadegh Mirzaei, PGNR Company, Iran
Dr. Gerard G. Dumancas, Oklahoma Medical Research Foundation, USA
Mr. Varadala Sridhar, Varadhaman College Engineering College, Affiliated To JNTU, Hyderabad
Assist. Prof. Manoj Dhawan, SVITS, Indore
Assoc. Prof. Chitreshh Banerjee, Suresh Gyan Vihar University, Jaipur, India
Dr. S. Santhi, SCSVMV University, India
Mr. Davood Mohammadi Souran, Ministry of Energy of Iran, Iran
Mr. Shamim Ahmed, Bangladesh University of Business and Technology, Bangladesh
Mr. Sandeep Reddivari, Mississippi State University, USA
Assoc. Prof. Ousmane Thiare, Gaston Berger University, Senegal
Dr. Hazra Imran, Athabasca University, Canada
Dr. Setu Kumar Chaturvedi, Technocrats Institute of Technology, Bhopal, India
Mr. Mohd Dilshad Ansari, Jaypee University of Information Technology, India
Ms. Jaspreet Kaur, Distance Education LPU, India
Dr. D. Nagarajan, Salalah College of Technology, Sultanate of Oman
Dr. K.V.N.R.Sai Krishna, S.V.R.M. College, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Mr. Himanshu Pareek, Center for Development of Advanced Computing (CDAC), India
Mr. Khaldi Amine, Badji Mokhtar University, Algeria
Mr. Mohammad Sadegh Mirzaei, Scientific Applied University, Iran
Assist. Prof. Khyati Chaudhary, Ram-eesh Institute of Engg. & Technology, India
Mr. Sanjay Agal, Pacific College of Engineering Udaipur, India
Mr. Abdul Mateen Ansari, King Khalid University, Saudi Arabia
Dr. H.S. Behera, Veer Surendra Sai University of Technology (VSSUT), India
Dr. Shrikant Tiwari, Shri Shankaracharya Group of Institutions (SSGI), India
Prof. Ganesh B. Regulwar, Shri Shankarprasad Agnihotri College of Engg, India
Prof. Pinnamaneni Bhanu Prasad, Matrix vision GmbH, Germany
Dr. Shrikant Tiwari, Shri Shankaracharya Technical Campus (SSTC), India
Dr. Siddesh G.K., : Dayananada Sagar College of Engineering, Bangalore, India
Dr. Nadir Bouchama, CERIST Research Center, Algeria
Dr. R. Sathishkumar, Sri Venkateswara College of Engineering, India
Assistant Prof (Dr.) Mohamed Moussaoui, Abdelmalek Essaadi University, Morocco
Dr. S. Malathi, Panimalar Engineering College, Chennai, India
Dr. V. Subedha, Panimalar Institute of Technology, Chennai, India
Dr. Prashant Panse, Swami Vivekanand College of Engineering, Indore, India
Dr. Hamza Aldabbas, Al-Balqa’a Applied University, Jordan
Dr. G. Rasitha Banu, Vel's University, Chennai
Dr. V. D. Ambeth Kumar, Panimalar Engineering College, Chennai
Prof. Anuranjan Misra, Bhagwant Institute of Technology, Ghaziabad, India
Ms. U. Sinthuja, PSG college of arts &science, India
Dr. Ehsan Saradar Torshizi, Urmia University, Iran
Dr. Shamneesh Sharma, APG Shimla University, Shimla (H.P.), India
Assistant Prof. A. S. Syed Navaz, Muthayammal College of Arts & Science, India
Assistant Prof. Ranjit Panigrahi, Sikkim Manipal Institute of Technology, Majitar, Sikkim
Dr. Khaled Eskaf, Arab Academy for Science ,Technology & Maritime Transportation, Egypt
Dr. Nishant Gupta, University of Jammu, India
Assistant Prof. Nagarajan Sankaran, Annamalai University, Chidambaram, Tamilnadu, India
Assistant Prof.Tribikram Pradhan, Manipal Institute of Technology, India
Dr. Nasser Lotfi, Eastern Mediterranean University, Northern Cyprus
Dr. R. Manavalan, K S Rangasamy college of Arts and Science, Tamilnadu, India
Assistant Prof. P. Krishna Sankar, K S Rangasamy college of Arts and Science, Tamilnadu, India
Dr. Rahul Malik, Cisco Systems, USA
Dr. S. C. Lingareddy, ALPHA College of Engineering, India
Assistant Prof. Mohammed Shuaib, Interal University, Lucknow, India
Dr. Sachin Yele, Sanghvi Institute of Management & Science, India
Dr. T. Thambidurai, Sun Univercell, Singapore
Prof. Anandkumar Telang, BKIT, India
Assistant Prof. R. Poorvadevi, SCSVMV University, India
Dr Uttam Mande, Gitam University, India
Dr. Poornima Girish Naik, Shahu Institute of Business Education and Research (SIBER), India
Prof. Md. Abu Kausar, Jaipur National University, Jaipur, India
Dr. Mohammed Zuber, AISECT University, India
Prof. Kalum Priyanath Udagepola, King Abdulaziz University, Saudi Arabia
Dr. K. R. Ananth, Velalar College of Engineering and Technology, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assistant Prof. Sanjay Sharma, Roorkee Engineering & Management Institute Shamli (U.P), India
Assistant Prof. Panem Charan Arur, Priyadarshini Institute of Technology, India
Dr. Ashwak Mahmood muhsen alabaichi, Karbala University / College of Science, Iraq
Dr. Urmila Shrawankar, G H Raisoni College of Engineering, Nagpur (MS), India
Dr. Krishan Kumar Paliwal, Panipat Institute of Engineering & Technology, India
Dr. Mukesh Negi, Tech Mahindra, India
Dr. Anuj Kumar Singh, Amity University Gurgaon, India
Dr. Babar Shah, Gyeongsang National University, South Korea
Assistant Prof. Jayprakash Upadhyay, SRI-TECH Jabalpur, India
Assistant Prof. Varadala Sridhar, Vidya Jyothi Institute of Technology, India
Assistant Prof. Parameshachari B D, KSIT, Bangalore, India
Assistant Prof. Ankit Garg, Amity University, Haryana, India
Assistant Prof. Rajashe Karappa, SDMCET, Karnataka, India
Assistant Prof. Varun Jasuja, GNIT, India
Assistant Prof. Sonal Honale, Abha Gaikwad Patil College of Engineering Nagpur, India
Dr. Pooja Choudhary, CT Group of Institutions, NIT Jalandhar, India
Dr. Faouzi Hidoussi, UHL Batna, Algeria
Dr. Naseer Ali Husieen, Wasit University, Iraq
Assistant Prof. Vinod Kumar Shukla, Amity University, Dubai
Dr. Ahmed Farouk Metwaly, K L University
Mr. Mohammed Noaman Murad, Cihan University, Iraq
Dr. Suxing Liu, Arkansas State University, USA
Dr. M. Gomathi, Velalar College of Engineering and Technology, India
Assistant Prof. Sumardiono, College PGRI Blitar, Indonesia
Dr. Latika Kharb, Jagan Institute of Management Studies (JIMS), Delhi, India
Associate Prof. S. Raja, Pauls College of Engineering and Technology, Tamilnadu, India
Assistant Prof. Seyed Reza Pakize, Shahid Sani High School, Iran
Dr. Thiyagu Nagaraj, University-INOU, India
Assistant Prof. Noreen Sarai, Harare Institute of Technology, Zimbabwe
Assistant Prof. Gajanand Sharma, Suresh Gyan Vihar University Jaipur, Rajasthan, India
Assistant Prof. Mapari Vikas Prakash, Siddhant COE, Sudumbare, Pune, India
Dr. Devesh Katiyar, Shri Ramswaroop Memorial University, India
Dr. Shenshen Liang, University of California, Santa Cruz, US
Assistant Prof. Mohammad Abu Omar, Limkokwing University of Creative Technology- Malaysia
Mr. Snehasis Banerjee, Tata Consultancy Services, India
Assistant Prof. Kibona Lusekelo, Ruaha Catholic University (RUCU), Tanzania
Assistant Prof. Adib Kabir Chowdhury, University College Technology Sarawak, Malaysia
Dr. Ying Yang, Computer Science Department, Yale University, USA
Dr. Vinay Shukla, Institute Of Technology & Management, India
Dr. Liviu Octavian Mafteiu-Scai, West University of Timisoara, Romania
Assistant Prof. Rana Khudhair Abbas Ahmed, Al-Rafidain University College, Iraq
Assistant Prof. Nitin A. Naik, S.R.T.M. University, India
Dr. Timothy Powers, University of Hertfordshire, UK
Dr. S. Prasath, Bharathiar University, Erode, India
Dr. Ritu Shrivastava, SIRTS Bhopal, India
Prof. Rohit Shrivastava, Mittal Institute of Technology, Bhopal, India
Dr. Gianina Mihai, Dunarea de Jos" University of Galati, Romania
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Assistant Prof. Ms. T. Kalai Selvi, Erode Sengunthar Engineering College, India
Assistant Prof. Ms. C. Kavitha, Erode Sengunthar Engineering College, India
Assistant Prof. K. Sinivasamoorthi, Erode Sengunthar Engineering College, India
Assistant Prof. Mallikarjun C Sarsamba Bheemnna Khandre Institute Technology, Bhalki, India
Assistant Prof. Vishwanath Chikaraddi, Veermata Jijabai technological Institute (Central Technological Institute), India
Assistant Prof. Dr. Ikvinderpal Singh, Trai Shatabdi GGS Khalsa College, India
Assistant Prof. Mohammed Noaman Murad, Cihan University, Iraq
Professor Yousef Farhaoui, Moulay Ismail University, Errachidia, Morocco
Dr. Parul Verma, Amity University, India
Professor Yousef Farhaoui, Moulay Ismail University, Errachidia, Morocco
Assistant Prof. Madhavi Dhingra, Amity University, Madhya Pradesh, India
Assistant Prof.. G. Selvavinayagam, SNS College of Technology, Coimbatore, India
Assistant Prof. Madhavi Dhingra, Amity University, MP, India
Professor Kartheesan Log, Anna University, Chennai
Professor Vasudeva Acharya, Shri Madhwa vadiraja Institute of Technology, India
Dr. Asif Iqbal Hajamydeen, Management & Science University, Malaysia
Assistant Prof., Mahendra Singh Meena, Amity University Haryana
Assistant Professor Manjeet Kaur, Amity University Haryana
Dr. Mohamed Abd El-Basset Matwalli, Zagazig University, Egypt
Dr. Ramani Kannan, Universiti Teknologi PETRONAS, Malaysia
Assistant Prof. S. Jagadeesan Subramaniam, Anna University, India
Assistant Prof. Dharmendra Choudhary, Tripura University, India
Assistant Prof. Deepika Vodnala, SR Engineering College, India
Dr. Kai Cong, Intel Corporation & Computer Science Department, Portland State University, USA
Dr. Kailas R Patil, Vishwakarma Institute of Information Technology (VIIT), India
Dr. Omar A. Alzubi, Faculty of IT / Al-Balqa Applied University, Jordan
Assistant Prof. Kareemullah Shaik, Nimra Institute of Science and Technology, India
Assistant Prof. Chirag Modi, NIT Goa
Dr. R. Ramkumar, Nandha Arts And Science College, India
Dr. Priyadharshini Vydhialingam, Harathiar University, India
Dr. P. S. Jagadeesh Kumar, DBIT, Bangalore, Karnataka
Dr. Vikas Thada, AMITY University, Pachgaon
Dr. T. A. Ashok Kumar, Institute of Management, Christ University, Bangalore
Dr. Shaheera Rashwan, Informatics Research Institute
Dr. S. Preetha Gunasekar, Bharathiyar University, India
Asst Professor Sameer Dev Sharma, Uttaranchal University, Dehradun
Dr. Zhihan lv, Chinese Academy of Science, China
Dr. Ikvinderpal Singh, Trai Shatabdi GGS Khalsa College, Amritsar
Dr. Umar Ruhi, University of Ottawa, Canada
Dr. Jasmin Cosic, University of Bihac, Bosnia and Herzegovina
Dr. Homam Reda El-Taj, University of Tabuk, Kingdom of Saudi Arabia
Dr. Mostafa Ghobaei Arani, Islamic Azad University, Iran
Dr. Ayyasamy Ayyanar, Annamalai University, India
Dr. Selvakumar Manickam, Universiti Sains Malaysia, Malaysia
Dr. Murali Krishna Namana, GITAM University, India
Dr. Smriti Agrawal, Chaitanya Bharathi Institute of Technology, Hyderabad, India
Professor Vimalathithan Rathinasabapathy, Karpagam College Of Engineering, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Dr. Sushil Chandra Dimri, Graphic Era University, India


Dr. Dinh-Sinh Mai, Le Quy Don Technical University, Vietnam
Dr. S. Rama Sree, Aditya Engg. College, India
Dr. Ehab T. Alnfrawy, Sadat Academy, Egypt
Dr. Patrick D. Cerna, Haramaya University, Ethiopia
Dr. Vishal Jain, Bharati Vidyapeeth's Institute of Computer Applications and Management (BVICAM), India
Associate Prof. Dr. Jiliang Zhang, North Eastern University, China
Dr. Sharefa Murad, Middle East University, Jordan
Dr. Ajeet Singh Poonia, Govt. College of Engineering & technology, Rajasthan, India
Dr. Vahid Esmaeelzadeh, University of Science and Technology, Iran
Dr. Jacek M. Czerniak, Casimir the Great University in Bydgoszcz, Institute of Technology, Poland
Associate Prof. Anisur Rehman Nasir, Jamia Millia Islamia University
Assistant Prof. Imran Ahmad, COMSATS Institute of Information Technology, Pakistan
Professor Ghulam Qasim, Preston University, Islamabad, Pakistan
Dr. Parameshachari B D, GSSS Institute of Engineering and Technology for Women
Dr. Wencan Luo, University of Pittsburgh, US
Dr. Musa PEKER, Faculty of Technology, Mugla Sitki Kocman University, Turkey
Dr. Gunasekaran Shanmugam, Anna University, India
Dr. Binh P. Nguyen, National University of Singapore, Singapore
Dr. Rajkumar Jain, Indian Institute of Technology Indore, India
Dr. Imtiaz Ali Halepoto, QUEST Nawabshah, Pakistan
Dr. Shaligram Prajapat, Devi Ahilya University Indore India
Dr. Sunita Singhal, Birla Institute of Technologyand Science, Pilani, India
Dr. Ijaz Ali Shoukat, King Saud University, Saudi Arabia
Dr. Anuj Gupta, IKG Punjab Technical University, India
Dr. Sonali Saini, IES-IPS Academy, India
Dr. Krishan Kumar, MotiLal Nehru National Institute of Technology, Allahabad, India
Dr. Z. Faizal Khan, College of Engineering, Shaqra University, Kingdom of Saudi Arabia
Prof. M. Padmavathamma, S.V. University Tirupati, India
Prof. A. Velayudham, Cape Institute of Technology, India
Prof. Seifeidne Kadry, American University of the Middle East
Dr. J. Durga Prasad Rao, Pt. Ravishankar Shukla University, Raipur
Assistant Prof. Najam Hasan, Dhofar University
Dr. G. Suseendran, Vels University, Pallavaram, Chennai
Prof. Ankit Faldu, Gujarat Technological Universiry- Atmiya Institute of Technology and Science
Dr. Ali Habiboghli, Islamic Azad University
Dr. Deepak Dembla, JECRC University, Jaipur, India
Dr. Pankaj Rajan, Walmart Labs, USA
Assistant Prof. Radoslava Kraleva, South-West University "Neofit Rilski", Bulgaria
Assistant Prof. Medhavi Shriwas, Shri vaishnav institute of Technology, India
Associate Prof. Sedat Akleylek, Ondokuz Mayis University, Turkey
Dr. U.V. Arivazhagu, Kingston Engineering College Affiliated To Anna University, India
Dr. Touseef Ali, University of Engineering and Technology, Taxila, Pakistan
Assistant Prof. Naren Jeeva, SASTRA University, India
Dr. Riccardo Colella, University of Salento, Italy
Dr. Enache Maria Cristina, University of Galati, Romania
Dr. Senthil P, Kurinji College of Arts & Science, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 15 No. 2, February 2017

Dr. Hasan Ashrafi-rizi, Isfahan University of Medical Sciences, Isfahan, Iran


Dr. Mazhar Malik, Institute of Southern Punjab, Pakistan
Dr. Yajie Miao, Carnegie Mellon University, USA
Dr. Kamran Shaukat, University of the Punjab, Pakistan
Dr. Sasikaladevi N., SASTRA University, India
Dr. Ali Asghar Rahmani Hosseinabadi, Islamic Azad University Ayatollah Amoli Branch, Amol, Iran
Dr. Velin Kralev, South-West University "Neofit Rilski", Blagoevgrad, Bulgaria
Dr. Marius Iulian Mihailescu, LUMINA - The University of South-East Europe
Dr. Sriramula Nagaprasad, S.R.R.Govt.Arts & Science College, Karimnagar, India
Prof (Dr.) Namrata Dhanda, Dr. APJ Abdul Kalam Technical University, Lucknow, India
Dr. Javed Ahmed Mahar, Shah Abdul Latif University, Khairpur Mir’s, Pakistan
Dr. B. Narendra Kumar Rao, Sree Vidyanikethan Engineering College, India
Dr. Shahzad Anwar, University of Engineering & Technology Peshawar, Pakistan
Dr. Basit Shahzad, King Saud University, Riyadh - Saudi Arabia
Dr. Nilamadhab Mishra, Chang Gung University
Dr. Sachin Kumar, Indian Institute of Technology Roorkee
Dr. Santosh Nanda, Biju-Pattnaik University of Technology
Dr. Sherzod Turaev, International Islamic University Malaysia
Dr. Yilun Shang, Tongji University, Department of Mathematics, Shanghai, China
Dr. Nuzhat Shaikh, Modern Education society's College of Engineering, Pune, India
Dr. Parul Verma, Amity University, Lucknow campus, India
Dr. Rachid Alaoui, Agadir Ibn Zohr University, Agadir, Morocco
Dr. Dharmendra Patel, Charotar University of Science and Technology, India
Dr. Dong Zhang, University of Central Florida, USA
Dr. Kennedy Chinedu Okafor, Federal University of Technology Owerri, Nigeria
Prof. C Ram Kumar, Dr NGP Institute of Technology, India
Dr. Sandeep Gupta, GGS IP University, New Delhi, India
Dr. Shahanawaj Ahamad, University of Ha'il, Ha'il City, Ministry of Higher Education, Kingdom of Saudi Arabia
Dr. Najeed Ahmed Khan, NED University of Engineering & Technology, India
Dr. Sajid Ullah Khan, Universiti Malaysia Sarawak, Malaysia
Dr. Muhammad Asif, National Textile University Faisalabad, Pakistan
Dr. Yu BI, University of Central Florida, Orlando, FL, USA
Dr. Brijendra Kumar Joshi, Research Center, Military College of Telecommunication Engineering, India
Prof. Dr. Nak Eun Cho, Pukyong National University, Korea
Prof. Wasim Ul-Haq, Mathematics Department Faculty of Science, Majmaah University, Saudi Arabia
CALL FOR PAPERS
International Journal of Computer Science and Information Security

IJCSIS 2017-2018
ISSN: 1947-5500
http://sites.google.com/site/ijcsis/
International Journal Computer Science and Information Security, IJCSIS, is the premier
scholarly venue in the areas of computer science and security issues. IJCSIS 2011 will provide a high
profile, leading edge platform for researchers and engineers alike to publish state-of-the-art research in the
respective fields of information technology and communication security. The journal will feature a diverse
mixture of publication articles including core and applied computer science related topics.

Authors are solicited to contribute to the special issue by submitting articles that illustrate research results,
projects, surveying works and industrial experiences that describe significant advances in the following
areas, but are not limited to. Submissions may span a broad range of topics, e.g.:

Track A: Security

Access control, Anonymity, Audit and audit reduction & Authentication and authorization, Applied
cryptography, Cryptanalysis, Digital Signatures, Biometric security, Boundary control devices,
Certification and accreditation, Cross-layer design for security, Security & Network Management, Data and
system integrity, Database security, Defensive information warfare, Denial of service protection, Intrusion
Detection, Anti-malware, Distributed systems security, Electronic commerce, E-mail security, Spam,
Phishing, E-mail fraud, Virus, worms, Trojan Protection, Grid security, Information hiding and
watermarking & Information survivability, Insider threat protection, Integrity
Intellectual property protection, Internet/Intranet Security, Key management and key recovery, Language-
based security, Mobile and wireless security, Mobile, Ad Hoc and Sensor Network Security, Monitoring
and surveillance, Multimedia security ,Operating system security, Peer-to-peer security, Performance
Evaluations of Protocols & Security Application, Privacy and data protection, Product evaluation criteria
and compliance, Risk evaluation and security certification, Risk/vulnerability assessment, Security &
Network Management, Security Models & protocols, Security threats & countermeasures (DDoS, MiM,
Session Hijacking, Replay attack etc,), Trusted computing, Ubiquitous Computing Security, Virtualization
security, VoIP security, Web 2.0 security, Submission Procedures, Active Defense Systems, Adaptive
Defense Systems, Benchmark, Analysis and Evaluation of Security Systems, Distributed Access Control
and Trust Management, Distributed Attack Systems and Mechanisms, Distributed Intrusion
Detection/Prevention Systems, Denial-of-Service Attacks and Countermeasures, High Performance
Security Systems, Identity Management and Authentication, Implementation, Deployment and
Management of Security Systems, Intelligent Defense Systems, Internet and Network Forensics, Large-
scale Attacks and Defense, RFID Security and Privacy, Security Architectures in Distributed Network
Systems, Security for Critical Infrastructures, Security for P2P systems and Grid Systems, Security in E-
Commerce, Security and Privacy in Wireless Networks, Secure Mobile Agents and Mobile Code, Security
Protocols, Security Simulation and Tools, Security Theory and Tools, Standards and Assurance Methods,
Trusted Computing, Viruses, Worms, and Other Malicious Code, World Wide Web Security, Novel and
emerging secure architecture, Study of attack strategies, attack modeling, Case studies and analysis of
actual attacks, Continuity of Operations during an attack, Key management, Trust management, Intrusion
detection techniques, Intrusion response, alarm management, and correlation analysis, Study of tradeoffs
between security and system performance, Intrusion tolerance systems, Secure protocols, Security in
wireless networks (e.g. mesh networks, sensor networks, etc.), Cryptography and Secure Communications,
Computer Forensics, Recovery and Healing, Security Visualization, Formal Methods in Security, Principles
for Designing a Secure Computing System, Autonomic Security, Internet Security, Security in Health Care
Systems, Security Solutions Using Reconfigurable Computing, Adaptive and Intelligent Defense Systems,
Authentication and Access control, Denial of service attacks and countermeasures, Identity, Route and
Location Anonymity schemes, Intrusion detection and prevention techniques, Cryptography, encryption
algorithms and Key management schemes, Secure routing schemes, Secure neighbor discovery and
localization, Trust establishment and maintenance, Confidentiality and data integrity, Security architectures,
deployments and solutions, Emerging threats to cloud-based services, Security model for new services,
Cloud-aware web service security, Information hiding in Cloud Computing, Securing distributed data
storage in cloud, Security, privacy and trust in mobile computing systems and applications, Middleware
security & Security features: middleware software is an asset on
its own and has to be protected, interaction between security-specific and other middleware features, e.g.,
context-awareness, Middleware-level security monitoring and measurement: metrics and mechanisms
for quantification and evaluation of security enforced by the middleware, Security co-design: trade-off and
co-design between application-based and middleware-based security, Policy-based management:
innovative support for policy-based definition and enforcement of security concerns, Identification and
authentication mechanisms: Means to capture application specific constraints in defining and enforcing
access control rules, Middleware-oriented security patterns: identification of patterns for sound, reusable
security, Security in aspect-based middleware: mechanisms for isolating and enforcing security aspects,
Security in agent-based platforms: protection for mobile code and platforms, Smart Devices: Biometrics,
National ID cards, Embedded Systems Security and TPMs, RFID Systems Security, Smart Card Security,
Pervasive Systems: Digital Rights Management (DRM) in pervasive environments, Intrusion Detection and
Information Filtering, Localization Systems Security (Tracking of People and Goods), Mobile Commerce
Security, Privacy Enhancing Technologies, Security Protocols (for Identification and Authentication,
Confidentiality and Privacy, and Integrity), Ubiquitous Networks: Ad Hoc Networks Security, Delay-
Tolerant Network Security, Domestic Network Security, Peer-to-Peer Networks Security, Security Issues
in Mobile and Ubiquitous Networks, Security of GSM/GPRS/UMTS Systems, Sensor Networks Security,
Vehicular Network Security, Wireless Communication Security: Bluetooth, NFC, WiFi, WiMAX,
WiMedia, others

This Track will emphasize the design, implementation, management and applications of computer
communications, networks and services. Topics of mostly theoretical nature are also welcome, provided
there is clear practical potential in applying the results of such work.

Track B: Computer Science

Broadband wireless technologies: LTE, WiMAX, WiRAN, HSDPA, HSUPA, Resource allocation and
interference management, Quality of service and scheduling methods, Capacity planning and dimensioning,
Cross-layer design and Physical layer based issue, Interworking architecture and interoperability, Relay
assisted and cooperative communications, Location and provisioning and mobility management, Call
admission and flow/congestion control, Performance optimization, Channel capacity modeling and analysis,
Middleware Issues: Event-based, publish/subscribe, and message-oriented middleware, Reconfigurable,
adaptable, and reflective middleware approaches, Middleware solutions for reliability, fault tolerance, and
quality-of-service, Scalability of middleware, Context-aware middleware, Autonomic and self-managing
middleware, Evaluation techniques for middleware solutions, Formal methods and tools for designing,
verifying, and evaluating, middleware, Software engineering techniques for middleware, Service oriented
middleware, Agent-based middleware, Security middleware, Network Applications: Network-based
automation, Cloud applications, Ubiquitous and pervasive applications, Collaborative applications, RFID
and sensor network applications, Mobile applications, Smart home applications, Infrastructure monitoring
and control applications, Remote health monitoring, GPS and location-based applications, Networked
vehicles applications, Alert applications, Embeded Computer System, Advanced Control Systems, and
Intelligent Control : Advanced control and measurement, computer and microprocessor-based control,
signal processing, estimation and identification techniques, application specific IC’s, nonlinear and
adaptive control, optimal and robot control, intelligent control, evolutionary computing, and intelligent
systems, instrumentation subject to critical conditions, automotive, marine and aero-space control and all
other control applications, Intelligent Control System, Wiring/Wireless Sensor, Signal Control System.
Sensors, Actuators and Systems Integration : Intelligent sensors and actuators, multisensor fusion, sensor
array and multi-channel processing, micro/nano technology, microsensors and microactuators,
instrumentation electronics, MEMS and system integration, wireless sensor, Network Sensor, Hybrid
Sensor, Distributed Sensor Networks. Signal and Image Processing : Digital signal processing theory,
methods, DSP implementation, speech processing, image and multidimensional signal processing, Image
analysis and processing, Image and Multimedia applications, Real-time multimedia signal processing,
Computer vision, Emerging signal processing areas, Remote Sensing, Signal processing in education.
Industrial Informatics: Industrial applications of neural networks, fuzzy algorithms, Neuro-Fuzzy
application, bioInformatics, real-time computer control, real-time information systems, human-machine
interfaces, CAD/CAM/CAT/CIM, virtual reality, industrial communications, flexible manufacturing
systems, industrial automated process, Data Storage Management, Harddisk control, Supply Chain
Management, Logistics applications, Power plant automation, Drives automation. Information Technology,
Management of Information System : Management information systems, Information Management,
Nursing information management, Information System, Information Technology and their application, Data
retrieval, Data Base Management, Decision analysis methods, Information processing, Operations research,
E-Business, E-Commerce, E-Government, Computer Business, Security and risk management, Medical
imaging, Biotechnology, Bio-Medicine, Computer-based information systems in health care, Changing
Access to Patient Information, Healthcare Management Information Technology.
Communication/Computer Network, Transportation Application : On-board diagnostics, Active safety
systems, Communication systems, Wireless technology, Communication application, Navigation and
Guidance, Vision-based applications, Speech interface, Sensor fusion, Networking theory and technologies,
Transportation information, Autonomous vehicle, Vehicle application of affective computing, Advance
Computing technology and their application : Broadband and intelligent networks, Data Mining, Data
fusion, Computational intelligence, Information and data security, Information indexing and retrieval,
Information processing, Information systems and applications, Internet applications and performances,
Knowledge based systems, Knowledge management, Software Engineering, Decision making, Mobile
networks and services, Network management and services, Neural Network, Fuzzy logics, Neuro-Fuzzy,
Expert approaches, Innovation Technology and Management : Innovation and product development,
Emerging advances in business and its applications, Creativity in Internet management and retailing, B2B
and B2C management, Electronic transceiver device for Retail Marketing Industries, Facilities planning
and management, Innovative pervasive computing applications, Programming paradigms for pervasive
systems, Software evolution and maintenance in pervasive systems, Middleware services and agent
technologies, Adaptive, autonomic and context-aware computing, Mobile/Wireless computing systems and
services in pervasive computing, Energy-efficient and green pervasive computing, Communication
architectures for pervasive computing, Ad hoc networks for pervasive communications, Pervasive
opportunistic communications and applications, Enabling technologies for pervasive systems (e.g., wireless
BAN, PAN), Positioning and tracking technologies, Sensors and RFID in pervasive systems, Multimodal
sensing and context for pervasive applications, Pervasive sensing, perception and semantic interpretation,
Smart devices and intelligent environments, Trust, security and privacy issues in pervasive systems, User
interfaces and interaction models, Virtual immersive communications, Wearable computers, Standards and
interfaces for pervasive computing environments, Social and economic models for pervasive systems,
Active and Programmable Networks, Ad Hoc & Sensor Network, Congestion and/or Flow Control, Content
Distribution, Grid Networking, High-speed Network Architectures, Internet Services and Applications,
Optical Networks, Mobile and Wireless Networks, Network Modeling and Simulation, Multicast,
Multimedia Communications, Network Control and Management, Network Protocols, Network
Performance, Network Measurement, Peer to Peer and Overlay Networks, Quality of Service and Quality
of Experience, Ubiquitous Networks, Crosscutting Themes – Internet Technologies, Infrastructure,
Services and Applications; Open Source Tools, Open Models and Architectures; Security, Privacy and
Trust; Navigation Systems, Location Based Services; Social Networks and Online Communities; ICT
Convergence, Digital Economy and Digital Divide, Neural Networks, Pattern Recognition, Computer
Vision, Advanced Computing Architectures and New Programming Models, Visualization and Virtual
Reality as Applied to Computational Science, Computer Architecture and Embedded Systems, Technology
in Education, Theoretical Computer Science, Computing Ethics, Computing Practices & Applications

Authors are invited to submit papers through e-mail [email protected]. Submissions must be original
and should not have been published previously or be under consideration for publication while being
evaluated by IJCSIS. Before submission authors should carefully read over the journal's Author Guidelines,
which are located at http://sites.google.com/site/ijcsis/authors-notes .
© IJCSIS PUBLICATION 2017
ISSN 1947 5500
http://sites.google.com/site/ijcsis/

You might also like