AlgorithmDesign Python-ERK2019
AlgorithmDesign Python-ERK2019
Abstract. Python is one of the most sought-after of the students. Before the algorithmic implementation,
programming languages for cybersecurity due to the it is necessary ti choose the programming language.
extensive library of powerful packages that support The police apply statistical or machine learning
rapid application development, clean syntax code and algorithms to data from police records on time, location,
modular design as well as automatic memory and nature of past crimes, to look for potential patterns
management and dynamic typing capability. A good in order to find possible occurrence of crime in the
understanding of algorithm design is very important future. Predictive policing does not replace conventional
because when a flaws is found out in the written code, policing methods (e.g. problem-oriented policing,
then a step back to the design phase is necessary and intelligence-led policing or hotspot policing) but
redesign of the algorithm is an obvious need. In this enhances these traditional practices by applying
paper, we describe how Python can be used to develop advanced statistical models and algorithms for National
algorithms and implement his modules to be applicable Institute of Justice (NIJ) in 2014. Perry (2013) claims
in the cybersecurity domain. Using suggested that predictive algorithms [1] can be used to identify
approaches in algorithm design for cybersecurity, their members of criminal groups that show an elevated risk
existing capabilities are increased for tweaking, of a violence outbreak between them [2].
customizing, or outright developing of own tools. Malicious pattern detection engine is a real life
example of searching and sorting algorithms [3]. SQL
Injection is perhaps one of the most common
1 Introduction
application layer attacks. In the study [4] by Liban and
Programming is one of the most important part of the Hilles in 2014, a SQL-injection vulnerability scanning
cybersecurity. Today, the need for cybersecurity tool named MYSQL Injector was developed, for the
professionals is more than obvious. Cybersecurity is not purposes of automatic creation of SQL-injection attacks
just about using a customised operating system with using time-based attack with Inference Binary Search
hundreds of tools to find vulnerabilities — it is Algorithm. Denial of Service (DoS) is a very important
something more than that. Programming skills are problem that needs to be dealt seriously in security
always welcome if you want to be a top-notch you have domain. Khan and Traore propose a regression analysis
to think like a hacker and look for all the possible ways based model [5] that can prevent algorithmic
a hacker could exploit any system, and this includes complexity attacks and they demonstrate their model on
development too. The field of cybersecurity is huge and quick-sort algorithm in the case of DoS attacks.
cannot be constrained into a number of predefined
fundamentals of forensic investigators, ethical hackers,
3 Using traditional data structures and
security analysts etc. Programming knowledge proves
essential for analyzing software for vulnerabilities, algorithms as part of the AI systems for
identifying malicious software, and other tasks required cybersecurity
for cybersecurity analysts. What to draw from this Strategy to learn an algorithm contents follows these
advice is that programming knowledge gives you an steps: write pseudocode, implement it with any
edge over other security professionals without those programming language following pseudocode, test
skills. Cybersecurity professionals might use their correctness by checking output with different inputs,
coding skills to write tools that automate certain security and analyse complexity. Development and usage of data
tasks. structures and algorithms represents an initial step in
forensic investigation which further reveals holes in
2 Areas of application of algorithms cyber space pointing out which places in the code and in
the infrastructure need to be fixed [6]. Keeping in mind
Computer science is the study of problems, problem- that attackers have a lot of tools to make attacks more
solving, and the solutions that come out from the and more sophisticated. This is where Python comes to
problem-solving process. Given a problem, a computer the fore with its ease of coding which is mostly like
scientist’s goal is to develop an algorithm, a step-by- plain English, and that most activities can be automated
step list of instructions for solving any instance of the using Python scripts. In following paragraphs we stress
problem that might arise. Algorithms are finite out which data structures and algorithms can be
processes that if followed will solve the problem. implemented in Python in the area of cybersecurity [7].
Algorithms should be taught using the native language
3.1.2 Binary search Table 3. The pseudocode for storing graphs algorithm
In Binary search, we iterate over an array to find if an ADJACENCYMATRIX (n)
element is present in a list or not. If the element to be for i in 1 to n
found is equal to the middle element, then we have for j in 1 to n
already found the element, otherwise, if it is smaller, A[i][j]=0 // create empty matrix-2D array
then we know it is going to lie on the left side or on the
right. The idea is to keep comparing the element with MAKEEDGE (to,from, undirected)
the middle value. This way with each search we if undirected ==true
eliminate one half of the list. A[from][to]=1
A[to][from]=1
3.1.4 Eratosthenes - Prime Numbers
The sieve of Eratosthenes is one of the most efficient
ways to find all primes smaller than n when n is smaller
than 10 million or so. A prime number is one which is
only divisible by 1 and itself. We have to find out
whether it has any divisors, and is therefore composite.
Figure 3. Visual representation of the Binary Search algorithm In this case, we would discard it as a prime number. The
with example Greek mathematician Eratosthenes designed a quick
way to find all the prime numbers. It’s a process called
the Sieve of Eratosthenes.
244
languages, without need for a user to install support on
local computer.
Table 4. The pseudocode for Eratosthenes/Prime As we can see from Table 5. the values about spent time
algorithm with natural language description for execution of algorithm and consumed memory are
given. Regarding the Python programming language, in
logical_type А[101] our experiments we used version 2 and novel version 3
А[0]=false,А[1]=false in order to check whether there are some progress in
for i=2,..,100 repeat speed of program execution. The obtained results
А[i]== true suggest us that Python 3 not only performs the
for i=2,..,sqrt(100) repeat algorithm faster, but also consumes less memory.
if А[i]== true Among all tested programming languages, Python uses
ј=i+i least memory. JavaScript, Java and C# showed worst
while ј<=100 repeat results, but this is not surprising keeping in the mind the
А[j]=false fact that these programming languages are executed on
ј=ј+i client side. Worst results in time needed for execution
for i=0,..,100 repeat achieved JavaScript, while Java takes the most memory.
if А[i]== true Libraries in programming languages always makes
print i our code easy to develop, so here we are going to
discuss some library functions in Python to work upon
3.2. Measuring Performance of Algorithms: Use case prime numbers. SymPy is a Python module which
on Eratosthenes/Prime algorithm contains some really practical prime number related
library functions. Method primerange(a, b) by SymPy
Prime numbers are fundamental to the most common
generates a list of all prime numbers in the range [a, b].
type of encryption used today: the RSA algorithm. The
However, Python cross platform programming language
RSA and Elliptic Curve asymmetric algorithms are
can be used as a script or application. It comes with pip
based on prime numbers. These numbers have
packet manager, which allows easy installation and
interesting properties that make them well suited to
exchange of packages – useful library modules available
cryptography. Cryptography is all about number theory,
in Python Package Index (PyPI) repository [8]. By using
and all integer numbers (except 0 and 1) are made up of
Python cybersecurity professional can create a quick
primes, so you deal with primes a lot in number.
reponse to a cyber attack through it’s vast treasure of
Fermat's Little Theorem states that the following
libraries.
equation is true for any prime number m, and any whole
number that is not a multiple of m.
4 Using sequential data structures in
(1) Python and Python extension modules
According to the description given in the previous Data structures are complex types of data that include
paragraphs, we present the results of experiments multiple elements of different names and different
obtained by testing Eratosthenes/Prime algorithm types. They are located in the same block within the
implementation in different programming languages. memory as well as main program, and are accessed
The purpose of this experiment was to check which through a single cursor or an identifier. For example, if
programming language gives the best performances of we have a simple data structure that has two elements,
implementation. The experiment was performed via title and the length of the song, the identifier of the
concurrent programming website CodeChef. This structure is the title field.
platform supports large number of programming Knowledge of Python sequential data structures is
necessary for creating functional penetration testing
245
tools and applications in cybersecurity [9]. Fundamental applications that deals with of most today's
understanding of the language’s data structures that are cybersecurity issues.
applicable to penetration testing and cybersecurity are a With unavoidable fact that technology, threats are
must for those who want to advance in this area. Some constantly evolving, if cybersecurity profesionall skills
of the fundamental data structures that can be don't evolve with them, they will become ineffective
implemented in the Python are array, queue and linked and irrelevant, unable to provide the vital defenses that
list. organizations increasingly require.
Python programming languages implement extension
modules and libraries that adds a possibility of using Acknowledgment
data structures. Some of these modules and libraries are: This paper is part of the research projects no. TR 32023,
[10]: TR 35026 and subproject 3 in project no. III 47016,
supported by the Ministry of Education, Science and
NumPy module:. Provides efficient operation Technological development of the Republic of Serbia.
on arrays of homogeneous data. One of the
operation is numpy.sort. Although Python has
Literature
built-in sort and sorted functions to work with
lists, numpy.sort uses a quicksort algorithm. [1] W.L. Perry et al.: Predictive policing: the role of crime
forecasting in law enforcement operations. 2013. Santa
Pandas: It includes sophisticated methods for Monica, CA:RAND.
data structure manipulation. By using [2] L. Bennett Moses, & J. Chan: Algorithmic prediction in
hierarchical axis indexing it provides an policing: assumptions, evaluation, and accountability.
intuitive way of working with high- Policing and Society, 28(7), 806-822, 2018.
dimensional data in a lower-dimensional data [3] N. Singh, C. B. Kaverappa, J. D. Joshi: Data Mining for
Prevention of Crimes. In: Yamamoto S., Mori H. (eds)
structures. Human Interface and the Management of Information.
Bisect module: Binary search is a fast Interaction, Visualization, and Analytics. HIMI 2018.
algorithm for searching sorted sequences. It is Lecture Notes in Computer Science, vol 10904. Springer,
Cham
available in a standard Python module for
[4] A. Liban, & S. M. Hilles: Enhancing Mysql Injector
binary searches, named 'bisect.' Module
vulnerability checker tool (Mysql Injector) using
bisect.py provides support for maintaining a inference binary search algorithm for blind timing-based
list in sorted order without having to sort the attack. In 2014 IEEE 5th Control and System Graduate
list after each insertion. Research Colloquium (pp. 47-52). IEEE.
NetworkX module: Python library for [5] S. Khan, & I. Traore: A prevention model for algorithmic
complexity attacks. In 2005 International Conference on
studying graphs and networks. Operation from Detection of Intrusions and Malware, and Vulnerability
numpy_matrix return a graph from numpy Assessment (pp. 160-173). Springer, Berlin, Heidelberg.
matrix. NetworkX uses an adjacency dictionary [6] J. M. de Fuentes, L. González-Manzano, J. Tapiador, &
representation. The main emphasis of P. Peris-Lopez: PRACIS: Privacy-preserving and
aggregatable cybersecurity information sharing.
NetworkX is to avoid the whole issue of
Computers & Security, 69, 127-141, 2017.
hairballs. The use of simple calls hides much
[7] A. Epishkina, & S. Zapechnikov: A syllabus on data
of the complexity of working with graphs and mining and machine learning with applications to
adjacency matrices from view. cybersecurity. In 2016 Third International Conference on
Digital Information Processing, Data Mining, and
5 Conclusion Wireless Communications (DIPDMWC) (pp. 194-199).
IEEE.
Python has grown to become one of the top [8] S. Šandi, T. Popović, & B. Krstajić: Python
programming languages in the world, with more implementation of IEEE C37. 118 communication
developers than ever now using it for IT sector. On the protocol. ETF Journal of Electrical Engineering, 21(1),
other hand, Python language is predominantly used in 108-117, 2015.
cybersecurity because of the powerful packages it has to [9] K. J. Knapp, C. Maurer, & M. Plachkinova: Maintaining
offer. Traditional programming languages like C, C++ a Cybersecurity Curriculum: Professional Certifications
etc. force programmers to deal with details of data as Valuable Guidance. Journal of Information Systems
structures and supporting routines, rather than algorithm Education, 28(2), 101-114, 2017.
design. Python represents and algorithm-oriented [10] https://scipy.org/, accessed July 2019.
language that has been sorely needed in education about
cybersecurity domain. Our research showed that
implementation of analyzed algorithms in Python
enables development of flexible and functional
246