Graph Mining Tools
Graph Mining Tools
Indian Journal of Science and Technology, Vol 7(S7), 188–190, November 2014 ISSN (Online) : 0974-5645
Abstract
Data mining has a long history, which has a strong attention from researchers in many different fields including database
design, statistics, pattern recognition, machine learning, and data visualization. Data mining is the process of finding
projective models from the given data. In this paper we overviewed graph mining tasks and the tools which are used for
the mining of data represented as graphs.
1. Introduction The main approaches which are followed for mining
of graph data are
Data mining is the process of the discovery of knowledge
from the data. It is a emerging research field of computer • Mining frequent sub graphs
science, is defined as the significant process of identifying • Classification
suitable, prospectively useful, and crucially understand- • Clustering
able patterns in data. Graph mining is the process of
gathering and analyzing the data represented as graphs4. 2.1.1 Mining Frequent Sub Graphs
Graph mining has become an important topic of research
Frequent sub graphs, as the name suggests, are sub graphs
recently because of numerous applications to a wide
that occur frequently in data represented as graphs1. They
variety of data mining problems in biology, chemistry,
are useful for characterizing graph sets, discriminat-
management, business and communication networking3.
ing different groups of graphs, classifying and clustering
Graph Mining is a relatively new area of research which
graph sets, building graph indices and facilitating simi-
however has a solid base in classic graph theory, computa-
larity search in graph data bases2. A substructure may be
tional cost considerations, and sociological concepts such
different structural forms such as graphs, trees, or lattices,
how individuals interrelate, group together and follow
which may be combined with item sets or subsequences.
one another. The structure of the paper is as follows. In
If a substructure occurs frequently, it is called a (frequent)
Section 2 we discuss the various graph mining techniques
structured pattern.
and tools. Section 3 confers the issues of graph mining
Although graph mining may include mining frequent
and Section 4 gives the conclusion.
sub graph patterns, graph classification, clustering, and
other analysis tasks.
2. Graph Mining Techniques and
Tools 2.1.2 Classification
Classification is the method of discovering a representa-
2.1 Techniques tion that demonstrates and distinguishes data classes or
Being a special case of data mining, many data mining ideas, for the inspiration to use the model to predict the
approaches have been extended to graph mining also. class of objects whose class label is unknown. The model
is derived is based on the analysis of a set of training Knowledge is distributed in social networks, and
data. There are various other methods for constructing services are powered by cloud computing platforms.
classification models, such as Naive Bayesian classifica- Humans are extremely good in identifying patterns and
tion, support vector machines and k-nearest neighbor outliers. GraphInsight is useful for interacting visually
classification. with the data can give us a better intuition and higher
confidence on the field.
2.1.3 Clustering
Clustering is under vigorous development. Contributing 2.2.4 NetworkX
areas of research include data mining, statistics, machine Is a Python language package for exploration and analysis
learning, spatial database technology, biology, and mar- of networks and network algorithms Data structures for
keting. Owing to the huge amounts of data collected in representing many types of networks, or graphs, (simple
databases, cluster analysis has recently become a highly graphs, directed graphs, and graphs with parallel edges
active topic in data mining research. Clustering of graphs and self loops). Flexibility ideal for representing networks
includes (possibly large) number of graphs which need to found in many different fields.
be clustered based on their underlying structural behav-
ior. This problem is challenging because of the need to 2.2.5 Social Networks Visualizer
match the structures of the underlying graphs, and use Social Networks Visualizer also known simply as
these structures for clustering purposes. SocNetV, is a software which is able to compute almost
every network property might be interested in, including
2.2 Tools path lengths, clustering coefficients and graph diameters.
There are several tools available for graph mining. Some Moreover we can also make use of the multitude of layout
of them are given here. algorithms provided as well as the random networks that
can be quickly generated at the press of a button.
2.2.1 Cytoscape
2.2.6 Knime
Is a graph mining tool and it was developed in 2002, with
funding from the National Institute of General Medical Is a modular platform for building and executing
Sciences and the National Resource for Network Biology. workflows using predefined components, called nodes.
The biomedical research community started using this Core functionality available for tasks such as standard
first, and it is useful to understand the gene and protein data mining, analysis and manipulation and extra features
interaction in biology. and functionality available in KNIME through extensions
from various groups and vendors written in Java based on
2.2.2 Gephi the Eclipse SDK platform.
Vol 7 (S7) | November 2014 | www.indjst.org Indian Journal of Science and Technology 189
Graph Mining Techniques, Tools and Issues - A Study
upcoming research. This paper provides a new perspective 2. Du H. Data Mining Techniques and Applications an
of a researcher to overcome the challenges in methods, Introduction, 1st Edition. Cengage Learning Edition; 2010.
data and other issues of graph mining in social good. 3. Han J, Kamber M. Data Mining: Concept and Techniques,
2nd Edition. Morgan Kauffmann; 2006.
4. Chen MS, Han J, Yu PS. Data mining: an overview from
5. Acknowledgement database perspective. IEEE Transactions on Knowledge and
Data Engineering. 1999 Dec; 8(6):866–83.
The author of this article would like to thank AMET
University as a team for their full support.
6. References
1. Nettleton DF. Data mining of social networks represented
as graphs. Elsevier. 2013; 7:1–34.
190 Vol 7 (S7) | November 2014 | www.indjst.org Indian Journal of Science and Technology