2018 Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium
on Advanced Intelligent Systems
Autonomous Machine Learning Modeling using a Task Ontology
Kyoung Soon Hwang Ki Sun Park
Dept. Computer Science Dept. Computer Science
Chungbuk National University Chungbuk National University
Cheongju, Korea Cheongju, Korea
[email protected] [email protected] Sang Hyun Lee Kwang il Kim
Dept. Computer Science Dept. Computer Science
Chungbuk National University Chungbuk National University
Cheongju, Korea Cheongju, Korea
[email protected] [email protected] Keon Myung Lee
*corresponding
Dept. Computer Science
Chungbuk National University
Cheongju, Korea
[email protected] Abstract— Recently, many researchers are intensely engaged standardizing various tools so that machine learning non-
in investigation on the artificial intelligence technology that experts can easily apply them to their domains.
recognizes, learns, inferences, and acts on external On the other hand, an autonomous machine learning
information in a wide range of fields by combining [6] is still in its infancy, and some techniques provide the
technologies of computing, big data and machine learning ability to reduce the unnecessary tasks that are
algorithms. The artificial intelligence technology is currently progressively refined to prepare the model and improve its
used in almost all industries, and many machine learning accuracy. The tools of autonomous machine learning
experts are working on integrating and standardizing provide an optimal algorithm for machine learning tasks
various machine learning tools so that non-experts can easily
and functions to determine the hyper-parameter setting
apply them to their domain. The researchers are also
studying an autonomous machine learning as well as
through self-analysis. The typical tools include Auto-
ontology construction for standardizing the machine sklearn[7], Auto-Weka[8], H2o Driverless AI[9] and
learning concepts. In this paper, we classify typical problem Google's AutoML[10].
solving steps for autonomous machine learning as tasks, and In this paper, we describe a typical problem solving
present a problem solving process. We propose the modeling process for the machine learning as tasks, present their
method of an autonomous machine learning using a procedure, and propose the modeling method of an
processes of the task execution on machine learning such as autonomous machine learning for using task execution
workflow. The proposed task ontology-based machine processes. The modeling method of autonomous machine
learning model defines a task-based process grouping learning based on the task ontology define a structure-
scheme of UML activities. And it will automatically generate based grouping method of the UML(Unified Modeling
and extend the machine learning models by transformation Language)[11] activities and implement a function to
rules based on common elements and structures automatically generate models based on common elements
(relationships and processes between elements). and structures.
The purpose of the proposed autonomous machine
Keywords - Machine Learning; Deep Learning; learning model is to model autonomous machine learning
Autonomous Machine Learning; Task Ontology; Machine by reusing existing resources and producing new
Learning Ontology
knowledge through relearning it. This paper is structured
as follows: Section II discusses some related works. In
I. INTRODUCTION section III we present the process and method for
Artificial intelligence technology has become one of designing the proposed autonomous machine learning
the most essential tools in research and business context model, finally, section IV presents the conclusion and
recently. Most of the machine learning frameworks are future research.
open-source, so the entry of barriers into machine learning
are lowered. The typical machine learning frameworks
II. RELATED WORKS
include Tensorflow[1], Keras[2], Caffe[3], Scikit-learn[4],
and Theano[5] implemented in programming languages The related works are discussed in twofold: task
such as Python, Java, and R. In this respect, many machine ontologies and machine learning ontologies
learning experts are working on integrating and
978-1-5386-2633-7/18/$31.00 ©2018 IEEE 244
DOI 10.1109/SCIS-ISIS.2018.00051
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on November 19,2024 at 08:38:40 UTC from IEEE Xplore. Restrictions apply.
A. Task ontologies III.DESIGN OF AUTONOMOUS MACHINE LEARNING MODEL
Ontology is defined in various fields depending on the BASED ON TASK ONTOLOGY
field of applications. In the field of artificial intelligence, it The design of autonomous machine learning model is
is an explicit and formal specification of how objects and presented based on the task ontology. Hereby we discuss
concepts described in the field of interest. In the Semantic three key aspects of designing the model as: method of
Web, an ontology plays a very important role in acquiring vocabulary on machine learning, knowledge
processing, sharing, and reusing the knowledge for representation by a concept graph, and UML-based meta-
exchanging information between different databases. An model for implementation.
ontology is also defined as an explicit description of
concepts, attributes, constraints, and relationships between A. Method of Acquiring Vocabulary on Machine
them on the domains. Learning
On the other hand, domain ontology can be defined as The collection of vocabulary is achieved by extracting
an 'explicit protocol for conceptualization’ of the problem. words from a paper, a textbook or a machine learning
A task ontology is defined as 'extracting and organizing the library (API) tutorial, and then selecting keywords from
concepts and relations existing in the problem solving the index and title of the textbook. The frequency of
process domain-independent'. In particular, a task ontology coincidence with the key word is calculated and labeled by
is a specification of the concept structure for the task a category item. Based on MEX vocabulary, it generates
execution process[12-15]. Thus, the core concept is the metadata about machine learning execution, algorithm
subject of processing and the procedure of processing for a environment, and execution result. Fig. 1 illustrates a
problem solving. In general, a person becomes a subject in vocabulary acquisition procedure of the machine learning.
a task ontology. However, in this paper, agents (programs)
become subjects to perform the tasks.
Expose ontology[12] is an ontology for machine
learning experiments. It is used in openML as a data
structuring and data sharing(API) method. Machine
Learning (ML) Schema[16] is used to export all openML
as linked open data. The DMOP ontology is explicitly
designed to support data mining and machine learning.
This covers the structure and parameters of predictive
models, associated cost functions, and optimization
strategies. OntoMD ontology provides a unique framework
for data mining research .
B. Machine learning ontologies
ML schema[16] is a top-level ontology that provides
classes, properties, and constraints for machine learning
algorithms, datasets and experimentation suggested by the
W3C(ML Schema community group). It can be easily
extended and refined, and can be mapped to other domain
ontologies developed in the field of machine learning and
data mining.
MEX vocabulary[17,18] has been designed to solve the
share of provenance information in a lightweight form.
The extended PROV ontology provides a model for Fig.1. A vocabulary acquisition method of the machine learning
representing, capturing, and sharing provision information
on the Web. This can enable the use of analytical data and The task ontology of the machine learning is modeling
code so that another person can reuse the results. The code with the following items.
and the markup language are written in a single file, and 1. Project includes user information and task
processed to create a document. descriptions for the entire experiment.
A provenance meta information[17] was proposed as a
standard model of data management by W3C. The 2. Experimental information describes information
provenance information is also “information about entities, about 'what, when, who, what ratings' the task is
activities, and people involved in producing a piece of data made'. A task is the procedure of machine learning
which can be used to form assessments about its quality, that includes information such as data preparing,
reliability or trustworthiness”. As a standard query data preprocessing, preparation of learning data,
language, SPARQL is a query language similar to SQL machine learning model setting, a parameter
and stored in Resource Description Framework(RDF) for setting of machine learning model, hyper-
queries on data. parameter setting and code generation
(implementation: python code).
3. Data describe on the characteristics (attributes),
the type of data (training data sets, validation data
sets, test data sets) and data format type (text, csv,
image, json, xml …).
245
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on November 19,2024 at 08:38:40 UTC from IEEE Xplore. Restrictions apply.
4. The ML framework describes on information The conceptual graph consists of concept nodes, and
such as its name, features, and so on. It describes relationship nodes, which represent the relationship
the machine learning (machine Learning or deep between concepts and concepts. The square is a concept
Learning), model name, parameters, etc., and node and the ellipse is a relationship node. The following
describes on the model evaluation algorithm and is an example of the concept graph for the above Fig. 3.
so on.
[Run]–
5. The algorithm describes information such as
learning type, algorithm name, and parameters. (Imple)–[Code:CNNCode_01.py]
6. The hyper-parameter describes information (Model)-[CNN:CNNmodel_01]
about epochs, batch size, learning rate, etc. for the
machine learning (deep learning). (Data)-[Dataset:MNIST]
7. The implementation manages the systematic
information of users, the use date, and the use type The concept graph specifies the sub nodes of the
about an applied model. It also stores and manages concept node searched through the concept node search
information about the generated models. and expresses it in a linear form through materialized
8. The generated machine learning code includes the views. Fig. 4 illustrates the concept node of algorithm. For
prerequisites for computing information in the run example, if the sub node of the training model extends,
configuration information for execution. expands about the parameter setting.
9. The execution environment describes
information about the coding language, library,
software information, and so on.
10. Performance describes evaluation methods and
procedures for the applied model evaluation. Fig. 4. A part of workflow concerning the deep learning modeling
Based on the machine learning vocabulary, the
Fig. 2 illustrates the overview of task ontology diagram workflow is visually represented by connecting only the
on the machine learning. The sub nodes of each concept relevant ones hierarchically.
node are hierarchically subdivided and represented in a
markup language such as XML or RDF. C. Autonomous machine learning modeling
B. Representation - Concept graph Autonomous machine learning modeling is the work
for standardization and abstraction to the core of the
Machine learning knowledge is systematically components base on the meta information of the machine
described in terms, methods (function: executable script), learning. The model consists of the task and process and
execution environment, and procedures. This means that if saves as the method library(API). It defines into small
a normal output is made through the preceding work, the units for modular and systematization of its components.
next work is performed after confirming it. Based on this, The defined components redefine as a UML-based meta-
it is possible to define a structure-based grouping scheme model for the consistency, traceability, reusability, and
of activity and automatically generate a model based on implementation-ready between tasks and the results. So the
common rules (functions) and structures (relations and core class of the UML-based meta-model consist of tasks
processes between elements) based on conversion rules. and processes.
When we design a machine learning workflow, we The Knowledge of the autonomous machine learning
gradually create a concept graph. The concept graph is to a also describes a small task unit based on the MEX-
graph representation of the concept structure. The vocabulary. Fig.5 depicts a part of the knowledge of the
conceptual structure is expressed as a frame, and the object detection using the “YOLO” of the deep learning
reference object is expanded using the preorder expression algorithm. The machine learning pipeline for object
of the keyword. This is an expression in which the detection consists of data import, decision of attribute
keyword precedes the relevant machine learning word selection or schema, selection of learning model,
from the sentence extracted from the document. Fig.3 construction of learning model, hyper-parameter setting,
shows a part of conceptual graph on the machine learning. model training, measurement of model performance, and
so on.
In this way, the knowledge representation of the project
unit is written as ‘.json’ files using the mapping rules
based on the machine learning schema and the vocabulary,
and convert it into a UML-based meta-model. This model
makes that objectives, optimizers, metrics, and layers in
the Keras API are meta-model for deep learning.
Fig. 3. A part of conceptual graph
246
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on November 19,2024 at 08:38:40 UTC from IEEE Xplore. Restrictions apply.
designated autonomous level. Therefore, the non-experts
are capable of doing complex tasks using the proposed
method and can easily implement the machine learning
model in a specific application
ACKNOWLEDGMENT
This research was supported by Next-Generation
Information Computing Development Program through the
National Research Foundation of Korea (No.: NRF-
2017M3C4A7069 432).
REFERENCES
[1] A. Martín, B. Paul Barha, C. Jianmin, C. Zhifeng, D. Andy, D.
Jeffrey, et al., “TensorFlow: A System for Large-Scale Machine
Learning," Operating Systems Design, and Implementation, Vol.
16, pp. 265-283, 2016.
[2] Keras, https://keras.io/accessed on April, 05, 2018.
[3] J. Yangqing, S. Evan, D. Jeff, K. Sergey, L. Jonathan, G. Ross, et
al., “Caffe: Convolutional Architecture for Fast Feature
Embedding,” Proceedings of the 22nd ACM International
Conference on Multimedia, pp. 675-678, 2014.
[4] P. Fabian, V. Gaël, G. Alexandre, M. Vincent, T. Bertrand, G.
Olivier, et al., "Scikit-learn: Machine Learning in Python," Journal
of Machine Learning Research, pp. 2825-2830, 2011.
[5] Theano, http://deeplearning.net/software/theano/accessed on April,
05, 2018.
[6] K. M. Lee, K. Y. Kim, J. S. Yoo, “Autonomicity Levels and
Requirements for Automated Machine Learning”. In Proceedings
of the International Conference on Research in Adaptive and
Convergent Systems, ACM. 2017, Sept.; 46-48.
[7] Auto-sklearn, https://automl.github.io/auto-sklearn/stable/
[8] Auto-WEKA, https://www.cs.ubc.ca/labs/beta/Projects/autoweka/
[9] H2o Driverless AI, https://www.h2o.ai/driverless-ai/
[10] Google's AutoML, https://ai.googleblog.com/2017/11/automl-for-
large-scale-image.html, 2017.
[11] C. Perez, The Deep Learning A.I. Playbook : Strategy for
Fig. 5. A part of the image detection knowledge Disruptive Artificial Intelligence,I.M.:Intution Machine, 2017
[12] V. Joaquin, S. Larisa, “Expos´e: An Ontology for Data Mining
Fig.6 shows a part of the UML-based meta-model for Experiments”, Third Generation Data Mining Workshop at ECML
autonomous machine learning. The proposed model PKDD 2010.
consists of objective function, optimization algorithm, and [13] I. Mitsuru, S. Kazuhisa, K. Osamu, M. Riichiro, “Task ontology:
layers. The objective function is the loss function and Ontology for building conceptual problem solving models”, In
fitness function of the machine learning model. proceeding of ECAI98 Workshop on Applications of Ontologies
and Problem-Solving model, pp.126-133, ECA, 1998.
Optimization algorithms include Stochastic Gradient
Descent, Momentum, AdaGrad, RMSProp, Adam, and so [14] A. F. Martins, R. A. F. De, “Models for Representing Task
Ontologies”, Proceeding of the 3rd Workshops on Ontologies and
on. The layers include fully-connected, convolution, their Application, 2008.
pooling, dropout, and recurrent layers. [15] S. Kanjana, S. Maleerat, “Ontology Knowledge-Based Framework
The proposed modeling method can accumulate the for Machine Learning Concept”, iiWAS '16 Proceedings of the
knowledge of the overall machine learning gradually by 18th International Conference on Information Integration and Web-
adding a machine learning automation module for a based Applications and Services, pp. 50-53, 2016.
specific work process, and can build an autonomous [16] P. Gustavo Correa, E. Diego, L. Agnieszka, P. Panče, S. Larisa, S.
machine learning framework easily based on this model Tommaso, V. Joaquin, Z. Hamid, “ML Schema: Exposing the
Semantics of Machine Learning with Schemas and Ontologies”,
ICML 2018 - RML Workshop.
IV. CONCLUSIONS [17] G. Eason, B. Noble, and I. N. Sneddon, ͆ MEX Interfaces:
Automating Machine Learning Metadata Generation, ͇
In this paper, we extracted important keywords for https://www.researchgate.net/publication/305143958, 2016.
constructing an ontology from papers and textbooks about [18] E. Diego, N. M. Pablo, M. Diego, C. D. Julio, Z. Amrapali, L. Jens,
machine learning. Moreover we designed a task ontology “MEX Vocabulary: A Lightweight Interchange Format for
based on the MEX vocabulary. We also studied workflow Machine Learning Experiments”, SEMANTiCS 2016 Proceedings
for the autonomous machine learning model. The proposed of the 12th International Conference on Semantic Systems, pp. 17-
24, 2016.
method is applicable for automatic workflow according to
247
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on November 19,2024 at 08:38:40 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. The overview of the task ontology on the machine learning
Fig.6. A part of the deep learning model based on UML
248
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on November 19,2024 at 08:38:40 UTC from IEEE Xplore. Restrictions apply.