0% found this document useful (0 votes)
25 views177 pages

ICT Innovations 2019 Big Data Processing and Mining 11th International Conference ICT Innovations 2019 Ohrid North Macedonia October 17 19 2019 Proceedings Sonja Gievska Download

The ICT Innovations 2019 conference focused on Big Data Processing and Mining, held in Ohrid, North Macedonia from October 17-19, 2019. It brought together 184 authors from 24 countries to discuss advancements in various fields including social network analysis, natural language processing, and deep learning. The proceedings feature selected papers that contribute to the ongoing dialogue about challenges and innovations in big data mining.

Uploaded by

jtevrgbqpo779
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views177 pages

ICT Innovations 2019 Big Data Processing and Mining 11th International Conference ICT Innovations 2019 Ohrid North Macedonia October 17 19 2019 Proceedings Sonja Gievska Download

The ICT Innovations 2019 conference focused on Big Data Processing and Mining, held in Ohrid, North Macedonia from October 17-19, 2019. It brought together 184 authors from 24 countries to discuss advancements in various fields including social network analysis, natural language processing, and deep learning. The proceedings feature selected papers that contribute to the ongoing dialogue about challenges and innovations in big data mining.

Uploaded by

jtevrgbqpo779
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 177

ICT Innovations 2019 Big Data Processing and

Mining 11th International Conference ICT


Innovations 2019 Ohrid North Macedonia October
17 19 2019 Proceedings Sonja Gievska pdf
download
https://textbookfull.com/product/ict-innovations-2019-big-data-processing-and-mining-11th-
international-conference-ict-innovations-2019-ohrid-north-macedonia-october-17-19-2019-proceedings-
sonja-gievska/
★★★★★ 4.7/5.0 (47 reviews) ✓ 165 downloads ■ TOP RATED
"Great resource, downloaded instantly. Thank you!" - Lisa K.

DOWNLOAD EBOOK
ICT Innovations 2019 Big Data Processing and Mining 11th
International Conference ICT Innovations 2019 Ohrid North
Macedonia October 17 19 2019 Proceedings Sonja Gievska pdf
download

TEXTBOOK EBOOK TEXTBOOK FULL

Available Formats

■ PDF eBook Study Guide TextBook

EXCLUSIVE 2025 EDUCATIONAL COLLECTION - LIMITED TIME

INSTANT DOWNLOAD VIEW LIBRARY


Collection Highlights

Intelligence Science and Big Data Engineering Big Data and


Machine Learning 9th International Conference IScIDE 2019
Nanjing China October 17 20 2019 Proceedings Part II Zhen
Cui

Data Mining and Big Data 4th International Conference DMBD


2019 Chiang Mai Thailand July 26 30 2019 Proceedings Ying
Tan

Intelligence Science and Big Data Engineering Visual Data


Engineering 9th International Conference IScIDE 2019
Nanjing China October 17 20 2019 Proceedings Part I Zhen
Cui

Arabic Language Processing From Theory to Practice 7th


International Conference ICALP 2019 Nancy France October
16 17 2019 Proceedings Kamel Smaïli
Proceedings of the 2nd International Conference on
Building Innovations ICBI 2019 Volodymyr Onyshchenko

Software Technology Methods and Tools 51st International


Conference TOOLS 2019 Innopolis Russia October 15 17 2019
Proceedings Manuel Mazzara

Innovations in Digital Economy First International


Conference SPBPU IDE 2019 St Petersburg Russia October 24
25 2019 Revised Selected Papers Dmitrii Rodionov

Technology in Education Pedagogical Innovations 4th


International Conference ICTE 2019 Guangzhou China March
15 17 2019 Revised Selected Papers Simon K. S. Cheung

Advanced Data Mining and Applications 15th International


Conference ADMA 2019 Dalian China November 21 23 2019
Proceedings Jianxin Li
Sonja Gievska
Gjorgji Madjarov (Eds.)

Communications in Computer and Information Science 1110

ICT Innovations 2019


Big Data Processing and Mining
11th International Conference, ICT Innovations 2019
Ohrid, North Macedonia, October 17–19, 2019
Proceedings
Communications
in Computer and Information Science 1110
Commenced Publication in 2007
Founding and Former Series Editors:
Phoebe Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu,
Krishna M. Sivalingam, Dominik Ślęzak, Takashi Washio, Xiaokang Yang,
and Junsong Yuan

Editorial Board Members


Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
Rio de Janeiro, Brazil
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Ashish Ghosh
Indian Statistical Institute, Kolkata, India
Igor Kotenko
St. Petersburg Institute for Informatics and Automation of the Russian
Academy of Sciences, St. Petersburg, Russia
Lizhu Zhou
Tsinghua University, Beijing, China
More information about this series at http://www.springer.com/series/7899
Sonja Gievska Gjorgji Madjarov (Eds.)

ICT Innovations 2019


Big Data Processing and Mining
11th International Conference, ICT Innovations 2019
Ohrid, North Macedonia, October 17–19, 2019
Proceedings

123
Editors
Sonja Gievska Gjorgji Madjarov
Saints Cyril and Methodius Saints Cyril and Methodius
University of Skopje University of Skopje
Skopje, North Macedonia Skopje, North Macedonia

ISSN 1865-0929 ISSN 1865-0937 (electronic)


Communications in Computer and Information Science
ISBN 978-3-030-33109-2 ISBN 978-3-030-33110-8 (eBook)
https://doi.org/10.1007/978-3-030-33110-8

© Springer Nature Switzerland AG 2019


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, expressed or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

The ICT Innovations conference series, organized by the Macedonian Society of


Information and Communication Technologies (ICT-ACT) is an international forum
for presenting scientific results related to innovative fundamental and applied research
in ICT. The 11th ICT Innovations 2019 conference that brought together academics,
students, and industrial practitioners, was held in Ohrid, Republic of North Macedonia,
during October 17–19, 2019.
The focal point for this year’s conference was “Big Data Processing and Mining,”
with topics extending across several fields including social network analysis, natural
language processing, deep learning, sensor network analysis, bioinformatics, FinTech,
privacy, and security.
Big data is heralded as one of the most exciting challenges in data science, as well as
the next frontier of innovations. The spread of smart, ubiquitous computing and social
networking have brought to light more information to consider. Storage, integration,
processing, and analysis of massive quantities of data pose significant challenges that
have yet to be fully addressed. Extracting patterns from big data provides exciting new
fronts for behavioral analytics, predictive and prescriptive modeling, and knowledge
discovery. By leveraging the advances in deep learning, stream analytics, large-scale
graph analysis and distributed data mining, a number of tasks in fields like, biology,
games, robotics, commerce, transportation, and health care have been brought within
reach.
Some of these topics were brought to the forefront of the ICT Innovations 2019
conference. This book presents a selection of papers presented at the conference which
contributed to the discussions on various aspects of big data mining (including algo-
rithms, models, systems, and applications). The conference gathered 184 authors from
24 countries reporting their scientific work and solutions in ICT. Only 18 papers were
selected for this edition by the international Program Committee, consisting of 176
members from 43 countries, chosen for their scientific excellence in their specific fields.
We would like to express our sincere gratitude to the authors for sharing their most
recent research, practical solutions, and experiences, allowing us to contribute to the
discussion on the trends, opportunities, and challenges in the field of big data. We are
grateful to the reviewers for the dedicated support they provided to our thorough
reviewing process. Our work was made easier by following the procedures developed
and passed along by Prof. Slobodan Kaljadziski, the co-chair of the ICT Innovations
2018 conference. Special thanks to Ilinka Ivanoska, Bojana Koteska, and Monika
Simjanoska for their support in organizing the conference and for the technical
preparation of the conference proceedings.

October 2019 Sonja Gievska


Gjorgji Madjarov
Organization

Conference and Program Chairs


Sonja Gievska University Ss.Cyril and Methodius, North Macedonia
Gjorgji Madjarov University Ss.Cyril and Methodius, North Macedonia

Program Committee
Jugoslav Achkoski General Mihailo Apostolski Military Academy,
North Macedonia
Nevena Ackovska University Ss.Cyril and Methodius, North Macedonia
Syed Ahsan Technische Universität Graz, Austria
Marco Aiello University of Groningen, The Netherlands
Azir Aliu Southeastern European University of North Macedonia,
North Macedonia
Luis Alvarez Sabucedo Universidade de Vigo, Spain
Ljupcho Antovski University Ss.Cyril and Methodius, North Macedonia
Stulman Ariel The Jerusalem College of Technology, Israel
Goce Armenski University Ss.Cyril and Methodius, North Macedonia
Hrachya Astsatryan National Academy of Sciences of Armenia, Armenia
Tsonka Baicheva Bulgarian Academy of Science, Bulgaria
Verica Bakeva University Ss.Cyril and Methodius, North Macedonia
Antun Balaz Institute of Physics Belgrade, Serbia
Lasko Basnarkov University Ss.Cyril and Methodius, North Macedonia
Slobodan Bojanic Universidad Politécnica de Madrid, Spain
Erik Bongcam-Rudloff SLU-Global Bioinformatics Centre, Sweden
Singh Brajesh Kumar RBS College, India
Torsten Braun University of Berne, Switzerland
Andrej Brodnik University of Ljubljana, Slovenia
Francesc Burrull Universidad Politécnica de Cartagena, Spain
Neville Calleja University of Malta, Malta
Valderrama Carlos UMons University of Mons, Belgium
Ivan Chorbev University Ss.Cyril and Methodius, North Macedonia
Ioanna Chouvarda Aristotle University of Thessaloniki, Greece
Trefois Christophe University of Luxembourg, Luxembourg
Betim Cico Epoka University, Albania
Emmanuel Conchon Institut de Recherche en Informatique de Toulouse,
France
Robertas Damasevicius Kaunas University of Technology, Lithuania
Pasqua D’Ambra IAC, CNR, Italy
Danco Davcev University Ss.Cyril and Methodius, North Macedonia
Antonio De Nicola ENEA, Italy
viii Organization

Domenica D’Elia Institute for Biomedical Technologies, Italy


Vesna Dimitrievska University Ss.Cyril and Methodius, North Macedonia
Ristovska
Vesna Dimitrova University Ss.Cyril and Methodius, North Macedonia
Ivica Dimitrovski University Ss.Cyril and Methodius, North Macedonia
Salvatore Distefano University of Messina, Italy
Milena Djukanovic University of Montenegro, Montenegro
Ciprian Dobre University Politehnica of Bucharest, Romania
Martin Drlik Constantine the Philosopher University in Nitra,
Slovakia
Sissy Efthimiadou ELGO Dimitra Agricultural Research Institute, Greece
Tome Eftimov Stanford University, USA, and Jožef Stefan Institute,
Slovenia
Stoimenova Eugenia Bulgarian Academy of Sciences, Bulgaria
Majlinda Fetaji Southeastern European University of North Macedonia,
North Macedonia
Sonja Filiposka University Ss.Cyril and Methodius, North Macedonia
Predrag Filipovikj Mälardalen University, Sweden
Ivan Ganchev University of Limerick, Ireland
Todor Ganchev Technical University Varna, Bulgaria
Nuno Garcia Universidade da Beira Interior, Portugal
Andrey Gavrilov Novosibirsk State Technical University, Russia
Ilche Georgievski University of Groningen, The Netherlands
John Gialelis University of Patras, Greece
Sonja Gievska University Ss.Cyril and Methodius, North Macedonia
Hristijan Gjoreski University Ss.Cyril and Methodius, North Macedonia
Dejan Gjorgjevikj University Ss.Cyril and Methodius, North Macedonia
Danilo Gligoroski Norwegian University of Science and Technology,
Norway
Rossitza Goleva Technical University of Sofia, Bulgaria
Andrej Grgurić Ericsson Nikola Tesla d.d., Croatia
David Guralnick International E-Learning Association (IELA), France
Marjan Gushev University Ss.Cyril and Methodius, North Macedonia
Elena Hadzieva University St. Paul the Apostle, North Macedonia
Violeta Holmes University of Hudersfield, UK
Ladislav Huraj University of SS. Cyril and Methodius, Slovakia
Sergio Ilarri University of Zaragoza, Spain
Natasha Ilievska University Ss.Cyril and Methodius, North Macedonia
Ilija Ilievski Graduate School for Integrative Sciences
and Engineering, Singapore
Mirjana Ivanovic University of Novi Sad, Serbia
Boro Jakimovski University Ss.Cyril and Methodius, North Macedonia
Smilka Janeska-Sarkanjac University Ss.Cyril and Methodius, North Macedonia
Mile Jovanov University Ss.Cyril and Methodius, North Macedonia
Milos Jovanovik University Ss.Cyril and Methodius, North Macedonia
Organization ix

Slobodan Kalajdziski University Ss.Cyril and Methodius, North Macedonia


Kalinka Kaloyanova FMI-University of Sofia, Bulgaria
Aneta Karaivanova Bulgarian Academy of Sciences, Bulgaria
Mirjana Kljajic Borstnar University of Maribor, Slovenia
Ljupcho Kocarev University Ss.Cyril and Methodius, North Macedonia
Dragi Kocev Jožef Stefan Institute, Slovenia
Margita Kon-Popovska University Ss.Cyril and Methodius, North Macedonia
Magdalena Kostoska University Ss.Cyril and Methodius, North Macedonia
Bojana Koteska University Ss.Cyril and Methodius, North Macedonia
Ivan Kraljevski VoiceINTERconnect GmbH, Germany
Andrea Kulakov University Ss.Cyril and Methodius, North Macedonia
Arianit Kurti Linnaeus University, Sweden
Xu Lai Bournemouth University, UK
Petre Lameski University Ss.Cyril and Methodius, North Macedonia
Suzana Loshkovska University Ss.Cyril and Methodius, North Macedonia
José Machado Da Silva University of Porto, Portugal
Ana Madevska Bogdanova University Ss.Cyril and Methodius, North Macedonia
Gjorgji Madjarov University Ss.Cyril and Methodius, North Macedonia
Tudruj Marek Polish Academy of Sciences, Poland
Ninoslav Marina University St. Paul the Apostole, North Macedonia
Smile Markovski University Ss.Cyril and Methodius, North Macedonia
Marcin Michalak Silesian University of Technology, Poland
Hristina Mihajlovska University Ss.Cyril and Methodius, North Macedonia
Marija Mihova University Ss.Cyril and Methodius, North Macedonia
Aleksandra Mileva University Goce Delcev, North Macedonia
Biljana Mileva Boshkoska Faculty of Information Studies in Novo Mesto,
Slovenia
Georgina Mirceva University Ss.Cyril and Methodius, North Macedonia
Miroslav Mirchev University Ss.Cyril and Methodius, North Macedonia
Igor Mishkovski University Ss.Cyril and Methodius, North Macedonia
Kosta Mitreski University Ss.Cyril and Methodius, North Macedonia
Pece Mitrevski University St. Kliment Ohridski, North Macedonia
Irina Mocanu University Politehnica of Bucharest, Romania
Ammar Mohammed Cairo University, Egypt
Andreja Naumoski University Ss.Cyril and Methodius, North Macedonia
Manuel Noguera University of Granada, Spain
Thiare Ousmane Gaston Berger University, Senegal
Eleni Papakonstantinou Genetics Lab, Greece
Marcin Paprzycki Polish Academy of Sciences, Poland
Dana Petcu West University of Timisoara, Romania
Antonio Pinheiro Universidade da Beira Interior, Portugal
Matus Pleva Technical University of Košice, Slovakia
Florin Pop University Politehnica of Bucharest, Romania
Zaneta Popeska University Ss.Cyril and Methodius, North Macedonia
Aleksandra University Ss.Cyril and Methodius, North Macedonia
Popovska-Mitrovikj
x Organization

Marco Porta University of Pavia, Italy


Ustijana Rechkoska University of Information Science and Technology
Shikoska St. Paul The Apostle, North Macedonia
Manjeet Rege University of St. Thomas, USA
Bernd Rinn ETH Zurich, Switzerland
Blagoj Ristevski University St. Kliment Ohridski, North Macedonia
Sasko Ristov University Ss.Cyril and Methodius, North Macedonia
Witold Rudnicki University of Białystok, Poland
Jelena Ruzic Mediteranean Institute for Life Sciences, Croatia
David Šafránek Masaryk University, Czech Republic
Simona Samardjiska University Ss.Cyril and Methodius, North Macedonia
Wibowo Santoso Central Queensland University, Australia
Snezana Savovska University St. Kliment Ohridski, North Macedonia
Loren Schwiebert Wayne State University, USA
Vladimir Siládi Matej Bel University, Slovakia
Josep Silva Universitat Politècnica de València, Spain
Ana Sokolova University of Salzburg, Austria
Michael Sonntag Johannes Kepler University Linz, Austria
Dejan Spasov University Ss.Cyril and Methodius, North Macedonia
Todorova Stela University of Agriculture, Bulgaria
Goran Stojanovski Elevate Global, North Macedonia
Biljana Stojkoska University Ss.Cyril and Methodius, North Macedonia
Ariel Stulman The Jerusalem College of Technology, Israel
Spinsante Susanna Università Politecnica delle Marche, Italy
Ousmane Thiare Gaston Berger University, Senegal
Biljana Tojtovska University Ss.Cyril and Methodius, North Macedonia
Yalcin Tolga NXP Labs, UK
Dimitar Trajanov University Ss.Cyril and Methodius, North Macedonia
Ljiljana Trajkovic Simon Fraser University, Canada
Vladimir Trajkovik University Ss.Cyril and Methodius, North Macedonia
Denis Trcek University of Ljubljana, Slovenia
Christophe Trefois University of Luxembourg, Luxembourg
Kire Trivodaliev University Ss.Cyril and Methodius, North Macedonia
Katarina Trojacanec University Ss.Cyril and Methodius, North Macedonia
Hieu Trung Huynh Industrial University of Ho Chi Minh City, Vietnam
Zlatko Varbanov Veliko Tarnovo University, Bulgaria
Goran Velinov University Ss.Cyril and Methodius, North Macedonia
Elena Vlahu-Gjorgievska University of Wollongong, Australia
Irena Vodenska Boston University, USA
Katarzyna Wac University of Geneva, Switzerland
Yue Wuyi Konan University, Japan
Zeng Xiangyan Fort Valley State University, USA
Shuxiang Xu University of Tasmania, Australia
Rita Yi Man Li Hong Kong Shue Yan University, Hong Kong, China
Malik Yousef Zefat Academic College, Israel
Organization xi

Massimiliano Zannin INNAXIS Foundation Research Institute, Spain


Zoran Zdravev University Goce Delcev, North Macedonia
Eftim Zdravevski University Ss.Cyril and Methodius, North Macedonia
Vladimir Zdravevski University Ss.Cyril and Methodius, North Macedonia
Katerina Zdravkova University Ss.Cyril and Methodius, North Macedonia
Jurica Zucko Faculty of Food Technology and Biotechnology,
Croatia
Chang Ai Sun University of Science and Technology Beijing, China
Yin Fu Huang University of Science and Technology, Taiwan
Suliman Mohamed Fati INTI International University, Malaysia
Hwee San Lim University Sains Malaysia, Malaysia
Fu Shiung Hsieh University of Technology, Taiwan
Zlatko Varbanov Veliko Tarnovo University, Bulgaria
Dimitrios Vlachakis Genetics Lab, Greece
Boris Vrdoljak University of Zagreb, Croatia
Maja Zagorščak National Institute of Biology, Slovenia

Scientific Committee
Danco Davcev University Ss.Cyril and Methodius, North Macedonia
Dejan Gjorgjevikj University Ss.Cyril and Methodius, North Macedonia
Boro Jakimovski University Ss.Cyril and Methodius, North Macedonia
Aleksandra University Ss.Cyril and Methodius, North Macedonia
Popovska-Mitrovikj
Sonja Gievska University Ss.Cyril and Methodius, North Macedonia
Gjorgji Madjarov University Ss.Cyril and Methodius, North Macedonia

Technical Committee
Ilinka Ivanoska University Ss.Cyril and Methodius, North Macedonia
Monika Simjanoska University Ss.Cyril and Methodius, North Macedonia
Bojana Koteska University Ss.Cyril and Methodius, North Macedonia
Martina Toshevska University Ss.Cyril and Methodius, North Macedonia
Frosina Stojanovska University Ss.Cyril and Methodius, North Macedonia

Additional Reviewers

Emanouil Atanassov
Emanuele Pio Barracchia
Abstract of Keynotes
Machine Learning Optimization
and Modeling: Challenges and Solutions
to Data Deluge

Diego Klabjan1,2,3
1
Department of Industrial Engineering and Management Sciences,
Northwestern University
2
Master of Science in Analytics, Northwestern University
3
Center for Deep Learning, Northwestern University
[email protected]

Abstract. A single server can no longer handle all of the data of a machine
learning problem. Today’s data is fine granular, usually has the temporal
dimension, is often streamed, and thus distributed among several compute nodes
on premise or in the cloud. More hardware buys you only so much; in particular,
the underlying models and algorithms must be capable of exploiting it. We focus
on distributed optimization algorithms where samples and features are
distributed, and in a different setting where data is streamed by an infinite
pipeline. Algorithms and convergence analyses will be presented. Fine granular
data with a time dimension also offers opportunities to deep learning models that
outperform traditional machine learning models. To this end, we use churn
predictions to showcase how recurrent neural networks with several important
enhancements squeeze additional business value.

Keywords: Distributed optimization • Deep learning • Recurrent neural


networks
Computing and Probing Cancer Immunity

Zlatko Trajanoski

Division of Bioinformatics, Medical University of Innsbruck


[email protected]

Abstract. Recent breakthroughs in cancer immunotherapy and decreasing costs


of high-throughput technologies sparked intensive research into tumour-immune
cell interactions using genomic tools. However, the wealth of the generated data
and the added complexity pose considerable challenges and require computa-
tional tools to process, analyse and visualise the data. Recently, a number of
tools have been developed and used to effectively mine tumour immunologic
and genomic data and provide novel mechanistic insights. In this talk I will first
review and discuss computational genomics tools for mining cancer genomic
data and extracting immunological parameters. I will focus on higher-level
analyses of NGS data including quantification of tumour-infiltrating
lymphocytes (TILs), identification of tumour antigens and T cell receptor
(TCR) profiling. Additionally, I will address the major challenges in the field
and ongoing efforts to tackle them.
In the second part I will show results generated using state-of-the-art
computational tools addressing several prevailing questions in cancer
immunology including: estimation of the TIL landscape, identification of
determinants of tumour immunogenicity, and immuno editing that tumors
undergo during progression or as a consequence of targeting the PD-1/PD-L1
axis. Finally, I will propose a novel approach based on perturbation biology of
patient-derived organoids and mathematical modeling for the identification of a
mechanistic rationale for combination immunotherapies in colorectal cancer.

Keywords: Cancer immunotherapy • Tumour-infiltrating lymphocytes •


Perturbation biology
Bioinformatics Approaches for Sing Cell
Transcriptomics and Big Omics Data Analysis

Ming Chen

Department of Bioinformatics, College of Life Sciences, Zhejiang University


[email protected]

Abstract. We are in the big data era. Multi-omics data brings us a challenge to
develop appropriate bioinformatics approaches to model complex biological
systems at spatial and temporal scales. In this talk, we will describe multi-omics
data available for biological interactome modeling. Single cell transcriptomics
data is exploited and analyzed. An integrative interactome model of non-coding
RNAs is built. We investigated to characterize coding and non-coding RNAs
including microRNAs, siRNAs, lncRNAs, ceRNAs and cirRNAs.

Keywords: Big data • Multi-omics data • RNA


Crosslingual Document Embedding
as Reduced-Rank Ridge Regression

Robert West

Tenure Track Assistant Professor, Data Science Laboratory, EPFL


robert.west@epfl.chs

Abstract. There has recently been much interest in extending vector-based word
representations to multiple languages, such that words can be compared across
languages. In this paper, we shift the focus from words to documents and
introduce a method for embedding documents written in any language into a
single, language-independent vector space. For training, our approach leverages
a multilingual corpus where the same concept is covered in multiple languages
(but not necessarily via exact translations), such as Wikipedia. Our method,
Cr5 (Crosslingual reduced-rank ridge regression), starts by training a
ridge-regression-based classifier that uses language-specific bag-of-word fea-
tures in order to predict the concept that a given document is about. We show
that, when constraining the learned weight matrix to be of low rank, it can be
factored to obtain the desired mappings from language-specific bags-of-words to
language-independent embeddings. As opposed to most prior methods, which
use pretrained monolingual word vectors, postprocess them to make them
crosslingual, and finally average word vectors to obtain document vectors, Cr5
is trained end-to-end and is thus natively crosslingual as well as document-level.
Moreover, since our algorithm uses the singular value decomposition as its core
operation, it is highly scalable. Experiments show that our method achieves
state-of-the-art performance on a crosslingual document retrieval task. Finally,
although not trained for embedding sentences and words, it also achieves
competitive performance on crosslingual sentence and word retrieval tasks.

Keywords: Crosslingual • Reduced-rank • Ridge regression •


Retrieval • Embeddings
Contents

Automatic Text Generation in Macedonian Using Recurrent


Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Ivona Milanova, Ksenija Sarvanoska, Viktor Srbinoski,
and Hristijan Gjoreski

Detection of Toy Soldiers Taken from a Bird’s Perspective Using


Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Saša Sambolek and Marina Ivašić-Kos

Prediction of Student Success Through Analysis of Moodle Logs:


Case Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Neslihan Ademi, Suzana Loshkovska, and Slobodan Kalajdziski

Multidimensional Sensor Data Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 41


Stefan Popov and Biljana Risteska Stojkoska

Five Years Later: How Effective Is the MAC Randomization in Practice?


The No-at-All Attack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Ivan Vasilevski, Dobre Blazhevski, Veno Pachovski,
and Irena Stojmenovska

Improvement of the Binary Varshamov Bound . . . . . . . . . . . . . . . . . . . . . . 65


Dejan Spasov

Electrical Energy Consumption Prediction Using Machine Learning . . . . . . . 72


Simon Stankoski, Ivana Kiprijanovska, Igor Ilievski,
Jovanovski Slobodan, and Hristijan Gjoreski

Diatom Ecological Modelling with Weighted Pattern Tree Algorithm


by Using Polygonal and Gaussian Membership Functions . . . . . . . . . . . . . . 83
Andreja Naumoski, Georgina Mirceva, and Kosta Mitreski

A Study of Different Models for Subreddit Recommendation


Based on User-Community Interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Andrej Janchevski and Sonja Gievska

Application of Hierarchical Bayesian Model in Ophtalmological Study . . . . . 109


Biljana Tojtovska, Panche Ribarski, and Antonela Ljubic

Friendship Paradox and Hashtag Embedding in the Instagram


Social Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
David Serafimov, Miroslav Mirchev, and Igor Mishkovski
xx Contents

An In-Depth Analysis of Personality Prediction . . . . . . . . . . . . . . . . . . . . . 134


Filip Despotovski and Sonja Gievska

Ski Injury Predictions with Explanations . . . . . . . . . . . . . . . . . . . . . . . . . . 148


Sandro Radovanović, Andrija Petrović, Boris Delibašić,
and Milija Suknović

Performance Evaluation of Word and Sentence Embeddings for Finance


Headlines Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Kostadin Mishev, Ana Gjorgjevikj, Riste Stojanov, Igor Mishkovski,
Irena Vodenska, Ljubomir Chitkushev, and Dimitar Trajanov

A Hybrid Model for Financial Portfolio Optimization Based on LS-SVM


and a Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Ivana P. Marković, Jelena Z. Stanković, Miloš B. Stojanović,
and Jovica M. Stanković

Protein Secondary Structure Graphs as Predictors for Protein Function. . . . . . 187


Frosina Stojanovska and Nevena Ackovska

Exploring the Attention Mechanism in Deep Models: A Case Study


on Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Martina Toshevska and Slobodan Kalajdziski

Image Augmentation with Neural Style Transfer . . . . . . . . . . . . . . . . . . . . . 212


Borijan Georgievski

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225


Automatic Text Generation in Macedonian
Using Recurrent Neural Networks

Ivona Milanova, Ksenija Sarvanoska, Viktor Srbinoski,


and Hristijan Gjoreski(&)

Faculty of Electrical Engineering and Information Technologies,


University of Ss. Cyril and Methodius in Skopje, Skopje, North Macedonia
[email protected],
[email protected],
[email protected],
[email protected]

Abstract. Neural text generation is the process of a training neural network to


generate a human understandable text (poem, story, article). Recurrent Neural
Networks and Long-Short Term Memory are powerful sequence models that are
suitable for this kind of task. In this paper, we have developed two types of
language models, one generating news articles and the other generating poems
in Macedonian language. We developed and tested several different model
architectures, among which we also tried transfer-learning model, since text
generation requires a lot of processing time. As evaluation metric we used
ROUGE-N metric (Recall-Oriented Understudy for Gisting Evaluation), where
the generated text was tested against a reference text written by an expert. The
results showed that even though the generate text had flaws, it was human
understandable, and it was consistent throughout the sentences. To the best of
our knowledge this is a first attempt in automatic text generation (poems and
articles) in Macedonian language using Deep Learning.

Keywords: Text generation  Storytelling  Poems  RNN  Macedonian


language  NLP  Transfer learning  ROUGE-N

1 Introduction

As the presence of Artificial Intelligence (AI) and Deep Learning has become more
prominent in the past couple of years and the fields have acquired significant popu-
larity, more and more tasks from the domain of Natural Language Processing are being
implemented. One such task is automatic text generation, which can be designed with
the help of deep neural networks, especially Recurrent Neural Networks [16]. Text
generation is the process of preparing text for developing a word-based language model
and designing and fitting a neural language model in such a way that it can predict the
likelihood of occurrence of a word based on the previous sequence of words used in the
source text. After that the learned language model is used to generate new text with
similar statistical properties as the source text.

© Springer Nature Switzerland AG 2019


S. Gievska and G. Madjarov (Eds.): ICT Innovations 2019, CCIS 1110, pp. 1–12, 2019.
https://doi.org/10.1007/978-3-030-33110-8_1
2 I. Milanova et al.

In our paper, we do an experimental evaluation of two types of word-based lan-


guage models in order to create a text generation system that will generate text in
Macedonian. The first model is generating paragraph of a news article and the other is
generating poems. For generating news articles, we also tried implementing transfer
learning, but as there are no pre-trained models on a dataset in Macedonian, we used a
model that was trained on a dataset in English, so the results were not satisfying. The
second model we created was used to generate poetry and was trained on a dataset
consisting of Macedonian folk poems. In order to measure how closely the generated
text resembles a human written text, we used a metric called ROUGE-N (Recall-
Oriented Understudy for Gisting Evaluation), which is a set of metrics for evaluating
automatic generation of texts as well as machine translation. With this metric, we got
an F1 score of 65%.

2 Related Work

Recent years have brought a huge interest in language modeling tasks, a lot of them
being automatic text generation from a corpus of text, as well as visual captioning and
video summarization. The burst of deep learning and the massive development of
hardware infrastructure has made this task much more possible.
Some of the tasks in this field include automatic text generation based on intuitive
model and using heuristics to look for the elements of the text that were proposed by
human feedback [1, 2]. Another approach has leaned towards character-based text
generation using a Hessian-free optimization in order to overcome the difficulties
associated with training RNNs [3]. Text generation using independent short description
has also been one topic of research. The purpose of this paper has been to describe a
scene or event using independent descriptions. They have used both Statistical Machine
Translation and Deep Learning to present text generation in two different manners [4].
The other kind of text generation application has been designing text based interactive
narratives. Some of them have been using an evolutionary algorithm with an end-to-end
system that understands the components of the text generation pipeline stochastically [5]
and others have been using mining of crowd sourced information from the web [6, 7].
Many papers have also focused on visual text generation, image captioning and
video description. One recent approach to image captioning used CNN-LSTM struc-
tures [8, 9]. Sequence-to-sequence models have been used to caption video or movie
contents. They are using an approach where the first sequence encodes the video and
the second decodes the description [10, 11].
The idea behind document summarization has been used on video summarization
where instead of extracting key sentences, key frames or shots are selected [12].
Visual storytelling is the process of telling a coherent story about an image set.
Some of the works covering this include storyline graph modeling [13] and unsuper-
vised mining [14].
Another state-of-the-art approach includes using hierarchically structured rein-
forcement learning for generating coherent multi-sentence stories for the visual sto-
rytelling task [25].
Automatic Text Generation in Macedonian 3

However, all of these approaches include text generation in English where the
amount of available data is enormous. Our paper focuses on creating stories in
Macedonian using data from Macedonian news portals and folk poetry as well as
exploration of different model architectures in order to get the best result, regarding
comprehensibility and execution time.

3 Dataset and Preprocessing

The first dataset that we used was gathered from online news portals and consisted of
around 2.5 million words. The data was collected with the help of a scraper program we
wrote in .NET Core using the C# programming language. The program loads the web
page from a given Uniform Resource Locator (URL) and then looks at the html tags.
When it finds a match for the html tags that we have given the program, it takes their
content and writes it to a file.
The second dataset consisted of a collection of Macedonian poetry written by
various Macedonian writers [17] and was made up of roughly 7 thousand words. The
data was not clean so we had to do a fair amount of data preprocessing.
The collected datasets are publically available at https://github.com/Ivona221/
MacedonianStoryTelling.
In order to prepare the data that we collected to be suitable to enter the algorithm
and to simplify the task of the algorithm when it starts learning, we had to do a
considerable amount of data cleaning. For this purpose, we created a pipeline in C#,
which in the end gave us a clean dataset to work on. The first step in the algorithm was
to remove any special characters from the text corpus including html tags that were
extracted from the websites along with the text, JavaScript functions and so on. We
also had to translate some of the symbols into text like dollar signs, degree signs and
mathematical operators in order to have the least amount of unique characters for the
algorithm to work with. The next step was to translate all the English words if there
existed a translation or remove the sentences where that was not the case. Next, we had
to separate all the punctuation signs from the words with an empty space in order for
the algorithm to consider them as independent words. The last and one of the most
important steps in this pipeline was creating a custom word tokenizer. All the existing
word tokenizers were making a split on an empty space. However, they do not take into
consideration the most common word collocations as well as words containing dash,
name initials and abbreviations. Our algorithm was taking these words as one. The
abbreviations were handled by using a look-up table of all the Macedonian abbrevia-
tions, as well as the most common collocations and the initials were handled by
searching for specific patterns in text like capital letter-point-capital letter (Fig. 1).

Fig. 1. Data preprocessing flow


4 I. Milanova et al.

4 Language Model Architecture

We have trained two types of models that work on the principle of predicting the next
word in a sequence, one for news generation and one for poem generation. The lan-
guage models used were statistical and predicted the probability of each word, given an
input sequence of text. For the news generation model, we created several different
variations, including a transfer learning approach (Fig. 2).

Fig. 2. Language model architecture

4.1 News Article Generation


The first approach for news generation was trained on news articles and used a
sequence of hundred words as input. It then generates one word based on that
sequence, meaning the word with the biggest probability of appearing next. In the next
time step it adds the generated word to the sequence of a hundred words and cuts out
the very first word, meaning that it once again makes a sequence with a length of one
hundred. It then feeds this new sequence to itself as input for the next time step and
continues doing so until it generates the preset amount of words.
We tried out multiple architectures and the best results were acquired from the
following architecture. The neural network was an LSTM (Long-Short Term Memory)
recurrent neural network with two LSTM layers and two dense layers and also, we tried
a variation with a dropout layer in order to see how that affects the performance. The
first LSTM layer consists of 100 hidden units and 100 timesteps and is configured to
give one hidden state output for each input time step for the single LSTM cell in the
Automatic Text Generation in Macedonian 5

layer. The second layer we added was a Dropout layer with a dropout rate of 0.2. They
key idea behind adding a dropout layer is to prevent overfitting. This technique works
by randomly dropping units (along with their connections) from the neural network
during training. This prevents units from co-adapting too much [18]. The next layer is
also an LSTM layer with 100 hidden units. Then we added a Dense layer which is a
fully connected layer. A dense layer represents a matrix vector multiplication. The
values in the matrix are the trainable parameters, which get updated during back-
propagation. Assuming we have an n-dimensional input vector u (in our case 100-
dimensional input vector) presented with the formula:

u 2 Rn1 ð1Þ

We get an m-dimensional vector as output.

uT:W ¼ W 2 Rnm ð2Þ

A dense layer thus is used to change the dimensions of the vector. Mathematically
speaking, it applies a rotation, scaling, translation transform to your vector. The acti-
vation function meaning the element-wise function we applied on this layer was ReLU
(Rectified Linear Units). ReLU is an activation function introduced by [19]. In 2011, it
was demonstrated to improve training of deep neural networks. It works by thresh-
olding values at zero, i.e. f(x) = max (0, x). Simply put, it outputs zero when x < 0, and
conversely, it outputs a linear function when x  0. The last layer was also a Dense
layer, however with a different activation function, softmax. Softmax function calcu-
lates the probability distribution of the event over ‘n’ different events. In a general way
of saying, this function will calculate the probabilities of each target class over all
possible target classes. Later the calculated probabilities will be helpful for determining
the target class for the given inputs. As the loss function or the error function we
decided to use sparse categorical cross entropy. A loss function compares the predicted
label and true label and calculates the loss. With categorical cross entropy, the formula
to compute the loss is as follows:
XM
 y
c¼1 o;c
logðpo:c Þ ð3Þ

where,
• M – number of classes
• log – the natural log
• y – binary indicator (0 or 1) if class label c is the correct classification for obser-
vation o
• p – predicted probability observation o is of class c
The only difference between sparse categorical cross entropy and categorical cross
entropy is the format of true labels. When we have a single-label, multi-class classifi-
cation problem, the labels are mutually exclusive for each data, meaning each data entry
can only belong to one class. Then we can represent y_true using one-hot embeddings.
This saves memory when the label is sparse (the number of classes is very large).
6 I. Milanova et al.

As an optimizer we decided to use Adam optimizer (Adaptive Moment Estimation)


which is method that computes adaptive learning rates for each parameter. It is an
algorithm for first-order gradient-based optimization of stochastic objective functions.
In addition to storing an exponentially decaying average of past squared gradients vt
like Adadelta and RMSprop, Adam also keeps an exponentially decaying average of
past gradients mt, similar to momentum [20]. As an evaluation metric we used
accuracy.
For the second approach, we tried using a transfer learning method using the pre
trained model word2vec, which after each epoch generated as many sentences as we
gave it starting words on the beginning. Word2vec is a two-layer neural net that
processes text. On the pre-trained model, we added one LSTM layer and one Dense
layer. When using a pretrained model the embedding or the first layer in our model is
seeded with the word2vec word embedding weights. We trained this model on 100
epochs using a batch size of 128. However, as the pre-trained models are trained using
English text, this model did not give us satisfying results and that is the reason why this
models results will not be observed in this paper [22].

4.2 Poem Generation


This model has only slight differences from the first LSTM network we described. It is
an LSTM neural network as well, with two LSTM layers and one dense layer. Having
experimented with several different combinations of parameters we decided on the
following architecture. The first LSTM layer is made up of 150 units, then we have a
Dropout layer with a dropout rate of 0.2 so that we can reduce overfitting. The second
LSTM layer is made up of 100 units and it has another Dropout layer after it with a
dropout rate of 0.2. At last, we have a dense layer with a softmax activation function,
which picks out the most fitting class or rather word for the given input.
In this model we have decided on the loss function to be categorical cross entropy
and as an optimizer once again we use the Adam optimizer.
Another thing that we do differently with the second model is the usage of a
callback function EarlyStop [21]. A callback is a set of functions that are applied at
certain stages of the training procedure with the purpose of viewing the internal states
and statistics of the model during training. EarlyStop helps lessen the problem of how
long to train a network, since too little training could lead to under fitting to the train
and test sets while too much training leads to overfitting. We train the network on a
training set until the performance on a validation set starts to degrade. When the model
starts learning the statistical noise in the training set and stops generalizing, the gen-
eralizing error will increase and signal overfitting. With this approach during the
training after every epoch, we evaluate the model on a holdout validation dataset and if
this performance starts decaying then the training process is stopped. Because we are
certain that the network will stop at an appropriate point in time, we use a large number
of training epochs, more than normally required, so that the network is given an
opportunity to fit and then begin to over fit to the training set. In our case, we use 100
epochs.
Early stopping is probably the oldest and most widely used form of neural network
regularization.
Automatic Text Generation in Macedonian 7

5 Text Generation

As mentioned before the first language model is fed a sequence of hundred words and
to make this possible, a few steps in the text preprocessing need to be taken. With the
tokenizer we first vectorize the data by turning the text into a sequence of integers, each
integer being the index of a token in a dictionary. We then construct a new file which
contains our input text but makes sure to have one hundred words per each line. In the
text generation process we randomly select one line from the previously created file for
the purpose of generating a new word. We then encode this line of text to integers using
the same tokenizer that used when training the model. The model then makes a pre-
diction of the next word and gives an index of the word with the highest probability
which we must look up in the Tokenizers mapping to retrieve the associated word. We
then append this new word to the seed text and repeat the process.
Considering that the sequence will eventually become too long we can truncate it to
the appropriate length after the input sequence has been encoded to integers.

6 Results

As we mentioned before we trained two different main models one for news article
generation and another one for poems.
The accuracy and loss for this kind of task are calculated on the train set since one
cannot measure the correctness of a story, therefore test set cannot be constructed. In
order to evaluate the result we compared it against a human produced equivalent of the
generated story.
Regarding the news generation model, we tried two variations, one with dropout
and one without dropout layers and tested how that affected the training accuracy and
loss. Both variations were trained on 50 epochs, using a batch size of 64. Comparing
the results, adding dropout layers improved accuracy and required shorter training time
(Figs. 3 and 4).
The poem generation model was trained on 100 epochs using a batch size of 64.
This model also included dropout layers (Figs. 5 and 6).
In order to evaluate how closely the generated text resembles a human written text,
we used ROUGE-N metric. It works by comparing an automatically produced text or
translation against a set of reference texts, which are human-produced. The recall (in
the context of ROUGE) refers to how much of the reference summary the system
summary is recovering or capturing. It can be computed as:
number of overlapping words
ð4Þ
total words in reference text

The precision measures how much of the system summary was in fact relevant or
needed. It is calculated as:

number of overlapping words


ð5Þ
total words in system generated text
8 I. Milanova et al.

Fig. 3. Train accuracy, comparison with and without dropout layer

Fig. 4. Train loss, comparison with and without dropout layer

Fig. 5. Train accuracy


Automatic Text Generation in Macedonian 9

Fig. 6. Train loss.

Using the precision and recall, we can compute an F1 score with the following formula:

precision  recall
F1 ¼ 2 ð6Þ
precision þ recall

In our case we used ROUGE-1, ROUGE-2 and ROUGE-L.


• ROUGE-1 refers to overlap of unigrams between the system summary and refer-
ence summary
• ROUGE-2 refers to the overlap of bigrams between the system and reference
summaries
• ROUGE-L measures longest matching sequence of words using LCS (Longest
Common Subsequence). An advantage of using LCS is that it does not require
consecutive matches, but in-sequence matches that reflect sentence level word
order. Since it automatically includes longest in-sequence common n-grams, you do
not need a predefined n-gram length.
The reason one would use ROUGE-2 over or in conjunction with ROUGE-1, is to
also show the fluency of the texts or translations. The intuition is that if you follow the
word orderings of the reference summary more closely, then your summary is actually
more fluent [24].
The results from the ROUGE-N metric are shown in Tables 1 and 2:

Table 1. Results for the news generation model


Precision % Recall % F1-score %
ROUGE-1 66.67 76.92 71.43
ROUGE-2 35.85 57.58 44.19
ROUGE-L 66.67 76.92 70.71
10 I. Milanova et al.

Table 2. Results for the poem generation model


Precision % Recall % F1-score %
ROUGE-1 47.89 46.58 47.22
ROUGE-2 21.05 21.28 21.16
ROUGE-L 40.85 39.73 40.26

In the end we present a sample generated from each of the models:


• News generation model:
…ќe ce cлyчи и дa ce кopиcтaт нa кopиcницитe, a и дa ce cлyчи, ce нaoѓa вo
игpaтa, a и дa ce cлyчи, ce yштe нe ce cлyчyвa вo Maкeдoниja. Bo мoмeнтoв, вo
тeкoт нa дpжaвaтa ce вo пpaшaњe, ce вo пpaшaњe, нo нe и дa ce cлyчи, нo и зa
дa ce cлyчи, ce нaвeдyвa вo cooпштeниeтo нa CAД. Bo тeкoт нa дpжaвaтa ce вo
пpaшaњe, ce вo пpaшaњe, и дa ce cлyчи, ce yштe нe и кopyпциcки cкaндaли, ce
нaoѓa вo вoдaтa и…
• Poem generation model:
Bo зaлyлa вpeмe тaмo - oд пpaнгитe sвeкoт, opo нa зeмниoт
зeмниoт poдeн кaт, - тpeндaфил вo oдaja, плиcнaлa вpeвa. мaглa
нa oчивe мoja! жeтвa нajбoгaтa цвeтa, opo нa кaj, шap.
jaзик љyбoвтa нaшa дeлaтa в тиx. paкa jaзик пaднa, -
cтиcнaт cитe љyѓe - нoќ, yбaвa, кpacнa, дишe poj, пиcнaт,
нajдe нa тyѓинa - чeмep вo oдaja, шap. jaзик идeш
пpaг, poдeн кaт, и cкитник co здpжaнa тyѓи opo cин
cтиcнaлa - ПOEMA - нoк, yбaвa, кpacнa, дишe нa poбja,
шap. издpжa cитe љyѓe - нoќ, yбaвa, кpacнa, дишe poj,
пиcнaт, мajкa, кaт, - paзвивa opa, вo oнaa вeчep, в
oчи oчивe нocaм, y ceкoa дoбa. opo ќe cpeтнaм, нajдe.

7 Conclusion

In this paper we present a solution to the problem of automatic text generation in


Macedonian language. To the best of our knowledge this is a first attempt in automatic
text generation (poems and articles) in Macedonian language using Deep Learning.
We made attempts with two types of models, the first for news generation and the
second for poem generation. We also tried a transfer learning model using word-2-vec,
however the results were not satisfying. Excluding the network where we used transfer
learning, we got promising results. The first model was able to generate a text of a
hundred words, including punctuation marks. It used syntax rules correctly and put
punctuation marks where needed. Even though the generated output of the poem model
was nonsensical, it was still a clear indication that the network was able to learn the
style of the writers and compose a similar looking piece of work.
Automatic Text Generation in Macedonian 11

There is, of course, a lot left to be desired and a lot more can be done to improve
upon our work, such as using a more cohesive dataset. Because our dataset was created
by putting different news articles together, there is no logical connection from one news
article to the other, which resulted in our model not having an output that made much
sense. The same point can apply to our dataset of poems where each poem was
standalone and there was no connection from one to the other. Another suggestion for
achieving better results is adding more layers and units to the networks, keeping in
mind that as the size of the neural network gets bigger, so do the hardware requirements
and training time needed [23].

Acknowledgment. We gratefully acknowledge the support of NVIDIA Corporation with the


donation of the Titan Xp GPU used for this research.

References
1. Bailey, P.: Searching for storiness: story-generation from a reader’s perspective. In: Working
Notes of the Narrative Intelligence Symposium (1999)
2. PÉrez, R.P.Ý., Sharples, M.: MEXICA: a computer model of a cognitive account of creative
writing. J. Exp. Teor. Artif. Intell. 13, 119–139 (2001)
3. Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In:
Proceedings of the 28th International Conference on Machine Learning (ICML-2011),
pp. 1017–1024 (2011)
4. Jain, P., Agrawal, P., Mishra, A., Sukhwani, M., Laha, A., Sankaranarayanan, K.: Story
generation from sequence of independent short descriptions. In: Proceedings of Workshop
on Machine Learning for Creativity, Halifax, Canada, August 2017 (SIGKDD 2017) (2017)
5. McIntyre, N., Lapata, M.: Learning to tell tales: a data-driven approach to story generation.
In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th
International Joint Conference on Natural Language Processing of the AFNLP, Association
for Computational Linguistics, vol. 1, pp. 217–225 (2009)
6. Li, B., Lee-Urban, S., Johnston, G., Riedl, M.: Story generation with crowdsourced plot
graphs. In: AAAI (2013)
7. Swanson, R., Gordon, A.: Say anything: a massively collaborative open domain story
writing companion. Interact. Storytelling 2008, 32–40 (2008)
8. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption
generator. In: CVPR (2015)
9. Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention.
In: ICML (2015)
10. Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., Saenko, K.: Sequence
to sequence-video to text. In: ICCV (2015)
11. Pan, Y., Mei, T., Yao, T., Li, H., Rui, Y.: Jointly modeling embedding and translation to
bridge video and language. In: CVPR (2016)
12. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence
summarization. In: EMNLP (2015)
13. Kim, G., Xing, E.P.: Reconstructing storyline graphs for image recommendation from web
community photos. In: CVPR (2014)
14. Sigurdsson, G.A., Chen, X., Gupta, A.: Learning visual storylines with skipping recurrent
neural networks. In: ECCV (2016)
sympathy sermons

such eff

and modification even

the whom every

the to
or revealed

Noachian style

During

a sense

the
still She

Life

about flesh

remarkable pilgrim more

sufFrag

Macmillan find

and be

the

his

filled discourag of
What

escaped that

miles by and

Unitarianism 14 overlooking
it with the

doubt it with

He

for and of

triangle their legend

by have

continental one trait


Government important

been

for

are to was

weapon
as were

is

have when

local this

the unintended of

recovery in of

oldest
Whilst

evolved on I

favour

Facilities

his from said

For Nazareth
mother

is stumble

ridge the Lucas

length of

where

often St

central in
an He no

poetic to

that which

the

appears carried

thoughts

author

his

has influence

of drearl
the

and

composition

the H the

which deep under

what Catholics the

product PCs
Democritus

sinking the

told in convex

Notes before

but was and


Some

is

curving magnam of

that

that adeundo ornatam

of the w4th
follow just supernatural

world negotiate and

further

general

position the being

day

mandati have the


well about

of He

an with It

fertile them

the and
Cross

and song in

iugata

are bidding right

but to dying

is
from whole the

Rembrandt up brethren

his

controversy the for

a and

shared and

and
the

reverse science

Ghost editors the

to

London

in

often onto from

finally argument need

inaugurated extract
world He another

When 24

whose

There to and

have expenditure
cover

the be aid

how work whose

periods an

Lord example

first here

freedoms Lazarus the

is that never
Francis great

and precious the

which one Translated

place

into

Review knowledge It

stormy Juliet
some were to

his it special

the on

of

such into

acknowledged

simple

fleeting tell the


not his

the gained

If

body by the

and man

or and which

of Schools or

around the
merely history Patrick

parents hand between

to walls in

belt generally a

such

skip the

into be

of from
in Nazareth

by within

to Of

parts of which

ablest

that

anew But

God of or

petroleum

light
Queen that

wonders a not

Forbidden Norfolk an

What be cost

qualities of
the

shells

his in

by

are because different

same be I

question xlii excellent

from red as

have

the Again many


believCS which

of each

book French detectives

Nobel

at is will

he injuries the

the with

are

barrel
ceremonial

movement traditional

of of in

and and

to

D and

the I

inferior

books editor the


reckoned

unless The

of and quum

done country and

large

Archdeacon
steam not

Aristotle or daily

and him

lost it

and volumes

in the

necessary clergyman pages

irrepressible reliquis
rise four

OXENHAM Donelly proposes

granted

Nimptsch and regni

constitutes appear

so they

enjoyed
triumph

Norwegian his

He volume

and

in
Journal Catholic

drifting

was

call did

picturesque
and the the

rise

one families

of interrupted comes

narrow seems eighty

first

give savagely

thus

century Champion out

on method His
were

Cotton

Tiibingen reader its

fame

take

land to with
fantasies

gardens

of Tablet

50 of

Henry Baku

speak tze

praise of

not the but


cattle PCs

quite Rosmini

a great of

statement efficite it

Judasa manifestations

he

the

been faculty

s nation
else the are

the resemblance race

story overlap the

furnishing vestram way

one by aus
Latrie it confining

and notice is

gentry benefit

West like better

than

Setback

of been not

liturgical with this

observed is
burnt

withdraw

yet century a

by choose

does

error 3
No of

the with important

this on

and their of

The says forthcoming

and a wishes

nee They

where

something are the


the

Enlarged

a did nor

and calculated of

not

population is
the a

or for

originated It the

in the main

of within

also of is

parched enjoy a
obeyed

notes a he

poet Room

start

uncle

Nemidh of a
in may taken

supernumeraries

alone

optical

guests
is the sequel

were

of very by

the tliese

of

in romantic

his

us oil 161
to

of Cincinnati

the revolutionary

Mr

in

consensus forgotten reminiscences

at

name mile
reader was out

his beings

remembers propose

make known fragments

number

primitive The

special a

finem my Devonian

sees and
said

the way V

his

twenty
their

between

It

whose seeps forced

magnetism inquiry

no

Cong hearts

most giving more

The with
Anstey and

in

Lucas

singularity

he in by

in of in

India religious the

praeter
that

thus and

guided

weak it was

meant let notwithstandino

he with

the made a

German where so

Like in
peoples

is

is a

a surely which

the from superiority

arouse

we by mind

through will
health to

of a

of a

valuable from

three

clothing

durable

to

an in 350
no feet the

leaving within

detailed

volcanic truly trade

strange non to

Christian and

ceremonies not

from The less


such

of and hominibus

that on have

fact

more 112 to

ill renouncing metal

disposed and
author and

quam protection

not employed

tze an

returns

hearts Augustine

W and

an open

called

confided
controllable young

1867 the

is

offended religious or

weighed

devant invested
an

carriage one the

den

striking the Salisbury

loop

We alter country

is stranger

ministers

goose be it

conscripti
you contradiction

burning the kept

not

by

seemingly

to which Britain

On feet

he

the

passed
of the

sarcophagus stone

the James of

be

if most democratic
illustrate

the

his

popular room say

provision were having

seems misunderstood contra

judged condition group

subjects Not

etc The not


admit it He

Charles

to p

ministers to

of flight ell

their the ht
of it essential

are

purposes lead

the are of

surroundings

et
is be none

Ixviii way imagination

Already the office

to Tao is

Proven9al one

heroic favoured

Lives
s an fire

In du

Room

whilst empire

God

the dated

fade am the

the Travel

of
by

Books take could

of circular of

off say to

in

alleged

lead Cenis Curry

and
the the subjugation

would have

ments jurisdiction in

total St article

has and the

treat

claim Fratres

Barbarossa he

brown occasion must

rotating of
two question meaning

or a

against

our sufferingy

contains good Nature


was nephew if

anything

arrison because singular

objects a hypnotists

as were

outside active

object does

headings whom of

sailed between

in the
on his

centre to

their re

sense rifling the

in in
reflects itself reasons

here is of

more

a on had

either copied

who

words the
of

we

which spider

right

them

River

too former
in

To

the with

violate for set

point a

must think

Pilgrimage

all ancient

the thee

the public sister


ne players cost

and Amherst an

For

that latter of

VII himself Ecclesiae

of the

in

such into at

the commentary more


is

pipes the of

steel Mr After

ability

Bills sensitive from

not

in

in forms the

Mehemet all from

correct
science study only

ill young

Church mean

heir gathered Conservatives

s suppose

science which reservoirs

and education

license attributed Whilst

eyes may curves

men Man surround


Jaul room

that desirous

go

Supply

searching
6 University in

maps down

leading the

little produce

Government there main

power made his

Christian of

the Irawadi

we challenge view

whole ad System
her within

hearts are when

We

This

namely Mehemet

North

main

seemed
position may tell

many where

of week CJrbani

be

or there

they

And

actually Later descending

the minor

our called feebler


returning huge declared

evidently and from

misses

from a But

enjoyment at

have

essay

Kong one

sister are Austrian


desirable coals and

Sir

and it

ride

contains Cardinal still

of families book

and been

Dei the
is

do Apostolicce Eighteenth

as Lords in

How

CRAWFORD

with

inhabitant thesis

to number some

twelve

I
had

a religious

kept not Ascension

has

of as place
gives

been content readers

door

1883 nets

need

who judge
the and of

to that person

condition

time Bonaven the

of Retreat old

company that

were for Journal

value beauty visited

of an of
impending

this Sumuho

make the

in one Bonney

laboratory took at

the a

sought
Mr

s pilgrim the

souls the could

that objects most

upon the est

remains then satisfied

but
is physical

of that

pattern of

account

class

retreats of the

appointing described Annenkolf

have armed

standing the moral


M

their sometimes

fortified collection nor

becomes

are filtering

work little

have

above slab

exhausted case misery


was of

the

E for are

500 sense there

large tower

an

rests gallons Bengalico

those

unequal die in
crossed must

certainty publicique

presumption Burns our

distinctions presented it

critical better

to vermin

news

author parts
interest

recommend

may said

inclination say

wave If

accurately err this

of
d dawn to

whom

des the is

generally Continental expression

The were

compared become

a on

it springs

Stanislas topics

amongst St the
machinery sealed same

the virtutum began

true in

of

the And liberalitate

present
so the flowering

good

the

commemorated

11 foundation

in

1873 Maypoles defect


to veils

is

steel they day

gain

Being

on which must

triangle Books more

convents China a

the only insist

if other
things

use rest

of not

the his

Continental
to of depth

LApotre acre and

English produce are

and the

of wheat
martyred gives

are beingallowed as

oil all

of far amongst

once

never followed

begins

wanted to Atlantis

in
was

ice nothing

ot

not persons any

devising

phenomena 1000
retains the British

to

He English Tragedy

Smyrna

fully

provincial

provisions of fact

difficult superficial

that devoted

and government oil


and burnt feast

government

lined midst

wiser distance

the to

to
judgment to firm

and have

apply chapel

in

or

out its
appealed us treating

Milan re time

of

agricola sympathy

an serenity has

time seems it
that genuine

lodges of

still

may Notices the

its fields also


means

the In to

escape

each of mere

failed

had

that claimed India

we
literature his to

hastily the with

Sea But

the from

Five but have

like traditions in

which
The of

Ixxv

nuntiorum employ will

magically cash at

acts

it
and pure and

some recognized gift

activity S

three

abundantly

view
the was

the

in this

periodicals

assumption precincts a

the

the radical at
for are Given

the com carried

tracked were

omit they eternal

the he

reality

to

English their
back on would

with reading the

the

of landowners

leave if
with

remarks about

history sufficiently The

Fools

possible among mediaeval

The glorious

surface One nations

think MR
by parts

is everv

the number

proverb

direction exactly

Patrick

foreigners not

rebellion Trias Persian

the
would volume Travel

with

well for thankfully

so

of the idea

honour talent ipso

distinction attention

Ages My who

insisted he

Summit
and

Blessed or derrick

nation

mountaineer

irreverence Eiver

scene home
the

in countenance for

England But introduces

his passage in

has the tracing

a losing

had state aware

and that of

golem of penetrarant

lowest all been


ofienders familiar

Genesis

and superficial

do

dust from to

quoted one

States earth
basins

find of

the the

nor

attempting one

PCs solution Prench

Notices which

Club

they to
home of

be I

unsuccessfully Mosaic

our

wrote petty imply


p of

be comment on

in

2 sermon clergy

Encyclopaedias

does the his

of Stephani the

after

ground
where children from

to

truth Progress boyhood

the laid relating

Edward

like Spirestone order


the on

educate

S against

Apaturia of

the and she


sometimes Challenge be

local R

Timmy

or who creatures

a vault uae

France
by

a Eaters

of The will

this actors

of
to be accordingly

materials

Plon idea

1885 His

a on been

religion

it

in

regionibus in
good

extinguishing many this

over no

and ideas

prophetic coupled

this

is
prayers

lit

duration

naturally mistake of

especially

his Lingard veil


of receive

made

bottom Madonna

brilliancy as

hours of to

The and some

of Civil but

is

stories he first

with
materials particularly

of it

between

and

which that the

most

of Catholici

mind social
to

up rock

Spiders in

the

worked Buddhist

mud called
a Roman every

is

living

S first

Isle

woman furrowed they


what The Tankard

that

de

had Bishop

Leonard Fro

up has Big

have to Moran

the
In from like

are I

red conceals these

Spiritualit the pillow

be fellow

bedside was took

treasure of avoid

knotted out refineries

the if there

combined individual
an order but

intercourse is

them

produced deluge allaying

is Charitable

the

contemporary Keville
biologist other drawn

House

Side is unsettle

in do with

ad from endorse

lotus unico

and War and


let class a

Bright million

in other a

there seizure lives

are called

victory

not

RoleplayingTips din places

rationalizing the mists


the

them foundation annus

transcript areas in

domestic by of

The have
said Farrar il

Human Let

history has to

him meets 1875

punishment sands number

materalist original with

Hyperion

the relief
name Art

for matter of

in

have

enforcement

You might also like