Dhealth 2019
Dhealth 2019
Series Editors:
B. Blobel, O. Bodenreider, E. Borycki, M. Braunstein, C. Bühler, J.P. Christensen, R. Cooper,
R. Cornet, J. Dewen, O. Le Dour, P.C. Dykes, A. Famili, M. González-Sancho, E.J.S. Hovenga,
J.W. Jutai, Z. Kolitsi, C.U. Lehmann, J. Mantas, V. Maojo, A. Moen, J.F.M. Molenbroek,
G. de Moor, M.A. Musen, P.F. Niederer, C. Nøhr, A. Pedotti, N. Peek, O. Rienhoff, G. Riva,
W. Rouse, K. Saranto, M.J. Scherer, S. Schürer, E.R. Siegel, C. Safran, N. Sarkar,
T. Solomonides, E. Tam, J. Tenenbaum, B. Wiederhold, P. Wilson and L.H.W. van der Woude
Volume 260
Recently published in this series
Vol. 259 T. Bürkle, M. Lehmann, K. Denecke, M. Sariyar, S. Bignens, E. Zetz and J. Holm
(Eds.), Healthcare of the Future – Bridging the Information Gap – 5 April 2019,
Biel/Bienne, Switzerland
Vol. 258. A. Shabo (Shvo), I. Madsen, H.-U. Prokosch, K. Häyrinen, K.-H. Wolf, F. Martin-
Sanchez, M. Löbe and T.M. Deserno (Eds.), ICT for Health Science Research –
Proceedings of the EFMI 2019 Special Topic Conference
Vol. 257. F. Lau, J.A. Bartle-Clar, G. Bliss, E.M. Borycki, K.L. Courtney, A.M.-H. Kuo,
A. Kushniruk, H. Monkman and A.V. Roudsari (Eds.), Improving Usability, Safety
and Patient Outcomes with Health Information Technology – From Research to
Practice
Edited by
Dieter Hayn
Digital Health Information Systems, Center for Health & Bioresources, AIT
Austrian Institute of Technology GmbH, Graz, Austria
Alphons Eggerth
Digital Health Information Systems, Center for Health & Bioresources, AIT
Austrian Institute of Technology GmbH, Graz, Austria
and
Günter Schreier
Digital Health Information Systems, Center for Health & Bioresources, AIT
Austrian Institute of Technology GmbH, Graz, Austria
This book is published online with Open Access and distributed under the terms of the Creative
Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
Publisher
IOS Press BV
Nieuwe Hemweg 6B
1013 BG Amsterdam
Netherlands
fax: +31 20 687 0019
e-mail: [email protected]
LEGAL NOTICE
The publisher is not responsible for the use which might be made of the following information.
Preface
Since its beginning in 2007, the dHealth conference series is organized by the Austrian
Working Group of Health Informatics and eHealth. Each year, this event attracts around
300 participants from academia, industry, government and health care organizations.
In keeping with its interdisciplinary mission, the dHealth conference series provides
a platform for researchers, practitioners, decision makers and vendors to discuss innova-
tive health informatics and eHealth solutions to improve the quality and efficiency of
healthcare by digital technologies.
A special topic for the dHealth 2019 was “from eHealth to dHealth”, stressing that
healthcare will be more and more data-driven in the future. While eHealth in general
concerns healthcare IT solutions at professional healthcare providers, dHealth addresses
broader fields of application in all areas of life, including sensors, networks, genomics
and bioinformatics, data centered solutions, machine learning, etc.
The present proceedings give insights into the state of the art of different aspects of
dHealth, including the design and evaluation of user interfaces, patient centered solutions,
electronic health/medical/patient records, machine learning in healthcare and biomedical
data analytics. These topics address the data path “from sensors to decisions”, providing
an interdisciplinary approach to digital health, including aspects of biomedical and sen-
sor informatics.
Dieter Hayn
Alphons Eggerth
Günter Schreier
This page intentionally left blank
vii
Reviewers
We would like to thank all reviewers for their significant contribution to the proceed-
ings of the dHealth 2019:
Contents
Preface v
Dieter Hayn, Alphons Eggerth and Günter Schreier
Scientific Programme Committee vii
Reviewers viii
Abstract. Diabetes mellitus (DM) is a chronic disease that affects many people in
Switzerland and around the world. Once diagnosed, a patient has to continuously
monitor blood glucose, manage medications or inject insulin. Technical skills and
competencies as well as knowledge on disease management have to be acquired
right after being diagnosed. Diabetes consultants support patients in this process and
provide educational material. While the process of generating patient-tailored
material is currently complex and time consuming, in future, the eDiabetes platform
can help. The platform developed in cooperation with the consulting section of the
Swiss Diabetes Society offers the opportunity to create individual patient
information and instructions to teach technical skills and knowledge on diabetes.
Further, an integrated forum allows exchanging information and discussing issues
regarding diabetes counselling on a secure platform. Usability tests showed that
eDiabetes is easy to use and provides benefits for diabetes consultants and patients.
1. Introduction
Diabetes mellitus (DM) is a chronic disease that affects many people in Switzerland and
around the world. DM causes blood glucose levels to increase (hyperglycemia). A recent
forecast of the International Diabetes Federation predicts that in 2045 more than 625
Million people worldwide will suffer from diabetes [1]. There is even an increasing
prevalence of type 2 DM in children and adolescents around the world [2]. Reasons for
the overall increase are an ageing society, obesity and lack of physical activity in people
[2]. As in other diseases, education plays a key role in the treatment of DM and in the
management of the condition by the patients themselves. The treatment success relies
heavily on patient accountability and awareness over the restrictions imposed by the
condition, in addition to the need for patients to manage their glucose levels. To avoid
complications and comorbidities, it is crucial that the newly diagnosed patient learns
about the disease, about the events that can occur and the therapeutic management of the
disease. There are mobile applications available such as “mysugr” that aim at supporting
the individual disease management, but support for diabetes consultants in creating
individualized education material is still missing. Patient education for diabetes aims at
1
Corresponding Author: Kerstin Denecke, Bern University of Applied Sciences, Quellgasse 21, 2501
Biel, Switzerland, E-Mail: [email protected]
2 K. Denecke et al. / Creating Individualized Education Material for Diabetes Patients
2. Related Work
Several educational interventions have been tested in patients with DM. Nevertheless, a
universally effective model for patients is still unavailable [5]. Health education is
recognized as an effective self-management capacity building tool, in which patients are
empowered to play an active role in the management of their conditions. The main four
pillars for health education are: 1) empowering individuals, 2) leadership, 3) motivation
and 4) education and information [6]. All type 1 and 20-30% type 2 diabetic patients
require insulin via daily subcutaneous injections. The technique is not complex, but many
patients tend to forget and inject insulin incorrectly [7]. Education demands a lot from
health care providers and includes specific training, teaching skills and motivation of
patients [7]. Initiate education for patients newly diagnosed with diabetes and providing
information on self-management skills to help ensure safe post-discharge care are some
of the suggested strategies for DM patient education [8].
K. Denecke et al. / Creating Individualized Education Material for Diabetes Patients 3
consultants from a local hospital. Given the restricted time for the project (February 2018
to June 2018), only a test with a limited number of participants could be realized.
The eDiabetes platform integrates a community platform. To select an appropriate
platform, we performed a value benefit analysis for integrating a blog, a wiki or a forum.
We formulated 10 criteria for required functionalities as collected in the requirement
analysis. They include functionalities such as exchanging images and information,
searching for postings, create postings, exchange private messages with colleagues.
4. Results
4.1. Requirements
understandable manual when using blood glucose meters, insulin pens, and lancing
devices. It would also be helpful if the leaflets or instructions for use are more visual and
contain less text.
As best suited community tool to be integrated into eDiabetes, a forum was selected
based on the results from the value benefit analysis. Thus, as a second functionality of
the eDiabetes platform, the open source forum software MyBB (https://mybb.com/) was
integrated. The forum allows experts to discuss with each other, exchange information,
discuss diabetes issues or ask for a second opinion. The forum is open to all members of
the Swiss Diabetes Society.
Figure 2. eDiabetes platform: For instructions on devices, original image material from the manufacturer is
required. This can be uploaded to a database that can be accessed by the eDiabetes platform.
The platform is running on a web server. This was set up using LAMP Stack, Apache
Version 2.4.18, MySQL 5.7.2, PHP 7.0.22 und phpMyAdmin 4.5.4.1. For developing
the views of eDiabetes, Java script, CSS and PHP were used.
The underlying information material on devices originates from the manufacturers
of glucose meter and insulin pens. In the course of the project, 11 manufacturers were
contacted to get approval of using the original images in our prototype. An update of the
material is required when new devices are put on market or modifications on existing
devices are made. To realize this, our concept foresees that the manufacturer can upload
the image to a database from which the eDiabetes platform collects the image files (Fig.
2).
Figure 3. Screenshot of the drag and drop functionality to create a user manual specifically for a patient. The
images can be arranged in the order as needed by the patient.
K. Denecke et al. / Creating Individualized Education Material for Diabetes Patients 7
Usability in our context concerned on the one hand the interaction with the eDiabetes
platform and on the other hand, the appropriateness of the individualized leaflets that are
generated with the platform. The feedback on the eDiabetes platform by the diabetes
consultants was very positive; they confirmed that the platform is simple, clear, and easy
to use. They felt comfortable in interacting with the platform. However, two test persons
had difficulties in interacting with the platform. Further assessments on this issue has
shown that these two persons are older than the others and are using the computer less
often during their work. Nevertheless, the survey showed that interacting with the
eDiabetes platform is easy to learn without external help or a user manual. Test persons
claimed that they liked the platform, because it provides many possibilities for extensions.
They even asked for more options to add individual descriptions.
Patients on the other side confirmed that all relevant information is available on the
leaflets. The personalized leaflets help them, since the image-based process descriptions
provide a step-by-step presentation of the instructions which is clear, and all relevant
information is summarized briefly and concisely. They stated clearly that personalized
leaflets are better than the original instructions for use. In addition, information sheets
are more understandable for foreign-language patients than text-based leaflets. However,
they are still missing advices on what to pay attention to when using one of the devices.
References
[1] D. Schillinger, K. Grumbach, J. Piette, et al. Association of Health Literacy With Diabetes Outcomes.
JAMA. 2002;288(4):475–482. doi:10.1001/jama.288.4.475
[2] T. Chomutare, L. Fernandez-Luque, E. Årsand, G. Hartvigsen: Features of Mobile Diabetes Applications:
Review of the Literature and Analysis of Current Applications Compared Against Evidence-Based
Guidelines. J Med Internet Res. 2011 Jul-Sep; 13(3): e65.
[3] Thomas Reinehr: Type 2 diabetes mellitus in children and adolescents. World J Diabetes. 2013 Dec 15;
4(6): 270–281.
[4] N.H. Cho, J.E. Shaw, S. Karuranga, Y. Huang, J.D. da Rocha Fernandes, A.W. Ohlrogge, B. Malanda: IDF
Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res
Clin Pract. 2018 Apr;138:271-281. doi: 10.1016/j.diabres.2018.02.023. Epub 2018 Feb 26.
[5] L. Haas, M. Maryniuk, J. Beck, C.E. Cox, P. Duker, L. Edwards, et al.; Standards Revision Task Force.
National standards for diabetes self-management education and support. Diabetes Care 2014;37:S144-
53. DOI: http://dx.doi.org/10.2337/dc14-S144
[6] R.C.C. Iquize, F.C.E.T Theodoro, K.A. Carvalho, M.A. Oliveira, J.F. Barros, A.R.D. Silva : Educational
practices in diabetic patient and perspective of health professional: a systematic review. J Bras Nefrol.
2017 Apr-Jun;39(2):196-204. doi: 10.5935/0101-2800.20170034.
[7] A. Maldonato, D. Bloise, M. Ceci, E. Fraticelli, F. Fallucca . Diabetes mellitus: lessons from patient
education. Patient Educ Couns. 1995 Sep;26(1-3):57-66.
[8] A.T. Nettles: Patient education in the hospital. Diabetes Spectrum, 2005, 18(1), 44-48
[9] G.T. McMahon, H.E. Gomes, S. HicksonHohne, T.M. Hu, B.A. Levine, et al. Web-based care management
in patients with poorly controlled diabetes. Diabetes Care, 2005, 28: 1624-1629.
[10] F.A. Mersal, N.E. Mahday, N.A. Mersal. Efficiency of Web-Based Education versus Counselling on
Diabetic Patients ' Outcomes. Lambert Academic Publishing, Saarbrücken, 2012
[11] R. Rachmani, Z. Levi, I. Slavachevski, M. Avin, M. Ravid. Teaching patients to monitor their risk factors
retards the progression of vascular complications in high-risk patients with Type 2 diabetes mellitus--a
randomized prospective study. Diabet Med. 2002;19(5):385–92
[12] S. Izahar, Q.Y. Lean, MA. Hameed, et al. Content Analysis of Mobile Health Applications on Diabetes
Mellitus. Front Endocrinol (Lausanne). 2017;8:318. Published 2017 Nov 27.
doi:10.3389/fendo.2017.00318
[13] J. Prümper, M. Anft: Die Evaluation von Software auf Grundlage des Entwurfs zur internationalen
Ergonomie-Norm ISO 9241 Teil 10 als Beitrag zur partizipativen Systemgestaltung - ein
Fallbeispiel. Software-Ergonomie '93, Stuttgart: Teubner, 1993
[14] J. Vandenbosch,, S.V. den Broucke, L. Schinckus, et al. The impact of health literacy on diabetes self-
management education. Health Education Journal, 77(3), 2018, 349–362.
https://doi.org/10.1177/0017896917751554
[15] K. Martin, L. Carter, D. Balciunas, F. Sotoudeh, D. Moore, J. Westerfield. The impact of verbal
communication on physician prescribing patterns in hospitalized patients with diabetes. Diabetes Educ.
2003 Sep-Oct;29(5):827-36
[16] SM. Gillani, A. Nevill, BM. Singh. Provision of structured diabetes information encourages activation
amongst people with diabetes as measured by diabetes care process attainment: the WICKED Project.
Diabet Med. 2015 Jul;32(7):865-71
dHealth 2019 – From eHealth to dHealth 9
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-9
Abstract. Standard toilets in Western countries often do not meet the needs of
elderly and disabled people with physical limitations. While the existing concept
of barrier-free toilets and the emerging “changing places” concept offer more
space and support, the fixed height of the toilet seat still imposes a major problem
during all phases of toilet use and can limit the users’ autonomy by requiring
personal assistance. Thus, in the EU project iToilet an innovative ICT-based
modular height adjustable toilet system was designed to support the autonomy,
dignity and safety of older people living at home by digital technology
enhancements adapting the toilet to their needs and preferences. The main
requirements were: double foldable handrails, height and tilt adjustment,
emergency detection and call, and ease of use. The ICT component in this
approach serves a double purpose of enhancing usability of the base assistive
technology while at the same time providing safety for independent use. A field
test of a prototype system in real environments of a day care center and a
rehabilitation clinic has been successfully finished. The application of the iToilet
concept also in semi-public settings is currently studied in the Toilet4me project.
Keywords. AAL, toileting, autonomy, care, smart toilet, robotic toilet, barrier-free
toilet
1. Introduction
Standard toilets in Western countries often do not meet the needs of elderly and
disabled people [1, 2, 3]. For individuals with physical disabilities therefore barrier-free
toilet concepts have been introduced, which provide more room e.g. for wheelchairs or
assistants and a fixed height raised toilet seat with grab bars for support during transfer
and when sitting. Recently further improvements by a changing bench and a hoist have
been proposed via the “changing places” consortium [4] mainly in the UK. Still, such
concepts are difficult to implement in users’ homes, and they do not solve one of the
main challenges, the fixed toilet height during all phases of toilet use, which might be
unsuitable and prevent fully-autonomous toilet use or transfer. Therefore, in the EU
project iToilet an innovative ICT-based modular height adjustable toilet system was
designed to support the autonomy, dignity and safety of older people living at home by
digital technology enhancements adapting the toilet to their needs and preferences.
A main motivation of Assistive Technology design has always been to support
autonomy – at least in the sense that there are alternatives between personal assistance
and technological support to choose from. Such independence can be accomplished by
1
Corresponding Author: Paul Panek, Institute of Visual Computing and Human-Centered Technology,
TU Wien, Favoritenstrasse 11/193-05b, A 1040 Vienna, Austria, E-Mail: [email protected]
10 P. Mayer et al. / Towards Smart Adaptive Care Toilets
be not “uphill” i.e. the toilet height should be slightly higher than or equal to
the wheelchair seat height.
x A raised toilet seat and armrests support the users in standing upright and
again facilitating the process of dressing.
An assistive, height adjustable toilet (in form of an ICT enhanced robotic toilet)
therefore should support three to four different individual heights in the range between
ca. 35 to 80cm during the toileting process. The individual settings can be retrieved
from a database, thus making it possible to use one adaptive toilet for many users e.g.
in institutional settings with IT infrastructure.
In the reminder of the paper, chapter 2 outlines the iToilet project approach
covering user requirement gathering, prototype development and evaluation for home
and institutional use. Chapter 3 provides an overview of the toilet4me study project on
a possible extension of the iToilet concept from use in home and institutional settings
to the area of semi-public spaces.
In the iToilet project presumptive users (n=41, all with mobility restrictions but able to
walk with a technical aid) [6, 7] were first asked to rank the relevance of problems in
relation with toilet use based on their personal experience to collate the project
assumptions right at the start (see Table 1).
The same questions were also given to 21 secondary users (care givers) and 12 tertiary
users (managers of health care organizations and insurances). While primary users
generally gave lower scores to problems than secondary and tertiary users, the highest
12 P. Mayer et al. / Towards Smart Adaptive Care Toilets
ranked problems all dealt with the height of the toilet and the transfer to and from the
seat followed by hygiene issues.
Moreover, a set of questions regarding proposed support functions of an ICT
enhanced toilet was ranked by the same users. Here fall detection, emergency
recognition and custom settings were the most favored functions. Overall, the idea of
iToilet to provide ICT enhanced physical support for independent and safe use was
found to be clearly coincident with existing problems and appreciated by the users. The
users’ ranking of iToilet functions is documented in more details in [6, 8] and led to
prioritized features for prototyping.
While all of the high priority requirements (see Table 1) were implemented in the
iToilet prototypes some of the medium priority items were either only tested in
laboratory (automatic dispensing of toilet paper) or left for the design of a future
product (self-sanitizing seat and bowl, shelf/tray area, individually formed toilet seat
etc.) where commercial solutions are already available. This approach allowed the
consortium to focus on the most important user needs when designing, implementing
and testing prototypes.
Based on the requirement lists (Table 1) two different prototypes (PT), both based on
the existing sanitary products “Lift-WC” and “mobile toilet chair” of company Santis
Kft. [9] with the same interaction concept [10] and base functionality, were developed:
a chair-like prototype of a motorized stand-up support “PT1” (see Fig.1 left) which can
be easily placed over any existing toilet bowl (for single users at home, appropriate for
temporary use without complex installation as only the seat is movable) and another
prototype “PT2” based on a wall-mounted base toilet system where the whole toilet
including the bowl is movable. PT2 needs more installation efforts and might be more
suitable for use in multi-user settings of institutions (see Fig.1 right).
Figure 1. Chair-like iToilet prototype PT1 (left) and wall mounted prototype PT2 (right) as installed for user
tests (see section 2.3 iToilet Field Tests).
Despite the taboo area of toilet and personal hygiene considerable involvement of
users in participatory design activities could be accomplished [11, 12, 13]. In the
P. Mayer et al. / Towards Smart Adaptive Care Toilets 13
design of toilet paper dispensers, speech recognition, different buttons and armrests the
users actively contributed their ideas to the development in the form of co-design
activities.
Base system:
x Adjustable motorized height and tilt of seat (approx... 40-75cm height and 0-
10 respectively 30 degree tilt), position monitoring by sensors
x Arm rests for support on both sides, foldable for wheelchair transfer
x Manual operation by a user interface with big buttons featuring clear symbols
and tactile feedback (on both sides)
x Motor amperage draw monitoring and safety contacts for collision detection
x Wireless configuration via microcontroller, Wifi and MQTT (see next item)
As captured by this list, ICT together with the mechanical base modules not only
allows hands-free control of functions via speech recognition but also enables the
system to perform actions like automatic change of positions or flushing based on user
preferences while the emergency recognition component monitors safe use and
instructions can be given to users by the system or via the GSM voice link if needed.
The PT2 system was tested autonomously under real daily use conditions by a total of
50 users, 23 clients of a Multiple Sclerosis day care center and 27 patients of a
rehabilitation clinic [14], during a period of at least 4 weeks after approval by the
appropriate research boards had been granted. The testing in institutions was chosen
14 P. Mayer et al. / Towards Smart Adaptive Care Toilets
because of easier access to the necessary number of users with fitting profile and the
better support to users in case of problems.
After this four-weeks experience of the toilet, during the final interviews all
provided functions (c.f. 2.2) were rated as useful by at least 75 to 100% of users, except
the bidet function, which was not so well accepted at the day care center and the
recognition of spoken commands (as alternative to pressing buttons) which achieved
only around 60% rating on both test sites. The reliability of the prototype functions was
rated high for the core support functions but some additional features like the speech
recognition or the emergency monitoring were criticized because of too many false
positives (caused by the initially underestimated range of behaviors).
During the field trials around 500 toilet visits have been logged. The use patterns,
ranging from 5 minutes (on average) to sometimes 30 minutes, clearly showed that
individual, higher sit-down and stand-up heights have been preferred by the majority of
users which is in line with the good rating from interviews. The user identification by
RFID tag was technically reliable but was not always easy to use because the
participants were required to hold the personal tag against a reader. For plain home use
for single users this will not be necessary, for institutional use additional more
ergonomic methods of user identification should be investigated.
and their needs when using a toilet outside home in public or semi-public environments
(e.g. in community centers, shopping malls, theatres, hotels etc.).
The main idea of Toilet4me is simple but challenging: As iToilet already
demonstrated the benefits for supporting people during toilet use at home (or in a care
institution), we now want to proceed and explore the feasibility of this type of
supportive toilet in places outside the own home. Offering the support in places which
people frequently visit or would like to visit, if appropriate toilet facilities would be
available, should allow people to closer participate in society, which should contribute
to their independence and quality of life. A service or technical solution which allows
the users to always “take their own preference settings with them in the pocket”, in the
form of a digital personal use profile, can facilitate a lot of new possibilities for several
user groups, inside and outside the home. Toilet4me together with end users (older or
disabled people, their caregivers and managers of public places and hotels) will
elaborate the requirements for such service.
It is expected that the principles of accessible toilets for home or institutional use
can also be applied in semi-public settings but of course challenges like costs for
installation, maintenance and service as well as suitable methods for the safe and easy
data exchange for preferences have to be solved. The Toilet4me project shall deliver
facts for informed estimations about the chances of a successful market introduction.
iToilet has demonstrated that ICT enhanced physical support can assist people in using
a toilet without personal help while safety is preserved. Accessible and barrier-free
toilets are important for the autonomy of older or disabled users which otherwise would
be without choice and dependent on personal assistance in this taboo area. This affects
people both in their daily life at home, but also when going out and when participating
in social life as they are often guided by the availability of suitable toileting facilities.
While toilet rooms with traditional barrier-free design are a step into the right
direction and further improvements towards “changing places” lower the barriers for
many users, additional motorized support for optimum seat height for all phases of
toilet use and for people of all body sizes can be essential elements for supporting
many old or disabled people. ICT enhancements can aid the operation and enhanced
safety features can give users the feeling of safety even when no personal assistant is in
reach. Digital technologies can add a smart adaptive layer to the assistive base
technology which empowers the users to customize the assistive service according to
their individual needs wherever they are and to select the level of safety they prefer and
at the same time help providers to integrate the technology based support smoothly into
modern (health-) care services. Thereby a sound individual balance between wanted
autonomy and personal assistance in a location independent way can be achieved.
Well designed, “really” barrier-free smart and adaptive care toilets, enabling
personalized assistive settings and the integration with other health services, also might
open new market fields from the economically underdeveloped core AAL market
towards accessible tourism not only for public places like theaters or museums but also
for hotels and recreation or wellness sites. This is currently under investigation in the
Toilet4me project, as outlined above.
16 P. Mayer et al. / Towards Smart Adaptive Care Toilets
Acknowledgement
References
[1] J. F.M. Molenbroek, J. Mantas, R. De Bruin (eds.), A Friendly Rest Room: Developing toilets of the
future for disabled and elderly people, IOS press, Amsterdam, 2011.
[2] P. Chamberlain, H. Reed, M. Burton, G. Mountain, ‘Future Bathroom’, What to make? Or How to
Make? Challenges in meeting sustainable needs. In: Sustainable Intelligent manufacturing, IST Press,
Portugal, 2011, pp. 777-784.
[3] A. Kira, The Bathroom, Viking Press, New York, 1976.
[4] Changing Places Consortium, Changing Places: The Practical Guide, 2013, online: http://www.changing-
places.org/ [last access: 31 Jan 2019].
[5] F. Güldenpfennig, P. Mayer, P. Panek, G. Fitzpatrick, An Autonomy-Perspective on the Design of
Assistive Technology: Experiences of People with Multiple Sclerosis, ACM CHI Conf on Human
Factors in Computing Systems (CHI 2019), May 4-9, 2019, Glasgow, Scotland, UK (to appear).
[6] T. Pilissy, A. Tóth, G. Fazekas, A. Sobják, R. Rosenthal, T. Lüftenegger, P. Panek, P. Mayer, Towards a
situation-and-user-aware multi-modal motorized toilet system to assist older adults with disabilities: A
user requirements study, 15th IEEE Intern Conf. on Rehabilitation Robotics (ICORR), QEII Centre,
London, UK, July 17-20, 2017, pp. 959-964, DOI: 10.1109/ICORR.2017.8009373.
[7] P. Panek, G. Fazekas, T. Lueftenegger, P. Mayer, T. Pilissy, M. Raffaelli, A. Rist, R. Rosenthal, A.
Savanovic, A. Sobjak, F. Sonntag, A. Toth, B. Unger, On the Prototyping of an ICT-Enhanced Toilet
System for Assisting Older Persons Living Independently and Safely at Home, Studies Health
Technology Informatics, vol. 236, IOS press, DOI 10.3233/978-1-61499-759-7-176, 2017, pp. 176-183.
[8] A. Sobják, T. Pilissy, G. Fazekas, A. Tóth, R. Rosenthal, T. Lüftenegger, P. Mayer, P. Panek, iToilet
project deliverable D1.1 (public version). User Requirements Analysis showing three priority level,
2016, http://www.itoilet-project.eu, last access: 20.3.2019.
[9] Sanitary company Santis Kft., Debrecen, Hungary, http://www.santis.org/, last access: 20.3.2019.
[10] P. Panek, P. Mayer, Initial Interaction Concept for a Robotic Toilet System, Proc of the Companion of
the ACM/IEEE Intern Conf on Human-Robot Interaction (HRI 2017), March 6-9, 2017, Vienna,
Austria, doi: 10.1145/3029798.3038420, pp. 249-250.
[11] P. Mayer, P. Panek, Involving Older and Vulnerable Persons in the Design Process of an Enhanced
Toilet System, ACM CHI Conf on Human Factors in Computing Systems (CHI 2017), Denver,
Colorado, May 6-11, 2017 doi: 10.1145/3027063.3053178, pp. 2774 – 2780.
[12] R. Rosenthal, F. Sonntag, P. Mayer, P. Panek, Partizipation als Instrument zur Optimierung der
Selbstwirksamkeit für Menschen mit der Diagnose Multiple Sklerose im Rahmen des EU Projektes
iToilet, Poster, Pflegekongress, Austria Center Wien, 30 Nov – 1 Dec, 2017.
[13] P. Panek, P. Mayer, Ethics in a Taboo-Related AAL Project, in: F. Piazolo, St. Schlögl (eds.),
Innovative solutions for an ageing society, proc of Smarter Lives 18 conf, 20 Feb 2018, Innsbruck,
Pabst Science Publishers, Lengerich, ISBN: 978-3-95853-413-1, pp. 127-133.
[14] G. Fazekas, et al., Assistive technology in the toilet ̺ Field test of an ICT̺enhanced lift̺WC,
accepted for 15th EFRR Congress 2019, April 15-17, 2019, Berlin, Germany (to appear).
[15] A. Manzeschke, K. Weber, E. Rother, H. Fangerau, Ergebnisse der Studie „Ethische Fragen im Bereich
Altersgerechter Assistenzsysteme“, Berlin, (VDI/VDE), 2013.
dHealth 2019 – From eHealth to dHealth 17
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-17
1. Introduction
1
Corresponding Author: Michael Schmucker, Heilbronn University of Applied Sciences, Max-Planck-
Str. 39, 74081 Heilbronn, Germany; E-Mail: [email protected].
18 M. Schmucker et al. / Evaluation of Depth Cameras for Use as an Augmented Reality Emergency Ruler
resuscitated, for example, after five initial ventilations in a 15:2 rhythm (thoracic
compression to ventilation), while adults start in a 30:2 rhythm. Intubation is also more
difficult due to the anatomical characteristics [4]. However, the most frequent errors
occur in the dosage of medication. Children are particularly susceptible here because the
dose must be calculated or estimated individually, depending on their weight. As Young
and Korotzer have pointed out in their systematic analysis, parental estimation is the
most accurate method for determining weight, followed by the size-based method where
weight is derived from the child's height using survey data (e.g. German Health Interview
and Examination Survey for Children and Adolescents (KiGGS) [5]). Medical doctors'
weight estimates are not accurate [6]. But even if the weight is known, calculation errors
occur due to nervousness or hecticness. As part of a study by Hoyle et al., 125 out of 360
prescriptions were made with dosage errors. Especially in preclinical environments these
errors happen frequently. The reason for this is probably the lack of experience of
emergency paramedics or emergency physicians with pediatric emergencies [7]. In a
further study with simulated resuscitation, incorrect doses of a potency of 10 (1000% of
the recommended dose) took place with one of 32 prescriptions - these can pose a life-
threatening risk [8]. Young, inexperienced physicians, who make up the majority of
emergency physicians, are particularly susceptible [9]. Based on these facts, physicians
want electronic tools, such as a computer program or a calculator, because they can
demonstrably minimize calculation errors [8][10]. Such a computer program could
directly perform all necessary calculations, be it the dosage of drugs or the current
strength of the defibrillator. But the weight or height still has to be known first. Even
better, of course, would be the automatic recognition of the weight or size. This means
that calculation errors can be ruled out if the detection is error-free. At the moment, so-
called emergency rulers (e.g. Browselow Tape [11] or PediaTape [12]) are the most
important aids alongside the guidelines. An emergency ruler is placed next to the head
of a child. A color code can then be read off at the feet, which can be used to indicate
dosage recommendations or age-appropriate reactions, e.g. to the Glasgow Coma Scale.
This information is often stored in a brochure supplied with the ruler (Figure 1). The
dosage recommendations may also be printed directly on the tape.
Technically, size recognition is possible with the help of depth cameras, some of
which are built into smartphones (Asus Zenfone AR, Lenovo Phab2 Pro) or head-
mounted displays (Microsoft HoloLens). These cameras are designed for augmented
reality (AR) applications. They are necessary for the most error-free placement of AR
elements in a room. These cameras can be programmed with the Tango Framework from
Google [13] among others. With the help of the depth camera and Google Tango
Framework it is possible to perform accurate measurements. The depth camera scans the
M. Schmucker et al. / Evaluation of Depth Cameras for Use as an Augmented Reality Emergency Ruler 19
room and gets to know the environment of the user. Thus, it is possible to carry out the
measurements within one to two seconds from any point and at any angle near the object
to be measured. But the question is, how accurate are these cameras in reality? Only
exact cameras are suitable for this type of application. This paper presents a study in
which 33 children aged between 3 and 6 years were measured with an emergency ruler
and an augmented reality app on a smartphone with a depth camera. The results were
then compared. The aim was to find out to what extent the size measurement functions
solidly and thus further work in this research area is meaningful.
2. Methods
First a systematic PubMed and Google Scholar search according to the PRISMA scheme
was indicated to investigate if there are already similar works or important preliminary
works.
Subsequently, a study design was developed that allows to compare an augmented
reality app on a standard smartphone with depth camera (Asus Zenfone AR) with an
emergency ruler (PediaTape). For this purpose, an augmented reality app was written,
which uses Libraries of the Google Measure App [14], an already existing app for
measuring lengths. Using the Google Measure App (tango version) directly was
unfortunately not possible because it often rounds off the results.
Afterwards 33 children aged between 3 and 6 years were measured one after the
other in an interior with daylight lying on the floor, first with the PediaTape, then with
the Zenfone AR. The two values were recorded. For further research and quality
assurance, the weight of each child (with clothing) was also noted. It is a within-group
design with one independent variable (measurement device) that adopts two values
(emergency ruler, augmented reality app). An exemplary measurement is shown in
Figure 2. For data protection reasons, the measurement for the illustration is simulated
with a doll.
Figure 2. Screenshot of the re-enactment of the measurement with PediaTape and Augmented Reality App.
20 M. Schmucker et al. / Evaluation of Depth Cameras for Use as an Augmented Reality Emergency Ruler
ௌభ ାௌమ
ܵሺݔǡ ݕሻ ൌ ቀ ǡ ሺܵଵ െ ܵଶ ሻቁ (1)
ଶ
The upper and lower limits of agreement (LOA) are defined as ݀ േ 1.96s at a
significance level of ߙ= 0,05 where ݀ represents the mean difference and s the standard
deviation of pairwise differences. If 95% of the measurements lie within the LOA, both
methods can be considered interchangeable, i.e. both methods are equally appropriate
[17].
To check if a Bland-Altman Plot might be applicable, a one-sample t-test comparing
the mean deviation of the difference (S1-S2) to the reference value zero was executed as
preliminary work. In the optimal case, the difference of the individual measurements
would be zero - then both measurements would be identical. The following hypothesis
is tested:
H0: There is no difference between an augmented reality app running on the Asus
Zenfone AR and a PediaTape emergency ruler in the quality of
measuring.
If there are large deviations, this means that there is a significant difference between
the two measurement methods and a Bland-Altman plot would not be necessary.
3. Results
Although there is some interesting work done with commercially available depth
cameras, such as hand gesture recognition [18], it is never evaluated how accurate these
cameras actually are. But in the case of this paper and other cases, an exact measurement
is the basis for further applied research. There are also various studies that attempt to
solve the relevant and known problem of dosage errors in pediatric emergencies with the
aid of an app [19][20][21]. But no attempt to automate the process of size recognition -
and thus the basis of all calculations - is available. Some of these apps also use the age-
based formula [19], which, according to the systematic review of Young and Korotzer,
is less accurate than the size-based estimate [6]. All dosage apps found have in common
that the values must be entered manually. However, these metrics must first be known
(age, weight or height). There are also apps that aim to digitally map the analog
emergency guidelines that are available [22].
M. Schmucker et al. / Evaluation of Depth Cameras for Use as an Augmented Reality Emergency Ruler 21
Table 1 and Table 2 show the results of the one sample t-test to test value = 0. There is
no significant difference (diff) between the two measurement methods in the quality of
measuring (t = -1.022; p = 0.314). The difference of the individual measurements does
not deviate significantly from zero (x̅ = 0.314; s = 2.044). H0 can thus be retained.
Table 1. Difference (diff) between PediaTape and Augmented Reality App (in cm). One sample statistics
(SPSS Output).
N Mean Std. Deviation Std. Error Mean
diff 33 -0.364 2.044 0.356
Table 2. Difference (diff) between PediaTape and Augmented Reality App (in cm). One sample t-test to test
value 0 (SPSS Output).
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
diff -1.022 32 0.314 -0.364 -1.088 0.361
As can be seen graphically in Figure 3, at least 95% of the measurements are within the
limits of agreement (Mean േ 2 Standard Deviation (SD)). Thus, it can be said that there
is no significant difference in the quality of the two measurement methods in terms of
size detection. In most cases (15 of 33; 45%) there was a deviation of 0 to 1 centimeter,
in 25 cases (75%) the deviation was two centimeters or less. Deviations greater than three
centimeters were rare (3 of 33; 9%). There was no variation greater than 5 centimeter.
To exclude a proportional bias a linear regression can be made. For this purpose, the
mean (mmean) is tested for hypothesis H:
As can be seen in Table 3, the t value is not significant (t = -1.389; p = 0.175). Thus,
H can be maintained. Within the available data a proportional bias can be excluded.
Table 3. Linear regression to test for proportional bias for dependent variable difference (diff). SPSS Output.
Unstandardized Coefficients Standardized Coefficients
Model B Std. Error Beta t Sig.
(Constant) 6.783 5.158 1.315 0.198
mean -0.065 0.047 -0.242 -1.389 0.175
4. Discussion
right time. Such intelligent assistance services have, for example, been researched in the
project A.L.I.N.A. [28] funded by the German Federal Ministry of Education and
Research (BMBF). It would also be helpful for operation during an emergency if the
assistance service was not running on a smartphone but on a head-mounted display. This
requires further research in this area, especially in the area of usability.
References
[28] S. Blaschke et al., Intelligent Assistance Services and personalized Learning Environments for
Knowledge-and Action support in the Interdisciplinary Emergency Room, Medizinische Klinik-
Intensivmedizin und Notfallmedizin 2016;111:366-366.
dHealth 2019 – From eHealth to dHealth 25
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-25
Keywords: data elements, minimum data set, electronic medical record systems,
mental disorders
1. Introduction
Today, healthcare systems have moved towards the utilization and use of electronic
technologies such as mobile health and electronic medical records (EMRs) [1]. The use
of these technologies can lead to reduce medical errors, improve quality of health
services, increase productivity, improve information quality, support for clinical
decision-making, reduce healthcare costs, and better patient-physician communication
and education [2-6].
1Corresponding Author: Abbas Sheikhtaheri, Health Management and Economics Research Center,
School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran,
E-Mail: [email protected]
26 N. Hashemi et al. / Electronic Medical Records for Mental Disorders
Despite the benefits of EMR, its utilization in the mental health field is more
modest compared to other areas [7]. Today, the very high growth of medical
knowledge has led to an increase in the number of clinical specialties and hence more
than one expert involved in the treatment of patients, due to nature of mental illness
[7]. In such an environment, many medical records could be created based on the
treatment process by many specialists. Therefore, patients’ data are scattered [6].
There are many barriers and challenges for developing and implementing
electronic systems such as EMRs [6, 8, 9]. In this regard, determining the appropriate
and consensus-based data elements for developing EMRs is important. Data elements
play an important role in collecting and documenting information about patients in
their EMRs [10]. The purpose of Minimum Data Set, as core health data elements, is to
standardize data items and their definitions [11]. On the other hand, for the purpose of
developing electronic medical record systems for patients with mental disorders and
customizing them according to the needs of psychiatric patients, one of the main steps
is to identify the data elements required by the electronic medical records in this area.
Many promising studies have been done on the design of data elements in different
medical fields [12-14]. Furthermore, there are many studies related to data elements of
EMRs in different fields [10, 15, 16]. In addition, some studies have been conducted on
the design of psychiatric assessment forms or mental illness registry [17, 18].
Organizations such as UK National Health Service [19] set Mental Health Minimum
Data Set (MHMDS) and Australian Institute of Health and Welfare (AIHW) [20] has
the same elements in medical records of patients with mental disorders as the 27
elements. However, these data elements have not been developed and customized for
electronic systems. Some promising studies have been conducted for implementation
of EMRs in mental institutions [21-23], but they have not reported data elements for
EMRs. Therefore, there are few studies regarding the data elements required for EMRs
in mental disorders. The aim of our study was to determine minimum data elements
required toward developing EMRs for patients with mental disorders.
2. Methods
This descriptive cross-sectional study was carried out in 2018. To determine the data
elements for the EMRs for patients with mental disorders, a literature review [17-20]
were conducted. In the next step, 50 medical records of psychiatric patients were
randomly selected from one of the specialized psychiatric hospitals, in Tehran, Iran.
All patients’ data were completely anonymized and the study has been proved by the
ethics committee as well. The medical records were selected based on the
psychological disorders diagnosis codes of International Classification of Diseases,
fifth chapter (F00-F99). From each of the 10 blocks of this chapter, five records were
randomly selected. Using a checklist, the contents of these records were extracted and
qualitatively were analyzed to identify the common data elements used by physicians.
Then, we aggregated and classified the data elements identified from the literature
review and medical records in some data classes and subclasses. In the next phase, data
elements were validated through a survey on psychologists and psychiatrists (Figure 1).
To this end, a questionnaire was designed in two parts. The first part contains the
demographic data of participants, and the second part was related to the data elements
for EMRs, which were classified into seven data classes. The scale of this
questionnaire was two choices: necessary and unnecessary. The content validity of this
N. Hashemi et al. / Electronic Medical Records for Mental Disorders 27
tool was confirmed by three relevant experts (in the fields of mental health, health
information management and medical informatics). We used the Kuder-Richardson
coefficient for its reliability. The paper-based questionnaires were handed to 45 mental
health specialists (psychologists and psychiatrists) with a minimum of 10 years of work
experience in the relevant field. These specialists were selected from three psychiatric
hospitals (15 participants from each hospital). Finally, 33 specialists participated.
Literature review
Classification Validation Final data
Preliminary
data of data process elements
elements elements
Extraction of data
elements from Excluded
medical records
data
elements
Figure 1. The steps of the study
Data analysis was done using descriptive statistics and SPSS software, version 20.
After a survey by specialists, the agreement on each of the given data elements was
calculated as percentages. All data elements with less than 75 percent agreement were
considered as unnecessary data elements and excluded [24, 25]. Other data elements
were suggested as the necessary data elements for the electronic medical records of
mental disorders. The ethics committee of Iran University of Medical Sciences, Tehran,
Iran approved this study and confidentiality of patients’ data was observed.
3. Results
In total, 155 data elements were identified and classified in seven data classes and
10 sub-classes for the EMRs. Of these, a total of 140 data elements were validated by
participants as necessary data elements for EMRs, as presented in Table 2. Finally, the
data elements obtained for the EMRs are shown in Table 3. The excluded data
elements are shown in Figure 2.
Table 3. Continued
4. Discussion
problems, results, social history and cross border have been defined [26]. However,
none of these data elements developed in these countries is related to EMRs for mental
disorders. Although they are valuable, they have not defined specialized data elements
for EMRs for mental disorders. In our study, some categories of data and data elements
are similar to these projects; however, more specialized data elements in the field of
mental disorders were identified for EMRs.
As this study was conducted in three hospitals in one country, our results may not
be generalizable to the other countries. Additionally, in this study, we considered
mental disorders in general. Therefore, this research provides fundamental findings for
further studies. Future studies may consider specific mental disorders to identify more
specialized data elements. Lastly, developing data elements is a preliminary step to
develop EMR systems. Further studies should focus on developing use cases and the
EMR system.
In conclusion, we are confident that the results of this study will be of great
assistance to mental health centers, which want to implement electronic medical
records. Identifying these elements at least leads to an overview that can help
information system developers and EMR vendors to facilitate and accelerate the
development of such a system and reduce the likelihood of a system failure. In
addition, the results of this study can be useful for mental health managers who tend to
implement electronic medical record systems, in order to plan more accurately and
increase system effectiveness.
References
[1] N.A. Latha, B.R. Murthy, U. Sunitha, Electronic health record. International Journal of Research in
Engineering and Technology. 1(10) (2012), 1-9.
[2] A. Sheikhtaheri, N. Hashemi, N.A Hashemi, Benefits of using mobile technologies in education from the
viewpoints of medical and nursing students. Studies in Health Technology and Informatics 251 (2018),
289-292.
[3] C. Chao, H. Hu , C.O. Ung , Y. Cai. Benefits and challenges of electronic health record system on
stakeholders: a qualitative study of outpatient physicians. Journal of Medical Systems. 37(4) (2013),
9960.
[4] N. Menachemi, T. Collum, Benefits and drawbacks of electronic health record systems. Risk
Management and Healthcare Policy, 4 (2011). 47-55.
[5] J. King, V. Patel, E. W. Jamoom, M.F. Furukawa, Clinical benefits of electronic health record use:
national findings. Health Services Research, 49 (1 Pt 2) (2013). 392-404.
[6] S. Malekzadeh, N. Hashemi, A. Sheikhtaheri, N. Hashemi. Barriers for implementation and use of health
information systems from the physicians' perspectives. Studies in Health Technology and Informatics,
251 (2018), 269-272.
[7] R.F. Stewart, P.J. Kroth, M. Schuyler, R. Bailey, Do electronic health records affect the patient-
psychiatrist relationship? A before and after study of psychiatric outpatients. BMC Psychiatry 10 (1)
(2010), 3.
[8] M. Jebraeily, Z. Piri, B. Rahimi, N, Ghasemzade, M, Ghasemirad, A, Mahmodi, Barriers of electronic
health records implementation. Health Information Management.8 (6) (2012), 807-814.
[9] N. Mirani, H. Ayatollahi, H. Haghani, A survey on barriers to the development and adoption of electronic
health records in Iran. Journal of Health Administration. 15 (50) (2013), 65-75.
[10] W.S. Weintraub, R.P. Karlsberg, J.E Tcheng, et al, ACCF/AHA 2011 key data elements and definitions
of a base cardiovascular vocabulary for electronic health records: A report of the American College of
Cardiology Foundation/American Heart Association Task Force on clinical data standards. Journal of
the American College of Cardiology. 58 (2) (2011), 202-222.
[11] M. Abedlhak, M.A. Hanken, Health information: Managing a strategic resource. 5th edition, Elsevier.
[12] F. Sadoughi, S. Nasiri, M. Langarizadeh, Necessity for designing national minimum data set of perinatal
period in Iran: a review article. Iranian Journal of Basic Medical Sciences. 5(57) (2014), 727-737.
32 N. Hashemi et al. / Electronic Medical Records for Mental Disorders
[13] A. Mohammadi, M. Ahmadi, A. Bashiri, Z. Nazemi, Designing the minimum data set for orthopedic
injuries, Journal of Clinical Research in Paramedical Sciences. 3(2) (2014), 75-83.
[14] F. Rafii M. Ahmadi, F. Hoseini, M. Habibi Koolaee, Nursing minimum data set: an essential need for
Iranian health care system. Iran Journal of Nursing. 24(71) (2011), 19-27.
[15] M. Darabi, A. Delpisheh, E. Gholamiparizad, M. Nematollahi. R .Sharifian. Designing the minimum
data set for Iranian children’ health records. Scientific Journal of Ilam University of Medical Sciences.
24(1) (2016).
[16] M. Ahmadi, A. Bashiri, A minimum data set of radiology reporting system for exchanging with
electronic health record system in Iran. Payavard Salamat. 8(2) (2014).
[17] H. Lotfnezhadafshar, Z. Zareh Fazlollahi, M. Khoshkalam, Comparative study of mental health registry
system of United Kingdom, Malaysia and Iran. Health Information Management. 6 (11) (2009), 1-11.
[18] A. Rezaei Ardani, L. Ahmadian, K. Kimiyafar, F. Rohani, Z. Ebnehoseini, Comparative study of data
elements in psychiatric history and assessment forms in selected countries. Journal of Health and
Biomedical Informatics. 3(1) (2016), 57-64.
[19] T. Square, B. Lane, Mental Health Minimum Data Set (MHMDS). 2012 [Accessed 4th Nov. 2018].
Available from: http://content.digital.nhs.uk/article/4865/Mental-Health-Minimum-Data-Set-MHMDS.
[20] Australian Institute of Health and Welfare. Patient care national minimum data set: national health data
dictionary, version 12. National Health Data Dictionary. Cat. no. HWI 48. Canberra: AIHW. 2003
[Accessed 4th Nov. 2018]. Available from: www.aihw.gov.au/publication-detail/?id=6442467503.
[21] S.A. von Esenwein 1, B.G Druss, Using electronic health records to improve the physical healthcare of
people with serious mental illnesses: a view from the front lines. International Review of Psychiatry.
26(6). (2014), 629-37.
[22] A. Takian, A. Sheikh , N. Barber . We are bitter, but we are better off: case study of the implementation
of an electronic health record system into a mental health hospital in England. BMC Health Services
Research , 12 (2012) 484.
[23] D. Robotham, M. Mayhew, D. Rose, T. Wykes. Electronic personal health records for people with
severe mental illness; a feasibility study. BMC Psychiatry. 15 (2015) 192.
[24] K. Kimiafar, M. Sarbaz, A. Sheikhtaheri, A. Azizi. The impact of management factors on the success
and failure of health information systems, Indian Journal of Science and Technology 8 (2015) 1-9.
[25] A. Sheikhtaheri, F. Sadoughi, M. Ahmadi, Developing Iranian patient safety indicators: an essential
approach for improving safety of healthcare, Asian Biomedicine 7 (3) (2013) 365-373.
[26] Health informatics- The patient Summary for unscheduled, cross border care. Draft European Standard
IPS, 2018. [Accessed 18 Mar 2019]. Available from: http://www.ehealth-standards.eu/en/documents/
dHealth 2019 – From eHealth to dHealth 33
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-33
1. Introduction
1
Contributed equally
2
Corresponding Author: Kerstin Denecke, Bern University of Applied Sciences, Quellgasse 21, Biel,
Switzerland, E-Mail: [email protected].
34 P. Kyburz et al. / Exchanging Appointment Data Among Healthcare Institutions
Our starting point was the development of a mobile patient navigator app [10] which
enables a patient to look up his current appointments which may be altered by his
healthcare professionals.
In a following step, we examined some exemplary inpatient [11] and outpatient [12]
information systems with regard to appointment data exchange with such a navigator app.
In addition, several online outpatient appointment booking tools were searched and
analyzed, namely Medicosearch.ch, Docbox.ch, Doctena.ch, Samedi.de. The objective
of this analysis was to identify the data types that are stored in these systems with respect
to an appointment. Based on the results, we could identify common data types to derive
an appointment data structure.
Next, a Medline literature review was carried out with following search terms:
Computerized appointment scheduling, Cross institution appointment scheduling, Cross
sector AND appointment, Cross sector AND scheduling, Web based appointment
scheduling. The aim of retrieval was to identify existing solutions for cross-institutional
appointment communication. The results of the search were filtered for publications
dealing with the scheduling process. There were no restrictions on the publication date.
In a fourth step, the existing standards for scheduling in healthcare were surveyed to
check whether they are appropriate to be used in a comprehensive and cross-institutional
appointment scheduling solution. Thus, we analyzed the Appointment Resource of FHIR
[13], the HL7 V2.x SIU messages [4], and IHE profiles dealing with appointment
booking and the EPD. The derived data structure was then compared with the exchange
standards mentioned above and supplemented when necessary.
3. Results
particularly evident in the area of possible appointment types. The pre-defined values
differ in their degree of detail among systems. This can lead to difficulties in cross-
institutional communication. Links between appointments within or beyond an
institution or aggregation of appointments to a treatment episode were missing.
For inpatient care, HL7 V2 supports Scheduling Information Unsolicited Messages
(SIU) [4]. SIU supports 14 different trigger events to notify applications of appointment
changes. All events use a common message format. SIU-S12 for example is the event
for notification of a new appointment. The SCH segment contains information regarding
the date such as IDs, reason, duration, etc. It also shows who booked the appointment
and its status. In the TQ1 segment, the times are displayed in more detail. Thus, an
appointment can also have a repetition, i.e. an appointment can be booked weekly.
Different segments specifying patients, services, devices, rooms and service providers
for an appointment may be added.
Fast Healthcare Interoperability Resources (FHIR) [13] is an emerging standard
hosted by HL7.org where appointments are mapped with the appointment resource. An
appointment resource contains fields for start and end time, duration, location and the
participants of an appointment. Using additional FHIR resources such as the Slot and the
Schedule resources, the whole booking process can be addressed in FHIR. Furthermore,
through the use of the Subscription resource, FHIR supports that different participants
can be automatically notified about the change of a resource. The communication
between FHIR endpoints is realized through a REST API and the transmitted data can
be either XML or JSON formatted. Therefore, FHIR not only addresses the data structure
itself but also the communication through which the data is exchanged. However, the
FHIR standard is not a document-based standard and is therefore not directly compliant
to the EPD.
The so-called CDA-CH standard, a Swiss adaptation of the CDA, is used for the
EPD. CDA is part of the HL7 V3 standard. In the current CDA-CH document v.2.0.3,
the term "Appointment" is not at all mentioned [15]. CDA-CH together with the XDS.b
profile of IHE provides a document-based infrastructure for the Swiss EPD. An
appointment is currently not considered as document.
A specific use case for the communication of event data via the XDS.b profile has
not yet been defined by IHE. The Eye Care Appointment Scheduling (ECAS) profile
demonstrates the process for scheduling appointments [16]. This profile can serve as a
possible basis for the implementation of cross-institutional appointment data
communication. However, if this profile is to be implemented across institutions, the
transactions should be adapted or redefined. Various transactions originate from the
Radiology Framework of IHE (RAD-1, RAD-12, ...) that use the standard HL7 V2.x.
This is not optimal for cross-institutional communication. Furthermore, this profile
would have to be integrated into an XDS environment without the loss of dynamic
communication. For cross-institutional communication, it is essential to detect the
appropriate service provider in order to book an appointment. With the IHE CSD Profile
[17], this search process can be supported. CSD provides a register with the available
service providers. A query could, for example, return all orthopedists in the area of Berne
in Switzerland.
The results of this analysis demonstrate that there is no off-the-shelf solution
available for cross-institutional appointment data exchange. However, existing standards
can provide some foundations.
36 P. Kyburz et al. / Exchanging Appointment Data Among Healthcare Institutions
Figure 1. Tree structure comprising several appointments, either individual or appointments in an inpatient
case, aggregated in a treatment episode
This section describes a possible process for the communication of appointment data. As
already considered in the mentioned FHIR resource, during the scheduling process, the
appointment can adopt various states. These states serve as the basis for this conceptual
communication process (fig. 2).
Figure 2. Defined process of an appointment booking process using the various states of an appointment
the appointment, it will be changed to state “booked”. If not, the appointment will be
“cancelled”.
For “booked” appointments the date is fixed. If the patient checks in at the providing
institution, the state is altered to “showed up”. Once the appointment is finished the state
is set to “completed”. If one of the participants is no longer able to attend, he should
“cancel” in advance. The state “no-show” indicates that the appointment is scheduled,
but the patient did not appear and did not cancel the appointment in advance.
During the scheduling process, the defined conditions are monitored and influenced
by four different actors (table 2).
Table 2. Actors involved in the appointment booking process
Actor Description
Participant All entities participating in an appointment. This can be a person such as a
patient and a health professional but may also include other entities such as
an MRI or an operating room.
Requester of Participant who starts the initial appointment process by making a suggestion
appointment of an appointment
Healthcare Institution where the appointment takes place
institution
EPD Refers to the community to which the healthcare provide belongs
Community
Communication between the various actors is divided into individual steps. Each
step represents the exchange of data between two actors. Standards and IHE profiles are
proposed for realizing the various steps. The state of the appointment is also changed in
some steps. In figure 3 the individual communication steps are shown graphically and
explained in more detail in the following.
Figure 3. Communication among the various actors during an appointment booking process
(1) Through the IHE CSD profile the requesting person (Service Finder) uses the
ITI-73 transaction against the EPD Community (Care Services InfoManager) to search
an institution and receive corresponding information including the FHIR endpoint. To
realize this, the EPD functionality would have to be extended by the IHE Care Service
Discovery (CSD) profile. (2) The availability schedule of the institution and, if
necessary, of other required participants will be retrieved using the FHIR schedule and
slot resources. (3) When a free slot has been found, an appointment request will be sent.
P. Kyburz et al. / Exchanging Appointment Data Among Healthcare Institutions 39
In this step, the actual FHIR appointment resource is created and sent to the institution.
(4) All participants will be informed about the appointment request. (5) The individual
participants can confirm or reject the appointment request using the FHIR
AppointmentResponse resource. (6) In the previous steps, the appointment
communication was carried out via a system of the corresponding institution. Once the
state is set to booked, the FHIR Appointment Resource can be converted to a CDA-CH
document and uploaded to the EPD community. (7) If a “booked” appointment can no
longer be attended by a participant, it should be “cancelled”. A new version of the
appointment document with the status “cancelled” will be created and uploaded to the
corresponding community. (8) In order to notify all participants of the cancellation, the
EPD must be extended with the IHE Document Metadata Subscription (DSUB) profile.
4. Discussion
Our initial thinking started with the upload of scheduling documents within the EPD. We
detected quickly, that this approach causes several challenges: Every change in the date
of an appointment as well as every response of a participant result in a new document
version. Moreover, every appointment change requires a complete download of the
whole document and the upload of the modified version. In the current state of the EPD,
patients can only read their documents and not actively manipulate them. Thus, a patient
would be unable to suggest or confirm a date for an appointment. Furthermore, within
the Swiss EPD architecture, it is impossible to upload documents that are not directly
patient-related. Therefore, healthcare providers and other possible participants could not
provide their availability in form of an own schedule.
For this reason, we propose the (combined) use of FHIR as a potential alternative to
the direct document-based mirroring of the scheduling process within the EPD. We are
fully aware that the additional use of FHIR implies a significant additional effort. Every
healthcare institution will be required to provide her own FHIR endpoint with the
schedules of the bookable resources. Because of this additional effort, the whole national
scheduling process in the suggested form should be considered as an optional extension
which institutions can freely choose to implement and use.
An additional effort is necessary to convert the FHIR appointment resource to a
CDA document as soon as the appointment is booked. The scheduling process itself
could be performed without the conversion of the document and its upload to the EPD.
For this reason, the question should further be analyzed if and when the appointment is
uploaded to the EPD during this process. Once the upload is completed, the patient is no
longer able to actively manipulate the appointment because of mentioned limitations of
the EPD. Nevertheless, possible benefits of persisting an appointment as a document in
a national patient record concern the reuse of the appointment data. Documents resulting
from an appointment could be directly linked to this visit, providing an additional
opportunity to sort and search documents of a patient. On the other hand, the total number
of EPD documents grows considerably.
The proposed inter-sectorial appointment scheduling process has been designed for
integration with the Swiss EPD. We did not explicitly examine the compatibility with
other national health record implementations. During analysis of the Swiss EPD
infrastructure and the corresponding standards it turned out that such integration requires
additions such as the integration of the IHE CSD profile and the IHE DSUB profile.
Although the IHE CSD profile would also provide the possibility to check the availability
40 P. Kyburz et al. / Exchanging Appointment Data Among Healthcare Institutions
of a service by using the transaction ITI-75 based on CalDAV (RFC 4791), we decided
to use the IHE CSD profile to retrieve the FHIR endpoint. The main reason for this
decision lies in the possibility to include further FHIR resources in the process in future
steps. For example the questionnaire resource could be utilized to request further
information.
This concept provides the foundation for a possible solution to digitalize the cross-
institutional appointment booking process. FHIR is still a new and emerging standard
and further work is required to demonstrate the practicability of our proposal. The next
step is the validation through experts and the implementation of a proof of concept. This
should include at least an implementation for one institution. This proof of concept can
provide further information that can be used to parametrize the FHIR resources and the
possible appointment document for a national standardization. It also should be evaluated
if there is a need for the upload of the appointment document once the appointment is
booked. In parallel, the integration of the IHE CSD profile to a national patient record
infrastructure such as the Swiss EPD should be considered.
References
[1] Bundesgesetz über das elektronische Patientendossier. SR 816.1 Apr 15, 2017.
[2] IHE. IHE IT Infrastructure (ITI) Technical Framework Vol 1, Revision 15.0.
https://www.ihe.net/uploadedFiles/Documents/ITI/IHE_ITI_TF_Vol1.pdf. Last visited 11. February
2019.
[3] eHealth Suisse: Austauschformate eMedikation. Pre-Publication Review, 2018, Version 08.05.2018,
https://www.e-health-suisse.ch/fileadmin/user_upload/Dokumente/2018/D/180508_CDA-CH-
EMED_de.pdf
[4] Caristix. HL7V2 Scheduling Information Unsolicited Messages. http://hl7-
definition.caristix.com:9010/HL7%20v2.3/triggerEvent/Default.aspx?version=HL7+v2.3&triggerEvent
=SIU_S14. Last visited 11. February 2019.
[5] NEMA. DICOM PS3.3 2019a - Information Object Definitions.
http://dicom.nema.org/medical/dicom/current/output/pdf/part03.pdf. Last visited 11. February 2019.
[6] P. Zhao, I. Yoo, J. Lavoie, BJ. Lavoie, E. Simoes. Web-Based Medical Appointment Systems: A
Systematic Review. Journal of Medical Internet Research. 2017;19(4):e134.
[7] EA. Fradgley, CL. Paul, J. Bryant, C. Oldmeadow. Getting right to the point: identifying Australian
outpatients’ priorities and preferences for patient-centred quality improvement in chronic disease care.
Int J Qual Health Care. 1. September 2016;28(4):470–7.
[8] Z. Siddiqui, R. Rashid. Cancellations and patient access to physicians: ZocDoc and the evolution of e-
medicine. Dermatol Online J. 15. April 2013;19(4):14.
[9] K. Mohamed, A. Mustafa, S. Tahtamouni, et al. A Quality Improvement Project to Reduce the ‘No
Show’ rate in a Paediatric Neurology Clinic BMJ Open Quality 2016;5:u209266.w3789. doi:
10.1136/bmjquality.u209266.w3789
[10] K. Denecke, P. Kyburz, S. Gfeller, Y. Deng, T. Bürkle. A Concept for Improving Cross-Sector Care by
a Mobile Patient Navigator App. Stud Health Technol Inform. 2018;255:160–4.
[11] POLYPOINT – Hospital Information System [Internet]. POLYPOINT. [cited 7. February 2019].
Available from: https://polypoint.ch/
[12] Elexis – Physician Information System [Internet]. [cited 7. February 2019]. Available from:
https://elexis.info/
[13] HL7.org. Fast Healthcare Interoperability Resources (FHIR). https://www.hl7.org/fhir/.
Last visited 11. February 2019.
[14] C. Rinner, G. Duftschmid. Bridging the Gap between HL7 CDA and HL7 FHIR: A JSON Based
Mapping. Stud Health Technol Inform. 2016;223:100–6.
[15] T. Schaller. CDA-CH v2.0.3.pdf. HL7 Benutzergruppe Schweiz; 2018.
[16] Eye Care Appointment Scheduling - IHE Wiki [Internet]. [cited 20. October 2018]. Available online:
https://wiki.ihe.net/index.php/Eye_Care_Appointment_Scheduling
[17] Mobile Care Services Discovery (mCSD) - IHE Wiki [Internet]. [cited 27. October 2018]. Available
online: https://wiki.ihe.net/index.php/Mobile_Care_Services_Discovery_(mCSD)#Details
dHealth 2019 – From eHealth to dHealth 41
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-41
1. Introduction
Most medical institutes generally use Electric Medical Record (EMR) to record and store
information about their patients, including diagnostics, performed treatments and their
results. EMR is a valuable information source for medical analysis, however it is usually
incomplete or redundant, making data mining a difficult and challenging task. It is
especially true in case of echocardiography reports. Generally, echocardiography reports
can be divided into two parts in terms of diagnostic content: in the first semi-structured
part diagnostic results are stored in the form of term-value pairs (e.g.: interventricular
septum: 14 mm), and in the second part results are recorded as free text written in natural
language (e.g.: mild left ventricular hypertrophy). As there is no consensus about how to
store the results of echocardiography examinations and it is varying across different
medical institutes, processing of echocardiography reports is a nontrivial task. Present
paper is focusing on how to process the first, semi-structured part of echocardiography
reports. As processing of the free text part requires quite different methods, including
Natural Language Processing (NLP) techniques, we do not deal with them in this paper.
1
Corresponding Author: Ágnes Vathy-Fogarassy, Department of Computer Science and Systems
Technology, University of Pannonia, 2. Egyetem Str., 8200 Veszprém, Hungary, E-Mail: [email protected]
pannon.hu
42 S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information
Generally, information extraction from medical texts focuses on the following two
tasks: named-entity recognition (NER, or term extraction), and relation extraction (RE).
Named-entity recognition refers to the process of identifying particular types of names,
terminologies or symbols in documents, while relation extraction identifies the relation
between them [1]. Successful term identification is key to getting access to the stored
information and the process of identification has been recognized as a bottleneck in text
mining [2]. The process of term identification is usually done in three steps: the first step
is term recognition; the second step is term classification; and the last step is term
mapping [2].
There are two possible approaches to solve this task. The first approach is to directly
search for specific terms (e.g. aortic root, ejection fraction) in documents. Direct term
search always relies on a specialized dictionary to recognize and classify medical
terminology, and the performance of this approach heavily depends on the coverage and
quality of the dictionary. The acquisition of such knowledge is a time-consuming task.
Direct search can also be extended by pattern search, which requires a priori knowledge
about the structure of the processed text (e.g. use of colon between terms and values,
order of terms, various expletives). With this extension, it becomes possible to recognize
terms and their measured value (e.g. aortic root: 27 mm) together.
Other term extraction methods also exist which utilize classical text mining
techniques. These text mining-based solutions do not need a predefined dictionary to
extract terms from the text, but simply collect every occurrence of word sequences that
are possibly valid terms. However, these methods require a text pre-processing phase
(including text cleaning), and term candidates must be identified and mapped onto a
dictionary after term extraction.
In the literature, several international studies have been published which are engaged
in echocardiography report processing [3-10]. They are mostly based on the direct search
approach, but some of them apply text-mining methods as well. In the published studies,
typically only the extraction of one specific parameter is the aim, such as ejection fraction
(EF). Garvin et al., Kim et al., and Xie et al. all successfully extracted this parameter
from free text documents and described practical extraction techniques [3-5]. In [6] a
natural language-based method was presented which uses a predefined dictionary, expert
rules and predefined patterns to extract echocardiography measurements from
documents. In this study, a pattern-matching algorithm was created and tested to extract
term candidates from a large set of clinical notes. The presented method relies heavily
on pattern matching, but it can also identify possible misspellings and synonyms by
iterative extraction. Wells et al. also successfully extracted a set of predefined parameters,
including wall thicknesses, chamber dimensions or flow velocities [7]. They applied
NLP to parse the most frequently measured dimensions and used outlier analysis to filter
out unrealistic values. Toepfer et al. developed and evaluated an information extraction
component with fine-grained terminology that enabled them to recognize almost all
relevant information stated in German transthoracic echocardiography reports at the
University Hospital of Würzburg [8]. Jonnalagadda et al. described an information
extraction-based approach that automatically converts unstructured text into structured
data, which is cross-referenced against eligibility criteria using a rule-based system to
determine which patients qualify for a heart failure with preserved ejection fraction
(HFpEF) clinical trial [9]. In [10], Renganathan proposed text mining techniques that
enable the extraction of unknown knowledge from unstructured documents.
As we can see, all the suggested methods report successful medical text processing
but were implemented in different ways. However, until now, there was no analysis
S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information 43
published that would compare the two basic approaches. The purpose of our research
was to examine how well a text mining-based solution fares against a direct term search-
based method in processing medical, especially echocardiography documents, and
whether it is able to outperform it or not. For this purpose, we implemented both
approaches, processed the same corpus of echocardiography reports with them, and
compared the results. Our results are primarily valid for the analysis of echocardiography
reports, but we believe that they might be valid in case of disclosure information stored
in term-value pairs from other medical documents as well.
The structure of this document is as follows. In Section 2, we give a brief overview
of the challenges faced when processing echocardiography reports and present two
fundamentally different methods to extract, identify, and map terms. In Section 3, the
used dataset and the evaluation process are described and the result of the analysis is
presented. Finally, in Section 4, general experiences and future developments are
discussed.
2. Methods
The most vexatious problem with echocardiography reports is that there is no unified
process on how to record data of patients. The form of recorded information varies from
medical institute to medical institute. Furthermore, not only the location of data recording
is an influencing factor, but medical assistants or doctors record the results according to
their own habits and, arising from the lack of a unified recording interface, free text
contains many typos as well.
In our study, two methods have been realized to extract terms from the first, semi-
structured part of echocardiography reports. The first method is a general regular
expression-based method which processes raw text, meaning that there is no pre-cleaning
applied, and assumes that terms and their measurement results are separated by a colon.
The second method is based on traditional text mining methods. In this case, the raw text
is first cleaned and then the cleaned text is processed. This method searches for numerical
values and assumes that there is a term before and a unit of measurement (if needed) after
each numerical value.
The main difference between the two methods lies in the text preparation phase. The
regular expression-based method processes raw text and assumes, based on a priori
knowledge, that the term and their measurement result pairs follow a certain pattern, e.g.
they are separated by a colon, while the text mining-based method cleans and
manipulates the text in such a way that it becomes easier to process. Furthermore, the
text mining-based method does not rely on any a priori knowledge about the medical text
to process them. These two methods are introduced in detail in the following subsections.
The regular expression-based NER method uses regular expressions to extract terms
from echocardiography reports. A regular expression is a sequence of characters that
defines a search pattern. Usually this pattern is used by string searching algorithms for
"find" or "find and replace" operations on strings. The regular expression-based method
processes raw text, meaning that the data is processed as it is, no pre-cleaning methods
are applied. Furthermore, the regular expression-based processing method, based on a
priori knowledge, presumes that every term and the adherent value is separated by a
44 S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information
colon (in our case, but it can be separated by any other predefined separator character as
well) and the applied regular expressions are built upon this assumption and knowledge.
In our study, firstly simpler regular expressions have been defined on which more
complex expressions were based. These rudimentary regular expressions include
expressions for terms, values, units and extended units. Sample expressions for terms and
values are the following:
terms r'(?P<term>(?!\d)\w\D+)' (1)
values r'(?P<digits>\d[\d,.+\-x/*]*)' (2)
The first expression defines that terms cannot start with a number and one or more non-
numeric characters follow a word character. The second expression defines the values in
such a way that they can be integers (e.g. 27), decimal numbers (e.g. 12.5), ranges of
values separated by a hyphen (e.g. 25-28, 12.4-12.7) or a multi-dimensional value
specified by an "x" character (e.g. 27x13).
Using these rudimentary expressions more complex expressions can also be
constructed. For example, the measurement result is a complex expression, which is a
concatenation of values, some separating whitespace characters and a measurement unit
with affixation taken into account (measurement_result = [values][whitespace
characters][unit][affixation]). An expression for a term–measurement_result pair is the
concatenated form of the term, whitespace characters, a colon, whitespace characters
and the measurement result expressions (term–measurement_result = [term][whitespace
characters]:[whitespace characters][measurement_result]). The flexibility of the
regular expression-based NER comes from its ability to find character sequences
matching the defined patterns regardless of their position in a longer sequence.
Using the previously defined regular expression set, the raw text can be processed.
Based on the indeterministic nature of regular expression matching, the
echocardiography reports were processed from start to end. If a string matching an
expression was found, the string was processed, stored, and removed from the document.
These few steps were executed iteratively until no processable string was left.
The second method is a more straightforward approach which utilizes traditional data
mining and cleaning methods. It pre-processes the raw text without any a priori
knowledge about the contents of the documents. As part of the cleaning process, this
method unifies whitespaces, removes all colons, parentheses and unneeded characters. It
is important to note that however, it does not modify any commas or dots as they can
also be used as decimal comma or decimal point based on the localization of the
recording software. It also unifies the units of measurement based on a predefined list.
The unified and found measures are concatenated to the preceding numerical values
during the pre-processing phase.
The trick of this method is that it assumes that all measured values are numerical
and before every numerical value there is a term and after every numerical value there
can be a unit of measurement present. To remove the numerical values which do not
express measurements results, the algorithm in the pre-processing phase modifies some
of the measures like mm2 or cm2 to, respectively, sqrmm and sqrcm.
After the text pre-processing phase, the text mining-based NER splits the cleaned
documents into "words" (sequences of characters separated by whitespaces) and searches
S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information 45
for the first occurrence of a "word" starting with a numerical value. The preceding ݊
words, if ݊ words are present, are considered term candidates. The candidates are then
checked, marked, and stored for later usage. In our case, ݊ ൌ Ͷ was chosen for the
threshold of the number of words for candidate terms.
The recognized named entities from the processed documents were checked whether
they are valid terms or not by using a dictionary of terms. This dictionary has been
created with the help of a medical expert. The used dictionary contains more than 30
terms from the field of cardiology probably present in some form in echocardiography
reports and over 100 synonyms have also been defined for the 30 terms. The previously
defined ݊ ൌ Ͷ word length of terms stems from the dictionary, for the reason that the
maximum length of terms recorded in the dictionary is 4.
The terms extracted in the previous phases were compared against the elements of
the dictionary. The Jaro-Winkler distance [11] was calculated for each comparison and
if the distance was lower than a specified distance threshold, the term was considered
valid and identified. This threshold parameter was defined as the lowest, non-zero intra-
distance of the terms stored in the dictionary. The Jaro-Winkler distance (݀௪ ) can be
calculated in the following way:
݀௪ ሺଵ ǡ ଶ ሻ ൌ ͳ െ ݉݅ݏ௪ ሺଵ ǡ ଶ ሻ (3)
݉݅ݏ௪ ሺଵ ǡ ଶ ሻ ൌ ݉݅ݏ ሺଵ ǡ ଶ ሻ ݈ሺͳ െ ݉݅ݏ ሺଵ ǡ ଶ ሻሻ (4)
where ݉݅ݏ is the Jaro similarity for ݏଵ and ݏଶ strings, ݈ is the length of a common prefix
up to 4 characters and is a constant scaling factor with a standard value of ͲǤͳ. The
Jaro similarity (݉݅ݏ ) is calculated in the following way:
46 S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information
Ͳ ݉ ൌ Ͳ
݉݅ݏ ሺଵ ǡ ଶ ሻ ൌ ቊଵ ቀ ି௧ቁ (5)
ଷȁ௦ ȁ
భ ȁ௦ ȁ మ
where ȁݏ ȁ is the length of ݏ , ݉ is the number of matching characters and ݐis half of the
number of transpositions. The concept of matching and transpositions is detailed in [11].
3. Results
Table 1. The number of the most common terms identified by RE-NER and TM-NER methods.
RE-NER TM-NER
Term N ࡺࡾࡱ ࡺࢋࡾࡱ ࡺࢀࡹ ࡺࢋࢀࡹ
Left ventricular
19 598 19 464 42 19 549 116
end-systolic diameter
Interventricular septum
19 562 19 498 109 19 491 43
(end-diastolic)
Aortic root 19 537 19 492 66 19 476 21
Posterior wall
19 496 15 696 116 19 386 3 800
(end-diastolic)
Left ventricular
19 240 19 096 102 19 147 81
end-diastolic diameter
Left atrium (M-mode) 19 344 19 259 208 19 144 85
E 18 759 18 719 44 18 723 59
EF 18 768 18 640 636 18 135 131
A 18 458 18 421 977 17 483 41
Interventricular septum
14 372 2 2 14 370 14 370
(end-systolic)
Posterior wall
14 310 41 1 14 309 14 269
(end-systolic)
Right ventricle (M-mode) 10 656 10 448 239 10 432 17
2D right atrial dimensions 10 492 10 398 237 10 264 11
S. Szekér et al. / Application of Named Entity Recognition Methods to Extract Information 47
Most differences of ܰோா and ்ܰெ occur from typos, missing spaces or non-
numerical values. During testing, we found that there are cases when spaces are missing
between some named entities (terms). As TM-NER is based on the list of words, in this
case, this method is unable to find the appropriate term. To handle this kind of failure it
is suggested to insert separator space characters into the text (for example after the
measurements) during text cleaning. Furthermore, there were occurrences of term–
measurement_result pairs where the measurement result part was a non-numeric value.
The TM-NER method is unable to identify these kind of results, but RE-NER may be
able to find these occurrences based on the presumption that terms and values are
separated by a colon regardless the type of value. The biggest difference occurred during
the exploration of terms Interventricular septum (end-systolic) and Posterior wall (end-
systolic). These terms are composite terms. They follow the term1–
measurement_result1–subterm2–measurement_result2 pattern. RE-NER struggles to
find and to process these kinds of terms in a humanly consumable way.
Table 2. The relative occurrence of most common terms identified by RE-NER and TM-NER methods.
RE-NER TM-NER
term N ࡾࡱ ࢋࡾࡱ ࢀࡹ ࢋࢀࡹ
Left ventricular
19 598 99,32% 0,21% 99,75% 0,59%
end-systolic diameter
Interventricular septum
19 562 99,67% 0,56% 99,64% 0,22%
(end-diastolic)
Aortic root 19 537 99,77% 0,34% 99,69% 0,11%
Posterior wall
19 496 80,51% 0,59% 99,44% 19,49%
(end-diastolic)
Left ventricular
19 240 99,25% 0,53% 99,52% 0,42%
end-diastolic diameter
Left atrium (M-mode) 19 344 99,56% 1,08% 98,97% 0,44%
E 18 759 99,79% 0,23% 99,81% 0,31%
EF 18 768 99,32% 3,39% 96,63% 0,70%
A 18 458 99,80% 5,29% 94,72% 0,22%
Interventricular septum
14 372 0,01% 0,01% 99,99% 99,99%
(end-systolic)
Posterior wall
14 310 0,29% 0,01% 99,99% 99,71%
(end-systolic)
Right ventricle
10 656 98,05% 2,24% 97,90% 0,16%
(M-mode)
2D right atrial dimensions 10 492 99,10% 2,26% 97,83% 0,10%
4. Discussion
results show that the text mining-based method is able to perform at a similar level in
finding and identifying terms as the regular expression-based method. Both methods
have advantages over the other. The text mining-based algorithm has difficulty in
handling missing space characters. As the text mining-based method is based on the
assumption that measured results are stored as numerical values, this method is unable
to find non-numerical values. In case of the regular expression-based method, the
formulation of the expression set is a difficult task and it is even harder to extend this
regular expression set to recognize complex terms. Furthermore, not all occurrences can
be expressed with a general expression. These special occurrences require more and more
unique expressions to be added to the set, which increases the processing time.
Our primary finding is that the text mining-based NER method is able to perform at
a similar level in finding and identifying terms as the regular expression-based method
and in case of extracting complex terms and their measurement results it outperforms the
regular expression-based NER method. Information extraction can be further improved
by implementing a hybrid NER which merges the advantages and negates the
disadvantages of both methods. This hybrid NER is part of our future research.
Acknowledgment
References
[1] Wencheng Sun, Zhiping Cai, Yangyang Li, et al. Data Processing and Text Mining Technologies on
Electronic Medical Records: A Review. Journal of Healthcare Engineering, 2018; Article ID 4302425
[2] Krauthammer M, Nenadic G. Term identification in the biomedical literature. J Biomed Inform.
2004;37(6):512–526.
[3] Xie F, Zheng C, Yuh-Jer Shen A, Chen W. Extracting and analyzing ejection fraction values from
electronic echocardiography reports in a large health maintenance organization. Health Inform J.
2017;23(4):319–328.
[4] Garvin JH, DuVall SL, SOuth BR, et al. Automated extraction of ejection fraction for quality measurement
using regular expressions in Unstructured Information Management Architechture (UIMA) fro heart
failure. J Am Med Inform Assoc. 2012;19(5):859–866.
[5] Kim Y, Garvin JH, Goldstein MK, et al. Extraction of left ventricular ejectionfraction information from
various types of clinical reports. J Biomed Inform. 2017;67:42–48.
[6] Patterson OV, Freiberg MS, Skanderson M, et al.. Unlocking echocardiogram measurements for heart
disease research through natural language processing. BMC Cardiovasc Disord. 2017;17(1):151.
[7] Wells QS, Farber-Eger E, Crawford DC. Extraction of echocardiographic data from the electronic medical
record is a rapid and efficient method for study of cardiac structure and function. J Clin Bioinforma.
2014;4(1):12.
[8] Toepfer, M., Corovic, H., Fette, et al. Fine-grained information extraction from German transthoracic
echocardiography reports. BMC Medical Informatics and Decision Making, 2015; 15(1):91
[9] Jonnalagadda, S.R., Adupa, A.K., Garg, R.P. et al. Text Mining of the Electronic Health Record: An
Information Extraction Approach for Automated Identification and Subphenotyping of HFpEF Patients
for Clinical Trials. J. of Cardiovasc. Trans. Res. 2017; 10(3), 313–321.
[10] Renganathan, V. Text Mining in Biomedical Domain with Emphasis on Document Clustering.
Healthcare Informatics Research, 2017; 23(3), 141–146.
[11] Piskorski J., Sydow M. String Distance Metrics for Reference Matching and Search Query Correction.
In: Abramowicz W. (eds) Business Information Systems. BIS 2007. LNCS, Vol 4439. Springer, Berlin
dHealth 2019 – From eHealth to dHealth 49
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-49
Abstract. There is a great number of complex data concerning the Austrian Health
Care System. The goal was to process this data and present it to the general public
on an easily accessible information platform. The platform focuses on data about
the burden of disease of the Austrian Population, the available medical care and the
services provided by the physicians. Due to the vast differences in the underlying
source data, the methods used for the data acquisition range from statistical linkage
over web scraping to aggregating data on the reimbursed services. The results are
published on a website and are mainly displayed with interactive graphics. Overall,
these dynamic and interactive websites provide a good overview of the situation of
the Austrian Health Care System and presents the information in an intuitive and
comprehensible manner. Furthermore, the information given in the atlases can
contribute to the health care planning in order to identify distinctive service
provision in Austria.
1. Introduction
Issues concerning the health care system are mentioned by the Austrian media on a
regular basis. Recent keywords include the reformation of the Social Security System
and the imminent retirement of a large number of general practitioners in rural areas.
However, the media reports always focus only on a small aspect of the health care system
and fail to describe the complex overall situation.
Therefore, the Main Association of Austrian Social Security Institutions launched
the Health Care Atlases. The goal is to provide the public with a neutral, data driven
information platform which gives a good overview of the current situation of the health
care system and presents the complex data in a well comprehensible and intuitive manner.
The project was implemented by DEXHELPP, a research association developing
methods, models and technologies in order to support the analysis, planning and
controlling of the health care system.
1
Corresponding Author: Claire Rippinger, DEXHELPP, Neustiftgasse 57-59, 1070 Vienna, Austria, E-
Mail: [email protected]
50 C. Rippinger et al. / Health Care Atlases
The information in the atlases covers three main issues of the health care system: the
burden of disease of the Austrian Population, the medical care available in different
regions of Austria and the services provided by the physicians in their respective field.
Considering the big difference in available source data, the Health Care Atlases have not
been implemented in one single project. Instead, they were split into three smaller
projects, which have been implemented independently: the Epidemiology Atlas, the Care
Atlas and the Services Atlas.
Similar projects have been implemented by other countries, e.g. the UK, the US and
Germany. The Environment and Health Atlas for England and Wales [1] [2]is a
collection of interactive maps depicting the relative risks for a total of 14 health
conditions, averaged over a 25-year period. The Dartmouth Atlas Project [3]–[7]
presents a variety of maps and charts displaying various information ranging from
surgical procedures to medical discharges, all based on data from Medicare, a national
health insurance program in the United States. Finally, the German Versorgungsatlas
[8]–[12] is a library of many smaller projects involving health care. These projects
include information about the number of resident physicians, vaccination rates, a
selection of health indicators, etc.
2. Methods
The Epidemiology Atlas aims to provide information on the burden of disease of the
Austrian population. This information can only be collected by indirect means, since
there is no standardized diagnostic coding in the outpatient sector in Austria. Diagnoses
are only available in relation to sick leaves or hospital stays. In this project, the following
three methods have been evaluated in order to derive disease information indirectly:
x ATHIS Austrian Health Interview Survey 2006/2007
x ATC-ICD predicting the ICD code (International Classification of Diseases)
from the ATC code (Anatomical Therapeutic Chemical Classification System)
x Methods of Diagnosis Assignment by experts
One of these methods (Methods of Diagnosis Assignment) only considers the
prevalence of diabetes, the two other methods (ATHIS and ATC-ICD) consider multiple
diseases. In the first instance, all of these three methods were applied to diabetes in order
to compare their performance. Additionally, in the finalized Epidemiology Atlas, ATC-
ICD has been used to compute the prevalence of multiple diseases.
C. Rippinger et al. / Health Care Atlases 51
The aim of the Care Atlas is to provide an overview of the available medical supply in
Austria to the public population. Furthermore, this Care Atlas can contribute to and
support the general planning process of medical services by identifying distinctive
service provision in the Austrian health care sector. Consequently, this might have an
impact on improving the overall structural health care quality in the future. The
underlying data is publicly available on the websites of the Austrian Provincial Chambers
of Physicians (Landesärztekammern). For every physician, the website contains an entry,
52 C. Rippinger et al. / Health Care Atlases
providing information on the medical specialty, the type of social security contract, the
gender and the opening hours. Selenium, a framework which automates tasks performed
within a browser, was used to visit the single websites and to collect this information.
Regular expressions were then used to identify the information on the opening hours
provided by the physicians. The collected opening hours were stored in a database.
The collected data were subsequently processed to comply with the following
indicators:
x What is the number of available practices for a given federal state, medical
specialty, contract type and gender of the physician?
x What is the number of available weekly hours for a given federal state, medical
specialty, contract type and gender of the physician?
x What is the number of open practices by time and day of the week for a given
federal state, medical specialty, contract type and gender of the physician?
For the evaluation of the time dependent results, only those practices were included
in the data set for which the opening hours indicated a specific start and end time.
All results are expressed both in absolute values and in relation to the population.
The goal of the Services Atlas was to provide information on the services provided by
the physicians. The underlying data consists of the services billed to the social health
insurances by resident physicians in the years 2016 and 2017 and was provided by the
Association of Austrian Social Security Institutions. This database contains information
about the nature and number of services provided by each contracted physician, as well
as additional information on the contracted physician (i.e. insurance provider, medical
specialty, address). Each individual service is classified by an alphanumeric code, in
which the first two characters assign the service to a specific body region [19]. It is the
only data source providing a uniform encoding of the services provided in every federal
state.
It is important to notice, that neither services provided by non-contracted physicians
nor by physicians employed in a hospital or a similar institution are represented in this
data set.
Considering the good quality of the underlying data, only the standard data
preprocessing had to be applied and the data could be aggregated to comply with five
different indicators:
x What is the spectrum of the services provided: How many different services are
provided by the individual medical specialties in a given year and federal state?
x What are the most billed services (grouped by the corresponding body region)
for a given medical specialty, federal state and year?
x What are the most billed individual services for a given medical specialty,
federal state and year?
x What is the distribution of the individual services: What percentage of the most
billed services is provided by what group of medical specialties in a given year
and federal state?
x How many individual services are billed in the different federal states in relation
to the population in a given year?
During the data processing, privacy laws have been taken into consideration, more
specifically k-anonymity has been respected for k equals 3 [20]. Hence, every
C. Rippinger et al. / Health Care Atlases 53
information concerning a federal state where there are less than three contracted
physicians for a given medical specialty, is not displayed in the final results.
2.4. Visualization
Since the targeted audience of the Health Care Atlases is the general public, the
information is predominantly given in graphical form, allowing a quick and intuitive
comprehension of the presented data.
The type of chart is chosen depending on the underlying data: Regional information
is displayed in a choropleth map of Austria depicting the different federal states. A single
hue progression is used to illustrate the magnitude of the values represented on the map.
Categorical data, which do not represent regional information, are mostly displayed
using bar charts, allowing a quick comparison of the depicted values. The 3-dimensional
information on the number of open practices by time and day of the week is represented
using heat maps. As in the choropleth maps, a single hue progression illustrates the
magnitude of the values.
All charts and maps have been implemented in the form of interactive graphics. This
way, the user can single-handedly browse through the results and apply several filters.
Furthermore, a number of different tooltips and popovers display more detailed
information.
3. Results
The Health Care Atlases are publicly available on the DEXHELPP-website2. For each
atlas, the information is divided into several chapters and subchapters. Every chapter or
subchapter contains an explanation of the data presented, as well as some operating
instructions for the interactive features of the chart. Furthermore, every atlas also
provides detailed background information, additional links and the source of the
underlying data.
Since the goal of this project was to provide a neutral information platform, the
Health Care Atlases do not make an assessment on the current situation and do not draw
any conclusions from the data. However, they can be used to make some interesting
observations. These atlases provide a good overall overview of the Austrian health care
system and its service provision. By providing these data in an aggregated manner,
contributions can be made regarding the identification of distinctive service provision in
various regions in Austria. For example, a comparison of two heat maps displaying the
number of practices by time and day of the week is incorporated in the Care Atlas. By
using a dropdown menu, the user can filter the available data and compare the number
of open practices of general practitioners in an urban and a rural area. Figure 1 shows the
results in relation to 100 000 inhabitants. In the capital city of Austria (Vienna), the
number of practices which are open in the morning is almost equal to the number of
practices with opening hours in the afternoon. In contrast to this, in Carinthia, which is
the federal state with the lowest population density, there are much more open practices
in the period between 8:00 and 12:00 o’clock.
2
http://www.dexhelpp.at
54 C. Rippinger et al. / Health Care Atlases
Figure 1. Two heatmaps comparing the number of open practices of general practitioners per 100 000
inhabitants. Vienna is displayed on the left and Carinthia on the right.
Easily accessible data like this may provide additional information regarding the
public debate on the health care system: is there a general lack of physicians in a given
federal state? Is there only an apparent lack due to a temporal clustering of the opening
hours? How does the need of health care provision vary within different federal states
and regions?
4. Discussion
The three Health Care Atlases give a broad overview of the current situation of the
Austrian health care service. They give an insight on the burden of disease of the
population and the availability and type of treatment. Simply put, the three atlases answer
the following questions:
What diseases does the Austrian population suffer from?
Where and when is a treatment available?
How is the Austrian population treated?
However, the atlases do not depict the complete Austrian health care system. The
information displayed in the Epidemiology Atlas depends on estimations and the Care
Atlas and the Services Atlas only contain information on registered physicians in the
outpatient sector. The Services Atlas is even further limited to registered physicians with
a contract with the Social Security Institutions.
C. Rippinger et al. / Health Care Atlases 55
5. Future Work
Being aware of the limitations mentioned in chapter 4, additional projects and further
analyses regarding the Health Care Atlases have already been launched. For the
Epidemiology Atlas, a new approach of the ATC-ICD-method is already under
investigation and for the Care Atlas, it is planned to also include the opening hours of
outpatient departments. Currently, there are no plans to expand the information of the
Services Atlas to non-contracted physicians, since there is no data available on this
subject matter.
Furthermore, it is planned to equip each atlas with the visualization of the temporal
change of the data. Currently, it is possible to change the considered year in the
Epidemiology and in the Service Atlas. However, it may also be interesting to see the
temporal change of the data during the whole investigated period in one single chart.
Finally, it is intended to monitor anonymous usage behavior of the visitors of the
atlases. Thus, it can be investigated, how often the atlases are visited, and which is the
most popular one. This will give an insight into the public interest in the project.
References
[1] “Home page | The Environment and Health Atlas.” [Online]. Available:
http://www.envhealthatlas.co.uk/homepage/. [Accessed: 23-Jan-2019].
[2] A. L. Hansell et al., The Environment and Health Atlas for England and Wales, 1 edition. Oxford: OUP
Oxford, 2014.
[3] “Home,” Dartmouth Atlas of Health Care. [Online]. Available: https://www.dartmouthatlas.org/.
[Accessed: 12-Feb-2019].
[4] D. C. Goodman and A. A. Goodman, “Medical care epidemiology and unwarranted variation: the
Israeli case,” Isr. J. Health Policy Res., vol. 6, no. 1, p. 9, Feb. 2017.
[5] R. Panczak et al., “Regional Variation of Cost of Care in the Last 12 Months of Life in Switzerland:
Small-area Analysis Using Insurance Claims Data,” Med. Care, vol. 55, no. 2, p. 155, Feb. 2017.
[6] D. C. Goodman and G. A. Little, “Data Deficiency in an Era of Expanding Neonatal Intensive Care
Unit Care,” JAMA Pediatr., vol. 172, no. 1, pp. 11–12, Jan. 2018.
[7] G. P. Westert et al., “Medical practice variation: public reporting a first necessary step to spark change,”
Int. J. Qual. Health Care, vol. 30, no. 9, pp. 731–735, Nov. 2018.
[8] “versorgungsatlas.de - Der Versorgungsatlas.” [Online]. Available:
https://www.versorgungsatlas.de/der-versorgungsatlas/. [Accessed: 23-Jan-2019].
[9] C. Schmidt et al., “Integration von Sekundärdaten in die Nationale Diabetes-Surveillance: Hintergrund,
Ziele und Ergebnisse des Sekundärdaten-Workshops am Robert Koch-Institut,”
Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz, vol. 60, no. 6, pp. 656–661, Jun.
2017.
[10] M. K. Akmatov, A. Steffen, J. Holstiege, R. Hering, M. Schulz, and J. Bätzing, “Trends and regional
variations in the administrative prevalence of attention-deficit/hyperactivity disorder among children
and adolescents in Germany,” Sci. Rep., vol. 8, no. 1, Dec. 2018.
[11] H. Burchert, Ed., Fachbegriffe des Gesundheitsmanagementsh, 2., überarbeitete und erweiterte
Auflage. Herne: nwb STUDIUM, 2018.
[12] S. March et al., “Quo vadis Datenlinkage in Deutschland? Eine erste Bestandsaufnahme,”
Gesundheitswesen, vol. 57, no. 03, pp. e20–e31, Mar. 2018.
[13] Statistik Austria, “ATHIS.” [Online]. Available:
http://www.statistik.at/web_de/services/publikationen/4/index.html?includePage=detailedView§i
onName=Gesundheit&pubId=457. [Accessed: 23-Jan-2019].
[14] F. Endel, G. Endel, and N. Pfeffer, “PRM34 Routine Data in HTA: Record Linkage in Austrias GAP-
DRG Database,” Value Health, vol. 15, no. 7, p. A466, 2012.
[15] A. Weisser, G. Endel, P. Filzmoser, and M. Gyimesi, “ATC-> ICD–evaluating the reliability of
prognoses for ICD-10 diagnoses derived from the ATC-Code of prescriptions,” presented at the BMC
health services research, 2008, vol. 8, p. A10.
56 C. Rippinger et al. / Health Care Atlases
[16] F. Chini, P. Pezzotti, L. Orzella, P. Borgia, and G. Guasticchi, “Can we use the pharmacy data to
estimate the prevalence of chronic conditions? a comparison of multiple data sources,” BMC Public
Health, vol. 11, no. 1, p. 688, Sep. 2011.
[17] “ICD-10-GM.” [Online]. Available: https://www.dimdi.de/dynamic/de/klassifikationen/icd/icd-10-
gm/. [Accessed: 23-Jan-2019].
[18] Y. Dodge, Ed., The Oxford dictionary of statistical terms, First published in paperback 2006. Oxford:
Oxford Univ. Press, 2006.
[19] “Katalog ambulanter Leistungen (KAL): Entwicklung und Pilotprojekte bis inkl. 2013 |
Gesundheitssystem / Qualitätssicherung | Gesundheitssystem | Gesundheit | Sozialministerium.”
[Online]. Available:
https://www.sozialministerium.at/cms/site/gesundheit/dokument.html?channel=CH3958&doc=CMS1
240821423857. [Accessed: 23-Jan-2019].
[20] L. Sweeney, “k-anonymity: A model for protecting privacy,” Int. J. Uncertain. Fuzziness Knowl.-
Based Syst., vol. 10, no. 05, pp. 557–570, 2002.
dHealth 2019 – From eHealth to dHealth 57
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-57
1. Introduction
Crowding in emergency departments (ED) is associated with poor patient care, higher
mortality and negative impact on patient-safety [1]. Crowding occurs when the need for
emergency services exceeds available resources for patient care in the emergency
department, hospital, or both [2]. An issue of misaligned demand and supply [3], it is
therefore vital to relieve overstrained EDs by curbing patient volume as well as adjusting
ED resources to better meet demand during highly busy hours. With respect to patients,
there are many advances in understanding their motives and patterns of frequent use
informing increasing demand [4]. These can be used for countering improper ED use and
raising awareness among patients. On the other hand, it is of equal importance to tackle
resource alignment on the hospital’s side, since there is often a mismatch between staffing
rosters and patient demand [5]. Growing adoption of health IT technology, however,
holds the chance to integrate detailed forecasting models into ED processes and resource
management, as electronic health care records become widely available [6].
1
Corresponding Author: Jens Rauch, Osnabrück University of AS, Health Informatics Research Group,
PO Box 1940, 49009 Osnabrück, Germany; E-Mail: [email protected].
58 J. Rauch et al. / Improving the Prediction of Emergency Department Crowding
Several recent studies could show that relatively simple regression models already
give good predictions on forthcoming ED crowding measures [7,8,9]. So far, more
complex models provide only somewhat more accurate results, despite their higher
modelling capacities. This is foremost due to the lack of appropriate external covariates
[10]. While there is a large portion of emergency research devoted to the study of surges
in patient volume in case of catastrophic events, pandemic outbreaks or seasonal
fluctuations, little has been published with regard to covariates of regular variations in
intra-day ED occupancy [3]. Indeed, only a small number of indicators for upcoming ED
service demand have been studied. Among the covariates often used are weather and
calendric data, so far. The use of weather data is justified by the fact that weather affects
a number of conditions, which are likely to lead to medical emergencies, e.g. certain air
masses increase asthma hospital admissions significantly [11]. There were few attempts
to include other public data. A recent study investigated the predictive website traffic on
a public health portal [12]. Other studies revolve around the impact of mass gatherings
on the demand for emergency services. Only recently, first results of a systematic
examination regarding the effect of mass gatherings on medical emergency prevalence
were presented [13].
Common to all these studies is the use of variables, which are inherently connected
to public activity level. The use of calendric data is usually intended to reflect the
influence that weekends, holidays and seasons have on public activity patterns. Aside
from its physiological effects [14], weather evidently has effects on public activities, too
[15]. However, both kinds of data are only indirectly connected to overall activity levels.
It seems reasonable to including more proximate measures of public activity into
forecasting models of ED crowding. For instance, daily commuters can make up a
considerable portion of the number of people in a city during specific times and working
hours [16]. Accordingly, it can be hypothesized, that they also contribute to medical
incidents subject to emergency care. Data readily available about population movement
and sojourn is given by public recordings of traffic data. In this paper, we propose hourly
traffic flow as a direct and openly available measure of public activity and aim at
investigating its effect on the prediction of the overall hourly ED load.
2. Methods
We conducted a retrospective study and used historical ED data extracted from the
Electronic Health Record of Klinikum Osnabrück, an academic teaching hospital with
660 beds serving the town and region of Osnabrück, Lower Saxony, Germany. The ED
has about 40,000 cases per year and is operated 24 hours a day on 365 days a year. Since
exported data was anonymised, no ethical statement needed to be obtained. Data covered
the period from January 1 until December 15, 2017. Historical traffic data for the same
period was obtained from the German Federal Highway Research Institute
(Bundesanstalt für Straßenwesen), which collects hourly data about the direction and
number of vehicles passing by measuring stations on federal roads and motorways. There
is a total of six measuring stations in the area of Osnabrück, covering the major traffic
axes for motorised vehicles from and to Osnabrück. All data sources were mirrored in a
PostgreSQL 10 research database. Data analysis was carried out using the statistics
software R.
ED occupancy is a common measure for ED crowding [17]. Thus, hourly totals of
patients within the ED was the variable of interest, while traffic flow as a measure of
J. Rauch et al. / Improving the Prediction of Emergency Department Crowding 59
Figure 1. Typical intraday variation in traffic flow and ED occupancy (data from January 20, 2017), normalised
by respective day mean (red: motorway A33 and green: federal road B51, black: ED occupancy)
60 J. Rauch et al. / Improving the Prediction of Emergency Department Crowding
3. Results
An exemplary intraday time series of ED occupancy and two selected traffic densities
spanning a 24-hour period, beginning at 4 a.m. is shown in Figure 1. Obviously, traffic
flow took a similar shape and was preceding ED occupancy. Increases in ED occupancy
were foregone by increases in traffic load. There was a mean of 12.7 patients (± 7.86 SD)
and a maximum of 37 patients in the emergency department during the period of
investigation. Maximum Pearson correlation coefficients from all traffic measuring
stations with their respective lags are given in Table 1 together with the approximate
distance to the city centre. We also observed that in almost all cases traffic flow towards
the city had a higher correlation than traffic flow in the opposite direction. The best
explanation of the variance (rmax ) was given by traffic flow on motorway A33 at Hellern
and both stations on the federal road B51 with maximum correlation coefficients
of .73, .71 and .71 respectively, preceding ED occupancy by two hours in all three cases.
Accordingly, traffic values from these three roads were used as external predictors in the
SARIMA model.
Table 1. Maximum correlation coefficients from cross-correlation analysis of road traffic and ED occupancy.
Distance refers to driving distance from measuring station to Osnabrück centre. Traffic from roads included in
this study is bolded.
Measuring station Distance (km) rmax Lag (h)
B68 Lechtingen 6.0 .70 -3
B51 Ostercappeln 13.2 .71 -2
B51 Glandorf 20.9 .71 -2
A33 Hellern 5.6 .73 -2
A33 Fledder 6.5 .70 -3
A33 Handorf 10.2 .70 -3
Fitting of the SARIMA model to the training data period (January 1 to September
30) yielded optimal model parameters (1,0,0),(0,1,1)24 from analysis of ACF and PACF
plots. The time series was subjected to first order seasonal differencing, since no trend
but strong 24-hour seasonality was present. Inspecting the ACF of the differenced time
series showed a fair amount of decay. The respective PACF showed a sharp cut-off and
positive autocorrelation at lag 1. Thus, an AR term was added. Since ACF showed a
negative correlation at lag 24, an SMA term was added. An iterative comparison of the
models by the Bayesian Information Criterion in the parameter space
(AR,MA,SMA,SAR){א0,...,5}2×{0,1,2}2 confirmed, that this model was indeed optimal.
We used the same parameters for the SARIMA model with the external predictors
“road traffic on the federal road B51” and “road traffic on the motorway A33 Hellern”.
Overall cross-validated RMSE and MAE values for all six time horizons are given in
Table 2. Improvement for shorter time horizons was better. E.g. for lag 1, the SARIM
model with traffic had roughly 20 % lower RMSE (4.04 vs. 3.21) and for lag 2 RMSE
was about 10 % lower (4.20 vs 3.77). Inclusion of external predictors improved the
prediction for all lags as could be shown (Tab. 2).
J. Rauch et al. / Improving the Prediction of Emergency Department Crowding 61
Table 2. Model comparison from cross validation over the period October 1 to December 15, 2017.
Lags
1 2 3 4 5 6
Measure Model
RMSE SARIMA 4.04 4.20 4.22 4.23 4.23 4.23
4. Discussion
diabetes mellitus, or cigarette smoke. [24]. Individual activity patterns also seem to play
a distinctive role. Notably, it was found that some medical emergencies amongst the
working population follow a pattern that differed from a nonworking subgroup [25].
Obviously, biological and behavioural circadian patterns are correlates of emergency
events on an individual level and thus of ED use. This study revealed circadian public
activity measured by road traffic to be a meaningful predictor of overall ED occupancy
on a regional level. These findings do not claim any causal relationship. Thus, more
research is needed to explain the underlying mechanisms.
However, indicators such a road traffic are powerful because they are available on a
regional level and thus able to predict a regional phenomenon, i.e. ED occupancy. It is
therefore a natural next step, to draw on datasets other than road traffic that are indicative
of public activity levels. Especially mobile cellular location data might be accessible for
prediction of upcoming ED demand and could further improve the results. If so, they
could corroborate the hypotheses of public activity as a correlate of ED occupancy. Apart
from exogenous factors like public activity, a comprehensive predictive model of ED
crowding evidently needs to incorporate factors that are endogenous to the hospital
processes, e.g. staffing rosters and bed occupancy. The present study, however, focussed
on the effectiveness of road traffic data as an easily publicly available predictor. Its
predictive power can be expected to generalize to all EDs in a given region.
This study is of course limited in that data from the six measuring stations is a
specific sample of traffic activity around the city, since neither usage of public transport
nor side road traffic was given. Also, we used ED occupancy as the only measure for ED
crowding. Occupancy belongs to the central throughput measures that inform ED
workload. Yet, there are several other aspects that should also be taken into consideration,
but which were not present in our data, e.g. ED capacity and hospital efficiency.
Furthermore, it remains to be examined in what way the present findings generalize to
other regions.
5. Conclusion
To the best of our knowledge, this is the first study, which examines regional traffic data
as indicator for urban critical health events that require emergency treatment. It could be
shown that road traffic as an external overall covariate can indeed contribute to a
substantial improvement in forecasting crowding in emergency departments.
Fundamentally, the effects might be explained by an inherent relation to human activity
levels that previously were found to be related to medical emergencies.
6. Acknowledgements
We thank Tobias Sonnenberg and the Klinikum Osnabrück for the provision of emer-
gency data and their collaboration. This work is funded by the state of Lower Saxony,
project ROSE, the learning health care system (ZN 3103).
J. Rauch et al. / Improving the Prediction of Emergency Department Crowding 63
References
[1] B.C. Sun, R.Y. Hsia, R.E. Weiss, D. Zingmond, L.-J. Liang, W. Han, H. McCreath and S.M. Asch,
Effect of Emergency Department Crowding on Outcomes of Admitted Patients, Annals of emergency
medicine 61(6) (2013), 605–611.
[2] Crowding, Annals of Emergency Medicine 47(6) (2006), 585, ISSN 0196-0644, 1097-6760.
doi:10.1016/j.annemergmed.2006.02.025.
[3] N.R. Hoot and D. Aronsky, Systematic Review of Emergency Department Crowding: Causes, Effects,
and Solutions, Annals of Emergency Medicine 52(2) (2008), 126–1361, ISSN 0196-0644.
doi:10.1016/j.annemergmed.2008.03.014.
[4] J. Rauch, J. Husers, B. Babitsch and U. Hübner, Understanding the Characteristics of Frequent Users of
Emergency Departments: What Role Do Medical Conditions Play?, Studies in health technology and
informatics 253 (2018), 175–179.
[5] R. Champion, L.D. Kinsman, G.A. Lee, K.A. Masman, E.A. May, T.M. Mills, M.D. Taylor, P.R.
Thomas and R.J. Williams, Forecasting Emergency Department Presentations, Australian Health
Review 31(1) (2007), 83–90, ISSN 1449-8944. doi:10.1071/ah070083.
[6] R.S. Evans, Electronic Health Records: Then, Now, and in the Future, Yearbook of medical informatics
Suppl 1 (2016), 48–61, ISSN 0943-4747. doi:10.15265/IYS-2016-s006.
[7] M. Hertzum, Forecasting Hourly Patient Visits in the Emergency Department to Counteract Crowding,
The Ergonomics Open Journal 10(1) (2017), 1–13.
[8] L. Zhou, P. Zhao, D. Wu, C. Cheng and H. Huang, Time Series Model for Forecasting the Number of
New Admission Inpatients, BMC medical informatics and decision making 18(1) (2018), 39.
[9] W. Whitt and X. Zhang, A Data-Driven Model of an Emergency Department, Operations Research for
Health Care 12 (2017), 1–15, ISSN 2211-6923. doi:10.1016/j.orhc.2016.11.001.
[10] R. Calegari, F.S. Fogliatto, F.R. Lucini, J. Neyeloff, R.S. Kuchenbecker and B.D. Schaan, Forecasting
Daily Volume and Acuity of Patients in the Emergency Department, Computational and Mathematical
Methods in Medicine 2016 (2016).
[11] P.F. Jamason, L.S. Kalkstein and P.J. Gergen, A Synoptic Evaluation of Asthma Hospital Admissions
in New York City, American Journal of Respiratory and Critical Care Medicine 156(6) (1997), 1781–
1788, ISSN 1073-449X. doi:10.1164/ajrccm.156.6.96-05028.
[12] A. Ekstrom, L. Kurland, N. Farrokhnia, M. Castren and M. Nordberg, Forecasting Emergency
Department Visits Using Internet Data, Annals of emergency medicine 65(4) (2015), 436–442.
[13] J. Ranse, A. Hutton, T. Keene, S. Lenson, M. Luther, N. Bost, A.N. Johnston, J. Crilly, M. Cannon and
N. Jones, Health Service Impact from Mass Gatherings: A Systematic Literature Review, Prehospital
and disaster medicine 32(1) (2017), 71–77.
[14] R. Manfredini, O. La Cecilia, B. Boari, J. Steliu, V. Michelini, P. Carli, C. Zanotti, M. Bigoni and M.
Gallerani, Circadian Pattern of Emergency Calls: Implications for ED Organization, The American
journal of emergency medicine 20(4) (2002), 282–286.
[15] T. Horanont, S. Phithakkitnukoon, T.W. Leong, Y. Sekimoto and R. Shibasaki, Weather Effects on the
Patterns of People’s Everyday Activities: A Study Using GPS Traces of Mobile Phone Users, PloS one
8(12) (2013), 81153.
[16] R. Patuelli, A. Reggiani, S.P. Gorman, P. Nijkamp and F.-J. Bade, Network Analysis of Commuting
Flows: A Comparative Static Approach to German Data, Networks and Spatial Economics 7(4) (2007),
315–331, ISSN 1572-9427. doi:10.1007/s11067-007-9027-6.
[17] L.I. Solberg, B.R. Asplin, R.M. Weinick and D.J. Magid, Emergency Department Crowding: Consensus
Development of Potential Measures, Annals of Emergency Medicine 42(6) (2003), 824–834, ISSN
01960644. doi:10.1016/S0196-0644(03)00816-3.
[18] R.H. Shumway and D.S. Stoffer, Time Series Regression and Exploratory Data Analysis, in: Time Series
Analysis and Its Applications, Springer, 2011, pp. 47–82.
[19] C. Bergmeir, R.J. Hyndman and B. Koo, A Note on the Validity of Cross-Validation for Evaluating
Autoregressive Time Series Prediction, Computational Statistics & Data Analysis 120 (2018), 70–83.
[20] M. Wargon, B. Guidet, T.D. Hoang and G. Hejblum, A Systematic Review of Models for Forecasting
the Number of Emergency Department Visits, Emergency Medicine Journal 26(6) (2009), 395–399.
64 J. Rauch et al. / Improving the Prediction of Emergency Department Crowding
[21] S. Jiang, K.-S. Chin and K.L. Tsui, A Universal Deep Learning Approach for Modeling the Flow of
Patients under Different Severities, Computer Methods and Programs in Biomedicine 154 (2018), 191–
203, ISSN 0169-2607. doi:10.1016/j.cmpb.2017.11.003.
[22] J. Ranse, S. Lenson, T. Keene, M. Luther, B. Burke, A. Hutton, A.N. Johnston and J. Crilly, Impacts on
In-Event, Ambulance and Emergency Department Services from Patients Presenting from a Mass
Gathering Event: A Retrospective Analysis, Emergency Medicine Australasia (2018).
[23] R. Manfredini, M. Gallerani, F. Portaluppi and C. Fersini, Relationships of the Circadian Rhythms of
Thrombotic, Ischemic, Hemorrhagic, and Arrhythmic Events to Blood Pressure Rhythms, Annals of the
New York Academy of Sciences 783(1) (1996), 141–158, ISSN 1749-6632.
doi:10.1111/j.17496632.1996.tb26713.x.
[24] R. Manfredini, F. Manfredini, B. Boari, R. Salmi and M. Gallerani, The Monday Peak in the Onset of
Ischemic Stroke Is Independent of Major Risk Factors, The American journal of emergency medicine
27(2) (2009), 244–246.
[25] C. Spielberg, D. Falkenhahn, S.N. Willich, K. Wegscheider and H. Voller, Circadian, Day-of-Week,
and Seasonal Variability in Myocardial Infarction: Comparison between Working and Retired Patients,
American Heart Journal 132(3) (1996), 579–585, ISSN 0002-8703.
dHealth 2019 – From eHealth to dHealth 65
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-65
1. Introduction
In recent years, various classification models for machine learning (ML) have been
published, predicting events in health care such as hospital readmission, cancer survival
or cardiovascular diseases [1–5]. Although many of them achieve good prediction
results in test and validation data, there is a lack of follow-up studies on the
performance of such models in dynamic decision-making situations [6]. For integrating
prediction models in a clinical workflow, one has to overcome many obstacles. Such
obstacles may not only include official regulations, standards for implementation, and
privacy protection, but also negative attitudes of health care professionals, or
interoperability of health systems [7–9]. Due to such obstacles, only a small percentage
of developed ML models have made their way to clinical practice and there is limited
research on the behaviour of highly complex models in real-time prediction scenarios.
1
Corresponding Author: Stefanie Jauk, CBmed GmbH – Center for Biomarker Research in Medicine,
Stiftingtalstraße 5, 8010 Graz, Austria; E-Mail: [email protected].
66 S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow
Risk prediction models in healthcare are often based on electronic health records
(EHR) of patients. EHR contain a large amount of information for a single patient and
represent longitudinal patient histories. KAGes (Steiermärkische
Krankenanstaltengesellschaft m.b.H.), the regional health care provider in Styria
(Austria), hosts longitudinal health records of around 90% of all Styrian inhabitants
resulting from clinical documentation. The records include inpatient and outpatient
visits and have been stored electronically in the hospital information system (HIS) of
KAGes for the last 15 years.
We recently developed a model based on a random forest that predicts the
occurrence of delirium for patients at the time of admission in a KAGes hospital [10].
Delirium is a syndrome of acute confusional state which is common among
hospitalized elderly patients. It can cause adverse medical outcomes and is associated
with an increased mortality rate [11]. However, there is evidence that delirium can be
prevented in many cases using nonpharmacological interventions such as reorientation,
hydration, sleep strategies or hearing and vision adaptations [12].
In our study [10], we used the longitudinal EHR of more than 8,500 internal
medicine patients including demographic data, three-character categories of ICD-10
coded diagnoses, procedures, laboratory data, nursing assessment and transfer data for
modelling. The model achieved an area under the receiver-operating characteristic
curve (AUROC) of more than 0.90 in a separated test set.
An adaptation of the delirium model for a cohort of internal medicine and surgical
patients was implemented in May 2018 in the clinical workflow of a KAGes hospital.
For every patient admitted to the hospital, a risk of developing delirium is predicted at
(a) the point of hospitalization. A second prediction (b) takes place the morning after
admission. In some cases additional predictions (c) are made, e.g. due to a transfer to
another department. Within the HIS, prediction results are presented to the health care
personnel in three risk categories: low risk, high risk and very high risk. The model has
been approved for regular clinical use and is currently under evaluation.
Before implementation of the model, we already assumed some limitations for real-
time prediction. One limitation is encountered when specific data is missing for the
respective patient. For instance, the determination of the laboratory parameters at the
beginning of a hospitalization is a standard procedure. However, it is possible that
technical problems may delay the availability of these parameters in the system for
some hours. Also, information of nursing assessment is being recorded within the first
48 hours. Hence, the information for prediction might not be complete at admission.
For building a training cohort of ML models, all data is retrospectively available.
However, when implementing a trained model in clinical workflow, data transmission
delays might influence the real-time prediction. The implementation proved that – for
some patients – a model previously trained with a well-defined and well-prepared data
set did not represent the data we came across in the real-time scenario.
In addition, data available at time of hospitalization (and hence at time of
prediction) is usually not missing-at-random, but there are systematically missing
feature groups. This problem may be explained by two examples from our data:
S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow 67
One way to determine the influence of single features on the prediction is the
measurement of variable importance. Variable importance helps to understand what
variables are most important for prediction results. Different approaches have been
studied, but recent studies show that many methods suffer from biases, especially if the
modelling variables vary in measurement scales [13]. Also, variable importance
methods focus on the entire prediction model and the results cannot explain individual
prediction result for a single patient. We believe that in addition to the analysis of
variable importance of the entire model, further evaluation on individual results is
necessary.
In one of our previous studies, we showed that a model trained with demographic
information and nursing assessment data only, achieved a higher AUROC than models
trained with other feature groups [14]. This indicates that nursing assessment data
contain informative features for our prediction model for delirium.
However, an application of the obtained results for a prediction model scenario in
clinical practice is not trivial. It is not feasible and efficient to train a prediction model
for every possible combination of informative features missing, as the number of such
combinations is too high. In addition, it remains unclear how a model trained with
information in all features performs for a patient with missing information in some
informative features, like nursing assessment, and how this might influence the
implementation of the model in clinical practice.
1.4. Objectives
Our first experiences with the real-time prediction of our delirium model confirmed
that recent nursing assessment data is missing at admission time for some patients, and
we are aware of the importance of nursing assessment in the prediction context. Also,
for some cases, laboratory data might be available only shortly after admission time.
Therefore, the aim of this study is to evaluate a possible benefit of a model that is
trained specifically for a case of missing laboratory data and nursing assessment data at
admission. Dependent on the information available in the HIS for a patient, such an
information adapted model could be employed in addition to the prediction based on
the whole feature set.
Besides, we want to determine the feature groups that result in a risk prediction
closest to the one achieved by a complete data set when using the same prediction
model. This simulation helps to understand the effect of missing data in certain feature
groups when using a model trained on a complete data set.
68 S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow
2. Methods
The analysed data were extracted from the HIS of KAGes, openMEDOCS, which is
based on IS-H/i.s.h.med information systems, implemented on SAP platforms. After
extraction and anonymization, all analyses were computed in R using various packages.
For modelling of random forest models, the caret package [15] as well as associated
packages were used.
The study is part of the project that received approval from the Ethics Committee
of the Medical University of Graz (30-146 ex 17/18).
The random forest model implemented in May 2018 was an adapted version of the
previously published model [10]. We used the same inclusion and exclusion criteria as
in the previous model, but an extended period of admission time from 2011 till 2018.
During that period, 6,459 patients were coded with the ICD-10 three-character category
F05 (delirium due to known physiological condition). In addition to the delirium
patients we included 13,445 randomly selected controls from internal medicine and
surgical departments.
We selected variables based on literature and previous analyses. Examples of the
modelling features are shown in Table 1. As we cannot differentiate e.g. between not
coded and not present diseases, missing values for features were set to zero (i.e. not
present). This also applies to not analysed laboratory values and nursing assessment.
We split the cohort into training (75%) and test data set (25%). We trained a
random forest with up-sampling on the training data set including a 10-fold cross
validation, and tested the model on the separate test data. The model achieved an
AUROC of 0.85 for the cohort of surgical and internal medicine patients.
Although the classification of the model was binary (delirium vs. non-delirium),
we later added a second threshold in the implemented version in order to receive three
classes: low risk, high risk and very high risk. The chosen threshold was the result of
clinical considerations to alert patients above the 85th percentile with delirium risk.
Table 1. Examples of modelling features extracted from electronic health records (n=556).
Feature group Examples n
Demographic data Age, sex, mother tongue, additional private insurance 30
Diagnosis codes ICD-10 codes (e.g. F00, E11, E78, I10, N39, I49), groups of 275
ICD-10 codes (e.g. F00_F09), total number of diagnoses
Procedure codes X-ray, MRI, physiotherapy, CT scan 95
Laboratory data CRP, ALT, AST, cholesterol, gamma-GT, haemoglobin, 53
bilirubin, MCV, creatinine
Nursing protocols Hearing impaired, vision impaired, sleeping disorder, body 96
mass index, catheter, communication possible, smoking
Administrative data, Number of transfers, number of hospital admissions, Charlson 7
indices comorbidity index, number of procedures
S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow 69
random forest model (Model A) with the complete data set to predict the delirium risk
for the cohort.
To examine the influence of five available features groups, we simulated five
subsets. Every subset included informative features for demographic data (DEM) of a
patient. Additionally, each subset contained informative data in one of the following
feature groups only: Coded diagnoses (ICD), applied procedures (PROC), laboratory
data (LAB), nursing assessment data (NURS), and transfer data (TRANS). The
remaining features for every subset set were set to zero, i.e. non-informative.
We predicted the risk of delirium for every patient with Model A on each subset
and compared the results to the prediction with the complete data set. We calculated the
root mean squared error (RMSE) for all five subsets. The RMSE represents the square
root of the average of the differences between the prediction with the complete data set
and each subset, meaning that a high RMSE represents a high difference in prediction.
Our aim was to evaluate the deviation of predicted risk probabilities from a complete
data set, and therefore we did not evaluate the specificity or sensitivity of the prediction.
3. Results
Table 2 shows the results of the prediction on a complete data set compared to five
subsets excluding informative features of certain feature groups. The total number of
informative features for every subset varied between 37 and 305. The RMSE of
information in demographic data (DEM) only was the highest with 0.54. The subset of
DEM was represented by 30 features and was part of each subset. The lowest RMSE,
and therefore the lowest deviation from the prediction with a complete data set, was
achieved by the subset including informative features of procedures (DEM + PROC). A
risk prediction with diagnoses (DEM + ICD) was furthest from the results of the
complete data set. Although in the subset of transfers (DEM + TRANS) information for
only seven further features was added to the 30 demographic features, the RMSE was
the second lowest.
70 S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow
Table 2. Root mean squared error (RMSE) for individual differences of 7,514 predicted risk probabilities
comparing the complete feature set with subsets. Every subset includes 30 features from the DEM feature set.
Feature group Number of features RMSE
DEM 30 0.540
DEM + ICD 305 0.491
DEM + LAB 83 0.361
DEM + NURS 126 0.235
DEM + PROC 125 0.180
DEM + TRANS 37 0.225
Figure 1 demonstrates the results of the prediction of two models for the same test data
set with missing information in laboratory and nursing assessment features. Model B,
trained without features of nursing assessment and laboratory, achieved better results in
delirium prediction than Model A. The AUROC for Model B was 0.830 [0.818, 0.841],
and 0.799 [0.787, 0.812] for Model A.
A DeLong test for correlated ROC curves showed a significant difference between
the curves of Model A and Model B (Z = 10.388, p < 0.001).
4. Discussion
Even though a prediction model performs well in the test data set, several obstacles
might occur during its implementation in a real-clinical workflow. In May 2018, we
integrated a random forest model predicting an occurrence of delirium in the HIS of a
hospital in Austria. As one may not foresee problems arising during implementation, a
constant evaluation of the implemented model is crucial. Our study raises awareness of
emerging limitations when a prediction model is implemented in a clinical workflow.
Also, we presented a way on how to overcome some of these limitations.
We applied our random forest model on data subsets with informative data missing
in different feature groups. The prediction using only demographic data and
information of procedures was closest to the prediction with the complete data set. A
subset with information of nursing assessment achieved the third closest prediction to
the complete data set. When showing the importance of nursing assessment in our
previous study [14], we trained separate models for combinations of feature groups.
This time, we used one model which was trained on the complete data set, and applied
it on data sets with missing information. We conclude that the performance of the same
model varies for patients with different information available, and that not all of this
variation might be explained by a global measure of variable importance.
At time of admission, for some patients information used for prediction might be
missing not-at-random and set to zero for prediction. For patients with missing
laboratory data and nursing assessment, a model trained specifically for that scenario
(Model B) achieved better prediction results than the currently implemented model
trained with all features (Model A). This indicates that for the implementation scenario
in KAGes two models are needed: Depending on the availability of recent laboratory
data and nursing assessment data, Model A or Model B should be employed for
prediction at admission time. This conclusion is remarkable, as it shows that the
S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow 71
1.0
0.8
0.6
Sensitivity
0.4
Model A
0.2
Model B
0.0
Figure 1. ROC curves for Model A (random forest trained with all features) and Model B (random forest
trained without nursing assessment and laboratory features). Prediction was computed on the same test data
with missing values of nursing assessment and laboratory data.
Acknowledgements
This work has been carried out with the K1 COMET Competence Centre CBmed,
which is funded by the Federal Ministry of Transport, Innovation and Technology
(BMVIT); the Federal Ministry of Science, Research and Economy (BMWFW); Land
Steiermark (Department 12, Business and Innovation); the Styrian Business Promotion
Agency (SFG); and the Vienna Business Agency. The COMET program is executed by
the FFG. KAGes and SAP provided significant resources, manpower and data as basis
for research and innovation.
72 S. Jauk et al. / Information Adapted ML Models for Prediction in Clinical Workflow
References
[1] K. Kourou, T.P. Exarchos, K.P. Exarchos, M.V. Karamouzis, and D.I. Fotiadis, Machine learning
applications in cancer prognosis and prediction, Comput. Struct. Biotechnol. J. 13 (2015) 8–17.
doi:10.1016/j.csbj.2014.11.005.
[2] A. Wong, A.T. Young, A.S. Liang, R. Gonzales, V.C. Douglas, and D. Hadley, Development and
Validation of an Electronic Health Record–Based Machine Learning Model to Estimate Delirium Risk
in Newly Hospitalized Patients Without Known Cognitive Impairment, JAMA Netw. Open. 1 (2018)
e181018. doi:10.1001/jamanetworkopen.2018.1018.
[3] S. Hao, Y. Wang, B. Jin, A.Y. Shin, C. Zhu, M. Huang, L. Zheng, J. Luo, Z. Hu, C. Fu, D. Dai, Y.
Wang, D.S. Culver, S.T. Alfreds, T. Rogow, F. Stearns, K.G. Sylvester, E. Widen, and X.B. Ling,
Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk
Assessment Tool in the Maine Healthcare Information Exchange, PLOS ONE. 10 (2015) e0140271.
doi:10.1371/journal.pone.0140271.
[4] S.F. Weng, J. Reps, J. Kai, J.M. Garibaldi, and N. Qureshi, Can machine-learning improve
cardiovascular risk prediction using routine clinical data?, PLOS ONE. 12 (2017) e0174944.
doi:10.1371/journal.pone.0174944.
[5] A. Rajkomar, E. Oren, K. Chen, A.M. Dai, N. Hajaj, M. Hardt, P.J. Liu, X. Liu, J. Marcus, M. Sun, P.
Sundberg, H. Yee, K. Zhang, Y. Zhang, G. Flores, G.E. Duggan, J. Irvine, Q. Le, K. Litsch, A. Mossin,
J. Tansuwan, D. Wang, J. Wexler, J. Wilson, D. Ludwig, S.L. Volchenboum, K. Chou, M. Pearson, S.
Madabushi, N.H. Shah, A.J. Butte, M.D. Howell, C. Cui, G.S. Corrado, and J. Dean, Scalable and
accurate deep learning with electronic health records, Npj Digit. Med. 1 (2018). doi:10.1038/s41746-
018-0029-1.
[6] M. Islam, M. Hasan, X. Wang, H. Germack, and M. Noor-E-Alam, A Systematic Review on
Healthcare Analytics: Application and Theoretical Perspective of Data Mining, Healthcare. 6 (2018)
54. doi:10.3390/healthcare6020054.
[7] F. Jiang, Y. Jiang, H. Zhi, Y. Dong, H. Li, S. Ma, Y. Wang, Q. Dong, H. Shen, and Y. Wang, Artificial
intelligence in healthcare: past, present and future, Stroke Vasc. Neurol. 2 (2017) 230–243.
doi:10.1136/svn-2017-000101.
[8] E.G. Liberati, F. Ruggiero, L. Galuppo, M. Gorli, M. González-Lorenzo, M. Maraldi, P. Ruggieri, H.
Polo Friz, G. Scaratti, K.H. Kwag, R. Vespignani, and L. Moja, What hinders the uptake of
computerized decision support systems in hospitals? A qualitative study and framework for
implementation, Implement. Sci. 12 (2017). doi:10.1186/s13012-017-0644-2.
[9] R. Amarasingham, R.E. Patzer, M. Huesch, N.Q. Nguyen, and B. Xie, Implementing Electronic Health
Care Predictive Analytics: Considerations And Challenges, Health Aff. (Millwood). 33 (2014) 1148–
1154. doi:10.1377/hlthaff.2014.0352.
[10] D. Kramer, S. Veeranki, D. Hayn, F. Quehenberger, W. Leodolter, C. Jagsch, and G. Schreier,
Development and Validation of a Multivariable Prediction Model for the Occurrence of Delirium in
Hospitalized Gerontopsychiatry and Internal Medicine Patients., Stud. Health Technol. Inform. 236
(2017) 32–39.
[11] S.K. Inouye, R.G. Westendorp, and J.S. Saczynski, Delirium in elderly people, The Lancet. 383 (2014)
911–922. doi:10.1016/S0140-6736(13)60688-1.
[12] T.T. Hshieh, J. Yue, E. Oh, M. Puelle, S. Dowal, T. Travison, and S.K. Inouye, Effectiveness of
Multicomponent Nonpharmacological Delirium Interventions: A Meta-analysis, JAMA Intern. Med.
175 (2015) 512. doi:10.1001/jamainternmed.2014.7779.
[13] C. Strobl, A.-L. Boulesteix, A. Zeileis, and T. Hothorn, Bias in random forest variable importance
measures: Illustrations, sources and a solution, BMC Bioinformatics. 8 (2007). doi:10.1186/1471-2105-
8-25.
[14] S. Veeranki, D. Hayn, D. Kramer, S. Jauk, and G. Schreier, Effect of Nursing Assessment on
Predictive Delirium Models in Hospitalised Patients, Stud. Health Technol. Inform. (2018) 124–131.
doi:10.3233/978-1-61499-858-7-124.
[15] M. Kuhn, caret: Classification and Regression Training. R package version 6.0-78., 2017.
dHealth 2019 – From eHealth to dHealth 73
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-73
Abstract. In medical education Virtual Patients (VP) are often applied to train
students in different scenarios such as recording the patient’s medical history or
deciding a treatment option. Usually, such interactions are predefined by software
logic and databases following strict rules. At this point, Natural Language
Processing/Machine Learning (NLP/ML) algorithms could help to increase the
overall flexibility, since most of the rules can derive directly from training data. This
would allow a more sophisticated and individual conversation between student and
VP. One type of technology that is heavily based on such algorithmic advances are
chatbots or conversational agents. Therefore, a literature review is carried out to give
insight into existing educational ideas with such agents. Besides, different
prototypes are implemented for the scenario of taking the patient’s medical history,
responding with the classified intent of a generic anamnestic question. Although the
small number of questions (n=109) leads to a high SD during evaluation, all scores
(recall, precision, f1) reach already a level above 80% (micro-averaged). This shows
a first promising step to use these prototypes for taking the medical history of a VP.
1. Introduction
In the publication by Riemer and Abendroth [1], different approaches are presented how
Virtual Patients (VP) can best be used in medical education. The listed systems are
thereby CASUS, CAMPUS and INMEDEA [1]. Typical task types in such case-based
learning systems are, for example, multiple-choice, long menu, free text, or assignment
questions [2]. In the current CAMPUS system, such an interaction element is used during
the anamnesis interview. Thereby, the students are able to select an appropriate
anamnestic question and receive the VP answer from CAMPUS. This dialog is based
upon predefined anamnestic questions, which are stored in a database. In order to build
up additional competencies among the students a more flexible and individual
conversation should be considered. This requires therefore a change of the system, since
storing an anamnestic question in each variation isn’t feasible. Hence, this research paper
examines various methods from the fields of Natural Language Processing/Machine
Learning (NLP/ML) to provide the capability of dealing with unknown questions. To
this end, different approaches (prototypes) are implemented using Python, evaluated by
leave-one-out cross-validation (LOO) and compared with each other using different
1
Corresponding Author: Andreas Reiswich, Heilbronn University, Max-Planck-Straße 39, 74081
Heilbronn, Germany, E-Mail: [email protected]
74 A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes
micro-averaged ML scores (recall, precision, f1). The overall aim is to determine the best
performing prototype, when considering a small number of given training data (n=109).
2. Methods
The literature databases MEDLINE1, IEEE Digital Library2 and ACM Digital Library3
were used to conduct a systematic review of chatbots that were utilized in an educational
environment. The underlying method was derived from the PRISMA flow diagram [3].
The year range was set for all databases from 2015 to 2018. In case of IEEE the options
Full Text & Metadata and My Subscribed Content were selected for the search. For ACM,
the option ACM Full-Text Collection and Any field for search terms and lastly, the option
All Fields for PubMed, were selected. When performing the search on all three databases,
the following search string was applied (ACM result syntax output):
(+Chatbot* Training Education Apprenticeship Teaching)
In addition, the TeXMed [4] website was used to generate a BibTex file for PubMed.
All extracted results were then managed by Zotero4. Besides an additional Excel file was
used to summarize relevant information related to a set of predefined dimensions of
interest. These dimensions comprise, for example, the technical solution such as the
usage of the Artificial Intelligence Markup Language (AIML), Machine Learning (ML)
algorithms and the implemented user interface (UI). Aspects related to an educational
concept were also considered. The final update of all literature entries was conducted on
26th January 2018.
All ML prototypes were developed either directly in Python (scikit-learn [5]) or with an
adapter class for Rasa NLU [6]. Solutions that weren’t available in Python were
excluded. Python was used as it is the core language for many scientific fields including
Artificial Intelligence (AI) & ML while still offering a highly readable code [7]. This
allowed a more efficiently use of own software fragments and created a uniform
approach for the overall evaluation using the inbuilt method cross_val_score [8] of
scikit-learn for the LOO cross-validation. In addition, Rasa NLU was selected as an open
source chatbot platform, which allows an execution on a private server. This can also be
a future prerequisite, for example, if chatbots build up on sensitive data from patients or
students. Therefore, external service providers such as Google or Facebook were
excluded from evaluation. In order to use Rasa for classification, all anamnestic
questions and the corresponding intents were specified in a separate file using the
required markdown language. For this, each intent was described by a heading and the
chatbot’s knowledge base as an enumeration of all anamnestic questions in plain
German. An example for a question and its intent is: “What medical complaints do you
have?” (intent: complaints_identification). No entities and no sentence modifications are
applied during this step for Rasa.
2
https://ieeexplore.ieee.org/Xplore/home.jsp, last access: 29.01.2019.
3
https://dl.acm.org/, last access: 29.01.2019.
4
https://www.zotero.org/, last access: 29.01.2019.
A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes 75
3. Results
The literature review led to 332 papers on IEEE, 124 papers on ACM and 2 papers on
PubMed. They were then added to Zotero using the generated BibTex files, which also
included abstracts if available. In summary, 458 papers were considered from these three
databases using standard export and import of each database provider, TeXMed and
Zotero. Nonetheless, several files were removed before or during the process of sighting
each paper’s title and abstract. The reason was either duplicates (5), no valuable
information (24), e.g. referring only to a schedule [11] or having no access (1) [12].
Finally, 428 papers could be usefully sighted for title and abstract. At this stage, all
chatbots were considered, which were integrated in a more advanced educational concept,
e.g. in a concept of a Massive Open Online Course (MOOC). Therefore, results like
answering FAQs of a university [13], supporting the degree program choice of a student
[14] were not further reviewed. In addition, all results were excluded, which weren’t
enough focusing on a chatbot approach, e.g. only listing a chatbot as an example [15].
After sighting each title and abstract, 36 papers from IEEE, 7 from ACM and 1 from
PubMed were sighted on their full text information, leading to 21 accepted publications
(14 IEEE and 7 ACM papers). During this step, the paper quality itself wasn’t considered
as an additional criterion, only the chatbot context was decisive. In the following, a short
summary of the results related to the educational setting is presented.
Frequently, chatbots were integrated in a MOOC scenario [16][17][18][19]. Thereby,
Demetriadis et al. [16] focused on creating a more productive talk using transactive
questions and conceptual links to shape the relevant domain model of a task. Kloos et al.
[17] proposed a MOOC complementary chatbot, allowing to learn Java in several
interaction modes, such as review and gaming. Besides MOOCs, conversational agents
were also applied in Virtual Reality (VR) [20][21][22] creating an immersive educational
environment. Tsaramirsis et al. [21], for example, simulated the experiences of a student
in a classroom, including the communication with the lecturer. If the lecturer didn’t
5
https://exploratory.io/, last access: 29.01.2019.
6
https://www.scikit-yb.org/en/latest/, last access: 20.03.2019.
7
https://slack.com/, last access: 29.01.2019.
76 A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes
respond to a student’s question, an inbuilt AIML Chatbot was used to generate the answer.
Other chatbot realizations covered the use case of language learning [20][23][24].
Troussas et al. [23] developed a mobile chatbot for learning vocabularies through text or
voice response. Further, gamification elements were incorporated by [25][26][27].
Pereira [25], for example, created a quiz chatbot for students in different subjects.
Thereby, a Telegram UI was applied since students were familiar in using such instant
messaging services [25]. Besides these results, Webber [28] was the only fitting VP
approach that was referenced within the literature results. However, Webber builds on a
rule-based SQL approach [29] that was published in 2005 [28]. Therefore, the intention
of this paper is to revisit the concept of a VP chatbot, considering next to classical ML
methods a modern approach (Rasa NLU) and a mobile chatting app (Slack) as it was
suggested by Io and Lee [30] in a recent bibliometric analysis about chatbots.
Several data sources were integrated to train and evaluate the different prototypes. This
included questions from the CAMPUS database and online resources [31][32][33][34].
They were used to generate data sets with the modified textcorpus-generator8 project.
Thereby, each anamnestic question was annotated by a single intent based on a personal
assumption that derived from the gained insight of these resources. A typical question
from the data set is for example “How much do you currently smoke per day?” (Intent:
Smoking) or the one given in Section 2.2.
Table 1. Basic properties of the anamnestic questions corpora
Features
The basic properties of the underlying questions corpora are described by several
key features, being shown in Table 1. For each corresponding corpus, either the median
(MED), average (AVG), minimum (MIN) or maximum (MAX) were calculated.
8
https://github.com/pagesjaunes/textcorpus-generator, last access: 09.12.2018.
A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes 77
represents a general expected imbalance but doesn’t claim to reflect the future reality.
The underlying assumption is based on the opinion that certain intents won’t allow to
create the same amount of questions because they are either more specific (e.g. Person
weight) or more generally designed (e.g. Smoking). A future study must therefore show
how to define intents to avoid overlaps before ML classification.
Figure 3. Slack integration of the anamnesis chatbot. User chooses Rasa NLU as a classifier. Input sentence
contains an intentional spelling mistake and the chatbot returns the classified intent and its confidence value.
78 A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes
Finally, all created prototypes were bundled into a Slack application. Figure 3
illustrates the final Slack UI using the Rasa prototype for intent classification. Thereby,
each single prototype is selectable by using the keyword: set_mode_x where x stands for
the number of the individual prototype, e.g. x=2 for Rasa (Figure 3). All additional
mappings can be listed by using the chatbot’s help command. All in all, the use of Slack
creates a UI, which allows the communication between student and VP through a
modern, well-known messaging service.
Scikit-Learn
RF* 85.321 r 35.390 84.404 r 36.282 82.569 r 37.938
NB 81.651 r 38.706 81.651 r 38.706 81.651 r 38.706
LSCV 88.991 r 31.300 88.991 r 31.300 88.991 r 31.300
LR 85.321 r 35.390 85.321 r 35.390 85.321 r 35.390
Since the total number of the available data for a specific intent was small, the LOO
cross-validation method was selected as a test procedure. In the next step each scikit-
learn prototype inherited the BaseEstimator9 class and implemented the adapter methods
fit(self, X, y) and predict(self, X). Subsequently, an object of this class was passed to the
cross_val_score method to perform the final evaluation. Thereby, the following scoring
methods (micro-averaged) were applied: recall, precision and f1. The results of this
respective approach are listed in Table 2, where each value represents an average of all
n measurements. All prototypes achieve a score of over 80% for each individual
combination with Rasa and LSVC delivering the best results (RF excluded because of
high fluctuation). However, since the SD of each score is very high due to LOO, all
current prototypes (Section 3.2) should remain selectable by the UI (Section 3.3) for a
future field study. This would allow to record not only new real data for cross-validation
but also gain feedback on the individual perception of use by each medical student. Both
insights could then be analyzed to select the final prototype for the use in a case-based
learning system.
4. Discussion
Considering the given data that is shown in Table 1, both mean values indicate that the
total number of words and characters in the training data is low. Instead, the proportion
of stop words with about 29% is quite high. Bearing all this in mind, the subsequent ML
9
https://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html, last access:
12.02.2019.
A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes 79
algorithms had only few valuable information to deal with and still performed fine
(>80%, but high SD due LOO). Further improvements could be made by ML parameter
optimizations or by increasing the data quality when allowing medical students or experts
to ask questions and subsequently integrate their feedback for training purposes.
The results of Table 2 also show that there are only minor differences in terms of
classification performance between the Rasa NLU platform and the assembled scikit-
learn implementations. It would be interesting to investigate whether lemmatizing or
additional feature extractions, e.g. from grammatical structures, could lead to further
performance improvements for the scikit-learn prototypes.
The current version of the Slack UI could allow an unstructured anamnesis survey
between chatbot and user by replacing the intent with a VP answer. Thereby, it can be
used on the computer as well as within a mobile application. For future development, the
dialog system could be extended by storylines, e.g. by Rasa stories, making the overall
conversation more sophisticated. At this point, additional concepts like voice commands
[20] or a VR avatar [21] can be also considered. If there are no further improvements in
data quality the interaction between user and chatbot might be facilitated by additional
UI elements like in Fadhil and Villafiorita [26]. This could help building a feedback
channel, e.g. displaying a small set of intents for user selection if the confidence of the
chatbot isn’t sufficiently high enough. As a result, the user could indicate a suitable
intention or deny the given suggestion completely. These statements could then be
forwarded to an author’s Slack workspace for revision and re-added to the chatbot for
Reinforcement Learning.
References
[1] M. Riemer and M. Abendroth, Virtuelle Patienten: Wie werden sie aus Sicht von Medizinstudierenden am
besten eingesetzt?, Ger. Med. Sci. GMS E-J., (2013).
[2] M. R. Fischer et al., Virtuelle Patienten in der medizinischen Ausbildung: Vergleich verschiedener
Strategien zur curricularen Integration, Z. Für Evidenz Fortbild. Qual. Im Gesundheitswesen, 102(10),
(2008), 648–653.
[3] PRISMA, PRISMA Flow Diagram, http://prisma-statement.org/prismastatement/flowdiagram.aspx, last
access: 29.01.2019.
[4] TeXMed, TexMed – a BibTeX interface for PubMed, https://www.bioinformatics.org/texmed/, last access:
29.01.2019.
[5] F. Pedregosa et al., Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2011, 2825–2830.
[6] Rasa NLU, Rasa NLU: Language Understanding for chatbots and AI assistants, https://rasa.com/docs/nlu/,
last access: 30.01.2019.
[7] G. Rashed and R. Ahsan, Python in Computational Science: Applications and Possibilities, Int. J. Comput.
Appl., 46(20), 2012, 26-30.
[8] Scikit-learn, API Reference, https://scikit-learn.org/stable/modules/classes.html#module-
sklearn.model_selection, last access: 29.01.2019.
[9] Scikit-learn, Text feature extraction, https://scikit-learn.org/stable/modules/feature_extraction.html#text-
feature-extraction, last access: 22.01.2019.
[10] Scikit-learn, Model evaluation: quantifying the quality of predictions, https://scikit-
learn.org/stable/modules/model_evaluation.html, last access: 29.01.2019.
[11] IEEE Xplore, Schedule, 2018 Zooming Innovation in Consumer Technologies Conference (ZINC),
(2018).
[12] S. Garg et al., Clinical Integration of Digital Solutions in Health Care: An Overview of the Current
Landscape of Digital Technologies in Cancer Care, JCO Clin Cancer Inf., 2(2), 2018, 1-9.
[13] B. R. Ranoliya et al., Chatbot for university related FAQs, 2017 International Conference on Advances
in Computing, Communications and Informatics (ICACCI), 2017, 1525–1530.
80 A. Reiswich and M. Haag / Evaluation of Chatbot Prototypes
[14] S. Mirri et al., User-driven and open innovation as app design tools for high school students, in 2018
IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications
(PIMRC), 2018, 6–10.
[15] C. Kuo and J. Z. Shyu, An Innovative Syndicate Medium Ecosystem, in 2018 IEEE International
Symposium on Innovation and Entrepreneurship (TEMS-ISIE), 2018, 1–5.
[16] S. Demetriadis u. a., „Conversational Agents as Group-Teacher Interaction Mediators in MOOCs“, in
2018 Learning With MOOCS (LWMOOCS), 2018, S. 43–46.
[17] C. D. Kloos et al., Design of a Conversational Agent as an Educational Tool, in 2018 Learning With
MOOCS (LWMOOCS), 2018, 27–30.
[18] H. Hsu and N. Huang, Xiao-Shih: The Educational Intelligent Question Answering Bot on Chinese-Based
MOOCs, in 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA),
2018, 1316–1321.
[19] A. Mitral et al., MOOC-O-Bot: Using Cognitive Technologies to Extend Knowledge Support in MOOCs,
in 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE),
2018, 69–76.
[20] A. Berns et al., Exploring the Potential of a 360° Video Application for Foreign Language Learning, in
Proceedings of the Sixth International Conference on Technological Ecosystems for Enhancing
Multiculturality, New York, 2018, 776–780.
[21] G. Tsaramirsis et al., Towards simulation of the classroom learning experience: Virtual reality approach,
in 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom),
2016, 1343–1346.
[22] I. Stanica et al., VR Job Interview Simulator: Where Virtual Reality Meets Artificial Intelligence for
Education, in 2018 Zooming Innovation in Consumer Technologies Conference (ZINC), 2018, 9–12.
[23] C. Troussas et al., Integrating an Adjusted Conversational Agent into a Mobile-Assisted Language
Learning Application, in 2017 IEEE 29th International Conference on Tools with Artificial Intelligence
(ICTAI), 2017, 1153–1157.
[24] X. L. Pham et al., Chatbot As an Intelligent Personal Assistant for Mobile Language Learning, in
Proceedings of the 2018 2Nd International Conference on Education and E-Learning, New York, 2018,
16–21.
[25] J. Pereira, Leveraging Chatbots to Improve Self-guided Learning Through Conversational Quizzes, in
Proceedings of the Fourth International Conference on Technological Ecosystems for Enhancing
Multiculturality, New York, 2016, 911–918.
[26] A. Fadhil and A. Villafiorita, An Adaptive Learning with Gamification & Conversational UIs: The Rise
of CiboPoliBot, in Adjunct Publication of the 25th Conference on User Modeling, Adaptation and
Personalization, New York, 2017, 408–412.
[27] K. Katchapakirin and C. Anutariya, An Architectural Design of ScratchThAI: A Conversational Agent
for Computational Thinking Development Using Scratch, in Proceedings of the 10th International
Conference on Advances in Information Technology, New York, 2018, 7:1–7:7.
[28] G. M. Webber, Data Representation and Algorithms For Biomedical Informatics Applications, PhD
thesis, Harvard University, 2005.
[29] A. S. Lokman, J. M. Zain, F. S. Komputer and K. Perisian, Designing a Chatbot for diabetic patients,
International Conference on Software Engineering & Computer Systems, 2009.
[30] H. N. Io and C. B. Lee, Chatbots and Conversational agents: A bibliometric analysis, in IEEE
International Conference on Industrial Engineering and Engineering Management, 2017, 215-219.
[31] Jairvargas, Anamnese, https://www.slideshare.net/jairvargas/anamnese-44468748, last access:
29.01.2019.
[32] Alk-info.com, Alkoholtest mit 22 Fragen, Schnell-Test auf Alkoholgefährdung, https://www.alk-
info.com/tests/print/439-alkoholtest-mit-22-fragen-schnell-test-auf-%20alkoholgefaehrdung,last access:
21.03.2019.
[33] U. Latza et al., Erhebung, Quantifizierung und Analyse der Rauchexposition in epidemiologischen
Studien, Robert Koch Institut, 2005.
[34] Robert Koch Institut, Journal of Health Monitoring – Fragebogen zur Studie „Gesundheit in Deutschland
aktuell”(GEDA2014/2015-EHIS),
https://www.rki.de/DE/Content/Gesundheitsmonitoring/Gesundheitsberichterstattung/GBEDownloadsJ
/Supplement/JoHM_2017_01_gesundheitliche_lage9.pdf?__blob=publicationFile ,last access:
11.02.2019.
dHealth 2019 – From eHealth to dHealth 81
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-81
Abstract. Speaker attribution and labeling of single channel, multi speaker audio
files is an area of active research, since the underlying problems have not been
solved satisfactorily yet. This especially holds true for non-standard voices and
speech, such as children and impaired speakers. Being able to perform speaker
labelling of pathological speech would potentially enable the development of
computer assisted diagnosis and treatment systems and is thus a desirable research
goal. In this manuscript we investigate on the applicability of embeddings of audio
signals, in the form of time and frequency-band based segments, into arbitrary vector
spaces on diarization of pathological speech. We focus on modifying an existing
embedding estimator such that it can be used for diarization. This is mainly done via
clustering the time and frequency band dependant vectors and subsequently
performing a majority vote procedure on all frequency dependent vectors of the
same time segment to assign a speaker label. The result is evaluated on recordings
of interviews of aphasia patients and language therapists. We demonstrate general
applicability, with error rates that are close to what has been previously achieved in
diarizing children’s speech. Additionally, we propose to enhance the processing
pipelines with smoothing and a more sophisticated, energy based, voting scheme.
1. Introduction
Aphasia is a language disorder usually acquired from strokes or other causes of brain
damage. It is usually not related to motoric or sensoric incapabilities, but a loss of the
brains capability to formulate language. The gold standard for aphasia classification and
severity measurement in Germany is the Aachen aphasia test (AAT) [1]. It consists of
several procedures, testing the patient’s linguistic capabilities in scenarios like image
description, storytelling and spontaneous speech. Conducting and evaluating a complete
AAT takes up to eight hours of work by a professional speech and language therapist or
neurologist and requires that the patient is present at the clinic.
1
Corresponding Author: Daniel Klischies, Institute of Information Management in Mechanical
Engineering (IMA), RWTH Aachen University, Germany, E-Mail: [email protected].
82 D. Klischies et al. / Evaluation of Deep Clustering for Diarization of Aphasic Speech
Our long term goal is to develop a method to automatically estimate aphasia severity
and syndrome classification based on a preexisting recording of a patient interview. In
order to do this, we need to separate the therapist's speech segments from the patient's
speech segments. This process, also called diarization, has seen significant research
interest in the past. While originally conducted using clustering procedures based on
immediate features of the underlying audio signal, recent developments (such as [2, 3])
suggested that generating clustering features using neural networks yields better results.
The specific use case of aphasic speech also yields some additional challenges and
characteristics with regard to diarization. Since a symptom of aphasia can be stuttering
and extensive use of filler words, we require that these are included in the diarization
output. We also cannot rely the diarization on semantic or syntactic linguistic properties,
since both capabilities might be severely reduced as an effect of aphasia. Lastly,
recordings of aphasia patients are rare and hard to obtain, because the prevalence of
aphasia within the population is relatively low and obtaining recordings usually includes
adhering to strict data protection rules. This results in significant problems when training
any machine learning based classifier, as training material is rare. Additionally, virtually
all recordings of aphasia patients are single source recordings, eliminating the possibility
to use multi-source diarization procedures. This specifically holds true for a set of
aphasic speech data we currently possess and are planning to analyze, and which was the
original motivation to perform single source diarization.
In general, single source speaker diarization systems require a set of metrics that can
be used to locally cluster temporal segments of speech, such that each cluster represents
a segment of speech by the same speaker. This can be implemented either bottom up, by
first creating many small segments and subsequently merging those segments, or top
down by splitting segments as long as they are suspected to be comprised of more than
one speaker. One possibility to perform bottom up clustering is to implement the initial
splitting based on a sliding window over the audio signal and subsequently clustering
these segments. Such an approach has recently been investigated by Wang et al. [3].
Their diarization system uses a long short-term memory (LSTM) network, which is a
memory cell based recurrent neural network [4], to derive a vector space embedding of
the sliding window segments and clusters them using different procedures. The most
promising clustering procedures are k-means clustering and spectral clustering, with a
diarization error rate of roughly 12%, depending on the evaluation data set. They
compared their results to the diarization error rate of a similar system that uses Gaussian
mixture models (GMMs [5, pp. 40-42]) instead of LSTMs to derive the embeddings,
which performed worse by at least 8 percentage points.
A combination of both, bottom up and top down clustering, has been implemented
by Bozonnet et al. in [6]. Their integrated approach led to an improvement in diarization
error rate by 4 percentage points, although generally performing worse than the system
developed by Wang et al, albeit their usage of different evaluation data. This is most
probably because the latter uses LSTM based embeddings, while the system by Bozonnet
used Gaussian mixture models.
In 2010, Meignier and Merlin published a paper describing the LIUM toolkit for
development of diarization systems [7]. Contrary to the other systems presented, this
system provides several building blocks for the implementation of diarization utilities,
based on agglomerative clustering. Due to its release date, this framework does not use
LSTMs but the older GMM approach.
Lastly, in 2016 Hershey et al. proposed a method to generate vector space
embeddings from speech data using LSTMs [2, 4]. These embeddings are generated
D. Klischies et al. / Evaluation of Deep Clustering for Diarization of Aphasic Speech 83
2. Methods
Applying deep clustering as introduced by Hershey et al. in [2] in practice leads to several
details that might influence the results dramatically. These details resolve around how
much training data is required to train an estimator, such that the embeddings are
sufficiently discriminative to fulfill the aforementioned criteria for successful clusterings.
Additionally, we are not interested in speech separation but diarization, requiring a slight
alteration of Hershey's proposed algorithm. In order to collapse the time frequency bins
into time bins, we employ a majority voting scheme: For each set of time frequency bins
representing the same time slot ݐ, we count how many frequencies have been assigned
to which speaker. In the next step we assign the time slot ݐof the input signal to the
speaker to whom the most time frequency bins were assigned. Under the hypothesis that
the acoustically dominant speaker of a time slot also dominates most of the time
frequency bins of said time slot, this majority voting allows us to diarize an input file
such that we always get the whole spectrum assigned to a single speaker. This saves us
from having to deal with a signal reconstruction problem that the original separation
procedure suffers from: If one does assign time-frequency bins of a signal to different
speakers, all those frequency bands that have not been assigned to a speaker would be
missing from the output signal. Isik et al. proposed some reconstruction methods for
these parts of the signal [8], but since we are dealing with pathological speech any
reconstruction methods based on assumptions of non-pathological speech could
introduce incorrect additions and reduce the quality of a diagnosis based on the
reconstructed signal.
Our implementation of the deep clustering algorithm itself is based on an
implementation by Haroran Zhou, who implemented deep clustering using Tensorflow
(https://github.com/zhr1201/deep-clustering). We modified his work, such that it
supports Python 3, resolved some minor bugs and adopted the frequency band majority
voting strategy presented in the previous section.
We preprocess data by down-sampling the signal to 8kHz, and generate 129 Fourier
transformation points per frame, with a window size of 256 separate Fourier
transformations, such that we get Hanning windows of length 256/8000Hz = 0.032s. The
network itself consists of four bidirectional LSTM layers with 300 memory units per
layer, followed by a layer with a hyperbolic tangent activation function to estimate the
embedding. Finally, the embedding is normalized based on its L2 norm.
We use a dropout of 50% for the forward propagation and 20% for the recurrent
propagation of errors, the estimated embedding space has 40 dimensions.
84 D. Klischies et al. / Evaluation of Deep Clustering for Diarization of Aphasic Speech
The classifier has been trained for a week using an Nvidia Titan X (Pascal
architecture), which was sufficient for 352000 training steps. For the training corpus, we
mixed (non-aphasic) speech files from the 360 hours LibriSpeech audio book corpus [9],
such that we get training files with two simultaneously speaking speakers per file. We
do this by combining 20 files per speaker with some other randomly chosen file
containing another speaker. The training is thus based on the original usage of the
classifier, as proposed by Hershey et al. and the resulting classifier could also be used
for speech separation. Our majority voting scheme is only applied after the training is
completed and the classifier is being evaluated.
In order to evaluate the results, we use a slightly modified version of the diarization error
rate (DER), which was originally proposed by the National Institute of Standards and
Technology (NIST) (cf. [10]). Given a prediction and a ground truth set of speaker labels,
the DER quantifies the correctness of the prediction. The ultimate goal is to develop a
diarization procedure that yields predictions with a DER of 0. While the NIST definition
measures how much of the overall recording time was incorrectly attributed, we want to
measure how many of the potential speaker labels are incorrect. This penalizes classifiers
that do not detect overlapping speech properly more than the NIST definition: In the
NIST definition, a segment that actually contains two speakers but was classified as
silence increases the DER just as much as a segment that contains two speakers but was
classified to contain one speaker. In our definition, classifying this segment to contain
silence is twice as bad as classifying it to contain a single speaker.
For an audio recording of length ஊ with a frame rate ܤ, for which we know that it
contains ܰ speakers, we define the maximum amount of possibly incorrectly assigned
ሺ ڄሻൈଶ
labels ୫ୟ୶ ൌ ஊ ܰ ڄ ܤ ڄ. Furthermore, we define א ܮԋଶ ಂ to be our ground truth
speaker label matrix where ܮǡ ൌ ͳ iff in the ݅ frame, the ݆th speaker is active, and ܲ א
th
ሺ ڄሻൈଶ
ԋଶ ಂ to be the estimated speaker label matrix. Then we can decompose the DER
of ܲ given ܮinto the following components:
σಿ
ೕసబ ǡೕ
ܧ ൌ σஸழಂ ڄ (1)
ாೌೣ
רǡ ڄୀ
σಿ
ೕసబ ǡೕ
ܧ௦௦ ൌ σஸழಂ ڄ (2)
ாೌೣ
רǡ ڄୀ
σಿ
ೕసబ ȁሺǡି ڄǡ ڄሻೕ ȁ
ܧ ൌ σ ஸழಂ ڄ (3)
ாೌೣ
רǡ ڄୀרழǡ ڄ
ܧ is the false alarm rate, which we define as the percentage of possible speech
label tags that were marked as non-silence but were actually silence. A high false alarm
rate indicates that there is an issue with the voice activity detector (VAD) of the
diarization procedure. Analogously ܧ௦௦ is the percentage of speech labels that were
classified as silence but were actually speech. If this value is high, then the VAD
algorithm is too restrictive, as it misses some speech. Lastly ܧ is the percentage of
D. Klischies et al. / Evaluation of Deep Clustering for Diarization of Aphasic Speech 85
incorrectly assigned speech labels, in regions where there was no silence according to
the ground truth and the diarization algorithm's VAD.
Given these components, the DER can be calculated as described in equation 4.
2.2. Benchmarking
3. Results
Our evaluation data set is based on the AphasiaBank dataset [11]. AphasiaBank is a data
set composed of transcribed video recordings of semi standardized interview scenarios
between aphasia patients and therapists. We downloaded and automatically split all those
recordings into utterances labeled with information whether the therapist or the patient
is speaking in that particular utterance, based on the timestamps and speaker labels of
the transcripts. Since the therapist usually does not change between different recordings
of the same data set, we store all therapist utterances of the same institution as it came
from one recording, while patient utterances are separated such that each recording leads
to a separate set of patient utterances.
We recombined a subset of the AphasiaBank speaker files, such that we get audio
files with a minimum length of 5 seconds and at most 3 seconds per utterance. The latter
value roughly matches average speaker durations in common evaluation data sets for
diarization of healthy speech [12]. For each speaker of the subset, we randomly choose
at least 5 utterances and combined each of them with an utterance from another,
randomly chosen speaker. If that combination was not at least 5 seconds long, we
appended additional utterances from the same speakers until 5 seconds of file length were
reached. This length requirements ensures that we get a balanced set of evaluation data.
This is particularly relevant because, depending on the aphasia syndrome, patients tend
to speak significantly longer or shorter than the therapist. Additionally, we did not only
mix speech of patients with speech of therapists, but also with speech of other patients.
This allows us to judge the quality of the classifier for diarization of aphasic speech in
general, and not only in scenarios where exactly one speaker suffers from aphasia. This
would not be possible, if we would not have recombined the files, as we do not possess
recordings containing multiple patients.
The result of this process were 125 separate audio files. Diarizing them with deep
clustering led to a DER of 27.94%, minimally 13.9%, maximally 39.19% and a standard
deviation of 0.0439. Since the way we compose the input file does not allow for overlap
(no crossfade) or gaps, the "false alarm" and "miss" error rates do not play a role in this
evaluation, and we only rely on the "error" part of the diarization metric.
References
[1] Walter Huber et al., Aachener Aphasie Test (AAT): Handanweisung, Verlag für Psychologie, Hogrefe,
1983.
[2] John R. Hershey et al., Deep clustering: Discriminative embeddings for segmentation and separation,
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016), 31-35.
[3] Quan Wang et al., Speaker Diarization with LSTM, IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP) (2018), 5239-5243.
[4] Sepp Hochreiter and Jürgen Schmidhuber, Long Short-Term Memory, Neural Computation 9.8 (1997),
1735-1780.
88 D. Klischies et al. / Evaluation of Deep Clustering for Diarization of Aphasic Speech
[5] Geoffrey McLachlan and David Peel, Finite Mixture Models, John Wiley & Sons, Hoboken, 2000.
[6] Simon Bozonnet et al., An integrated top-down/bottom-up approach to speaker diarization, Eleventh
Annual Conference of the International Speech Communication Association (INTERSPEECH) (2010),
2646-2649.
[7] Sylvain Meignier and Teva Merlin, LIUM SpkDiarization: an open source toolkit for diarization, CMU
SPUD Workshop, 2010.
[8] Yusuf Isik et al., Single-channel multi-speaker separation using deep clustering, 17th Annual Conference
of the International Speech Communication Association (INTERSPEECH) (2016), 545-549.
[9] Vassil Panayotov et al., Librispeech: an ASR corpus based on public domain audio books, IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), 5206-5210.
[10] Jonathan G. Fiscus et al., The rich transcription 2005 spring meeting recognition evaluation, in:
International Workshop on Machine Learning for Multimodal Interaction, Springer, Heidelberg, 2005,
369-389.
[11] Brian MacWhinney et al., AphasiaBank: Methods for studying discourse, Aphasiology 25.11 (2011),
1286-1307.
[12] Xavier Anguera et al., Speaker diarization: A review of recent research, IEEE Transactions on Audio,
Speech and Language Processing 20.2 (2012), 356-370.
[13] Alejandrina Cristia et al., Talker diarization in the wild: The case of child-centered daylong audio-
recordings, 20th Annual Conference of the International Speech Communication Association
(INTERSPEECH) (2018), 2583-2587.
[14] Victoria M. Garlock et al., Age-of-acquisition, word frequency, and neighborhood density effects on
spoken word recognition by children and adults, Journal of Memory and language 45.3 (2001), 468-492.
[15] Yanna Ma and Akinori Nishihara, Efficient voice activity detection algorithm using long-term spectral
flatness measure, EURASIP Journal on Audio, Speech, and Music Processing 2013.1 (2013), 87.
dHealth 2019 – From eHealth to dHealth 89
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-89
1. Introduction
1
Corresponding Author: Michael Netzer, UMIT Hall, Eduard-Wallnöfer-Zentrum 1, 6060 Hall in Tirol,
Austria, E-Mail: [email protected].
90 M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics
2.1. Dataset
Statistical analysis was performed using non-parametric tests for repeated measures data
[7]. The p-value was obtained using an ANOVA-type statistic. In contrast to parametric
approaches rank-based methods allow to analyze categorical or heavily skewed data in a
systematic way [8].
We use a k-nearest neighbor (kNN) and naive bayes (NB) classifier. kNN is a popular
non-parametric method that determines the class of a new instance based on the majority
M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics 91
class of its k nearest neighbors. The NB model assumes independence between variables.
This assumption is not valid in general and may also be validated in our dataset, however
NB is a popular classifier that performs well in many classification tasks [9]. The
performance of models is estimated using 10-fold cross validation summarized by micro-
average. In particular, the dataset is divided into ten partitions using nine parts for
training and the remaining subset for testing. This procedure is repeated ten times. For
every iteration the accuracy is calculated and finally summarized by calculating the mean.
However, in particular for imbalanced datasets the accuracy is inappropriate.
ைିா
Consequently, we also calculate the parameter ൌ , where O is the observed and E
ଵିா
the expected accuracy to overcome this problem.
, where ௧ represents the value for each time point from step 1 and yt is
the actual value.
(b) Determine for each metabolite m אM the class c אC by selecting the class
with the smallest RSS
cm = argminRSS(c) (2)
c
(c) Select the final class using class predictions of the previous steps (class
selection step).
For the class selection step, we consider three methods:
ͻ Majority voting: Select the majority class based on the class predictions for all
metabolites. For instances, having 5 metabolites where four metabolites select
class 1 and one metabolite selects class 2, we use class 1.
ͻ Majority voting and feature selection: Use only a subset of ntop ranked features
for the voting step. The ranking is calculated using the area between the time
92 M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics
Figure 1. Stacking approach using the class prediction of each metabolite. The color represents the class
prediction considering each metabolite (green = class 1, red = class 2).
3. Results
Figure 2 visualizes metabolites and corresponding FDR adjusted p-values for comparing
groups (i.e., average vs. competitive athletic activity) and time (i.e., varying workload).
Table 1 depicts the p-values for change of time, group differences and interaction of both
of these variables.
Figure 2. Scaled p-values comparing groups and time (workload). The y-axis is plotted in log scale. Features
above the horizontal blue line significantly change over time (p < 0.05).
M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics 93
Table 1. FDR adjusted P-values for change of time, group differences (average vs. competitive athletic) and
interaction of both variables (group:time) calculated using non-parametric tests for repeated measures data [7].
The majority of metabolites significantly change over time (i.e., P-value for time < 0.05).
Figure 3. Boxplots of accuracy (left) and kappa (right) values for predicting average vs. competitive athletic
using kNN (first row) and NB (second row) classifier.
In this work, we analyzed metabolic changes by considering change over time (i.e.,
varying Watt levels) and group differences. In summary, a total of 25 metabolites
changed significantly over time (p < 0.05). Similar to our previous work [5], the smallest
p-values were observed for lactate, alanine, acetylcarnitine (C2) and related short-chain
acylcarnitines (C3, C5). Interestingly, no significant changes were observed comparing
average vs. competitive athletic groups. The smallest p-values were observed for serine,
M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics 95
tryptophan, and threonine. However, considering time charts, we identified a clear trend
by observing systematically higher metabolite levels for all time points for these
metabolites. The missing significance levels may be a result of the relatively high
standard deviation due to the small sample size and heterogeneities within the groups
(e.g., different individual maximum Watt levels and individual anaerobic thresholds).
Considering these three metabolites biochemically, the carbon skeletons of serine,
threonine and tryptophan are used to form pyruvate that is used as fuel in the
mitochondria by conversion to acetyl CoA (TCA cycle), converted to lactate or utilized
to produce glucose in the liver [10].
Even though the statistical approach revealed no significant metabolites when
comparing the classes, we obtain accuracy values of 75%. The highest mean accuracy of
76.83% was obtained by using our stacking approach using NB as classifier. The standard
deviations of the resulting performance values were also comparably low. Interestingly,
the proposed feature selection step did not improve the performance. Our assumption is
that the proposed feature ranking method is very prone to noise.
The degree of 9 used for polynomial fitting was based on our previous work,
however this value can be further optimized to increase accuracy values.
In summary, we introduced a new ensemble-based classification method for time
series metabolite data. For each metabolite, a class prediction is produced using
polynomial fitting. The predictions are summarized by using an induced classifier to
obtain a final classification of a new unlabeled sample. Note that this approach can be
also applied to proteomic or genomic datasets.
5. Acknowledgements
Michael Netzer was supported by the Tiroler Wissenschaftsfond. The authors thank
Prof. Dr. Elske Ammenwerth for her comments improving the paper.
References
[1] A. Holzinger, Interactive machine learning for health informatics: when do we need the human-in-the
loop?, Brain Informatics 3(2) (2016), 119–131.
[2] D. Ravı, C. Wong, F. Deligianni, M. Berthelot, J. Andreu-Perez, B. Lo and G.-Z. Yang, Deep learning
for health informatics, IEEE journal of biomedical and health informatics 21(1) (2017), 4–21.
[3] T.H. McCoy, A.M. Pellegrini and R.H. Perlis, Assessment of Time-Series Machine Learning Methods
for Forecasting Hospital Discharge Volume, JAMA network open 1(7) (2018), 184087–184087.
[4] Y. Saeys, I. Inza and P. Larranaga,˜ A review of feature selection techniques in bioinformatics.,
Bioinformatics 23(19) (2007), 2507–2517.
[5] M. Breit, M. Netzer, K.M. Weinberger and C. Baumgartner, Modeling and Classification of Kinetic
Patterns of Dynamic Metabolic Biomarkers in Physical Activity., PLoSComputBiol 11(8) (2015),
1004454. doi:10.1371/journal.pcbi.1004454. http://dx.doi.org/10.1371/journal.pcbi.1004454.
[6] T. Hastie, R. Tibshirani, G. Sherlock, M. Eisen, P. Brown and D. Botstein, Imputing missing data for
gene expression arrays, Stanford University Statistics Department Technical report, 1999.
[7] K. Noguchi, Y.R. Gel, E. Brunner and F. Konietschke, nparLD: An R Software Package for the
Nonparametric Analysis of Longitudinal Data in Factorial Experiments, Journal of Statistical Software
50(12) (2012), 1–23. http://www.jstatsoft.org/v50/i12/.
96 M. Netzer et al. / Ensemble Based Approach for Time Series Classification in Metabolomics
[8] F. Konietschke, A.C. Bathke, L.A. Hothorn and E. Brunner, Testing and estimation of purely
nonparametric effects in repeated measures designs, Computational Statistics & Data Analysis 54(8)
(2010), 1895–1905.
[9] J. Wolfson, S. Bandyopadhyay, M. Elidrisi, G. Vazquez-Benitez, D.M. Vock, D. Musgrove, G.
Adomavicius, P.E. Johnson and P.J. O’Connor, A Naive Bayes machine learning approach to risk
prediction using censored, time-to-event data, Statistics in medicine 34(21) (2015), 2941–2957.
[10] D.M. Medeiros, R.E. Wildman et al., Advanced human nutrition, Jones & Bartlett Publishers, 2013.
dHealth 2019 – From eHealth to dHealth 97
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-97
1. Introduction
1
Corresponding Author: Silvia Winkler, Sigma Software Solutions OG, Markhofgasse 1-9/3/338, 1030
Vienna, Austria, E-Mail: [email protected]
98 S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology
specifications: HL7 CDA is the standardized document format for medical reports in
ELGA and DICOM is the required data format and communication protocol for
medical images and image related data like signal data or evidence documents. The
decision of the European Commission 2015/1302 of 28 July 2015 [1], which declares
27 IHE profiles as ‘eligible in public procurement’, emphasizes the legitimacy to
require interoperable data formats in future public tenders.
Prior condition for interoperable systems is a standardized data format. In the past
different organizations provided normative standards or noncommittal format
specifications for neurophysiological time signals like EEG; a comprehensive
comparison of many of them was done by A. Schloegl [2]. Until the recent past none of
the formats found a broader acceptance by manufacturers of EEG devices. We still see
the situation, that manufacturers use proprietary formats – in most of the cases with
restricted access to the specifications. Stead et al. [3] recently discussed the urgent need
to establish a common format and proposed the use of the MEF3 format.
This paper shows that the existing, well-established DICOM standard is able to
achieve interoperability for neurophysiology data like EEG and gives an overview of
the necessary extensions to the existing DICOM Waveforms specification.
2. Methods
Both, ASTM E1467 respectively its successor ACNS TS1 [1] and HL7 V2.x [6],
are evaluated later in this document. The two CEN standards mentioned in the DICOM
waveform specification existed only as drafts released from the CEN Technical
Committee 251 (WG 5 - Medical Device Communication in Integrated Healthcare).
SCP-ECG reached broader usage especially in cardiology use cases, today it is part of
the CEN/ISO/IEEE 11073-9x specifications. CEN VITAL and IEEE MIB became the
core standards in CEN/ISO/IEEE 11073 [7], [8].
S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology 99
3. Results
The American Clinical Neurophysiology Society released the first version of this
standard (ASTM E1467) in 1992 as a result of a joint undertaking of clinical, academic,
and vendor interests. As its successor ACNS TS1 [1] was released in 2008. Both
versions are based on NCCLS LIS 5-A [5], which was one of the fundamentals of HL7
v2.x, too.
ACNS TS1 claims support not only for EEG data but for a broad range of digital
electro-physiologic waveform data in clinical and research environment like
electromyogram (EMG), polysomnography (PSG) and evoked potentials (EP) as well.
The standard is strictly ASCII based; in the earlier version even the signal data had
to be stored as 7.bit ASCII values. The current version still supports ASCII encoded
sample data and recommends them still for short recordings. In addition, sample data
can be provided in numeric form as well, but this requires additional data files.
Annotations are supported and stored as the waveform data itself in result segments.
For annotations the segment category ANA is used.
Besides the waveform data and their acquisition parameters the specification
contains in-depth structured data segments, well defined data types, comprehensive
lists of allowed values and codes and guidelines for message exchange. The message
format is similar to HL7 V2.x messages
ACNS TS1 is a comprehensive standard for medical waveform data, which
includes administrative and acquisition context. It is notable for its broad nomenclature,
which contains defined terms for almost every single parameter.
The standard defines different levels of implementation (Level I – Waveforms
only; Level II- Waveform or Procedure Annotations or Both; Level III – Coded
Information) and suggests a “Description of Implementation”, which a system should
provide in order to declare details of its compliance.
Although the message format itself is very close to HL7 V2.x, which is worldwide
used, there is no known implementation of this standard.
Health Level Seven (HL7) is an international organization which provides standards for
healthcare interoperability.
HL7 V2 [6] is a message based standard with focus on administrative tasks like
patient and order management, and communication of results like laboratory measures.
It is broadly used all over the world and facilitates communication between hospital
information systems and departments like radiology or laboratory. It is used for
pharmacy tasks as well as for billing and – last but not least – in electronic health
records. It is one of the base standards used in IHE integration profiles.
100 S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology
The European Data Format EDF [9] and its successor EDF+ [10] are an open, non-
proprietary guideline for ASCII based encoding of signal data. There is a large set of
free available tools to handle this format.
EDF+ is widely compatible with EDF. With the more recent version,
discontinuous recordings and annotations are supported. EDF and EDF+ are frequently
used for sleep data (PSG), EEG, ECG, and EMG.
A standard EDF file consists of a header record with metadata for the recording
followed by data records. The filename extension has to be .edf or .EDF. A data file
contains recordings which were acquired with the same technique and with common
amplifier settings. Different settings result in different files.
For metadata provided in the file header only a limited character set (ASCII 32-
126) is allowed. A space separated CHAR-array contains the patient data. A space
separated ASCII-array contains information for the recording. Per-channel properties
define the physical conditions of the recording. There is no information about the
recording geometry, i.e. the electrode positions.
EDF and EDF+ only support 16-bit sample values in strict chronological order.
Values have to be stored as 2-complement, in little-endian byte order (low byte first).
Recorded data are split into chunks with a maximum size of 61440 bytes.
S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology 101
The necessity to store sample values in 16-bit values leads to size overhead, if the
acquisition only produces 8-bit values. On the other hand sample values with more than
16 bits are not supported and can only be stored after downsizing.
EDF does not support annotations. EDF+ supports text annotations, events and
stimuli. They are stored together with the signal data as an additional channel with a
defined channel label. The sample data for the channel then contains characters instead
of 2 byte integers.
EDF and EDF+ are without doubt the most popular formats for biomedical signal
data. They are supported by a wide range of EEG manufacturers and software vendors.
In many cases interfaces for importing or exporting EDF are available.
The General Data Format GDF [11] was defined with the aim to overcome some of the
limitations of EDF+, for example missing support for more than 16 bit sampling width
or missing electrode positions. For this purpose some elements of some other standards
were incorporated in the specification. An open source implementation (C++ and
MATLAB) including tools for conversion and a library for reading and writing GDF
2.x is available at [12]. In 2015 GDF became an Austrian Standard [13].
GDF provides a comprehensive set of metadata in fixed file header containing
patient and recording related information and a variable header reserved for channel
specific parameters.
Signal data is stored in the data section of the .gdf file. Data is organized in records,
each containing a defined amount of samples, first block for first channel, second block
for second channel, and so on. The sampling rate, number of samples and data type
may differ for each channel. The signal values are stored as numeric data. The data
type for each channel is defined in the variable header, the recommended data format is
32-bit integer, but there are 13 different data types defined ranging from 8-bit integer
up to 128-bit float.
Annotations are stored in a table of events after the data section. Its start address
within the .gdf file is calculable. Events have to follow a well-defined structure
containing type, position in samples, channel, duration, and time stamps. For different
types of events a code table is provided within the specification.
The General Data Format GDF overcomes some limitations of EDF+ like the
missing electrode localization, which is done via coordinate positions (XYZ) with
originates in the center of the head. GDF supports any kind of signal data and is not
restricted to EEG. The software is used in some research projects (listed at [12]), there
is no known commercial use.
Support for different character sets is granted, the used character set is contained in
the DICOM message itself. Unicode and multi-byte character sets are supported, too.
Usage of a defined nomenclature is mandatory for many of the properties of the
acquisition system; use of different code systems is supported.
DICOM contains Waveform objects since 2001. Object definitions exist for audio
data, different types of ECG data, hemodynamic and respiratory signal data as well.
Waveform acquisition can happen in context of an image acquisition or without.
DICOM allows handling both situations: waveforms can be stored together with an
imaging context or on their own, as separate information objects.
DICOM Waveform objects are structured like the well-known DICOM image
objects following a well-defined information model – see Figure 1. The metadata
provide the full clinical context ranging from patient data to data acquisition parameters.
The signal data itself can be stored in different formats, depending on the signal
type and on the physical parameters of data acquisition (i.e. the bit-depth of the AD-
converter). Defined terms for waveform sample types are listed in Table 1.
DICOM waveforms also support annotations. They are stored together with the
waveform in one information object. The annotation information can be free text or a
coded item. In case of a coded item this can contain a numeric measurement or a coded
concept.
Furthermore the well-known and broadly supported DICOM Structured Report
objects could be used to store evaluation results and neurology reports.
S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology 103
4. Discussion
In spite of various attempts to find a common data format neurophysiology still lacks
interoperability. Although EDF is supported by many device manufactures it is not
supported by any healthcare platforms. Extending DICOM Waveforms to
neurophysiology data would accomplish this and would result in additional advantages.
DICOM’s main objective is interoperability. The standard is – besides HL7 - one
of the major components of the IHE integration framework in the radiology and also in
the cardiology domain.
To add support for a General EEG Waveform Storage SOP Class analogously to
the already existing General ECG Waveform Storage SOP Class only some domain
specific adaptions would be necessary. Most of them are easy to achieve, like deviating
value ranges for recording properties such as sampling frequency or scaling factors.
The main effort results from identifying adequate nomenclatures for electrode positions,
coded acquisition context information and coded annotations.
For EEG recordings the anatomical position of the electrodes is an important
information. Clinical routine EEG uses electrode positions on the surface of the skull
according to the International 10-20 or 10-10 system [16]. ISO/IEEE 11073 – 10101
[15] provides standardized terms for these locations.
To achieve interoperability, DICOM supports annotations with coded content.
Therefore ISO/IEEE 11073 – 10101 [15] provides a nomenclature and codes for
neurology comprising measurement, device, and patient related events.
Moreover DICOM provides mechanisms to store and to preserve the spatial or
temporal relationship of DICOM instances. Synchronization of DICOM objects is
possible even if acquired on different devices. For clinical use this could be of interest
especially for EEG synchronized to video, fMRI, PET or SPECT.
Especially video recordings are important in neurology, for example in epilepsy
monitoring or sleep studies. As DICOM supports different video formats like MPEG2,
H.264 or H.265, the videos can be stored as (separate) DICOM files. These DICOM
instances can be synchronized to the waveform object – i.e. the EEG - using the
mechanisms described above.
Compression remains an open issue because the DICOM standard currently does
not include any time-series-specific compression algorithms. Integration of
compression algorithms is possible in principle and can be done if the DICOM
committee accepts their usage. Algorithms working for different types of signal data
(ECG, EEG, MEG, pressure waveforms, etc.) would be preferred; patent protection
could be an obstacle. As long as there are no waveform compression algorithms
104 S. Winkler et al. / Achieving an Interoperable Data Format for Neurophysiology
supported by the DICOM standard, waveforms can make use of the deflated transfer
syntax (RFC 1951; i.e. the zip algorithm) which is applied to the data set as a whole.
Another open issue is the missing support of (almost) real-time online submission
of waveforms. DICOM waveform objects are designed and intended to be persisted.
The format does not permit continuous writing due to the structure of the data. Even
though the DICOM standard defines a communication protocol (called Transfer
Syntax) to stream image data, which is based on the JPEG 2000 image format, there is
no such mechanism defined for waveform objects. The need for such communication
mechanism is well-known. Efforts will have to be taken in future by the standardization
bodies to define streaming protocols for DICOM waveforms for future
implementations.
References
[1] Commission Decision (EU) 2015/1302 of 28 July 2015 on the identification of ‘Integrating the
Healthcare Enterprise’ profiles for referencing in public procurement (Text with EEA relevance),
https://eur-lex.europa.eu/eli/dec/2015/1302/oj, last access: 21.3.2019
[2] A. Schloegl, An overview on data formats for biomedical signals. Image Processing, Biosignal
Processing, Modelling and Simualtion, Biomechanics. Munich, Germany: Word Congress on Medical
Physics and Biomedical Engineering, 2009; pp 1557-1560
[3] M. Stead, J.J. Halford, Proposal for a Standard Format for Neurophysiology Data Recording and
Exchange. Clin. Neurophysiol. 33(5) (2016), 403-413
[4] ASTM E1467: American Clinical Neurophysiology Society Technical Standard 1 (ACNS 1) Standard
for Transferring Digital Neurophysiological Data Between Independent Computer Systems, 2008
[5] NCCLS. Standard Specification for Transferring Clinical Observations Between Independent Computer
Systems. NCCLS document LIS5-A [ISBN 1-56238-493-7]. NCCLS, 940 West Valley Road, Suite
1400, Wayne, Pennsylvania 19087-1898 USA, 2003
[6] HL7 Messaging Standard Version 2.6, 2007
[7] ISO/IEEE 11073 – 10201 Health informatics Point-of-care medical device communication Part
10201:2004 Domain information model
[8] ISO/IEEE 11073 – 30300 Health informatics – Point-of-care medical device communication Part
30200:2004 Transport profile – Cable connected
[9] B. Kemp, A. Värri, A.C. Rosa, K.D. Nielsen and J. Gade, A simple format for exchange of digitized
polygraphic recordings. Electroencephalogr. Clin. Neurophysiol. 82(5) (1992), 391-393.
http://www.edfplus.info/specs/edf.html, last access: 8.2.2019
[10] B. Kemp, J. Olivan, European data format 'plus' (EDF+), an EDF alike standard format for the
exchange of physiological data. Clin. Neurophysiol. 114(9) (2003), 1755-61.
http://www.edfplus.info/specs/edfplus.html, last access: 8.2.2019
[11] A. Schloegl, GDF - A General Data Format for Biosignals, https://arxiv.org/abs/cs/0608052, last
access: 8.2.2019
[12] A. Schloegl, The BioSig Project, http://biosig.sourceforge.net/index.html, last access: 8.2.2019
[13] Austrian Standards Institute ÖNORM K 2204 General data format for biomedical signals, 2015
[14] NEMA PS3 / ISO 12052, Digital Imaging and Communications in Medicine (DICOM®) Standard,
National Electrical Manufacturers Association, Rosslyn, VA, USA
[15] ISO/IEEE 11073 – 10101 Health informatics – Point-of-care medical device communication Part
10101:2004 Nomenclature
[16] Jasper, H.H., The ten–twenty electrode system of the International Federation. Electroencephalogr.
Clin. Neurophysiol. 10 (1958), 371–375
[17] Klem, G.H., Luders, H.O., Jasper, H.H., Elger, C., The ten–twenty electrode system of the International
Federation. The International Federation of Clinical Neurophysiology. Electroencephalogr. Clin.
Neurophysiol. Suppl.(52) (1999), 3–6
dHealth 2019 – From eHealth to dHealth 105
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-105
1. Introduction
Transcription factors (TF) bind to distinct recognition sites on the DNA and thereby
regulate gene transcription. Chromatin immunoprecipitation sequencing (ChIP-seq) is a
method to identify genome-wide binding sites of a specific TF and to gain information
about transcriptional regulation, affected genes and pathways. Nuclear receptors (NRs)
are a class of TFs, which are directly activated/inactivated by agonistic/antagonistic
ligands. The NR farnesoid X receptor (FXR) is activated by bile acids, thereby
controlling gene regulation of different metabolic pathways mainly in the liver (e.g.
bile acid-, lipid- and glucose metabolism). FXR recently attracted attention as a novel
drug target for various metabolic liver diseases. Therefore, understanding precise
genomic FXR binding and transactivation of genes is important to fully reconstruct
FXR signaling, particularly when used as therapeutic drug.
1
Corresponding Author: Emilian Jungwirth, Medical University of Graz, A-8010 Graz,
Stiftingtalstrasse 24, E-Mail: [email protected]
106 E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data
Several FXR ChIP-seq data sets for different species, conditions and cell lines
have been reported, none so far for human liver tissue. Our aim was to re-analyze these
publicly available data sets with a standardized method and combine these data sets for
further extended downstream analysis of FXR signaling properties. In addition, we
compared the available public data sets to our own human biopsy material.
2. Methods
In public repositories, FXR-ChIP-seq data sets were available for mouse, rat and a cell
line of primary human hepatocytes. We also had access to our own FXR-ChIP-seq data
set from human liver tissue (Table 1). Raw reads were available for all data sets except
“Mouse-Guo” and “Mouse-Osborne”. For the “Mouse-Osborne” data set only mapped
read tracks were available. In case of the “Mouse-Guo” data sets only the called peak
tracks were available.
Table 1. Available data sets for this study. Naming of the data sets is based on the species and the last author
of the paper where the data was first published.
We created our own ChIP-seq analysis pipeline (Fig 1). The quality of the data samples
is assessed at relevant steps of the analysis. Most of the data processing was performed
using a locally available Galaxy [10] instance. The analysis comprises three major
steps:
Raw read handling: Most of the data sets were single-end (SE) Illumina reads.
Trimmomatic (version 0.36.5) [11] was used to trim and filter overrepresented
sequences such as Illumina adapter. Additional parameters to the ILLUMACLIP were
a SLIDINGWINDOW of 4 bases with an average quality of 28 and a minimum length
of 80% of the raw read length to ensure a high read quality. FastQC [12] was used to
confirm the quality.
Mapping and peaks calling: Filtered reads were mapped to the human genome
version hg19, mouse genome version mm10 and rat genome version rn6 using Bowtie
2 (version 2.3.4.2) [13, 14] with default parameters.
To determine putative FXR binding sites model-based analysis of ChIP-seq
version 2 (MACS2 version 2.1.1) [15, 16] was used. Various parameter combinations
were used to evaluate their effects on the outcome and determine the most reliable
parameter combination. The parameters were: q-value of 0.01 or 0.05, using input, IgG
or no control sample, having a fixed or estimated fragment length and the two different
standard effective genome sizes for human (2.45 and 2.7Gbp).
Downstream analyses: For the top 500 scoring peaks a de novo motif analysis
was performed using Multiple Em for Motif Elicitation MEME SUITE (version
4.12.0.0) [17]. The sequences flanking the peak summit by 100bp on either side were
examined. Apart from the number of motifs which was set to 10 the default parameters
were used. Additionally, a motif scan for the canonical IR1 FXR motif
(AGGTCAxTGACCT) [18] was performed using the tool FIMO from MEME SUITE.
The scan was performed for the HOMER FXR motif across the narrow peaks and
wider peak regions. The wider peak region was defined as 1000bp up- and downstream
from the peaks summit.
Peaks were annotated to UCSC knownGenes using the R package ChIP-Seeker
[19]. Each gene was defined as potentially regulated by FXR if a peak summit is
located in the promotor (defined as +/-1kbp around TSS), intron or exon region of that
gene. Genes were subjected to a REACTOME [20] pathways analysis; a q-value of less
than 0.05 was considered statistically significant.
Figure 1. ChIP-seq analysis pipeline: The three major steps of a ChIP-seq analysis are (i) Read quality
control (QC), (ii) Mapping and peak calling, and (iii) Downstream-analyses such as a motif- and a pathway-
analysis.
108 E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data
A combined mouse data set “Mouse-pooled” was generated by pooling the filtered and
mapped reads of 13 individual mouse samples from 4 different mouse data sets to gain
higher sequencing depth. By pooling the samples on the read level, a summation of the
individual FXR-signals is achieved. This summation of the FXR-signals allows the
detections of weaker FXR binding sites, which could not be detected in single data sets.
Because all data sets are from different laboratories only limited summation of noise is
expected to occur. This analytic procedure combined with the strict filtering of the raw
reads is expected to lead to a high quality virtually deep sequenced FXR ChIP-seq data
set.
Subsamples were created to further investigate the saturation of FXR-related
peaks/genes. The subsamples were created by randomly selecting reads from the entire
combine data set. The subsamples size reached from 1/20 to 2/3 of the entire pooled
reads. For each subsample size five distinct subsamples were created.
2.4. Comparison
The comparison between the data sets on a read and peak level was based on the
quality metrics proposed in ENCODE- and other authoritative ChIP-seq guidelines [1,
2] (Table 2).
Table 2. Metrics used to assess the quality of the ChIP-seq samples. NSC/RSC were calculated using the
phantompeakqualtools package version 2 [21, 22].
Quality metric Abbriviation
Ratio of uniquely mapped reads to total number of reads UMR/TNR
Ratio of uniquely mapped reads to total number of mapped reads UMR/TMR
Non-Redundant Fraction NRF
PCR Bottleneck Coefficient 1 PBC1
PCR Bottleneck Coefficient 2 PBC2
Normalized Strand Cross-correlation coefficient NSC
Relative Strand Cross-correlation coefficient RSC
Fraction of reads, which are in peak regions FRiP
Percentage of peaks with foldchange greater than 5 %fc>5
Percentage of peaks, which are in Dnase I HS sites % Dnase I HS
The similarity between the various peak calling results and the corresponding
genes was determined using the Jaccard distance [23]. The pairwise Jaccard distances
were visualized with a heatmap. It was necessary to map the genes to their orthologues
of the other species to correctly estimate the similarity between different species.
Mouse and rat genes were mapped to their corresponding human genes.
A dotplot was used to illustrate enrichment of pathways across samples. Some
samples did not show any enriched pathways under the defined settings. Additional
pathway trees for each sample with enriched pathways were created, to investigate the
branch and subtree differences between the samples.
E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data 109
3. Results
In public repositories, FXR-ChIP-seq data sets from three different species are
available: five for mice, one for rat and one for human primary hepatocytes. Most data
sets include baseline FXR binding and binding events under pharmacological treatment
(i.e. FXR activation with different ligands) or diseased conditions (i.e. diet-induced
non-alcoholic fatty liver disease, bile duct ligation induced cholestasis). No public data
sets are available for human liver tissue (Table 1). Our analysis shows that these data
sets are heterogeneous concerning baseline quality criteria (Table 3).
Table 3. Evaluation of ChIP-seq quality for the available data sets. The number of samples/analysis results
which pass the quality metric in respect total number of samples/analysis results is presented. Peak calling
was performed with multiple parameter combinations; thereby the number of peak calling results is a
multiple of the number of samples.
Study
Rat-Stevens
Mouse-Guo
Threshold
PHH-Guo
Lefebvre
Osborne
Human-
Kemper
Wagner
Kersten
Mouse-
Mouse-
Mouse-
Mouse-
Mouse-
pooled
value
Quality metric
UMR/TNR 50% - - 4/4 4/4 4/4 - 4/6 2/2 2/2
UMR/TMR 50% - - 4/4 4/4 4/4 - 6/6 2/2 2/2
NRF 50% - - 4/4 4/4 4/4 1/1 6/6 2/2 1/2
PBC1 50% - 1/1 4/4 4/4 2/4 1/1 6/6 2/2 1/2
PBC2 1,00 - 1/1 4/4 4/4 4/4 1/1 6/6 2/2 2/2
NSC 1,05 - - 0/4 0/4 4/4 - 6/6 2/2 2/2
RSC 0,8 - - 0/4 0/4 0/4 - 6/6 2/2 2/2
FRiP 1% - - 16/16 15/32 27/32 4/4 24/24 8/16 22/24
%fc>5 50% - 8/8 16/16 32/32 32/32 0/4 24/24 16/16 24/24
%Dnase I HS 80% 1/2 8/8 0/16 0/32 0/32 0/4 - 2/16 5/24
When analyzed with the various analysis parameters in a standardized manner, the
number of called FXR peaks and associated genes ranges from 103 to 40,080 and 6 to
12,873 in the single data sets, respectively. For the combined data set, the number of
called peaks reached from 24,747 to 59,319 and the number of associated genes from
10,038 to 13,826 for the different parameter combinations. The called peaks/genes of
the combined data sets represent more than just the simple addition of binding
sites/genes from the single data sets and can be explained by enhancement of weak
signals after virtually increasing sequencing depth.
The comparison of the public data sets to our human data set revealed that the
quality of the human data set (although derived from surgical tissue) is in many regards
at least as good as published data sets. The human data set passed the RSC quality
criteria, which is crucial for the correct estimation of the fragment length by MACS2.
The human data set also included an input and IgG control sample, which was critical
to analyze the impact of different control samples in ChIP-seq experiments.
The most prevalent motif identified by the de novo search within the top 500 peaks
was the canonical FXR IR-1 motif (AGGTCAxTGACCT). It was present in 2 to 54%
of narrow peaks and 20 to 64% in wider peak regions for the different data sets.
The similarity between the samples was determined using the Jaccard distance
based on the identified genes. Samples of the same data set group together rather than
samples from the same condition/treatment from different data sets (Fig 2). Based on
the quality criteria and the pairwise Jaccard similarities the parameter combination of:
110 E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data
q-value 0.05, no control sample, fixed set fragment length (if the estimated fragment
length was unrealistic) and - for the human samples - an effective genome size of
2.7Gbp was considered as the most reliable parameter settings. Only peak calling
results from those parameters were used for all further analysis.
Figure 2. Heatmap based on the pairwise Jaccard distance. The samples are colored based on the data sets.
The cluster tendency seems to be towards data sets rather than sample conditions.
Peaks were assigned to the closest annotated genes. Based on the assigned genes
enriched REACTOME pathways were identified (Fig. 3). The combined analysis
revealed additional significant pathways, which are not present in any of the single
mouse data sets. Some of those additional pathways are also present in samples of other
species. This demonstrates both a conservation of the FXR dependency of that pathway
across multiple species and validity of the additional pathways identified by the
combined data set.
Mouse−Kersten−40_MVEH8
Mouse−Kersten−41_MVEH9
Mouse−Kersten−38_MINT4
Mouse−Kersten−39_MINT5
Human_cholestasis_01
Mouse−Lefebvre−13
Mouse−Lefebvre−14
Mouse−Lefebvre−15
Mouse−Lefebvre−16
Mouse−Kemper−16
Mouse−Kemper−17
Mouse−Kemper−18
Mouse−Kemper−19
Rat_36_ligated_14d
Mouse−Guo−ileum
Mouse−Guo−Liver
Human_normal_03
Rat_38_ligated_5d
Rat_40_ligated_1d
Rat_35_sham_14d
Rat_39_sham_1d
Rat_37_sham_5d
Mouse−Osborne
Mouse−pooled
PHH−Guo_77
PHH−Guo_79
Figure 3. Top enriched RACTOME pathways represented in a dotplot. The dot color relates to the q-value
and the size to the pathway coverage (number of pathway genes found in the pathway / total number of genes
in the pathways). For some samples no enriched pathways were found under the defined settings.
3.1. Insights in FXR binding events revealed by the combined data set
The combined mouse data set shows many additional peaks, genes and pathways which
were not present in any of the individual samples (e.g the ‘Translocation of SLC2A4
(GLUT4) to the plasma membrane’ pathway is one of 33 pathways which are only
E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data 111
present in the combined mouse data set). Similarly, some peaks, genes and pathways
present in one or more individual mouse samples are not present in the combined data
set (e.g. the ‘Tspy-ps’ gene is not present in the combined mouse data set although it is
present in 8 of the individual mouse samples). This indicates that the signal for those
peaks is not conserved across all samples. This could be explained either by a signal
that is only present under very specific conditions, which were only met in a single
sample, or by incorrectly called peaks due to noise. Peaks are more prevalent in the
vicinity of TSSs, which is expected for a TF ChIP-seq experiment.
Interestingly, over 96% of the liver FXR ChIP-seq genes from the “Mouse-Guo”
data set are present in the combined data set although the “Mouse-Guo” was not
included in the pool because only the peak tracks were available. Furthermore 70% of
the “Mouse-Guo” genes which are not present in any other single mouse sample are
present in the pooled data set. This indicates that the pooling of FXR signal allowed the
detection of weaker signals. Although the combined data sets revealed many new
potential FXR related binding sites, saturation appears not to be reached. This is
demonstrated by subsampling the combined data set (Fig. 4).
A B
Figure 4. The number of reads with respect to the number of peaks (A) and number of genes (B) for the
“Mouse-pooled” data set and its subsamples. The blue points represent the number of peaks/genes for either
the entire “Mouse-pooled” data set or of its subsamples. A linear (black) and exponential (red) fitting curve
was created for the data points; the exponential curve represents a much better fit.
4. Discussion
Several FXR ChIP-seq data sets are publicly available for various species and
conditions. Standard ENCODE quality criteria are usually not reported for those data
sets. We observe that the analysis results are sensitive to settings of certain analysis
parameters such as the effective genome size and most prominently to the choice of
control sample, which is generally underappreciated in most studies. A low-quality
control sample can have a significant impact on the peak calling results even if the
ChIP-seq sample is of good quality. Influences of control samples on the peak calling
results were also reported in other studies [24]. Therefore, an analysis without a control
sample should be considered. Interestingly, the human in vivo samples were more
similar to rodent in vivo samples than to in vitro human primary hepatocytes.
Individual data sets often exhibit a too low sequencing depth to identify weak/rare
binding sites, therefore we combined all available mouse reads to create a “FXR-super-
signaling-atlas” for a profound downstream analysis of FXR signaling capacities. This
data set allowed to detect more binding sites, genes and connected pathways. However,
even the combined data set did not reach the theoretical determined saturation.
112 E. Jungwirth et al. / A Comprehensive FXR Signaling Atlas Derived from Pooled ChIP-seq Data
References
[1] Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, et al. ChIP-seq guidelines and
practices of the ENCODE and modENCODE consortia. Genome Res 2012;22(9), 1813-31.
[2] Shin H, Liu T, Duan X, Zhang Y, Liu XS. Computational methodology for ChIP-seq analysis.
Quantitative Biology 2013;1(1), 54-70.
[3] Thomas AM, Hart SN, Kong B, Fang J, Zhong X, Guo GL. GenomeǦ wide tissueǦ specific farnesoid X
receptor binding in mouse liver and intestine. Hepatology 2010;51(4), 1410-9.
[4] Chong HK, Infante AM, Seo Y, Jeon T, Zhang Y, Edwards PA, et al. Genome-wide interrogation of
hepatic FXR reveals an asymmetric IR-1 motif and synergy with LRH-1. Nucleic Acids Res
2010;38(18), 6007-17.
[5] Lien F, Berthier A, Bouchaert E, Gheeraert C, Alexandre J, Porez G, et al. Metformin interferes with
bile acid homeostasis through AMPK-FXR crosstalk. J Clin Invest 2014;124(3), 1037-51.
[6] Ijssennagger N, Janssen AW, Milona A, Pittol JMR, Hollman DA, Mokry M, et al. Gene expression
profiling in human precision cut liver slices in response to the FXR agonist obeticholic acid. J Hepatol
2016;64(5), 1158-66.
[7] Lee J, Seok S, Yu P, Kim K, Smith Z, RivasǦ Astroza M, et al. Genomic analysis of hepatic farnesoid
X receptor binding sites reveals altered binding in obesity and direct gene repression by farnesoid X
receptor in mice. Hepatology 2012;56(1), 108-17.
[8] Sutherland J, Webster Y, Willy J, Searfoss G, Goldstein K, Irizarry A, et al. Toxicogenomic module
associations with pathogenesis: A network-based approach to understanding drug toxicity. The
Pharmacogenomics Journal 2017;18(3), 377-90.
[9] Zhan L, Liu H, Fang Y, Kong B, He Y, Zhong X, et al. Genome-wide binding and transcriptome
analysis of human farnesoid X receptor in primary human hepatocytes. PloS One 2014;9(9), e105930.
[10] Afgan E, Baker D, Van den Beek M, Blankenberg D, Bouvier D, Čech M, et al. The galaxy platform
for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res
2016;44(W1), W3-W10.
[11] Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for illumina sequence data.
Bioinformatics 2014;30(15), 2114-20.
[12] Andrews S. FastQC A quality control tool for high throughput sequence data.
<http://www.bioinformatics.babraham.ac.uk/projects/fastqc/>. Accessed 2018 10/10.
[13] Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nature Methods 2012;9(4), 357.
[14] Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA
sequences to the human genome. Genome Biol 2009;10(3), R25.
[15] Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nature
Protocols 2012;7(9), 1728.
[16] Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of
ChIP-seq (MACS). Genome Biol 2008;9(9), R137.
[17] Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in
bipolymers. Proc Int Conf Intell Syst Mol Biol. 1994;2, 28-36.
[18] Laffitte BA, Kast HR, Nguyen CM, Zavacki AM, Moore DD, Edwards PA. Identification of the DNA
binding specificity and potential target genes for the farnesoid X-activated receptor. J Biol Chem
2000;275(14), 10638-47.
[19] Yu G, Wang L, He Q. ChIPseeker: An R/bioconductor package for ChIP peak annotation, comparison
and visualization. Bioinformatics 2015;31(14), 2382-3.
[20] Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, et al. The reactome pathway
knowledgebase. Nucleic Acids Res 2017;46(D1), D649-55.
[21] Kundaje A, Jung LY, Kharchenko P, et al. Assessment of ChIP-seq data quality using cross-correlation
analysis. <http://code.google.com/p/phantompeakqualtools>. Accessed 2018 08/23.
[22] Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-
binding proteins. Nat Biotechnol 2008;26(12), 1351.
[23] Jaccard P. Lois de distribution florale dans la zone alpine. Bull Soc Vaudoise Sci Nat 1902;38, 69-130.
[24] Marinov GK, Kundaje A, Park PJ, Wold BJ. Large-scale quality analysis of published ChIP-seq data.
G3 (Bethesda) 2014;4(2), 209-23.
dHealth 2019 – From eHealth to dHealth 113
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-113
1. Introduction
1
Corresponding Author: Ekaterina Kutafina, RWTH Aachen University, Germany, E-Mail:
[email protected]
114 N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings
to record event related potentials (ERP) and applied statistical analyses to quantify the
signal quality.
While ERPs are used in many medical contexts, others require the recording of
resting state EEG. For example, in epilepsy diagnostics, resting state EEG is recorded
and analyzed for presence of epileptiform abnormalities. Therefore, mobile EEG devices
should also be validated for resting state EEG, which limits the usage of statistical
methods, such as averaging. In this case, methods such as cross-correlation or cross-
spectrum analysis are applied to compare the performance of EEG devices [4]. These
methods often fail to adequately reflect the matching of two EEG sequences, due to
sensitivity to strong disturbances like eye blinking, which are neglected as artifacts
during a clinical interpretation of the EEG signal. While some of these problems can be
solved through artifact removal, it is highly desirable to compare core features of the
signals. Until now, little work was reported on EEG data comparison through
mathematical modelling. One such approach was proposed by Mikkelsen et al. [5]. The
authors built linear models to investigate mutual information between scalp and ear EEG
devices.
Therefore, we propose an approach that is based on labeling the EEG signal’s
samples through clustering and comparing the label sequences. Specifically, we fit an
autoregressive model (AR) to the EEG signal by using a Kalman filter to handle the low
signal-to-noise ratio and non-stationarity. The non-stationarity results in a sequence of
AR models for each EEG time series. Subsequently, the computed AR coefficient are
clustered with a Gaussian Mixture Model and the distance between cluster label
sequences for two simultaneously recorded EEG time series is defined and returns a
“matching score”. We argue, that such score offers a promising alternative to the
currently used signal-based approaches and is characterised by resistance to noise and
artifacts.
2. Methods
Our approach is based on labeling the EEG signal’s samples based on clustering and
comparing the label sequences. The aim of labeling the signal’s samples is to express the
matching quality in terms of the matching of label sequences, which removes any
interdependencies between scoring and the model used for labeling. One promising
approach for this is presented by Penny et. al. [6]. The authors fit mixtures of Gaussians
to the hidden state vectors of stationary phases in the signal in order to distinguish, if the
test person had done a task with his left or right hand.
In our approach, the hidden state vectors are the coefficients of an autoregressive
model, which is adapted by using a Kalman filter. Arnold et. al. [7] investigated the
properties of several physiological signals (including EEG) by adaptive filtering. They
showed that this approach is well suited for such types of signals due to its fast
convergence to new system states.
One important property of EEG is the piecewise-stationarity of its signals.
Stationarity is given when mean and variance of a signal do not change over time. In
general, this condition does not hold for long EEG recordings but for short periods of
time, stationarity can be assumed [8]. If two EEG devices take simultaneous
measurements on the same patient and with similar electrode positions, these stationary
phases should be identical. For this purpose, we first try to identify stationary phases in
N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings 115
the EEG signal. Afterwards we compare the two EEG recordings regarding the matching
of these stationary phases, which gives a quantitative measure.
The stationary phases are identified by the coefficients of an autoregressive (AR)
model of the EEG signal. The piecewise-stationarity of the signal implies, that the
coefficients are time-variant in general, but that they are constant during a stationary
phase.
The adaptation of the AR model’s coefficients is done with a Kalman filter, which
results in a time series of coefficient sets. These sets can be clustered by training a
Gaussian Mixture model (GMM). Applying the GMM to a time series of coefficient sets
yields a sequence of labels for the stationary phases, which is used for the comparison of
the EEG signals. Since the GMM can be trained on the coefficient sets of both EEG
recordings, we repeat the process vice versa and take the average of both comparisons.
Eq. 1 shows how the standard ܴܣሺሻ model estimates the next value of a sequence
ݕො from a linear combination of the past values ݕොି and includes additive white,
Gaussian noise ߝ. The parameters of an AR model are the stochastic properties of the
noise and the weighting factors ܽ for the linear combination.
contains the ratio between the measurement uncertainty ܴ and ܲ , which prevents noisy
measurements from having an exceeding influence on the state estimation. Furthermore,
it contains the inverse of ܪ , which transforms the output error ݁ to an error in the state
estimate, so it can be used for correction. The equations for the correction step are shown
in Eqs. 4-6.
݁ ൌ ݕ െ ݕො ൌ ݕ െ ܪ ் ܺ (4)
ܪ ் ܲ
ܭ ൌ ் (5)
ܪ ܲ ܪ ܴ
ܺ ൌ ܺ ܭ ݁ s (6)
The final operation of the correction step is shown in Eq. 7, which shows the
calculation of the estimated uncertainty ܲ . Compared to the predicted uncertainty of the
state estimate ܲ , this is lower due to the inclusion of the information from the
measurement.
ܲ ൌ ܲ െ ܪ ܭ ் ܲ (7)
After initialization of ܺ , ܲ , ܳ and ܴ , this filter however will not accurately
approximate the hidden states over time. In the event of sudden changes of the signal’s
characteristics, which occur in transitions between two stationary phases, a KF with
constant prediction uncertainty ܳ will take too long to adapt the hidden state. This can
be remedied by using the Jazwinski algorithm [9] to implement a time-variant covariance
matrix ܳ . This algorithm uses a prediction and update procedure, which is similar to the
steps of the KF algorithm. The prediction step in Eq. 9 uses the measurement
covariance ߪ ଶ , which is the denominator of the Kalman gain. Eq. 10 shows how the
smoothing parameter ߙ is used to update ܳ . Contrary to the Kalman gain, ߙ is a
constant hyperparameter.
ߪ ଶ ൌ ܪ ் ܲ ܪ ܴ (8)
݁ ଶ െ ߪ ଶ
ܳ ൌ ݉ܽ ݔቊͲǡ ቋ (9)
ܪ ் ܪ
ܳ ൌ ߙܳିଵ ሺͳ െ ߙሻܳ (10)
We train a Gaussian Mixture Model (GMM) on the sequence of hidden states ܺ in order
to find clusters. The clusters represent the piecewise-constant AR coefficients during the
stationary phases, which we refer to as prototype states in the further text.
The state sequence is centered and normalized with the ܮଶ -norm per vector
component to satisfy scale dependency of the GMM. Moreover, we use individual
diagonal covariance matrices for each component.
During transitions between two stationary phases, the hidden state fluctuates in a
way, that it might not match any of the prototype states. Including these transition phases
in the training for the GMM might therefore result in a biased model. Because of this,
the training is only done with states where the evidence is high. The evidence ሺݕ ȁߠ ሻ
describes the probability observing the measured ݕ , given the current state of the KF ߠ .
It is modeled according to Eq. 11 as a conditional, Gaussian probability density.
ሺݕ ȁߠ ሻ ൌ ࣨሺݕ ȁݕො ǡ ߪ ଶ ሻ (11)
During state transitions, in which the KF has to adapt, its current state does not
correspond well to the measured EEG signal. Therefore, the evidence decreases
significantly between two stationary phases. We decided to use the mean evidence as
N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings 117
threshold for the evidence, below which the hidden states were excluded from the
training of the GMM.
Let ܮሺܺ ǡ ߮ሻ be the function, that maps a hidden state ܺ to the most probable prototype
state of a trained GMM with the model parameters ߮. With this, the state sequences can
be transformed into sequences of prototype state labels.
In order to reduce fluctuations of the comparison results, the label sequences of the
EEG signals are segmented into non-overlapping chunks. The unilateral matching score
for each chunk is the number of samples, for which both EEG signals have the same
label, normalized by the chunk size.
Since a GMM can be trained on the hidden state sequences of both EEG signals, the
matching score that is used to assess the similarity of the signals is computed as the
average of both unilateral matching scores.
If a single value is required for the comparison, the average of all segments’
matching scores is taken.
3. Results
3.1. Simulation
For verifying our method we generate a test set of piecewise-stationary signals with a
known hidden state sequence. Additionally, we add fixed-length Gaussian windows with
varying amplitude at random points to simulate disturbances of the signal. We use a
weighted sum of sine functions with different frequencies and white noise. In order to
generate piecewise-stationary phases, the weighting coefficients of individual summands
are switched. Each generated pair of signals consists of three state phases while either
0%, 33%, 66% or 100% of the phase sequences match. The white noise covariance was
set to ߪ ଶ ൌ ͲǤͳ which corresponds to 10% of the maximal amplitude. For initialization
of the model we first fit an ܴܣሺͷሻ-model, for which the choice ൌ ͷ was determined
heuristically. The estimated coefficients are then used as the initial state ܺ and we
assume a low initial covariance ܲ ൌ ͳ, while the measuring noise covariance ܴ is set to
the root-mean-square-error (RMSE) of the AR model’s prediction. This should reflect
the assumption of an underlying AR process and explains the error of the fitted AR model
with measurement noise. Moreover, we assume a slight smoothing in the Jazwinski
algorithm of ߙ ൌ ͲǤͲͷ for stability.
The results are displayed in Fig. 1, which illustrates, that our approach reliably gives
results closer to the true matching than correlation does. The average deviation of the
matching score from the ground truth is 0.131, while the correlation coefficient deviates
by 0.225 on average. Within groups of pairs with an identical true matching, the matching
score has an average deviation of 0.114, 0.102, 0.117 and 0.190 from the ground truths
of 1.00, 0.66, 0.33 and 0.00 respectively. The corresponding differences between the
correlation coefficient and the true matching are 0.414, 0.341, 0.137 and 0.007.
118 N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings
Figure 1. Comparison of the true matching of artificial signals with the results of the proposed method and the
correlation coefficient.
In order to evaluate the proposed approach on real EEG data, we analyze two
simultaneously recorded, multi-channel EEG signals. One has 14 channels and is
recorded by a consumer-grade mobile device (mEEG), the other one has 21 channels and
is recorded by a clinical device (cEEG), both in the 10-20 system. The differences in
electrode numbers, electrical characteristics and reference systems required initial
preprocessing of the data to obtain 28 pairs of EEG signals with 128Hz sampling rate.
Each pair contains one signal from the mEEG and one from the cEEG, which were
recorded simultaneously with spatially close electrodes. The preprocessing procedure
included a normalization to avoid large scale differences. Despite this processing, this
data is referred to as raw during the further text, in order to distinguish it from the clean
data, in which outliers and artifacts were removed with the intention to compute more
meaningful correlation coefficients. As a reference, the Pearson correlation coefficient
was computed for both the raw and the clean data, while the proposed approach was
evaluated on the raw data only.
For the model we follow the same process as before, which fits an AR model and
uses its coefficients as initial state ܺ . We use ൌ Ͷ for the AR model and take the
RMSE of its prediction as initialization for ܴ. We assume a higher initial covariance of
ܲ ൌ ͳ and a smoothing in our state noise covariance adaptation of ߙ ൌ ͲǤͲͷ. Due to the
length of the EEG recordings, we split the estimated state changes into segments of 1000
samples for the cluster analysis. In order to determine the number of Gaussian
components we employ the estimation, that EEG signals are stationary for about half a
second [10], which amounts to about 16 possible stationary phases per segment.
However, not all of these phases correspond to a distinct prototype state and also the
length of a stationary phase is variable. We therefore tested values of three to nine
components, where a choice of three resulted in the best generalization performance. The
results of our analysis for 28 pairs of EEG recordings is presented in Fig. 2, which
includes the correlation coefficients for comparison. Each pair is recorded
simultaneously with spatially close combinations of electrodes, so a high degree of
similarity is expected.
N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings 119
Figure 2. Comparison of the proposed approach with the correlation coefficients for real EEG recordings.
4. Discussion
With the simulated data, the matching score of the proposed method is generally closer
to the ground truth than the correlation coefficient. The average deviation of the matching
score from the true matching is almost 42% lower than that of the correlation coefficient.
On EEG data, the proposed method’s matching score is much more consistent than
the correlation coefficient. Fig. 2 shows, that it has a lower variance and there are no
cases with an exceptionally good or bad matching score. By contrast, the correlation
coefficient is unexpectedly low for certain pairs like 9 and 17. And this being the case
for both the raw and the clean data shows, that the low correlation cannot be explained
with artifacts and outliers. Additionally, the correlation coefficients vary strongly
between different pairs, even when comparing only the clean data.
However, an interesting observation can be made, when grouping the simulated data
by the true matching degree. For the higher values of the ground truth matching, the
proposed method yields a lower matching score than expected, while the matching score
for pairs with little to no relationships in the data consistently exceeds the true matching.
The unexpectedly high matching score for unrelated pairs of signals can be
explained with the fact, that a GMM always assigns a cluster label, even if the evidence
of that sample’s hidden state estimation is low. This can cause random matches in the
label sequence even for completely mismatching sections and prevents matching scores
of zero for those segments. A similar behavior can be observed with the real EEG data,
in which the pairs of signals from spatially remote electrodes were assigned a matching
score, which was only slightly worse on average, than that of highly related signal pairs.
Checking the overall posterior probabilities for all mixture components may help to
improve this behavior.
With perfectly matching pairs, the proposed method does not produce a matching
score of one, because of fluctuating changes of the state label while no significant
changes in the hidden state sequence occur. In this case the hidden state is very close to
a decision boundary of a mixture component. In our model we assume independence
between stationary phases. However, for EEG signals there is a higher probability to stay
in one stationary phase than transitioning to another [10]. Since the combination of a KF
and a GMM does not allow to model this aspect explicitly, an improvement might be
implemented by using a first-order hidden Markov model.
Another point to mention is the selection of an optimal number of mixture
components of the GMM. The actual number of stationary phases varies between
120 N. Von Stein et al. / Robust Comparison of Simultaneous EEG Recordings
different EEG recordings. A sub-optimal selection of this parameter will directly affect
the matching score, as the GMM model might over- or underfit. In the EEG dataset our
selection of the parameter was based on domain knowledge and the performance tested
on a small subset of the data. The best parameter was then used for all pairs of EEG
signals. However, the optimal choice for this parameter may vary highly across the
different pairs, which would cause a large scoring error.
5. Conclusion
The problem at hand was to develop a method for comparing two simultaneous EEG
recordings from different devices. As standard techniques for this task like correlation
suffer from typical characteristics of the EEG signal, we proposed an approach, which
focuses on statistical methods. Instead of comparing single values of the time series we
interpret the EEG sequence as a stochastic process with a time-variant hidden state. The
development of the hidden state serves as a proxy and allows for a more accurate
comparison. The proposed method was successfully tested on simulated and real world
data with promising results. The method is robust against outliers and generates more
consistent matching scores for simultaneous recordings than the correlation coefficient.
Some future technical improvements, possibly leading to even more robust model,
were proposed. However, a test set of EEG signals with known matching levels would
be necessary in order to perform a proper comparison of different methods. Since the
definition of matching strongly depends on the specific application, the design of such a
data collection should be performed together with the domain experts.
References
[1] B. Farnsworth, EEG headset prices: An overview of 15+ EEG devices, July 2017.
[2] Nicholas A. Badcock, Petroula Mousikou, Yatin Mahajan, Peter de Lissa, Johnson Thie and Genevieve
McArthur, Validation of the Emotiv EPOC EEG gaming system for measuring research quality auditory
ERPs, PeerJ, February 2013.
[3] Andrew Melnik, Petr Legkov, Krzysztof Izdebski, Silke M. Kärcher, W. David Hairston, Daniel P. Ferris
and Peter König, Systems, Subjects, Sessions: To What Extent Do These Factors Influence EEG Data?,
Frontiers in Human Neuroscience, March 2017.
[4] M. Lopez-Gordo, D. Sanchez-Morillo and F. Valle, Dry EEG Electrodes, Sensors, July 2014.
[5] Kaare B. Mikkelsen, Preben Kidmose and Lars K. Hansen, On the Keyhole Hypothesis: High Mutual
Information between Ear and Scalp EEG, Frontiers in Human Neuroscience, June 2017.
[6] W. D. Penny and S. J. Roberts, Dynamic models for nonstationary signal segmentation, Computers and
Biomedical Research, 1999.
[7] M. Arnold, X. H. R. Milner, H. Witte, R. Bauer and C. Braun, Adaptive AR modeling of nonstationary time
series by means of Kalman filtering, IEEE Transactions on Biomedical Engineering, 1998.
[8] S. Sanei and J. A Chambers, EEG signal processing, Wiley Online Library, 2007.
[9] A. H. Jazwinski, Adaptive filtering, Automatica, 1969.
[10] P. L. Nunez and S. J. Williamson, Neocortical dynamics and human EEG rhythms, Physics Today,
January 1996.
dHealth 2019 – From eHealth to dHealth 121
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-121
1. Introduction
1
Corresponding Author: Saeid Eslami, Mashhad University of Medical Sciences, Mashhad, Iran,
E-Mail: [email protected]
122 H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation
2. Methods
IFDA called upon the following executives to introduce expert(s): 1. MOH, 2. Medical
Council Organization, 3. IFDA Rational Drug Use Committee, 4. Electronic
Prescribing Pilot Projects Executives, 5. Main Health Insurance Companies, and 6.
Medical Informatics Experts (MIE) from Mashhad University of Medical Sciences.
In accordance with the guidelines of the workgroup, Medical Informatics Experts were
asked to find the previous researches, analyze their results, summarize the most
important findings, and report the most significant results for the workgroup members.
We used Arksey, H. and O'Malley methodology [9] for scoping reviews, Prisma P for
systematic reviews [10], and Core HTA [11] for Health Technology Assessment Study
[12-13].
MIE collected evidence from peer-reviewed scholarly journal publications by
searching in major electronic databases (Medline/PubMed, Embase, Scopus and
Google Scholar). A comprehensive gray literature search was conducted to find other
national reports, recommendations, standards, and Implementation Guides.
National Committee on Vital and Health Statistics (NCVHS)2 of the United States
of America published a report on recommendations for electronic prescribing in 2005
[6]. To prepare the mentioned report, evidence from the past was gleaned and then
reviewed and the gaps of knowledge in this regard were found. After reviewing
2
The NCVHS serves as the statutory [42 U.S.C. 242k(k)] public advisory body to the Secretary of
Health and Human Services (HHS) for health data, statistics, privacy, and national health information policy
and the Health Insurance Portability and Accountability Act (HIPAA). website: https://ncvhs.hhs.gov/
H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation 123
evidence MIE suggested this report to be used as the cornerstone. So we focused on the
evidence after 2005 and evidence were reviewed up to September 2016. The second
milestone was the epSOS Project 3 as an infrastructure for cross-border exchange of
Patient Summary and e-prescribing in Europe. A comparative review of electronic
prescription systems in five countries (Denmark, Finland, Sweden, England, and the
United States of America) [7] has been used.
After five meetings, by comparing and contrasting the national and international
evidence, the recommendations were finalized in expert panels.
3. Results
The following recommendations are the results of the collaboration of the multi-
stakeholder workgroup, which are identified as the needed actions in our proposed
roadmap.
1. E-prescribing standards should be comprehensive and suitable for all
physicians, pharmacists and it should provide information for insurance
companies.
2. The standards should be compatible with other MOH information
interchange (HII) standards.
3. Information security and confidentiality should be guaranteed.
4. Backward compatibility of the standards should be mentioned.
5. E-prescribing implementations should support national formulary.
National formulary data should be available by web service.
6. Basic e-prescribing functionality should be implemented including:
a. Creating new prescription
b. Canceling prescription
3
epSOS is an eHealth (electronic Health) interoperability project funded by the European Commission.
It aims at improving medical treatment of citizens while abroad by providing Healthcare Professionals (HCP)
with the necessary electronic and safe patient data This initiative broke new ground and generated a lot of
interest in Europe: "When the project was initiated in 2008 it involved a few stakeholders, but it gradually
grew to encompass 25 countries and about 50 beneficiaries", project coordinator Fredrik Lindén (Sweden)
and his team write in their letter
(http://epsos.eu/fileadmin/content/pdf/deliverables/epSOS_letter_to_contributors_1July2014.pdf).
"The epSOS project achieved considerable results in a range of areas. Main technical deliverables
include development of a solid basis for the eprescription and patient summary services, considering:
governance, use cases, data content, semantics, specifications, architecture, testing mechanisms, etc.".
website: http://www.epsos.eu/
124 H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation
c. Refilling prescription
d. Revising prescription according to pharmacist consultation
e. Patient medication history should be accessible for the
prescription provider.
f. Supporting prior authorization (prior authorization is done by
pharmacies in Iran)
g. Providing medication delivery feedback for physician
7. To guarantee the security standards the following infrastructures are
mandatory: secure health information interchange network, digital
signature, and PKI service.
8. Support of health ID card should be mentioned.
9. Prescription delivery should be possible from a pharmacy which is not
connected to the e-prescribing network. (e.g. by health card or printed
prescription)
10. Clinical workflow must be supported in network instability.
11. E-prescribing should support clinical workflow in offices, and pharmacies.
12. E-prescribing implementation should support claims data.
13. Information processes and data analyses should be planned from the first
step.
14. A mapping between different coding standards should be mentioned.
15. Medication availability in the country or mentioned pharmacy should be
observed while prescribing.
16. Vivid regulations should be observed for alternative medications delivery
by pharmacists.
17. The standard format of the prescription should be observed.
18. Evidence shows that decision support system reduce medical errors;
therefore, it is recommended the e-prescribing system to be equipped with
decision support systems. The following DSSs in e-prescribing are
recommended:
a. Access to clinical guidelines,
b. Notification to drug allergy,
c. Drug dose calculation,
d. Order set recommendation,
e. Providing feedback based on national average drug use,
f. Suggestion for cheaper alternative drugs
19. Knowledge-base using in decision support system should be supervised
and guaranteed to be updated.
20. Patient, provider and pharmacist identification standards should be
implemented across the country.
21. E-prescribing should support the care of non-citizens, in this case,
passport number can be used for patient identification.
22. The identification code for office, pharmacy, Hospitals, Clinics (Health
care Centers), insurance plan should be provided.
23. The process of license issue for e-prescribing solutions should be
implemented.
24. Incentive considerations for the pharmacies or physicians using e-
prescribing.
25. E-prescribing should be integrated with EHR systems.
26. Free text field should be available for special cases.
H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation 125
Functionality
6a. New prescription 6b. Cancel prescription
Focus
6c. Refilling prescription
Prescription
6d. Pharmacist consultation 6e. Patient history
Revising Focus
Infrastructure
Focus 7. Secure HII network 7. Digital signature 7. PKI service
Implementation
8. Support health ID card 9. Prescription delivery 10. Clinical workflow
Focus
Observation
14. Code mapping 15. Medication availability
Focus
DSS Focus
18. DSSs for e-prescribing 19. Supervised Knowledge
Identification
20. Identification Standards 22. Identification Codes
Focus
20. Administrative
21. Non-citizens patients
identification
Others
23. Process of license issue 24. Incentive considerations 25. Integrated with EHR
26. Free text field 27. Patient preference 28. Alternative plan
According to the 7th step of the method (see Figure 1), these actions are
summarized and represented in the form of a roadmap chart. Our results are categorized
into 7 different focus areas. The transitions the actions reveal a time-based dependency
or priority between some of the actions. Figure 2 shows the proposed roadmap chart.
126 H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation
4. Discussion
E-prescribing was implemented in developed countries such as Sweden from 1980s [9].
Over the years, the reasons for success and failure of e-prescribing have been
investigated. Although, systematic reviews and meta-analyses showed that the e-
prescribing implementation can reduce medical errors and save cost but the context of
the national health model influence on development and adoption of Electronic
Prescribing.
Although most of the evidence is transferable and we must learn lessons from
experiences in other countries but the recommendations from one country should not
be used before customization in another country.
A few National level road-maps for digitalization of health care are published [18].
Most of them are based on the expert meeting but we collect evidence by scoping and
systematic reviews, interview with semi structured questionnaire and a Health
Technology Assessment study to support expert panel. This method led us to some
specific recommendations. Because of the noticeable number of tourists and
immigrants in Iran that they don’t have Health ID we recommended to use passport
number for the identification process. We noticed that herbal and traditional medicines
are important in Iran so we recommended EP systems must support them and free text
field should be available for special cases. Catastrophic disasters have occurred in Iran;
therefore, we recommended having an alternative plan in a crisis. People in different
languages live in Iran; therefore, we recommended EP systems must support
multilanguage drug order.
We published our method and results hoping our experience be useful for other
countries. We also hope to get feedback from scholars to update the recommendations.
We have planned to publish supplementary studies and explanation of recommendation
items as soon as possible.
References
[1] Nguyen MR, Mosel C, Grzeskowiak LE. Interventions to reduce medication errors in neonatal care: a
systematic review. Therapeutic advances in drug safety. 2018;9(2):123-55.
[2] Deetjen U, European E-Prescriptions : Benefits and Success Factors 2016,
https://www.politics.ox.ac.uk/materials/publications/15224/workingpaperno5ulrikedeetjen.pdf, last
access: 20.3.2019.
[3] Page N, Baysari MT, Westbrook JI. A systematic review of the effectiveness of interruptive medication
prescribing alerts in hospital CPOE systems to change prescriber behavior and improve patient safety.
International journal of medical informatics. 2017;105:22-30.
[4] Stojkovic T, Marinkovic V, Manser T. Using Prospective Risk Analysis Tools to Improve Safety in
Pharmacy Settings: A Systematic Review and Critical Appraisal. Journal of patient safety. 2017.
[5] Hermanowski, T. R., Kowalczyk, M., Szafraniec-Burylo, S. I., Krancberg, A. N., & Pashos, C. L. (2013).
Current status and evidence of effects of e-prescribing implementation in United Kingdom, Italy,
Germany, Denmark, Poland and United States. Value in Health, 16(7), A462–A463.
[6] Camarinha-Matos, L. M., Afsarmanesh, H., Ferrada, F., Oliveira, A. I., & Rosas, J. (2013). A
comprehensive research roadmap for ICT and ageing. Studies in Informatics and Control, 22(2), 233–
254. http://doi.org/10.24846/v22i3y201301
[7] Afsarmanesh, H., Camarinha-Matos, L. M., & Msanjila, S. S. (2009). A well-conceived vision for
extending professional life of seniors. IFIP Advances in Information and Communication Technology,
307, 682–694. http://doi.org/10.1007/978-3-642-04568-4_70
[8] Camarinha-Matos, L. M., & Afsarmanesh, H. (2012). Collaborative networks in active ageing–a roadmap
contribution to demographic sustainability. Production Planning & Control, 23(4), 279–298.
H. Dehghan et al. / Development of a National Roadmap for Electronic Prescribing Implementation 127
[9] Arksey, H., & O’Malley, L. (2005). Scoping studies: towards a methodological framework. International
Journal of Social Research Methodology, 8(1), 19–32. http://doi.org/10.1080/1364557032000119616
[10] Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., … Shekelle, P. (2015).
Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015
statement. Systematic Reviews, 4(1), 1. http://doi.org/10.1186/2046-4053-4-1
[11] Lampe, K., Mäkelä, M., Garrido, M. V., Anttila, H., Autti-Rämö, I., Hicks, N. J., … Kärki, P. (2009).
The HTA core model: a novel method for producing and reporting health technology assessments.
International Journal of Technology Assessment in Health Care, 25(S2), 9–20.
[12] Eslami, S., Dehghan, H., Namayandeh, M., Dehghani, A., Dashtaki, S. H., Gholampour, V., …
Ghasemian, S. (2018). Applied Criteria of Hospital Information Systems in Organizational Evaluation:
A Systematic Review Protocol. Internal Medicine and Medical Investigation Journal, 3(2), 52–56.
[13] Dehghan, H. R., Eslami, S., Namayandeh, M., Dehghani, A., Dashtaki, S. H., Gholampour, V., …
Barzegar, A. (2018). Criteria for Ethical Evaluation of Hospital Information Systems: A Protocol for
Systematic Review. Internal Medicine and Medical Investigation Journal, 3(4).
[14] National Committee on Vital and Health Statistics. Recommendations from Past Reports : E Ǧ
Prescribing Standards(2005). Retrieved from
http://endingthedocumentgame.gov/PDFs/ePrescribing.pdf
[15] Samadbeik, M., Ahmadi, M., Sadoughi, F., & Garavand, A. (2017). A comparative review of electronic
prescription systems: Lessons learned from developed countries. Journal of Research in Pharmacy
Practice, 6(1), 3. http://doi.org/10.4103/2279-042X.200993
[16] Ahmadi, M., Samadbeik, M., & Sadoughi, F. (2014). Modeling of the Outpatient Prescribing Process in
Iran: A Gateway toward Electronic Prescribing System. Iranian Journal of Pharmaceutical Research,
12(2), 725–738. Retrieved from
http://ijpr.sbmu.ac.ir/index.php/daru/article/view/?_action=articleInfo&article=1500
[17] Deetjen, U. (2016b). European e-prescriptions: benefits and success factors. Working Paper No. 5,
Working Paper Series of the Cyber Studies Programme, Department of International Relations,
University of Oxford.
[18] WHO. (2018). Towards a Roadmap for the Digitalization of National Health Systems in Europe, (June),
1–44. Retrieved from http://www.euro.who.int/__data/assets/pdf_file/0008/380897/DoHS-meeting-
report-eng.pdf?ua=1
128 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-128
1. Introduction
Medical errors are the third leading cause of death in the United States[1], and about 20%
of these errors are related to medication errors [2, 3]. Electronic prescribing (e-
prescribing) has been offered as a solution to this problem and has been shown to have
many benefits [4-7]. But on the other hand, if electronic prescribing is not implemented
properly, it can add new errors (e-iatrogenesis) [8-11]
Usability and User-Centered Design (UCD) [12] are critical elements in the design
and development of electronic prescription and electronic health record (EHR) systems
in general, which can enhance patient's safety and encourage physicians’ adoption and
reduce their dissatisfaction with these systems [13].
1
Corresponding Author: Kobra Etminani, Department of Medical Informatics, Faculty of Medicine,
Mashhad University of Medical Sciences, Mashhad, Iran, E-Mail: [email protected]
S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System 129
Poor usability not only results in an increased level of clinician frustration but also
can lead to errors, posing serious threats to patient safety. [14-16]
This study is aimed to enhance the efficiency of the e-prescribing system by
reducing the risk of inappropriate selection of the medication and also to reduce the
prescribing time of the physicians. We propose a model that recommends the most
commonly prescribed medication on top of the drop-down list in the e-prescription
application and showed that it can improve performance.
2. Methods
The main idea was based on the observation that pharmacists can read physicians’
handwritten prescriptions just with a few legible letters. It was obvious that they use
some complementary information such as the physicians’ specialty to narrow down the
search space which contains all possible alternatives to the correct medication, eventually
reaching a very probable result and confirming that with other clues on the prescription.
We decided to use this approach to design a smart system to “guess” what might be in
the physician’s mind when he/she has entered only a few characters from the beginning
of drug’s name. So that the application could display the search results based on their
probability, instead of sorting them alphabetically.
At the first phase of the study, in order to extract the pharmacists’ tacit knowledge
and find out which information fields they use to reach the conclusion about a specific
drug, we conducted semi-structured deep interviews with pharmacists. They were
provided an initial list of eight fields and asked to try to describe how they think and talk
about the information fields they may use if the case of facing an illegible prescription.
These fields were identified through two brain-storming sessions and include “physician
specialty”, “frequency of drug use in general”, “frequency of drug use among physicians
with same specialty”, “other drugs in the prescription”, etc. The pharmacists were
allowed to modify the list, add new fields and/or remove useless fields from the list. The
viewpoints of the interviewees where noted and also all the sessions’ voice were recorded
with the interviewees’ permission. Recorded voices were rechecked by the interviewers
to ensure nothing is missed from the notes. Interviews were continued until information
saturation and no new fields were added to the list in two consequent interviews. A total
of sixteen pharmacists were interviewed.
A checklist containing all information fields we found in the interviews was
prepared and sent to twenty pharmacists via email, asking them to rank the fields by their
relative importance in reading illegible prescriptions. Then the fields were sorted by their
average rank, making the final list ready for the next phase.
Some of the identified fields were not available (for example, the “diagnosis” field
is not recorded in claims data), and also some are not applicable in the context of
electronic prescribing, such as the number of ordered drug which won’t be available to
the system before the drug itself is known.
We used a drug claims database of over 16 million prescriptions containing 46
million drug items, prescribed over two consequent years in a large province of Iran, to
build and train a model for predicting drug names in the context of an electronic
prescription system.
To build a model, we used the “Lift” concept which is commonly used in the data-
mining approach: “Association Rule Mining”. It is defined as the ratio of two
probabilities. The ratio of the probability of an event might occur in a specific condition
130 S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System
to the probability of that event might occur in general. We used this value to rank drugs
matching the user’s input and sort the search result based on those ranks.
For example, when a drug name starts with letters “Ac”, it can be “Acetaminophen”,
“Acetazolamide” or some other drugs. In our database, there were 471.000 prescriptions
(out of 16 million) which contain “ACETAMINOPHEN 325MG TAB”. So the
probability of this drug to be prescribed in general is 471,000/16,000,000 which equals
to 0.029. But when we have another piece of information, the probability may change. If
we know that the physician is an ophthalmologist, the database says that there are about
186000 prescriptions from ophthalmologists among which “ACETAMINOPHEN
325MG TAB” is prescribed 700 times. So the probability changes to 700/186000 which
equals to 0.00382. In this example, knowing the specialty of the physician changed the
probability to about 1/7 of its previous value. On the other hand, the same calculations
for “ACETAZOLAMIDE 250MG TAB” changes its general probability of 0.00097 to
0.03614, leading to a lift of 37.25. This means that ophthalmologists prescribe this drug
37 times more often than general.
In the same way, other pieces of available information change the probability
upward or downward. With the available data we could access 7 information fields:
x Doctor profile
x Patient
x Specialty
x Previous Drug (i.e. other drugs in the same prescription)
x Previous Drug in the Same Specialty as the doctor
x Drug Simple Name (I.e. all dosage-forms and strengths of the drug)
x Drug Simple Name in the Same Specialty as the doctor
3. Results
The first phase resulted in an ordered list of 22 information fields. The most important
fields where drug’s form, asking the patient about his/her medication history, considering
other legible items that may exist in the prescription, physician’s specialty, drug’s dosage,
132 S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System
Table 1. Evaluation steps for the model. In the matching-drug level, the result of the model for each search
result, and its concordance with the drug that actually prescribed is the basis of the performance measurement.
In the selected-drug level, the rank of the desired drug within the result set determines the performance. In the
prescription level, some measures of overall performance are recorded and compared between groups.
1. Select random prescriptions from the claims database
2. For each drug item (“desired drug”) in each selected prescription
2.1. Extract first N letters of drugs name => user query
2.2. Find All Drug Names Starting with "user query" => matched drugs
2.3. For each "Matched Drug":
2.3.1. Calculate Lifts in the actual prescription context
2.3.2. Feed lifts into the model and calculate result => “model result”
Matching Drug
2.3.3. Assign a class label: If "model result">= threshold, class=”Yes”, Otherwise class = “No”
2.3.4. Assign “classification result”:
Level
“matching drug” = “desired drug” “matching drug” ӆ “desired drug”
Class label = “Yes” True Positive (TP) False Positive (FP)
Selected Drug
5.4. Assign “Frequency-Top-N” label to “Yes” or “No”, by checking whether the “desired drug” is
Level
among top N drugs of the list or not. (N=1,3,5,10)
5.5. Sort “matching drugs” by “model result” in descending order.
5.6. Assign “Model-Top-N” label to “Yes” or “No”, by checking whether the “desired drug” is among
top N drugs of the list or not. (N=1,3,5,10)
6. Count the number of “Yes”s and “No”s in each sorting method, for each level of N.
7. Compare differences between each pair of sorting method, for each level of N. (Using statistical
methods, such as Chi-square test)
8. Implement the model into a laboratory e-prescribing application, define some test scenarios.
User Experience
9. Ask physicians to prescribe drugs for test cases, record all users’ activities in the application. Set
the sorting method to “alphabet” or “model”, in a cross-over design. Prescription
Level
10. Measure “time to find the desired drug”, “number of entered characters” to reach the desired drug
and “the position of the selected drug”.
11. Compare differences of measures in the previous step, between the two sorting methods, using
statistical methods such as “multi-regression analysis”.
asking the patient about his/her symptoms, usage instructions, and patient’s age,
respectively.
These fields were categorized into five groups: fields related to (1) patient’s profile,
(2) physician’s profile, (3) physician’s specialty, (4) the medication’s properties and (5)
other medications in the prescription.
In the “matching drug” level, 45 ROC curve analyses were made. Figure 1 shows a
sample of 3 ROC curves for configurations that previous drug and physician’s profile is
used, and the user has entered 2, 3 or 4 letters of the drug name. In this sample, choosing
cut-off points of 0.77 for the first curve (2 letters) results in a sensitivity of 0.931 and
specificity of 0.921. Table 2 shows these performance measures.
In the “selected-drug” level, as shown in figure 2, in alphabetical sort, the desired
drug is in top of the list in only 12% of cases, but when sorting by results of the model
based on the physician’s profile, the first suggested drug is the desired one in about two-
thirds of cases. Sorting on the general frequency better results than alphabetical sort, but
performance is lower than the model. The same difference is seen when comparing the
desired drug being in top 3, top 5 or top 10 suggestions.
S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System 133
Figure 1. ROC Curves for model accuracy, when the physician's profile and previous drug in the prescription
are known, and the user has entered 2, 3 or 4 letters of the drug name. Note that for better illustration, the
horizontal axis range is changed to (0,0.3).
Table 2. Model performance measures in “matching drugs” level, for different combinations of known fields
and the number of entered characters. Minimum and maximum values for each measure are in bold.
SEN: Sensitivity, SPC: Specificity, spec: doctor’s specialty, prevdrug: previous drug in the same prescription.
2 Letters 3 Letters 4 Letters
Known Fields SEN SPC SEN SPC SEN SPC
patient 0.999 0.990 0.999 0.985 0.998 0.986
doctor 0.863 0.945 0.919 0.900 0.930 0.874
prevdrug 0.933 0.865 0.902 0.870 0.811 0.895
Spec 0.902 0.876 0.860 0.853 0.903 0.821
doctor patient 0.995 0.992 0.997 0.988 0.996 0.982
prevdrug patient 0.997 0.992 0.998 0.980 0.998 0.987
prevdrug doctor 0.925 0.942 0.919 0.919 0.917 0.902
spec doctor 0.931 0.921 0.942 0.881 0.950 0.863
spec patient 0.996 0.992 0.997 0.986 0.997 0.986
spec prevdrug 0.926 0.901 0.930 0.860 0.905 0.860
prevdrug doctor patient 0.993 0.993 0.997 0.985 0.996 0.982
spec doctor patient 0.996 0.991 0.996 0.987 0.996 0.987
spec prevdrug doctor 0.950 0.927 0.933 0.910 0.941 0.882
spec prevdrug patient 0.996 0.992 0.997 0.988 0.998 0.984
spec prevdrug doctor patient 0.997 0.991 0.996 0.987 0.993 0.987
134 S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System
100%
99,436%
90%
98,030%
96,972%
91,738%
91,177%
80%
82,161%
Percent of cases
70%
60%
65,496%
65,484%
50%
53,344%
40%
45,265%
12,677%
30% 33,739%
20%
10%
0%
Alphabetical Frequency Model - Doctor
Top Result 12,677% 53,344% 65,496%
Top 3 33,739% 82,161% 91,738%
Top 5 45,265% 91,177% 96,972%
Top 10 65,484% 98,030% 99,436%
Sorting Method
Figure 2. Percentage of cases that the desired drug is in 1st rank, top 3, top 5 or top 10 suggestions. Comparing
alphabetical sorting with sorting by frequency and sorting by model results.
Chi-square tests showed that differences observed between each pair of sorting
methods, in all these four levels, and in all configurations of models were statistically
significant (p < 0.001, df=1).
In the last evaluation step (user experience, prescription-level), the multi-regression
analysis showed that sorting by the model is significantly better than alphabetical sort in
terms of less time to find the desired drug (p<0.001) and fewer number of entered
characters (p<0.01). The position of the selected drug was not significantly different
between sorting methods (p>0.05).
4. Discussion
Recommender systems are widely used in commercial and e-commerce sites, and many
methods for implementing and evaluating these systems are developed [17].
In this project, we used collaborative filtering[18] method to enhance user usability of
an e-prescribing system.
In 2014, Syed-Abdul et al. proposed a smart model that recommends most
commonly prescribed medications in the drop-down menu for a given disease. They used
association between diagnosis and prescribed drugs to calculate Mean Prescription Rank
(MPR) of prescriptions and Coverage Rate (CR) of prescriptions and developed a model
to compute a proactive medication list using these concepts. They showed that this
system can shorten the length of the medication drop-down menu in the electronic
S.H. Ghasemi et al. / Design and Evaluation of a Smart Medication Recommendation System 135
prescription application and concluded that this could improve safety and save time.
They showed that “diagnosis” field can be used in developing recommender systems.
Our study showed that patient’s profile, physician’s profile, physician’s specialty
and other prescribed drugs can also be used alone or in combination with each other to
develop recommender systems for electronic prescribing.
Future researches may combine these fields with diagnosis and reach better results.
Although we could show that recommender system can improve usability by
reducing time and effort to find the desired drug, it’s efficacy to enhance patient safety
should be studied in future researches in physicians’ routine practice.
References
[1] Makary, M.A. and M. Daniel, Medical error-the third leading cause of death in the US. BMJ, 2016. 353:
p. i2139.
[2] Bates, D.W., et al., Incidence of Adverse Drug Events and Potential Adverse Drug Events: Implications
for Prevention. JAMA, 1995. 274(1): p. 29-34.
[3] Tamblyn, R., et al., The medical office of the 21st century (MOXXI): effectiveness of computerized
decision-making support in reducing inappropriate prescribing in primary care. CMAJ: Canadian
Medical Association journal = journal de l'Association medicale canadienne, 2003. 169(6): p. 549-
556.
[4] Eslami, S., N.F. de Keizer, and A. Abu-Hanna, The impact of computerized physician medication order
entry in hospitalized patients--a systematic review. Int J Med Inform, 2008. 77(6): p. 365-76.
[5] Meisenberg, B.R., R.R. Wright, and C.J. Brady-Copertino, Reduction in chemotherapy order errors with
computerized physician order entry. J Oncol Pract, 2014. 10(1): p. e5-9.
[6] Porterfield, A., K. Engelbert, and A. Coustasse, Electronic prescribing: improving the efficiency and
accuracy of prescribing in the ambulatory care setting. Perspectives in Health Information
Management, 2014. 11(Spring).
[7] Reckmann, M.H., et al., Does computerized provider order entry reduce prescribing errors for hospital
inpatients? A systematic review. J Am Med Inform Assoc, 2009. 16(5): p. 613-23.
[8] Weiner, J.P., et al., “e-Iatrogenesis”: the most critical unintended consequence of CPOE and other HIT.
2007. 14(3): p. 387-388.
[9] Nanji, K.C., et al., Errors associated with outpatient computerized prescribing systems. J Am Med Inform
Assoc, 2011. 18(6): p. 767-73.
[10] Campbell, E.M., et al., Types of unintended consequences related to computerized provider order entry.
J Am Med Inform Assoc, 2006. 13(5): p. 547-56.
[11] Koppel, R., et al., Role of computerized physician order entry systems in facilitating medication errors.
Jama, 2005. 293(10): p. 1197-203.
[12] Ratwani, R.M., et al., Electronic health record usability: analysis of the user-centered design processes
of eleven electronic health record vendors. Journal of the American Medical Informatics
Association, 2015. 22(6): p. 1179-1182.
[13] Cohen, J.F., J.-M. Bancilhon, and M.J.S.A.C.J. Jones, South African physicians' acceptance of e-
prescribing technology: An empirical test of a modified UTAUT model. 2013. 50(1): p. 43-54.
[14] Tamblyn, R., et al., The development and evaluation of an integrated electronic prescribing and drug
management system for primary care. Journal of the American Medical Informatics Association:
JAMIA, 2006. 13(2): p. 148-159.
[15] Johnson, K.B., et al., Showing Your Work: Impact of annotating electronic prescriptions with decision
support results. J Biomed Inform, 2010. 43(2): p. 321-5.
[16] Halamka, J., et al., E-Prescribing collaboration in Massachusetts: early experiences from regional
prescribing projects. Journal of the American Medical Informatics Association: JAMIA, 2006.
13(3): p. 239-244.
[17] Portugal, I., P. Alencar, and D.J.E.S.w.A. Cowan, The use of machine learning algorithms in
recommender systems: a systematic review. 2018. 97: p. 205-227.
[18] Ekstrand, M.D., et al., Collaborative filtering recommender systems. 2011. 4(2): p. 81-173.
136 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-136
1. Introduction
Conventions on the designation and ordering of phenomena of the study area are present
in all sciences in order to make them accessible, communicable and comparable to
systematic research [1]. Classification systems are helpful and sometimes indispensable
from a clinical, scientific, administrative and economic point of view. The
documentation effort is critically questioned by the health service providers [2,4] and it
is also feared that bad coding generates financial disadvantages [3].
After the "countless symptoms and non-disease related conditions that occur in
primary care" [5] were inadequately classified by the ICD 10, the World Organization of
National Colleges, Academies and Academic Associations of General Practitioners /
Family Physicians (WONCA) developed, issued and continuously adapted a coding
system that specifically addresses the needs of general medical and primary care
documentation.
2. Methods
In advance, selected studies from Germany, Switzerland, the Netherlands, Norway and
Australia [6,7] were analyzed for the practical application of ICPC-2 coding. On this
basis, two studies were conducted in Austria during the period from February to
1
Corresponding Author: Karin Messer-Misak, FH Joanneum Gesellschaft mbH, Eckertstr. 30i, 8020
Graz, Austria, E-Mail: [email protected]
K. Messer-Misak / Use of ICPC-2 – Current Status, Strengths and Weaknesses of the System 137
September 2018. The quantitative survey (n=28) was carried out in cooperation with the
"Austrian Forum for Primary Care in the Health Care System" [6, pp.44] in order to
ascertain which of the 28 institutions with primary care character have ICPC-2 in
practical use and to what extent the documentation software used was implemented. The
qualitative resignation in the form of expert interviews [7, pp. 20] (n=4) had the goal of
developing a strengths-weaknesses analysis for use of ICPC2 coding in practice.
The results of the quantitative survey [6, pp41-48] have shown that only four of the
surveyed institutions in Austria have practical experience with the documentation by
means of ICPC-2. The qualitative analysis showed some strengths of the use of ICPC-2
[7] such as e.g. documentation of counseling events, episodes and counseling results on
the symptom level, coding specifically of treatment episodes, clarity due to the small
number of codes and the possibility of documentation of non-medical content (e.g.,
social issues). But there are also some significant weaknesses of the system [7, p. 68],
which should not be ignored, such as the uncertainty about the correct application, a basic
skepticism regarding the benefits, no consistent specifications as how to code, no exact
diagnostic, original diagnostic texts do not correspond to common usage and procedures
are unclear.
In order to ensure an Austria-wide effective implementation, the following steps are
considered as recommendable: In terms of content, it is recommended that an extension
and supplementation of the data will be required [5, p.20-21, 7, p. 68]. At the federal
level, the organizational and legal preparations of a cross-sector coded diagnostic
documentation in the entire outpatient area will be introduced by December 2021. And
at the organizational level, specific training is required as well as a uniform guide how
to manage the integration into common practice software [7, p.68].
Due to the new reorganization of primary care and other health economics
requirements, unified documentation, which is already common in the intramural field,
will be essential.
References
[1] H.-U. Wittchen, Klinische Psychologie & Psychotherapie (Lehrbuch mit Online-Materialien). 2., überarb.
und erw. Auflage. Springer, ISBN 978-3-642-13017-5, Heidelberg 2011, S. 28-53.
[2] E. Gollner, F. Schnabel, Strukturevaluation der medizinischen Dokumentation bei unterschiedlichen
Krankenhausträgern, Innovation durch Evaluation: Impulse setzen durch Evaluationsprozesse im Social-
Profit- und Public Health-Sektor, Forschungsforum der österr. Fachhochschulen. 2/2017, p. 102.
[3] S. Stark, S. Hölzer, Dokumentations- und Kodierprozesse im Spital: Herausforderungen und Massnahmen.
https://saez.ch/de/resource/jf/journal/file/view/article/saez/de/saez.2005.11404/2005-33-1410.pdf/ , CH
Ärztezeitung, 2005; Nr. 32/33, p. 86.
[4] K. Blum, U. Müller, Dokumentationsaufwand im Ärztlichen Dienst der Krankenhäuser.
Repräsentativerhebung des Deutschen Krankenhausinstituts. In: Das Krankenhaus 7/2003, p. 544-548.
[5] WONCA International Classification Committee (Hrsg.). Internationale Klassifizierung der medizinischen
Primärversorgung ICPC-2. Ein Codierungssystem der Allgemeinmedizin. ISBN 3211835504, Springer,
2001, p. 11-12.
[6] T. Kraußler, Analyse der aktuellen Umsetzung einer einheitlichen Diagnose- und Leistungserfassung
mittels ICPC-2 in Österreich. Masterarbeit. Österreich, Graz, 2018.
[7] K. Kahr, Kritische Betrachtung des Einsatzes der ICPS-2 Codierung in der Primärversorgung. Masterarbeit.
Österreich, Graz, 2018.
138 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-138
1. Introduction
1
Corresponding Author: Amelie Altenbuchner, Institut für Sozialforschung und
Technikfolgenabschätzung (IST), Ostbayerische Technische Hochschule Regensburg (OTH), Seybothstraße 2,
D-93053 Regensburg, Germany, E-Mail: [email protected]
A. Altenbuchner et al. / Exploratory Analysis of Motion Tracking Data 139
usage [3]. Exploring practical devices for measuring patients’ motion in everyday
conditions is, nevertheless, a current desideratum of gerontology [4].
Rather than developing a new technology system for geriatric patients in the
rehabilitation process after a hip fracture, this study intends to find out how to use an
existing customary device in a new context. This approach is called technology design
[5]. A study group measured gait with customary body-fixed sensors after hip fracture in
the hospital nine hours a day for one to two days and then two weeks later. The authors
found a change in performance and suggest using the median for further analysis, when
comparing groups because of deviations [6]. To continuously measure steps, it is useful
to use a customary motion tracker similar to a wristwatch [2]. Patients are often used to
wear their watch all the time and usually do not forget to put it back on. Furthermore, the
devices can be easily integrated into daily life. However measuring in everyday life
conditions is accompanied by the impossibility to control for any confounder variables.
In this paper, we perform an explanatory statistical analysis (EDA) on the observed
data (steps taken). The aim is to detect cluster groups, that show similarities in the
individual performance of patients and to design a significant linear regression model for
each patient with the aim of predicting steps through time. The analysis is part of an
ongoing long-term study that explores the possibilities of a conventional motion tracker
during rehabilitation.
2. Methods
2.1. Subjects
Subjects are patients on a geriatric trauma ward in a German hospital after surgery for
hip fracture. Patients with this condition or – in case – their legal guardians are invited
to join the study. Recruitment starts postoperatively by obtaining informed consent. Data
collection begins post-surgery. No medical intervention is part of the study. The patients
have a study ID and the online registration of the motion tracker contains a pseudonym.
There are two study groups in the prospective observational study with two different
types of motion trackers. Study group 1 (sg1) includes all hip fracture patients who
agreed to participate (n =10) (31% of all patients) until discharge from hospital (Med=9
days) from 17/11/06 to 18/28/02. The first group was primarily examined to explore
conditions in the research field and to test if the patients agree to use a motion-tracker
(FitBit ALTA HR) at all [2]. In study group 2 (sg2) the same sample procedure includes
14 patients at this point and has been ongoing during the stay in the rehabilitation and in
the domestic situation since 18/06/13 (Med=45 days) with a different motion tracker
(GARMIN vìvofit3).
2.2. Instruments
Sg1 uses the motion tracker FitBit ALTA HR with a battery life of five days. Data
downloading is possible while charging the device by connecting to a personal computer,
laptop or a tablet or via Bluetooth using a smartphone application without recharging.
Sg2 uses the motion tracker GARMIN vìvofit3 with battery life up to one year.
Validity of measurement of customary motion trackers and motion sensors is not
completely solid for elderly patients [7], although some authors assume proven validity
for body worn sensors [6] for elderly living at home or stationary.
140 A. Altenbuchner et al. / Exploratory Analysis of Motion Tracking Data
2.3. Analysis
In both study groups, the providers’ online tools allow data extraction into a .csv-file.
After extraction, this file was imported into IBM SPSS Statistics 24.
Median tests compared sg1 and sg2 as sample groups. Running a k-means cluster
analysis showed similarities in the data of steps for each patient during time. The cluster
groups were tested as factors for ANOVA, to find out, if there are significant mean
differences in patients belonging to the clusters.
To detect individuals within the clusters for whom time is a significant predictor of
steps, linear regression analysis is run for every patient.
3. Results
Table 1 shows the descriptive statistics of patients (N=24) in sg1 (n=10) and sg2 (n=14).
26% of the patients participating in the study are male. Patients are in the geriatric
age of 86 ±6.8 years on average. The median test did not show a significant difference
between the age medians of sg1 (Med=85.5) and sg2 (Med=86.0) [z=.120, p=n.s, n=24].
Table 3 shows the distribution of steps for the patients during individual lengths of
data collection.
K-means cluster analysis suggests three clusters (Table 2) for the average steps per day
after five days of measurement (s5) (N=24) [F(2,21)=76.096, p≤.001].
For the average steps per day after 14 days of measurement (s14) (N=12)
[F(2,9)=43.449, p≤.001], after 21 days (s21) (N=12) [F(2,9)=54.462, p≤.001] and 28
days (s28) (N=12) [F(2,8)=34.360, p≤.001] only sg2 is considered for the analysis.
In s14 88,5% of the variation is explained through the three-cluster solution
(R²adjust=.885). Cluster 1 (M=902.2 steps) only contains one person, cluster 2 (M=481.2
steps ±71.3) and cluster 3 (M=188 steps ±236.5) show a significant difference from each
other.
In s21 90,7% of the variation is explained through the three-cluster solution
(R²adjust=.924). Games-Howell’s post-hoc test states, that cluster 2 (M=219.3 steps
±107.8) and cluster 3 (M=677.6 steps ±57.7) are significant different from each other
and cluster 1 (M=1207.4 steps ±231.7) shows no significant difference to the other
clusters.
In s28 89,6% of the variation is explained through the three-cluster solution
(R²adjust=.896). Cluster 2 (M=1720 steps; same patient as in s14) only contains one person,
cluster 1 (M=829.1 steps ±287.4) and cluster 3 (M=185 steps ±90.3) show a significant
difference from each other.
Linear regression analysis shows that in sg1 a linear regression model, that predicts steps
through the days of measurement is significant for 30% of all patients [N=3; ID=4, 6, 9].
For the other patients in sg1 this is not the case [N=7; ID=1, 2, 3, 5, 7, 8, 9]. In sg2
the same regression model is significant for 71% of all patients [N=10; ID=11, 13, 15,
16, 17 18, 20, 21, 22, 23]. For 29% of all patients [N=4; ID=12, 14, 19, 24] the model
cannot predict the steps.
The following rows show the significant regression models for the patients.
4. Discussion
The main goal of this article is an exploration of data a motion tracker can provide for
the sample of geriatric trauma patients. A target group for whom no general statistical
record about their average number of steps exists so far. Consequently, the first aim was
to detect similarities and patterns in the number of steps individuals take. The second
A. Altenbuchner et al. / Exploratory Analysis of Motion Tracking Data 143
aim was to find out if it could be possible to predict steps through time in the future, by
formulating linear regression models.
Overall, results present the development of mobility after hip fracture surgery of 24
individuals. At the beginning of data collection 75% of patients fit into the same cluster.
Through the rehabilitation process the distribution gets wider, whereas the largest
number of patients are together in one cluster continuously, the one with low progress
compared to the other clusters. This and the fact, that in s14 and s21 only one patient
defines an own cluster, may speak for a two cluster solution. Still the difference between
individuals is high and 50% of patients in sg2 are continuing in the long-term study. So
cluster analysis needs to be repeated in the future with the available long-term data. The
goal is to find out if a three-cluster model fits the data and if this result could construct a
scientific hypothesis for future research old-age rehabilitation.
Information about the amount of steps before the incident that led to hip fracture is
unfortunately not available. Consequently not the actual amount of steps, but the decline
and incline rate could be included in further analysis in the future.
During data collection only a few interruptions occurred. This reflects the
acceptance and usability of the technology / motion-trackers as they are similar to a
wristwatch. Still a great challenge is the cancellation and therefore the width in length of
measurement, due to health conditions or other events in the patients or caregivers life.
This leads to the fact, that for 29% of patients in sg2 the regression model was not
significant. Who are these patients? ID 12 deceased during the study; caregivers
cancelled the participation for ID 14, a patient with dementia diagnosis. The patient with
ID 19 cancelled the study after eight days and the patient with ID 24 just started the
ongoing study. All other patients show middle to high effect sizes in the regression
models. For rehabilitants and their caregivers it could be interesting to monitor a trend
of physical motion. A suggestion for further research is to focus on a statistical
connection between steps and geriatric assessment scores and to detect predictors for the
quality of rehabilitation. In the ongoing project, our research will focus on validity
aspects for the target group. We plan to estimate the validity for walking with a rollator
or a walking stick.
Additionally this leads to another important issue for patients: The effect of motion
feedback on motion. During the ongoing study we evaluate feedback.
The feedback issue implies the imperative of data literacy [8]: On the one hand,
passive data measured by an objective wearable provides an insight into daily activities
lacking of subjectivity. On the other hand, the visualisation of numbers, representing
activity, is a skewed reflection that might be hard to interpret. The WHO
recommendations suggest almost the same level of physical activity for people over 65
as for people between 18 and 65 years [9]. The differentiation between youngsters,
middle old and elderly persons lies in the individual cardiac frequency [10]. For people
who have never been so much into sports their whole life, as a majority of the German
population for example [11], these requirements should be hard to meet. Even suffering
from physical limitation, moderate activity at least three times a week is the suggestion
for preventing falls [12] and even mostly inactive elderlies benefit from any activity
integration in their daily routine [12]. In addition, dementia patients benefit from physical
activity in therapy [13, 14]. By self-tracking or caregiver-tracking the data base could
provide individual information [8] about the objective state of motion. Through that, it
is possible to interpret the data, even without medical expert knowledge by asking and
answering: Is there a decline, incline or consistency through a certain period? Does the
144 A. Altenbuchner et al. / Exploratory Analysis of Motion Tracking Data
data fit the subjective impressions? Does the data fit the general state of health? What
are the rehabilitation goals?
Digital self-tracking could lead to a loss of control and autonomy [8] and is criticised
for an economically body optimizing aspect and the fact that the data is sold by the
providers [15]. It is a relatively affordable tool for regaining power through knowledge
about the own body and self – in this case about motion and mobility. In the life of the
elderly these entities are associated with rehabilitation, health, participation, quality of
life and hence autonomy [16, 4]. This project includes the topics of quality of life and
motivation by feedback [2] in the future.
Finally physical activity […] mitigate[s] the mortality risks [17], namely all kinds of
physically activity is superior to inactivity [12]. Motion-tracking data reflects motion and
sets the foundation stone for developing complex personalized interventions in the future
[4].
References
[1] K. Weber, Demografie, Technik, Ethik: Methoden der Demografie, Technik, Ethik: Methoden der
normativen Gestaltung technisch gestützter Pflege, Pflege & Gesellschaft 22(4) (2017), 338–352.
[2] A. Altenbuchner, S. Haug, R. Kretschmer, K. Weber, How to measure physical motion and the impact of
individualized feedback in the field of rehabilitation of geriatric trauma patients, in: Health Informatics
Meets eHealth. G. Schreier and D. Hayn, open access IOS press, 2018. pp. 226–232.
[3] K. Gurley, FA. Norcio, A systematic review of technologies designed to improve and assist cognitive
decline for both the current and future aging populations, in: Internationalization, design and global
development: Third international conference, IDGD 2009, held as part of HCI International 2009, San
Diego, CA, USA, July 19–24, N. Aykin, Berlin, 2009. pp. 156–63.
[4] A. Barth, G. Doblhammer, Physische Mobilität und Gesundheit im Alter, in: Die transformative Macht der
Demografie. T. Mayer, Wiesbaden, 2017. pp. 207–244.
[5] G. Banse, R. Hauser, Technik und Kultur - ein Überblick, in: Technik und Kultur: Bedingungs- und
Beeinflussungsverhältnisse. A. Grunwald, G. Banse, Karlsruhe, 2010. pp. 17–39.
[6] P. Benzinger, U. Lindemann, C. Becker, K. Aminian, M. Jamour, S.E. Flick. Geriatric rehabilitation after
hip fracture. Role of body-fixed sensor measurements of physical activity. Z Gerontol Geriatr 47(3)
(2014), 236–42.
[7] B. Grimm, S. Bolink. Evaluating physical function and activity in the elderly patient using wearable motion
sensors. EFORT Open Rev1(5) (2017), 112–120.
[8] S. Duttweiler, J.-H. Passoth, Self-Tracking als Optimierungsprojekt? in: Leben nach Zahlen – Self-
Tracking als Optimierungsprojekt? S. Duttweiler, R. Gugutzer, J.-H. Passoth, Bielefeld, 2016. pp. 9–42.
[9] World Health Organization, Global recommendations on physical activity for health, Geneva, 2010.
[10] A. Rütten, K. Abu-Omar, T. Lampert, T. Ziese, Körperliche Aktivität, Gesundheitsberichterstattung des
Bundes, Vol. 26, Berlin, 2005.
[11] World Health Organization, What is Moderate-intensity and Vigorous-intensity Physical Activity? -
Intensity of physical activity, http://www.who.int/dietphysicalactivity/physical_activity_intensity/en/,
last access: 22.01.2019.
[12] K. Pfeifer, W. Banzer, E. Füzéki, W. Geidl, C. Graf, V. Hartung, et al., Empfehlungen für Bewegung, in:
Nationale Empfehlungen für Bewegung und Bewegungsförderung. A. Rütten, K. Pfeiffer, Erlangen,
2016. pp. 17–64.
[13] L. Clare, Rehabilitation for People Living with Dementia: a Practical Framework of Positive Support,
PLoS medicine 14 (2017), e1002245.
[14] H. Bork, Rehabilitation nach hüft- und knieendoprothetischer Versorgung älterer Menschen, Orthopäde
46(1) (2017), 69–77.
[15] S. Schaupp, Wir nennen es flexible Selbstkontrolle. Self-Tracking als Selbsttechnologie des
kybernetischen Kapitalismus, in: Leben nach Zahlen – Self-Tracking als Optimierungsprojekt? S.
Duttweiler, R. Gugutzer, J.-H. Passoth, Bielefeld, 2016. pp. 63–86.
A. Altenbuchner et al. / Exploratory Analysis of Motion Tracking Data 145
[16] S. Förch, R. Kretschmer, T. Haufe, J. Plath, E. Mayr, Orthogeriatric Combined Management of Elderly
Patients With Proximal Femoral Fracture: Results of a 1-Year Follow-Up. Geriatr Orthop Surg Rehabil
8(2) (2017), 109–114.
[17] K. M. Diaz, A.T. Duran, N. Colabianchi, S.E. Judd, V.J. Howard, S.P. Hooker, Effects on Mortality of
Replacing Sedentary Time With Short Sedentary Bouts or Physical Activity: A National Cohort Study,
American Journal of Epidemiology kwy271 (2019), 1-7. DOI:10.1093/aje/kwy271.
146 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-146
1. Introduction
1
Corresponding Author: Nils Reiss, Schüchtermann-Klinik Bad Rothenfelde, Institute for
Cardiovascular Research, Ulmenallee 5-11, 49214 Bad Rothenfelde, Germany, E-Mail:
[email protected]
N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients 147
patients have questions or problems, they can contact staff at the implanting center by
telephone at any time. Ultimately, however, aftercare in the post-hospital phases is
insufficient and urgently in need of improvement.
Figure 1. LVAD patient with internal and external equipment [14] (1- pump, 2- batteries, 3- driveline, 4-
controller, with permission of Abbott®).
In the field of heart failure therapy (without LVAD), first experiences have been
made with telemonitoring approaches in the last few years [15–19]. Implementation of
telemedicine centers facilitates remote medical services. Telemedicine centers improve
the efficiency of treatment and provide patients with a greater sense of safety by
assuring permanent contact with qualified medical staff. However, the approaches
applied so far are insensitive to LVAD patients, and new strategies are indispensable
for this patient group [20, 21]. To date, there are no telemedicine centers available for
monitoring heart failure patients supported by LVAD.
In the following paper, structural, staff and spatial requirements for a telemedicine
center to monitor the special group of LVAD patients are described, based on
comprehensive literature research and expert interviews.
148 N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients
2. Methods
For the systematic recording, organization and administration of all relevant literature
sources, a systematic literature search was carried out at the beginning of the study. In
order to select relevant studies, inclusion and exclusion criteria were established using
the PICO schema [22] and advanced elements. Based on the PICO schema, the
following parameters were considered in more detail in order to subsequently define
corresponding inclusion or exclusion criteria (Table 1).
A qualitative investigation based on guided interview and focus group techniques was
conducted at three German heart centers with caregiver experts. The expert interviews
were conducted as openly as possible since the goal of this method was a
comprehensive survey of expert knowledge regarding the research topic. Guidance
interviews are non-standard interviews that work with given topics and a list of
questions – known as the guide.
In line with the research question of the present work, a guideline was developed
for the expert interviews with corresponding topics and questions. The guide was
subdivided into five subject blocks with a total of 14 principal questions and 9
subordinate questions, corresponding to 23 categories. The 5 blocks were:
The expert interviews were intended to provide a first insight into the field of
structural, personnel and spatial requirements in order to derive conclusions for the
practical implementation of a telemedicine center.
The selection of experts is an essential decision in research design because it
decides the nature and quality of the information. Experts are in this scenario people
who, based on their work with LVAD patients, have the expertise to make statements
about the requirements of planned centers. The selected group of experts thus includes
N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients 149
persons who work closely with LVAD patients, who have the required experience, and
who have a comprehensive overview of the overall care situation of this patient
clientele.
These include, on the one hand, implanting physicians and physicians with pre-
and aftercare patients. On the other hand, VAD coordinators and VAD nurses with
appropriate specialist training, working intensively with LVAD patients, are also
involved.
The interviews were conducted personally, face-to-face, by the same interviewer.
Face-to-face interviews are characterized by high information content and a good
controllability of the conversation. The individual expert interviews were transcribed
promptly after implementation using the transcription program f5 in order to fully map
and capture the information received. The transcripts of the expert interviews were
used as base material for the data analysis and evaluated using the qualitative content
analysis according to Mayring [23].
3. Results
In summary, there were no published studies available to answer the questions posed
by the research topic. Thus, the current state of research must be assessed as bad or
non-existent, clarifying the crucial importance of research regarding this topic.
A reduced search strategy was therefore then used, with 450 hits achieved in total
(Table 2). By means of the systematic literature research or the reduced search strategy
and the supplementary unsystematic research by hand, publications on individual
elements of the search strategy were found. These were helpful for development of the
guideline and provided information which had an impact on the topic.
Source: PubMed
Date: 30.04.2018
Filter: humans, english or german, adult
# Term Result
1 “left ventricular assist device*“ 4.768
2 LVAD 3.460
3 #1 OR #2 5.694
4 telemonitoring 1.140
5 “remote monitoring“ 1.420
6 “telemedical monitoring“ 20
7 #4 OR #5 OR #6 2470
8 “telemedical service cent*“ 27
9 “telemedicine cent*“ 4.985
10 “telehealth cent*“ 5.286
11 #8 OR #9 OR #10 5290
12 #3 AND #7 AND #11 Filters: Humans; English; German; Adult: 19+ years 0
13 #3 AND #11 Filters: Humans; English; German; Adult: 19+ years 0
14 #3 AND #7 Filters: Humans; English; German; Adult: 19+ years 450
150 N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients
The multicenter interviews were carried out at three different German clinics in Lower
Saxony and North Rhine-Westphalia. All clinics had been implanting LVAD systems
for many years and also offered aftercare for discharged patients. From the 3 hospitals,
a total of 11 experts (6 physicians and 5 VAD coordinators or VAD nurses, 91% male)
took part in the interviews. The mean experience of the interviewed experts with
LVAD patients can be assessed as very good, spanning 12.9 ± 7.5 years (Table 3). All
interviews could be conducted through to the end without interruption. Overall, the
interviews led to a total of 04:29:09 hours of soundtrack and spoken data, and single
interviews lasted between 00:14:15 and 00:42:32 hours (mean 00:24:28 hours).
The expertise and experience in the field of telemedicine varied among the LVAD
experts interviewed, ranging from no knowledge to expert knowledge. The median was
within the range of basic knowledge of telemedicine (Figure 2).
expert knowledge
good knowledge
basic knowledge
theoretical knowledge
no knowledge
0 1 2 3 4
number of recipients
Based on the findings from the five interview blocks mentioned, corresponding to 23
categories, 10 hypotheses were generated according to the answers given by the experts
(Table 4). The first hypothesis is that each LVAD-implanting clinic should monitor its
LVAD patients via its own telemedicine center. The second hypothesis is as follows: in
order to better interpret the transmitted data and parameters, patients should be known
to the telemedicine center. In addition, based on the findings of this thematic block, a
N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients 151
Acknowledgment
This project is funded by the German Federal Ministry of Education and Research
(BMBF) within the framework of the ITEA 3 Project Medolution (14003).
4. References
1. Gustafsson F, Rogers JG. Left ventricular assist device therapy in advanced heart failure: Patient
selection and outcomes. Eur J Heart Fail. 2017;19:595–602. doi:10.1002/ejhf.779.
2. Slaughter MS, Pagani FD, Rogers JG, Miller LW, Sun B, Russell SD, et al. Clinical management of
continuous-flow left ventricular assist devices in advanced heart failure. J Heart Lung Transplant.
2010;29:S1-39. doi:10.1016/j.healun.2010.01.011.
3. Pinney SP, Anyanwu AC, Lala A, Teuteberg JJ, Uriel N, Mehra MR. Left Ventricular Assist Devices
for Lifelong Support. J Am Coll Cardiol. 2017;69:2845–61. doi:10.1016/j.jacc.2017.04.031.
4. Hernandez RE, Singh SK, Hoang DT, Ali SW, Elayda MA, Mallidi HR, et al. Present-Day Hospital
Readmissions after Left Ventricular Assist Device Implantation: A Large Single-Center Study. Tex
Heart Inst J. 2015;42:419–29. doi:10.14503/THIJ-14-4971.
5. Kimura M, Nawata K, Kinoshita O, Yamauchi H, Hoshino Y, Hatano M, et al. Readmissions after
continuous flow left ventricular assist device implantation. J Artif Organs. 2017;20:311–7.
doi:10.1007/s10047-017-0975-4.
6. Smedira NG, Hoercher KJ, Lima B, Mountis MM, Starling RC, Thuita L, et al. Unplanned hospital
readmissions after HeartMate II implantation: Frequency, risk factors, and impact on resource use and
survival. JACC Heart Fail. 2013;1:31–9. doi:10.1016/j.jchf.2012.11.001.
7. Hasin T, Marmor Y, Kremers W, Topilsky Y, Severson CJ, Schirger JA, et al. Readmissions after
implantation of axial flow left ventricular assist device. J Am Coll Cardiol. 2013;61:153–63.
doi:10.1016/j.jacc.2012.09.041.
8. Akhter SA, Badami A, Murray M, Kohmoto T, Lozonschi L, Osaki S, Lushaj EB. Hospital
Readmissions After Continuous-Flow Left Ventricular Assist Device Implantation: Incidence, Causes,
and Cost Analysis. Ann Thorac Surg. 2015;100:884–9. doi:10.1016/j.athoracsur.2015.03.010.
9. Forest SJ, Bello R, Friedmann P, Casazza D, Nucci C, Shin JJ, et al. Readmissions after ventricular
assist device: Etiologies, patterns, and days out of hospital. Ann Thorac Surg. 2013;95:1276–81.
doi:10.1016/j.athoracsur.2012.12.039.
N. Reiss et al. / Requirements for a Telemedicine Center to Monitor LVAD Patients 153
10. Haglund NA, Davis ME, Tricarico NM, Keebler ME, Maltais S. Readmissions After Continuous Flow
Left Ventricular Assist Device Implantation: Differences Observed Between Two Contemporary
Device Types. ASAIO J. 2015;61:410–6. doi:10.1097/MAT.0000000000000218.
11. Schmidt T, Reiss N, Hoffmann JD, Feldmann C, Deniz E, Roske K, et al. Post-Hospital Care in LVAD
Patients - Experiences of Two Large German Heart Centers. J Heart Lung Transplant. 2017;36:S436.
doi:10.1016/j.healun.2017.01.1248.
12. Jakovljevic DG, McDiarmid A, Hallsworth K, Seferovic PM, Ninkovic VM, Parry G, et al. Effect of
left ventricular assist device implantation and heart transplantation on habitual physical activity and
quality of life. Am J Cardiol. 2014;114:88–93. doi:10.1016/j.amjcard.2014.04.008.
13. Casida JM, Wu H-S, Abshire M, Ghosh B, Yang JJ. Cognition and adherence are self-management
factors predicting the quality of life of adults living with a left ventricular assist device. J Heart Lung
Transplant. 2017;36:325–30. doi:10.1016/j.healun.2016.08.023.
14. Abbott. HeartMate3 System. https://www.heartmate.com/app_themes/patient/images/img27.jpg.
Accessed 24 Jan 2019.
15. Hindricks G, Taborsky M, Glikson M, Heinrich U, Schumacher B, Katz A, et al. Implant-based
multiparameter telemonitoring of patients with heart failure (IN-TIME): A randomised controlled trial.
Lancet. 2014;384:583–90. doi:10.1016/S0140-6736(14)61176-4.
16. Koehler F, Koehler K, Deckwart O, Prescher S, Wegscheider K, Kirwan B-A, et al. Efficacy of
telemedical interventional management in patients with heart failure (TIM-HF2): A randomised,
controlled, parallel-group, unmasked trial. Lancet. 2018;392:1047–57. doi:10.1016/S0140-
6736(18)31880-4.
17. Böhm M, Drexler H, Oswald H, Rybak K, Bosch R, Butter C, et al. Fluid status telemedicine alerts for
heart failure: A randomized controlled trial. Eur Heart J. 2016;37:3154–63.
doi:10.1093/eurheartj/ehw099.
18. Abraham WT, Adamson PB, Bourge RC, Aaron MF, Costanzo MR, Stevenson LW, et al. Wireless
pulmonary artery haemodynamic monitoring in chronic heart failure: A randomised controlled trial.
Lancet. 2011;377:658–66. doi:10.1016/S0140-6736(11)60101-3.
19. Abraham WT, Adamson PB, Costanzo MR, Eigler N, Gold M, Klapholz M, et al. Hemodynamic
Monitoring in Advanced Heart Failure: Results from the LAPTOP-HF Trial. J Cardiac Fail.
2016;22:940. doi:10.1016/j.cardfail.2016.09.012.
20. Glitza JI, Müller-von Aschwege F, Eichelberg M, Reiss N, Schmidt T, Feldmann C, et al. Advanced
telemonitoring of Left Ventricular Assist Device patients for the early detection of thrombosis. Journal
of Network and Computer Applications. 2018;118:74–82. doi:10.1016/j.jnca.2018.04.011.
21. Reiss N, Schmidt T, Boeckelmann M, Schulte-Eistrup S, Hoffmann J-D, Feldmann C, Schmitto JD.
Telemonitoring of left-ventricular assist device patients-current status and future challenges. J Thorac
Dis. 2018;10:S1794-S1801. doi:10.21037/jtd.2018.01.158.
22. Schardt C, Adams MB, Owens T, Keitz S, Fontelo P. Utilization of the PICO framework to improve
searching PubMed for clinical questions. BMC Med Inform Decis Mak. 2007;7:16. doi:10.1186/1472-
6947-7-16.
23. Mayring P. Qualitative Inhaltsanalyse: Grundlagen und Techniken. 12th ed. Weinheim: Beltz; 2015.
154 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-154
1. Introduction
In 2015, diabetes affected 59.8 million people in Europe aged between 20 and 79 years.
According to the International Diabetes Federation (IDF) the number of people with
diabetes is increasing in every European country and it is estimated that the number of
people with diabetes in Europe will rise to 71.1 million in 2040 [1]. Diabetes is basically
a life-long disease and like all chronic diseases it cannot be cured. Nevertheless, there
are strategies for improving the patients’ health situation. One key aspect is empowering
patients to put them in the position to better take care of their diabetes. Information and
Communication Technologies can play a key role in better management of diabetes and
in patient empowerment. Patient empowerment [2] involves patients to a greater extent
in their own healthcare process and disease management becomes an integrated part of
their daily life.
1
Corresponding Author: Oliver Jung, Salzburg Research Forschungsgesellschaft mbH, Jakob-Haringer-
Straße 5/3, 5020 Salzburg, Austria, E-Mail: [email protected]
O. Jung et al. / Empowering Diabetes Patients with Interventions 155
In this paper, we present the approach of the Action Plan Engine developed in the
POWER2DM project2. POWER2DM started in February 2016 and the main aim is to
develop a personalised self-management support system for Type 1 and Type 2 diabetes
patients. It offers a guided action plan for self-management by combining decision
support based on personalised results of interlinked predictive computer models,
feedback functionalities based on Behavioural Change Techniques, and real-time
collection and interpretation of personal data and self-management activities. The Action
Plan Engine is a web-based module in POWER2DM and integrates personalized
behaviour change interventions to increase adherence of the patients to their care
program and improve their interaction with health professionals.
The Action Plan Engine in POWER2DM is an advancement of the self-management
support system developed in the EMPOWER project [3] by means of refactoring and
addition of interventions, advanced exercises and modular design approaches.
The Action Plan Engine offers a guided workflow as an iterative cycle, typically on a
weekly basis. For every cycle, the patient is encouraged to specify tasks and activities he
or she wants to take care of in this period. These planned activities help to adhere to
medical treatment plans, e. g. measuring glucose values, but may also support the
accomplishment of personal goals in planning exercises. If a patient specifies activities
on a weekly basis, the likelihood that these activities are realistic is higher than planning
activities for a longer period. However, the Action Plan cycle can also be bi-weekly,
monthly or of another duration. Besides planning of activities, the Action Plan Engine
supports writing of diaries with respect to mood or stress.
The Action Plan Engine interacts with other POWER2DM components: (i) with the
component for the doctors (the Shared Decision Making application) for supporting the
appointment and for specifying treatment goals and activities and (ii) with the mobile
app for a convenient acquisition of patient data and integration of device data.
Basically, the Action Plan Workflow comprises four main steps (see Figure 1): in the
first step, the patient can specify long-term self-management goals based on personalised
values and on the treatment plan and goals. Based on the treatment goals, the patient can
also specify a treatment goal in a more detailed way (e.g. specifying the type of exercise
he would like to do) but also add additional personal goals.
In the next step and based on the self-management goals, the patient specifies short-
term (e.g. weekly) activities by using a calendar. Relating an activity to a goal keeps the
user aware why he is performing an activity.
Next, patient data will be recorded by devices but also manually through web and/or
mobile forms. This phase supports the self-monitoring of vital data and behaviour.
Currently, the following patient data can be recorded via web forms: blood glucose,
blood pressure, body weight, exercises, meals, problems, sleep and stress.
2
The POWER2DM project is funded by the European Union’s Horizon 2020 research and innovation
programme under grant agreement No 689444
156 O. Jung et al. / Empowering Diabetes Patients with Interventions
In the last step, the Action Plan Engine evaluates and gives feedback how successful
the patient has fulfilled his planned goals and activities. This includes feedback about the
overall performance and the performance of all concerned goals and activities.
Additionally, the Action Plan Engine provides hints and advices (=interventions for self-
management) for all activities and goals. Interventions can be in a different context, e.g.
a tip for improving self-management activities, an advice based on national guidelines
(e.g. recommended duration for physical activities), a tip for coping with daily problems
(e.g. sleep problems or stress) or positive reinforcement [4] by means of a motivational
message (e.g. when the patient has successfully completed all his activities for a specific
goal).
Furthermore, part of the Action Plan Engine are some exercises such as the Energy
Battery, a metaphor in three steps for mood or energy problems (e.g. in case of low mood
or too much stress, in case of sleeping problems), the Value Compass, a tool for reflecting
on the importance of personal values in different life areas (e.g. to support goal
definition), and the Information Material, a WordPress website including detailed articles
about information and problems relevant for diabetes patients.
3
Currently supported languages are English, Spanish, Dutch and German.
O. Jung et al. / Empowering Diabetes Patients with Interventions 157
This API is also used by the Mobile Application developed within the POWER2DM
project. Figure 2 depicts the general architecture and the main components of the Action
Plan Engine. Towards the user, the Action Plan provides an HTML5/JS application. In
the backend, the Action Plan Engine is implemented as a Java Servlet running inside a
secure container providing controlled access via the APIs.
Separate APIs for each service are provided, such as the management of goals,
planning of activities and observations or accessing the review over a specified period.
For providing such information, the Action Plan Engine does not store any patient data
itself, but relies on existing, secure patient data management infrastructures. In
POWER2DM, the FHIR-compatible 4 personal data store (PDS) [6] and the
POWER2DM identity service for authorization and authentication have been used. The
Action Plan Engine only transforms patient data, calculates graphs and statistics, and
creates interventions based on the action plans and results stored in the PDS. Also, all
data entered via the Action Plan is stored only in the secure PDS.
Figure 3 shows the menu structure provided to the patients after authentication.
Landing page is a “Dashboard”. From that, the patient can navigate to the “Treatment
Plan”, which contains goals defined by the care provider and the patient together during
a “Shared-Decision Making” process. Goals defined in the treatment plan can be adopted
by the patient and further detailed in their own self-management “Action Plan”. This
menu item also contains the links to further information such as the review, or exercises
like the Energy Battery and Value Compass. Daily activities can be either recorded
through the mobile application, or by using the “Journal” pages. Finally, profile and
settings are available for the patient to change personal preferences.
4
“Fast Healthcare Interoperability Resources” (http://hl7.org/fhir) is a medical standard created by the
standardisation organisation HL7.
158 O. Jung et al. / Empowering Diabetes Patients with Interventions
Further technical details about the Action Plan Engine are described in a public
report on the prototype architecture of POWER2DM [7].
Most theories and determinants explain behaviour, but do not describe how to change
behaviour. In POWER2DM, the interventions of the Action Plan Engine are based on
the Behaviour Change Techniques (BCTs) of Abraham and Michie [8]. They describe
interventions to change a person’s lifestyle behaviour. A BCT is an “observable,
replicable, and irreducible component of an intervention designed to alter or redirect
causal processes that regulate behaviour”.
The Action Plan Engine provides interventions as part of the periodic review and by
suggesting interventions at the end of barrier decision trees. These interventions are
stored in the intervention table which is based on a dual approach and supports both a
psychological and a technical approach. The starting point is the compliance with the
planned goals and activities. Depending on the degree of fulfilment, different types of
interventions/purposes and BCTs can be specified, e.g. positive reinforcement when a
goal resp. an activity is completely achieved, a question to detect a barrier when a goal
resp. an activity is almost or not achieved.
Currently, the intervention table of the Action Plan Engine includes about 170
different interventions. These interventions can be of different types. They can be plain
text (e.g. positive reinforcement), they can refer to an external website (e.g. about a
detailed description of diabetes and coping with emotions), they can recommend an
exercise (e.g. an exercise for coping with low mood or energy problems) and they can
refer to a more detailed explanation in the POWER2DM information material (e.g. an
article about fear of needles).
The periodic review collects all scheduled activities within the review period where each
scheduled activity is marked as completed whenever a corresponding observation is
present. Otherwise the scheduled event remains as planned in the review. As a result, the
number of planned activities compared to the number of completed activities denotes the
compliance or performance in completing planned activities. However, since activities
are of a particular type (monitoring glucose, doing exercises etc.), the review
computation is also performed for each activity type and of course for the activities
O. Jung et al. / Empowering Diabetes Patients with Interventions 159
overall. Furthermore, any activity may point to one or more related goals. Hence, the
performance computation is additionally performed for each goal and on an overall basis.
As a result, the review shows several review categories such as overall performance,
activity performance, goal performance. Besides the planned activities and goals, the
review takes additional recordings such as sleeping problems, mood or stress into
account. For these data, we use Likert Scales [9] as a basis for evaluation in the periodic
review (sleeping problem intensity, mood level, and stress level).
From the technical point of view, the degree of the patient’s compliance in planning
activities and successfully completing them is the basis for interventions. Interventions
in this context are motivational or informative messages shown to the patient to foster
behaviour change. To support the selection of meaningful interventions, the selection of
an intervention is based on several rules such as the review category (e.g. activity or
goal) and the performance rule to compare with the review result in order to show only
helpful messages to the patient.
For the review, the accurate evaluation of the performance with the intervention
table is a crucial task. For this, the performance rule consists of an expression constant,
a comparator and the target value. The expression constant points to the computed
review performance result (e.g. performance of glucose monitoring activities) or to the
recorded problem or stress intensity respectively. Since the review period covers several
days or even weeks, appropriate expressions for selecting the lowest, the highest or the
average intensity values are available. The comparator expression allows the comparison
of the resolved review value with the given target value.
The performance however is expressed as a percentage of completed tasks compared
to the number of planned tasks. The resulting percentage is transformed into a
corresponding 4-step Likert Scale outlining the degree of compliance. The values for
problem intensity and stress levels are aligned to a Likert Scale as well, thus specifying
the performance criteria for all kinds of interventions is simple and straight forward.
Finally, when computing the review and selecting the proper interventions, eligible
messages are identified by i) filtering for the review category and ii) applying the
performance rule. Whenever all rules evaluate to true, the intervention is eligible to be
shown to the patient. This ensures that the patient gets appropriate feedback based on his
achievements.
Decision Trees are ultimately a special kind of intervention. They can be triggered by
other interventions, observations or user interaction based on defined rules. In contrast
to other interventions, they incorporate direct user feedback.
Since the rules and workflows of the Decision Trees are highly based on expert
knowledge and defined by practitioners, a dynamic “workflow tool” has been developed
using vis.js 5 and Node-RED 6. It allows for definition by non-technicians that can be
exported to a structured JSON format for integration into the Action Plan Engine
subsequently.
Figure 4 shows an example workflow that covers all definable node and edge types.
Nodes can be i) comments (grey) – for improving collaboration, ii) questions (dark blue)
5
“vis.js” (http://visjs.org/) is a dynamic, browser based visualization library
6
“Node-RED” (https://nodered.org/) is a flow-based programming tool for the Internet of Things
160 O. Jung et al. / Empowering Diabetes Patients with Interventions
– asked to the user for direct feedback, iii) content (purple) – redirects to new pages or
static content, iv) conditions (light blue) – for checking existing observations, v) triggers
(orange) – for manually or periodically executing the workflow and vi) actions (red) –
for activating existing triggers. Edges can be i) answers (light blue) – for proceeding
based on user input and ii) forwards (grey) – for directly linking nodes.
The exported JSON file is used for creating more complex user interface elements
or dialog-based select inputs dynamically generated on the front-end. Figure 5 shows an
example dialog about glucose monitoring. The user is guided through the previously
defined questions and answers gradually while conditions are checked and content is
loaded in the background. In this example, the user states to monitor too little due to
disliking needles and is provided with a link to an information page on how to overcome
needle phobia.
In the end, each Decision Tree can result in an intervention or trigger another
Decision Tree allowing for complex and comprehensive rules and support by the use of
easily definable components. In the POWER2DM project, we specified five decision
O. Jung et al. / Empowering Diabetes Patients with Interventions 161
trees for coping with barriers regarding glucose monitoring, exercise, carbohydrates,
insulin and stress that consist of up to 30 sub-trees.
4. Conclusions
In this paper, we presented the approach of the Action Plan Engine developed as a
component in the POWER2DM project. An early prototype of POWER2DM including
the Action Plan Engine is currently implemented and evaluated in a randomised trial with
9 months follow-up in total with 230 patients (115 type-1 diabetes, 115 type-2 diabetes)
in pilot applications in the Netherlands and in Spain. The trial aims at evaluating the
acceptance rate and effectiveness of the presented interventions as well as the HbA1c
levels in comparison to a reference group. Although continuous feedback from
physicians, psychologists and patients helps to improve the prototype, it is currently too
early to present reliable evaluation results. So far, patients are enthusiastic about the idea
of a holistic self-management support system, receiving support and feedback on
physiological, behavioural and psychological parameters.
The Action Plan Engine is actually part of the POWER2DM system, integrating with
the FHIR based storage and a provided single-sign-on solution. However, due to its
modular design, the Action Plan Engine may be used in many environments and can be
easily transferred and used in other e-Health projects, focussing on different diseases. A
connection and integration to third party health systems is also possible due to the use of
standardised interfaces. This also allows for integrating more natural input methods, such
as voice recognition, for some exercises (e.g. decision trees).
References
1. Introduction
Health care nowadays is unthinkable without the use of modern information and
communication technologies. In Austria, nurses and other health care professionals
routinely use IT-based tools such as electronic medical records, computerized physician
order entry systems, patient data management systems or mobile documentation tools in
their daily work. In the future, also health information exchange between institutions will
have an increasing impact on nursing [1]. Managing these complex socio-technical
information systems and the increasing volume of patient information is not a trivial task
[2]. Studies show challenges related to introduction and use of IT systems in nursing,
such as inefficient workflow support [3], low usability [4] and limited evidence on the
impact of nursing IT on quality of care and patient outcome [5].
These problems at least partly originate in the fact that nurses and other health care
professionals have insufficient competencies in dealing with these challenges; they seem
to have difficulties, for example, to express their requirements, to contribute to system
1
Corresponding Author: Elske Ammenwerth, Instittue of Medical Informatics, UMIT – University for
Health Sciences, Medical Informatics and Technology, EWZ 1, 6060 Hall in Tirol, Austria, E-Mail:
[email protected]
E. Ammenwerth and W.O. Hackl / Topics for Continuous Education in Nursing Informatics 163
implementation and system testing, to prepare for IT-based workflow changes, and to
establish an adequate change management and communication policy.
Often, other professional groups such as health informaticians who are adequately
trained for these types of clinical IT projects thus coordinate these projects in hospitals.
However, such projects have a higher chance to succeed with close end-user involvement.
Therefore, clinical IT projects seem not well manageable without a close cooperation
between IT staff (both in the IT department of the health care organizations and at the
vendors side) and clinical staff (such as nurses and other health care professionals).
When we look at the situation in Austria, nurses are mostly not well equipped to
contribute to system analysis, system specification, system selection, system
implementation, and system evaluation, as informatics competencies are quite limited
among Austrian nurses. Nursing informatics competencies are typically not part of
nursing education, nor do adequate continuous education opportunities exist. We see the
same situation in Germany and Switzerland [6].
In other countries, nursing informatics seems much better integrated in nursing
education and continuous education. For example, countries such as Australia [7] offer
an own career option for nursing informatics. This is not the case in Austria. However,
also in Austria, more and more nurses contribute to IT projects and need additional IT-
related competencies.
We were thus interested in better understanding whether nurses with IT
responsibilities and nursing managers themselves see a need for further education in
nursing informatics, and if yes in which topics. The objective of this study is thus to
analyze the needs for continuous education in health informatics among nurses in Austria.
Our motivation was to use this information to design a tailored continuous education
program in health informatics for nurses and other health care professionals.
2. Methods
Chief nursing managers of five of the largest Austrian health care organizations (AUVA,
GESPAG, SALK, KAGES, Tirol Kliniken) were contacted. All agreed to participate. A
survey was prepared, covering 5 major topics and 52 sub-topics of nursing informatics.
The list of topics and sub-topics was developed based on a literature review of
international recommendations on health informatics education of several institutions,
including: Australian Health Informatics Education Council [7], Global Academic
Curricula Competencies for Health Information Professionals [8], Technology
Informatics Guiding Education Reform [9], Canadian Association of Schools of Nursing
[10] and International Medical Informatics Association [11].
For each topic and sub-topic, we asked the following question: “Would you be
interested in continuous education for nursing staff involved in IT projects?”
For the five major topics, a yes/no answer was possible. For the 52 sub-topics, a 4-
point scale was used to document the answer (1 = not interesting, 4 = interesting).
The survey was distributed in all five participating health care institutions using a
snowball system. The survey was organized as online-based survey; only in one health
care institution, paper-based questionnaires were used. Nursing managers on different
hierarchical levels as well as nursing practitioners involved in IT projects were invited
to participate. Participation was voluntary and fully anonymous. Survey results were
analyzed using SPSS.
164 E. Ammenwerth and W.O. Hackl / Topics for Continuous Education in Nursing Informatics
3. Results
Overall, 330 questionnaires were returned, with responses coming from nurses in a broad
range of professional positions. First results of this broader survey have already been
published [12]. Now, for this paper, we focus on a more detailed analysis of the responses
of nurses with additional IT responsibilities and of nurses in middle or higher
management roles. These roles were chosen, as the continuous education program that
we planned would target nurses with IT responsibilities, so we were interested in their
opinion. Also, as middle and top management have to approve such education, we also
were interested in their judgment. Overall, 280 respondents came from these groups and
were thus included in the analysis. Table 1 shows the participants and their professional
position.
For all five main topics, the overall answers showed high interest for continuous
education: IT in nursing (overall interest: 92% of all responses); IT project management
(85%); eHealth technologies (85%); nursing terminologies (83%); computer science
basics (81%).
Figure 1 shows the answer regarding the five main topics for each surveyed
professional role. All three groups show comparable support for all five topics. Nurses
with additional IT responsibilities show least interest in nursing terminologies and
computer science basics. Middle management shows least interest in computer science
basics and eHealth technologies. Top management shows least interest in computer
science basics and IT project management.
Figure 1. Interest in continuous education for five main topics in health informatics, dependent on the
professional role of survey participants (n = 280). To highlight the trends, the category “no answer” is not
presented. Only yes/no answers were possible for these five major topics.
E. Ammenwerth and W.O. Hackl / Topics for Continuous Education in Nursing Informatics 165
Figure 2. Interest in continuous education for 52 sub-topics in health informatics, dependent on the
professional role of survey participants (n = 280). To highlight the trends, the category “no answer” is not
presented. Answers of 1 (interesting) and 2 (partly interesting) are combined and presented in green; answers
3 (partly uninteresting) and 4 (uninteresting) are combined and presented in red.
Figure 2 shows the answers of the 280 respondents for the 52 sub-topics. To allow
better identification of the most interesting topics, answers were classified in
“interesting” versus “not interesting”, and “no answer” responses were omitted. These
answers were comparable between the five participating health care institutions, thus
sub-group analysis is not presented.
Results show some differences between the professional roles. For example, demand
for nursing terminologies was 80% for nurses with IT duties, 92% for nurses in middle
management and 100% for nurses in higher management. In general, as Figure 2 shows,
166 E. Ammenwerth and W.O. Hackl / Topics for Continuous Education in Nursing Informatics
top management considered most of the topics of higher importance than middle
management did. Nurses with IT responsibilities showed larger interest in most topics
compared to middle management, but a bit less than top management.
Several sub-topics reached highest interest among all participants. Here is a list of
sub-topics with highest support over all three groups, ordered according to topic:
x IT project management: Standardization and optimization of nursing workflow;
process and change management in nursing; usability of IT systems.
x IT use in nursing: Electronic patient records; electronic nursing documentation
systems; electronic medication systems.
x eHealth technologies: Importance of eHealth on nursing; importance of
electronic health records; mobile IT tools in nursing.
x Nursing terminologies: Legal basis for electronic patient records; development
of nursing documentation systems; ensuring the quality of nursing
documentation.
x Computer science basics: Creating and using small databases; legal
requirements for data privacy and security.
4. Discussion
Our survey supported a clear interest in continuous education in the indicated topics and
sub-topics. We included only nursing practitioners with IT responsibilities as well as
middle and top nursing managers in the analysis. We focused on these group as we see
this group as target audience and important stakeholder for a planned continuous
education program in health informatics. A survey with nursing practitioners without IT
responsibilities may certainly have yielded different results.
All groups showed high interest in most of the presented topics. Interest correlated
with the professional position: Top nursing managers mostly showed a stronger interest
in most topics than middle nursing management. This may reflect a better understanding
of the strategic benefits and challenges of eHealth technologies in nursing, as responded
to these challenges demands well-trained nursing work force.
Nurses practitioners with IT responsibilities also showed high interest in most sub-
topics. Their preferences, however, differed partly from that of top nursing managers.
For example, while top management showed interest in project management and IT
specifications, nursing practitioners did show much lower interest in this. In turn, nursing
practitioners showed large interest in interfaces (such as HL7) which was not a topic of
interest for top managers. This may reflect the different – operational versus strategic –
E. Ammenwerth and W.O. Hackl / Topics for Continuous Education in Nursing Informatics 167
Table 2. Some opportunities for part-time continuous education in nursing informatics in Austria.
University Name of program Comments Source
University for 3-day short Offered since 2012, addresses nurses www.umit.at/pflegeinfor
Health Sciences, introductory with interest in IT matik
Medical Informatics course “Applied
and Technology Nursing
(UMIT) Informatics”
University for 2,5-years part-time Online-based program. Targets nurses www.umit.at/him
Health Sciences, master program and other health care professionals
Medical Informatics “Health with bachelor degree, as well as
and Technology Information graduates from technical studies.
(UMIT) Management”
University of 2-years part-time Addresses “health experts”, including https://www.fhstp.ac.at/
Applied Sciences master program nurses with bachelor degree, as well de/studium-
St. Pölten “Digital Health” as graduates from technical studies. weiterbildung/medien-
digitale-
technologien/digital-
healthcare
FH Joanneum Graz 2-years part-time Addresses, among others, medical- https://www.fh-
master “eHealth” technical or management graduates joanneum.at/ehealth/ma
(with IT knowledge) and graduates ster
from technical studies.
5. Conclusion
Austrian nursing practitioners and nursing managers show a strong interest for
continuous education in health informatics. This supports findings of other international
surveys. There is, however, a lack of suitable opportunities for continuous education in
Austria.
The results of the survey have been used to design a new master program in Health
Information Management at our University [16]. This master program is fully online and
thus especially suited for continuous education of health care professionals. We will
carefully monitor the participants and their professional background in the future to
determine whether this educational offer is accepted among nurses and other health care
professionals.
References
nursing informatics core competency areas in Austria, Germany, and Switzerland., Inform. Health
Soc. Care. Aug (2018) 1–25. doi:10.1080/17538157.2018.1497635.
[7] AHIEC, Health Informatics - Scope, Careers and Competencies V1.9 (2011).
http://www.ahiec.org.au/docs/AHIEC_HI_Scope_Careers_and_Competencies_V1-9.pdf.
[8] Global Health Workforce Council, Global Academic Curricula Competencies for Health Information
Professionals (2015). http://www.ahima.org/about/~/media/AHIMA/Files/AHIMA-and-Our-
Work/AHIMA-GlobalCurricula_Final_6-30-15.ashx?la=en.
[9] TIGER, The TIGER Initiative - Technology Informatics Guiding Education Reform (2015).
http://thetigerinitiative.org.
[10] CASN, Nursing Informatics - Entry-to-Practice Competencies for Registered Nurses (2013).
http://www.casn.ca/2014/12/nursing-informatics-entry-practice-competencies-registered-nurses-2.
[11] J. Mantas, E. Ammenwerth, G. Demiris, A. Hasman, R. Haux, W. Hersh, E. Hovenga, K.C. Lun, H.
Marin, F. Martin-Sanchez, and G. Wright, Recommendations of the international medical
informatics association (IMIA) on education in biomedical and health informatics, Methods Inf. Med.
49 (2010). doi:10.3414/ME5119.
[12] W.O. Hackl, E. Ammenwerth, and R. Ranegger, Bedarf an Fort- und Weiterbildung in
Pflegeinformatik – Ergebnisse einer Umfrage, Zeitschrift Für Pflegewiss. (2016) 381–387.
doi:10.3936/1354.
[13] E. Hovenga, H. Sinnott, and J. Gogler, Operationalising the National Nursing Informatics Position
Statement, Stud Heal. Technol Inf. 250 (2018) 221–3.
[14] E. Shin, E. Cummings, and K. Ford, A qualitative study of new graduates’ readiness to use nursing
informatics in acute care settings: clinical nurse educators’ perspectives, Contemp Nurse. 51 (2018)
64–76. doi:10.1080/10376178.2017.1393317.
[15] S. Remus, and M. Kennedy, Innovation in transformative nursing leadership: nursing informatics
competencies and roles, Nurs Leadersh. 25 (2012) 14–26.
[16] E. Ammenwerth, W.O. Hackl, M. Felderer, and A. Hörbst, Developing and evaluating collaborative
online-based instructional designs in health information management, Stud Heal. Technol Inf. 243
(2017) 8–12.
170 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-170
1. Introduction
The health and wellbeing of visually impaired and blind people can be affected by the
ability of these individuals to navigate around the world [1]. The human cognitive
system incorporates several aspects one of which is spatial navigation. The layout and
arrangement of the environment is encoded and mapped [2-5]. Cognitive processes
underlying this mapping have attracted considerable research [6-10]. Mental models
reflect the spatial layout and locations of objects in the world [11-12] and it is possible
to generate these models by giving individuals physical maps of unknown
environments which act as pre-navigational tools.
Whilst people with visual impairments can navigate familiar places well they can
have difficulty in new environments [13]. The processes by which both sighted and
visually impaired (including blind) people acquire and process mental imagery has
been found to be similar [14] with some evidence to suggest that blind people
1
Corresponding Author: Mark Scase, Division of Psychology, De Montfort University, The Gateway,
Leicester, LE1 9BH, UK, E-Mail: [email protected]
M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps 171
compensate by having better tactile acuity [14-15], voice memory [16] and auditory
localization [17].
Assistive technology can be used to enhance the wellbeing of blind and visually
impaired people [18, 1]. Tactile maps, haptic navigation and global positioning
systems can promote cognitive mapping and help blind and visually impaired
individuals navigate [19-21]. Pre-navigational aids incorporating tactile components
such as Braille can provide information for a map [22] and have been received well by
visually impaired users [23]. Furthermore, representations about objects or features
can come in tactile form either via embossed paper or devices with pins and haptic
feedback [24] producing vibration when an area of interest is touched. The
disadvantage of tactile maps including Braille is that this tactile print requires more
space that conventional text. Therefore, the amount of haptic information that can be
presented for a certain map size is less than for a printed map.
An alternative Braille is a tactile map with audio feedback or audio description
[25]. A system with audio-tactile interaction can improve blind user satisfaction [26]
and can help with non-visual navigation [27]. Combining tactile and audio stimuli can
produce a flexible learning system. Individuals with visual impairments prefer
navigation via route-like descriptions [28] particularly if they can have the flexibility to
construct their own route from a personal mental model of the cognitive map.
A paper tactile map was combined with a tablet computer by O’Sullivan et al. [22]
to produce a prototype audio tactile map (ATM). A paper tactile map was made by
printing onto swell paper and then heated to create ridges corresponding to the map.
This paper was then overlain onto the tablet. When users touched the paper the tablet
detected the presses and could give audio feedback. Through a user-centered design
with visually impaired people this system was able to give a multimodal experience
and be a pre-navigational tool for individuals with visual impairments. This paper is an
extension of the research originally conducted by O’Sullivan et al. with the aim to
assess the effectiveness of this ATM amongst visually impaired individuals.
The aim of the current study was to compare an ATM prototype providing a
flexible learning environment with a conventional tactile map that was accompanied by
a route-based audio description. Learning effectiveness would be assessed by mixed
methods both quantitatively, and qualitatively through interviews. Since ATMs are
multimodal in nature allowing for multiple learning styles it was predicted that
participants using an ATM would acquire and retain knowledge of a map better than
those using a tactile map accompanied by an audio description.
2. Methods
2.1. Participants
Fourteen volunteers (eight male; six female) with congenital (n=6) or acquired (n=6)
visual impairments (2 undisclosed), took part in the study. All participants reported
visual impairments ranging from mild (i.e. low vision assisted by lenses, but not
entirely corrected) to complete blindness (i.e. no perception of light). Ages ranged from
30 to 65 years (M=48.8; SD=14.4). Participants were randomly assigned to either an
experimental group (Condition 1) or a control group (Condition 2) using a matched
pairs approach. Matching was based on participant severity of visual impairment (e.g.
mild, moderate, severe, blind).
172 M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps
2.3. Materials
A fictitious map depicting a health club (Figure 1) was used for both conditions
because of its distinctive rooms (e.g. Swimming Pool Room, Gym, Café, Sauna etc.)
with unique atmosphere, sound and sensory stimuli. This map was incorporated into an
ATM (Condition 1) and a conventional tactile map with an accompanying description
(Condition 2). Both maps provided the same detail and information.
In the ATM condition a tactile map was printed onto swell paper where the
internal and external walls and door spaces were embossed. The paper had a QR code
printed in one corner which, when scanned by the tablet computer camera, loaded data
into the tablet. This paper was attached to tablet screen (9.5” x 7.3”) and included
sound, audio description and acoustic-click feedback to reflect the room size and
acoustics. Interaction with the ATM could be in three ways: i) Moving the finger inside
a room activated its corresponding background noises (bold text; see Figure 1); ii)
Tapping twice inside the room activated the playback of text-to-speech auditory
information about the room (italic text; see Figure 1); iii) Tapping three times inside
the room activated the acoustic-click feedback. This option simulated how a finger
clicking noise would have sounded in the room. Audio production software was used to
simulate the appropriate echo feedback for the room size (i.e. larger space = longer
delay). Reverberation was applied to the sound based on room size and the materials
typically present within the room. For example, the large swimming pool room
generated an echo with a longer delay than the smaller sauna area. The pool room also
had a higher level of reverberation than the other rooms, to simulate the effects of hard
surfaces typically found in a swimming pool area. Conversely, the small shop area had
a comparatively short delay and less reverberation due to containing items for sale.
The second condition of a tactile map with verbal description involved five
minutes’ exposure to a non-interactive paper tactile map of the same fictional health
club. A verbal description of a journey through the map included a description of the
shape and size of the room, the background sounds and the objects within the room (e.g.
‘You take the door on your left and enter a large 10 x 8 metre room containing a
M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps 173
rectangular swimming pool and a walkway running around its edge... You leave this
room by the door in which you entered it’). The sequential journey took participants
through all of the rooms. Participants were required to start at the elevator and trace
their journey using the tactile map.
Figure 1. A representation of the map developed for this experiment including description of rooms and
sound effects.
are in the swimming pool room facing East. In what direction is the Bar?’ [In front,
Behind, to your left, to your right]. Four questions examined map memory by asking
participants about the fewest number of doors they would need to travel through to get
from one room to another (‘You are in the Changing Facility and you want to get to the
Sauna. What is the smallest number of doors that you would need to travel through?’ [1,
2, 3, 4]). Four questions examined room size by asking participants to identify the
largest or smallest out of two rooms (‘Which room is the largest Room?’ [Gym or
Jacuzzi Room]).
Phase 1: After five minutes’ exposure to one of the map conditions, participants
completed 20 multiple choice questions, for which scores between the conditions were
compared. As the questions had the potential effect of contributing to the participants
learning of the map, the questions were asked in the same order to all participants.
After completing the questions and undergoing a de-briefing, participants were invited
to explore the ATM prototype prior to Phase 2 of the study.
Phase 2: After completing Phase 1 of the study and spending time with the ATM
prototype, participants were asked a series of questions on their experience and views
on the ATM. Data were analyzed thematically using node and tree-node functions.
The study was approved by the Faculty of Technology Research Ethics Committee,
De Montfort University, Leicester, UK (reference: 1415/297, chair Bernd Stahl).
3. Results
A Mann-Whitney U-test identified that the overall scores for the 20 multiple-choice
questions was significantly higher for Condition 1 (Md=15, n=7) than Condition 2
(Md=13, n=7) U=11.50, z=-1.68, p=.042 (one-tailed), r=.45 indicating a medium to
large effect size using Cohen’s 1988 criteria (i.e. .3=medium, .5=large).
Figure 2. A Box plot showing the median overall scores on the multiple choice test for both conditions.
M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps 175
Therefore, participants performed better with the ATM than with a conventional
tactile map and verbal description.
The experience and views of participants on the ATM was explored in the second
phase. Nearly all participants used language to suggest the ATM would be useful and
beneficial to them with three themes emerging.
4. Discussion
Visually impaired and blind individuals can experience challenges when navigating
unfamiliar environments [13] and tactile maps can help [23], particularly with verbal or
audio feedback [25-26]. This study compared an ATM to a more conventional, verbally
annotated tactile map [25] with a sequential journey format [28]. People using the
176 M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps
ATM had a statistically significant higher overall score on the assessment of their
recollection and cognitive mapping of the fictitious environment. This result suggested
that the ATM was a more effective system for spatial recall than the tactile map
accompanied by a text description. A journey approach to learning an environment is
more effective than survey-based approaches among visually impaired individuals [28].
Conventional tactile maps with an audio description (like Condition 2) offer such a
method but learning is linear and in a fixed sequence. The ATM system however,
offered participants a flexible way of learning an environment allowing for both
journey and survey strategies. Participants appeared to use both strategies when
learning the environment. The map in Condition 2 contained the same information that
was presented in the ATM condition with no sound effects but rather the verbal
description of the rooms. The ATM condition provided a more multimodal approach to
learning. Blind individuals can have better memory of some auditory information [16-
17] and so the ATM might have contributed to the production of a more detailed
cognitive map.
Qualitative feedback included recommendations for improving the system which
could be incorporated into further development. The multiple approaches to learning an
environment offered by the ATM accommodated diverse learning needs of individuals.
The flexibility to learn an environment appeared important and might have been a
factor in the better recall experienced in Condition 1. Furthermore, the enjoyable
aspects of the ATM may have increased motivation and engagement among users.
An ATM system allowing multimodal learning from both survey and route
perspectives yields superior performance in the encoding and retrieval components of
cognitive mapping suggesting that this system is an effective pre-navigation tool
among individuals with visual impairments. The provision of assistive technology has
enabled people with disabilities to be less challenged by their environment. Mobility
and navigation aids can improve the wellbeing of visually impaired people [33] and the
flexible learning approach of the ATM may be a valuable addition to future assistive
technology developers. The use of a QR code printed on the swell paper linking to
mapping data for the tablet increases flexibility and sustainability of this ATM. This
project linked with a local organization Vista Blind (www.vistablind.org.uk) from
where some participants were recruited. Dissemination of this ATM technique will be
promoted initially locally and could lead to further enhancement of the ATM.
References
[1] R. Hewett, G. Douglas, S. Keil, Wellbeing of Young People with Visual Impairments. Visual Impairment
Centre for Teaching and Research, University of Birmingham, Birmingham 2015.
[2] S.M. Kosslyn, Image and Mind, Harvard University Press, Cambridge Massachusetts, 1980.
[3] B. Tversky, Spatial Mental Models, in: The Psychology of Learning and Motivation: Advances in
Research and Theory. Academic Press Inc, San Diego, 1991. pp. 109-145.
[4] B. Tversky, Distortions in memory for maps, Cognitive Psychology 13(3) (1991), 407-433.
[5] C. Campus, L. Brayda, F. De Carli, R. Chellali, F. Famà, C. Bruzzo, L. Lucagrossi, G. Rodriguez, Tactile
exploration of virtual objects for blind and sighted people: the role of beta 1 EEG band in sensory
substitution and supramodal mental mapping, Journal of Neurophysiology 107(10), 2713-2729.
[6] R. Kupers, D.R. Chebat, K.H. Madsen, O.B. Paulson, M. Ptito, Neural correlates of virtual route
recognition in congenital blindness. PNAS 107(28) (2010), 12716-12721.
[7] E.C. Tolman, Cognitive maps in rats and men, The Psychological Review 55(4) (1948), 189-208.
M. Scase et al. / Pre-Navigation via Interactive Audio Tactile Maps 177
[8] C. Eden, Cognitive mapping, European Journal of Operational Research 36(1) (1988), 1-13.
[9] J. O’Keefe, L. Nadel, The Hippocampus as a Cognitive Map, Oxford University Press Oxford UK, 1978.
[10] N.J. Cohen, H. Eichenbaum, The theory that wouldn't die: A critical look at the spatial mapping theory
of hippocampal function, Hippocampus 1(3) (1991) 265-268.
[11] P.N. Johnson-Laird, Mental Models, in: Foundations of Cognitive Science. MIT Press, Cambridge,
Massachusetts, USA, 1989. pp. 467-499.
[12] Y. Bestgen, V. Dupont, The construction of spatial situation models during reading, Psychological
Research 67(3) (2003), 209-218.
[13] J.R. Marston, R.G. Golledge, The hidden demand for participation in activities and travel by persons
who are visually impaired, Journal of Visual Impairment & Blindness 97(8) (2003), 475-488.
[14] Z. Cattaneo, T. Vecchi, C. Cornolodi, I. Mammarella, D. Bonino, E. Ricciardi, P. Pietrini, Imagery and
spatial processes in blindness and visual impairment, Neuroscience and Biobehavioral Reviews 32(8)
(2008), 1346-1360.
[15] D. Goldreich, I.M. Kanics, Performance of blind and sighted humans on a tactile grating detection task,
Perception & Psychophysics 68(8) (2006), 1363-1371.
[16] B. Röder, H.J. Neville, Developmental functional plasticity, in: Handbook of Neuropsychology:
Plasticity and Rehabilitation. Elsevier Science, Amsterdam, 2003. pp. 231-270.
[17] B. Röder, W. Teder-Sälejärvi, A. Sterr, F. Rösler, S.A. Hillyard, H.J. Neville, Improved auditory spatial
tuning in blind humans, Nature 400 (1999), 162-166.
[18] M.A. Hersh, M.A. Johnson, Assistive Technology for Visually Impaired and Blind People, Springer-
Verlag London Ltd, London, 2008.
[19] U.R. Roentgen, G.J. Gelderblom, M. Soede, L.P. de Witte, Inventory of electronic mobility aids for
persons with visual impairments: a literature review, Journal of Visual Impairment & Blindness
102(11) (2008), 702-724.
[20] S. Ertan, C. Lee, A. Willets, H. Tan, A. Pentland, A wearable haptic navigational guidance system,
Digest of the 2nd International Symposium on Wearable Computers (1998), 164-165.
[21] R.D. Jacobson, R.M. Kitchin, GIS and people with visual impairments or blindness: Exploring the
potential for education, orientation, and navigation, Transactions in GIS, 2(4) (1997), 315-332.
[22] L. O’Sullivan, L. Picinali, A. Gerino, D. Cawthorne, A prototype audio-tactile map system with an
advanced auditory display, International Journal of Mobile Human Computer Interaction 7(4) (2015),
53-75.
[23] L. Zeng, G. Weber, ATMap: Annotated tactile maps for the visually impaired, in: Cognitive
Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer Berlin, 2012. pp. 290-
298.
[24] C. Campus, L. Brayda, F. De Carli, R. Chellali, F. Famà, C. Bruzzo, L. Lucagrossi, G. Rodriguez,
Tactile exploration of virtual objects for blind and sighted people: the role of beta 1 EEG band in
sensory substitution and supramodal mental mapping, Journal of Neurophysiology 107 (2012), 2713-
2729.
[25] C. Graf, Verbally annotated tactile maps: Challenges and approaches, in: Spatial Cognition VII.
Lecture Notes in Computer Science, vol 6222. Springer Berlin, 2010. pp. 303-318.
[26] A. Brock, P. Truillet, B. Oriola, D. Picard, C. Jouffrais, Interactivity improves usability of geographic
maps for visually impaired people, Human-Computer Interaction 30(2) (2015), 156-194.
[27] M. Geronazzo, A. Bedin, L. Brayda, C. Campus, F. Avanzini, Interactive spatial sonification for non-
visual exploration of virtual maps, International Journal of Human-Computer Studies 85(C) (2016), 4-
15.
[28] M.L. Noordzij, S. Zuidhoek, A. Postma, The influence of visual experience on the ability to form
spatial mental models based on route and survey descriptions, Cognition 100(2) (2006) 321-342.
[29] J.W. Creswell, Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, , Sage,
Thousand Oaks California, 2013.
[30] C. Teddlie, A. Tashakkori, Foundations of Mixed Methods Research: Integrating Quantitative and
Qualitative Approaches in the Social and Behavioral Sciences, Sage, Thousand Oaks California, 2009.
[31] V. Braun, V. Clarke, Using thematic analysis in psychology, Qualitative Research in Psychology 3(1)
(2006), 77-101.
[32] V. Braun, V. Clarke, Teaching thematic analysis: Over-coming challenges and developing effective
strategies for effective learning, The Psychologist 26(2) (2013), 120-123.
[33] B. Andò, S. Baglio, V. Marletta, A. Valastro, A haptic solution to assist visually impaired in mobility
tasks, IEEE Transactions on Human-Machine Systems 45(5) (2015), 641-646
178 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-178
Abstract. In-patient care of the elderly is currently being put to the test in all
developed industrial nations. The aim is to make the resident-centered and
nursing-related care more professional. In addition to the organizational and
interdisciplinary orientation, the use of socially assistive robot technologies and
artificial intelligence is increasingly coming to the fore. By means of literature
research, expert interviews and an online survey of Upper Austrian nursing home
directors, current and future challenges and challenges for the use of socially
assistive robots (SAR) in in-patient care for the elderly were identified and
prioritized. It becomes clear that the technological and application-oriented
maturity of SAR as well as the modular adaptation of the hybrid SAR services to
the existing structures and processes from the point of view of the nursing home
management are in the foreground. In the future, it will be increasingly important
to bring the process-related and technological support of human-machine
interaction through SAR to a value-adding level.
1. Introduction
Nursing care for the elderly in Austria is, in addition to the informal care provided by
relatives and mobile care for the elderly, characterized by in-patient care in retirement
and nursing homes. In-patient care of the elderly includes long-term residency in a
nursing home, where people´s needs for care are accommodated by a specialist staff
under constant supervision. In in-patient nursing homes, long-term accommodation is
usually the case. The prerequisite for this is that out-patient care or other types of care
can no longer adequately address a person´s need for care. The range of services in the
in-patient care of older people includes not only the provision of hotel services (for
example, accommodation and meals) but also nursing, therapeutic and medical services,
1
Corresponding Author: Johannes Kriegel, University of Applied Sciences Upper Austria,
Garnisonstraße 21, 4020 Linz/Austria, E-Mail: [email protected]
J. Kriegel et al. / Socially Assistive Robots (SAR) in In-Patient Care for the Elderly 179
In order to improve the security of supply and the quality of care in the in-patient care
of older people, it is necessary to improve the division of labor and fragmented in-
patient care for the elderly by means of more comprehensive and resident-centered
provision of services. Such optimization is increasingly carried out by technologically
supported solutions. In doing so, it increasingly uses socially assisted robotic
technologies and artificial intelligence. Socially assistive robots (SAR) are
autonomously acting robots that interact and communicate with humans or other
autonomous physical agents, following social behavior and defined rules tied to their
roles and functions [3,4,5]. A key factor of influence and success is thereby the
consideration of user and resident interests. In addition to the resident`s perspective
(e.g., wishes and possibilities of the residents), centering on residents also includes the
service and process perspective (e.g., comprehensive and barrier-free design of the care
processes) [6,7]. Important aspects in this regard are the interlinked service and service
design in in-patient care across professions and specialties. Increasingly, supportive
and digital technologies are used in the areas of nursing and therapy as well as
supporting services. What are the possible applications for the use of socially assistive
robots (SAR) to support and optimize the security of supply and quality in in-patient
care provided by nursing homes? Furthermore, it is necessary to identify the associated
challenges regarding the use of socially assistive robots (SAR) to support and optimize
the security and quality of care in in-patient care provided by nursing homes for the
elderly.
2. Methods
For this purpose, relevant national and international databases (e.g., Science Direct
College Edtition, Emerald Collections, Pubmed, Cochrane Library, Thieme Connect,
SpringerLink) were searched using targeted keywords or keyword combinations (e.g.,
residential geriatric care, socially assistive robots, use cases, service providers, etc.).
The identified articles, studies and reports were reviewed and their contents interpreted
in the context of the research question raised. Furthermore, the results of the literature
search were incorporated into the development and design of the survey instruments
used below (online questionnaire, expert interviews).
In order to identify possible influencing factors, requirements and applications for the
use of socially assistive robots to support and optimize the security of supply and
quality in in-patient care by nursing homes, a survey of the nursing home directors´
perspectives was carried out by means of an online survey. This focused on the current
situation and the future applications of SAR-supported service provision in the in-
patient care of older people. A distinction was made between function-related and
resident-related use cases. For this purpose, a standardized online questionnaire based
on the results of the literature search as well as six expert interviews (one nursing home
director, two nurses, three graduate social workers for elderly work, one kitchen
manager, one laundry assistant) were compiled. Based on a pre-test (n=5), the number
of application options as well as challenges to be selected and the formulation of the
questions were adopted. The online survey took place from 12/18/2018 to 1/8/2019
using the Unipark survey tool [8]. To this end, 106 nursing home directors in Upper
Austria were invited to participate by e-mail. The return was n=46, resulting in a return
rate of 48.76%.
3. Results
3.1. Factors influencing the use of SAR in in-patient care for the elderly
The use of socially assistive robot technologies in in-patient care for the elderly is
confronted with a multitude of different influencing factors, ranging from legal
regulations and access to utility services to the lack of services and mature technologies
in the context of in-patient care for the elderly. In the process of assessing them, a
systematic classification of the manifold challenges concerning the external dimensions
of environment and society, caregiving system and technological developments as well
as the internal dimensions of organization and results, information and communication,
and caregiving professionals of in-patient care for the elderly can be made (see Fig. 1).
3.2. Optional use cases for SAR in in-patient care for the elderly
Socially assistive robotics aims to support in-patient care for the elderly through social
interaction and human users (such as employees, residents) in order to add value to the
care services and associated supportive processes. In addition to the technical, legal and
economic aspects, it is also important to consider the psychological, social and ethical
J. Kriegel et al. / Socially Assistive Robots (SAR) in In-Patient Care for the Elderly 181
Figure 1. Factors influencing the use of SAR in in-patient care for the elderly
the elderly lie in the function-related support processes (e.g., transport of food, laundry,
care supplies) as well as in the resident-related care processes (e.g., communication,
entertainment, therapy support) (see Figure 2).
Figure 2. Prioritization of possible use cases for SAR in in-patient care for the elderly
182 J. Kriegel et al. / Socially Assistive Robots (SAR) in In-Patient Care for the Elderly
3.3. Challenges for the use of SAR in in-patient care for the elderly
Based on the intended usefulness of the use of robotic technologies in the in-patient
care of older people with regard to time saving, the focus on the core business,
workload management, documentation and cost savings, it is important to establish the
deployment of SAR conceptually and in an application-oriented way in the future. In
addition to a functioning and solution-oriented robotics technology, this also requires
its incorporation into existing or future supply and support processes. Furthermore, it is
necessary to provide the appropriate interfaces, standards and necessary infrastructures
for embedding SAR in the complex supply system. Another key success factor is the
respective acceptance on the part of employees, residents and relatives towards SAR.
Finally, a dedicated SAR services provider is required to enable and ensure hybrid
SAR services in nursing homes.
From the point of view of the surveyed nursing home directors, the embedding of
SAR technologies into the existing infrastructures and service processes as well as the
training and involvement of the employees and residents involved are critical to
success. The nursing home directors accordingly expect challenges in establishing SAR
in the nursing home, especially with regard to the integration and maintenance of
existing and required software and information technologies. In addition to the
different software requirements, the interface management between the different
software programs will be a particular challenge. In addition to the technological
realization, according to the nursing home directors, the SAR solutions must be
adapted to the respective structural and process-related circumstances and preferences.
The nursing home directors consider the publicly debated threats of data abuse, reduced
staff and monitoring by SAR and artificial intelligence to be minor challenges.
Figure 3. Challenges associated with the use of SAR in in-patient care of the elderly
J. Kriegel et al. / Socially Assistive Robots (SAR) in In-Patient Care for the Elderly 183
4. Discussion
4.1. Required business model for modular SAR services in the nursing home
In conjunction with the development of new SAR services and innovative value-added
services, customer- and solution-oriented product bundles of services have to be
combined and integrated into hardware and software as well as service elements to
form an independent new customer-specific business solution [40]. It is also important
to involve an active service provider in the development and provision of SAR services.
Without a service provider, there will be no SAR services in nursing homes! For this,
the development of a SAR business model is recommended, which illustrates the core
structure, the internal and external cooperations as well as the financial requirements of
the organization [41]. Furthermore, the business model represents the current and
future core products or services that the organization offers or wants to offer as well as
the associated objectives. In the context of an experimental research and SAR services
development, the possible and identified use cases have to be considered in concrete
terms of the twelve relevant dimensions of a SAR services business model (customer
segments; customer relationships; communication and distribution channels; revenue
streams; value propositions; emotions; key activities; key resources; key partnerships;
cost structure; ethics; legal regulation) [42].
4.2. Functionality and added value for in-patient care of the elderly
The future use of socially assistive robots in tailored, comprehensive in-patient care for
the elderly will be determined on the one hand by user requirements and on the other
hand by the associated added value [43]. In the medium term, SAR services will play a
major role in support and cooperative performance processes, not only in order to cope
with the upcoming shortage of skilled workers, but also to foster qualitative and
supportive human-machine interaction. At the same time, the delegation of non-
professional, repetitive and stressful activities to SAR gives rise to the possibility of
enhancing and expanding social human-human interactions as well as the nursing
profession [44]. The use of SAR results in a measurable benefit for the elderly and
health care. However, the development of this future scenario must actively involve
caregiving professionals and residents with their corresponding requirements,
possibilities and fears and integrate the targeted SAR solutions into the actual care
processes [45]. This requires appropriately aligned experimental research and
development as well as targeted integration and project management.
References
[5] R. Bemelmans, GJ. Gelderblom, P. Jonker, L. de Witte, Socially assistive robots in elderly care: a
systematic review into effects and effectiveness. J Am Med Dir Assoc, 13 (2012), 114-120
[6] Y.H. Park, H.L. Bang, G.H. Kim, J.Y. Ha, Facilitators and barriers to self-management of nursing home
residents: perspectives of health-care professionals in Korean nursing homes. Clin Interv Aging, 10
(2015), 1617-1624
[7] T. Vandemeulebroucke, B.D. de Casterlé, C. Gastmans, How do older adults experience and perceive
socially assistive robots in aged care: a systematic review of qualitative evidence. Aging Ment Health,
22 (2018), 149-167
[8] QuestBack, Enterprise Feedback Suite EFS survey. QuestBack, Köln-Hürth, 2013
[9] M. Firgo, U. Famira-Mühlberger, Ausbau der stationären Pflege in den Bundesländern. WIFO, Wien,
2014
[10] G. Dewsbury, D. Dewsbury, Securing IT infrastructure in the care home. Nursing and Residential Care,
19 (2017), 672-674
[11] F. Kohlbacher, C. Herstatt, N. Levsen, Golden opportunities for silver innovation: How demographic
changes give rise to entrepreneurial opportunities to meet the needs of older people. Technovation, 39
(2015), 73-82
[12] G. Kojima, S. Iliffe, K. Walters, Frailty index as a predictor of mortality: a systematic review and meta-
analysis. Age and Ageing, 47 (2018), 193-200
[13] F. Hoffmann, H. Kaduszkiewicz, G. Glaeske, H. van den Bussche, D. Koller, Prevalence of dementia in
nursing home and community-dwelling older adults in Germany. Aging Clin Exp Res, 26 (2014), 555-
559
[14] E. Borowiak, J. Kostka, T. Kostka, Comparative analysis of the expected demands for nursing care
services among older people from urban, rural, and institutional environments. Clin Interv Aging, 10
(2015), 405-412
[15] U. Famira-Mühlberger, Die Bedeutung der 24-Stunden-Betreuung für die Altenbetreuung in Österreich.
WIFO, Wien, 2017
[16] S.C. Miller, J.M. Teno, V. Mor, Hospice and palliative care in nursing homes. Clin Geriatr Med, 20
(2004), 717-734
[17] T. Uhrhan, M. Schaefer, Drug supply and patient safety in long-term care facilities for the elderly.
Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz, 53 (2010), 451-459
[18] H. Cramer, H. Pohlabeln, M. Habermann, Factors causing or influencing nursing errors as perceived by
nurses: findings of a cross-sectional study in German nursing homes and hospitals. Journal of Public
Health, 21 (2013), 145-153
[19] P. Khosravi, A.H. Ghapanchi, Investigating the effectiveness of technologies applied to assist seniors:
A systematic literature review. International Journal of Medical Informatics, 85 (2016), 17-26
[20] J. Pineau, M. Montemerlo, M. Pollack, N. Roy, S. Thrun, Towards robotic assistants in nursing homes:
Challenges and results. Robotics and Autonomous Systems, 42 (2003), 271-281
[21] H.H. Tsai, Y.F. Tsai, H.H. Wang, Y.C. Chang, H.H. Chu, Videoconference program enhances social
support, loneliness, and depressive status of elderly nursing home residents. Aging Ment Health, 14
(2010), 947-954
[22] J.E. Morley, Telemedicine: Coming to Nursing Homes in the Near Future. J Am Med Dir Assoc, 17
(2016), 1-3
[23] J. van Hoof, A.M.C. Dooremalen, M.H. Wetzels, H.T.G. Weffers, Exploring Technological and
Architectural Solutions for Nursing Home Residents,Care Professionals and Technical Staff - Focus
Groups With Professional Stakeholders. International Journal for Innovative Research in Science &
Technology, 1 (2014), 90-105
[24] W.Y. Louie, D. McColl, G. Nejat, Acceptance and Attitudes Toward a Human-like Socially Assistive
Robot by Older Adults. Assist Technol, 26 (2014), 140-150
[25] E. Mariani, R. Chattat, M. Vernooij-Dassen, R. Koopmans, Y. Engels, Care Plan Improvement in
Nursing Homes: An Integrative Review. J Alzheimers Dis, 55 (2017), 1621-1638
[26] D.C. Grabowski, R.J. Town, Does Information Matter? Competition, Quality, and the Impact of
Nursing Home Report Cards. Health Service Research, 46 (2011), 1698-1719
[27] N.L. Crogan, B. Evans, B. Severtsen, J.A. Shultz, Improving nursing home food service: uncovering
the meaning of food through residents' stories. J Gerontol Nurs, 30 (2004), 29-36
[28] J. Adams, H. Verbeek, S.M. Zwakhalen, The Impact of Organizational Innovations in Nursing Homes
on Staff Perceptions: A Secondary Data Analysis. J Nurs Scholarsh, 49 (2017), 54-62
[29] R. Briggs, S. Robinson, F. Martin, D. O’Neill, Standards of medical care for nursing home residents in
Europe. European Geriatric Medicine, 3 (2012), 365-367
[30] N. Carrier, G.E. West, D.J. Ouellet, Dining experience, foodservices and staffing are associated with
quality of life in elderly nursing home residents. Nutr Health Aging, 13 (2009), 565-570
J. Kriegel et al. / Socially Assistive Robots (SAR) in In-Patient Care for the Elderly 185
1. Introduction
1
Corresponding Author: Sai Pavan Kumar Veeranki, AIT Austrian Institute of Technology,
Reininghausstraße 13, 8020, Graz, Austria, E-Mail: [email protected]
S.P.K. Veeranki et al. / Regular Re-Training of a Predictive Delirium Model 187
Soto et al. developed web-based services for deploying risk assessment models and
decision support tools [5], which can already be used in clinical routine. However, most
of these methods are not able to be integrated with existing EMR data and require the
manual input of model parameters. Translation and dissemination of fully automated
prediction algorithms from research to decision support at the point of care is a challenge.
In a previous paper, we have adapted the cross industrial standard process for data
mining (CRISP-DM). As compared to the original CRISP-DM standard, our approach
consists of two cycles: one represents the complete project (outer cycle) and the other
represents the predictive analytics cycle (inner cycle), as shown in Figure 1 [2].
Figure 1. Overview our data-driven decision support for health and care [2].
The inner cycle in Figure 1 is a continuous process to improve the performance of
the model with newly generated features and newly available data even after deployment.
The model is deployed into clinical workflow to perform predictions with prospective
data, which closes the outer cycle in Figure 1.
According to Brownson et al. [6], the translation of basic research to practice in
health business takes 17 to 20 years and millions of dollars in order to overcome several
sequential hurdles. In [7], authors described a method based on FHIR web services to
deploy predictive models into healthcare routine. HIMSS Analytics developed an
Adoption Model for Analytics Maturity (AMAM), which provides a score that can be
used to check the technology readiness of hospitals and health care providers to provide
personalised medicine in 7 stages. There are 5 stages before reaching stage 6, which
states the readiness to provide clinical risk intervention and predictive analytics.
According to HIMSS, there are currently only a few hospitals that reached stage 6 [8].
Apart from above mentioned hurdles, stages, social, legal, ethical, economic,
political, usability-related and organisational factors etc., from our observation, there are
intrinsic issues of the models and data themselves, which complicates the process of
deployment. Some of these issues relate to system changes over time, such as variations
of one or more of the following characteristics:
x availability of new diagnostic and therapeutic procedures
x major changes in financing and incentives
x changes of health profiles of the patients / populations
x new processes for recording the data
x changes in data quality
188 S.P.K. Veeranki et al. / Regular Re-Training of a Predictive Delirium Model
1.2. Objective
In 2018, a delirium prediction model has been deployed within a routine clinical
workflow to support health care professionals [10]. In the current study, we analysed the
stability of the predictive models over time. Our results provide a basis for the decision,
how often the deployed delirium model will need to be re-trained or re-built during
operation.
2. Methods
2.1. Dataset
The data for this analysis were extracted from the Hospital Information Systems (HIS)
of “Steiermärkische Krankenanstaltengesellschaft m.b.H.” (KAGes) (i.e. EMR of
KAGes) which is the regional health care provider in Styria (one of the nine provinces
of Austria). KAGes has about 90% market share in terms of acute care hospital beds of
the region and has access to more than 2.1 million longitudinal health records. The
retrospective dataset has been extracted from the KAGes HIS between the years 2012
and 2017 for developing the models for the use case of predicting the occurrence of
delirium during hospitalisation at the time of admission of the patient.
For the analysis, we have identified 4,596 delirium patients as cohort from the HIS
based on certain inclusion and exclusion criteria that were published in [1] and randomly
selected 25,000 patients who had never been diagnosed with delirium at KAGes. Due to
privacy reasons, patients with extremely rare diseases and patients with no previous
records were excluded, ending up with 24,972 patients in the control group.
group and b) the date of admission of the hospitalisation when the patient was diagnosed
with delirium for the cohort. Altogether, the feature set comprised of 502 features.
In order to analyse the stability of the models over time, we divided the data into six
subsets based on the year of the reference date. We selected a Random Forest (RF) as a
modelling method, since RF had outperformed various other methods for predicting
delirium in one of our previous works [1].
We trained one RF model for each year from 2012 to 2016 with 10-fold cross
validation, which provided us 10 models per year. To validate the question at stake, we
tested all the 5*10 models with the sixth subset, i.e. 2017 data. Area under the receiver
operating characteristic curve (AUROC) measures were calculated for each model with
test data for comparing the performance of the models.
3. Results
Each boxplot in Figure 2 represents the distribution of the AUROC for the 10 models of
a single year, when tested with the data from the year 2017. We found no significant
variation or trend in the performance of the models with respect to the AUROC when
comparing the models that were trained with older data to models trained with more
recent data.
Figure 2. This figure shows the AUROC distribution of the models trained with data from the years 2012 –
2016 each and tested with the data from the year 2017.
190 S.P.K. Veeranki et al. / Regular Re-Training of a Predictive Delirium Model
Table 1 summarises all the median AUROC measure, when testing the models
trained with data form one year with the data of subsequent years.
Table 1. Tabulated the median of AUROC of the models trained on data from one year and tested with the
data from other year’s
Test 2013 2014 2015 2016 2017
Train
2012 0.8820 0.8678 0.8740 0.8800 0.8857
2013 0.8678 0.8798 0.8812 0.8804
2014 0.8783 0.8847 0.8828
2015 0.8917 0.8845
2016 0.8836
4. Discussion
Our results indicate that, in the case of predicting delirium models, re-training in yearly
intervals would have not improved the model performance in comparison to that of the
models that have been trained up to five years earlier. From this finding, one might
conclude, that even in the future, re-training will not be necessary unless there are major
data revisions. However, the deployment of a delirium model itself might represent such
a major change. Therefore, re-training right after deployment might still be indicated.
Although our analysis shows that there is no need for frequent modelling in the case
of delirium prediction, these analytical methods should be applied to other clinical
questions with care. Additionally, the size of the dataset might play an important role in
this context. Further research would be necessary to find out, if similar results would be
received with smaller and/or larger datasets.
We applied 10-fold cross-validation when training each of our models. Since we
tested our models with different test-datasets (stemming from different years), cross-
validation could even have been omitted. However, by the use of cross-validation, we
could also compare the results from models trained with data form 2012-2016 with a
model trained with data from 2017. However, this has not been investigated yet.
Additionally, by the use of cross-validation we receive information concerning the effect
of slight changes in the learning data of a single year on the model performance.
We decided to look at the question at stake from a pragmatic (“top-down”) point of
view: if the model performance does not change, then re-training might not be necessary.
However, additional (“bottom-up”) analyses of the stability of the underlying features
might give valuable insights within the inner cycle depicted in Figure 1. Therefore, it
may be beneficial to monitor features during the operation of the model, in order to detect
unexpected changes in a feature’s characteristic that might have effects on the model
performance.
5. Conclusions
(five years earlier), to the data from the year 2017, indicating that re-training of the model
in regular intervals might not be critical.
However, there is a necessity to further observe the models’ performance during
their application to prospective data, as the characteristics of the clinical routine data
might change due to the deployment of such a predictive model. Major organizational or
legislative changes or changes in e.g. HIS should be scrutinized concerning their impact
on a predictive model.
Acknowledgement
This work is part of the IICCAB project (Innovative Use of Information for Clinical Care
and Biomarker Research) within the K1 COMET Competence Centre CBmed
(http://cbmed.at), funded by the Federal Ministry of Transport, Innovation and
Technology (BMVIT); the Federal Ministry of Science, Research and Economy
(BMWFW); Land Steiermark (Department 12, Business and Innovation); the Styrian
Business Promotion Agency (SFG); and the Vienna Business Agency. The COMET
program is executed by the FFG. KAGes and SAP provided significant resources,
manpower and data as basis for research and innovation.
References
Abstract. The steady increase in the number of patients equipped with mechanical
heart support implants, such as left ventricular assist devices (LVAD), along with
virtually ubiquitous 24/7 internet connectivity coverage is motive to investigate and
develop remote patient monitoring. In this study we explore machine learning
approaches to infection severity recognition on driveline exit site images. We apply
a U-net convolutional neural network (CNN) for driveline tube segmentation,
resulting in a Dice score coefficient of 0.95. A classification CNN is trained to
predict the membership of one out of three infection classes in photographs. The
resulting accuracy of 67% in total is close to the measured expert level performance,
which indicates that also for human experts there may not be enough information
present in the photographs for accurate assessment. We suggest the inclusion of
thermographic image data in order to better resolve mild and severe infections.
1. Introduction
An increasing number of patients with heart failure classified as severe according to the
New York Heart Association (NYHA) Classification [1], are treated with a mechanical
support implant. This may either be for the period while waiting for the heart
transplantation, or as a permanent solution, the so-called destination therapy [7]. A left
ventricular assist device (LVAD) is a pumping device implanted onto the heart, taking
over the main pump function of the left ventricle while the heart is still functional at a
low percentage. The device relies on a permanent electrical connection to a control
module and battery pack situated on the outside of a patient’s body, through a driveline
tube. The control module collects device operation data which is a valuable opportunity
for data exchange and thus early detection of problems.
The driveline exit site is a delicate location, requiring continuous wound treatment
and wound dressing, the latter to be renewed typically once in five days. Driveline
1
Corresponding Author: Noël Lüneburg, Target Holding, Atoomweg 6B Groningen, The Netherlands,
E-Mail: [email protected].
N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition 193
infections occur frequently because the driveline exit site creates a conduit for the entry
and proliferation of bacteria. This is one of the most severe adverse events for the patient,
leading to the necessity of surgical wound revision or even the replacement of the assist
device implant [8]. Driveline infection is defined as an infection affecting the soft tissues
around the driveline outlet, accompanied by redness, warmth, and purulent discharge.
Telemonitoring of driveline exit sites can provide early detection of these symptoms
and can aid in the remote diagnosis of relevant driveline infections. The majority of
LVAD patients have positive reactions towards telemonitoring [9]. Photographs of the
driveline exit site, taken by caregivers or patients themselves with their mobile devices
during renewal of the wound dressing, are sent through a mobile application to the
physician in charge in the patient’s clinic. The image will be reviewed in combination
with any available device data, clinical data and the accompanying patient-update on
their well-being or quality of life. The aim is to prevent patients from having to travel to
their clinics for check-ups too often, or to consult their local general practitioner, but
even more so not to miss out on the early detection of an upcoming adverse event. Right
now the state of the art is that patients are seen by their clinics once every three months,
without any visual monitoring in between.
Before deep learning was widely accepted as a machine learning method in the
image processing field, the support vector machine (SVM) and multi-layered perceptron
(MLP) were popular choices for computer aided image analysis in the domain of
photographic imaging [11] as well as non-photographic medical imaging [12]. Deep
learning has been applied to skin cancer classification supported by a large data set [10]
in which a deep CNN matched (and even outperformed in certain configurations)
dermatologists in classification accuracy.
In the following sections we describe three applications of deep learning which
support the diagnosis procedure by automatically predicting the presence and severity of
driveline infections based on patient photographic data. This can be executed ‘on the fly’
and will not add significant transit time to the images. The physician-in-charge then
receives the images with a severity indication, and in particular a warning sign in case of
a recognized severe infection.
2. Methods
The data set we worked on for this study consists of 745 general photographs from a total
of 61 patients, taken and provided in pseudonymized format by Schüchtermann-
Schiller’sche Kliniken and Hannover Medical School. The photographs had been taken
and stored for documentation, without being further processed for some time.
Photographs were taken from various positions and lack consistency in lighting. In
addition, photographs can be out of focus or show signs of camera motion, and part of
the wound area can be obstructed by dressing. These conditions might apply to future
images taken by patients as well and we are prepared to handle this automatically.
732 out of the 745 photographs are labelled as belonging to one of the following
three classes: no infection, mild infection, severe infection. In regular operations, labels
are assigned by clinical experts based on features such as presence of bacteria, odour and
warmth, in addition to visual features on the surface of the wound. We intentionally only
assessed the photographic data as this will be the data available from remote patient
monitoring.
194 N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition
The data set is heavily imbalanced concerning the representation of the three classes,
specifically, the severe label is assigned to only 5.1% of all photographs. The distribution
for each class is listed in Table 1. The number of photographs per unique patient varies
between 1 and 38 with an average of 6.8 photographs. A severe infection case occurred
in 17 patients. For these patients on average 2.2 photographs were assigned the severe
label.
The processing steps used in the machine learning classification training procedure
were as follows.
1. Detection and filtering of out-of-focus photographs,
2. driveline tube segmentation,
3. prediction of region of interest,
4. classification of wound infection class.
In the following sections each of the processing steps is explained in more detail.
We would like to filter out highly out-of-focus data samples from the training set to
increase the quality of the training data. The aim was to automatically remove the subset
of photographs without sufficient detail to determine the infection class.
Quantification of blur in a photograph can be done by computing the sum of the
partial second derivatives of the image in both dimensions, known as the Laplacian
operator, which has an application in autofocusing for microscopes [2]. The amount of
blur is reduced to a single number by taking the variance of the Laplacian value across
all pixels in the image.
Before the out-of-focus detection algorithm was developed a set of 692 photographs
was available, which were manually classified as either out-of-focus or clear. This
allowed us to set a threshold on the variance of the Laplacian that ensures a balanced
ratio between precision and recall for out-of-focus detection.
Whenever a device is used to take and send a photo, the out-of-focus detection could
trigger an immediate request for a repeated photograph, sent back to the patient’s LVAD
App while they are still busy with the wound dressing renewal.
Drivelines may have different visual features, from an opaque white colour to
transparent, granting view on different internal cable colours, sometimes reflecting flash
lighting on their surface. They occur in all photographs and their presence may increase
the complexity of training an infection classification network if the network itself is not
able to ignore the irrelevant tube features. This section focuses on two separate
N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition 195
approaches for detecting the driveline tubes which allowed us to mask and negate the
features in the driveline tube area of the image during infection classification.
In the absence of annotated photographs, a first approach made use of the
Felzenszwalb unsupervised segmentation algorithm [3]. It is a greedy graph-based
algorithm which iteratively merges adjacent pixel regions based on local and global
contrast. The Felzenszwalb algorithm is sensitive to the variations within the
photographic data, requiring parameter tuning on a per sample basis for adequate
segmentation performance, which is inconvenient for practical applications.
A supervised deep learning method may be better suitable for capturing the image
complexity. In order to facilitate supervised learning we set up a web-based annotation
service. Anonymous images were offered to annotators in a random sequence. Images
and annotation results were exchanged through a secure connection. LVAD experts were
able to use this service to visually annotate driveline regions and other skin coverage
(e.g. wound dressing) in photographs. A magnification tool allowed for the exact
drawing of the segmentation map with usual point and click devices.
A specific architecture of convolutional neural networks (CNN) called U-net [4] was
used for training on the annotated data. It is a type of semantic segmentation CNN which
can be used to assign a class label (‘driveline tube’ or ‘background’ in this case) to each
pixel in an image. Physicians used the annotation service to annotate 185 photographs
which we randomly split into 148 training and 37 validation samples. Data augmentation
is applied to the training set and ground truth annotations in the form of affine
transformations to artificially enrich the training set.
While the first three steps above provide methods and tools for the preparation of the
photographs to be analysed, infection class recognition is the main contribution of the
research described in this paper. We set up a classification network that learns to identify
one of the three infection classes (none, mild, severe) based on an input image.
Experiments were set up using a variety of popular CNN classification architectures.
The best performing network on our data set was the VGG-16 architecture [5], pretrained
on ImageNet [6] and fine-tuned on the driveline photographic data. The training data was
augmented using affine transformations to indirectly increase the effectiveness of the
classifier [13].
Since the labels of our training set were initially assigned using more information
than only the visual features observed in the photographs, we initiated a blind expert
evaluation. In such an evaluation we can not only compare the performance of the
196 N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition
classification CNN with respect to the original labels, but also to the performance of
human experts in an identical task. The blind evaluation set consisted of 100 photographs,
containing an even division of samples from both heart clinics. The chosen class
distribution reflects the class distribution of the full data set as much as possible, while
ensuring a minimum of 15 samples per class (see Table 3). The images were drawn
randomly from the respective class’s image pool. Physicians from both clinics were
asked to provide their classification of infection class for each of the evaluation
photographs. The classification CNN was trained using the leave-one-out method to
obtain a single infection class prediction for each of the 100 photographs.
A separate experiment was set up to analyse the effects of tube segmentation on
classification performance. Segmentation masks from Section 2.2 were applied to the
photographs before feeding these to the classification CNN.
3. Results
Figure 1. Visualisation of driveline tube segmentation masks. The blue region represents the predicted
driveline tube area. Left: Felzenszwalb segmentation method (note that the non-skin background is included in
the blue region). Right: U-net segmentation method.
N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition 197
Table 2. Infection classification accuracy and macro (unweighted) F1 score based on different types of region
of interest (RoI) extraction methods.
RoI type Accuracy (%) F1 score
None (full image) 66.7 0.472
Manual RoI 71.7 0.498
U-net RoI 69.8 0.496
Two LVAD experts, one from each of the two clinics involved in the study, have
assigned infection class labels to each of the 100 photographs in the blind evaluation.
For comparison, output predictions from the classification CNN have been obtained on
the same set. We compute prediction accuracy using the original labels, and the resulting
metrics are shown in Table 3. The total accuracy of all participants, humans and machine,
is between 66% and 69%. The mean accuracy is derived from the results of the three
classes, weighted by the class distribution, as shown in Table 3. The severe infection
class, which is least represented in the data and prone to under-skin processes, shows the
lowest accuracy for all participants. Since prediction performance in this class is at least
as important as in the other classes, the macro F1 score is reported for each participant
as well. We observed that due to the lower performance on the severe infection class the
macro F1 average score of the classification CNN is lower than that of the trained
physicians.
Multiple approaches for applying tube masks (generated by the U-net segmentation
CNN) to classification input photographs were explored, such as setting the driveline
tube to a solid colour and a combination of inpainting and blurring to attempt to hide the
tubes in the photographs. In every approach in which a tube segmentation mask was
applied to classification input images, the resulting classification accuracy ended up
lower than without applying the mask.
Table 3. Prediction accuracy and F1 score of each participant providing predictions on the blind evaluation set
(n=100). The macro F1 score average is reported for each candidate, which is calculated by weighing each
class equally. Numbers in bold indicate the highest scores per class.
4. Discussion
5. Acknowledgements
The authors of this paper would like to thank Dr. Ioannis Giotis for his valuable
knowledge on skin lesion segmentation during the start of the project. In addition, we
thank Dr. Rolf Neubert for the coordination between involved parties as well as helping
to improve the writing style of the paper.
References
[1] Specifications Manual for Joint Commission National Quality Measures, New York Heart Association
(NYHA) Classification, https://manual.jointcommission.org/releases/TJC2016A/DataElem0439.html,
last access: 12.02.2019.
[2] J. L. Pech-Pacheco, et al., Diatom autofocusing in brightfield microscopy: a comparative study, in:
Proceedings. 15th International Conference on Pattern Recognition. IEEE, 2000. pp. 314-317.
[3] P.F. Felzenszwalb, D. P. Huttenlocher, Efficient graph-based image segmentation, International journal
of computer vision, 59, 2004, 167-181.
N. Lüneburg et al. / Photographic LVAD Driveline Wound Infection Recognition 199
[4] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation,
in: International Conference on Medical image computing and computer-assisted intervention. Springer,
Cham, 2015. pp. 234-241.
[5] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv
preprint arXiv:1409.1556, 2014.
[6] Stanford University, Princeton University, ImageNet, http://www.image-net.org/, last access: 22.1.2019.
[7] Pinney, S. P., Anyanwu, A. C., Lala, A., Teuteberg, J. J., Uriel, N., & Mehra, M. R. (2017). Left
ventricular assist devices for lifelong support. Journal of the American College of Cardiology, 69(23),
2845-2861.
[8] Zierer, A., Melby, S. J., Voeller, R. K., Guthrie, T. J., Ewald, G. A., Shelton, K., ... & Moazami, N. (2007).
Late-onset driveline infections: the Achilles’ heel of prolonged left ventricular assist device support. The
Annals of thoracic surgery, 84(2), 515-520.
[9] Deniz, E., Feldmann, C., Schmidt, T., Hoffmann, J. D., Hanke, J., Rojas-Hernandez, S. V., ... & Haverich,
A. (2017). The Impact of Telemonitoring in Patients with Ventricular Assist Device. The Thoracic and
Cardiovascular Surgeon, 65(S 01), ePP17.
[10] Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017).
Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115.
[11] Masood, A., & Ali Al-Jumaily, A. (2013). Computer aided diagnostic support system for skin cancer: a
review of techniques and algorithms. International journal of biomedical imaging, 2013.
[12] Wernick, M. N., Yang, Y., Brankov, J. G., Yourganov, G., & Strother, S. C. (2010). Machine learning in
medical imaging. IEEE signal processing magazine, 27(4), 25-38.
[13] Perez, L., & Wang, J. (2017). The effectiveness of data augmentation in image classification using deep
learning. arXiv preprint arXiv:1712.04621.
200 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-200
Abstract. In Austria, there is no single source of truth holding information about all
physicians and their medical practice. Therefore, different sources have to be
combined to accumulate detailed information about doctors, identify data errors and
increase overall data quality. The aim of this project is to link two datasets from
vastly different origins utilizing reproduceable and mostly automatic procedures in
contrast to manually acquired links in the past. As there is no global identifier, names
and addresses of the doctors were used instead. Because of different spellings and
typos of names and addresses within and in between the datasets, direct comparison
does not lead to satisfactory results. Therefore, probabilistic matching with string
metrics was applied. The utilized methods significantly improve the linkage and
allow to match about 80 % of the private consultants in both datasets.
1. Introduction
In Austria, there is no single register about all outpatient physicians and their medical
practices. Furthermore, the ambulatory outpatient healthcare system is highly
fragmented. Some physicians have contracts with some social health insurance (SHI)
institutions and get reimbursed directly. Additionally, there are non-SHI-accredited
doctors of a patient’s personal choice where out-of-pocket payment is required and
whose bills get partly reimbursed by the health insurance on request.
The "Handbuch der Sanitätsberufe" (HSB) of the publishing house Göschl contains
information of all doctors in Austria, but there is no information about doctors practice,
e.g. consultations, included. On the other hand, routinely collected administrative and
accounting data of the SHI institutions includes rich information about reimbursements
and services rendered, but hardly any information about the physicians themselves.
The aim of the project is mostly automatically link these two datasets to be able to
get a complete picture of the non-SHI physicians. The result consists of the accounting
data of the non-SHI physicians supplemented by additional information from the HSB.
2. Methods
The datasets had to be acquired, transferred securely and transformed into a usable
format. During the initial data exploration and quality assessment it became clear, that
1
Corresponding Author: Melanie Zechmeister, Verein DEXHELPP, Neustiftgasse 57-59, 1070 Vienna,
Austria, E-Mail: [email protected]
M. Zechmeister and F. Endel / Improving Information About Private Consultants 201
there are often more than one entry per doctor or office. Due to the lack of global
identifiers, names and addresses had to be used for deduplication and matching. Because
of differing spellings, abbreviations and typos of names and addresses even in the same
dataset, string metrics, which provide a measure of similarity or distance of two texts [1]
have to be applied. Visualizations and automatically generated reports enabled a direct
communication of results, possible thresholds of the distance measure, and errors in the
deduplication and matching procedure.
As the datasets are updated periodically, the process had to be implemented in a way
which allows to rerun it for the updated data with little additional effort. Therefore, the
process was implemented as a series of R-Files which call the matching algorithm for
each variable. So the process can be reexecuted also with differing datasets. One must
just identify the linking variables.
3. Results
Deduplication and linking the two datasets using string metrics highly increased the
quality of the information and data available. While there are about 27.000 entries in the
raw dataset, only 15.000 identified physicians (about 45 %) are left after deduplication.
The string matching is essential for the linkage process as well. Exact linkage leads
to about 5.800 matchings, which is about 38 % of the private consultants. After applying
the text matching, about 12.414 or 82 % can be linked. Furthermore, the threshold of the
distance metric can be adapted to the requirements of a project, e.g. resulting in more
links while accepting more false positives.
4. Discussion
It becomes apparent that the employment of text matching leads to a big improvement
of the results. Especially the automatization saves a lot of time compared to manual
linkage which has been the method of choice before.
Even so there is potential for improvement. The applied string metric and matching
procedure [2] was chosen as it is implemented in R. A systematic evaluation of different
metrics and algorithms might improve the results.
Furthermore, the manual identification and selection of the threshold is still time
consuming and depending on human interaction. Applying automatic detection of fitting
boundaries would accelerate the procedure.
In this project all matchings above the boundaries were accepted, the others were
declined, which leads to single errors. Spot-check inspections showed that some errors
could be identified manually.
References
[1] W. E. Winkler. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of
record linkage. Proceedings of the Section on Survey Research Methods (1990), 354-369.
[2] Andreas Borg, Murat Sariyar (2019). RecordLinkage: Record Linkage in R. R package version 0.4-10.1.
https://CRAN.R-project.org/package=RecordLinage
202 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-202
1. Introduction
Since health data is sensitive and private, the need for confidentiality and security of
these data is obvious [1-3]. Confidentiality refers to the protection of information
against unauthorized access or disclosure and keeping information confidential should
be conducted by controlling the access level of individuals (authorized users) in
organizations, as well as protecting information at the time of data transmission [4-6].
Therefore, one of the most important responsibilities of the Health Information
Management Departments (HIMD) in hospitals is compliance with the principles of
confidentiality and information security [7]. The HIMDs should play a significant role
in monitoring and observing laws, adhering to professional standards, and conducting
appropriate procedures for keeping health information secure and confidential [5,8].
However, research conducted in different countries indicates the high deviations of the
HIMDs with these principles [8], and notwithstanding, different countries have
1 Corresponding Author: Nasim Hashemi, Iranian Social Security Organization, Tehran, Iran, E-Mail:
[email protected]
A. Sheikhtaheri et al. / Protecting the Confidentiality and Information Security of Patients 203
different policies and procedures to protect the privacy and security of patient
information in hospitals [4]. For example, USA [9], Canada [10] and Australia [11]
have enacted regulations in this regard. Furthermore, the European Union enacted the
General Data Protection Regulation for protecting data and privacy for all individuals
within the European Union (EU) and the European Economic Area (EEA) in 2016
[12]. In Iran, there is no clear policy to maintain the security and confidentiality of
patient information. Some studies have shown that the confidentiality of medical
records was not observed appropriately [6, 7]. Additionally, some previous studies
indicated that patients are concern in this regard [13]. Considering the importance of
the observance of confidentiality rules, this study carried out to determine the
performance of HIMDs in teaching hospitals in Iran to identify their similarities and
differences as a basis for compiling the policies of confidentiality and security of health
information.
2. Method
3. Results
Most participants were women (89.3%), had a bachelor's degree (85.7%) and were
specialists in the field of HIM (96.4%). The mean age of participants was 39.7 years
and the average working experience was 16.5 years. In most hospitals (Table 1),
patients do not have the right to review and request correction of their medical records
(66%) and most of them do not release patients’ information to them (47.1%). Most
hospitals receive an obligation from the users for not disclosing the contents of the
medical records (43.4%). Regarding the compliance with the principles of information
disclosure consent (Table 2), we found that obtaining permission from the hospital
administrators and authorities to disclose patient information without the patient's
consent (67.9%), and access of the hospital doctors to the medical records without the
patients’ consent and only upon request from the doctor (56.6%) were considered
adequate. Disclosure of any information requested by the users without the consent of
204 A. Sheikhtaheri et al. / Protecting the Confidentiality and Information Security of Patients
Regarding the external users (Table 3), responding to the legal requests with the
order of the hospital director (88.6%) and providing medical records to other hospitals
and external doctors with the orders of the hospital managers without patients’ consent
was 81.1%. Only 64.3% of participants stated that their hospitals had a policy for using
medical records by the researchers. Disclosure of medical record information for
A. Sheikhtaheri et al. / Protecting the Confidentiality and Information Security of Patients 205
lawyers and the authorities only by the judiciary order was only 62.6%. Regarding the
internal users (Table 4), we found that heads and managers of hospitals have
convenient and fast access to medical records without the patients’ consent (58.4%).
Only 52.8% of participants declared that they had a clear policy for the use of patients’
information in the educational programs.
Table 3. Compliance with confidentiality principles in responding to external users in the HIMDs
Table 3. Continued
4. Discussion
In general, the findings show that these hospitals, in some cases, use the same
procedures, but in many cases, the current process of hospitals regarding the
confidentiality and disclosure of information in different circumstances is not the same.
Furthermore, in many cases, procedures of hospitals show that confidentiality and
security of health information is not a priority, and they provide access to patients’
information without their consents.
In the first axis, most hospitals take obligation from the users not to disclose
information. In addition, in most hospitals, patients have no right to review and request
correction of his/her information, and if the information is harmful, no information is
given to patients. In the case that the information provided under legal conditions, a
few hospitals controlled patients' consent and were mostly confined to the doctor`s
report or the hospital's director order. Few hospitals stated that if a patient requests
information from his/her records, he/she has access to this information with his/her
identification card. Furthermore, in a small number of hospitals, keeping the
confidentiality and privacy of patients was mentioned in the staff job descriptions.
Some hospitals only gave the possibility to correct patients’ identification data.
Additionally, it was found that in most hospitals, the doctors' access to patients’
information is possible without the permission of patients and only by a doctor's
request. For other users of the medical records, the permission of the hospital managers
and authorities is considered sufficient. In most cases, the disclosure of information to
the hospitals where a patient is transmitted is undertaken only with the physicians’
208 A. Sheikhtaheri et al. / Protecting the Confidentiality and Information Security of Patients
permission. In most hospitals, sending information to users did not require patients’
consent. In addition, it was determined that hospital managers have access to patients’
information without their consents, but other staffs have access to information only
within their scope of tasks. The use of medical records to assess the quality of health
care is also possible with the permission of the doctor. To send out information outside
the hospitals, most hospitals do not receive consent from the patients and only the
permission of the hospital authorities is considered sufficient, except in legal cases
where the information is sent out merely by a request from the judicial authorities.
According to the HIPAA Privacy Policy, a patient has the right to access and
control his health information [9]. According to Canadian law, health care providers
and centers are required to protect personal information and to justify them about all
information activities they carry out. Healthcare organizations should provide patients
with access to their health information [10]. In Australia, there are laws developed to
protect the privacy of health information for patients to access their health information
[11]. General Data Protection Regulation has developed a framework for data privacy,
rights of data subjects, and transfer of data for European countries [12] but in Iran it
seems that hospitals do not have a common framework to protect health information
privacy and give enough patients’ access to their information and patients cannot
control their information.
According to the HIPAA, patients’ health information should not be released
without their consent unless there is a clear reason for it; and users should also protect
it. The patients should also be informed about what information is disclosed and for
whom and why [9]. However, the findings showed that in Iran, a patient’s consent to
deliver the information is not taken into consideration, and therefore, patients do not
have control over who has access to and uses their information. Moreover, the use of
health information is allowed when it is permitted or required by the law, and the
patient has expressed his/her consent to this disclosure [14].
Use of information to achieve the primary purposes of health information
collection and other purposes such as planning, providing healthcare services,
allocating resources, managing errors and risks, and improving the quality of care,
training of health care providers is allowed without patients’ consent, unless they have
expressly announced their disagreement for this [15]. This issue is partly observed in
Iranian hospitals. Application of health information for research, evaluation of
healthcare quality and education do not dependent on patients’ consent. Although the
educational use of information is permitted [16,17], but the identity of patients should
not be released [18] and students should be responsible for maintaining health
information [19].
In the case of research, the written consent of patients is required unless based on
the ethics committee, written consent of patients is not required or researchers use a
limited set of data without the patient's identification data [20]. In other words, if the
study needs identity information, the approval of the ethics committee should be
available [21]. In some cases, such as health and medical research that benefits the
community, and there is no possibility of obtaining consent from patients, instead of
the consent form, appropriate mechanisms should be taken into account to protect the
privacy of health information [22]. In Iranian hospitals, researchers are allowed to
access the information with the permission of hospital managers, and researchers
should have the ethics committee permission. Therefore, this issue is respected in our
hospitals.
A. Sheikhtaheri et al. / Protecting the Confidentiality and Information Security of Patients 209
In summary, this study showed that in our country, there are no specific national
frameworks and guidelines for the disclosure of health information and their privacy
and security, and hospitals are conducting different ways in this regard. Also, in many
cases, international principles are not respected. Therefore, a specific framework for
security and confidentiality of health information may be developed in order to protect
the confidentiality and security of health information in both electronic and manual
medical record systems.
References
[1] G.S. Poduri, Confidentiality and patient records. AP Journal of Psychological Medicine. 14(2)
(2013),110- 113.
[2] N. Hajrahimi, S.M. Hejazi Dehaghani, A. Sheikhtaheri, Health information security: A Case study of
three selected medical centers in Iran. Acta Informatica Medica. 21(1) (2013), 42-45.
[3] J.R. Junges, M. Recktenwald, H.D Raymundo, et al. Confidentiality and privacy of information about
patients treated by primary health care teams: a review. Revista Bioética. 23(1) (2015), 200-206.
[4] T. NaseriBooriAbad, A. Sheikhtaheri. Information privacy and pervasive health: Frameworks at a glance.
Journal of Biomedical Physics and Engineering. (2019), In press.
[5] M. Langarizadeh, A. Orooji, A. Sheikhtaheri, Effectiveness of Anonymization methods in preserving
patients’ privacy: a systematic literature review. Studies in Health Technology and Informatics. 248
(2018), 80-87.
[6] E. Mehraeen, H. Ayatollahi, M. Ahmadi, A Study of information security in hospital information systems,
Health Information Management. 10(6) (2014), 779-788.
[7] A. Hajavi, M. Khoushgam, M. Hatami, A Comparative study on regarding rate of the privacy principles
in legal issues by WHO manual at teaching hospitals. Journal of Health Administration. 33(11) (2007),
7-16.
[8] M. Farzandipour, Policies for providing medical records at hospitals. Dissertation of Health Information
Management. (2002).
[9] K.A. Wager, F.W. Lee, J.P. Glaser, Health care information systems: a practical approach for health care
management. John Wiley & Sons (2017).
[10] A. Thorogood, Protecting the privacy of Canadians’ health information in the cloud. Can J Law
Technol. 14 (2016), 173-213.
[11] New South Wales information and privacy commission, health records and information privacy Act
2002.
[12] General Data Protection Regulation (GDPR). (2016) Available from: https://gdpr-info.eu/
[13] A. Sheikhtaheri, M.S Jabali, Z.H. Dehaghi. Nurses' knowledge and performance of the patients' bill of
rights. Nursing Ethics. 23(8) (2016), 866-876.
[14] Government of Newfoundland and Labrador: Department of Health and Community Services, The
personal health information act policy development manual. (2011).
[15] Personal Health Information Protection Act. (2004) Available from:
https://www.ontario.ca/laws/statute/04p03.
[16] Uconn Health. Policy: Use of protected health information (phi) in education (POLICY NUMBER
2014-07). (2014) Available from: https://health.uconn.edu/policies/wp-
content/uploads/sites/28/2015/07/policy_2014_07.pdf.
[17] UT Health HIPAA compliance program: Office of regulatory affairs and compliance: Using protected
health information (PHI) for education. (2014).
[18] M. Abdelhak, S. Grostick, M.A. Hanken, Health Information: Management of a Strategic resource.
Elsevier Health Sciences, (2014).
[19] University of Hawaii HIPAA training program, Appropriate uses of protected health information for
educational purposes. (2014).
[20] UCI Office of Research. Protected Health Information (HIPAA). (2015) Available from:
http://www.research.uci.edu/compliance/human-research-protections/researchers/protected-health-
information-hipaa.html.
[21] Office of the Information and Privacy Commissioner, The health information Act: use and disclosure of
health information for research.
[22] C. O'Keefe, D. Rubin, Individual privacy versus public good: Protecting confidentiality in health
research. Statistics in Medicine. 34(23) (2015), 3081-3103.
210 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-210
Keywords. medical record linkage, data analysis, ergometry, exercise test, cardiac
rehabilitation
1. Introduction
Clinical trials are commonly limited to a specific study population, which cannot
represent the target population in full detail. Additionally, in clinical trials there is a
limited time of observation and a relatively small number of subjects. To get a more
comprehensive view, different strategies can be followed. On the one hand, researchers
are trying to reuse available datasets from various clinical trials by combining them to
bigger datasets (“record linkage”), e.g. using EUPID [1]. On the other hand, routine care
increasingly relies on information and communications technology (ICT), which leads
to huge amounts of patient data. As they are documenting treatments and their outcomes
under real-world conditions, routine data are a highly valuable resource for research
studies (“secondary use”).[2]
1
Corresponding Author: Alphons Eggerth, AIT Austrian Institute of Technology GmbH,
Reininghausstraße 13, 8020 Graz, Austria, E-Mail: [email protected]
A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment 211
Secondary use of healthcare data brings high responsibilities for the research team.
The research environment needs to meet legal and ethical requirements, which limit the
usage of the data. A very important aspect is the protection of the patients’ personal
information (e.g. GDPR for Europe [3], HIPAA for the US [4]). Along with providing
secure data storage and computing environments, data needs to be cleaned from
identifying elements. Thus, names, social security numbers, telephone numbers, etc.
must be removed from the datasets. [2]
Once all requirements are met, there is the question for the optimal record linkage
algorithm. Usually, several datasets are given, which need to be linked. In an optimal
case, a universal identifier or common patient pseudonyms exist, which can be used to
directly connect all the records. However, there are situations, when such a universal
identifier (e.g. for privacy reasons) or common patient pseudonyms (e.g. due to various
origins of the datasets) are not present. To enable linkage of de-identified datasets in such
situations, two datasets need to share some of their fields, i.e. some information needs to
be present in both datasets (e.g. date of birth, ZIP code, sex), which contain enough
information to obtain a unique combination of values for each patient (see k-anonymity
concept [5]). While deterministic approaches try to match records through rule-based
algorithms, probabilistic approaches rely on statistical methods to calculate weights for
the available parameters, which are then applied for estimating matching probabilities
[6-9].
In our study, we obtained data from two different sources: a) patient record data
from a manually entered database (exported as an Excel file) as well as b) raw data,
metadata and manually entered free text annotations of ergometric performance tests
recorded with the used ergometers (exported as separate XML files). We tried to link
these two data sources to validate the XML files’ free text annotations. However, there
was neither a universal identifier nor common patient pseudonyms nor was the data
format suited for commonly used record linkage algorithms. Thus, we transformed the
data and applied a distance based time series record linkage approach, as proposed by J
Nin and V Torra [10].
This paper is organised as follows: For the given datasets, we first present the
application of the time series record linkage algorithm. Second, we investigate the quality
of the XML files’ free text annotations.
2. Methods
Pseudonymised ergometry data from the rehabilitation centre ZARG Zentrum für
ambulante Rehabilitation GmbH were obtained comprising of an Excel file containing
manually entered database entries from 1.538 cardiac rehabilitation patients as well as of
29.876 XML files that had been recorded with ergometers and contained data from one
ergometric performance test each. In this paper, we use “PAT file” as a notation for the
Excel file and “ERGO files” as a notation for the XML files. However, the date ranges
of the datasets were just partly overlapping. Thus, for most of our analyses, the PAT file
entries after 13.06.2017 were not used. For a detailed description of the source datasets
see Table 1. All analyses were conducted using Matlab (The MathWorks, Natick, US).
During pre-processing, two pseudonymised IDs of the ERGO files were dismissed,
as they had more than 100 performance test entries, which is very unlikely for a common
patient. Furthermore, minor typos (e.g. year = “2217” instead of “2017”), which became
obvious due to unexpected outcomes during implementation, were corrected.
212 A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment
As an initial step, the information from the ERGO files was parsed and transformed
into a table. After comparing the ERGO and the PAT dataset with each other, we
extracted for all available performance tests the date and six parameters providing
identical entries in both datasets. We converted all dates and parameter values to integer
values and created a separate table for dates and for each of the parameters both for PAT
and ERGO: a) for the PAT files we arranged the integers in the columns with one row
for each patient (referred to as “PAT tables”) and b) for the ERGO files, we arranged the
integers in the columns with one row for each ID (referred to as “ERGO tables”).
Table 1. Properties of the ergometry data received from the ZARG rehabilitation center.
Property PAT file (= Excel file) ERGO files (= XML files)
Content Four sheets of manually entered patient Each ERGO file contained the
information and results from ergometric data of one performance test.
performance tests. Every sheet The data comprised of raw
represented one of four ergometric data (e.g. heart rate curves,
performance tests, which had been workload step profiles,
conducted during the cardiac ECGs), metadata and
rehabilitation program: sometimes annotations (e.g.
x Start of phase 2 free text entries denoting the
x End of phase 2 reason for the test). Files for
x Start of phase 3 the same patient were linked
x End of phase 3 by a pseudonymised ID.
Number of records 1,538 patients 29,876 ergometric performance
tests
Origin of the dataset Cardiac rehabilitation Cardiac rehabilitation;
by order of a physician;
as part of a training program.
First performance test 11.02.2013 22.01.2004
Last performance test 13.09.2018 13.06.2017
Identical entries in Sex, age, height, weight, maximum workload value of applied step profile,
both datasets maximum heartrate value during performance test, date of performance test.
Figure 1. Depiction of one iteration step, which is run several times during an iteration and contains 5 iteration
sub-steps. For every iteration step, the date tables of PAT and ERGO are used along with the value tables of
PAT and ERGO for the currently evaluated parameter. In the PAT table up to four entries are available for
each patient. In sub-step 1, a patient is selected from the PAT tables and one of her/his four date-value pairs is
selected. In sub-step 2, the ERGO tables contain dates and values in their columns and each of their rows
represents a pseudonymised ID. The date-value pair selected in sub-step 1 is now subtracted from all the values
and dates of the ERGO tables. In this way, values that are identical will result in zeros. In sub-step 3, the
minimum of the differences from sub-step 2 is calculated for each row and stored (in the same column as in
the PAT tables and in the same row as in the ERGO tables). Sub-steps 1-3 are done for the up to four date-
value pairs of the selected patient. In sub-step 4, the entries equal to 0 (= exact matches) are counted for each
row of the table resulting from sub-step 3 and stored to another table. Now, for the patient selected in sub-step
1, this table shows in each row the number of exact matches between her/his date-value pairs and the date-
value pairs of this row from the ERGO tables. In sub-step 5, the IDs from the ERGO tables’ rows, which have
the maximum number of exact matches, are stored for the selected patient. Sub-steps 1-5 are done for each
parameter to finally choose the one, which results in the fewest IDs in sub-step 5.
214 A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment
3. Results
With our record linkage algorithm, we initially obtained 761 matches for the full PAT
file, which equals to 49.5%. Removing all patients, which had at least one date value
outside the overlapping date range of PAT file and ERGO files, the matching rate was
74.5%. Detailed results can be found in Table 2. For further analyses, the matches of the
overlapping date range were used.
Table 2. Results of applying our record linkage algorithm. A “matched patient” is a patient of the PAT file,
which can be unambiguously linked to a single pseudonymised ID of the ERGO files. For the column “PAT
file until 13.06.2017” all patients with values after 13.06.2017 (= date of the last performance test of the ERGO
files) were omitted.
Property Full PAT file PAT file until 13.06.2017
Number of patients 1,538 877
Matched patients 761 (49.5%) 653 (74.5%)
IDs matched to more than one 206 (13.4%) 129 (14.7%)
patient (thus rejected)
Patients without a matching ID 571 (37.1%) 95 (10.8%)
As described in section 2.2, the free text entries of the ERGO files for the reason of the
performance test could be obtained from the “ReasonForStudy” field. Thus, for the
overlapping time range, we collected all these free text entries for each of the four reasons.
Table 3 shows the obtained free text entries for each of the four performance test reasons
together with the number of their occurrence.
For “start of phase 2” 167 free text entries were obtained through the matching IDs.
78.4% of these entries contained the expected string “Erstuntersuchung Phase II”. 15.6%
of the records contained an empty string. Thus, only very few unrelated entries remained.
A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment 215
Table 3. Available free text entries from the ERGO files, which could be unambiguously matched to PAT
entries for the respective performance test reasons. There can be four different reasons, relating to the current
stage of the cardiac rehabilitation program (“start of phase 2”, “end of phase 2”, “start of phase 3”, “end of
phase 3”). Only entries of the PAT file, which were within the overlapping time range of both data sources
(11.02.2013 – 13.06.2017) were considered for the matching.
Start of phase 2 (167 free text entries obtained from the ERGO files)
Free text entry from the matched ERGO files Number of occurrences
“Erstuntersuchung Phase II” 131 (78.4%)
“ ” (empty string) 26 (15.6%)
“Erstuntersuchung Phase III” 3 (1.8%)
“Abschlußuntersuchung Phase II” 1 (0.6%)
“Abschlußuntersuchung Phase III” 1 (0.6%)
“Anfangsuntersuchung ProHeart” 1 (0.6%)
“CAVE!! Hr. [name] [birthdate] Erstuntersuchu” (sic!) 1 (0.6%)
“EU II” 1 (0.6%)
“Pro-Heart 3” 1 (0.6%)
“ZU Proheart” 1 (0.6%)
End of phase 2 (174 free text entries obtained from the ERGO files)
Free text entry from the matched ERGO files Number of occurrences
“Abschlußuntersuchung Phase II” 153 (87.9%)
“ ” (empty string) 17 (9.8%)
“AU II” 1 (0.6%)
“Erstuntersuchung Phase II” 1 (0.6%)
“Proheart ZU” 1 (0.6%)
“Zwischenuntersuchung Phase III” 1 (0.6%)
Start of phase 3 (184 free text entries obtained from the ERGO files)
Free text entry from the matched ERGO files Number of occurrences
“Erstuntersuchung Phase III” 111 (60.3%)
“ ” (empty string) 49 (26.6%)
“Zwischenuntersuchung Phase III” 6 (3.3%)
“Abschlußuntersuchung Phase III” 5 (2.7%)
“Erstuntersuchung Phase II” 4 (2.2%)
“Pro-Heart 2” 2 (1.1%)
“Abschlußuntersuchung Phase II” 1 (0.5%)
“Anfangsuntersuchung ProHeart” 1 (0.5%)
“EU Phase III” 1 (0.5%)
“Eingangsuntersuchung Phase III” 1 (0.5%)
“Pro Heart” 1 (0.5%)
“Pro Heart ZU” 1 (0.5%)
“Rehaabbruch” 1 (0.5%)
End of phase 3 (194 free text entries obtained from the ERGO files)
Free text entry from the matched ERGO files Number of occurrences
“Abschlußuntersuchung Phase III” 149 (76.8%)
“ ” (empty string) 32 (16.5%)
“Pro Heart” 3 (1.5%)
“Erstuntersuchung Phase III” 2 (1.0%)
“10/10/1min” 1 (0.5%)
“Abschlußuntersuchung Phase III Verl.” 1 (0.5%)
“Anschlussuntersuchung Phase III” 1 (0.5%)
“Kontrolluntersuchung” 1 (0.5%)
“Pro Heart / Herzverband” 1 (0.5%)
“Rehaabbruch Phase III” 1 (0.5%)
“Vorzeitiger Rehaabbruch/AU PH3” 1 (0.5%)
“Zwischenuntersuchung Phase III” 1 (0.5%)
216 A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment
For “end of phase 2” 174 free text entries were obtained through the matching IDs.
87.9% contained the expected string “Abschlußuntersuchung Phase II”. Only 9.8% of
the entries contained an empty string and the total number of remaining unrelated values
was 2.3%. Thus, “end of phase 2” showed the highest rate of accurate entries along with
the fewest empty strings and the lowest number of unrelated entries. For “start of phase
3” 184 free text entries were obtained. Only 60.3% of the performance tests were tagged
with the expected entry “Erstuntersuchung Phase III”. With 26.6%, more than a quarter
of the performance tests was annotated with an empty string. Also, the number of
unrelated values was the highest in comparison to the other phases, totaling in 13%. The
final reason “end of phase 3” showed similar characteristics like “start of phase 2”. Of
194 obtained free text entries, there were 76.8% of expected entries containing the string
“Abschlußuntersuchung Phase III” and 16.5% empty strings. The number of unrelated
entries was 6.7%.
While “start of phase 2” and “end of phase 3” had a similar rate of accurate entries,
the rate of “start of phase 3” was lower and the rate of accurate entries of “end of phase
2” was comparably higher than for the other reasons of the performance tests.
4. Discussion
Looking at the outcome of this study, the applied time series record linkage algorithm
achieved a matching rate of 74.5% and the observed free text entries were in accordance
with our expectations for up to 87.9% of the entries. However, at this time, we had no
gold standard for evaluating the accuracy of our matches. For more reliable analyses of
the resulting combined dataset, the datasets should be linked by patient pseudonyms.
Another issue were the different date ranges of the two data sources. While the PAT
file contained records ranging from 2013 to 2018, the ERGO files contained records
ranging from 2004 to 2017 only. Obviously, no matches outside the overlapping date
range were possible and considering the full date range, only half of the patients (49.5%)
from the PAT file could be unambiguously matched to their IDs of the ERGO files.
Looking at the overlapping date range, 74.5% of the patients could be matched.
For our matching approach, we assumed that no patient had more than one ID in the
ERGO files and allowed only one single linkage between PAT file patients and ERGO
file IDs. Thus, if a patient would have had two IDs in the ERGO files, one “correct”
linkage would have been dismissed.
Even if only patients of the overlapping date range were considered for this study,
the gathered knowledge could still be used for the full date range of the datasets. There
were up to 87.9% of correctly entered free text entries, which gives the reassurance that
these entries are quite reliable.
The proposed record linkage algorithm can be used to combine de-identified datasets
to one comprehensive, de-identified dataset, which could be the basis for further insights.
However, it is not possible to identify single patients or to recreate personal information.
Formula 1 gives the used criterion for testing equality. It was chosen, because our
implemented routine transformed all parameter time series to date-value pairs in integer
format for easier handling. For identical values at the same date, this criterion was
logically true, which allowed to count these entries as exact matches. For allowing some
distance between two values instead of only counting exact matches, adaptations would
be needed: The values’ dates would separately need to be checked for an exact match,
A. Eggerth et al. / Patient Record Linkage for Data Quality Assessment 217
while the values themselves would be allowed to diverge within some boundaries (e.g.
± 5 bpm for the maximum heart rate).
The matching results show, that on the one hand, the proposed matching algorithm
was suitable for the given scenario, as unrelated free text entries were very rare. On the
other hand, the free text entries showed to be very accurate.
5. Conclusion
For the given scenario with two data sources containing identical entries for some of their
parameters, our iterative, distance-based time series record linkage algorithm achieved a
matching rate of 74.5%. Furthermore, the free text annotation entries in the ERGO files
were in accordance with our expectations for up to 87.9% of the entries, which showed
that inclusion of the PAT file will be unnecessary for our future analyses of this dataset.
6. Conflict of Interest
7. Acknowledgement
This work was partly funded by the Austrian Research Promotion Agency (FFG) as part
of the project EPICURE under grant agreement 14270859.
References
[1] D. Hayn et al., "IT Infrastructure for Merging Data from Different Clinical Trials and Across
Independent Research Networks," (in eng), Stud Health Technol Inform, vol. 228, pp. 287-91, 2016.
[2] S. Dusetzina, S. Tyree, A. Meyer, A. Meyer, L. Green, and W. Carpenter, "Linking Data for Health
Services Research: A Framework and Instructional Guide.," University of North Carolina at Chapel
Hill, Rockville, MD, 2014, vol. AHRQ Publication.
[3] G. Chassang, "The impact of the EU general data protection regulation on scientific research," (in
eng), Ecancermedicalscience, vol. 11, p. 709, 2017.
[4] R. Nosowsky and T. J. Giordano, "The Health Insurance Portability and Accountability Act of 1996
(HIPAA) privacy rule: implications for clinical research," (in eng), Annu Rev Med, vol. 57, pp. 575-
90, 2006.
[5] K. El Emam and F. K. Dankar, "Protecting privacy using k-anonymity," (in eng), J Am Med Inform
Assoc, vol. 15, no. 5, pp. 627-37, 2008 Sep-Oct 2008.
[6] A. Sayers, Y. Ben-Shlomo, A. W. Blom, and F. Steele, "Probabilistic record linkage," (in eng), Int
J Epidemiol, vol. 45, no. 3, pp. 954-64, 06 2016.
[7] G. P. Oliveira, A. L. Bierrenbach, K. R. Camargo, C. M. Coeli, and R. S. Pinheiro, "Accuracy of
probabilistic and deterministic record linkage: the case of tuberculosis," (in eng|por), Rev Saude
Publica, vol. 50, p. 49, Aug 2016.
[8] Y. Zhu, Y. Matsuyama, Y. Ohashi, and S. Setoguchi, "When to conduct probabilistic linkage vs.
deterministic linkage? A simulation study," (in eng), J Biomed Inform, vol. 56, pp. 80-6, Aug 2015.
[9] I. P. Fellegi and A. B. Sunter, "A theory for record linkage," Journal of the American Statistical
Association, vol. 64, no. 328, pp. 1183--1210, 1969.
[10] J. Nin and V. Torra, "Distance Based Re-identification for Time Series, Analysis of Distances.," in
Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol. 4302, J.
Domingo-Ferrer and L. Franconi, Eds. Berlin, Heidelberg: Springer, 2006.
218 dHealth 2019 – From eHealth to dHealth
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/978-1-61499-971-3-218
Abstract. Background: Stroke is one of third most common causes of death and the
main cause for permanent disabilities. The Tyrol Stroke Pathway covers, all steps
from stroke onset to outpatient rehabilitation. Objectives: The main objective of this
paper is to describe how the paper-based documentation in the outpatient
rehabilitation can be implemented in an eHealth service for integrated care.
Methods: First a state analysis followed by a requirement analysis was performed.
An interactive mock-up was designed for further discussion with the stakeholders.
After the implementation of the system the evaluation was performed in two steps:
feedback from a virtual test phase and a pilot operation was analyzed. Results: First
experiences during the virtual test phase with key stakeholders of the therapy
pathway showed a high level of acceptance. Users reported an improvement in the
communication and documentation processes. Conclusion: Initial results illustrate
how a shift from paper-based documentation to an integrated eHealth service can
improve communication and documentation in an independent therapy network.
Keywords. integrated care, stroke, eHealth, health service, health care delivery,
patient care management
1. Introduction
In Austria and in most other countries, stroke is one of third most common causes of
death and the main cause for permanent disabilities [1][2]. However, compared to other
countries, Austria has a low mortality rate in patients having a stroke [3]. Consequently,
the demand for post-stroke care is high. Efficiency and quality of interprofessional
interaction and communication in rehabilitation teams of physiotherapy, occupational
therapy and logopedics could be improved through a better patient-oriented inter-
professional communication [4][5].
Tyrol, a state in western Austria, implemented the Tyrol Stroke Pathway, which
covers the care of stroke patient. The care of patients follows a structured rescue and
1
Corresponding Author: Kristina Reiter, AIT Austrian Institute of Technology GmbH,
Reininghausstraße 13, University of Applied Sciences FH Joanneum, Eckerstraße 30i, Graz, Austria, E-Mail:
[email protected].
K. Reiter et al. / eHealth Service for Integrated Care and Outpatient Rehabilitation 219
treatment chain, which can be divided into the following phases: prehospital phase, the
hospital phase, the inpatient rehabilitation and the outpatient rehabilitation. An
immediate treatment in specialized centers (Stroke Units) and an integrated care after the
stationary treatment is recommended by several guidelines [6][7].
“It was commenced, as a long-term routine-care program and aimed to include all
patients with stroke in the survey area. During the period of implementation of the
comprehensive stroke management program, thrombolysis administration increased and
clinical outcome significantly improved” [8]. The Tyrol Stroke Pathway now is
implemented in all counties and the outpatient rehabilitation in seven out of nine counties
of Tyrol as a routine standard and is generally accepted [9].
Tyrol is characterized by rural and urban areas surrounded by Central Europe’s main
mountain range, which leads to several challenges, especially in transportation and for
organizing outpatient rehabilitation.
The outpatient rehabilitation has the aim to provide rehabilitation in the proximity
of the patients’ home. Therefore regional, multidisciplinary stroke networks had been
established, covering the hospitals and the disciplines physiotherapy, occupational
therapy and logopedics, as well as general practitioners, neurologists, nursing homes,
coordinating organizations (“Sozialsprengel”) and discharge managers of the acute care
hospital. The outpatient rehabilitation of the Tyrol Stroke Pathway is implemented in
already existing health care structures. The discharging hospital contacts the local
coordinator, who is a member of the coordinating organization, in order to get therapy
organized early enough to have a good transfer for the patient. The treatment plan follows
scientific standards, which are defined in the Tyrol Stroke Pathway [8]. Outpatient
rehabilitation is based on the International Classification of Function (ICF). Therapy
sessions take place at home in order to learn to participate again in everyday life and to
handle activities of daily life.
The Stroke Pathway is well developed, but according to the evaluation of the project
2017 [9] administration burden in terms of organizing the ambulant stroke rehabilitation
and administration burden in order to submit invoices as well as a lack of communication
was reported by the network members. Reducing administration time and providing
information immediately and simultaneous to regional network members of the patient
is strived.
The aim of this paper is to describe how an eHealth service for integrated care and
outpatient rehabilitation has to be designed and implemented to support the
characteristics of the interdisciplinary network of the ambulant stroke rehabilitation. The
paper further intends to solve how to enhance communication and information exchange
and reduce administration burden in the therapy network for ambulant stroke
rehabilitation and to evaluate user feedback from virtual test phase and pilot operation.
Our hypotheses are to reduce problems which came along with the paper-based
documentation, to reduce the time between discharge and the beginning of the treatment
and to reach acceptance throughout the treatment network.
2. Methods
Based on the annual report of the Tyrol Stroke Pathway program [9] and the results of
the gap analysis it was decided to set up an electronic, virtual communication and
documentation platform and to fully replace the paper-based documentation [10].
220 K. Reiter et al. / eHealth Service for Integrated Care and Outpatient Rehabilitation
The first step was to analyze the current procedures followed by a requirement
analysis. The manual for network members [10] and the set of different paper-based
forms built the basis for the analysis. Additional requirements were identified during
regular meetings with the pathway experts of the Governmental Institute of Integrated
Healthcare (“Landesinstitut für Integrierte Versorgung”) and members of the treatment
network.
After analyzing the most important requirements, an interactive mock-up was
designed with the software “Justinmind” (www.justinmind.com, San Francisco, USA)
and discussed with the stakeholders. After few adjustments the implementation was
realized by the AIT Austrian Institute of Technology with the existing eHealth platform
KIOLA.
Along with the electronic documentation tool the aim was to provide information
to network members immediately and to save time in the administrative process. First
the management of prescriptions was supported by implementing a standardized
workflow for the prescription in the outpatient rehabilitation followed by the approval of
the health insurance. The first follow up prescription is approved by a general practitioner
and the second is approved by the neurologist, after determining the mRS (modified
Ranking Scale) in a so called 3-Month-Assessment.
A key indicator for quality in the treatment pathway is the time between discharge
and the beginning of the outpatient rehabilitation, indicated by the first therapy session.
Due to financial regulations it is important that geographically the nearest therapist
attends the patient. To guarantee a quick organization of an interdisciplinary team a tool
based on Google Maps was implemented. It allows to identify and contact the nearest
therapists, which means Google Geometry Library [11] was used for the calculation of
the distances and Google Places Library [12] was used for address search with
geographic coordinates.
A central element of the treatment network in the outpatient care is the interdisciplinary
meeting between the therapists (physiotherapist, logopedics, occupational therapist).
This interdisciplinary meeting has the purpose that the treatment is planned and
coordinated between the interdisciplinary team of therapists. The challenge in the
interdisciplinary documentation process is to ensure that every therapist is able to create
treatment goals and further evaluate these treatment goals after the treatment process. In
addition to the interdisciplinary meeting an outcome check is performed with SINGER
(Scores of Independence for Neurologic and Geriatric Rehabilitation,
https://www.singer-assessment.de/). The results of the SINGER-Assessment are
uploaded and stored as PDF-documents.
Evaluation of the novel IT service was planned in two steps. First step: a virtual test
phase with ten dummy data sets representing patient cases in the outpatient rehabilitation
was performed. All relevant system roles were used by key stakeholders to test the
documentation and process quality. Second step: a pilot operation of the system in real
use in the western part of Tyrol for a period of three months has been set up. For
evaluation of the pilot phase the authors will use the Information System Success Model
Survey. This survey is based on the Delone&McLean Information System Success
Model [13]. It consists of questions to access the six dimensions: information quality,
system quality, service quality, intention to use, user satisfaction and net benefits. The
instrument also contains open questions on benefit and possibilities for improvements.
The questionnaire has been adapted for this evaluation but is not formally validated.
K. Reiter et al. / eHealth Service for Integrated Care and Outpatient Rehabilitation 221
3. Results
Although the treatment path in the country of Tyrol had been in use for several years, the
status and requirements analysis showed that the paper-based documentation of the
ambulant rehabilitation needs to be replaced. The key element of the paper-based system
is a patient folder which includes different administrative, medical and therapeutic
information. The folder was used by all health care professionals to document results of
therapy sessions together with administrative information.
The patient folder is kept by the patients for keeping it readily available for all therapist
during the home visits and for the physicians during office visits. During the
requirements analysis it has become evident that the current paper-based system tends to
cause errors (e.g. the patient folder might get lost) and it is time consuming, especially
in the organization of appointments and the evaluation of the processes. Potential for
useful and time-saving functionalities has been identified:
• Communication and coordination between health care professionals
• Assignment of therapists in close vicinity to the patients
• Enabling an easier approval process by the responsible health insurance
companies
• Enabling a correct billing process for the therapists with the health insurance
companies
• Support a transparent execution and documentation of the SINGER Assessment
Additionally, regional discrepancies in processes have been identified. The organization
of the prescription is managed differently.
The outpatient rehabilitation workflow is shown in Figure 1. The workflow, shown in
Figure 1, has not changed with the implementation of the electronic documentation.
Further details about the outpatient rehabilitation workflow are described in [10].
The central elements in the electronic documentation tool are the patient list
combined with a list of tasks as well as a checklist for every individual patient. The
checklist is adapted for every role (discharge manager, local coordinator, therapist) in
the system.
On the left side of Figure 2 an extract of the checklist for the local coordinator is shown.
The comprehensive checklist is divided into eight sections:
• Registration
• First prescription
• Network Management
Figure 2. Google Maps based organization tool for identifying and assigning the nearest therapist.
K. Reiter et al. / eHealth Service for Integrated Care and Outpatient Rehabilitation 223
• Interdisciplinary Meeting
• Therapy goals
• Documents and Information
• First follow up prescription
• Second follow up prescription
The checklist is linked with compulsory entry fields and shows green buttons
whenever a process step (e.g. entry of the discharge date) is fulfilled. Notifications via
e-mail including a direct link to the required task are triggered via the electronic
documentation tool in order to ensure a fast process handling.
During the discharge process the local coordinator is responsible for organizing the
therapeutic network close to the patient’s home. In Figure 2 the organization tool for the
therapeutic network is shown. Based on the patient’s main residence the nearest
therapists are shown in the map and can be contacted.
In December 2018 we started the virtual test phase. Based on the evaluation of the test
phase we identified potential for changes. The accessibility of documentation made by
therapists has been changed to avoid parallel usage of other communication channels,
for example WhatsApp or E-Mail. The second request concerns the return of the
approved prescription by the health insurance. A Fax2Mail solution is planned since the
processes in the health insurance does not allow an integrated solution. A high
acceptance in the test phase could be accomplished, which also became apparent in a
high participation rate in the training workshop for the pilot phase with over 60
participants.
In January 2019 the training workshops started for all participating partners in the pilot
region of Landeck and Imst, which are located in the western part of Tyrol. All
participants signed the terms of use to take part at the pilot phase, which was planned for
a duration of 3 months with up to 30 patients. The first patient was registered on 1 st of
February at the hospital in Zams. As part of an accompanying evaluation, user surveys
are ongoing carried out according to the Information System Success Model [13]. First
results of the evaluation will be ready to be presented at the dHealth conference at the
end of May 2019.
4. Discussion
To sum up, this paper has highlighted the implementation process of an electronic
communication and documentation platform in the outpatient rehabilitation of stroke
patients. Additionally, an evaluation through a virtual test phase and a pilot phase was
performed. Especially the interdisciplinary character of the outpatient stroke
rehabilitation workflow and the independent character of the network members was a
challenging factor in the realization. The implemented eHealth service for integrated care
and outpatient rehabilitation was the first step to shift from paper-based documentation
224 K. Reiter et al. / eHealth Service for Integrated Care and Outpatient Rehabilitation
References
Abstract. Background: Reuse of EHR data for selecting patients who are eligible
for clinical research can substantially improve the recruitment process. ART-
DECOR is an open-source tool that is commonly used to design and publish HL7
V3 templates of national (e.g. ELGA) and international EHR initiatives. Objectives:
Extend ART-DECOR to allow the definition of criteria that may be used for patient
selection. Methods: Using the native ART-DECOR development framework we
extended existing ART-DECOR template associations by allowing conditions to be
formulated. Results: An editor for the specification of conditions was implemented.
The resulting criteria are internally translated to XPath expressions and can be
immediately applied to CDA documents. As a prototypical application of our
approach we implemented a “Trial Criteria Evaluator” tool that allows trial
eligibility criteria to be composed of our ART-DECOR criteria and have them
checked against a patient’s CDA documents. Conclusion: Referring to HL7
templates, our criteria can be applied to documents of national EHR systems such
as ELGA and hereby reach a broad patient cohort. Implementing our approach
within ART-DECOR alleviates its reuse and enhancement by other researchers.
1. Introduction
According to a recent WHO study, almost every second member state of the EU already
has an operative national electronic health record (EHR) system in place [1]. In Austria,
the ELGA system [2] was started in 2015 and aims to finalize the rollout phase in the
outpatient sector this year.
Documents stored in EHR systems are frequently formatted according to the HL7
Clinical Document Architecture (CDA) standard [3]. Since the CDA model is very
generic, HL7 V3 templates [4] are used to specify the structure and content of particular
document types. ART-DECOR (https://art-decor.org) is an open-source tool and
methodology that is commonly used to design and publish HL7 V3 templates of EHR
systems. Templates are available for the ELGA CDA document types (http://elga.art-
decor.org/) and of various other international EHR initiatives (https://art-
decor.org/decor/services/Statistics?list=bbrs).
1
Corresponding Author: Simon Ott, Section for Medical Information Management, Center for Medical
Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Spitalgasse 23, A-1090 Vienna,
Austria, E-Mail: [email protected]
S. Ott et al. / Expressing Patient Selection Criteria Based on HL7 V3 Templates 227
Routine data recorded in EHRs have become an interesting source for clinical
research [5]. A typical step in a clinical research project is the identification of a patient
cohort with particular characteristics. This may for example be patients with a particular
diagnosis and age, who would be eligible for a particular clinical trial or patients, who
received a particular treatment and should be interviewed in the course of an outcome
study. An automatic identification of patients based on their EHR data [6] could
substantially reduce the high effort and rate of missed patients that typically characterize
manual patient recruitment [7].
ART-DECOR allows the definition of high-level information needs, such as ‘age’
or ‘diagnosis’, and map them to a component of a CDA template that holds the
corresponding data. These so-called ART-DECOR “concepts” and “template
associations” can be stored in individual ART-DECOR project files. Currently they are
not sufficient for the definition of fine grained criteria to select patients as they do not
support the specification of particular conditions that the mapped template components
would have to satisfy, such as “age ≥ 6” and “diagnosis = type 1 diabetes”.
In this paper we therefore present an extension of ART-DECOR that allows the
definition of criteria that may be used to identify patients with particular characteristics
relevant for a clinical research project. They are based on ART-DECOR template
associations and thus include a reference to those components of a CDA document that
hold the source data of the criteria. The criteria are stored in the ART-DECOR project
file and can then be applied to a patient’s CDA documents to check whether they satisfy
the criteria.
2. Methods
Figure 1. Definition of concept “white blood cell count” with associated LOINC code in ART-DECOR (top).
Concept “white blood cell count” is mapped to element “hl7:observation” of template “Laboratory Observation”
that is used in ELGA CDA laboratory reports for storing laboratory data (bottom).
228 S. Ott et al. / Expressing Patient Selection Criteria Based on HL7 V3 Templates
We can further map concept “white blood cell count” to a suitable HL7 V3 template
that refers to a particular component of a CDA document, which would hold the white
blood cell count value of a patient (see fig. 1 bottom).
For the definition of our desired criterion “elevated white blood cell count” the
following points are missing:
x Specification of conditions (e.g., we consider the white blood cell count to be
elevated if the observation’s “hl7:interpretationCode” element holds code “H”
or “HH”)
x Detailing of template associations if a concept is mapped to a generic template
(e.g., in fig. 2 the generic template “Laboratory Observation” will only hold a
white blood cell count value, if the observation’s “hl7:code” element holds the
LOINC code “26464-8” for Leucocytes)
x Executing the criterion against a CDA document to check whether the
contained data satisfy the criterion
For implementing the first two points we decided to develop an editor that allows
the specification of conditions for an ART-DECOR template association. For the third
point, we added a testing component to ART-DECOR that allows the specified criteria
to be immediately executed against an uploaded CDA document. All extensions should
be implemented within the native ART-DECOR development frameworks (i.e. Orbeon
Forms and eXist-db) to easily integrate them into an existing ART-DECOR environment
using the ART-DECOR package manager.
In order to demonstrate a potential prototypical application of our extension, we
planned to develop a “Trial Criteria Evaluator” tool. It should allow to evaluate whether
a patient may be a potential candidate for a clinical trial based on his/her CDA documents.
3. Results
Originating from an existing template association, our extension allows the required
“implicit” and “explicit” conditions to be added to define a criterion (see fig. 2).
“hl7:templateId” can be used to create the implicit condition that concept “elevated white
blood cell count” should only refer to CDA observations that hold the OID
“1.3.6.1.4.1.19376.1.3.1.6” in their “hl7:templateId” element. Further, the terminology
association defined for concept “elevated white blood cell count” can be used to create
the implicit condition that this concept should only refer to observations that hold code
“26464-8” in their “hl7:code” element.
Explicit conditions are defined by the user by manually specifying one or more
statements and linking them by Boolean operators (see fig. 4). Each statement is
composed of an attribute of a template element, an operator and a comparison value. All
attributes as predetermined by the template element’s datatype and its child elements are
offered for selection. As an example, element “hl7:code” is of datatype CE (Coded with
Equivalents) and thus allows attributes “code”, “codeSystem”, “codeSystemName”,
“codeSystemVersion”, “displayName”, and “nullFlavor” to be selected. Further,
operators "=", "≠", "<", "≤", ">", "≥" and "IS NULL"(i.e., value of the attribute is empty)
are available to formulate a statement.
All data concerning the defined conditions are stored as additional elements of the
corresponding <templateAssociation> component within the ART-DECOR project file.
In order to allow the specified criteria to be checked also independently of our tool, we
automatically translate them to XPath expressions. Hereby, we logically link all implicit
and explicit conditions of a criterion with a Boolean AND operator and generate the
correct XML Schema Datatypes [8] (e.g., numerical comparison values are converted to
xs:double, values of HL7 datatype TS or any flavor are converted to xs:dateTime). The
expression derived from a criterion’s conditions is used as the predicate of the XPath.
The node-test of the XPath is the root element of the template or in case of multiple root
elements of a template the parent of these elements as a wildcard. The generated XPath
(see fig. 7) uses the HL7 defined XML namespace for v3 “urn:hl7-org:v3” [9].
For an immediate checking of the criteria we also implemented a testing component
within our tool. It allows the user to upload a CDA document and execute the generated
XPaths against the document. All components of the CDA document that are found by
means of the XPath and thus satisfy the criteria are displayed.
Figure 5. Trial Criteria Evaluator. Criteria defined in ART-DECOR project “Test” were used to compose four
trial criteria of a diabetes trial (https://clinicaltrials.gov/ct2/show/NCT01390480). Three CDA documents were
uploaded to be checked against the trial criteria.
S. Ott et al. / Expressing Patient Selection Criteria Based on HL7 V3 Templates 231
The “Trial Criteria Evaluator” (see fig. 5) demonstrates in a prototypical way, how our
extension could be applied to check a patient’s eligibility for a clinical trial.
Being an Orbeon forms application, it can be added to an existing ART-DECOR
environment by installing the corresponding package in the eXist-db. As it is completely
independent of ART-DECOR itself, it may also be installed as standalone version
without ART-DECOR. It allows the user to load the criteria that he/she defined within
ART-DECOR and combine them with Boolean operators to form complex trial inclusion
and exclusion criteria.
Figure 6. Result screen. At the top, the total result and an overview of satisfied inclusion/exclusion criteria is
shown. Below, a detailed report is displayed that shows for each criterion and document, whether the criterion
is satisfied/not satisfied by the document’s data or whether the document does not contain the required data.
Inclusion criteria are depicted in green background color, exclusion criteria in red.
The user can then upload a set of documents of a particular patient and evaluate
whether the patient could be a potential candidate for the trial. Hereby all uploaded
documents are checked iteratively whether they satisfy one of the trial criteria. Our
evaluation is conservative insofar, as we assume that the uploaded documents represent
only a subset of the patient’s complete medical history [10]. We thus have to expect that
there may be additional data that we are not aware of but may nevertheless satisfy a trial
criterion. Consequently, the only safe assessment that our trial criteria evaluator is
capable of, is to exclude a patient if the uploaded documents satisfy one or more
exclusion criteria. Otherwise it concludes that the patient may be eligible and displays a
report of which criteria are satisfied by the uploaded documents, respectively for which
criteria the uploaded documents do not contain the required data (see fig. 6).
For each criterion the corresponding generated XPath can be displayed (see fig. 7).
4. Discussion
The work presented in this paper is the result of an ongoing bachelor thesis of the first
author and is thus of preliminary nature. The final results will be presented at the
conference. We further plan to make our extensions of ART-DECOR publicly available
as open-source code by then.
Various suggestions have been made to automate the identification of patient cohorts
based on EHR data [11]. These approaches typically refer to institutional EHR systems
with proprietary data models. This limits the number of patients that can be addressed
and requires individual mappings of the selection criteria to the data model of each single
EHR system. Fernandez-Breis et al. suggest to map the selection criteria to standardized
EHR data defined by means of openEHR archetypes [12]. Our approach is similar but is
based on the more prevalent HL7 CDA standard and associated HL7 v3 templates. We
further implemented our method within the open-source tool ART-DECOR that is used
within several national EHR system initiatives. The EHR4CR platform [13] uses a
distributed architecture, where trial criteria can be defined at a central server and
transmitted to clinical data warehouses (CDW) of participating hospitals. It requires
individual mappings of the criteria expressed in the central ECLECTIC syntax [14] to
the data models of each single CDW. A similar approach is pursued by SHRINE [15],
which allows queries to be distributed to CDWs that are based on the i2b2 [16] model.
As a prerequisite the participating CDWs have to support common i2b2 ontologies.
Compared to earlier projects, the first main contributions of our work is that our
criteria directly refer to elements of HL7 CDA documents and hereby make use of the
knowledge of the CDA structure as specified within HL7 v3 templates. They may thus
be applied to the document types of national EHR systems such as ELGA and hereby
reach a broad patient cohort. Our second main contribution is that we implemented our
approach as an extension of the open-source tool ART-DECOR that is widely used in
the course of (inter)national EHR initiatives. This alleviates reuse and further
enhancement of our work by other researchers.
Several alternatives exist for the expression and execution of criteria, such as Arden
Syntax or SNOMED CT. Applying ontology-based semantic knowledge [17] in the
processing of criteria would also be a useful extension of our work, e.g. when searching
for patients with different types of lung diseases. A comparison with these alternative
methods is beyond the scope of the present paper but is planned for a more elaborate
version of this work.
S. Ott et al. / Expressing Patient Selection Criteria Based on HL7 V3 Templates 233
References
1. Introduction
1.1. Background
New technologies and increasing digital technologies are leading to huge amounts of
data being available in the health and care sector. Affordable and portable measuring
devices also enable continuous monitoring of patients at home in telehealth settings. In
order to really benefit from all this data, computer-aided processing methods are
increasingly applied, especially machine learning approaches are currently considered in
many healthcare areas [1][2][3].
While technological progress offers promising possibilities, fully automatic analysis
and modelling approaches are prone to be incomprehensive solutions. To date, human
involvement is still an essential part in the modelling process[4][5]. On the one hand,
specialists need to contribute their knowledge and skills during the development process
of predictive models. On the other hand, presentation of modelling results needs to be
comprehensible and understandable for humans in order to achieve legitimacy and
1
Corresponding Author: Michael Sams, Institute of Neural Engineering, Graz University of Technology,
AIT Austrian Institute of Technology, Reininghausstraße 13/1, 8020 Graz, Austria, E-Mail:
[email protected]
M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data 235
acceptance of such systems. Therefore, interactive visual analytics tools are needed to
fuse human intelligence with computational processing power to achieve optimum
results[6][7][8].
Figure 1. Two-level process of data driven decision support in health and care.[9]
normal / pathological), evaluation outcome (e.g. a false positive case), and the
assessment of the relevance of a given feature [11].
However, although for ECG visualization and classification the ECG viewer
extension is highly suitable, PATH lacks tools for supporting the predictive modelling
process of other domains, like e.g. the telehealth domain.
1.3. Objectives
The aim of the present work was to develop and implement a tool that supports data
scientists in developing predictive models in the telehealth domain. This tool had to be
developed on the basis of the predictive toolset PATH, which was introduced in section
1.2. The tool had to be integrated into the process of model development, which
corresponds to the inner cycle of Figure 1. It had to be designed as an interactive
visualization tool, allowing users to gain better insights during the model development
process, which provides the basis for improvements and model optimization.
To this end, it was necessary to investigate which design and functionality
requirements such a tool should have in order to support the workflow of data scientists
and to cope with the special characteristics of telehealth data. Based on these
requirements, a viewer was designed and subsequently implemented in MATLAB.
2. Methods
2.1. Dataset
The development of the viewer was based on a test dataset consistent with the data model
of the heart failure disease management program 'HerzMobil Tirol'. [12] In addition to
demographic data of the patients and measurements taken by the patients at home (i.e.
body weight, heart rate and blood pressure), the test dataset also included clinical notes
of healthcare professionals, information about medication compliance and prescriptions
as well as information on hospital admissions.
To suit for the development of various functionalities, the test dataset was extended
by artificially adding ECG recordings to obtain a further data level. Thus, for selected
patients, a time series of ECG measurements was added to the existing data.
In order to get a comprehensive basis for the requirements analysis with respect to the
special characteristics of telehealth data, first the general characteristics of health data
were examined in detail. Subsequently, this basis was extended and refined with
experiences from the concrete telehealth use case from the 'HerzMobil Tirol' program.
The procedure of gathering the viewer requirements regarding the development of
predictive models was similar. In addition to theoretical considerations, the existing
workflow of the AIT data scientists was analysed in detail and their previous experiences
regarding the development of predictive models were considered. There was also an in-
depth exchange about expectations towards such a tool and its capabilities.
On the basis of these surveys, the essential elements and functionalities of such an
interactive visualization tool were identified.
M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data 237
2.3. Implementation
The implementation of the viewer was carried out in MATLAB R2018a (The
MathWorks Natick, US). For the development of the graphical user interface
MATLAB’s in-house program GUIDE was utilised.[13]
As a basis for the implementation, the ‘ECG viewer’ framework of the CinC 2017
challenge, which was described in section 1.2, was used. In the course of this work, this
framework was generalized using the 'HerzMobil Tirol' telehealth data model. The
verification of this generalised tool was then carried out with a de-identified dataset from
the research project EPICURE, containing records of ergometric performance tests
conducted by a cohort of cardiac rehabilitation patients.
3. Results
An important aspect of telehealth data is the variety of possible data types. The data can
thereby include a variety of measurements, various medical events (e.g. hospitalization,
procedures, medications) as well as unstructured clinical notes. Commonly, the focus of
interest lies on the progression of a patient's health over time. Thus, another essential
aspect is the temporal characteristic of the data. Especially temporal patterns are thereby
of great interest. Based on the gained knowledge concerning the characteristics of
telehealth data and the workflow of data scientists, the following essential elements and
functionalities of a visualization tool that supports the model development process were
identified. (Figure 2)
Due to the temporal characteristic of telehealth data, an appropriate time series
visualization poses a core element of the requirements. Time series data typically have
to be pre-processed or transformed before they can be used in the modelling process.
Therefore, it was necessary to support an interaction of the viewer and the signal
processing functionality of the software. Selection of different available signal
processing algorithms should be supported along with a possibility to launch them
directly out of the user interface combined with automatically reloading all affected
viewer elements to keep the visualizations up-to-date. This enables a convenient
environment for testing different algorithms and for analysing different outcomes.
Another key aspect in the modelling process are the so-called “features”. These are
specific attributes or properties of the data and are essential to the predictive model (e.g.
classification and regression trees). To keep a good overview of the current modelling
processes, all features should be at hand, which can e.g. be realised through presenting
them as a list.
When it comes to distinct model result visualization of a single patient, the pure
listing of numerical values is not sufficient. The model result should be presented in an
intuitive and comprehensive way in order to instantly provide the required information
238 M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data
for further model improvement. For example, a threshold violation could be indicated by
color coding depending on whether it is a positive or negative outcome. Another
important aspect is, that if the modelling results are related to a certain time interval (e.g.
weekly intervals) the time synchronous relation to the underlying raw data must be given.
Typical datasets for predictive modelling consist of several data levels. Along with
raw data from measuring devices, derived data or any other kind of supplementary
information on a measurement can be available. For example, a time series of heart rate
values could have been derived from ECG recordings. Thus, while initially the actual
heart rate values are shown, there should be a functionality to go further into detail and
to take a closer look at the underlying biosignal, i.e. the ECG. Other examples would be
various biosignals, lab reports or even imaging data. Such a functionality was
implemented as a so-called ‘drill-down’ functionality and allows the user to obtain a
higher level of details.
Another important part of the predictive modelling process is the overall model
evaluation. However, the development of a predictive model does not follow a specific
path, like e.g. from raw data to modelling result, but it is an interactive, iterative process
of understanding all the aspects of a given scenario and the resulting modelling outcomes.
The main challenges are understanding why a given result has been obtained and also
finding potential issues, which might lead to errors or unsatisfactory effects. Therefore,
it should be possible to launch the viewer directly out of the evaluation process (e.g. by
choosing an interesting cell of a confusion matrix) presenting just the chosen sub-
selection of cases (e.g. the false positive cases only). In order to then once again allow a
closer look at the underlying data, the underlying measurements, the features that were
used for the model etc., to enable and support the process of gaining insights.
3.2. Implementation
Figure 3 shows the interface of the ‘PATHviewer’ with its main elements.
M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data
Figure 3. MATLAB-based PATHviewer. 1: feature table, 2: modelling result panel, 3: time series panel, 4: drill-down panel
239
240 M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data
In the feature table (1), all the features of the currently selected patient are listed.
The modelling result panel (2) is for the presentation of the model output. In this example,
the prediction was about, whether for the respective patient an admission to a hospital
will happen or not. Along the x-axis, the time is plotted, which in this case are 7-day
intervals. For each of these intervals, a predicted probability, whether an admission to
the hospital might happen is visualized by bars with corresponding height. The red dotted
line represents a threshold. If the predicted probability exceeds the threshold, the model
indicates that there will be an admission. An actual admission to a hospital is indicated
by a grey background of the respective week. The color of the bars is indicating whether
the prediction was true (green) or false (red).
The time series panel (3) is dedicated for the visualization of time series data. This
panel is implemented in such a way that it is user-configurable. A configuration menu
allows the user to define the number of axes and their size. In a separate step it is then
possible to decide via a context menu, which parameters should be plotted on which axis.
A data cursor functionality can be used to display detailed information on the data points.
This data cursor function can also be used to display text information such as e.g. clinical
notes.
The purpose of the drill-down panel (4) is to allow the user to take a deeper look
into the underlying data. If there is a time series of measurements or events for which an
underlying data level exists, the individual data points can be selected via mouse click.
The layout and content of the drill-down panel then adapts to the respective measurement
or event type. In the case of the example shown in Figure 3, it is an ECG measurement
with the corresponding signal analysis.
There are two possibilities for the start-up procedure. On the one hand, the Viewer
can be called via the MATLAB 'Command Window', where all existing data can then
interactively be loaded into the viewer. On the other hand, after a model run, it is possible
to load the viewer directly from the evaluation graphics (e.g. by clicking onto a cell of
the confusion matrix), with a specific subselection of cases (e.g. false positive cases).
4. Discussion
The concept development of a tool that supports data scientists in the development of
predictive models in the telehealth domain revealed many aspects that have to be
considered. A major challenge in terms of the amount and variety of available data is
finding the right balance. On the one hand, a good and comprehensive overall overview
of the situation should be available, providing simultaneous visualization across several
data levels. On the other hand, however, care must also be taken to ensure that there is
no overload of information that overwhelms the user.
The developed tool takes the former aspect into account by simultaneously
presenting the model result, the underlying time series data as well as the features used
for modelling on a single interface. The color-coded representation of the modelling
results enables a quick overview of the model outcomes. Furthermore, the horizontally
time synchronous alignment of the time series directly underneath each other illustrates
the temporal relationship between the model output and the progression of the raw data.
This is of particular importance in order to gain a deeper understanding of the model
outcomes and to formulate new hypotheses.
The second aspect, prevention from information overload, is ensured by the
configurability of the viewer and the concept of 'details on demand'. The user has the
M. Sams et al. / Predictive Modelling and Its Visualization for Telehealth Data 241
possibility to adapt the time series visualization according to his needs, to retrieve
detailed information by means of interactions and to go one step deeper into the data with
the help of drill-down procedures.
Although the presented viewer was developed on the basis of data from a heart
failure disease management program, the basic concept and framework of the tool can
also be applied to a number of other use cases in the health and care sector. Proof-of-
principle for these capabilities has been obtained by applying the viewer to analysis set
of rehabilitation ergometry data. It is a limitation, however, that it required a certain
amount of changes in the code of the viewer to adapt for new application areas. For an
even more generic solution of the viewer, further developments will be necessary.
Nevertheless, even at the current stage of development, the concept and its concrete
implementation make an important contribution to gaining a deeper understanding of the
model development process. It enables insights that can only be achieved by
simultaneously viewing the various levels of data for all development steps. This is
essential for the optimization process of such models in order to improve their
performance. On the other hand, this insight and a deeper understanding are essential to
make the model outcome comprehensible and explainable, which is critical for the
acceptance of data-driven decision support systems in real world applications.
Acknowledgement
This work was partly funded by the Austrian Research Promotion Agency (FFG) as part
of the project EPICURE under grant agreement 14270859.
References
[1] T. B. Murdoch and A. S. Detsky, “The Inevitable Application of Big Data to Health Care,” JAMA,
vol. 309, no. 13, pp. 1351–1352, 2013.
[2] J. Hu, A. Perer, and F. Wang, “Data Driven Analytics for Personalized Healthcare,” in Healthcare
Information Management Systems: Cases, Strategies, and Solutions, C. A. Weaver, M. J. Ball, G. R.
Kim, and J. M. Kiel, Eds. Cham: Springer International Publishing, 2016, pp. 529–554.
[3] D. Gotz and D. Borland, “Data-Driven Healthcare: Challenges and Opportunities for Interactive
Visualization,” IEEE Comput. Graph. Appl., vol. 36, no. 3, pp. 90–96, 2016.
[4] D. A. Keim, F. Mansmann, J. Schneidewind, J. Thomas, and H. Ziegler, “Visual Analytics: Scope
and Challenges,” in Visual Data Mining: Theory, Techniques and Tools for Visual Analytics, S. J.
Simoff, M. H. Böhlen, and A. Mazeika, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008,
pp. 76–90.
[5] A. Holzinger, “Interactive machine learning for health informatics: when do we need the human-in-
the-loop?,” Brain Informatics, vol. 3, no. 2, pp. 119–131, Jun. 2016.
[6] D. Keim, J. Kohlhammer, G. Ellis, and F. Mansmann, Mastering the Information Age. 2010.
[7] D. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J. Kohlhammer, and G. Melançon, Visual Analytics:
Definition, Process, and Challenges. 2008.
[8] S. Liu, X. Wang, M. Liu, and J. Zhu, “Towards better analysis of machine learning models: A visual
analytics perspective,” Vis. Informatics, vol. 1, no. 1, pp. 48–56, 2017.
[9] D. Hayn et al., “Predictive analytics for data driven decision support in health and care,” it -
Information Technology, vol. 60. pp. 183–194, 2018.
[10] G. D. Clifford et al., “AF Classification from a Short Single Lead ECG Recording: the
PhysioNet/Computing in Cardiology Challenge 2017,” Comput. Cardiol. (2010)., vol. 44, p.
10.22489/CinC.2017.065-469, Sep. 2017.
[11] M. Kropf et al., “Cardiac anomaly detection based on time and frequency domain features using tree-
based classifiers,” Physiol. Meas., vol. 39, no. 11, p. 114001, 2018.
[12] A. der Heidt et al., “HerzMobil Tirol network: rationale for and design of a collaborative heart failure
disease management program in Austria,” Wien. Klin. Wochenschr., vol. 126, no. 21, pp. 734–741,
Nov. 2014.
[13] The MathWorks, Inc. Matlab, https://www.mathworks.com/products/matlab.html, last accessed:
6.2.2019.
This page intentionally left blank
dHealth 2019 – From eHealth to dHealth 243
D. Hayn et al. (Eds.)
© 2019 The authors, AIT Austrian Institute of Technology and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
Subject Index
AAL 9 emergency hospital service 57
algorithms 73 ENCODE 105
appointment 33 ergometry 210
atlas 49 exercise test 210
Austria 178 expressive language disorders 81
autonomy 9 FHIR 33
barrier-free toilet 9 forecasting 57
biomarkers 89 FXR 105
blindness 170 health care delivery 218
business model 178 health information 202
cardiac rehabilitation 210 health information interoperability 97
care 9 health monitoring 138
child 17 health service 218
ChIP-seq 105 heart assist device 192
classification 65, 89 heart failure 146
clinical decision support systems 154 hip fracture 138
cognition 170 human-in-the-loop 234
competency-based education 162 human-machine interaction 178
confidentiality 202 ICPC-2 136
convolutional neural network 192 implementation 121
cross-institutional data exchange 33 in-patient care of the elderly 178
data aggregation 49 infection classification 192
data analysis 210 information provision 1
data analytics 200 information system 1
data elements 25 integrated care 218
data linkage 200 Iran 121
data mining 41 kinetics 89
delirium 65 latent class analysis 113
delivery of health care 49 left-ventricular assist device 146
diabetes mellitus 1 machine learning 65, 73, 81, 186
diarization 81 medical education 73
disease management 146 medical informatics 81
documentation 136 medical record linkage 210
driveline infection 192 mental disorders 25
early medical intervention 154 minimum data set 25
echocardiography 41 mobile applications 17
eHealth 170, 218 model deployment 186
electroencephalography 97, 113 model stability 186
electronic health records 65, 226 motion tracker 138
electronic medical record 41 named entity recognition 41
electronic medical record systems 25 national health programs 136
electronic prescribing 121, 128 national roadmap 121
emergencies 17 natural language processing 73
244
Author Index
Aghabagheri, M. 121, 128 Jauk, S. 65, 186
Altenbuchner, A. 138 Jolo, P. 1
Ammenwerth, E. 162 Jonas, S.M. 81, 113
Babitsch, B. 57 Jung, O. 154
Bahaadinbeigy, K. 121 Jungwirth, E. 105
Baumgarten, D. 89 Kastner, P. 218
Baumgartner, C. 89 Kimiafar, K. 121
Boeken, U. 146 Klischies, D. 81
Breit, M. 89 Kluge, T. 97
Bürkle, T. 33 Kohlschein, C. 81
Dehghan, H. 121, 128 Kramer, D. 65, 186
Denecke, K. 1, 33 Kreiner, K. 210
Denter, M. 57 Kriegel, J. 178
Duftschmid, G. 226 Kutafina, E. 113
Eggerth, A. v, 186, 210, 234 Kyburz, P. 33
Ehrenmüller, I. 178 Leodolter, W. 65, 186
Endel, F. 49, 200 Lüneburg, N. 192
Eslami, S. 121, 128 Machalik, K. 41
Etminani, K. 128 Marschall, H.-U. 105
Feldmann, C. 192 Mayer, P. 9
Fogarassy, G. 41 Messer-Misak, K. 136
Geley, T. 218 Modre-Osprian, R. 210
Gfeller, S. 33 Morshuis, M. 146
Ghasemi, S.H. 121, 128 Mulrenin, R. 154
Glachs, D. 154 Namayandeh, S.M. 121, 128
Grabner, V. 178 Netzer, M. 89
Griffin, E. 170 Nüssli, S. 1
Güldenpfennig, F. 9 Ott, S. 226
Haag, M. 17, 73 Panek, P. 9
Hackl, W.O. 162 Panzitt, K. 105
Hanser, F. 89 Picinali, L. 170
Hashemi, N. 25, 202 Ploessnig, M. 154
Hashemi, N.-s. 25, 202 Quehenberger, F. 65, 186
Hasibian, M.R. 128 Rauch, J. 57
Haug, S. 138 Rawassizadeh, R. 25
Hayn, D. v, 65, 186, 210, 234 Reiss, N. 146, 192
Hoffmann, J.-D. 146 Reiswich, A. 73
Huber, M. 97 Reiter, K. 218
Hübner, U. 57 Rinner, C. 226
Huisman, S. 154 Rippinger, C. 49
Igel, C. 17 Rissbacher, C. 218
Jahangiri, M. 121 Runge, J. 218
Jansen, S. 192 Saberi, M.R. 128
246