CIE - Paper - AICS - 2023 - FineTuneIt - BHartmann - Example Paper
CIE - Paper - AICS - 2023 - FineTuneIt - BHartmann - Example Paper
Abstract—This paper presents a system to manage, train, through improving patient safety [10]. However, the rapid
and optimize Named Entity Recognition models using Cloud adoption of this technology leads to an increased volume of
resources, named CRM4NER. This system explores features to data [39]. This volume also leads to Information Overload
perform model fine-tuning using automatic parameter searches
and context-aware fine-tuning recommendations using a Large (IO) [27]. In IO the complexity and abundance of information
Language Model. A goal is to support domain experts in the dif- and data overwhelms the user and can lead to lower decision
ficult task of training and fine-tuning Machine Learning models, speed and quality [27]. For example, a physician may have
which often requires expertise that domain experts are lacking. access to a large amount of EHR data and may not be
The system also includes functionalities to manage models and able to review this information effectively [33]. The field of
training data for the entire usage life-cycle and is integrated in a
knowledge management system. Through providing Cloud-based Information Retrieval (IR) attempts to address challenges of
storage and training, domain users are further supported though IO. However, with using traditional IR-systems such as search
scalable compute and storage for advanced Machine Learning engines, it gets increasingly difficult to solve IO challenges
workloads such as GPU-based Transformer models for improved [44]. NER constitutes a vital component of Natural Language
performance. Processing (NLP) and offers support to IR systems [44].
Index Terms—Cloud Resource Management, Deep Learning,
Named Entity Recognition, Transformer, Cloud Computing, Mi- Named Entities (NE) encompass textual elements that can be
croservice Architecture categorized into specific classes, such as individuals, locations,
or organizations, with the particular categories depending on
I. I NTRODUCTION AND M OTIVATION the application context [29]. In the domain of medicine,
Named Entity Recognition (NER) is a natural language entities may include drugs, medical conditions, or frequency
processing technique that identifies and classifies entities, such descriptors [18]. NER leverages Machine Learning (ML)
as names of people, organizations, locations, within a text. methods [44] to process these entities. The data processed
[28]. This paper addresses the development of a microservice- by NER systems can be used to address challenges such
based application to manage Amazon Web Services (AWS) as IO. In the medical domain, decision support systems and
Cloud resources for NER model training. It also involves inte- data analytics [25] could use this data to improve treatment
grating the application into a medical knowledge management decisions and care quality. ML systems often require large
system. Electronic Health Records (EHRs) contain textual amounts of compute resources and data [15], [43], [46]. Cloud
data about patients such as treatment history, past conditions, Computing (CC) provides techniques to provision resources
and biographical information. The introduction of EHRs in the [26], [42]. It enables users to run scalable compute services,
medical domain leads to positive developments in patient care without the need to manage infrastructure [13]. CC may be
[10]. EHRs can improve productivity of operations through used to complement NER and can solve challenges such as
reduced administrative overhead and improve care quality storing and processing large amounts of data required for NER
[43]. Recent scientific developments such as Transformers [45]
have lead to exceptional results in NLP including NER [28].
979-8-3503-6021-9/23/$31.00 ©2023 IEEE Furthermore, the recent release of the GPT-4 model demon-
strates impressive performance on different language tasks and Evaluation: Assess enhanced system performance. Evaluate
a basis for further application using Large Language Models new optimization components. Examine system integration
(LLMs) [36]. within KM-EP.
We will now motivate our work by introducing relevant In this section we motivated our work, introduced related
research and development associated with the challenges of ap- research, and defined a structured approach to our work. In the
plying NER in the medical domain. Artificial Intelligence for next section we will review relevant state of the art including
Hospitals, Healthcare and Humanity (AI4H3) [22] presents NER, fine-tuning strategies, and technologies associated with
an architectural proposal designed to support medical do- our system integration in KM-EP.
main experts through the integration of Artificial Intelligence
(AI) technology. The Content and Knowledge Management II. S TATE OF THE A RT IN S CIENCE AND T ECHNOLOGY
Ecosystem Portal (KM-EP), developed at the University of
Hagen within the Chair of Multimedia and Internet Appli- NER Techniques: NER is a common method used in IR
cations, adopts the platform architecture of AI4H3. KM-EP systems to address Information Extraction (IE) [11]. The
has previously been employed in projects like MetaPlat [48] term ”Named Entity Recognition” was coined at the Sixth
and SenseCare [16]. This paper’s objective is to seamlessly Message Understanding Conference (MUC) in 1996 [20].
integrate and further develop a cloud-based NER application NER is a subfield of Natural Language Processing (NLP)
into KM-EP. The Framework Independent Toolkit for that focuses on identifying entities in unstructured text data.
Named Entity Recognition (FIT4NER) [19] extends the NER techniques have evolved from rule-based processing and
AI4H3 architecture, focusing on supporting medical experts gazetteers [28], [32] to unsupervised ML approaches like
in NER tasks. Cloud-based Information Extraction (CIE) clustering [28], [32], [50]. Common supervised NER machine
[43] extends FIT4NER by offering Cloud infrastructure for learning techniques include Hidden Markov Models, Decision
NER tasks. Our system is built upon the CIE architecture. Trees, Support Vector Machines, and Conditional Random
The initial implementation documented in [21] covered only a Fields [28], [38]. Recent years have seen the dominance of
subset of use cases outlined in CIE and lacked integration Deep Learning-based NER models, particularly using Recur-
with KM-EP. This earlier implementation revealed several rent Neural Networks (RNNs) and Long Short-Term Mem-
functional deficiencies that should be solved. Integrating these ory (LSTM) architectures [28], [45]. Transformers, introduced
systems would promote usability and collaboration on a shared by Vaswani et al. [45], have emerged as state-of-the-art models
platform. ML model Hyperparameters (HPs) play a crucial in NER. The NER on CoNLL 2003 (English) leaderboard
role in controlling the model learning process [12]. Novice ranks NER model performance by F1 score [30]. According to
users may face challenges in fine-tuning for Transformer the May 2023 leaderboard, the top 10 models exclusively use
models due to the high technical and conceptual knowledge Transformer architectures or hybrid Transformer approaches.
required for configuring and selecting appropriate HPs. They Due to the rapid advancements in NER techniques, selecting
may also lack knowledge to configure Cloud resources for the optimal method is challenging. Consequently, we have
NER model on a Cloud provider like AWS. This leads to the opted to implement Transformers in our improved system,
following Research Questions (RQs): given their strong performance in current research.
RQ1: How can a system to manage AWS Cloud resources NER Frameworks & Technologies: NER is commonly
for NER be expanded and further improved? RQ2: How implemented using software frameworks. In this paper, we
can users be better supported in fine-tuning Transformer have chosen the spaCy framework [3], which we successfully
models? RQ3: How can the system be integrated into applied in our previous system [21]. spaCy, an open-source
a Knowledge Management System such as KM-EP? To framework [4], was developed in 2015 by Honnibal and
address these RQs, we employ the well-established Nuna- Montani at Explosion [23]. It is a popular NLP framework
maker methodology [35], a systematic approach to developing used in various NER studies [41] and production environments
information systems. This methodology involves delineating [6], [7]. Hugging Face (HF) [1] is a Software as a Service
specific Research Objectives (ROs) for each RQ, spanning (SaaS) provider specialized in ML tasks. HF offers a public
observation, theory building, implementation, and evaluation Transformer library [1] for creating, training, and using Trans-
stages. former models. SpaCy features an HF integration [17], allow-
Our defined Research Objectives (ROs) are categorized as ing users to use HF models as regular pipeline components
follows: a) Observation: Review key concepts of NER, Trans- for NLP tasks. Hyperparameter Optimization Strategies: When
formers, Cloud tech. and investigation of existing fine-tuning selecting model HPs, users face several challenges. First, the
strategies and best practices. Analysis of relevant KM-EP com- effective selection of HPs requires significant ML domain
ponents for expansion. b) Theory Building: Develop models expertise [9]. Second, the definitive effect of HPs on model
to address deficiencies identified in the initial implementation performance is difficult to predict, which may require trial
[21]. Enhance our previous model, including use cases, com- and error [24]. Third, there is a large number of parameters
ponents, and overall architecture. c) Implementation: Realize and combinations, making it difficult to select HPs [9]. We
NER training system functionality. Implement new GUI and introduce several strategies for optimizing model HPs, that
backend components. Seamlessly integrate into KM-EP. d) can support users in addressing these challenges:
Providing well-working default parameters (S1) [9]: Offering insights from our previous implementation [21] and CIE
default parameters that have demonstrated strong performance overall architecture [43]. Our system serves medical experts
in various applications and research studies, simplifying user and NER experts, with medical experts having minimal back-
choices. ground in NER and CC, while NER experts possess varying
Indicative scoping through useful context information (S2) NER and CC knowledge. Both groups aim to train NER
[14]: Supporting users in fine-tuning hyperparameters by pro- models, accessing the system through KM-EP. In addition to
viding informative descriptions and interactive GUI elements, CIE use cases [43], we introduced additional use cases to
enhancing user understanding. optimize the NER training process for users. The additional
Meta-Learning (S3) [24]: Meta-Learning is a sub-field of ML use cases are visible in Figure 2: ”Model and Data Man-
in which the goal is to optimize the process of learning itself agement”: Models and Corpus data should be persisted and
[24]. Hospedales et al. mention Bayesian Meta-Learning for available for further use. ”Selection of compute profiles”: The
optimization of HPs through parameter search. Weights & Bi- possibility of selecting predefined compute resources without
ases (W&B) is a Cloud service provider that provides utilities the need of specifying them in detail. ”Perform NER”: Ability
for parameter search through Parameter Sweeps. W&B offers to perform NER on text directly in the application. ”Training
bayesian parameter search and analysis utilities for ML tasks. metric viewing”: Users can view detailed information about
Random Parameter Search (S4) [24]: Conducting random model training such as Precision, Recall, F1-Score, mem-
parameter searches followed by promotion of the best- ory utilization, and system logs. Additionally, the following
performing parameters, offering a less structured but poten- new use cases support users specifically in improving their
tially effective approach. W&B provides random and grid NER models: ”Automatic Parameter Searches”: Ability to
search functionalities. automatically search for hyperparameters using meta-learning
Leaving fine-tuning to dedicated experts (S5) [9]: Recog- optimization algorithms or random search to improve models
nizing that hyperparameter fine-tuning may be best handled easily. This is an extension of the CIE use case Train Model
by experts with extensive experience in optimization. LLM- in Cloud [43]. ”Hyperparameter recommendations via LLM”:
supported parameter search (S6): Leveraging LLMs such as Users should receive fine-tunig recommendations for their
OpenAI GPT-4 [36] to provide context on hyperparameter val- model via a LLM to receive direct support to improve their
ues and model training performance to suggest improvements. models, specifically value suggestions for spaCy configuration
KM-EP Components: KM-EP is a web content and knowl- parameters. Integrating W&B into the training system and
edge management ecosystem portal utilized in numerous configuring UI forms supports strategies S1-4. S5, requiring
research projects, such as RecomRatio [47] and Sensecare user-initiated optimization parameter requests via the UI, is ex-
[16]. Its key components encompass the Content Manager, cluded. The system could compile context information, includ-
Content Management System (CMS) for content customiza- ing spacy configuration, corpus metadata, compute resources,
tion, Ingest Component for content imports, Authentication and previous training results, to improve recommendations
Component for user management, and SNERC—an App following LLM prompt guidelines by White et al. [49]. Users
for customizing and training NER models using Stanford receive fine-tuning recommendations and apply them during
CoreNLP. KM-EP was developed using the widely-used Sym- model training. The system manages and persists models,
fony framework [5] with PHP and MySQL [2]. Our initial training data, and training run information.
application [21] leveraged AWS resources for Cloud-based Our improved system is primarily managing Cloud re-
NER model training with spaCy customization. It offered a sources to provide functionalities around NER. Therefore we
GUI for configuring compute resources and consisted of two name the system Cloud Resource Management for Named
primary components: a standalone Model-View-Controller Entity Recognition (CRM4NER), pertaining to its core
(MVC)-based application and an AWS system for ML model functionality. The CRM4NER system architecture includes:
training. Both our initial system and KM-EP adhere to a a) An additional KM-EP Component for Symfony, providing
microservices architecture, enabling seamless integration. KM- GUI structure, persistence using the KM-EP Database, and
EP supports App integration, including SNERC [44], through request handling. The App should leverage the KM-EP CMS
isolated components, and both applications facilitate RESTful elements and Authentication and should follow a similar
communication. Our objective is to achieve a seamless inte- layout as SNERC. b) A Service Controller for API calls
gration of our app into KM-EP, as detailed in the following to remote services. c) A Metrics Component for collecting
section. and processing training metrics using W&B. d) An LLM
Provider component for gathering model context information
III. D ESIGN AND M ODELING and offering hyperparameter recommendations. e) Adaptation
We employed User Centered System Design (UCSD) [34], of the previous system [21], using a AWS Cloud Component
a proven methodology for developing information systems, to and Training Container for model training in AWS, to support
guide our system’s design and development. UCSD prioritizes parameter searches with W&B and spaCy. f) Microservice
understanding user needs and requirements to inform design interaction with Cloud and SaaS services via API calls. This
decisions, ensuring the system meets user needs effectively. To design maintains established MVC and component patterns.
apply UCSD, we defined the system’s use context, considering The component diagram is shown in Figure 1.
Fig. 1. CRM4NER Component diagram.
3-Parameter GPT- 6-
At- GPT-4 Lim- GPT-4 GPT-4 GPT-
bayes search 4 Full Attempt Parameter
tempt ited Context Limited Limited 4 Full
(config I) context Random
Context I Context II Context
1 0.822 0.826 0.826 Search
2 0.816 0.831 0.825 1 0.813 0.813 0.813 0.823
3 0.821 0.829 0.828 2 config error config error 0.828 0.848
4 0.822 0.827 0.827 3 0.8512812062 0.845 0.613 0.831
5 0.822 0.826 0.818 4 0.848144168 0.812 0.509 0.855
6 0.820 0.819 0.826 5 config error 0.831 0.000 0.832
6 0.852029826 config error 0.251 0.818
TABLE II
CPU FINE - TUNING METHOD PERFORMANCE DATA (PART II). 0.84
3-
3- Parameter 4- 4-
At- Parameter Parameter Parameter
bayes 0.83
tempt random bayes Random
search
search (config II) search Search
1 0.821 0.185 0.234 0.816
F-Score
2 0.811 0.443 0.766 0.820
3 0.811 0.813 0.812 0.810 0.82
4 0.821 0.811 0.650 0.823
5 0.816 0.823 0.648 0.821
6 0.814 0.819 0.459 0.819
0.81