0% found this document useful (0 votes)

214 views35 pages

BIM-GPT: Virtual Assistant for BIM IR

The document introduces BIM-GPT, a framework that uses GPT models and prompts to enable natural language information retrieval from building information models (BIMs). BIM-GPT generates prompts to help GPT models interpret users' natural language queries about BIMs, summarize retrieved information, and answer BIM-related questions. The framework was evaluated on a BIM dataset where it achieved high accuracy for query classification, including when no training data was provided in prompts. A prototype using BIM-GPT for a hospital building demonstrated its functionality for natural language-based BIM information retrieval.

Uploaded by

Bien Pham Huu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

214 views35 pages

BIM-GPT: Virtual Assistant for BIM IR

Uploaded by

Bien Pham Huu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

BIM-GPT: a Prompt-Based Virtual Assistant

Framework for BIM Information Retrieval

Junwen Zhenga, *, Martin Fischerb

a Stanford University, Department of Civil & Environmental Engineering, Stanford, CA, United States
b Kumagai Professor of Engineering and Professor in Civil & Environmental Engineering, Stanford University, Stanford, CA, United States

Abstract
Efficient information retrieval (IR) from building information models (BIMs) poses significant
challenges due to the necessity for deep BIM knowledge or extensive engineering efforts for
automation. We introduce BIM-GPT, a prompt-based virtual assistant (VA) framework
integrating BIM and generative pre-trained transformer (GPT) technologies to support NL-based
IR. A prompt manager and dynamic template generate prompts for GPT models, enabling
interpretation of users' NL queries, summarization of retrieved information, and answering BIM-
related questions. In tests on a BIM IR dataset, our approach achieved 83.5% and 99.5%
accuracy rates for classifying NL queries with no data and 2% data incorporated in prompts,
respectively. Additionally, we validated BIM-GPT’s functionality through a VA prototype for a
hospital building. This research contributes to the development of effective and versatile VAs for
BIM IR in the construction industry, significantly enhancing BIM accessibility and reducing
engineering efforts and training data requirements for processing NL queries.

*
Corresponding author.
E-mail address: [email protected] (J. Zheng), [email protected] (M. Fischer).
1. Introduction
Building information models (BIMs), one of the most important digital representations of a built
asset across its life cycle [1], have become increasingly popular as a tool in the construction
industry [2]. BIMs can integrate multi-disciplinary data, such as architectural, structural,
mechanical, electrical, and plumbing, to support construction and facility management activities
[3]. The integrated BIMs contribute to cost and time savings [4], reduce risks [5], and improve
efficiency of operation and maintenance [6].

However, as BIM is becoming more complex due to the aggregation of large amounts of data
from design to construction and operation, it is more challenging to efficiently retrieve
information [7], particularly for many non-tech savvy users in the industry. Existing information
retrieval (IR) strategies require users to have extensive knowledge of BIM technology [8].
Practitioners must master not only the complicated user interface, but also data structure,
terminology, and structured query language. The lack of BIM expertise has been identified as
one of the most prohibiting barriers to realizing the benefits of rich, up-to-date information in
construction and operation [9,10].

Virtual assistants (VAs) have been proposed as a potential solution to the challenges of BIM IR
by offering a natural language (NL) interface [11–13]. However, the development of VAs
requires extensive efforts, and their functionality is currently limited [14]. To interpret NL queries,
existing VAs rely heavily on either traditional natural language processing (NLP) methods
[11,12] or machine learning (ML) methods [13,15]. The traditional methods require substantial
engineering to customize syntactic and semantic analysis for different BIMs. Although ML
methods are more versatile than traditional NLP methods, they depend on large amounts of
labeled data and computationally expensive model training. Moreover, these VAs can only
respond to users’ queries within predefined syntactic patterns, and cannot answer more general
questions about building components and their properties in BIM that help practitioners retrieve
information [16]. Unfortunately, the technical resources and training data required are generally
not available in construction projects and for facility management, hindering the implementation
of VAs in practice.

The emergence of generative pre-trained transformer (GPT) models, a type of large language
model, presents new opportunities to improve IR. GPT models can significantly reduce the
engineering and data requirements for prototyping VAs compared to traditional NLP and ML
methods [17,18]. GPT models have been pre-trained on a large corpus of texts, such as GPT-3
[19] and InstructGPT [20], and have demonstrated remarkable in-context learning abilities with
the help of a textual “prompt” [21]. For example, ChatGPT can perform many NLP tasks, such
as text classification, information extraction, summarization, and question answering, by
providing prompts that contain task instructions and a few demonstration examples as context
[22]. Because these tasks are essential functions for efficient IR, the prompt-based approach is
now being explored for VA development in retail [23] and healthcare [24].

This paper introduces BIM-GPT, a prompt-based VA framework that integrates BIM and GPT to
enable efficient IR using NL. We developed a prompt library and a prompt manager, which

2
serve as the core module of the framework to dynamically generate prompts that provide
context for GPT models to interpret user queries, summarize retrieved results, and answer BIM-
related questions. Leveraging the advantage of GPT’s in-context ability, our approach not only
supports NL-based user interactions to improve accessibility of BIMs, but also significantly
reduces the required effort compared with existing approaches. Specifically, it eliminates the
need for the customization of syntactic and semantic analysis in traditional methods as well as
model training in ML methods. To validate BIM-GPT effectiveness and versatility, we evaluated
the framework on the augmented BINLQ [15], a BIM IR dataset that we further annotated for
improved coverage and granularity. The evaluation results demonstrate that BIM-GPT can
accurately interpret user queries, with high accuracy achieved when no data is incorporated in
prompts. Furthermore, we found that incorporating as little as 2% data in prompts can further
improve the accuracy of the system. The evaluation results demonstrate that BIM-GPT can
accurately interprets users’ queries with high accuracy achieved when no data is incorporated in
prompts. Furthermore, the results indicate that incorporating as little as 2% data in prompts can
further improve the accuracy of the system. Additionally, we developed a VA prototype based
on the framework for a hospital building and validated its functionality.

The paper is structured as follows: Section 2 reviews the related literature. Section 3 introduces
the VA framework, illustrating its architecture and components. Section 4 evaluates the
effectiveness and versatility of the framework based on a BIM IR dataset. Section 5 describes
the implementation details for the VA prototype of a hospital building. Finally, Section 6
concludes the paper, summarizing key findings and outlining directions for future research.

1. Literature Review
BIM technology has revolutionized the construction industry in recent decades. From mere
computer-aided design tools, BIMs have evolved into comprehensive systems that manage
building information and project data throughout a building’s life cycle [25]. As BIM adoption
continues to grow in the industry [26], efficient IR becomes increasingly important, playing a
critical role in enabling stakeholders to efficiently access and utilize the rich data available from
BIMs [27].

Therefore, this study examined three areas of related literature to provide the context of BIM IR,
identify gaps of existing retrieval strategies, and introduce new opportunities. Section 2.1
reviews the challenges of existing retrieval approaches for BIMs that require deep BIM
knowledge. Section 2.2 explores the NL-based approaches for BIM IR, including traditional NLP
methods and more recent ML methods. Section 2.3 introduces GPT models, particularly their in-
context learning and NLP capabilities. Lastly, Section 2.4 summarizes the gaps that remain in
existing NL-based approaches due to the need for extensive engineering and data
requirements.

2.1 Challenges of BIM Information Retrieval

Retrieving information from BIMs involves accessing and extracting relevant data from complex
digital models. BIMs capture a large amount of multi-disciplinary data throughout the design,

3
construction, and operation of a construction project. This data is essential for project
stakeholders’ decision-making [28]. It consists of not only building elements’ geometry, but also
their semantics, such as the material, manufacturer, location, or system [29]. However, existing
retrieval approaches, such as 3D interfaces and semantic search, require users to have deep
BIM knowledge, including proficiency in graphical user interfaces (GUIs), understanding of data
structures and their terminologies, and familiarity with formal query languages.

The 3D interface approach leverages intuitive GUIs of BIMs, enabling users to interact directly
with building elements and access relevant information linked to them [30]. The approach is
commonly used in Autodesk Revit [31] and other BIM software in practice and has also been
applied in many BIM-integrated systems. For example, [32] supported users in visual knowledge
management, while [33] facilitates operation and maintenance tasks. However, as BIMs grow
larger and more complex, users may encounter difficulties in navigating and manipulating
intricate 3D GUIs, hindering efficient IR.

The semantic search approach is also used for IR based on the object-oriented structure of
BIMs [34]. This approach utilizes well-defined data schemas, such as the Industry Foundation
Classes [35], and applies formal query languages to retrieve the desired information by filtering
specific semantic properties or characteristics of building elements. For instance, [36] enables
querying for building information of IFC-based BIMs with SPARQL, while [37] proposes a more
intuitive visual programming language for general filtering of IFC-based BIMs. However, the
semantic approach requires users to have a deep understanding of BIM data structures and
programming concepts, which is generally not the case for non-tech savvy practitioners,
especially in the construction and facility operation phrases of a project.

2.2 Natural Language-based Approaches

Over the last few decades, NLP technology has been applied to BIM IR, alleviating the
knowledge and expertise requirements for practitioners. NL-based approaches enable users to
retrieve information from BIMs using everyday expressions. Because of the massive information
contained in BIMs and a large variety of NL, accurately interpreting users’ queries is crucial for
the effectiveness of such an NL-based interface.

Early attempts generally applied traditional NLP methods, such as syntactic analysis [38] and
semantic analysis [39], for natural language understanding (NLU). Customized functions and
algorithms were developed to identify a user’s intent, extract key phrases and then map them to
a predefined representation used for BIM IR represented in a structured format [14]. These were
then applied to various use cases. [7,8] developed a process of keyword extraction and
International Framework for Dictionary (IFD)-based mapping, including tokenization, tagging,
parsing, classification and mapping steps, to convert an NL utterance into entities and
relationships in IFC schemas for querying BIM data. [40] developed a string-matching method
based on the Levenshtein distance and Burkhard-Keller tree structure to extract the spatial
geometric information of a BIM from users’ utterances for fire emergency response. [12]
developed an NLU algorithm to classify content words and an information extraction algorithm to
locate the queried structured IFC data from BIMs. Although these approaches could enable

4
construction practitioners to retrieve information from BIMs using NL, they require extensive
engineering efforts to customize NLP features and models.

To further improve the performance of NLU, previous research developed BIM-specific ontology
models for semantic understanding. For example, [41] constructed an IFC IR ontology and used
local context analysis to support a search engine for retrieving online BIM resources. [42] built a
domain ontology for architecture, engineering, and construction (AEC) and a processing
procedure for keyword extraction, expansion, and mapping to retrieve BIM objects. [11,43,44]
implemented a BIM ontology model and applied NLP syntactic analysis through word
segmentation, part-of-speech tagging, keyword matching extension and keyword mapping
steps, to convert NL queries into SQL for retrieving and manipulating BIM data. Although
ontology models can help further improve the accuracy of semantic analysis, additional
engineering efforts are needed to build those models, which is typically not practical in
construction.

Because traditional NLP-based methods require extensive effort to customize the syntactic and
semantic analysis, a more versatile approach is needed to effectively interpret NL queries for
BIM IR. In recent years, thanks to great advancements of ML in NLP, researchers have
developed ML-based approaches that can learn NLP features and models from training data
sets, reducing the need for engineering customized features and algorithms.

One notable example of ML methods is Bidirectional Encoder Representations from

Transformers (BERT) [45]. BERT is a powerful pre-training language representation model that
can be fine-tuned for specific NLP tasks, allowing it to learn from large amounts of text data and
achieve state-of-the-art performance on a range of NLP benchmarks. [15] has applied a robustly
optimized BERT pre-training approach (RoBERTa) fine-tuned on the building information-
related natural language queries (BINLQ) dataset for classifying NL queries of BIMs into
predefined IR categories. Moreover, technology companies have developed commercial
products that utilize ML methods for NLP, such as Amazon Alexa, Google Assistant, and IBM
Watson, which have also shown potential for BIM IR. [13] developed a proof-of-concept
prototype, using Amazon Alexa, to retrieve information from BIMs. However, while these ML-
based approaches can eliminate the need for customizing NLP functions and accurately
recognize NL utterances, they require a large amount of training data, which is generally not
available and too expensive to collect on typical construction projects [46].

In addition to NLU, natural language generation (NLG) has been applied to further process the
data retrieved from BIMs into an NL-based representation. For example, [47] extracted
semantic content from structured BIM data and then applied syntactic sentence structure, a
grammar rule to generate NL sentences, helping practitioners easily understand and utilize the
information. Recent research has integrated NLU and NLG to develop VA approaches that
support NL-based IR. [12,48,49] developed the intelligent building information spoken dialogue
system (iBISDS), a VA that provides IR for construction practitioners via spoken natural
language queries. However, the development of NLG modules that require additional
engineering efforts to build NL response templates is not practical and only covers limited
syntactic patterns.

5
Although the VA approaches enable searching for BIM elements and their properties through
NL, practitioners still need to learn general knowledge about BIM in order to better utilize the
technology [16]. With the growth of the Internet, such knowledge is more accessible with the
help of search engines. Semantic web technologies could facilitate retrieving the information
from online sources and integrate them with BIM for energy analysis [50] and cost estimation
[51]. In addition, to reduce search effort and enable NL queries, recent research has developed
an intelligent question answering (QA) system for general BIM knowledge using a BERT-based
ML method [16]. For example, the QA system can retrieve relevant documents from a dataset to
answer questions, such as “is BIM a process or a model?” and “What is BIM?”. Nevertheless,
the supplemental functionality of QA also requires extensive engineering efforts in building the
semantic web features or collecting data for ML methods.

2.3 New Opportunities: Generative Pre-trained Transformer Models

NLP has made significant progress in the last few years, largely due to the development of large
language models (LLMs). LLMs are a class of advanced ML models designed to understand
and generate human-like language by pre-training on massive amounts of text data [52,53].
One of the most prominent LLMs is the GPT series, including GPT-3 [19] with 175 billion
parameters, which have emerged as powerful tools in NLP due to their in-context learning
ability. Building upon GPT-3, its successor models such as InstructGPT [20] and ChatGPT [22]
further enhance in-context learning by following human instructions more effectively.

These GPT models are capable of interpreting and generating NL with minimal additional
training data [54]. Their in-context learning relies on “prompts” provided by users to guide the
model's responses. This prompt-based approach allows GPT models to excel in zero-shot and
few-shot learning scenarios [55]. They have achieved state-of-the-art performance in many NLP
tasks, including accurately categorizing text into predefined classes with limited training data
[19], extracting specific information using contextual prompts [56], generating concise and
coherent summaries of lengthy text passages [57], and comprehending and generating accurate
responses to user questions based on context [58]. Leveraging the NLP capabilities of GPT
models, the prompt-based approach is now being explored for VA development. For example,
[24] developed a GPT-based chatbot for triaging and patient note summarization for a
healthcare use case. [23] reviews GPT-based chatbots applied for real-world customer service
use cases. To date, however, there has been no published research on applying this approach
for BIM IR in the construction sector.

2.4 Summary of Gaps in Existing Natural Language-based Approaches to BIM IR

The existing literature highlights the ongoing challenges of BIM IR, specifically the need for
deep BIM knowledge, including proficiency in BIM GUIs, understanding of data structures, and
familiarity with formal query languages. To address the accessibility issue of BIMs, NL-based
approaches have been developed to allow practitioners to retrieve information using everyday
expression. However, the current approaches summarized in Table 1 have several limitations.

6
To accurately understand NL queries, early approaches applied traditional syntactic and
semantic analysis methods that required extensive engineering effort to customize functions for
different BIMs. Recent ML-based approaches, more versatile than the traditional ones, were
developed to learn the NLP functions and models from training data of NL queries. However,
the construction industry generally lacks such data, which is labor-intensive and expensive to
collect. Furthermore, all approaches use template-based methods to deliver retrieved
information from BIM, which requires additional technical efforts and limits the scope to
predefined templates. Lastly, none of them can answer general questions about building
components and their properties in BIM, which would require additional effort to construct the
knowledge base.

Table 1. Current NL-based Approaches for BIM IR with Limitations

Natural language Natural language Knowledge source for Limita

Approach understanding (NLU) generation (NLG) question answering (QA) tions

AI-based voice assistant for Machine learning:

A, C,
BIM [13] Amazon Alexa Skills Template-based BIM (RVT format)
D

Intelligent QA system for BIM Machine learning: Fine- Not available BIM-AIOT QA corpus
& AIOT [16] tuned BERT model A, D
(N/A) (text paragraphs)

QA system for BIM Traditional NLP: Syntactic

B, C,
information extraction [12] and semantic analysis Template-based BIM (IFC format)
D

BIM automatic speech Traditional NLP: syntactic

recognition [11] and semantic analysis + Not available
BIM (RVT format) B, D
ontology model (N/A)

Intelligent building information Traditional NLP: syntactic

B, C,
spoken dialogue system [48] and semantic analysis Template-based BIM (RVT format)
D

A NL-based approach to data Traditional NLP: syntactic Not available

BIM (IFC format) B, D
retrieval for cloud BIM [8] and semantic analysis (N/A)
A = requires a large amount of training data
B = requires extensive engineering effort
C = depends on predefined templates to deliver information
D = cannot answer general BIM-related questions

Although current NL-based approaches can alleviate the problem of practitioners in the
construction industry often lacking BIM expertise, construction projects typically do not have the
technical resources and training data. The emergence of GPT models offers new opportunities
to address these limitations by leveraging the prompt-based techniques, reducing engineering
requirements for traditional methods and data requirements for ML methods. Therefore, it is
necessary to investigate the potential of integrating GPT with BIM to develop a more effective,
versatile VA framework for IR, improving accessibility of BIMs without prohibitively increasing
engineering efforts and requiring massive training data.

7
2. Research Methodology
This paper presents BIM-GPT, a prompt-based Virtual Assistant (VA) framework for BIM
Information Retrieval (IR), designed to overcome limitations of existing approaches found in the
literature. The BIM-GPT framework offers a NL interface, coupled with BIM visualization, to
facilitate efficient IR and interaction with 3D building components for construction practitioners.
To harness the NLP capabilities of GPT models, our framework is designed to dynamically
generate prompts for various use cases and interpret NL queries accurately. Given that BIMs
encompass massive, multidisciplinary building data, the framework must effectively manage this
information to address user queries. Consequently, the BIM-GPT framework consists of three
modules: a web-based User Interface (UI) module, an NLP module, and a Data Management
(DM) module (as shown in Fig. 1).

The UI module enables users to engage with the VA by entering NL queries and receiving NL
responses through chat boxes while visualizing the retrieved results in a cloud-based BIM. Text
queries are relayed from the UI module to the NLP module, where responses are subsequently
generated. To interpret the user's query, the NLP module dynamically produces prompts and
acquires generated texts from an external GPT server. Within the DM module, a cloud-based
database hosts the building information and provides results in response to structured queries
from the NLP module.

The following subsections elaborate on the design of the NLP module, with its validation
discussed in Section 4. Section 5 addressed the implementation of the UI and DM modules, as
well as the validation of the entire framework.

Figure 1. Overview of BIM-GPT Framework

3.1 Natural Language Processing Module

The NLP module serves as the core of the BIM-GPT framework, responsible for interpreting NL
queries and generating NL answers based on retrieved results from BIMs. Our prompt-based

8
approach leverages the NLP capabilities of GPT models to reduce engineering and data
requirements. Trained on trillions of textual data to learn the patterns and structure of language,
GPT models are designed to predict the most likely subsequent word when given a prompt as
the starting point. The prompt is crucial in aligning the relevant data that the models have been
trained on with the concepts in BIM IR, as depicted in Fig. 2, so that the models can generate
relevant and coherent texts to meet users’ needs.

Fig. 2. Prompt: Aligning GPT Models with BIM IR

To support efficient IR and improve accessibility of BIMs, the NLP module must address three
critical tasks: NLU, NLG, and QA. The NLU task entails classifying user intent and identifying
the parameters as well as recognizing values of the queried building components, effectively
converting NL queries into structured queries for IR from BIMs. The NLG task focuses on
summarizing the retrieved results to deliver NL answers, while the QA task handles general
questions related to building components and their properties within BIMs. Consequently, we
designed a prompt library consisting of five types of prompts: Intent Prompt, Parameter Prompt,
Value Prompt for NLU, Summary Prompt for NLG, and General Question Prompt for QA.

Furthermore, we developed a shared template for these five prompts. The most important
feature of the template is that its components dynamically adapt based on the use case and
user’s query, which help GPT models effectively handle a wide variety of NL queries. Shown
with the use cases in Fig. 3, the dynamic template consists of five components: System,
Relevant Database Information, Task Instruction, Few-shot Examples, and User. This dynamic
characteristic is essential for ensuring the framework's effectiveness and versatility across
different NLP tasks.

Fig. 3. Dynamic Prompt Template and Its Use Cases

9
To effectively manage the dynamic generation of prompts, we devised a prompt manager to
control the process of converting NL queries into NL answers. The prompt manager is
responsible for monitoring current states of NLP tasks, obtaining relevant information from the
BIM database, updating prompts with retrieved results, and communicating with external GPT
servers to generate texts based on the prompts. The following subsections elucidate the design
of the dynamic prompt template in the prompt library and manager, supplemented by detailed
examples.

3.1.1 Prompt Library

The effectiveness of the NLP module in the BIM-GPT framework relies heavily on the quality of
prompts. To better align GPT models with BIM IR contexts, we designed the dynamic prompt
template based on consideration of characteristics of GPT models. Because the prompt serves
as the conditional probability for GPT models to predict the next word in a sequential order, the
components of the template are arranged in a specific order, with the more pertinent content
positioned closer to the end of the prompt. As such, the text towards the end of the prompt
tends to carry greater weight than the text at the beginning, which helps GPT models generate
relevant and coherent responses.

Following this sequence of relevance, we describe each component of the template and its role
in aligning GPT models with BIM IR, along with an intent prompt example, as shown in Fig. 4.
(1) The System component specifies the VA role of the GPT model, enabling it to understand
the background of the user’s query. For example, “You are a virtual assistant that helps users
retrieve building information from the BIM database, and answer general questions about BIM
technology, architecture, engineering, construction, and operation.” (2) The Relevant Database
Information component includes related data, such as schemas and records retrieved by the
prompt manager from the BIM database that may provide hints for the GPT model. For
example, “The BIM database consists of major building components of a hospital …” (3) The
Task Instruction component describes the task and the output requirements in detail, which
explicitly guide the GPT model to generate desired responses. For example, “Your first task is to
classify the ONE intent … among …; Your second task is to identify ONE category … from the
list …; This is a classification task; Answer as concisely as possible; The output should follow
this template …” (4) The Few-shot Examples component provides examples of the user input
and system output pairs, which demonstrate an output pattern for the GPT model. (5) The User
component incorporates the user’s query following the pattern, which serves as the starting
point for the GPT model to generate texts. See Appendix A1 for additional examples of prompts.

10
Figure 4. Intent Prompt Example of Dynamic Template

Even with the prompt template, it is still challenging for GPT models to directly parse an NL
query into a structured query for BIM IR, because they have not been trained on such semantic
parsing tasks nor seen the data structure of BIMs. To address this difficult NLU problem, we
decompose it into a chain of subproblems, including text classification and information
extraction and prediction, which can be handled by the models with appropriate prompting.
Specifically, the GPT model first classifies the user intent and the category of building objects
with the Intent Prompt, then identifies the parameters of the categorized building objects with
the Parameter Prompt, and lastly recognizes the values of the identified parameters with the
Value. In addition, the GPT model utilizes the Summary Prompt to summarize the retrieved
results for NLG and the General Question Prompt for answering general questions about BIM
technology, architecture, engineering, construction, and operation.

The dynamic nature of the prompt template is vital for the versatility and effectiveness of the
framework. Because GPT models have a maximum number of words allowed in a prompt,
supplying more pertinent information in prompts can assist models in interpreting NL queries
more accurately and generating NL answers more coherently. Therefore, our template
dynamically generates a prompt based on the use case and user query. The goal of the use
case determines the general description of components for each type of prompt, while the user

11
query further refines the exact details of relevant information to be incorporated into the
prompts. Utilizing the same template can reduce engineering effort for developers and minimize
learning costs for practitioners. The dynamically customized prompts can improve the
performance of NLU, NLG, and QA tasks for BIM IR.

Table 2 presents the five use cases of prompts, with descriptions emphasizing how their
components dynamically update according to the user query. Considering the large amount of
building information stored in the database and the need for concise prompts, we devised a
strategy to determine which pertinent information to include in the prompt components based on
a user query. Firstly, the Task Instruction component offers all possible solutions of the task by
employing known information. For example, the Intent Prompt includes the complete lists of
predefined user intent and the list of all categories of building objects from the database.
Secondly, the Relevant Database Information provides more detailed and granular information,
such as database schemas and value records, compared to the specific information in the Task
Instruction component. For instance, for each category (table) of building objects, the types and
parameters represent a more detailed level of information. For each parameter of the
categorized building objects, its distinct value records represent an even more specific level of
detail. Lastly, the Few-shot Example includes relevant examples based on the identified
category and parameter information. This dynamic design can enhance the performance of GPT
models by aligning them more effectively with BIM IR contexts.

Table 2. Goals and Dynamic Nature of Prompts

3.1.2 Prompt Manager

To dynamically generate prompts for each task, we have developed a PM to manage the entire
process. Fig. 5 illustrates how the PM converts an NL query into an NL answer. At each step
shown in yellow boxes, the PM builds the respective prompts, sends them, and receives

12
generated responses from the GPT model through application programming interfaces (APIs).
The PM uses the classification result of user intent to control whether to invoke either the BIM
database search or the GPT general question answering. In addition, the PM makes API calls to
retrieve relevant information from the BIM database (shown in the blue box), when constructing
the prompts for parameter and value identification as well as summarization.

Figure 5. Prompt Manager: Converting NL Query into NL Answer

Fig. 6 shows a step-by-step example of how the PM interprets an NL query and generates a
coherent NL answer, by incorporating pertinent information from the database into prompts and
leveraging NLP capabilities of GPT models. For the query “Who is the manufacturer of pump
14569?”, the process is listed below:

1. PM applies the Intent Prompt and identifies the intent as “[search in BIM]”, and the
category of the queried building objects as “Pumps”;
2. PM retrieves all parameters of pumps and samples up to 10 values for each parameter
from the database to build the Parameter Prompt, and identifies the filter parameter as
“component_id” and the projection parameter as ‘manufacturer’;
3. PM retrieves a list of all component_id values for pumps to build the Value Prompt, and
extracts the value from the user query as “14569” and predicted value as ‘14569’ from
the provided list of values;
4. PM retrieves the record of the queried pump from the database, constructs the Summary
Prompt, and then generates the NL response: “The manufacturer of pump 14569 is
PACO. It is located in room 06-470 on level 6 and is part of the hydronic return and
power systems.”

13
Figure 6. Example of How PM Processes NL Query

3. Validation of NLP for BIM-GPT

Accurately interpreting a user query is the prerequisite for VAs to effectively retrieve information
from BIMs. This section reports on the experiments we conducted on a BIM IR dataset to
validate the performance of NLP tasks for BIM-GPT. Our primary objective was to assess the
accuracy of the framework, not only for classifying the intent and categories of building objects
but also for identifying queried parameters and recognizing parameter values in NL queries. In
addition, we tested the data efficiency of BIM-GPT under the zero-shot and few-shot scenarios,
where no data was incorporated in prompts and 2% NL queries from the dataset were
incorporated in prompts, respectively. Compared with the existing state-of-the-art approach that
used 80% of the same dataset for training ML models and achieved 99.8% accuracy [15], BIM-
GPT achieved comparably high accuracy of 99.5% in the few-shot scenario. Furthermore, we
performed an ablation study for the dynamic template to understand the impact of prompt
components on system performance. Besides NLU, the other NLP capabilities of BIM-GPT,
such as NLG and QA, will be examined in Section 5.

4.1 Experiments

Our experiments utilized the BINLQ dataset [15], which we further annotated for improved
coverage and granularity. The existing dataset includes 2,065 NL queries for two architectural
BIMs, classified by the user intent and building object category. It includes 11 text classification
(TC) labels related to the basic model information, attributes and quantity information of building

14
objects such as windows, doors, rooms, units, and building storeys, as well as out-of-domain
queries from the SQuAD dataset [59]. However, to retrieve exact information from BIMs, VAs
must identify the parameters and their corresponding values in NL queries, in addition to the
intent and category of building objects. While intent classification is important, the subsequent
parameter identification and value recognition are crucial in obtaining specific information from
NL queries to construct structured queries. These tasks are more difficult as the number of
possible solutions increases from intent and categories to parameters and values.

To evaluate the effectiveness and versatility of our framework, we augmented the existing
dataset by manually annotating the parameters and values for NL queries. We provided both
the filter parameter (filter_para) and projection parameter (proj_para), along with the value
extracted from the user query (extr_value) and the predicted value (pred_value) matching the
database record. Since most databases only contain one distinct record for each value, the
predicted value can be considered as the normalized form for the extracted value that may be
its synonym or plural form. For example, for the query “What is the elevation of the second
floor?”; the filter_para is “storey_id”; the proj_para is “elevation”; the extr_value is “second floor”;
and the pred_value is “2”. Our additional annotation at the parameter level increases the
granularity of the dataset, extending beyond existing TC labels at the intent and category (table)
level. Table 3 displays examples of NL queries with additional annotations in columns 4 to 7.

Table 3. Examples of Additional Annotations for NL Queries in BINLQ [15]

With the augmented dataset, our framework was implemented using Python (version 3.10.8) for
the experiments. We used the ChatGPT model (gpt-3.5-turbo-0301) and set its temperature
parameter to 0 to minimize the randomness of text generation. The Python Data Analysis
(Pandas) library was used to handle this dataset. For each building object category in the
dataset, its schemas and distinct records of those schemas were retrieved and organized in a
Python object, along with NL queries and labels stored in a Pandas dataframe. We applied the
dynamic template to the intent, parameter and value prompts, which are updated according to

15
the user query. Because we decompose the problem of semantically parsing NL queries into a
chain of subproblems, only the NL queries accurately interpreted from the intent classification
and the parameter identification were proceeded for the parameter identification and the value
recognition, respectively.

Given a specific dataset, its categories (tables), schemas, and distinct values are known
information that can be included in the Relevant Database Information and Task Instruction
components of prompts. However, the NL queries and their labels require labor-intensive data
collection, which is generally not available for construction projects. To validate the data
efficiency of our approach, we established the zero-shot and few-shot scenarios for
incorporating NL queries and their ground truth labels in the Few-shot Example components of
prompts. In the zero-shot scenario, we included one or two hypothetical examples, such as NL
queries not included in the dataset and the queries failed in the preceding tasks, to demonstrate
the output pattern for the GPT model, which did not affect the task accuracy. Table 4 Illustrates
the example preparation for the few-shot scenario. For the Intent Prompt, the 2% data sampling
was independent of the user query, while for the Parameter Prompt and Value Prompt, the
sampling was based on the identified category and the identified parameter, respectively. It is
important to note that the few-shot learning was much more data efficient than most ML-based
methods. For example, [15] leveraged transfer learning techniques to reduce data requirements
and used 80% of the same dataset to train ML models for classifying TC labels. See Appendix
A1 for examples of prompts.

Table 4. Example Preparation for Few-shot Scenarios

Use Case Sampling Methods

Intent Randomly sampled 2% NL queries for each TC label in

Prompt the dataset (fixed set for ALL NL queries)

Randomly sampled 2% NL queries for each parameter of

Parameter
the categorized building objects (fixed set of NL queries
Prompt
for a specific object category)

Randomly sampled 2% NL queries for each distinct

Value
value record for the identified parameter (fixed set of NL
Prompt
queries for a specific parameter of categorized objects)

To investigate the impact of different prompt components on task performance, we conducted

an ablation study of the dynamic template. Specifically, we systematically removed one
component at a time, including the System, Relevant Database, and Task Instruction
components, and evaluated the accuracy of the prompts in intent classification, parameter
identification, and value recognition. Additionally, the ablation study was performed under both
zero-shot and few-shot scenarios to further comprehend their influence with respect to the data
availability in the Few-shot Example component. This study not only benefits VA developers but
also construction practitioners, as both parties can participate in editing and refining the NL-

16
based prompts. The results show that the Task Instruction component was the most critical
component and collecting labeled data for the Few-shot Examples was useful for accurately
interpreting NL queries for BIM IR. In the following section, we present the results and discuss
their implications.

4.2 Results

Table 5. presents the results of our experiments on the framework. In terms of the Intent
Prompt, BIM-GPT achieved accuracy rates of 83.5% and 99.5% for classifying TC labels under
the zero-shot and few-shot scenarios, respectively. For building object category classification, a
subclass of TC labels, the zero-shot accuracy increased to 98.6%, while the few-shot accuracy
remained almost unchanged. For identifying filter and projection parameters, the zero-shot
accuracy rates were 92.1% and 90.1%, respectively, while the few-shot accuracy rates were
97.8% and 98.2%, respectively.

Regarding the Value Prompt, BIM-GPT achieved accuracy rates of 78.3% and 88.9 % for
predicting the values of filtered parameters under the zero-shot and few-shot scenarios,
respectively. The framework also achieved accuracy rates of 62.0% and 81.7% for extracting
the values from NL queries under the zero-shot and few-shot scenarios, respectively. Notably,
the accuracy of "Pred. / Extr. values," which refers to cases where the GPT model correctly
predicted or extracted the value, was higher than each individual accuracy rate. This suggests
that the extracted values have the potential to further improve the value prediction. Overall, our
prompt-based approach performs well in NLU tasks, achieving high accuracy rates.

Table 5. Accuracy Rate of NLU Tasks

The confusion matrices for TC labels are presented in Fig. 7, which illustrates the classification
table between true and predicted labels. As shown on the left, the zero-shot prompt tended to
misclassify the attribute and quantity label while accurately predicting the category of building
objects. On the right, the clear diagonal concentration indicates that the few-shot prompt
correctly classified most TC labels. Error analysis of the results revealed a common mistake
where the model misclassified NL queries such as "how high is double glass K?" and “What is
the width of rinker double d” as the window category instead of the door category. However,
these ambiguous NL queries are also confusing for humans.

17
Figure 7. Confusion Matrices for TC Labels: Zero-shot (left) & Few-shot (right)

To provide insights into BIM-GPT’s performance in identifying and predicting parameters and
values, we present the categorized results in Fig. 8. Predicting the exact values that match
distinct database records is a more difficult task, and this is reflected in the decreased accuracy
rates for filtered values, which were 15% and 9% lower than the rates for filtered parameters
under the zero-shot and few-shot scenarios, respectively. As shown on the left of Fig. 8, the
accuracy rates for filtered parameters were on average over 90% for most building object
categories. For the zero-shot scenario, the unit category has the lowest accuracy rate of 87.5%,
but it only contained a small number of NL queries. The door category, with 89.7% accuracy,
contributes most to the misclassification, followed by the room category, with 89.3% accuracy.
The average few-shot accuracy was 6% higher than the zero-shot one, indicating that providing
labeled data was effective to improve the performance of parameter identification.

On the right of Fig. 8, the accuracy rates for filtered values were generally above 80% for most
categories, with the exception of the door category. In the zero-shot scenario, the accuracy
rates of value recognition for the door and window categories were the lowest, at 72.2% and
80.4%, respectively. However, the average few-shot accuracy for value recognition was 14%
higher in the few-shot scenario compared to the zero-shot scenario. This indicates that
incorporating a small amount of NL queries with annotations was useful for the GPT model.
Error analysis did not reveal any typical patterns, and one of the major challenges the model
faced was the large number of distinct records in the database for the "room" and "id"
parameters. For instance, the “id” and “room” parameters had over 50 and 150 different values,
respectively, making it difficult for the GPT model to accurately identify the filtered value.

18
Figure 8. Categorized Filter Parameters (left) & Value Prediction (right)

Table 6 presents the results of our ablation study, in which we evaluated the impact of the
System (SYS), Relevant Database Information (DB), and Task Instruction (TASK) components
of the dynamic prompt template on the performance of the NLU tasks. We measured the
accuracy of classifying TC labels, object categories, identifying filter and projection parameters,
and extracting and predicting filtered values for both zero-shot and few-shot scenarios, while
removing one component at a time. As expected, the effect of removed components increased
following their sequences in the template for the zero-shot scenario. Specifically, removing the
TASK component significantly degraded the accuracy rates for all tasks (shown in red). For
most tasks in the few-shot scenario, the DB and SYS components did not influence the
accuracy rates as much as the TASK one (shown in bold text). However, removing the TASK
component did not affect the accuracy for value prediction and extraction (shown in blue).

Table 6. Ablation Study Results

Figure 9 presents the results of the ablation study for the Intent Prompt. Removing the SYS
components had no effect on task accuracy rates for either scenario, as the SYS only describes
background and role information for VAs. However, such descriptions might be useful to stop
the GPT model from generating irrelevant texts. For example, in the query "Is there a louver 37
on 01 fl 01 t.o. Slab?", the intent prompt without the SYS component generated irrelevant user
queries and answers following the correct classification. Additionally, removing the DB
component resulted in a 9% and 15% reduction in accuracy for the TC label and object category
tasks, respectively, in the zero-shot scenario, while it had no impact in the few-shot scenario.
Although the type and attribute information of categorized building objects in the DB could
provide some context for the GPT model, this information was less useful compared to more
specific few-shot examples. Without the TASK component, the zero-shot prompt did not
generate any meaningful output, as the accuracy rates dropped to 0%. Nevertheless, by
providing only 2% of the data as few-shot examples, the accuracy rates only dropped to about
70%. This indicates that collecting NL queries, even just a small number of them, is useful in
classifying the user intent and object category.

19
Figure 9. Ablation Study Results for Intent Prompt

Figure 10 illustrates the ablation study results for the Parameter Prompt. In the zero-shot
scenario, removing the DB component did not affect the accuracy of identifying projection
parameters, as our annotation only included the projection parameters but not their value
records. However, removing the DB component decreased the accuracy of identifying filter
parameters by 11%. Furthermore, removing the TASK component greatly reduced the accuracy
rates for identifying projection and filter parameters to 59.1% and 25.2%, respectively. Despite
this, because we added leading texts like “proj_para for '' and “filter_para for” in front of NL
queries in the prompts, the GPT model may have had a rough idea of the task even without the
TASK component. In the few-shot scenario, only the TASK component affected the accuracy of
parameter identification, with around a 10% reduction. These results indicate that the TASK is
the most critical components for parameter identification. Thus, when construction practitioners,
with rich domain expertise participate in the prompt development process, they should ensure
the proper design of the TASK component. This can help achieve an acceptable accuracy rate,
which is 80% for our case, and the collection of few-shot data can further improve accuracy,
closing the remaining 20% gap.

20
Figure 10. Ablation Study Results for Parameter Prompt

Figure 11 displays the ablation study results for the Value Prompt. In the zero-shot scenario,
removing the SYS component did not affect the accuracy of filtered value prediction and
extraction, while removing the DB component slightly increased the accuracy. This suggests
that including the values of the same parameter for other categorized objects in the DB might
introduce noise to the GPT model. For the few-shot scenario, none of the SYS, DB, and TASK
components affected the accuracy of filtered value prediction and extraction. This may be
because the few-shot examples already provided adequate context and information for such
specific tasks. Additionally, the accuracy rates for prediction were higher than those for
extraction, which may be due to the fuzzy definition of the ground truth values. For example,
consider the query “What is the object type of the door in room 0220?” The annotation of the
extracted value was “0202”, while the GPT model extracted “room 0220”, resulting in an
incorrect classification.

21
Figure 11. Ablation Study Results for Value Prompt

4.3 Discussion

The accurate interpretation of user queries is critical for NL-based approaches for BIM IR. Our
experiment results demonstrate the effectiveness and versatility of the BIM-GPT framework for
a range of NLP tasks. By aligning the GPT model with BIM IR context, our prompt-based
approach successfully classified the intent and building object category, identified filter and
projection parameters, and extracted and predicted parameter values from NL queries. Our
dynamic template was successfully employed to generate the Intent, Parameter, and Value
Prompts, and achieved high accuracy rates, even in the zero-shot scenario. Additionally, we
further enhanced the accuracy by incorporating supplementary examples in the few-shot
scenario.

The experimental results also substantiated the data efficiency of our approach in
comprehending NL queries. In comparison to the existing state-of-the-art method [15], which
necessitated 80% of the same dataset for training purposes and attained 99.76% accuracy for
the remaining 20% data, BIM-GPT achieved accuracy rates of 83.5% and 99.5% with no data
and a mere 2% of data incorporated in prompts, respectively, when tested on the complete
dataset. This indicates that BIM-GPT can more efficiently utilize the collected data for testing,
ensuring its robust performance in covering a wide variety of NL queries.

The results of the ablation offer insights into the influence of the components of the dynamic
prompts on system performance. We discovered that the Task Instruction component was most
critical, providing the necessary context for the GPT model to generate accurate responses. The
Relevant Database Information component was particularly beneficial for identifying object
categories and parameters, especially in zero-shot scenarios. Although the System component

22
did not considerably impact the system's performance, it may aid in preventing irrelevant text
generation of the GPT model. This assessment of prompt components can not only help
construction practitioners prioritize those components to improve task performance, but also
guide the future development of more effective prompts.

In summary, our prompt-based approach exhibited excellent performance for text classification
tasks, such as intent classification and parameter identification, and reasonably good
performance for information extraction and prediction tasks, such as value recognition. By
leveraging the NLP capabilities of the GPT model, BIM-GPT achieved up to 98% accuracy for
classifying texts from a limited number of predefined labels (e.g., intent, categories,
parameters). However, attaining the same level of accuracy for predicting specific values from a
large number of possible solutions remained challenging for BIM-GPT. Furthermore, although
our prompt-based approach was much more data efficient than existing approaches,
incorporating labeled data in prompts still determines how accurately NLP tasks are performed,
especially difficult ones.

4.4 Limitations

Although our framework's results demonstrate promise, several limitations need to be

acknowledged. Firstly, the evaluation of the framework was conducted on a single BIM IR
dataset. This augmented dataset covered only NL queries for attribute and quantity information
of a few categorized building objects, such as windows doors, rooms, units and stories, and a
limited range of filter parameters and values. Geometric information about building objects, such
as width, length, and height, which are useful in construction projects, was not included. Future
research should collect more NL queries for different BIMs to assess the framework's
generalizability.

Moreover, the prompt-based approach is still a proof-of-concept method based on the recently
released GPT model (i.e., ChatGPT). It remains challenging for the approach to perform
accurately on difficult NLU tasks that involve numerous labels, particularly classification and
prediction. Future work should improve the design of prompts to better align GPT models with
the BIM IR context. Because the sequence and composition of few-shot examples and relevant
information incorporated in prompts can influence GPT model performance, future research
should also explore more effective sampling methods tailored to user queries. In addition, future
studies should investigate how different GPT models may affect the performance of the BIM-
GPT framework.

4. Validation of BIM-GPT Framework

In the previous section, we validated the effectiveness and data efficiency of the BIM-GPT
framework in interpreting NL queries for BIM IR, demonstrating its potential for practical
applications. Building upon these findings, Section 5 will present the validation of the complete
BIM-GPT framework through the implementation of a VA prototype for a hospital building,
showcasing its functionality and evaluating the integration of the DM, NLP, and UI modules.

23
5.1 Implementation

Based on the BIM-GPT framework, we developed a VA prototype for a sample hospital building.
The hospital BIM comprises over 42,000 building objects, including 675 facility assets with rich
and structured properties. The implementation process for BIM-GPT, illustrated in Fig. 12,
involved three main steps: data preprocessing for the DM module; building and testing the
prompt library and manager for the NLP module; and integrating the web-based interface for the
UI module.

Figure 12. Implementation Process for BIM-GPT

5.1.1 Data Management Module

The development of the DM module involved a series of preprocessing steps to prepare for high
quality data for BIM IR, as shown in Fig. 13. Firstly, the hospital BIM consisted of the
architectural, structural, and mechanical models in the format of Autodesk Revit. Our first step
was to use the Model Derivative API of the Autodesk Platform Service to extract building objects
and their properties, and to translate the Revit model into SVF2 format that can be rendered in a
browser. Next, we cleaned the extracted building information using Python to ensure data
quality. The data attributes of the building objects were categorized into five groups: basic
information (i.e., component_id, component_type, is_asset); location information (i.e.,
level_number, room_type, room_name); building system information (i.e., system_type,
system_name); equipment information (i.e., manufacturer, model_name, specification); and

24
OmniClass information (i.e., title, number). Finally, we imported the cleaned data into the cloud-
based MongoDB database. The data was stored as documents in BSON format, which allows
for flexible and efficient querying. Additionally, the translated model was used for the cloud-
based BIM with Autodesk Forge API, enabling users to access the BIM through a web browser.
Overall, the DM module effectively extracted and cleaned the necessary data from the hospital
BIM, making it readily accessible for the subsequent NLP module.

Figure 13. Data Preprocessing

5.1.2 Natural Language Processing Module

For the NLP module, we implemented the prompt-based approach using JavaScript and
Node.js (v18.12.1) along with Npm (8.19.2). We built upon the methodologies outlined in
Sections 3.1 and 3.2 to develop a comprehensive prompt library and a prompt manager that
connected with the ChatGPT model (gpt-3.5-turbo) via the OpenAI API. To evaluate the NLP
module, we tested it with a set of 100 NL queries. While Section 4 elaborates on the
implementation of prompts for identifying user intent, parameters, and values, this section
focuses on prompts for summarization and general question answering, and the NL-based
prompt development process.

For the Summary Prompt, we incorporated the retrieved results from the BIM database and
explicitly defined the task to answer user's queries The Task Instruction component stated:
"Your task is to provide a summarized response of the retrieved results from BIM to answer the
user's query. Your summary must be based on the retrieved results. You can use the first 1-2 as
examples, where many records are retrieved. You may provide the id and type, location, and
building system of the retrieved building components, which are generally useful for users." This
task description effectively guided the GPT model in summarizing the retrieved results for
various situations, thereby delivering the desired information to users.

For the General Question Prompt, we delineated the scope of general questions that the VA
would address and utilized the GPT model as an external knowledge base. Specifically, the
Task Instruction component stated: "The user may ask questions about the functions of building
components and the definition of the components' properties in the BIM database; users may
also search for general knowledge about buildings, BIM technology, architecture, engineering,
construction, and facility management. For other questions, your answer should be 'Sorry, I do
not know. Your question is out of my scope.'" This scope definition minimized the generation of
irrelevant text by the GPT model, preventing potential user confusion about its functionality.

25
The final aspect of the prompt development involves testing and refinement of the prompts to
ensure continuous improvement. BIM-GPT offers a significant advantage over existing
approaches because it allows for easy improvement of the system's NLP capabilities by refining
the prompts in the library, rather than necessitating extensive revisions for traditional NLP
methods, such as syntactic and semantic analysis, and the need for retraining ML models. For
such an NL-based development, even practitioners without programming backgrounds can
participate in improving the prompts. Specifically, we incorporated failed NL queries into the
Few-shot Example component of prompts to enhance the accuracy of NLU. The prototype was
tested with 90 NL queries for the BIM database search, which we collected and manually
annotated, and 10 general questions about BIM technology sourced from an existing BIM QA
dataset [16]. Examples of general BIM questions included "Is the BIM a process or a model?",
"What is the advantage of BIM?", and "How to define BIM?". Although we initially operated
under a zero-shot scenario, the performance of the VA prototype showed steady improvement
throughout the testing process.

5.1.3 User Interface Module

We developed a web-based UI for the VA prototype to support efficient BIM IR. To enable NL
interaction, we integrated with the open-source Genie Server [60] developed by Stanford Open
Virtual Assistant Lab (OVAL) as the front-end chat interface. The Genie Server was further
integrated with Autodesk Forge, which provided a library for visualizing the cloud-based BIM. To
enable seamless communication between the front-end interface and the prompt manager, we
used web sockets to send and receive NL queries and answers, as well as the IDs of retrieved
results. Based on these IDs, the cloud-based BIM rendered different 3D contextual scenes of
the retrieved building objects in response to the user's query. Additionally, our web-based
prototype also can be easily hosted by any cloud service provider (e.g., Google Cloud Platform)
and ported to a web browser. This web-based setting reduces the requirements for high-
performance workstations and allows for easy access from portable devices.

5.2 Discussion

Fig. 14 displays the developed VA prototype by implementing the BIM-GPT framework. Within
the user interface, users input NL queries (shown in green boxes) and receive VA’s NL answers
(shown in white boxes) on the left in the chat boxes, while they visualize retrieved results and
navigate the 3D model using the GUI of Autodesk Forge on the right. The screenshot
demonstrates that the VA retrieved accurate information from the BIM, because the attribute
values in the property window of the queried building component match the VA’s response.

26
Figure 14. User Interface of VA Prototype: Chatbox (left); 3D Rendering (right)

In Fig. 15, a dialogue between the user and the VA is presented. Upon initializing the prototype,
the BIM of the six-story hospital building was rendered. In this example, the user queried the
building components by filtering on the level_number, component_type and component_id, and
the corresponding retrieved results were visualized in the cloud-based BIM as shown in (a), (b),
and (c), respectively. The summarized results, along with their 3D contexts, helped the user
retrieve the required information and easily navigate the BIM. Finally, the user posed a general
question about the building system attribute of the pump, and the VA’s answer provided
additional information, facilitating the IR process.

27
Figure 15. Example Conversations and Visualization of VA Prototype

The developed VA prototype validated the functionality of our BIM-GPT framework. The UI
module enabled users to easily retrieve information through NL and navigate the cloud-based
BIM. The NLP module accurately interpreted the user’s queries, summarized retrieved
information, and coherently answered BIM-related questions. The DM module successfully
managed building information and provided retrieved results according to the user’s queries. As
such, our VA prototype reduced the BIM knowledge requirements for construction practitioners,
supporting more efficient BIM IR.

In addition, the prototype development process demonstrated the significant benefits of our
framework. By leveraging the capabilities of the GPT model, BIM-GPT substantially reduced the
engineering effort required to handle NLP tasks. It eliminated the need for customizing syntactic
and semantic analysis with traditional methods and the need for model training with ML
methods, while accurately interpreting all NL queries during the testing process. For NLG, it
eliminated the need to build predefined templates that only respond to queries within predefined
syntactic patterns. For QA, it leveraged the GPT model as an external knowledge source, which
required no additional effort for knowledge base construction. Furthermore, the VA’s
performance was easily improved by refining the prompts. This NL-based development enabled
construction practitioners to participate in the rapid testing and prototyping processes, unlike
existing approaches that heavily rely on technical experts to develop and maintain VAs.

5.3 Limitations

While we successfully implemented our novel framework, some limitations remain. Firstly, since
this study was primarily focused on developing the framework, a quantitative evaluation of the
framework was not included. In the future, we plan to conduct user studies to investigate the
impact of the NL-based interface on the efficiency of BIM IR and evaluate the usability of the VA

28
prototype. Secondly, the current framework required approximately 5 seconds to answer a BIM
database search query, because of the need to make 5 ChatGPT API calls. Future research
should reduce this latency while maintaining high accuracy in interpreting user queries. Thirdly,
the current VA only supports single-turn conversations. This prevents follow-up conversation,
which might be natural to users and useful for enhancing IR. Future research should investigate
ways to support multi-turn conversations. Lastly, because GPT models might generate
irrelevant or inaccurate information without proper prompting, future research should study how
to improve the control of text generation, especially for NLG and QA tasks.

5. Conclusion
Efficiently retrieving information from BIMs is challenging because they contain a massive
amount of multi-disciplinary data that are relevant across the building’s lifecycle. Current
retrieval strategies require users to have deep knowledge of BIM, which is a major impediment
to utilizing the rich, up-to-date information of BIMs in practice. Although VA solutions have been
explored to help address this accessibility issue, existing methods require extensive engineering
efforts and data collection, which are generally not available or practical in the construction
industry. This paper introduces BIM-GPT, a prompt-based virtual assistant (VA) framework that
integrates BIM and GPT to enable practitioners to effectively query building information, receive
summarized results with 3D visualization, and ask BIM-related questions using NL.

Our framework was evaluated on an augmented BIM IR dataset of 2065 NL queries. BIM-GPT
demonstrated remarkable NLP capabilities and high data efficiency, achieving accuracy rates of
83.5% and 99.5% for classifying user intent and building object categories in NL queries with no
data and 2% data incorporated in prompts, respectively. Additionally, BIM-GPT achieved
accuracy rates as high as 97.8% and 88.9% for identifying building object parameters and
recognizing parameter values in the NL queries, respectively, with similar prompt data levels.
The results demonstrate that our prompt-based approach is effective and versatile in performing
a wide range of NLP tasks.

To validate the functionality of the framework, we developed a VA prototype for a hospital

building. Our implementation included a web-based interface with Autodesk Forge for BIM
visualization, a prompt-based NLP module that connected with the ChatGPT API, and a
MongoDB cloud-based database. The VA prototype was tested with 100 NL queries, and the
results indicate that the VA effectively interacted with users through NL to support efficient BIM
IR. This successful implementation underscores the potential of our BIM-GPT framework to
support practical applications in the construction industry, improving the accessibility of BIMs for
IR.

As such, our research contributes to the body of construction automation theory and practice.
The BIM-GPT framework advances the theoretical understanding of how to leverage GPT
models to automate BIM IR and how to design effective prompts for NLP tasks, such as text
classification, information extraction, summarization, and question answering. By substantially
reducing engineering and data requirements, the framework facilitates the development of VA
for BIM to improve the speed, accuracy, and user experience of IR to support construction and

29
facility operation activities. Its effectiveness and versatility also open opportunities for more
advanced VAs and NL-based automation applications in the construction industry.

While the BIM-GPT framework shows promise, it is important to acknowledge its current
limitations. The evaluation of the framework was conducted on a single dataset of NL queries,
which may not represent all possible types of queries. The accuracy and effectiveness of the
framework may vary depending on the size, diversity, and coverage of the dataset used for
development and evaluation. Furthermore, this study mainly focused on the development of the
framework, and as such, excludes user studies with construction practitioners. Therefore,
further studies are needed to assess the impact of prompt-based VAs on the efficiency of BIM
IR compared to existing retrieval strategies.

Nevertheless, this study represents an important step towards bridging the gap between NL-
based IR and BIM through the development of the BIM-GPT framework. Future research should
aim to improve the accuracy, efficiency, and generalizability of the framework. Additionally, BIM-
GPT’s functionality can be extended from IR to comprehensive management and operation of
BIMs, allowing users to manipulate building information through NL. In the long-term, with
continued development and refinement, the BIM-GPT framework has the potential to accelerate
the digital transformation of the construction industry by supporting the development of prompt-
based VAs for BIMs and NL-related automation applications.

30
Reference
[1] A. Borrmann, M. König, C. Koch, J. Beetz, Building Information Modeling: Why? What?
How?, in: A. Borrmann, M. König, C. Koch, J. Beetz (Eds.), Building Information Modeling:
Technology Foundations and Industry Practice, Springer International Publishing, Cham,
2018: pp. 1–24. https://doi.org/10.1007/978-3-319-92862-3_1.
[2] X. Xu, L. Ma, L. Ding, A Framework for BIM-Enabled Life-Cycle Information Management of
Construction Project, Int. J. Adv. Rob. Syst. 11 (2014) 126. https://doi.org/10.5772/58445.
[3] J. Kunz, M. Fischer, Virtual design and construction, Constr. Manage. Econ. 38 (2020)
355–363. https://doi.org/10.1080/01446193.2020.1714068.
[4] D. Bryde, M. Broquetas, J.M. Volm, The project benefits of Building Information Modelling
(BIM), Int. J. Project Manage. 31 (2013) 971–980.
https://doi.org/10.1016/j.ijproman.2012.12.001.
[5] Y. Zou, A. Kiviniemi, S.W. Jones, A review of risk management through BIM and BIM-
related technologies, Saf. Sci. 97 (2017) 88–98. https://doi.org/10.1016/j.ssci.2015.12.027.
[6] P.E.D. Love, I. Simpson, A. Hill, C. Standing, From justification to evaluation: Building
information modeling for asset owners, Autom. Constr. 35 (2013) 208–216.
https://doi.org/10.1016/j.autcon.2013.05.008.
[7] Z. Hu, J. Zhang, BIM oriented intelligent data mining and representation, Proceedings of
the 30th CIB W78 International. (2013). https://www.researchgate.net/profile/Jia-Rui-
Lin/publication/306519679_BIM_Oriented_Intelligent_Data_Mining_and_Representation/lin
ks/5c0f27b592851c39ebe44687/BIM-Oriented-Intelligent-Data-Mining-and-
Representation.pdf.
[8] J.R. Lin, Z.Z. Hu, J.P. Zhang, F.Q. Yu, A natural-language-based approach to intelligent
data retrieval and representation for cloud BIM, Computer-Aided Civil And. (2016).
https://onlinelibrary.wiley.com/doi/abs/10.1111/mice.12151?casa_token=yyNrUN3JcAsAAA
AA:u0o7Q1Y1tnidRFxcO7h1R9hy7K1W4HcS7LtTYMLvk9wT7XlGSJ_Z0LD8Q4JDc7iXiJC
aFzmCIBW08Q.
[9] B. Ariono, M. Wasesa, W. Dhewanto, The Drivers, Barriers, and Enablers of Building
Information Modeling (BIM) Innovation in Developing Countries: Insights from Systematic
Literature Review and Comparative Analysis, Buildings. 12 (2022) 1912.
https://doi.org/10.3390/buildings12111912.
[10] S. Durdyev, M. Ashour, S. Connelly, A. Mahdiyar, Barriers to the implementation of Building
Information Modelling (BIM) for facility management, Journal of Building Engineering. 46
(2022) 103736. https://doi.org/10.1016/j.jobe.2021.103736.
[11] S. Shin, R.R.A. Issa, BIMASR: Framework for voice-based BIM information retrieval, J.
Constr. Eng. Manage. 147 (2021) 04021124. https://doi.org/10.1061/(asce)co.1943-
7862.0002138.
[12] N. Wang, R.R.A. Issa, C.J. Anumba, NLP-based query-answering system for information
extraction from building information models, J. Comput. Civ. Eng. 36 (2022).
https://doi.org/10.1061/(asce)cp.1943-5487.0001019.
[13] F. Elghaish, J.K. Chauhan, S. Matarneh, F. Pour Rahimian, M.R. Hosseini, Artificial
intelligence-based voice assistant for BIM data management, Autom. Constr. 140 (2022)
104320. https://doi.org/10.1016/j.autcon.2022.104320.
[14] A.B. Saka, L.O. Oyedele, L.A. Akanbi, S.A. Ganiyu, D.W.M. Chan, S.A. Bello,
Conversational artificial intelligence in the AEC industry: A review of present status,
challenges and opportunities, Advanced Engineering Informatics. 55 (2023) 101869.
https://doi.org/10.1016/j.aei.2022.101869.
[15] N. Wang, R.R.A. Issa, C.J. Anumba, Transfer learning-based query classification for
intelligent building information spoken dialogue, Autom. Constr. 141 (2022) 104403.
https://doi.org/10.1016/j.autcon.2022.104403.

31
[16] T.-H. Lin, Y.-H. Huang, A. Putranto, Intelligent question and answer system for building
information modeling and artificial intelligence of things based on the bidirectional encoder
representations from transformers model, Autom. Constr. 142 (2022) 104483.
https://doi.org/10.1016/j.autcon.2022.104483.
[17] T. Wu, E. Jiang, A. Donsbach, J. Gray, A. Molina, M. Terry, C.J. Cai, PromptChainer:
Chaining Large Language Model Prompts through Visual Programming, in: Extended
Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems,
Association for Computing Machinery, New York, NY, USA, 2022: pp. 1–10.
https://doi.org/10.1145/3491101.3519729.
[18] R. Bommasani, D.A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M.S. Bernstein, J.
Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N.
Chatterji, A. Chen, K. Creel, J.Q. Davis, D. Demszky, C. Donahue, M. Doumbouya, E.
Durmus, S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei-Fei, C. Finn, T. Gale, L. Gillespie,
K. Goel, N. Goodman, S. Grossman, N. Guha, T. Hashimoto, P. Henderson, J. Hewitt, D.E.
Ho, J. Hong, K. Hsu, J. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri, S. Karamcheti, G.
Keeling, F. Khani, O. Khattab, P.W. Koh, M. Krass, R. Krishna, R. Kuditipudi, A. Kumar, F.
Ladhak, M. Lee, T. Lee, J. Leskovec, I. Levent, X.L. Li, X. Li, T. Ma, A. Malik, C.D.
Manning, S. Mirchandani, E. Mitchell, Z. Munyikwa, S. Nair, A. Narayan, D. Narayanan, B.
Newman, A. Nie, J.C. Niebles, H. Nilforoshan, J. Nyarko, G. Ogut, L. Orr, I. Papadimitriou,
J.S. Park, C. Piech, E. Portelance, C. Potts, A. Raghunathan, R. Reich, H. Ren, F. Rong, Y.
Roohani, C. Ruiz, J. Ryan, C. Ré, D. Sadigh, S. Sagawa, K. Santhanam, A. Shih, K.
Srinivasan, A. Tamkin, R. Taori, A.W. Thomas, F. Tramèr, R.E. Wang, W. Wang, B. Wu, J.
Wu, Y. Wu, S.M. Xie, M. Yasunaga, J. You, M. Zaharia, M. Zhang, T. Zhang, X. Zhang, Y.
Zhang, L. Zheng, K. Zhou, P. Liang, On the Opportunities and Risks of Foundation Models,
ArXiv [Cs.LG]. (2021). http://arxiv.org/abs/2108.07258.
[19] T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, A. Neelakantan, P.
Shyam, G. Sastry, A. Askell, Others, Language models are few-shot learners, Adv. Neural
Inf. Process. Syst. 33 (2020) 1877–1901.
https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-
Abstract.html.
[20] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C.L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal,
K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P.
Welinder, P. Christiano, J. Leike, R. Lowe, Training language models to follow instructions
with human feedback, ArXiv [Cs.CL]. (2022). http://arxiv.org/abs/2203.02155.
[21] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, G. Neubig, Pre-train, Prompt, and Predict: A
Systematic Survey of Prompting Methods in Natural Language Processing, ACM Comput.
Surv. 55 (2023) 1–35. https://doi.org/10.1145/3560815.
[22] Introducing ChatGPT, (n.d.). https://openai.com/blog/chatgpt (accessed March 10, 2023).
[23] M. Zong, B. Krishnamachari, a survey on GPT-3, ArXiv [Cs.CL]. (2022).
http://arxiv.org/abs/2212.00857.
[24] E. Sezgin, J. Sirrianni, S.L. Linwood, Operationalizing and implementing pretrained, large
artificial intelligence linguistic models in the US health care system: outlook of generative
pretrained transformer 3 (GPT-3) as a service model, JMIR Medical Informatics. 10 (2022)
e32875. https://medinform.jmir.org/2022/2/e32875.
[25] R. Sacks, C. Eastman, G. Lee, P. Teicholz, BIM Handbook: A Guide to Building Information
Modeling for Owners, Designers, Engineers, Contractors, and Facility Managers, John
Wiley & Sons, 2018. https://play.google.com/store/books/details?id=IU9mDwAAQBAJ.
[26] K. Ullah, I. Lill, E. Witt, An Overview of BIM Adoption in the Construction Industry: Benefits
and Barriers, in: L. Irene, W. Emlyn (Eds.), 10th Nordic Conference on Construction
Economics and Organization, Emerald Publishing Limited, 2019: pp. 297–303.
https://doi.org/10.1108/S2516-285320190000002052.

32
[27] M.R. Hosseini, M. Maghrebi, A. Akbarnezhad, Analysis of citation networks in building
information modeling research, Journal Of. (2018).
https://doi.org/10.1061/(ASCE)CO.1943-7862.0001492.
[28] B. Becerik-Gerber, F. Jazizadeh, N. Li, G. Calis, Application areas and data requirements
for BIM-enabled facilities management, J. Constr. Eng. Manage. 138 (2012) 431–442.
https://doi.org/10.1061/(asce)co.1943-7862.0000433.
[29] A. Borrmann, S. Schraufstetter, E. Rank, Implementing metric operators of a spatial query
language for 3D building models: Octree and B-rep approaches, J. Comput. Civ. Eng. 23
(2009) 34–46. https://doi.org/10.1061/(asce)0887-3801(2009)23:1(34).
[30] P. Demian, K. Ruikar, T. Sahu, A. Morris, Three-Dimensional Information Retrieval (3DIR):
Exploiting 3D Geometry and Model Topology in Information Retrieval from BIM
Environments, IJ3DIM. 5 (2016) 67–78. https://doi.org/10.4018/IJ3DIM.2016010105.
[31] Revit software, (2022). https://www.autodesk.com/products/revit/overview (accessed March
15, 2023).
[32] Y.-C. Lin, Construction 3D BIM-based knowledge management system: a case study,
Journal of Civil Engineering and Management. 20 (2014) 186–200.
https://doi.org/10.3846/13923730.2013.801887.
[33] W.-L. Lee, M.-H. Tsai, C.-H. Yang, J.-R. Juang, J.-Y. Su, V3DM+: BIM interactive
collaboration system for facility management, Visualization in Engineering. 4 (2016) 5.
https://doi.org/10.1186/s40327-016-0035-9.
[34] U. Isikdag, G. Aouad, J. Underwood, S. Wu, Building information models: a review on
storage and exchange mechanisms, Bringing ITC Knowledge to Work. (2007) 135–144.
https://itc.scix.net/pdfs/w78-2007-020-068b-Isikdag.pdf.
[35] ISO 16739-1:2018, ISO. (2021). https://www.iso.org/standard/70303.html (accessed March
16, 2023).
[36] J. Beetz, J. van Leeuwen, B. de Vries, IfcOWL: A case of transforming EXPRESS schemas
into ontologies, Artif. Intell. Eng. Des. Anal. Manuf. 23 (2009) 89–101.
https://doi.org/10.1017/S0890060409000122.
[37] C. Preidel, S. Daum, A. Borrmann, Data retrieval from building information models based
on visual programming, Visualization in Engineering. 5 (2017) 18.
https://doi.org/10.1186/s40327-017-0055-0.
[38] G.K. Pullum, G. Gazdar, Natural languages and context-free languages, Linguist. Philos. 4
(1982) 471–504. https://doi.org/10.1007/bf00360802.
[39] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed Representations of
Words and Phrases and their Compositionality, ArXiv [Cs.CL]. (2013).
https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-
Abstract.html (accessed March 16, 2023).
[40] H. Zhoui, M.O. Wong, H. Ying, S.H. Lee, A framework of a multi-user voice-driven BIM-
based navigation system for fire emergency response, in: 26th International Workshop on
Intelligent Computing in Engineering (EG-ICE 2019), The European Group for Intelligent
Computing in Engineering (EG-ICE). The …, 2019.
https://hub.hku.hk/handle/10722/272474.
[41] G. Gao, Y.-S. Liu, M. Wang, M. Gu, J.-H. Yong, A query expansion method for retrieving
online BIM resources based on Industry Foundation Classes, Autom. Constr. 56 (2015) 14–
25. https://doi.org/10.1016/j.autcon.2015.04.006.
[42] S. Wu, Q. Shen, Y. Deng, J. Cheng, Natural-language-based intelligent retrieval engine for
BIM object database, Comput. Ind. 108 (2019) 73–88.
https://doi.org/10.1016/j.compind.2019.02.016.
[43] S. Shin, C. Lee, R.R.A. Issa, Advanced BIM Platform Based on the Spoken Dialogue for
End-User, in: Proceedings of the 18th International Conference on Computing in Civil and
Building Engineering, Springer International Publishing, 2021: pp. 123–132.

33
https://doi.org/10.1007/978-3-030-51295-8_11.
[44] Shin Sangyun, Lee Chankyu, Issa Raja R. A., Framework for Automatic Speech
Recognition-Based Building Information Retrieval from BIM Software, Construction
Research Congress 2020. (2020) 992–1000. https://doi.org/10.1061/9780784482865.105.
[45] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding, ArXiv [Cs.CL]. (2018).
http://arxiv.org/abs/1810.04805.
[46] Y. Ding, J. Ma, X. Luo, Applications of natural language processing in construction, Autom.
Constr. 136 (2022) 104169. https://doi.org/10.1016/j.autcon.2022.104169.
[47] N. Wang, R.R.A. Issa, Natural language generation from building information models for
intelligent NLP-based information extraction, in: EG-ICE 2020 Workshop on Intelligent
Computing in Engineering, Universitätsverlag der TU Berlin, Berlin, 2020: pp. 275–284.
https://www.researchgate.net/profile/Ning-Wang-
77/publication/342574758_Natural_Language_Generation_from_Building_Information_Mod
els_for_Intelligent_NLP-
based_Information_Extraction/links/5efbd96d92851c52d60c82d5/Natural-Language-
Generation-from-Building-Information-Models-for-Intelligent-NLP-based-Information-
Extraction.pdf.
[48] N. Wang, R.R. Issa, C.J. Anumba, A Framework for Intelligent Building Information Spoken
Dialogue System (iBISDS), EG-ICE 2021 Workshop On. (2021).
https://books.google.com/books?hl=en&lr=&id=5mJMEAAAQBAJ&oi=fnd&pg=PA228&dq=
Wang,+N.,+R.R.+Issa,+and+C.J.+Anumba,+A+Framework+for+Intelligent+Building+Inform
ation+Spoken+Dialogue+System+,+in+EG-
ICE+2021+Workshop+on+Intelligent+Computing+in+Engineering.+2021.&ots=WjtLVLdDA
W&sig=8R52tmexJ95AF3jlXihlKhdNDe8.
[49] Wang Ning, Issa Raja R. A., Anumba Chimay J., Query Answering System for Building
Information Modeling Using BERT NN Algorithm and NLG, J. Comput. Civ. Eng. (2021)
425–432. https://doi.org/10.1061/9780784483893.053.
[50] M. Niknam, S. Karshenas, Sustainable Design of Buildings using Semantic BIM and
Semantic Web Services, Procedia Engineering. 118 (2015) 909–917.
https://doi.org/10.1016/j.proeng.2015.08.530.
[51] M. Niknam, S. Karshenas, Integrating distributed sources of information for construction
cost estimating using Semantic Web and Semantic Web Service technologies, Autom.
Constr. 57 (2015) 222–238. https://doi.org/10.1016/j.autcon.2015.04.003.
[52] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, Language Models are
Unsupervised Multitask Learners, (2019). https://life-
extension.github.io/2020/05/27/GPT%E6%8A%80%E6%9C%AF%E5%88%9D%E6%8E%
A2/language-models.pdf (accessed March 18, 2023).
[53] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P.J. Liu,
Exploring the limits of transfer learning with a unified text-to-text transformer, J. Mach.
Learn. Res. 21 (2020) 5485–5551. https://dl.acm.org/doi/abs/10.5555/3455716.3455856.
[54] S. Min, X. Lyu, A. Holtzman, M. Artetxe, M. Lewis, H. Hajishirzi, L. Zettlemoyer, Rethinking
the Role of Demonstrations: What Makes In-Context Learning Work?, ArXiv [Cs.CL].
(2022). http://arxiv.org/abs/2202.12837.
[55] Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, Z. Sui, A Survey on
In-context Learning, ArXiv [Cs.CL]. (2022). http://arxiv.org/abs/2301.00234.
[56] X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y. Chen, M. Zhang,
Y. Jiang, W. Han, Zero-Shot Information Extraction via Chatting with ChatGPT, ArXiv
[Cs.CL]. (2023). http://arxiv.org/abs/2302.10205.
[57] X. Yang, Y. Li, X. Zhang, H. Chen, W. Cheng, Exploring the Limits of ChatGPT for Query or
Aspect-based Text Summarization, ArXiv [Cs.CL]. (2023). http://arxiv.org/abs/2302.08081.

34
[58] Y. Bang, S. Cahyawijaya, N. Lee, W. Dai, D. Su, B. Wilie, H. Lovenia, Z. Ji, T. Yu, W.
Chung, Q.V. Do, Y. Xu, P. Fung, A Multitask, Multilingual, Multimodal Evaluation of
ChatGPT on Reasoning, Hallucination, and Interactivity, ArXiv [Cs.CL]. (2023).
http://arxiv.org/abs/2302.04023.
[59] P. Rajpurkar, J. Zhang, K. Lopyrev, P. Liang, SQuAD: 100,000+ Questions for Machine
Comprehension of Text, ArXiv [Cs.CL]. (2016). http://arxiv.org/abs/1606.05250.
[60] genie-server: The home server version of Almond, Github, n.d. https://github.com/stanford-
oval/genie-server (accessed March 26, 2023).

1-S2.0-S0926580523003278-Mainvebdoi Iq Eh
No ratings yet
1-S2.0-S0926580523003278-Mainvebdoi Iq Eh
24 pages
Buildings 14 02499
No ratings yet
Buildings 14 02499
30 pages
2023 13 ITcon Nabavi
No ratings yet
2023 13 ITcon Nabavi
20 pages
GPT Assistant for BIM Interaction
No ratings yet
GPT Assistant for BIM Interaction
22 pages
DAVE: GPT Assistant for BIM Interaction
No ratings yet
DAVE: GPT Assistant for BIM Interaction
21 pages
NLP-Based Query-Answering System For Information Extraction From Building Information Models
No ratings yet
NLP-Based Query-Answering System For Information Extraction From Building Information Models
11 pages
AI Voice Assistant for BIM Users
No ratings yet
AI Voice Assistant for BIM Users
15 pages
Text2BIM Generating Building Models Using A Large Language Model-Based Multi-Agent
No ratings yet
Text2BIM Generating Building Models Using A Large Language Model-Based Multi-Agent
42 pages
Smart Fridge: Conversational AI Insights
No ratings yet
Smart Fridge: Conversational AI Insights
14 pages
Deep Learning Architectures Enabling Sophisticated Feature Extraction and Representation For Complex Data Analysis
No ratings yet
Deep Learning Architectures Enabling Sophisticated Feature Extraction and Representation For Complex Data Analysis
11 pages
BIMBOT-Artificial Intelligence Applied To BIM Desi
No ratings yet
BIMBOT-Artificial Intelligence Applied To BIM Desi
16 pages
Computer Vision in Construction Management
No ratings yet
Computer Vision in Construction Management
11 pages
Recognizing and Classifying Unknown Object in BIM Using 2D CNN
No ratings yet
Recognizing and Classifying Unknown Object in BIM Using 2D CNN
11 pages
BIM Integration with AI in AEC Industry
No ratings yet
BIM Integration with AI in AEC Industry
23 pages
A Fuzzy Extension of The Semantic Building Inform - 2015 - Automation in Constru
No ratings yet
A Fuzzy Extension of The Semantic Building Inform - 2015 - Automation in Constru
11 pages
Ślusarczyk Strug 2023 Machine Learning Methods in Bim Based Applications A Review
No ratings yet
Ślusarczyk Strug 2023 Machine Learning Methods in Bim Based Applications A Review
22 pages
Artificial Intelligence For BIM Content Management and Delivery: Case Study of Association Rule Mining For Construction Detailing
No ratings yet
Artificial Intelligence For BIM Content Management and Delivery: Case Study of Association Rule Mining For Construction Detailing
43 pages
Fin Irjmets1730175122
No ratings yet
Fin Irjmets1730175122
6 pages
Using ChatGPT in Construction Projects Unveiling Its Cybersecurity Risks Through A Bibliometric Analysis
No ratings yet
Using ChatGPT in Construction Projects Unveiling Its Cybersecurity Risks Through A Bibliometric Analysis
10 pages
I3CE Abstract
No ratings yet
I3CE Abstract
1 page
AI Chatbot for Institute Queries
No ratings yet
AI Chatbot for Institute Queries
8 pages
Mohamed Yousra
No ratings yet
Mohamed Yousra
54 pages
Deep Learningbased Software Engineering Progress Challenges and OpportunitiesScience China Information Sciences
No ratings yet
Deep Learningbased Software Engineering Progress Challenges and OpportunitiesScience China Information Sciences
88 pages
Exploring Potentials of Using Bim Data For Formwork Design Through Api Development
No ratings yet
Exploring Potentials of Using Bim Data For Formwork Design Through Api Development
9 pages
Whitepaper ML
No ratings yet
Whitepaper ML
4 pages
Artificial Intelligence Based Mobile Robot
No ratings yet
Artificial Intelligence Based Mobile Robot
19 pages
Phase 4
No ratings yet
Phase 4
27 pages
Abstract References
No ratings yet
Abstract References
3 pages
Smart Door Lock System Using Face Recognition
No ratings yet
Smart Door Lock System Using Face Recognition
5 pages
Buildings 13 00857
No ratings yet
Buildings 13 00857
16 pages
GANs in Construction: A Review
No ratings yet
GANs in Construction: A Review
16 pages
A Systematic Literature Review of Deep Learning Approaches For Sketch-Based Image Retrieval Datasets Metrics and Future Directions
No ratings yet
A Systematic Literature Review of Deep Learning Approaches For Sketch-Based Image Retrieval Datasets Metrics and Future Directions
23 pages
Question Answering Models For Human-Machine Interaction in The
No ratings yet
Question Answering Models For Human-Machine Interaction in The
12 pages
Infrastructures 05 00037 With Cover
No ratings yet
Infrastructures 05 00037 With Cover
15 pages
IJARP2024225304GR16 (P 54-81)
No ratings yet
IJARP2024225304GR16 (P 54-81)
28 pages
2506 13811v1
No ratings yet
2506 13811v1
32 pages
1-Advancing Civil Engineering With AI and Machine Learning From Structural Health
No ratings yet
1-Advancing Civil Engineering With AI and Machine Learning From Structural Health
36 pages
A Survey of Model Architectures in Information Retrieval
No ratings yet
A Survey of Model Architectures in Information Retrieval
26 pages
Automation in Construction
No ratings yet
Automation in Construction
26 pages
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
No ratings yet
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
11 pages
CONVR 2012 Proceedings
100% (1)
CONVR 2012 Proceedings
604 pages
V1N5
No ratings yet
V1N5
271 pages
J Autcon 2021 103558
No ratings yet
J Autcon 2021 103558
15 pages
Ref 5
No ratings yet
Ref 5
7 pages
Research Paper
No ratings yet
Research Paper
32 pages
Automatic Classification of Mechanical Components of Engines Using Deep Learning Techniques
No ratings yet
Automatic Classification of Mechanical Components of Engines Using Deep Learning Techniques
10 pages
Intelligent Question Answering System
No ratings yet
Intelligent Question Answering System
9 pages
Applsci 12 06382
No ratings yet
Applsci 12 06382
21 pages
1 s2.0 S0957417423031688 Main
No ratings yet
1 s2.0 S0957417423031688 Main
48 pages
Evolving Deep Architecture Generation With Residua 2
No ratings yet
Evolving Deep Architecture Generation With Residua 2
23 pages
AI Integration in Construction Safety
No ratings yet
AI Integration in Construction Safety
4 pages
Garyaev 2019 J. Phys. Conf. Ser. 1425 012121
No ratings yet
Garyaev 2019 J. Phys. Conf. Ser. 1425 012121
9 pages
Sustainability 14 01869 v2
No ratings yet
Sustainability 14 01869 v2
29 pages
Apex The Construction and Renovation Company
No ratings yet
Apex The Construction and Renovation Company
8 pages
Industrial Robots and Research Projects
No ratings yet
Industrial Robots and Research Projects
4 pages
Kor 2021 Deep Learning and Digital Twins
No ratings yet
Kor 2021 Deep Learning and Digital Twins
27 pages
Spoken Dialogue BIM Systems
No ratings yet
Spoken Dialogue BIM Systems
14 pages
英文文献原文2
No ratings yet
英文文献原文2
74 pages
DF-Cartex-SN Fermenter EN
No ratings yet
DF-Cartex-SN Fermenter EN
2 pages
BCC8002 BC8002 Advanced Mainboard: Building Technologies
No ratings yet
BCC8002 BC8002 Advanced Mainboard: Building Technologies
4 pages
AR for Human-Robot Collaboration
No ratings yet
AR for Human-Robot Collaboration
19 pages
Mechanical System Design Overview
No ratings yet
Mechanical System Design Overview
46 pages
Magic Quadrant for Content Services Platforms
No ratings yet
Magic Quadrant for Content Services Platforms
27 pages
Rotork
0% (1)
Rotork
20 pages
CN Lab Project
No ratings yet
CN Lab Project
17 pages
DDoS Attack and Mitigation Strategies Report
No ratings yet
DDoS Attack and Mitigation Strategies Report
2 pages
Sensorless Motor Control Insights
No ratings yet
Sensorless Motor Control Insights
15 pages
3 Final Session G12: Nano-Science
No ratings yet
3 Final Session G12: Nano-Science
57 pages
Data Analytics Career Profile
No ratings yet
Data Analytics Career Profile
2 pages
Setting Up Purchase Order Release Strategy
No ratings yet
Setting Up Purchase Order Release Strategy
18 pages
Learning Activity
No ratings yet
Learning Activity
2 pages
MPFI
No ratings yet
MPFI
19 pages
SmartLogger3000A EU Specs
No ratings yet
SmartLogger3000A EU Specs
1 page
Dok Fly Mra4e Mra4
No ratings yet
Dok Fly Mra4e Mra4
4 pages
India Road Logistics Industry
No ratings yet
India Road Logistics Industry
12 pages
Abebe M.mis Asigmntt
No ratings yet
Abebe M.mis Asigmntt
11 pages
Tesla Social Media Engagement Review
No ratings yet
Tesla Social Media Engagement Review
10 pages
6-Month IT Internship at Xpert Infotech
No ratings yet
6-Month IT Internship at Xpert Infotech
15 pages
Can We Teach Robots Ethics
No ratings yet
Can We Teach Robots Ethics
3 pages
Terms Every SOC - Cybersecurity Analyst Should Know
No ratings yet
Terms Every SOC - Cybersecurity Analyst Should Know
10 pages
Uae CV
No ratings yet
Uae CV
2 pages
Telecom Giants: China Mobile vs Vodafone
No ratings yet
Telecom Giants: China Mobile vs Vodafone
25 pages
VLSI Physical Design and
100% (1)
VLSI Physical Design and
23 pages
Electrical Wiring Methods Guide
100% (1)
Electrical Wiring Methods Guide
11 pages
Firetrol
No ratings yet
Firetrol
9 pages
Rama 2
No ratings yet
Rama 2
3 pages
Ug1165 Zynq Embedded Design Tutorial 1
No ratings yet
Ug1165 Zynq Embedded Design Tutorial 1
136 pages
Sonu Kumar Final Updated Form
No ratings yet
Sonu Kumar Final Updated Form
12 pages

BIM-GPT: Virtual Assistant for BIM IR

Uploaded by

BIM-GPT: Virtual Assistant for BIM IR

Uploaded by

BIM-GPT: a Prompt-Based Virtual Assistant

Framework for BIM Information Retrieval

Junwen Zhenga, *, Martin Fischerb

2.1 Challenges of BIM Information Retrieval

2.2 Natural Language-based Approaches

One notable example of ML methods is Bidirectional Encoder Representations from

2.3 New Opportunities: Generative Pre-trained Transformer Models

2.4 Summary of Gaps in Existing Natural Language-based Approaches to BIM IR

Table 1. Current NL-based Approaches for BIM IR with Limitations

Natural language Natural language Knowledge source for Limita

AI-based voice assistant for Machine learning:

QA system for BIM Traditional NLP: Syntactic

BIM automatic speech Traditional NLP: syntactic

Intelligent building information Traditional NLP: syntactic

A NL-based approach to data Traditional NLP: syntactic Not available

Figure 1. Overview of BIM-GPT Framework

3.1 Natural Language Processing Module

Fig. 2. Prompt: Aligning GPT Models with BIM IR

Fig. 3. Dynamic Prompt Template and Its Use Cases

3.1.1 Prompt Library

Table 2. Goals and Dynamic Nature of Prompts

3.1.2 Prompt Manager

Figure 5. Prompt Manager: Converting NL Query into NL Answer

3. Validation of NLP for BIM-GPT

Table 3. Examples of Additional Annotations for NL Queries in BINLQ [15]

Table 4. Example Preparation for Few-shot Scenarios

Use Case Sampling Methods

Intent Randomly sampled 2% NL queries for each TC label in

Randomly sampled 2% NL queries for each parameter of

Randomly sampled 2% NL queries for each distinct

To investigate the impact of different prompt components on task performance, we conducted

Table 5. Accuracy Rate of NLU Tasks

Table 6. Ablation Study Results

Although our framework's results demonstrate promise, several limitations need to be

4. Validation of BIM-GPT Framework

Figure 12. Implementation Process for BIM-GPT

5.1.1 Data Management Module

Figure 13. Data Preprocessing

5.1.2 Natural Language Processing Module

5.1.3 User Interface Module

To validate the functionality of the framework, we developed a VA prototype for a hospital

You might also like