0% found this document useful (0 votes)
48 views12 pages

Privacy First RAG Closed-Loop LLMs For Industrial Data Security

This document discusses the development of a closed-loop Retrieval-Augmented Generation (RAG) architecture tailored for small and medium-sized enterprises (SMEs) to enhance industrial data security. It highlights the limitations of traditional Large Language Models (LLMs) in handling sensitive data and proposes a secure, on-premises solution that optimizes document generation and retrieval while ensuring compliance with data privacy regulations. The architecture integrates advanced retrieval techniques and emphasizes the importance of transparency and interpretability in AI-generated responses.

Uploaded by

Vaishali Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views12 pages

Privacy First RAG Closed-Loop LLMs For Industrial Data Security

This document discusses the development of a closed-loop Retrieval-Augmented Generation (RAG) architecture tailored for small and medium-sized enterprises (SMEs) to enhance industrial data security. It highlights the limitations of traditional Large Language Models (LLMs) in handling sensitive data and proposes a secure, on-premises solution that optimizes document generation and retrieval while ensuring compliance with data privacy regulations. The architecture integrates advanced retrieval techniques and emphasizes the importance of transparency and interpretability in AI-generated responses.

Uploaded by

Vaishali Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Privacy first RAG: Closed-Loop LLMs for

Industrial Data Security

Dr. Bhawna Aggarwal¹, Chaitanya Batra¹, Vaishali Sharma¹ and Priyanshu Khanda¹
¹Netaji Subhas University of Technology

Abstract. Large Language Models (LLMs) have proven theit exceptional


capabilities in natural language processing tasks, such as document
understanding and question answering . However,their answers may not contain
relevant and domain specific information, leading to limitations in handling
context-specific industrial data [1]. While pre-trained neural models serve as
indirect knowledge bases, they often face challenges in expanding their memory,
interpretability, and the generation of factually accurate responses. Hybrid
models, such as Retrieval-Augmented Generation (RAG), provide a mitigation
of these issues by integrating retrieval-based components, allowing for most
relevant knowledge update and more interpretable outputs.[2]
Despite such advantages of RAG, standard systems typically rely on cloud-based
infrastructure, threatening the data security and privacy of organizations dealing
with confidential and sensitive data. [3]In order to reduce the danger of data
leaking, this article suggests a closed loop RAG architecture designed
specifically for small and medium-sized businesses (SMEs). It optimizes
document generation and retrieval within a secure, on-premises infrastructure.
Our technology targets industrial papers in particular, improving operational
efficiency and protecting data confidentiality. For enterprises that value data
sovereignty, a private RAG infrastructure is a good substitute for cloud-based
LLMs since it successfully strikes a balance between data security and
sophisticated NLP capabilities.
Keywords: Artificial Intelligence, Retrieval Augmented Generation, Text
Analysis

1. Introduction
The amount of technical documentation in a variety of industries, including manufacturing,
engineering, and compliance, has increased exponentially as a result of the quick digitization of
industrial processes. Therefore, advanced information retrieval systems have now become essential in
businesses to preserve data integrity as well as guaranteeing easy access to vital information. [4].Even
Though traditional Large Language models have shown impressive natural language processing
capabilities, they hugely rely on the cloud which presents serious privacy and data security issues in
industrial settings.
Retrieval-Augmented Generation (RAG), combining relevant and accurate data retrievals and large
language models to provide precise, contextually apt responses, has become a potential method to
address these issues [4]. Existing RAG systems, however, frequently depend on external servers,
which puts data confidentiality at risk. [5] In this paper, we propose an RAG model specifically
tailored for small and medium enterprises having domain specific knowledge base. It allows for the
safe on-premises processing of industrial documents while preserving the efficiency of retrieval-
augmented language generation.
1.1 Retrieval-Augmented Generation (RAG): An Overview
Retrieval-Augmented Generation (RAG) is an AI framework that combines generative language
models and neural retrieval techniques to overcome the drawbacks of conventional Large Language
Models (LLMs). compared to traditional LLMs, which once trained on a dataset continue to use that
outdated dataset as their knowledge base , RAG dynamically incorporates external information to
produce responses that are more precise, pertinent to the context, and recent. When highly specialized,
real-time, or domain-specific information is needed, this architecture works efficiently.

1.2 Core Components of RAG


RAG consists of a neural retriever, a generator and a knowledge base . Together, these elements
improve the generated outputs' precision, consistency, and contextual relevance.
1.2.1 Retriever: when the user provides an input query, it acts as an intelligent search engine
by embedding the query into a query vector and then using advanced information retrieval
techniques, including semantic search and dense vector representations, to find the most
contextually appropriate data. Techniques such as Maximum Inner Product Search (MIPS)
and k-nearest neighbors (k-NN) are employed to match high-dimensional vectors efficiently.
The [1]retriever’s primary functions include:
○ Searching for Relevant Data: Scans the knowledge base to find data related to the
input query, including private documents, reports, or FAQs.
○ Filtering Irrelevant Information: Ensures that only the most pertinent data( let’s
say first K relevant document embeddings) is sent to the generator, reducing noise
and enhancing output relevance.
○ Providing Context: The retrieved information is forwarded to the generator,
allowing it to create responses grounded in factual data rather than purely model-
generated content.
1.2.2 Generator: The generator acts as the response composer, leveraging the retrieved data
to craft a coherent and accurate answer. It takes the input query and the selected documents to
produce contextually enriched responses, often using techniques like attention mechanisms,
content selection, and copying to combine retrieved knowledge with the model's inherent
language generation capabilities.[6]
○ Utilizing Retrieved Data: Incorporates the filtered information from the retriever to
answer the query accurately.
○ Producing Clear and Contextual Outputs: Merges the retrieved information with
pre-trained language patterns to generate a well-structured response.
○ Maintaining Focus: The generator ensures that the output stays relevant to the user’s
question by adhering strictly to the retrieved context.
1.2.3 Knowledge BaseThe knowledge base serves as the repository from which relevant
documents are retrieved. It can contain structured or unstructured data, including books,
articles, web pages, databases, or proprietary documents. Efficient indexing techniques, such
as FAISS, enable fast similarity searches, while advanced processing methods (like using
PDF parsers for multi-column layouts and tables) ensure comprehensive data coverage.[7]

1.3 RAG Workflow


During inference, RAG follows these steps:
● Query Handling: The user’s input is passed to the retriever, which searches the knowledge
base and returns a ranked list of relevant documents.
● Document Selection: The top-k documents are selected and combined with the original
query.
● Response Generation: The generator uses this context to formulate a detailed answer,
conditioned on the retrieved knowledge.
● Final Output: The generated response, often accompanied by citations, is presented to the
user, indicating the source of the information used.

1.4 Advantages of RAG


RAG offers several distinct advantages over traditional LLMs:
➢ Up-to-date Information: Current Information: RAG incorporates external data sources, in
contrast to static LLMs, guaranteeing that answers are founded on the most recent data..
➢ Reduced Hallucinations: By grounding responses in retrieved content, RAG minimizes the
risk of generating unsupported or incorrect information.
➢ No Re-Training Cost: In LLMs whenever we need to update the knowledge base, the model
needs to be retrained on a recent database, however, In RAG, the new database is simply
added to the current knowledge base and the model retrieves updated information without
being retrained.
➢ Specialized Knowledge: Any domain specific database like healthcare, law firms,
automotive , chemical or mechanical industry can be provided to the retriever making it easy
to integrate with a lot of industries having different specializations.
➢ Handling Large Documents: Efficiently extracts relevant information from lengthy texts,
focusing on essential content.
➢ Web Access: Can be configured to query external APIs or databases, maintaining access to
the most current data.[8], [9]

1.5 Implementation Example: Enhanced RAG Architecture


The Langchain framework serves as a basis for an example of a sophisticated RAG architecture with
features like:
➢ Enhanced PDF Parsing: Uses libraries like Tabula and PDFMiner for parsing complex
documents.
➢ Advanced Retrieval Models: BM25 and BGE reranker models are integrated into advanced
retrieval models to improve accuracy.
➢ AgenticRAG: Incorporating tokens for self-reflection, follow up questions and function-
calling logic for iterative answer refinement.
➢ Evaluation: Tested on proprietary automotive datasets and public benchmarks (QReCC,
CoQA) using the RAGAS metric suite.[10], [11]

2. Difficulties in applying current RAG models in


industrial settings.
Beside the technical difficulties of creating precise systems, using RAG in organizational settings has
special problems, especially in compliance-regulated sectors like healthcare, banking, and the
automotive industry. Strict legal frameworks like GDPR, HIPAA, and FINRA require these
businesses to adhere to strict data privacy and security measures. To prevent unintentional data
exposure, these industries must integrate strong access restrictions, anonymous data collection
methods, and thorough auditing procedures. [12], [13] In order to avoid serious legal and financial
consequences, healthcare organizations that handle patient records, for example, must make sure that
RAG systems are unable to unintentionally retrieve or generate sensitive personal data.
The automotive industry handles extremely sensitive private data, such as design schematics,
manufacturing procedures, and engineering specifications, however not being subject to the same
legal frameworks as the healthcare or financial sectors. Another significant issue in this field is
document complexity, since technical publications can have complicated layouts, multi-column
formats, and specialist terminology that can make it difficult to retrieve information effectively.[14]
RAG implementation is further limited by the requirement for quick, on-premises processing in
resource-constrained industrial environments, which calls for systems that are both computationally
efficient and able to provide precise, contextually relevant answers instantly.
Ensuring openness and interpretability in RAG outcomes is crucial in legally sensitive fields.
Enterprise RAG systems are required to give explicit citations and attribution for generated material,
describing the precise documents or text passages that influenced each response, in contrast to
consumer-facing applications where approximations may be acceptable [3], [14], and [15]. This level
of traceability not only enhances trust in the system’s outputs but also aligns with compliance
requirements, enabling organizations to validate the integrity of AI-generated responses.

3. Related Works
3.1 Optimizing RAG Techniques for Automotive Industry PDF Chatbots
The authors propose an enhanced RAG architecture using Langchain, featuring advanced PDF parsing
with PDFMiner and Tabula, and BM25 and BGE reranker models for improved retrieval. They
introduce AgenticRAG, a self-reflecting agent with function-calling logic for real-time answer
refinement. The system is evaluated on automotive and public datasets (QReCC, CoQA) using the
RAGAS metric suite.
However, the paper lacks performance benchmarks for deployment on low-power hardware like
industrial PCs or tablets, and it doesn’t support processing technical diagrams, wiring schematics, or
CAD blueprints essential for automotive documentation. Future research could explore multimodal
RAG systems that handle both text and image data.
3.2 Integration of LLM for Real-Time Troubleshooting in Industrial
Environments based on RAG
This paper presents a RAG-based framework that integrates LLMs with real-time sensor data to
provide automated, domain-specific troubleshooting in complex industrial environments. It
addresses challenges such as outdated documentation, data overload, and reliance on expert
knowledge by combining a retrieval engine with generative models to produce context-aware,
actionable solutions.
However,there's minimal analysis of how technicians interact with the RAG system in real
operational settings.Integration with real-time data is described generally, but lacks depth on how
streaming data variability and quality are handled.
3.3 Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial
Applications
This study explores how incorporating images and text into Retrieval-Augmented Generation (RAG)
systems impacts performance in industrial settings. The authors evaluate multimodal RAG
pipelines using two LLMs (GPT-4V, LLaVA) and two image-processing methods: multimodal
embeddings and image-to-text summaries. Results show multimodal RAG outperforms text-only,
with image summaries proving more effective than direct embeddings.
Image-only RAG remains weaker than text-based or multimodal RAG, indicating a bottleneck in
image relevance and retrieval quality.
3.4 RAG based Question-Answering for Contextual Response Prediction System
This paper introduces FAIR, a factuality-aware, instruction-tuned RAG framework that enhances
both the factual accuracy and response faithfulness of LLMs. It does so by rephrasing instructions,
generating high-quality demonstrations, and using factuality-aware reward models for alignment.
FAIR achieves state-of-the-art results on five datasets, particularly improving factual QA
performance while maintaining language fluency.
3.5 Different Retrieval Techniques Performance Evaluation
This paper provides a structured review of performance evaluation methods for Information
Retrieval (IR) systems, categorizing them into non-graphical techniques (Precision, Recall, F1-
score, MAP) and graphical techniques (PR Curve, ROC Curve, AUC, nDCG). It discusses both
binary judgment measures and graded relevance measures, with illustrative examples and
formulas for calculating key metrics.
Focuses on classical IR metrics without covering modern neural IR methods or deep learning-
based RAG systems. No practical evaluation across real-world systems like web search, enterprise
search, or domain-specific IR (e.g., medical, legal).

4. Meeting Enterprise Requirements through RAG


Implementation
To effectively address the stringent requirements of enterprise environments, our proposed RAG
infrastructure is designed with the following key components and functionalities:
● Data Security and Compliance:
○ Fully on-premises deployment to mitigate risks associated with cloud-based data
storage.
○ Comprehensive access controls to regulate data access based on user roles and
permissions.
○ Implementation of data anonymization techniques to prevent exposure of sensitive
information, aligning with compliance standards such as HIPAA, GDPR, and SOC2.
○ Detailed audit logs that document all data access and modification activities, ensuring
transparency and accountability.
● Accurate and Explainable RAG:
○ Use of sophisticated semantic search engines, with FAISS integrated for quick and
effective similarity searches.
○ Sentence-transformer embeddings provide precise document retrieval by offering
strong, context-aware representations.
○ Clear citations that link each response to certain document sections or data points are
among the generation outputs, which improve interpretability and lower the
possibility of false information.
○ For maximizing both recall and precision in document retrieval, a hybrid query
technique combines keyword-based and semantic-based search.
● Seamless Integration and Scalability:
○ Databases, ERP systems, and CRM platforms are just a few examples of the
corporate systems that can be easily integrated using a modular architecture and
clearly defined APIs.
○ Scalable infrastructure guarantees prompt answers in real-time industrial situations by
managing big, diverse document corpora with negligible latency.
○ Popular corporate systems have pre-built interfaces that speed up deployment and
lower integration costs.
● Optimized Resource Utilization:
○ Model architectures that are computationally efficient and designed to function
effectively with common industrial hardware lessen reliance on expensive GPUs or
cloud-based resources.
○ It dynamically modifies the search's breadth and depth according to the query's
complexity and available processing power, termed as adaptive query processing
● Transparency and Interpretability:
○ All the generated outputs have inline citations connecting responses to the appropriate
text parts and are properly attributed to the relevant data sources.
○ Users can verify generated information and examine retrieved data points, increasing
their confidence in the system's results.

4.1 Code Implementation and System Architecture


For making it possible to offer strong, domain-specific information retrieval and query-answering
capabilities that are especially suited for industrial documents through Our RAG infrastructure, the
following essential elements are combined:
● PDF Text Extraction:
○ The PyPDF2 library is used by “the extract_text_from_pdf” function to extract text
content from PDF files. After processing each PDF page, the content is combined into
a single text string for additional analysis. This makes it possible to easily convert
industrial documentation into textual information that can be searched.
● Text Embedding and Similarity Search:
○ The system uses the SentenceTransformer model (all-mpnet-base-v2) to transform the
user queries and extracted text into embeddings, which are high-dimensional vectors.
FAISS is used to index these embeddings, which makes it easier to conduct effective
similarity searches to find pertinent document segments.
● Context-Based Question Answering:
○ The main question-answering interface is provided by the Custom() method. After
using FAISS to obtain the top k relevant document segments, it creates a contextual
prompt and sends it to the Hugging Face API to provide a response. This reduces the
possibility of hallucinations and improves output accuracy by guaranteeing that the
generated replies are closely linked with the obtained context.
● Citations for Enhanced Interpretability:
○ Each obtained document segment is linked to a citation index by the model, which
then produces inline citations. These citations provide transparency by enabling users
to track each generated response's origin back to the relevant document segment.
● Flask API and ngrok Deployment:
○ A Flask web application is used to install the system, and its endpoints are made to
respond to user inquiries and provide generated answers with citations. Ngrok makes
it easier for the general public to access the application, enabling safe and regulated
testing.
● Compliance and Security Considerations:
○ The system is configured to run entirely on-premises, ensuring that sensitive
industrial data remains within the organization’s infrastructure. Furthermore,access
restrictions and logging systems are implemented to ensure tracking data interactions
and compliance with data protection laws like GDPR and HIPAA.
Through the implementation of these elements, our RAG system successfully satisfies the
fundamental needs of small and medium scale enterprises (SMEs) document processing, offering
precise, context-driven replies while abiding by strict security and privacy guidelines.

5. Results
5.1 Data Preparation:
Constructing a Comprehensive Dataset for RAG Evaluation
A well-structured assessment dataset is necessary in order to evaluate the RAG architecture's
performance. Key components of the dataset include the following:
Industry-Specific Queries:
○ A wide range of inquiries that arise frequently in the real-world within specific
domains of automotive industries.
Validated Responses:
○ Accurate, contextually relevant answers to each query, which serve as a benchmark to
the RAG system’s generated responses.
Knowledge Base Documents:
○ A curated set of enterprise documents, categorized into:
■ Structured PDFs containing well-organized data such as tables, graphs, and
formatted text to assess extraction accuracy.
■ Unstructured or noisy PDFs with irrelevant or disorganized content to
evaluate the system’s resilience in filtering and retrieving relevant
information.
5.2 Accuracy on Legitimate Data
The proposed RAG system effectively extracted precise, context-aware replies when tested on
domain-specific PDF documents with authentic, structured data. For example, user inquiries like
"What are hardware assets?" yielded definitions that were factually accurate and included citations to
appropriate sources.. This demonstrated:
➢ High retrieval precision via FAISS-based similarity search.
➢ Contextually grounded generation using Hugging Face’s inference API.
➢ Clear and verifiable citations that improve trust and traceability.
Observation: The system showed consistent performance when documents were structured, with
minimal hallucinations and high semantic relevance.

5.3 Robustness to Noisy or Custom Inputs


To test generalizability, the system was evaluated on PDFs containing gibberish or loosely structured
content. Despite data inconsistencies:
➢ The retriever isolated semantically useful fragments.
➢ The generator maintained answer coherence using contextual reinforcement.
➢ The system minimized hallucinations, only responding when a minimal signal was detected in
the noise.
Observation: While accuracy slightly declined, the model maintained robustness and avoided
nonsensical completions, suggesting resilience to unstructured enterprise data.
5.4 Behavior with Absent or Missing Context
In cases where no relevant information existed in the provided document (e.g., a query unrelated to
any indexed content), the system:
➢ Avoided fabrication.
➢ Responded with fallback reasoning based on its prior training (e.g., linking hardware assets to
IT asset management).
➢ Indicated the absence of evidence implicitly, demonstrating responsible generalization.
Observation: The model did not hallucinate specifics and performed acceptably under zero-context
conditions by offering generic but plausible interpretations.

5.5 Performance and Efficiency


1. Setup Time:
➢ Initial system setup time was measured at ~40.4 seconds, independent of document length.
➢ Load time scaled linearly with sentence volume, indicating stable preprocessing efficiency.
The graph presented above illustrates the relationship between the system load time (in
seconds) and the number of processed sentences, with the assumption that one page contains
ten sentences. The data points derived from experiments highlight the setup time of the
system and the corresponding time required to process varying numbers of sentences. The
setup time, represented as 40.4 seconds, is the baseline system initialization time, independent
of the number of sentences. As the number of sentences increases, the load time exhibits a
linear trend, indicating a proportional relationship between system load time and input size.
This linearity is modeled by the equation(1) ensuring accurate representation of the
computational cost for scaling input size. Such an analysis provides valuable insights into
system performance and scalability in handling

(1)
2. Runtime:
Two linear performance models were derived:
➢ Equation (2) (Upper Bound): Captures peak load conditions.
➢ Equation (3) (Baseline): Reflects typical runtime efficiency.

The graph above demonstrates the relationship between processing time (in milliseconds)
and the number of sentences processed by the system. Two linear equations model distinct
scenarios for performance evaluation:
1. Equation (2)
2.Equation (3)

These models revealed that processing time scaled predictably with input size, confirming the
system's suitability for large-scale enterprise document processing.

6. Conclusion
In this study, we addressed the creation of a reliable Retrieval-Augmented Generation (RAG) system
designed for enterprise settings (SMEs), with a focus on domain-specific information retrieval, data
security, and regulatory compliance. Our suggested approach solves the main issues that companies
encounter when handling complex, unstructured data by combining PDF processing, semantic search
with FAISS, and context-based question-answering with Hugging face Open source LLM models..
The implementation of inline citations enhances transparency and interpretability, critical in legally
sensitive domains. Additionally, our comprehensive dataset, comprising structured and unstructured
PDFs, serves as a realistic benchmark for evaluating the system’s accuracy and robustness in real-
world industrial applications. Future work will focus on extending the system to handle multimodal
data, such as technical diagrams and CAD files, and optimizing performance on low-power hardware
to further align with enterprise deployment requirements. This approach not only ensures accurate and
contextually relevant responses but also establishes a framework for secure, scalable RAG systems
capable of meeting the stringent demands of modern enterprises.

References
[1] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W. Yih, T.
Rocktäschel, S. Riedel, and D. Kiela, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,"
*Advances in Neural Information Processing Systems (NeurIPS)*, vol. 33, 2020. [Online]. Available:
https://arxiv.org/abs/2005.11401

[2] H. Qian, Y. Fan, R. Zhang, and J. Guo, "On the Capacity of Citation Generation by Large Language
Models," *arXiv preprint*, arXiv:2402.06183, 2024. [Online]. Available: https://arxiv.org/abs/2402.06183

[3] J. Wang, J. Fu, R. Wang, L. Song, and J. Bian, "PIKE-RAG: sPecIalized KnowledgE and Rationale
Augmented Generation," *arXiv preprint*, arXiv:2312.08413, 2023. [Online]. Available:
https://arxiv.org/abs/2312.08413

[4] S. Soman and S. Roychowdhury, "Observations on Building RAG Systems for Technical Documents,"
*arXiv preprint*, arXiv:2311.15264, 2023. [Online]. Available: https://arxiv.org/abs/2311.15264

[5] S. Zeng, J. Zhang, P. He, Y. Xing, Y. Liu, H. Xu, J. Ren, S. Wang, D. Yin, Y. Chang, and J. Tang, "The
Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)," in *Proceedings of
the 2024 ACM Web Conference (WWW)*, Singapore, May 2024. [Online]. Available:
https://arxiv.org/abs/2402.16985
[6] A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, "SELF-RAG: Learning to Retrieve, Generate, and
Critique through Self-Reflection," *arXiv preprint*, arXiv:2310.11511, 2023. [Online]. Available:
https://arxiv.org/abs/2310.11511

[7] R. Gupta and C. V. Jawahar, "Information Retrieval from the Digitized Books," in *Proceedings of the 12th
International Conference on Document Analysis and Recognition (ICDAR)*, Washington, DC, USA, 2013, pp.
1005–1009. [Online]. Available: https://ieeexplore.ieee.org/document/6628859

[8] C. Niu, Y. Wu, J. Zhu, S. Xu, K. Shum, R. Zhong, J. Song, and T. Zhang, "RAGTruth: A Hallucination
Corpus for Developing Trustworthy Retrieval-Augmented Language Models," *arXiv preprint*,
arXiv:2401.00396, 2024. [Online]. Available: https://arxiv.org/abs/2401.00396

[9] C. D. Manning, P. Raghavan, and H. Schütze, *Introduction to Information Retrieval*. Cambridge


University Press, 2008.

[10] C.-Y. Chang, Z. Jiang, V. Rakesh, M. Pan, C.-C. M. Yeh, G. Wang, M. Hu, Z. Xu, Y. Zheng, M. Das, and
N. Zou, "MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation," *arXiv preprint*,
arXiv:2403.08039, 2024. [Online]. Available: https://arxiv.org/abs/2403.08039

[11] F. Liu, Z. Kang, and X. Han, "Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A
Case Study with Locally Deployed Ollama Models," *arXiv preprint*, arXiv:2408.05933, 2024. [Online].
Available: https://arxiv.org/abs/2408.05933

[12] A. Narimani and S. Klarmann, "Integration of Large Language Models for Real-Time Troubleshooting in
Industrial Environments Based on Retrieval-Augmented Generation," *Energies*, vol. 17, no. 1, pp. 1–19,
2024. [Online]. Available: https://doi.org/10.46254/EU07.20240085

[13] F. Liu, Z. Kang, and X. Han, "Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A
Case Study with Locally Deployed Ollama Models," *China Automotive Technology & Research Center*,
Tech. Rep., 2024. (Duplicate entry consolidated)

[14] T. Bruckhaus, "RAG Does Not Work for Enterprises," *Strative.ai*, White Paper, 2024. [Online].
Available: https://www.strative.ai/blog/rag-does-not-work-for-enterprises

[15] G. Byun, S. Lee, N. Choi, and J. D. Choi, "Secure Multifaceted-RAG for Enterprise: Hybrid Knowledge
Retrieval with Security Filtering," *arXiv preprint*, arXiv:2404.08957, 2024. [Online]. Available:
https://arxiv.org/abs/2404.08957

You might also like