WenyanGPT: Classical Chinese NLP Model
WenyanGPT: Classical Chinese NLP Model
Abstract 问:请给这段古汉语文本添加标点符号:汝忆否四五年前某
夕吾尝语曰与使吾先死也无宁汝先吾而死
答:汝忆否?四五年前某夕,吾尝语曰:“与使吾先死也,
Classical Chinese, as the core carrier of Chinese 无宁汝先吾而死。”
culture, plays a crucial role in the inheritance and
Punctuation Question: Please add punctuation marks to this ancient
study of ancient literature. However, existing natu- Chinese text: 汝忆否四五年前某夕吾尝语曰与使吾先死也无
ral language processing models primarily optimize 宁汝先吾而死
for Modern Chinese, resulting in inadequate perfor- Answer: 汝忆否?四五年前某夕,吾尝语曰:“与使吾先死
也,无宁汝先吾而死。”
mance on Classical Chinese. This paper presents
a comprehensive solution for Classical Chinese 问:识别古汉语文本中的人物、地名、时间、官职名等实体:
language processing. By continuing pre-training 二十三年,罢宣慰司,立辽阳行省,以亦力撒合为参知政事。
答:'人物': ['亦力撒合'], '地点': [], '时间': ['二十三年'],
and instruction fine-tuning on the LLaMA3-8B- '官职': ['宣慰司', '参知政事']
Chinese model, we construct a large language NER Question: Identify entities such as characters, place, time,
model, WenyanGPT, which is specifically designed and official positions in ancient Chinese texts: 二十三年,
for Classical Chinese tasks. Additionally, we de- 罢宣慰司,立辽阳行省,以亦力撒合为参知政事。
velop an evaluation benchmark dataset, Wenyan- Answer: 'characters': ['亦力撒合'], 'place': [], 'time':
['二十三年'], 'official positions': ['宣慰司', '参知政事']
BENCH. Experimental results on WenyanBENCH
demonstrate that WenyanGPT significantly outper- 问:将这段古汉语文本翻译成现代汉语:古之学者必有师。
forms current advanced LLMs in various Classical 答:古代求学的人必定有老师。
Chinese tasks. We make the model’s training data, Translation Question: Translating this ancient Chinese text into
instruction fine-tuning data, and evaluation bench- modern Chinese: 古之学者必有师。
Answer: 古代求学的人必定有老师。
mark dataset publicly available to promote further
research and development in the field of Classical Figure 1: Examples of tasks from WenyanGPT. The model demon-
Chinese processing. strates advanced knowledge in Classical Chinese and shows strong
performance in both Classical Chinese understanding and genera-
tion tasks.
1 Introduction
Classical Chinese is an important component of Chinese cul-
ture, with a long history and profound cultural heritage. It is Models (HMMs) for POS [Huang et al., 2002], Conditional
essential for understanding traditional Chinese culture. With Random Fields (CRFs) for punctuation [Huang et al., 2010]
the rapid development of artificial intelligence technology, in- and NER [Yuan et al., 2019; Li, 2018]. In deep learn-
telligent processing of ancient Chinese texts offers a new so- ing, RNNs, LSTMs, GRUs, and Attention mechanisms have
lution for the preservation and inheritance of Classical Chi- been applied to various tasks, including couplet generation
nese. Modern technologies, such as digitalization and natu- and classical poetry generation [Yan et al., 2016; Yi et al.,
ral language processing (NLP), can efficiently preserve and 2017], punctuation and POS using BiLSTM-CRF models
spread traditional culture while fostering deeper and more in- [Wang et al., 2019; Cheng et al., 2020; Zhang et al., 2023b;
novative academic research. Maximizing the potential of AI Chang et al., 2024]. As Transformer architectures emerged
in processing Classical Chinese texts has become an urgent [Vaswani et al., 2017], studies began using large-scale paral-
need for cultural inheritance and academic development. lel corpora to train models for translating Classical Chinese
Early research in Classical Chinese language process- into Modern Chinese [Liu et al., 2018] and for generating
ing focused on tasks such as punctuation, word segmenta- classical poetry [Huang et al., 2020]. The introduction of
tion, part-of-speech tagging (POS) , named entity recogni- pre-trained models including BERT [Kenton and Toutanova,
tion (NER) and translation. These tasks initially relied on tra- 2019] and GPT [Radford and Narasimhan, 2018] provided
ditional machine learning methods, such as Hidden Markov new opportunities for intelligent Classical Chinese process-
ing. Some research integrated ancient Chinese texts into the
∗
Corresponding author. training data of general pre-trained models, improving the
8339
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
processing performance of Classical Chinese compared to achieved human-level performance in 1-shot learning on the
typical pre-trained models [Tian et al., 2020; Wang et al., BIG-bench [Srivastava et al., 2023] dataset. In 2023, Meta
2022; Wang et al., 2023a; Liu et al., 2023a]. Other studies AI released the LLaMA model [Touvron et al., 2023]. This
used Classical Chinese corpus to continue pre-training and model also follows a decoder-only Transformer architecture
fine-tuning large language models, aiming to build conver- and excels in various NLP tasks after large-scale training. In
sational models for Classical Chinese [Zhang et al., 2024; 2024, LLaMA 3 [Dubey et al., 2024] was released, including
Yang et al., 2024b; Cao et al., 2023; Cao et al., 2024]. a pre-trained version with 405 billion parameters and a post-
However, challenges remain in Classical Chinese process- training version, alongside the LLaMA Guard 3 model for
ing. Different tasks require training specialized models, and input-output safety. Pre-trained language models have seen
no effective universal model exists. Additionally, there is a rapid development, with Transformer-based models becom-
lack of standardized evaluation benchmarks in this field, with ing the mainstream technology in NLP.
existing evaluation tasks, datasets, and metrics being incon-
sistent, making it difficult to perform cross-task comparisons 2.2 Classical Chinese PLMs
and systematic assessments of model performance. Pre-trained language models have achieved widespread suc-
To address these issues, we propose WenyanGPT, a large cess in the field of NLP. However, studies show that general-
language model for Classical Chinese. Some examples of domain models often lack specialized knowledge for tasks in
WenyanGPT are shown in Figure 1. We also construct the specific domains. Models pre-trained with domain-specific
largest pre-training corpus for continued pre-training, en- data tend to perform better for specialized tasks [Ke et al.,
hancing the model’s domain adaptability. Additionally, we 2023; Gupta et al., 2023; Ibrahim et al., 2024; Taylor et al.,
propose a framework for generating domain-specific instruc- 2022; Lehman et al., 2023; Liu et al., 2020]. In the field
tion data for supervised fine-tuning in the development of of Classical Chinese, several studies have extended models
WenyanGPT. To promote research in the intelligent process- such as BERT, RoBERTa, and GPT by incorporating Classi-
ing of Classical Chinese, we build the WenyanBENCH eval- cal Chinese corpora for pre-training, resulting in specialized
uation dataset and conduct extensive experiments for detailed models like AnchiBERT [Tian et al., 2020], SikuBERT and
analysis. The main contributions are as follows: SikuRoBERTa [Wang et al., 2022], GujiBERT and GujiGPT
• We propose WenyanGPT, a large language model fo- series [Wang et al., 2023a], and SikuGPT [Liu et al., 2023a].
cused on Classical Chinese. It demonstrates superior These models show improved performance over general pre-
performance and wide applicability in tasks such as trained models in Classical Chinese tasks. Instruction fine-
punctuation, POS, translation, etc. tuning is another effective strategy. Using Supervised Fine-
Tuning (SFT) can activate LLMs’ ability to understand and
• We release pre-training and instruction fine-tuning answer questions in a specific domain [Liu et al., 2023b;
datasets, along with a novel method for constructing Xiong et al., 2023; Wang et al., 2023b; Yue et al., 2023;
domain-specific fine-tuning data, providing valuable re- Huang et al., 2023; Cui et al., 2023; Yang et al., 2023b;
sources for future research. Zhang et al., 2023a; Dan et al., 2023]. Classical Chinese
• We introduce WenyanBENCH, an evaluation bench- LLMs [Zhang et al., 2024; Yang et al., 2024b; Cao et al.,
mark for Classical Chinese tasks, with extensive ex- 2023; Cao et al., 2024] are in the early stages of develop-
periments verifying WenyanGPT’s leading performance ment. For example, the "Xunzi1 " Classical Chinese large lan-
across multiple tasks. guage model was trained on Classical Chinese-related cor-
pora based on general models such as Qwen2.5 [Yang et al.,
2 Related Work 2024a], Baichuan2 [Yang et al., 2023a], and GLM-4 [Zeng
et al., 2024]. It has shown excellent performance in intelli-
2.1 PLMs gent tagging, information extraction, and other tasks. TongGu
In 2017, Google introduced a new neural network architec- [Cao et al., 2024], through two-stage instruction fine-tuning,
ture, Transformer. It utilizes self-attention mechanisms to is capable of Classical Chinese punctuation, translation, and
better handle long-distance dependencies and significantly appreciation tasks. In this paper, WenyanGPT is fine-tuned
improves training efficiency through parallel computation. on higher-quality pre-training data and a larger, more diverse
Based on Transformer, various LLMs have been proposed. instruction dataset, showing superior and more comprehen-
BERT employs an encoder-only Transformer architecture and sive task handling capabilities.
is pre-trained using masked language modeling and next sen-
tence prediction tasks. The GPT series, uses a decoder- 3 WenyanGPT
only Transformer architecture and an autoregressive language
In order to obtain the WenyanGPT Classical Chinese model,
model (ALM). Over the course of the GPT series, the model
we first construct a Classical Chinese pre-training corpus and
size has steadily increased, from the original GPT to sub-
continue pre-training based on LLaMA3-8B-Chinese (Sec-
sequent iterations, including GPT-2 [Radford et al., 2019],
tion 3.1). Then, we propose a method to construct domain
GPT-3 [Brown et al., 2020], and GPT-4 [Achiam et al.,
instruction data (Section 3.2). In our framework, instruction
2023], with continuous improvements in performance. PaLM
[Chowdhery et al., 2023] uses the standard Transformer ar- generation is manually constructed, guided by LLMs, and
chitecture in a decoder-only model with a modified SwiGLU 1
[Link]
activation function. This model, with 540 billion parameters, XunziALLM.
8340
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
8341
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
expand
filter
Question-Answer Pairs
Initial instructions merge
reverse reasoning
古之学者必有师。 filter
(input, output) (input, output) Seed instructions
古代求学的人必定有老师。
high quality pairs extract Parts of high quality Seed instruction
Parallel corpora pairs data
Classical Chinese
Corpus instructions
十三年, filter
十三年/t ,/w High quality instruction test
filter instructions
Tagged Corpus (instruction, input, output)
Parts of high quality sft data
(instruction, input, output)
outputs
(instruction,input) Final high quality sft data
Larger models Generated output (instruction, input, output)
Artificial design
filter
Parts of high quality sft data
8342
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
Table 6: Results for understanding tasks (Punctuation, POS, NER) on WenyanBench. The results underlined represent the second-best
model’s F1 score.
0 0 0
,。;: “ 《》)(!<>? ” p c nr n r v w d ns t y u m a f j q characte place time official
rs positions
Figure 4: The F1 scores of WenYanGPT for the subcategories of understanding tasks (including Punctuation, POS, and NER) on Wenyan-
Bench.
standing tasks and generation tasks). For understanding Data and Evaluation. We use the WenyanBench bench-
tasks, evaluation primarily relies on Precision, Recall, and mark for testing. The understanding tasks, including punc-
F1-Score. For generation tasks, BLEU and BERT-Score are tuation, POS, and NER, are evaluated by Precision, Recall,
used as evaluation metrics. BLEU measures the N-gram over- and F1-Score. The generation tasks include word explana-
lap between generated content and reference answers, while tion, translation, and reverse dictionary, where BLEU is used
BERT-Score better captures the semantic similarity between for word explanation and translation, and BERT-Score is used
the generated content and reference answers. for reverse dictionary.
Evaluation Method. To efficiently assess model perfor- 5.2 Experimental Analysis
mance, we design a set of scripted tools to automatically
compute BLEU, BERT-Score, and other metrics. These tools WenyanGPT demonstrates a significant lead in language
quickly and accurately quantify model outputs, providing understanding tasks. The experimental results for under-
clear feedback for model optimization. This automated eval- standing tasks are presented in Table 6. In the NER task,
uation approach improves evaluation efficiency and ensures WenyanGPT’s precision, recall, and F1 score all exceed
the consistency and comparability of the results. 90%, while the second-best model, GPT-4o, fails to sur-
pass 77% in any of these metrics. In the punctuation task,
WenyanGPT’s F1 score is 16.65% higher than that of the
5 Experiments second-best model, Deepseek-V3, reaching 75.66%. Addi-
The experiments evaluate WenyanGPT’s performance on un- tionally, in the POS task, WenyanGPT’s F1 score is 7.35%
derstanding and generation tasks in Classical Chinese. higher than that of the second-best model, GPT-4o. These
results highlight WenyanGPT’s overwhelming superiority in
5.1 Experimental Setup understanding tasks, particularly in NER and punctuation.
Baselines. The baselines include general-domain and Clas- This performance is attributed to the model’s extensive pre-
sical Chinese domain LLMs. The general-domainLLMs training on Classical Chinese data, enabling it to handle com-
are Qwen2.5-7B-Instruct, Baichuan2-7B-Chat, GLM-4-9B- plex linguistic phenomena and ensuring higher accuracy and
Chat, Meta-Llama-3-8B-Instruct, Llama3-8B-Chinese-Chat, stability in fundamental language understanding tasks, such
GPT-4o [Hurst et al., 2024], and Deepseek-V3 [Liu et as POS and NER.
al., 2024]. The Classical Chinese domain LLM is Xunzi- WenyanGPT demonstrates excellent capabilities and high
Qwen1.5-7B-Chat. efficiency in subcategories of Classical Chinese compre-
8343
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
Table 7: Results for generation tasks (Translation, Word explanation, Reverse dictionary) on WenyanBench. The results underlined represent
the second-best model’s BLEU1 score and BERT-Score-F1 score.
8344
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
Case 1 Please segment the following Clas- Case 2 Explain the meaning of "晕红" in the
sical Chinese content and accurately line "一片晕红才著雨,几丝柔绿乍
tag the parts of speech:四年春,衞 和烟" from "浣溪沙·谁道飘零不可
州吁弑桓公而立。 怜".
Ground Truth 四年/t 春/n ,/w 衞州吁/nr 弑/v 桓 Ground Truth 形容海棠花的色泽。
公/nr 而/c 立/v 。/w Describes the color of begonia flow-
ers.
WenyanGPT 四年/t 春/n ,/w 衞州吁/nr 弑/v 桓
公/nr 而/c 立/v 。/w WenyanGPT 形容海棠花色泽鲜艳而浓丽。
Deepseek-V3 四/m 年/t 春/t, /w 衞/ns 州 吁/nr Describes the bright and rich color of
弑/v 桓公/nr 而/c 立/v 。/w the begonia flower.
GPT-4o 四年/n 春/n ,/w 衞州/n 吁/v 弑/v Deepseek-V3 指雨后花朵泛起的朦胧红晕。
桓公/n 而/c 立/v 。/w It refers to the hazy red glow that ap-
Qwen2.5-7B- 四年/t 春/w ,/w 卫州吁/nr 射/v 桓 pears on flowers after the rain.
Instruct 公/nr 而/c 立/v 。/w GPT-4o 指花瓣在雨水滋润下呈现出的一种
Xunzi-Qwen1.5- 四年春/t , /w 衞州吁/nr 杀/v 桓 柔和而略带湿润光泽的红色。
7B-Chat 公/nr 而/c 立/v 。/w Refers to the soft and slightly moist
red color that the petals appear to
Table 8: Response examples for the POS task from different LLMs. have when nourished by rain.
Errors in POS are marked in red, while text errors are highlighted in Qwen2.5-7B- 指花瓣上晕染开的淡红色,适合颜
blue. Instruct 色柔和、渐变,类似于涂抹或被雨
水滋润后散开的红晕。
Refers to the light red color that
expression "四年" (the fourth year) as a common noun, and spreads across the petals. It is suit-
it also struggles with recognizing the proper noun "衞州 able for soft and gradual colors, sim-
吁" (Wei Zhou Xu). Qwen2.5-7B-Instruct errors in tagging ilar to the blush that spreads after be-
the part of speech for "春" (spring) and sometimes replaces ing smeared or moistened by rain.
characters in the original text with inappropriate alternatives.
Xunzi-Qwen1.5- 指花瓣上的红色晕染,因为花瓣在
Xunzi-Qwen1.5-7B-Chat primarily fails in distinguishing be-
7B-Chat 雨中被淋湿,所以红色晕染更加明
tween time words and nouns, mistakenly using the simplified
显。
character "杀" instead of the correct "弑". These errors high-
Refers to the red smudge on the
light the model’s difficulty in making precise distinctions be-
flower petals. Because the petals got
tween similar words.
wet in the rain, the red smudge is
For the word explanation task, WenyanGPT’s response
more obvious.
is the closest to the Ground Truth. Although WenyanGPT
adds some modifiers, it maintains simplicity and refinement
Table 9: Response examples for the word explanation task from dif-
in describing the color, which matches the tone of the poem.
ferent LLMs.
Deepseek-V3, GPT-4o, Qwen2.5-7B-Instruct, and Xunzi-
Qwen1.5-7B-Chat do not specify the subject as "海棠花"
(begonia flowers) and instead over-explained the moistness 6 Conclusion
of the petals and the process of spreading color, deviating
We propose a comprehensive solution for Classical Chi-
from the standard, concise description of "晕红" (blushing
nese language processing challenges, including the develop-
red). Specifically, the responses from Qwen2.5-7B-Instruct
ment of WenyanGPT, a large language model focused on the
and Xunzi-Qwen1.5-7B-Chat extend the impact of the ex-
Classical Chinese domain, and WenyanBENCH, an evalua-
ternal environment, which do not fully align with the origi-
tion benchmark dataset for Classical Chinese tasks. We re-
nal text’s expression. This over-expansion shows that these
lease pre-training and instruction fine-tuning datasets and de-
LLMs fail to capture the intended simplicity and depth of the
scribe the method for constructing the instruction fine-tuning
original description.
dataset. Through systematic experiments and analysis, we
WenyanGPT demonstrates strong semantic understanding
demonstrate the significant impact of domain-specific pre-
and retention capabilities in handling Classical Chinese tasks.
training and multi-task instruction fine-tuning on improv-
Its ability to accurately tag parts of speech and generate con-
ing Classical Chinese processing capabilities. Our model
cise, poetic word explanations demonstrates a deep grasp of
outpermforms existing mainstream LLMs in various down-
the nuances of Classical Chinese. In contrast, other LLMs
stream tasks. In future, we attend to explore the potential
struggle with both understanding the subtle distinctions of
of multimodal models by combining Classical Chinese texts
Classical Chinese and producing responses that remain faith-
with image data (such as inscriptions and manuscripts) to en-
ful to the content and meaning of the original text. This po-
hance processing capabilities.
sitions WenyanGPT as a powerful tool for handling complex
tasks in the domain of Classical Chinese, far surpassing the
other LLMs in terms of both accuracy and literariness.
8345
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
8346
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)
large language models for mental health support. ArXiv, [Yan et al., 2016] Rui Yan, Cheng-Te Li, Xiaohua Hu, and
abs/2309.15461, 2023. Ming Zhang. Chinese couplet generation with neural net-
[Liu et al., 2024] Aixin Liu, Bei Feng, Bing Xue, et work structures. In Proceedings of the 54th Annual Meet-
al. Deepseek-v3 technical report. arXiv preprint ing of the Association for Computational Linguistics (Vol-
arXiv:2412.19437, 2024. ume 1: Long Papers), pages 2347–2357, 2016.
[Radford and Narasimhan, 2018] Alec Radford and Karthik [Yang et al., 2023a] Aiyuan Yang, Bin Xiao, Bingning
Narasimhan. Improving language understanding by gen- Wang, Borong Zhang, et al. Baichuan 2: Open large-scale
erative pre-training. 2018. language models. arXiv preprint arXiv:2309.10305, 2023.
[Yang et al., 2023b] Yi Yang, Yixuan Tang, and Kar Yan
[Radford et al., 2019] Alec Radford, Jeff Wu, Rewon Child,
Tam. Investlm: A large language model for investment
David Luan, et al. Language models are unsupervised mul-
using financial domain instruction tuning, 2023.
titask learners. 2019.
[Yang et al., 2024a] An Yang, Baosong Yang, Beichen
[Srivastava et al., 2023] Aarohi Srivastava, Abhinav Ras-
Zhang, et al. Qwen2. 5 technical report. arXiv preprint
togi, Abhishek Rao, Abu Awal Md Shoeb, et al. Beyond arXiv:2412.15115, 2024.
the imitation game: Quantifying and extrapolating the ca-
pabilities of language models. Trans. Mach. Learn. Res., [Yang et al., 2024b] Songhua Yang, Hanjie Zhao, Senbin
2023, 2023. Zhu, , et al. Zhongjing: Enhancing the chinese medical
capabilities of large language model through expert feed-
[Taylor et al., 2022] Ross Taylor, Marcin Kardas, Guillem back and real-world multi-turn dialogue. In Proceedings
Cucurull, Thomas Scialom, et al. Galactica: A large lan- of the AAAI Conference on Artificial Intelligence, volume
guage model for science. ArXiv, abs/2211.09085, 2022. 38, pages 19368–19376, 2024.
[Tian et al., 2020] Huishuang Tian, Kexin Yang, Dayiheng [Yi et al., 2017] Xiaoyuan Yi, Ruoyu Li, and Maosong Sun.
Liu, and Jiancheng Lv. Anchibert: A pre-trained model Generating chinese classical poems with rnn encoder-
for ancient chinese language understanding and genera- decoder. In Chinese Computational Linguistics and Nat-
tion. 2021 International Joint Conference on Neural Net- ural Language Processing Based on Naturally Annotated
works (IJCNN), pages 1–8, 2020. Big Data: 16th China National Conference, CCL 2017,
[Touvron et al., 2023] Hugo Touvron, Thibaut Lavril, Gau- and 5th International Symposium, NLP-NABD 2017, Nan-
tier Izacard, et al. Llama: Open and efficient foundation jing, China, October 13-15, 2017, Proceedings 16, pages
language models. ArXiv, abs/2302.13971, 2023. 211–223. Springer, 2017.
[Vaswani et al., 2017] Ashish Vaswani, Noam M. Shazeer, [Yuan et al., 2019] Y Yuan, D Wang, S Huang, and B Li. The
Niki Parmar, et al. Attention is all you need. In Neural comparative study of different tagging sets on entity ex-
Information Processing Systems, 2017. traction of classical books. Data Analysis and Knowledge
[Wang et al., 2019] Hongbin Wang, Haibing Wei, Jianyi Discovery, 3(03):57–65, 2019.
Guo, and Liang Cheng. Ancient chinese sentence segmen- [Yue et al., 2023] Shengbin Yue, Wei Chen, Siyuan Wang,
tation based on bidirectional lstm+ crf model. Journal of Bingxuan Li, et al. Disc-lawllm: Fine-tuning large
advanced computational intelligence and intelligent infor- language models for intelligent legal services. ArXiv,
matics, 23(4):719–725, 2019. abs/2309.11325, 2023.
[Wang et al., 2022] Dongbo Wang, Chang Liu, Zihe Zhu, [Zeng et al., 2024] Team Glm Aohan Zeng, Bin Xu, Bowen
Jiangfeng Liu, et al. Sikubert and sikuroberta: Construc- Wang, et al. Chatglm: A family of large language models
tion and application of pre-trained models for the siku from glm-130b to glm-4 all tools. ArXiv, abs/2406.12793,
quanshu in the field of digital humanities. Library Tribune, 2024.
42(06):31–43, 2022. [Zhang et al., 2023a] Xuanyu Zhang, Qing Yang, and
[Wang et al., 2023a] Dongbo Wang, Chang Liu, Zhixiao Dongliang Xu. Xuanyuan 2.0: A large chinese financial
Zhao, et al. Gujibert and gujigpt: Construction of intel- chat model with hundreds of billions parameters. Proceed-
ligent information processing foundation language models ings of the 32nd ACM International Conference on Infor-
for ancient texts. arXiv preprint arXiv:2307.05354, 2023. mation and Knowledge Management, 2023.
[Wang et al., 2023b] Hao Wang, Chi-Liang Liu, Nuwa Xi, [Zhang et al., 2023b] Yiqin Zhang, Sanhong Deng, Qi
Zewen Qiang, Sendong Zhao, Bing Qin, and Ting Liu. Zhang, Dongbo Wang, and Hongcun Gong. Comparative
Huatuo: Tuning llama model with chinese medical knowl- analysis of language models for linguistic examination of
edge. ArXiv, abs/2304.06975, 2023. ancient chinese classics: A case study of zuozhuan cor-
pus. In 2023 International Conference on Asian Language
[Xiong et al., 2023] Honglin Xiong, Sheng Wang, Yitao Processing (IALP), pages 154–161. IEEE, 2023.
Zhu, Zihao Zhao, Yuxiao Liu, Linlin Huang, Qian Wang, [Zhang et al., 2024] Jundong Zhang, Songhua Yang,
and Dinggang Shen. Doctorglm: Fine-tuning your chinese
Jiangfeng Liu, and Qi Huang. Aigc empowering the
doctor is not a herculean task. ArXiv, abs/2304.01097,
revitalization of ancient books on traditional chinese
2023.
medicine:building the huang-di large language model.
Library Tribune, 44(10):103–112, 2024.
8347