About the journal


Cobiss

Computer Science and Information Systems 2025 Volume 22, Issue 4, Pages: 1707-1756
https://doi.org/10.2298/CSIS241130068L
Full text ( 863 KB)


Implementing persona in the business sector by a universal explainable AI framework based on byte-pair encoding

Liu Zhenyao (School of Economics and Management, Taizhou University Taizhou, Jiangsu Province, China), zyliu@tzu.edu.cn
Liu Yu-Lun (Integration and Collaboration Laboratory, Department of Industrial Engineering and Engineering Management, National Tsing Hua University Hsinchu, Taiwan), morris.cy0910@gmail.com, yeh@ieee.org
Yeh Wei-Chang (Integration and Collaboration Laboratory, Department of Industrial Engineering and Engineering Management, National Tsing Hua University Hsinchu, Taiwan), morris.cy0910@gmail.com, yeh@ieee.org
Huang Chia-Ling (Department of International Logistics and Transportation Management, Kainan University Taoyuan, Taiwan), clhuang@mail.knu.edu.tw

In the commercial realm, particularly for businesses targeting consumers (B2C), the challenge of acquiring and retaining valuable potential customers is paramount. As chip technology continues to advance at breakneck speed, in line with Moore’s Law, various innovative AI technologies have emerged, yet this also highlights the infamous “black-box” issue. Naturally, this has paved the way for the rise of Explainable AI (XAI) and machine learning. In response, this study proposes a universal explainability framework to tackle both the black-box conundrum and the limitation of customer list sizes. The framework leverages the fundamental Byte-Pair Encoding (BPE) algorithm from large language models to tokenize natural language data, integrating the results into customer data as feature columns, thereby constructing comprehensive Persona. Crucially, domain experts are involved in the model-building process, selecting and recommending features. These experts utilize depth-first search to identify additional, similar feature columns, which are then used as target categories for machine learning models. The final step involves classification tasks and prediction evaluations. The proposed framework demonstrates its effectiveness and generalizability through validation on public datasets, increasing the number of potential customers by 7.5 times compared to traditional modeling approaches. In case studies, the framework outperforms customer lists generated by experts based on past experience, yielding 2.4 times more customers, 3.8 times higher response rates, and 9 times more total respondents. More importantly, both the model-building process and predictive outcomes are interpretable through domain knowledge, enabling businesses to transfer experience and expertise, thus laying a solid foundation for large language models within the industry.

Keywords: Natural Language Processing, Byte-Pair Encoding, Persona, Explainable Machine Learning, Business Sector

This research was supported in part by the National Science and Technology Council, R.O.C. under grant MOST 110-2221-E-007-107-MY3, NSTC 112-2221-E-007-086 and NSTC 113-2221-E-007-117-MY3.


Show references