Computer Science and Information Systems 2025 Volume 22, Issue 4, Pages: 1707-1756
https://doi.org/10.2298/CSIS241130068L
Full text (
863 KB)
Implementing persona in the business sector by a universal explainable AI framework based on byte-pair encoding
Liu Zhenyao (School of Economics and Management, Taizhou University Taizhou, Jiangsu Province, China), zyliu@tzu.edu.cn
Liu Yu-Lun (Integration and Collaboration Laboratory, Department of Industrial Engineering and Engineering Management, National Tsing Hua University Hsinchu, Taiwan), morris.cy0910@gmail.com, yeh@ieee.org
Yeh Wei-Chang (Integration and Collaboration Laboratory, Department of Industrial Engineering and Engineering Management, National Tsing Hua University Hsinchu, Taiwan), morris.cy0910@gmail.com, yeh@ieee.org
Huang Chia-Ling (Department of International Logistics and Transportation Management, Kainan University Taoyuan, Taiwan), clhuang@mail.knu.edu.tw
In the commercial realm, particularly for businesses targeting consumers (B2C), the challenge of acquiring and retaining valuable potential customers is paramount. As chip technology continues to advance at breakneck speed, in line with Moore’s Law, various innovative AI technologies have emerged, yet this also highlights the infamous “black-box” issue. Naturally, this has paved the way for the rise of Explainable AI (XAI) and machine learning. In response, this study proposes a universal explainability framework to tackle both the black-box conundrum and the limitation of customer list sizes. The framework leverages the fundamental Byte-Pair Encoding (BPE) algorithm from large language models to tokenize natural language data, integrating the results into customer data as feature columns, thereby constructing comprehensive Persona. Crucially, domain experts are involved in the model-building process, selecting and recommending features. These experts utilize depth-first search to identify additional, similar feature columns, which are then used as target categories for machine learning models. The final step involves classification tasks and prediction evaluations. The proposed framework demonstrates its effectiveness and generalizability through validation on public datasets, increasing the number of potential customers by 7.5 times compared to traditional modeling approaches. In case studies, the framework outperforms customer lists generated by experts based on past experience, yielding 2.4 times more customers, 3.8 times higher response rates, and 9 times more total respondents. More importantly, both the model-building process and predictive outcomes are interpretable through domain knowledge, enabling businesses to transfer experience and expertise, thus laying a solid foundation for large language models within the industry.
Keywords: Natural Language Processing, Byte-Pair Encoding, Persona, Explainable Machine Learning, Business Sector
This research was supported in part by the National Science and Technology Council, R.O.C. under grant MOST 110-2221-E-007-107-MY3, NSTC 112-2221-E-007-086 and NSTC 113-2221-E-007-117-MY3.
Show references
A. Rakipi, O. Shurdi, and J. Imami, “Utilization of data mining and machine learning in digital and electronic payments in banks,” Corporate and Business Strategy Review, vol. 4, no. 4, pp. 243-251, 2023.
W. Yeh, M. Chuang, and W. Lee, “Uniform parallel machine scheduling with resource consumption constraint,” Applied Mathematical Modelling, vol. 39, no. 8, pp. 2131-2138, 2015.
W. Yeh and S. Wei, “Economic-based resource allocation for reliable grid-computing service based on grid bank,” Future Generation Computer Systems, vol. 28, no. 7, pp. 989-1002, 2012.
K. Pousttchi and M. Dehnert, “Exploring the digitalization impact on consumer decisionmaking in retail banking,” Electronic Markets, vol. 28, no. 3, pp. 265-286, 2018.
P. Angelov, E. Soares, R. Jiang, N. Arnold, and P. Atkinson, “Explainable artificial intelligence: an analytical review,” WIREs Data Mining and Knowledge Discovery, vol. 11, no. 5, p. 2021, 2021.
J. Achiam and et al., “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
H. Touvron and et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.
“Introducing llama: A foundational, 65-billion-parameter language model.” https://ai.meta.com/blog/large-language-model-llama-meta-ai/. Accessed: 2024/4/2.
S. Spatharioti, D. Rothschild, D. Goldstein, and J. Hofman, “Comparing traditional and llm-based search for consumer choice: A randomized experiment,” arXiv preprint arXiv:2307.03744, 2023.
H. Corley, J. Rosenberger, W. Yeh, and T. Sung, “The cosine simplex algorithm,” The International Journal of Advanced Manufacturing Technology, vol. 27, pp. 1047-1050, 2006.
B. Arcila, “Is it a platform? is it a search engine? it’s chatgpt! the european liability regime for large language models,” Journal of Free Speech Law, vol. 3, p. 455, 2023.
W. Yeh, “Novel binary-addition tree algorithm (bat) for binary-state network reliability problem,” Reliability Engineering and System Safety, vol. 208, p. 107448, 2021.
M. Karpinska and M. Iyyer, “Large language models effectively leverage document-level context for literary translation, but critical errors persist,” in Proceedings of the Eighth Conference on Machine Translation, 2023.
W. Yeh, “A new branch-and-bound approach for the n/2/flowshop/f+ cmax flowshop scheduling problem,” Computers & Operations Research, vol. 26, no. 13, pp. 1293-1310, 1999.
A. Thirunavukarasu, D. Ting, K. Elangovan, L. Gutierrez, T. Tan, and D. Ting, “Large language models in medicine,” Nature Medicine, vol. 29, no. 8, pp. 1930-1940, 2023.
C. Luo, B. Sun, K. Yang, T. Lu, and W. Yeh, “Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme,” Infrared Physics & Technology, vol. 99, pp. 265-276, 2019.
A. Mbakwe, I. Lourentzou, L. Celi, O. Mechanic, and A. Dagan, “Chatgpt passing usmle shines a spotlight on the flaws of medical education,” PLOS Digital Health, vol. 2, no. 2, p. e0000205, 2023.
N. Chiliya, G. Herbst, and M. Roberts-Lombard, “The impact of marketing strategies on profitability of small grocery shops in south african townships,” African Journal of Business Management, vol. 3, no. 3, p. 70, 2009.
T. Damrongsakmethee and V.-E. Neagoe, “Data mining and machine learning for financial analysis,” Indian Journal of Science and Technology, vol. 10, no. 39, pp. 1-7, 2017.
R. Aditya and D. Satria, “Optimizing bank marketing strategies through analysis using lightgbm,” CoreID Journal, vol. 1, no. 2, pp. 58-65, 2023.
S. Shim, M. Eastlick, and S. Lotz, “Search-purchase (s-p) strategies of multi-channel consumers,” Journal of Marketing Channels, vol. 11, no. 2-3, pp. 33-54, 2004.
A. Faria andW.Wellington, “Validating business gaming: Business game conformity with pims findings,” Simulation & Gaming, vol. 36, no. 2, pp. 259-273, 2005.
P. Chate, Behavioral Modelling of Customer Marketing Patterns and Review Prediction Using Machine Learning Techniques. PhD thesis, National College of Ireland, Dublin, 2022.
M. Muslim, Y. Dasril, A. Alamsyah, and T. Mustaqim, “Bank predictions for prospective longterm deposit investors using machine learning lightgbm and smote,” Journal of Physics: Conference Series, vol. 1918, no. 4, p. 042143, 2021.
E. Broek, A. Sergeeva, and M. Huysman, “When the machine meets the expert: An ethnography of developing ai for hiring,” MIS Quarterly, vol. 45, no. 3, 2021.
T. Jovanov and M. Stojanovski, “Marketing knowledge and strategy for smes: Can they live without it?,” in Thematic Collection of papers of international significance: Reengineering and entrepreneurship under the contemporary conditions of enterprise business, pp. 131-143, 2012.
Y. Huang, M. Zhang, and Y. He, “Research on improved rfm customer segmentation model based on k-means algorithm,” in 2020 5th International Conference on Computational Intelligence and Applications (ICCIA), 2020.
E. Soares, P. Angelov, B. Costa, and M. Castro, “Actively semi-supervised deep rule-based classifier applied to adverse driving scenarios,” in 2019 International Joint Conference on Neural Networks (IJCNN), 2019.
R. Blanco and C. Lioma, “Graph-based term weighting for information retrieval,” Information Retrieval, vol. 15, no. 1, pp. 54-92, 2011.
X. Xie, J. Niu, X. Liu, Z. Chen, S. Tang, and S. Yu, “A survey on incorporating domain knowledge into deep learning for medical image analysis,” Medical Image Analysis, vol. 69, p. 101985, 2021.
C. Deng, X. Ji, C. Rainey, J. Zhang, and W. Lu, “Integrating machine learning with human knowledge,” iScience, vol. 23, no. 11, p. 101656, 2020.
V. Belle and I. Papantonis, “Principles and practice of explainable machine learning,” Frontiers in Big Data, vol. 4, p. 688969, 2021.
S. Lundberg and S. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
A. Arrieta and et al., “Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai,” Information Fusion, vol. 58, pp. 82-115, 2020.
Z. Liu,W. Yeh, K. Lin, C. Lin, and C. Chang, “Machine learning based approach for exploring online shopping behavior and preferences with eye tracking,” Computer Science and Information Systems, vol. 21, no. 2, pp. 593-623, 2024.
R. Roscher, B. Bohn, M. Duarte, and J. Garcke, “Explainable machine learning for scientific insights and discoveries,” IEEE Access, vol. 8, pp. 42200-42216, 2020.
H. Chia, “The emergence and need for explainable ai,” Advances in Engineering Innovation, vol. 3, no. 1, pp. 1-4, 2023.
E. Soares, P. Angelov, S. Biaso, M. Froes, and D. Abe, “Sars-cov-2 ct-scan dataset: A large dataset of real patients ct scans for sars-cov-2 identification,” MedRxiv, 2020.
F. Morais, A. Garcia, P. Santos, and L. Ribeiro, “Do explainable ai techniques effectively explain their rationale? a case study from the domain expert’s perspective,” in 2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 2023.
J. Dressel and H. Farid, “The accuracy, fairness, and limits of predicting recidivism,” Science Advances, vol. 4, no. 1, p. eaao5580, 2018.
A. Smith-Renner, R. Rua, and M. Colony, “Towards an explainable threat detection tool,” in IUI Workshops, 2019.
S. Mathews, “Explainable artificial intelligence applications in nlp, biomedical, and malware classification: A literature review,” in Advances in Intelligent Systems and Computing, pp. 1269-1292, Springer International Publishing, 2019.
A. Das and P. Rad, “Opportunities and challenges in explainable artificial intelligence (xai): A survey,” arXiv preprint arXiv:2006.11371, 2020.
S. Murindanyi, B. Mugalu, J. Nakatumba-Nabende, and G. Marvin, “Interpretable machine learning for predicting customer churn in retail banking,” in 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), 2023.
T. Clement, N. Kemmerzell, M. Abdelaal, and M. Amberg, “Xair: A systematic metareview of explainable ai (xai) aligned to the software development process,” Machine Learning and Knowledge Extraction, vol. 5, no. 1, pp. 78-108, 2023.
Y. Han, “Research on precise service of academic journals based on user profile,” Acta Editologica, vol. 2, pp. 142-146, 2021.
D. Travis, “How to create personas your design team will believe in.” https://www.userfocus.co.uk/articles/personas.html. Accessed: 2024/4/2.
Y. Chang, Y. Lim, and E. Stolterman, “Personas: from theory to practices,” in Proceedings of the 5th Nordic conference on Human-computer interaction: building bridges, pp. 439-442, 2008.
L. W., O. K., L. C.G., and C. H.J., “User profile extraction from twitter for personalized news recommendation,” in 16th International conference on advanced communication technology, pp. 779-783, IEEE, 2014.
M. Raghuram, K. Akshay, and K. Chandrasekaran, “Efficient user profiling in twitter social network using traditional classifiers,” in Advances in Intelligent Systems and Computing, pp. 399- 411, Springer International Publishing, 2015.
R. Bonnie, “The power of the persona.” https://www.pragmaticinstitute.com/resources/articles/product/the-power-of-the-persona/. Accessed: 2024/4/2.
Y. Yao, J. Duan, K. Xu, Y. Cai, Z. Sun, and Y. Zhang, “A survey on large language model (llm) security and privacy: The good, the bad, and the ugly,” High-Confidence Computing, vol. 4, no. 2, p. 100211, 2024.
K. Bostrom and G. Durrett, “Byte pair encoding is suboptimal for language model pretraining,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020.
P. Gage, “A new algorithm for data compression,” The C Users Journal, vol. 12, no. 2, pp. 23- 38, 1994.
J. Zhan and et al., “An effective feature representation of web log data by leveraging byte pair encoding and tf-idf,” in Proceedings of the ACM Turing Celebration Conference-China, pp. 1- 6, 2019.
“Summary of the tokenizers.” https://huggingface.co/docs/transformers/tokenizer_summary#summary-of-the-tokenizers. Accessed: 2024/4/2.
Thomwolf, “Bpe tokenizers and spaces before words.” https://discuss.huggingface.co/t/bpe-tokenizers-and-spaces-before-words/475. Accessed: 2024/4/10.
R. A. and S. Borah, “Study of various methods for tokenization,” in Applications of Internet of Things, pp. 193-200, Springer Singapore, 2020.
X. Gutierrez-Vasques, C. Bentz, and T. Samardžić, “Languages through the looking glass of bpe compression,” Computational Linguistics, vol. 49, no. 4, pp. 943-1001, 2023.
N. Tavabi and K. Lerman, “Pattern discovery in physiological data with byte pair encoding,” in Multimodal AI in Healthcare, pp. 227-243, Springer International Publishing, 2022.
N. Fradet, N. Gutowski, F. Chhel, and J. Briot, “Byte pair encoding for symbolic music,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
H. Liu, “Byte-pair and n-gram convolutional methods of analysing automatically disseminated content on social platforms,” MDPI AG, 2020.
N. Nilsson, Principles of Artificial Intelligence. Springer Berlin Heidelberg, 1982.
F. Harary, “The explosive growth of graph theory,” Annals of the New York Academy of Sciences, vol. 328, no. 1, pp. 5-11, 1979.
R. Tarjan, “Depth-first search and linear graph algorithms,” SIAM Journal on Computing, vol. 1, no. 2, pp. 146-160, 1972.
C. Photphanloet and R. Lipikorn, “Pm10 concentration forecast using modified depth-first search and supervised learning neural network,” Science of The Total Environment, vol. 727, p. 138507, 2020.
S. Rahmani, S. Fakhrahmad, and M. Sadreddini, “Co-occurrence graph-based context adaptation: a new unsupervised approach to word sense disambiguation,” Digital Scholarship in the Humanities, vol. 36, no. 2, pp. 449-471, 2020.
Y. Du, F. Li, T. Zheng, and J. Li, “Fast cascading outage screening based on deep convolutional neural network and depth-first search,” IEEE Transactions on Power Systems, vol. 35, no. 4, pp. 2704-2715, 2020.
Q. Mei and M. Gül, “Multi-level feature fusion in densely connected deep-learning architecture and depth-first search for crack segmentation on images collected with smartphones,” Structural Health Monitoring, vol. 19, no. 6, pp. 1726-1744, 2020.
A. Syah, F. Helmiah, N. Irawati, and N. Hasibuan, “Depth first search algorithm in the expert system for diagnosis of palm oil growth obstacles,” in 4TH INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN MATERIALS SCIENCE AND ENGINEERING 2022, 2024.
G. Logeswari, S. Bose, and T. Anitha, “An intrusion detection system for sdn using machine learning,” Intelligent Automation & Soft Computing, vol. 35, no. 1, pp. 867-880, 2023.
W. Cai, R. Wei, L. Xu, and X. Ding, “A method for modelling greenhouse temperature using gradient boost decision tree,” Information Processing in Agriculture, vol. 9, no. 3, pp. 343-354, 2022.
G. Ke and et al., “Lightgbm: A highly efficient gradient boosting decision tree,” in Advances in neural information processing systems, vol. 30, 2017.
B. Wardani, S. Sa’adah, and D. Nurjanah, “Measuring and mitigating bias in bank customers data with xgboost, lightgbm, and random forest algorithm,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 9, no. 1, pp. 142-155, 2023.
Y. Hua, “An efficient traffic classification scheme using embedded feature selection and lightgbm,” in 2020 Information Communication Technologies Conference (ICTC), 2020.
N. Chawla, K. Bowyer, L. Hall, andW. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, pp. 321-357, 2002.
J. Ponsam, S. Gracia, G. Geetha, S. Karpaselvi, and K. Nimala, “Credit risk analysis using lightgbm and a comparative study of popular algorithms,” in 2021 4th International Conference on Computing and Communications Technologies (ICCCT), 2021.
Y. Wong, K. Madhavan, and N. Elmqvist, “Towards characterizing domain experts as a user group,” in 2018 IEEE Evaluation and Beyond-Methodological Approaches for Visualization (BELIV), pp. 1-10, 2018.
P. Fadde and P. Sullivan, “Developing expertise and expert performance,” in Handbook of Research in Educational Communications and Technology: Learning Design, pp. 53-72, 2020.
K. Chandrasekaran, Domain-Driven Design with Java - A Practitioner’s Guide: Create simple, elegant, and valuable software solutions for complex business problems. Packt Publishing, 2021. https://ddd-practitioners.com/home/glossary/domain-expert/.
Vujović, “Classification model evaluation metrics,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, 2021.
T. Saito and M. Rehmsmeier, “Basic evaluation measures from the confusion matrix.” https://classeval.wordpress.com/introduction/basic-evaluation-measures/, 2017.
P. Le, M. Nauta, V. Nguyen, S. Pathak, J. Schlötterer, and C. Seifert, “Benchmarking explainable ai - a survey on available toolkits and open challenges,” in Proceedings of the Thirty- Second International Joint Conference on Artificial Intelligence, 2023.
A. Holzinger, “From machine learning to explainable ai,” in 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA), 2018.
Z. Lipton, “The mythos of model interpretability,” Queue, vol. 16, no. 3, pp. 31-57, 2018.
C. Molnar, Interpretable machine learning. Lulu.com, 2020.
J. Karkavelraja, “Amazon sales dataset.” https://www.kaggle.com/datasets/karkavelrajaj/amazon-sales-dataset. Accessed: 2024/4/2.
A. Gupta, A. Raghav, and S. Srivastava, “Comparative study of machine learning algorithms for portuguese bank data,” in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), 2021.
Lavanya, “Google play store apps.” https://www.kaggle.com/datasets/lava18/google-play-store-apps. Accessed: 2024/4/10.
P. Lokesh, “Amazon products sales dataset 2023.” https://www.kaggle.com/datasets/lokeshparab/amazon-products-dataset. Accessed: 2024/4/2.
D. Chen, “Online retail.” UCI Machine Learning Repository, 2015.
“Personal data protection act.” https://law.moj.gov.tw/LawClass/LawAll.aspx?PCODE=G0380233. Accessed: 2024/4/2.
“Banking act.” https://law.fsc.gov.tw/LawContent.aspx?id=GL000624. Accessed: 2024/4/2.
A. Caramazza and J. Shelton, “Domain-specific knowledge systems in the brain: The animateinanimate distinction,” Journal of Cognitive Neuroscience, vol. 10, no. 1, pp. 1-34, 1998.