Tradertalk: An LLM Behavioural Abm Applied To Simulating Human Bilateral Trading Interactions
Tradertalk: An LLM Behavioural Abm Applied To Simulating Human Bilateral Trading Interactions
Abstract—We introduce a novel hybrid approach that aug- and GPT-o1 in 2024—poses challenges for systematic testing,
ments Agent-Based Models (ABMs) with behaviours generated as updates can render studies obsolete. To address this, we
by Large Language Models (LLMs) to simulate human trading propose a flexible framework using the most current, widely
arXiv:2410.21280v1 [[Link]] 10 Oct 2024
Trading Premise ”You are a market maker for UK gilts 5) Analyse and Conclude: .We then analyse the conver-
responsible for providing liquidity in the UK government sation for trade occurrences, quantities, and dialogue
bonds. You are supposed to at all times hold 0 bonds. Today, content.
you actually have 0 bonds, which means your holding is Project Context: ”You are a market maker for UK gilts
actually flat” . responsible for providing liquidity in the UK government bond.
Your job is to answer incoming queries from other market
Results and Conclusion makers to buy and sell UK government bonds by considering
Across 300 simulations, the LLM correctly decided to select if you wish to do so. UK government bonds trade at mid price.
not to trade in 180 instances, demonstrating that the LLM can You aim to make a trading decision in every conversation,
reason and follow the agent-based intention 60% of the time. either buy , sell or decline to trade. You must act professionally
We observe no significant differences with smaller sample in your conversations, and any decision you take is clearly
sizes. This test isolates the LLM’s ability to reason given a communicated to the other party and you repeat what is
natural language trading intention. There is an absence of any agreed.”
direct comparison of how frequently human traders perfectly Trading Roles:
follow such intentions however. Furthermore, existing research David: ”You are a market maker for UK gilts responsible
on human decision-making highlights ambiguity and the dis- for providing liquidity in the UK government bond, you are
tinction between following rules and intentions, suggesting supposed to at all times hold 0 bonds. Today, you actually
that achieving a 100% success rate is unrealistic, although have negative 10 million worth of bonds, your role is to buy
this is not quantifiable. We use this result as a baseline for the bonds if you have a negative holding”
future tests in more complex scenarios and believe that this Josephine: ”You are a market maker for UK gilts respon-
achieves the goal of including human like attributes, though sible for providing liquidity in the UK government bond, you
the quantum of such is beyond the scope of this work. are supposed to at all times hold 0 bonds. Today you have 10
In the remaining 40% of simulations where the LLM did not million worth of bonds, your role is to sell bonds if you are a
follow the correct intention, 23.6% involved the LLM attempt- holder, you need to call another market maker to trade away
ing to ”flatten” a trading book that was already flat, suggesting your bonds”
a misunderstanding or failure to adhere to the instructions and Unlike in RQ1, where the LLM operated independently, this
market parlance. The other 16.4% of responses reflected active setup integrates the ABM into decision-making to evaluate
trading positions that directly contradicted the premise, with how often the LLM produces simulated conversations with
the LLM expressing a desire to ”buy” in 10% of cases and to correct reasoning regarding trade intentions and executions.
”sell” in 6.3% of cases. We interpret this variation in behaviour Agents directly inform action direction (identifying buyers and
introduced by the LLM as analogous to the emergence of sellers).
unexpected properties within traditional agent-based models.
IV. C ONCLUSION [11] P. Avegliano and J. S. Sichman, “Using surrogate models to calibrate
agent-based model parameters under data scarcity,” in Proceedings of the
We present TraderTalk, a novel LLM behavioural agent- 18th International Conference on Autonomous Agents and MultiAgent
based model that simulates realistic human bilateral trading in- Systems, 2019, pp. 1781–1783.
[12] R. Hayes, A. Todd, N. Chaidarun, S. Tepsuporn, P. Beling, and
teractions without extensive model tuning. Utilising a state-of- W. Scherer, “An agent-based financial simulation for use by researchers,”
the-art, non-domain-specific, non-fine-tuned LLM within the null, 2014.
Concordia framework (GTP 4o-mini), we demonstrate limited [13] M. E. Paddrik, R. Hayes, A. Todd, S. Y. Yang, P. A. Beling, and W. T.
Scherer, “An agent based model of the e-mini s&p 500 applied to flash
yet realistic trade negotiations (RQ1), interpretation, and trade crash analysis,” IEEE Conference on Computational Intelligence for
execution decisions (RQ2) at frequencies approximating those Financial Engineering and Economics, 2012.
in U.S. equity markets. By addressing key challenges like co- [14] B. of England, “Gemm guidebook: A guide to the roles of the
dmo and primary dealers in the uk government bond market,”
ordinating agent turn-taking, our simulation achieves a trade- null, December 2004. [Online]. Available: [Link]
to-order ratio similar to real markets. Discrepancies between 22bbjndz/[Link]
trading intentions and execution in agent outputs enhance [15] J. Cheshire, “Market Making in Bond Markets,” RBA Bulletin March
Quater 2015, no. March, pp. 63–74, 2015. [Online]. Available:
the realism of stylised human traders, capturing decision- [Link]
making processes in bilateral trading environments where [16] G. Pinter, “An anatomy of the 2022 gilt market crisis,” SSRN Electronic
much interaction occurs outside formal exchanges. This proof- Journal, vol. null, 2023.
[17] A. Vidler and T. Walsh, “Modelling opaque bilateral market dynamics
of-concept indicates that LLMs can meaningfully enhance in financial trading: Insights from a multi-agent simulation study,”
the realism of behavioural simulations in ABMs for financial 2024. [Online]. Available: [Link]
market modelling and provides a foundation for future research [18] A. S. Vezhnevets, J. P. Agapiou, A. Aharon, R. Ziv, J. Matyas, E. A.
Duéñez-Guzmán, W. A. Cunningham, S. Osindero, D. Karmon, and
into more complex multi-agent and multi-market simulations. J. Z. Leibo, “Generative agent-based modeling with actions grounded
Future work should enhance GABM’s understanding of im- in physical, social, or digital space using concordia,” arXiv preprint
plicit trading rules and dynamic market conditions; by refining arXiv:2312.03664, 2023.
[19] A. Srivastava, A. Rastogi, A. Rao, A. A. M. Shoeb, A. Abid,
their ability to capture human decision-making, LLMs could and A. F. et al, “Beyond the imitation game: Quantifying and
offer more robust simulations for policymakers, regulators, and extrapolating the capabilities of language models,” 2023. [Online].
market participants alike. Available: [Link]
[20] Y. Zhang, S. Mao, T. Ge, X. Wang, A. de Wynter, Y. Xia, W. Wu,
T. Song, M. Lan, and F. Wei, “Llm as a mastermind: A survey
of strategic reasoning with large language models,” 2024. [Online].
ACKNOWLEDGEMENT Available: [Link]
[21] J. Ahn, R. Verma, R. Lou, D. Liu, R. Zhang, and W. Yin, “Large lan-
We thank Dr Arnau Quera-Bofarull and Dr Nick Bishop guage models for mathematical reasoning: Progresses and challenges,”
(Oxford University), for their guidance on the research topic. in Proceedings of the 18th Conference of the European Chapter of the
Association for Computational Linguistics: Student Research Workshop.
We thank an anonymous market maker for their input. St. Julian’s, Malta: Association for Computational Linguistics, Mar.
This work is funded in part by an ARC Laureate grant 2024.
FL200100204. [22] X. Xu, C. Tao, T. Shen, C. Xu, H. Xu, G. Long, and J. guang
Lou, “Re-reading improves reasoning in large language models,” 2024.
[Online]. Available: [Link]
[23] J. D. Zamfirescu-Pereira, R. Y. Wong, B. Hartmann, and Q. Yang, “Why
R EFERENCES Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design
LLM Prompts,” Conference on Human Factors in Computing Systems -
[1] A. Ng, “Agentic design patterns part 5: Multi-agent collaboration,” Proceedings, 2023.
null, 2024. [Online]. Available: [Link] [24] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi,
issue-245/ Q. V. Le, and D. Zhou, “Chain-of-Thought Prompting Elicits Reasoning
[2] J. Park, J. O’Brien, C. Cai, M. R. Morris, P. Liang, and M. Bernstein, in Large Language Models,” Advances in Neural Information Processing
“Generative agents: Interactive simulacra of human behavior,” arXiv, Systems, vol. 35, no. NeurIPS, pp. 1–43, 2022.
2023. [25] Re-Reading Improves Reasoning in Large Language Models, 2023.
[3] C. L. Ilut and M. Schneider, “Modeling uncertainty as ambiguity: a [Online]. Available: [Link]
review,” National Bureau of Economic Research, Working Paper 29915, [26] S. M. Frank J. Fabozzi, The Handbook of Fixed Income Securities,
April 2022. [Online]. Available: [Link] Seventh Edition. McGraw-Hill, 2005.
[4] P. Bossaerts, P. Ghirardato, S. Guarnaschelli, and W. R. Zame, [27] B. of England, “Official Operations in the Gilt Market An Operational
“Ambiguity in Asset Markets: Theory and Experiment,” The Review of Notice ,” null, vol. December, 2023.
Financial Studies, vol. 23, no. 4, pp. 1325–1359, 01 2010. [Online]. [28] Cooperation, Competition, and Maliciousness: LLM-Stakeholders Inter-
Available: [Link] active Negotiation, 2024.
[5] S. Bai, W. Raskob, W. Raskob, T. Müller, and T. Müller, “Agent based [29] AutoGen : Enabling Next-Gen LLM Applications via Multi-Agent Con-
model,” Radioprotection, 2020. versation, 2023.
[6] N. Gilbert, “Agent-based models,” The Centre for Research in Social [30] U.S. Securities and Exchange Commission, “Market structure:
Simulation, 01 2007. Exchange trade volume,” 2024, accessed: 2024-09-10. [Online].
[7] M. Wooldridge, An Introduction to MultiAgent Systems, Second Edition. Available: [Link]
John Wiley and Sons, 2009. [31] J. D. Farmer and S. Skouras, “Minimum resting times and transaction-
[8] A. A. Kirilenko, A. S. Kyle, M. Samadi, and T. Tuzun, “The flash crash: to-order ratios: review of amendment 2.3.f and question 20,” European
High-frequency trading in an electronic market,” Journal of Finance, Commission Public Consultation: Review of the Markets in Financial
2017. Instruments Directive (MiFID), 2012.
[9] J. Paulin, A. Calinescu, and M. Wooldridge, “Understanding flash crash [32] V. Dalko and M. H. Wang, “How Effective are the Order-to-
contagion and systemic risk: A micro–macro agent-based approach,” Trade Ratio and Resting Time Regulations?” Journal of Financial
Journal of Economic Dynamics and Control, 2019. Regulation, vol. 4, no. 2, pp. 321–325, 08 2018. [Online]. Available:
[10] M. Fehler, F. Klügl, and F. Puppe, “Approaches for resolving the [Link]
dilemma between model structure refinement and parameter calibration
in agent-based simulations,” Adaptive Agents and Multi-Agent Systems,
2006.