0% found this document useful (0 votes)
16 views4 pages

Tradertalk: An LLM Behavioural Abm Applied To Simulating Human Bilateral Trading Interactions

Uploaded by

TheEarlyStart Up
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

Tradertalk: An LLM Behavioural Abm Applied To Simulating Human Bilateral Trading Interactions

Uploaded by

TheEarlyStart Up
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

TraderTalk: An LLM Behavioural ABM applied to


Simulating Human Bilateral Trading Interactions
Alicia Vidler UNSW, Email: [Link]@[Link] Toby Walsh UNSW, Email: [Link]@[Link]

Abstract—We introduce a novel hybrid approach that aug- and GPT-o1 in 2024—poses challenges for systematic testing,
ments Agent-Based Models (ABMs) with behaviours generated as updates can render studies obsolete. To address this, we
by Large Language Models (LLMs) to simulate human trading propose a flexible framework using the most current, widely
arXiv:2410.21280v1 [[Link]] 10 Oct 2024

interactions. We call our model TraderTalk. Leveraging LLMs


trained on extensive human-authored text, we capture detailed accessible LLM (GPT-40-mini) without relying on specific
and nuanced representations of bilateral conversations in fi- model versions or fine-tuned models. Due to rapid advances
nancial trading. Applying this Generative Agent-Based Model and inherent lack of transparency in models like ChatGPT, we
(GABM) to government bond markets, we replicate trading present limited results as a proof of concept.
decisions between two stylised virtual humans. Our method We apply our methods to bilateral trading in government
addresses both structural challenges—such as coordinating turn-
taking between realistic LLM-based agents—and design chal- bond markets, such as UK Gilt bonds. This market involves
lenges, including the interpretation of LLM outputs by the agent participants such as market makers (MMs), clients, and inter-
model. By exploring prompt design opportunistically rather than dealer brokers [14]. Systemically important to the countries
systematically, we enhance the realism of agent interactions they serve, in markets such as Australia and the UK, MMs
without exhaustive overfitting or model reliance. Our approach facilitate most government bond transactions, which occur
successfully replicates trade-to-order volume ratios observed in
related asset markets, demonstrating the potential of LLM- over-the-counter (OTC) with limited publicly available data
augmented ABMs in financial simulations. [15], [16]. Thus, modelling of these markets requires novel
methods of ABM design and enhancements [17]. By incorpo-
rating LLMs and focusing on negotiation and decision-making,
I. I NTRODUCTION our framework offers new opportunities to simulate realis-
Large Language Models (LLMs) have garnered much atten- tic human interactions in these markets, aiming to enhance
tion since 2022, and continue to evolve rapidly, demonstrating methodologies and provide more nuanced, realistic market
capabilities in understanding and generating human-like text simulations.
across various domains. Integrating LLMs into multi-agent We introduce TraderTalk, a bespoke generative agent-based
frameworks is seen as a key future design method for AI [1], model (GABM) that integrates a general-purpose LLM into
aiming to replicate complex human interactions and decision- ABMs using open-source software, Concordia [18] and LLM
making processes [2]. Behaviours like risk inertia, aversion, prompting methods. By injecting human-like behaviours and
and ambiguity avoidance significantly impact financial markets uncertainties into logic-based ABMs without domain-specific
[3], and can be so extreme as to dissuade traders from trans- tuning, we aim to enhance simulation realism in bilateral
acting all together [4]. In this paper we investigate whether an financial trading. This paper is structured as follows: we
LLM, managed by an agent, can simulate human behaviours discuss recent research on LLMs and ABMs and related
in bilateral asset trading. concerns, present our architecture for integrating LLMs in
ABMs can effectively simulate interdependent, adaptive ABMs, show test case results for financial trading scenarios,
complex systems [5], [6], however, their design involves and conclude with future research directions.
significant parameterisation of agent behaviours. Logic-based
methods like ”belief, desire, intention” [7], [8], [9] are com-
II. R ECENT R ESEARCH AND R ELEVANT CONCERNS
mon, though agent feature specification, especially logic and
decision-making is challenging [10] and no consensus exists The potential of LLMs is well acknowledged, however
on calibration methods [11]. limitations on numerical reasoning, and prompting methods
ABMs are well-established in financial market simulations, persist [19], [20], in particular mathematical reasoning [21].
with the necessity of heterogeneous agents recognised [12], Prompt design remains an active research area with many
[13]. The rapid evolution of LLMs—such as OpenAI’s GPT-4o challenges and opportunities; even simple prompts to ”re-read”
input are found to significantly improves performance [22].
Complicating things further is the finding that non-AI experts
often adopt ”opportunistic rather than systematic approaches”
to prompt design [23]. Thought-eliciting methods, like Chain-
©2024 IEEE. Personal use of this material is permitted. Permission from
of-Thought (CoT), are popular as they aim to ”elicit the
IEEE must be obtained for all other uses, in any current or future media, reasoning process in the output” [24], relying on giving an
including reprinting/republishing this material for advertising or promotional LLM ”worked examples”, drawing inspiration from human
purposes, creating new collective works, for resale or redistribution to servers
or lists, or reuse of any copyrighted component of this work in other works.
learning theories. We build upon these concepts, incorporating
October 10th, 2024
2

an agent into this bi-directional process and COT, within a


simulated conversation.

A. Application and Financial asset trading


Current financial market simulations using LLMs prioritise
price dynamics over trading activity [25]. In bond markets,
prices are largely known due to interest rate assumptions and
are heavily influenced by monetary policy [26]; moreover,
MMs are legally required to maintain a minimum market share
of government bonds [27]. Thus, liquidity—the movement of
(b) RQ2: GABM with Concordia act-
assets between parties—is a key concern. Modelling these (a) RQ1 Baseline ing as agent handler
asset flows and transaction intentions remains an active, though
limited, research area [17]. LLMs could be particularly useful Fig. 1: Model Architectures
in markets with limited data and dominant bilateral trading
interactions. Despite their potential, the application of LLMs in
simulating human-to-human interactions in financial markets, as bond holdings and explicit trading intentions. Our model
especially for bilateral trading, remains under-explored. Our passes information between the agents and an external LLM,
research addresses this gap. consistently using GPT 4o-mini throughout our experiments.
We address two research questions:
B. Generative Agent Based Models
Using Concordia [18] we integrate LLMs with ABMs to Research Question 1: Can an LLM realistically and appro-
create Generative Agent-Based Models (GABMs), enabling priately respond in a bilateral trading interaction? (RQ1:
agents to ”apply common sense” and ”act reasonably” within Baseline)
simulated environments [2]. A key feature is the Game Master We initially implement our model with agents functioning
agent, which translates natural language requests into ex- as messengers passing basic information to the LLM. The
ecutable actions. However, recent studies [20], [28] reveal simulated scenario involves the first MM initiating contact
that current LLMs under perform in negotiation tasks and with another MM, who does not wish to trade because they
strategic reasoning within agent-based systems. To address are not a buyer. Using a CoT framework, agents are guided
these challenges, we focus on enhancing agents’ negotiation through a sequence where they:
and decision-making capabilities by integrating LLMs beyond
traditional rule-based ABMs [7]. Unlike prior work [29] that • Summarise new information.
designs generic agents using LLMs, we concentrate on design • Clarify their roles and objectives.
features essential for decision-making through negotiation. • Assess their current bond holdings.
By augmenting agents with LLMs, we aim to create more • Decide whether to trade or not.
flexible and adaptable decision-making processes in complex If a trade is decided, they determine the appropriate action
environments. Our work demonstrates that even in simple to meet their obligations, such as buying or selling bonds,
settings, combining LLMs with agents can enhance realism flattening their trading book, or maintaining their current
in ABM models, benefiting future research. position. The agents are initialised with prompts derived from
the CoT, which in turn drives a simulated conversation, each
agent responding (and concluding) based on LLM reasoning.
C. Order to Trade Ratio (OTR): Uncertainty in Human-
If they choose to trade, the LLM is asked to select from
Directed Trading
4 possible options (buying or selling bonds, flattening their
Financial trading involves significant uncertainty, often re- trading book, or no trade). After both agents have contributed
quiring multiple attempts before a trade is executed—even and made a selection, the conversation text is separately
in transparent equity markets. In 2024, the average OTR for analysed to determine if a trading decision was reached. In
major U.S. equity exchanges was approximately 4.61% [30]. this setup, the ABM provides only a premises to the LLM,
Thus, up to 96% of daily trading requests do not result in which independently makes decisions.
trades, complicating the modelling of human behaviour [30]. Our goal is to evaluate how often the LLM correctly reasons
Work by [31] explore possible theories and impacts, including to produce a simulated conversation resulting in ”no trade”. We
market spoofing [32], and theories of ambiguity aversion in selected the ”no trade” decision to avoid complexities related
human trading are discussed in [4] and [3]. The causes of to numerical reasoning [21], focusing on a scenario where a
high OTR’s remain an open research question, though we aim trader holding no bonds (a flat position) is expected to follow
to leverage LLMs to include this aspect for added realism. the straightforward, implicit prompt of not trading. We isolate
the LLM’s ability to interpret and apply a specified trading
III. T RADERTALK : A RCHITECTURE AND R ESULTS intention. The CoT prompt is thus structured so that the correct
TraderTalk is an ABM featuring two market making agents, outcome is for the agent to choose ”no trade” from the final
”Josephine” and ”David,” each with initial characteristics such multiple-choice options provided it.
3

Trading Premise ”You are a market maker for UK gilts 5) Analyse and Conclude: .We then analyse the conver-
responsible for providing liquidity in the UK government sation for trade occurrences, quantities, and dialogue
bonds. You are supposed to at all times hold 0 bonds. Today, content.
you actually have 0 bonds, which means your holding is Project Context: ”You are a market maker for UK gilts
actually flat” . responsible for providing liquidity in the UK government bond.
Your job is to answer incoming queries from other market
Results and Conclusion makers to buy and sell UK government bonds by considering
Across 300 simulations, the LLM correctly decided to select if you wish to do so. UK government bonds trade at mid price.
not to trade in 180 instances, demonstrating that the LLM can You aim to make a trading decision in every conversation,
reason and follow the agent-based intention 60% of the time. either buy , sell or decline to trade. You must act professionally
We observe no significant differences with smaller sample in your conversations, and any decision you take is clearly
sizes. This test isolates the LLM’s ability to reason given a communicated to the other party and you repeat what is
natural language trading intention. There is an absence of any agreed.”
direct comparison of how frequently human traders perfectly Trading Roles:
follow such intentions however. Furthermore, existing research David: ”You are a market maker for UK gilts responsible
on human decision-making highlights ambiguity and the dis- for providing liquidity in the UK government bond, you are
tinction between following rules and intentions, suggesting supposed to at all times hold 0 bonds. Today, you actually
that achieving a 100% success rate is unrealistic, although have negative 10 million worth of bonds, your role is to buy
this is not quantifiable. We use this result as a baseline for the bonds if you have a negative holding”
future tests in more complex scenarios and believe that this Josephine: ”You are a market maker for UK gilts respon-
achieves the goal of including human like attributes, though sible for providing liquidity in the UK government bond, you
the quantum of such is beyond the scope of this work. are supposed to at all times hold 0 bonds. Today you have 10
In the remaining 40% of simulations where the LLM did not million worth of bonds, your role is to sell bonds if you are a
follow the correct intention, 23.6% involved the LLM attempt- holder, you need to call another market maker to trade away
ing to ”flatten” a trading book that was already flat, suggesting your bonds”
a misunderstanding or failure to adhere to the instructions and Unlike in RQ1, where the LLM operated independently, this
market parlance. The other 16.4% of responses reflected active setup integrates the ABM into decision-making to evaluate
trading positions that directly contradicted the premise, with how often the LLM produces simulated conversations with
the LLM expressing a desire to ”buy” in 10% of cases and to correct reasoning regarding trade intentions and executions.
”sell” in 6.3% of cases. We interpret this variation in behaviour Agents directly inform action direction (identifying buyers and
introduced by the LLM as analogous to the emergence of sellers).
unexpected properties within traditional agent-based models.

Research Question 2 (RQ2): TraderTalk—Can a GABM Results and Conclusion


Make a Trading Decision in a Realistic Manner? Again, we conducted 300 simulations using GPT 4o-mini;
In this test, we enhance the framework from RQ1 by passing we see agents intended to trade in 58% of cases, and at least
specific agent information to the LLM using Concordia’s one party was willing to trade in 98% of instances. Agent
agent-handling mechanism [18]. Each agent is initialised with ”Josephine” closely aligned with her role, intending to trade
distinct roles: this time David holds a negative bond position 97.3% of the time, while ”David”’s intention was lower at
and needs to buy bonds, while Josephine holds a positive 58.7%; he explicitly declined to trade in 22.3% of responses,
bond position and needs to sell bonds. Concordia’s Game and 19% were unclear. The 58% rate at which both parties
Master design facilitates these interactions, functioning as a intended to trade is close to the 60% correct response rate in
meta-agent manager that supervises exchanges and ensures our RQ1, suggesting consistent reasoning abilities of the LLM
smooth decision-making. Consequently, the model design is across different model designs in RQ1 and RQ2.
augmented from Figure 1a to produce Figure 1b. Despite the high intention to trade, actual trades occurred in
The new process is as follows: only 5.7% of cases, highlighting a significant gap between in-
1) Define Chain of Thought (CoT): Use Project Context tentions and execution. While LLM-driven agents often desire
2) Initialise Agents: Assign specific roles and initial con- to trade, the necessary LLM dialogue needed to finalise a trade
ditions to the agents (see Trading Roles below). seems less frequently generated, producing low successful
3) Generate Initial Prompts: Agents, via the Game Mas- trading rates - align with real-world observed OTR levels [30].
ter, generate their own responses to the CoT questions TraderTalk only identified the correct initial bond holdings
from RQ1, stored and passed to the LLM. for both parties in 2.34%, with 32% of responses omitting
4) Simulate Conversation: Managed by the Game Master, starting values altogether, reflecting difficulties in recalling
the LLM simulates the dialogue between the agents, with initial numerical conditions, in line with [21]. Overall, our
each responding in turn based on previous interactions ABM augmented with LLM behaviours (TraderTalk) appears
and their trading objectives. The conversation continues capable of producing interactions consistent with sparse real-
until the Game Master determines it has concluded world data and making trading decisions in a realistic manner.
4

IV. C ONCLUSION [11] P. Avegliano and J. S. Sichman, “Using surrogate models to calibrate
agent-based model parameters under data scarcity,” in Proceedings of the
We present TraderTalk, a novel LLM behavioural agent- 18th International Conference on Autonomous Agents and MultiAgent
based model that simulates realistic human bilateral trading in- Systems, 2019, pp. 1781–1783.
[12] R. Hayes, A. Todd, N. Chaidarun, S. Tepsuporn, P. Beling, and
teractions without extensive model tuning. Utilising a state-of- W. Scherer, “An agent-based financial simulation for use by researchers,”
the-art, non-domain-specific, non-fine-tuned LLM within the null, 2014.
Concordia framework (GTP 4o-mini), we demonstrate limited [13] M. E. Paddrik, R. Hayes, A. Todd, S. Y. Yang, P. A. Beling, and W. T.
Scherer, “An agent based model of the e-mini s&p 500 applied to flash
yet realistic trade negotiations (RQ1), interpretation, and trade crash analysis,” IEEE Conference on Computational Intelligence for
execution decisions (RQ2) at frequencies approximating those Financial Engineering and Economics, 2012.
in U.S. equity markets. By addressing key challenges like co- [14] B. of England, “Gemm guidebook: A guide to the roles of the
dmo and primary dealers in the uk government bond market,”
ordinating agent turn-taking, our simulation achieves a trade- null, December 2004. [Online]. Available: [Link]
to-order ratio similar to real markets. Discrepancies between 22bbjndz/[Link]
trading intentions and execution in agent outputs enhance [15] J. Cheshire, “Market Making in Bond Markets,” RBA Bulletin March
Quater 2015, no. March, pp. 63–74, 2015. [Online]. Available:
the realism of stylised human traders, capturing decision- [Link]
making processes in bilateral trading environments where [16] G. Pinter, “An anatomy of the 2022 gilt market crisis,” SSRN Electronic
much interaction occurs outside formal exchanges. This proof- Journal, vol. null, 2023.
[17] A. Vidler and T. Walsh, “Modelling opaque bilateral market dynamics
of-concept indicates that LLMs can meaningfully enhance in financial trading: Insights from a multi-agent simulation study,”
the realism of behavioural simulations in ABMs for financial 2024. [Online]. Available: [Link]
market modelling and provides a foundation for future research [18] A. S. Vezhnevets, J. P. Agapiou, A. Aharon, R. Ziv, J. Matyas, E. A.
Duéñez-Guzmán, W. A. Cunningham, S. Osindero, D. Karmon, and
into more complex multi-agent and multi-market simulations. J. Z. Leibo, “Generative agent-based modeling with actions grounded
Future work should enhance GABM’s understanding of im- in physical, social, or digital space using concordia,” arXiv preprint
plicit trading rules and dynamic market conditions; by refining arXiv:2312.03664, 2023.
[19] A. Srivastava, A. Rastogi, A. Rao, A. A. M. Shoeb, A. Abid,
their ability to capture human decision-making, LLMs could and A. F. et al, “Beyond the imitation game: Quantifying and
offer more robust simulations for policymakers, regulators, and extrapolating the capabilities of language models,” 2023. [Online].
market participants alike. Available: [Link]
[20] Y. Zhang, S. Mao, T. Ge, X. Wang, A. de Wynter, Y. Xia, W. Wu,
T. Song, M. Lan, and F. Wei, “Llm as a mastermind: A survey
of strategic reasoning with large language models,” 2024. [Online].
ACKNOWLEDGEMENT Available: [Link]
[21] J. Ahn, R. Verma, R. Lou, D. Liu, R. Zhang, and W. Yin, “Large lan-
We thank Dr Arnau Quera-Bofarull and Dr Nick Bishop guage models for mathematical reasoning: Progresses and challenges,”
(Oxford University), for their guidance on the research topic. in Proceedings of the 18th Conference of the European Chapter of the
Association for Computational Linguistics: Student Research Workshop.
We thank an anonymous market maker for their input. St. Julian’s, Malta: Association for Computational Linguistics, Mar.
This work is funded in part by an ARC Laureate grant 2024.
FL200100204. [22] X. Xu, C. Tao, T. Shen, C. Xu, H. Xu, G. Long, and J. guang
Lou, “Re-reading improves reasoning in large language models,” 2024.
[Online]. Available: [Link]
[23] J. D. Zamfirescu-Pereira, R. Y. Wong, B. Hartmann, and Q. Yang, “Why
R EFERENCES Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design
LLM Prompts,” Conference on Human Factors in Computing Systems -
[1] A. Ng, “Agentic design patterns part 5: Multi-agent collaboration,” Proceedings, 2023.
null, 2024. [Online]. Available: [Link] [24] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi,
issue-245/ Q. V. Le, and D. Zhou, “Chain-of-Thought Prompting Elicits Reasoning
[2] J. Park, J. O’Brien, C. Cai, M. R. Morris, P. Liang, and M. Bernstein, in Large Language Models,” Advances in Neural Information Processing
“Generative agents: Interactive simulacra of human behavior,” arXiv, Systems, vol. 35, no. NeurIPS, pp. 1–43, 2022.
2023. [25] Re-Reading Improves Reasoning in Large Language Models, 2023.
[3] C. L. Ilut and M. Schneider, “Modeling uncertainty as ambiguity: a [Online]. Available: [Link]
review,” National Bureau of Economic Research, Working Paper 29915, [26] S. M. Frank J. Fabozzi, The Handbook of Fixed Income Securities,
April 2022. [Online]. Available: [Link] Seventh Edition. McGraw-Hill, 2005.
[4] P. Bossaerts, P. Ghirardato, S. Guarnaschelli, and W. R. Zame, [27] B. of England, “Official Operations in the Gilt Market An Operational
“Ambiguity in Asset Markets: Theory and Experiment,” The Review of Notice ,” null, vol. December, 2023.
Financial Studies, vol. 23, no. 4, pp. 1325–1359, 01 2010. [Online]. [28] Cooperation, Competition, and Maliciousness: LLM-Stakeholders Inter-
Available: [Link] active Negotiation, 2024.
[5] S. Bai, W. Raskob, W. Raskob, T. Müller, and T. Müller, “Agent based [29] AutoGen : Enabling Next-Gen LLM Applications via Multi-Agent Con-
model,” Radioprotection, 2020. versation, 2023.
[6] N. Gilbert, “Agent-based models,” The Centre for Research in Social [30] U.S. Securities and Exchange Commission, “Market structure:
Simulation, 01 2007. Exchange trade volume,” 2024, accessed: 2024-09-10. [Online].
[7] M. Wooldridge, An Introduction to MultiAgent Systems, Second Edition. Available: [Link]
John Wiley and Sons, 2009. [31] J. D. Farmer and S. Skouras, “Minimum resting times and transaction-
[8] A. A. Kirilenko, A. S. Kyle, M. Samadi, and T. Tuzun, “The flash crash: to-order ratios: review of amendment 2.3.f and question 20,” European
High-frequency trading in an electronic market,” Journal of Finance, Commission Public Consultation: Review of the Markets in Financial
2017. Instruments Directive (MiFID), 2012.
[9] J. Paulin, A. Calinescu, and M. Wooldridge, “Understanding flash crash [32] V. Dalko and M. H. Wang, “How Effective are the Order-to-
contagion and systemic risk: A micro–macro agent-based approach,” Trade Ratio and Resting Time Regulations?” Journal of Financial
Journal of Economic Dynamics and Control, 2019. Regulation, vol. 4, no. 2, pp. 321–325, 08 2018. [Online]. Available:
[10] M. Fehler, F. Klügl, and F. Puppe, “Approaches for resolving the [Link]
dilemma between model structure refinement and parameter calibration
in agent-based simulations,” Adaptive Agents and Multi-Agent Systems,
2006.

You might also like