Eduardo Ordax’s Post

🚀 Bye bye JSON… a new standard called TOON is in town. And it’s about to change how LLMs read your data. If you’re still sending JSON to your LLMs, you’re burning tokens, burning money, and burning accuracy. Welcome TOON (Token-Oriented Object Notation) built specifically for AI systems. 🔥 𝗪𝗵𝘆 𝗧𝗢𝗢𝗡 𝗶𝘀 𝗿𝗲𝘄𝗿𝗶𝘁𝗶𝗻𝗴 𝘁𝗵𝗲 𝗿𝘂𝗹𝗲𝘀 JSON was never designed for LLMs. It’s chatty. Redundant. Bloated with repeated keys. TOON fixes all of this with one goal: cut token usage by 30–60% while improving model accuracy. 𝗝𝗦𝗢𝗡 (𝗹𝗼𝗻𝗴, 𝘃𝗲𝗿𝗯𝗼𝘀𝗲): { "products": { "product_id": "301", "name": "Wireless Mouse", "price": "29.99", "stock": "in_stock", "rating": "4.5" }, { "product_id": "302", "name": "Mechanical Keyboard", "price": "89.00", "stock": "low_stock", "rating": "4.8" }, { "product_id": "303", "name": "USB-C Hub", "price": "45.50", "stock": "out_of_stock", "rating": "4.1" } } 𝗧𝗢𝗢𝗡 (𝗰𝗼𝗺𝗽𝗮𝗰𝘁, 𝘁𝗼𝗸𝗲𝗻-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗲𝗱): products[3]{product_id, name, price, stock, rating}: 301, Wireless Mouse, 29.99, in_stock, 4.5 302, Mechanical Keyboard, 89.00, low_stock, 4.8 303, USB-C Hub, 45.50, out_of_stock, 4.1 💡 Same information. Up to 60% fewer tokens. Better comprehension by the model. TOON = Tokens Saved. 📉 𝗧𝗵𝗲 𝗡𝘂𝗺𝗯𝗲𝗿𝘀 𝗔𝗿𝗲 𝗪𝗶𝗹𝗱 🔸Up to 64.7% reduction for tabular data 🔸73.9% accuracy vs JSON’s 69.7% in structured retrieval tests 🔸76% higher cost-efficiency measured as accuracy per 1,000 tokens For teams deploying agents, RAG systems, or long-context workflows… the savings become massive. 📌 𝗪𝗵𝗲𝗿𝗲 𝗧𝗢𝗢𝗡 𝗦𝗵𝗶𝗻𝗲𝘀 🔸Product catalogs 🔸Logs & events 🔸Time-series 🔸RAG-ready structured data 🔸Agent communication 🔸Multi-model pipelines 🔸Any uniform list of objects (the sweet spot) ⚠️ 𝗡𝗼𝘁 𝗵𝗲𝗿𝗲 𝘁𝗼 𝗿𝗲𝗽𝗹𝗮𝗰𝗲 𝗝𝗦𝗢𝗡 TOON isn’t a new API standard. It’s a translation layer at the LLM boundary: ➡️ App uses JSON ➡️ Convert to TOON ➡️ Send TOON to LLM ➡️ Convert back to JSON if needed Simple. Efficient. Purpose-built. This isn’t “just another format.” It’s part of the new LLM cost-efficiency stack. #ai #genai #llm #json

  • No alternative text description for this image

I like the new name they gave to CSV ;]

TOON saving 60% tokens is great until you realize your entire stack speaks JSON and now you need bidirectional converters everywhere 🫠

Nothing wrong with JSON and allows nested objects and you can stringify JSON anyhow... The main issue seems to be the fact it's got duplicated fields. That is solvable without making a whole new system.

We have our own alternative to JSON to optimize LLMs: nested hashes. Similar to JSON, but a lot more efficient for retrieval. See articles on the topic at https://mltechniques.com/?s=nested

2 months from now, nobody will remember this ever existed.

Do not use yet, benchmarks shows non negligible drop in accuracy with toon vs json. Probably due to json more prevalent in model training.

See more comments

To view or add a comment, sign in

Explore content categories