





















































DeployCon is a free, no-fluff, engineer-first summit for builders on the edge of production AI—and you’re on the guest list. On June 25 Predibase is taking over the AWS Loft in San Francisco and Streaming Online for a day of candid technical talks and war stories from the teams that ship large-scale AI.
Live Stream – Wherever You Are
Can’t make it to SF? Join virtually and get the same expert content, live.
June 25, 10:30AM–1:30PM PT
Register for Live Stream
The event is free, but space is limited so register now. Hope to see you there!
Yay!!!
Welcome to a landmark issue! This week marks our 100th newsletter, a significant milestone in our journey together exploring the dynamic world of AI, and it's all thanks to you, our valued reader!
To mark this special milestone, we've packed this 100th edition with an insightful graph data modeling post by our authors Ravi and Sid and the latest developments this week in the field of AI. Dive in for exclusive perspectives and updates that will inspire and inform your AI journey!
LLM Expert Insights,
Packt
Here’s your go-to calendar for this month’s midsummer AI meetups—perfect for networking, learning, and getting hands-on with the latest in generative models, agent frameworks, LLM tooling, and GPU hacking.
1. “Hype → Habit” Panel
Date: July 15, 2025
Location: Manchester – UK AI Meetup
Cost: Free
Focus: AI commercialisation
Website: Meetup.com
2. Mindstone London AI (August Edition)
Date: August 19, 2025
Location: London – Mindstone London AI
Cost: Free
Focus: Practical AI demos
Website: Meetup.com
3. Mindstone London AI (September Edition)
Date: September 16, 2025
Location: London – Mindstone London AI
Cost: Free
Focus: Agent-build case studies
Website: Meetup.com
What’s stopping you? Choose your city, RSVP early, and step into a room where AI conversations spark, and the future unfolds one meetup at a time.
Graph data modeling challenges traditional data modeling by encouraging different perspectives based on problem context. This means that instead of modeling the data on how it is stored, graphs help us model the data based on how it is consumed. Unlike rigid RDBMS approaches, which evolved from older, storage-limited technologies, graph databases like Neo4j enable flexible modeling using multiple labels. Inspired by real-world data consumption, graphs better reflect dynamic, interconnected data, offering more intuitive and efficient retrieval.
We will demonstrate a simple scenario wherein we’ll model data using both a relational database (RDBMS) and a graph-based approach. The dataset will represent the following information: A Person described by their firstName, lastName, and five most recent rental addresses where they have lived Each address should be in the following format: Address line 1, City, State, zipCode, fromTime, and tillTime
Following are some of the queries we could answer using this data:
First, let’s take a look at how this data can be modeled in an RDBMS.
There are three tables in this data model with relevant details: Person, Person_Address, and Address. The Person_Address (join) table contains the rental details along with references to the Person and Address tables. We use this join table to represent the rental details, to avoid duplicating the data within the Person or Address entities.
Let’s see how we fulfil Query 3 (Get the third address) from the RDBMS using the preceding model:
SELECT line1, city, state, zip from
person p, person_address pa, address a
WHERE p.name = 'John Doe'
and pa.person_id = p.id
and pa.address_id = a.id
ORDER BY pa.start ASC
LIMIT 2, 1
As you can see, in this query, we are relying on the search-sort-filter pattern to retrieve the data we want. We will now look at how this data can be modeled with graphs.
Graph data models use nodes (Person or Address) and relationships (HAS_ADDRESS) instead of join tables, thus reducing index lookup costs and enhancing retrieval efficiency. Take a look at how our data can be modeled using a basic graph data model:
You can use a Neo4j Cypher script to set up the indexes for faster data loading and retrieval:
CREATE CONSTRAINT person_id_idx FOR (n:Person) REQUIRE n.id IS UNIQUE ;
CREATE CONSTRAINT address_id_idx FOR (n:Address) REQUIRE n.id IS UNIQUE ;
CREATE INDEX person_name_idx FOR (n:Person) ON n.name ;
Once the schema is set up, we can use this Cypher script to load the data into Neo4j:
CREATE (p:Person {id:1, name:'John Doe', gender:'Male'})
CREATE (a1:Address {id:1, line1:'1 first ln', city:'Edison', state:'NJ', zip:'11111'})
CREATE (a2:Address {id:2, line1:'13 second ln', city:'Edison', state:'NJ', zip:'11111'})
…
CREATE (p)-[:HAS_ADDRESS {start:'2001-01-01', end:'2003-12-31'}]->(a1)
Now let’s see how we fulfil Query 3 (Get the third address) using graph data modeling:
MATCH (p:Person {name:'John Doe'})-[r:HAS_ADDRESS]->(a)
WITH r, a
ORDER BY r.start ASC
WITH r,a
RETURN a
SKIP 2
LIMIT 1
This query too relies on the search-sort-filter pattern and is not very efficient (in terms of retrieval time). Let’s take a more nuanced approach to graph data modeling to see if we can make retrieval more efficient.
Graph data modeling – Advanced approach
Here, let’s look at the same data differently and build a data model that reflects the manner in which we consume the data:
At first glance, this bears a close resemblance to the RDBMS ER diagram; however, this model contains nodes (Person, Rental, Address) and relationships (FIRST, LATEST, NEXT).
Let’s set up indexes:
CREATE CONSTRAINT person_id_idx FOR (n:Person) REQUIRE n.id IS UNIQUE ;
CREATE CONSTRAINT address_id_idx FOR (n:Address) REQUIRE n.id IS UNIQUE ;
CREATE INDEX person_name_idx FOR (n:Person) ON n.name ;
Then, you can load the data using Neo4j Cypher:
CREATE (p:Person {id:1, name:'John Doe', gender:'Male'})
CREATE (a1:Address {id:1, line1:'1 first ln', city:'Edison', state:'NJ', zip:'11111'})
…
CREATE (p)-[:FIRST]->(r1:Rental {start:'2001-01-01', end:'2003-12-31'})-[:HAS_ADDRESS]->(a1)
CREATE (r1)-[:NEXT]->(r2:Rental {start:'2004-01-01', end:'2008-12-31'})-[:HAS_ADDRESS]->(a2)
..
CREATE (p)-[:LATEST]->(r5)
Here is how your graph looks upon loading the data:
Let’s fulfil Query 3 (Get the third address) using this advanced graph data modeling approach:
MATCH (p:Person {name:'John Doe'})-[:FIRST]->()-[:NEXT*2..2]->()-[:HAS_ADDRESS]->(a)
RETURN a
We can see that the query traverses to the first rental and skips the next rental to get to the third rental (refer the preceding figure). This is how we normally look at data, and it feels natural to express the query in the way we have to retrieve the data. We are not relying on the search-sort-filter pattern.
If you run and view the query profiles, you will see that the initial graph data model took 19 db hits and consumed 1,028 bytes to perform the operation, whereas the advanced graph data model took 16 db hits and consumed 336 bytes. This change from the traditional RDMS modeling approach has a huge impact in terms of performance and cost.
Another advantage of this advanced data model is that if we want to track the sequence of rentals (addresses of Person), we can add just another relationship, say, NEXT_RENTAL, between the rentals for the same address. Representing such data like this in an RDBMS would be difficult. This is where Neo4j offers greater flexibility by persisting relationships and avoiding the join index cost, making it suitable for building knowledge graphs.
Create LLM-driven search and recommendations applications with Haystack, LangChain4j, and Spring AI
Here is the news of the week.
MiniMax Releases Groundbreaking M1 AI Model with 1 million context window
Shanghai’s MiniMax has launched MiniMaxM1, the first open-source, hybrid attention reasoning model supporting up to 1 million token contexts, powered by lightning attention and MoE architecture. MiniMax claims that M1, which is trained with a new CISPO RL algorithm, matches or exceeds closed‑weight rivals like DeepSeek R1 in reasoning, code, and long‑context benchmarks.
Baidu Unveils AI Avatar in E-commerce Livestream
Luo Yonghao’s AI-powered avatar debuted on Baidu’s livestream, showcasing synchronized two digital hosts powered by the ERNIE foundational model. The duo interacted with each other, communicated with the viewers, and introduced 133 products in 6 hours. The broadcast attracted over 13 million viewers, signaling China’s prowess in AI-driven innovation.
Google Introduces Live AI Search and Expands Gemini 2.5
Google has enhanced its search experience with Search Live in AI Mode, offering real-time voice interactions with multimodal responses directly within the Google app.
Additionally, Google expanded its Gemini 2.5 family with the introduction of Gemini 2.5 Flash-Lite, an efficient model designed for rapid, cost-effective tasks such as translation and summarization. Gemini 2.5 also introduced Deep Think, a developer-oriented feature improving step-by-step reasoning. This capability significantly boosts performance across coding, STEM, and multimodal tasks.
📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.
If you have any comments or feedback, just reply back to this email.
Thanks for reading and have a great day!
That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️
We would love to know what you thought—your feedback helps us keep leveling up.
Thanks for reading,
The AI_Distilled Team
(Curated by humans. Powered by curiosity.)