Real Time Fraud Detection Using Apache Flink - Part 2 - by Yugen - Ai - Yugen - Ai Technology Blog - Medium
Real Time Fraud Detection Using Apache Flink - Part 2 - by Yugen - Ai - Yugen - Ai Technology Blog - Medium
Featured
1 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
By Dharmateja Yarlagadda
Introduction
In a previous blog post we showcased how to get started with building a
fraud detection application by creating fraud detection rules using the Java
2 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
API.
In this blog, we’ll explore the pattern API to see how we can create fraud
rules based on patterns. Although pattern API can be implemented in
multiple ways, we’ll use SQL as it provides a way to understand the nuances
of pattern API with little to no programming knowledge.
Individual patterns
3 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
4 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Here’s another example C D? E — it’s similar to the example above with the
difference that the ? is used to denote an optional event.
Simple patterns can be combined to create complex ones. For e.g. consider
the pattern ((A B+) C D? E), which combines the 2 individual patterns we saw
above.
5 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Groups of Patterns
Groups of patterns take combination a step further by structuring multiple
6 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
patterns under higher-level logic. For example we can alter the pattern in
the previous section into 2 groups of patterns as follows
• SKIP_TO_NEXT : Discards every partial match that started with the same
event, emits that last match.
• SKIP_TO_FIRST : Discards every partial match that started after the match
7 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
• SKIP_TO_LAST : Discards every partial match that started after the match
started but before the last event of PatternName occurred.
8 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Pattern API can be implemented with multiple APIs like Java, Scala, SQL etc.
For the purpose of this blog we are going to be using Flink’s SQL API.
By using Flink SQL for pattern matching, you can define fraud detection
logic in a declarative, concise, and readable manner. The enhanced
readability makes it easier for engineers and analysts alike to understand
and iterate if required.
9 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
System Overview
Let’s take a quick look at the high-level architecture and flow of data before
we dig deep.
System Overview
10 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
3. High incoming Low retention pattern: The account received a high credit
amount (above threshold) but 90%+ was transferred out within 24h.
Setting Up
In the first blog we setup a fraud detection application which does the
following in order
11 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
For the pattern API we would be reusing most of the setup. We assume that
there is already a datagen code running which is generating transaction data
and publishing it to the transactions topic in kafka.
First lets start by opening the sql client. The SQL client can be opened by
running sql-client.sh command in the job manager container.
12 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Inputs
Similar to the previous blog we assume that there is a kafka topic called
transactions which has a continuous stream of transaction data flowing into
it. Each message in the topic is a JSON string which contains transaction
related fields. An example transaction would be as follows
{
"transaction_id": "2b792051-509c-4472-95af-598a94f612fa",
"user_id": 1003,
"recipient_id": 9004,
"amount": 1005,
"transaction_type": "credit",
"ts": "2025-02-03 17:01:12"
}
13 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
amount FLOAT,
transaction_type STRING,
ts TIMESTAMP(3),
message_ts TIMESTAMP_LTZ(3) METADATA FROM 'timestamp',
proctime AS PROCTIME(),
WATERMARK FOR message_ts AS message_ts - INTERVAL '5' SECOND) WITH
(
'connector' = 'kafka',
'topic' = 'transactions',
'scan.startup.mode' = 'earliest-offset',
'properties.bootstrap.servers' = 'kafka:9094',
'format' = 'json'
);
1. Make sure you have the kafka connector jar corresponding to your flink version
in the lib folder.
2. Make sure that the kafka broker address is according to your kafka setup. In this
case the kafka broker is reachable on Kafka:9092 host port combination. This may
vary according to your setup
You can run a simple SELECT query to verify the table creation
14 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
SELECT
firstSmallTxnTime,
lastSmallTxnTime,
15 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
largeTxnTime,
largeTxnAmount
FROM
transactions MATCH_RECOGNIZE (
PARTITION BY user_id, recipient_id
ORDER BY
message_ts MEASURES A.ts AS firstSmallTxnTime,
LAST(A.ts) AS lastSmallTxnTime,
B.ts AS largeTxnTime,
B.amount AS largeTxnAmount ONE ROW PER MATCH
AFTER
MATCH SKIP PAST LAST ROW PATTERN (A+ B)
DEFINE
A AS A.amount < 1,
B AS B.amount > 1000
);
Note that the field by which to order has to be a time field and has to have a
watermark. The watermark helps to handle messages which arrive at a delay. In
this case the watermark is set to 5 seconds when we created the transactions table
so utmost we accept messages with up to 5 seconds of delay.
The image below illustrates how to interpret the pattern in the code above
16 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Running the above query results in a flink job which keeps outputting 1
record per pattern match. If you were able to successfully run you should be
able to see an output similar to this
17 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
SELECT
user_id,
smallTxnCount,
firstSmallTxnTime,
lastSmallTxnTime
18 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
FROM transactions
MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY message_ts
MEASURES
COUNT(A.recipient_id) AS smallTxnCount,
FIRST(A.ts) AS firstSmallTxnTime,
LAST(A.ts) AS lastSmallTxnTime
19 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
SELECT
depositSum,
20 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
depositCount,
withdrawalSum,
withdrawalCount,
depositStartTime,
depositEndTime,
withdrawalStartTime,
withdrawalEndTime
FROM transactions
MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY message_ts
MEASURES
SUM(A.amount) AS depositSum,
COUNT(A.amount) AS depositCount,
FIRST(A.ts) AS depositStartTime,
LAST(A.ts) AS depositEndTime,
SUM(B.amount) AS withdrawalSum,
COUNT(B.amount) AS withdrawalCount,
FIRST(B.ts) AS withdrawalStartTime,
LAST(B.ts) AS withdrawalEndTime
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A+ B+?) WITHIN INTERVAL '1' HOUR
DEFINE
A AS A.transaction_type = 'credit',
B AS B.transaction_type = 'debit'
AND SUM(A.amount) > 500
AND SUM(B.amount) > 0.8 * SUM(A.amount)
);
21 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Outputs
The next step is to save our outputs to a new topic called
fraudulent_transactions . We can do this by first creating a table in Flink SQL
client corresponding to how the fraudulent transactions table should look
like.
22 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
transaction_id STRING,
ts TIMESTAMP(3),
fraud_type STRING
) WITH (
'connector' = 'kafka',
'topic' = 'fraudulent_transactions',
'properties.bootstrap.servers' = 'kafka:9094',
'properties.group.id' = 'flink-fraudulent-consumer-group',
'properties.auto.offset.reset' = 'earliest',
'format' = 'json'
);
We can now insert the detected fraudulent transactions into the output table
by simply running an insert query and changing our fields to match the table
format. The updated insert query becomes
23 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
AFTER
MATCH SKIP PAST LAST ROW PATTERN (A + B) DEFINE A AS A.amount < 1,
B AS B.amount > 1000
);
This query creates a continuously running job which runs in the backend.
We’ll create the rules corresponding to the other patterns well.
24 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
transaction_id,
ts,
'high_incoming_low_retention_pattern' as fraud_type
FROM
transactions MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY message_ts
MEASURES
LAST(B.transaction_id) AS transaction_id,
LAST(B.ts) AS ts,
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A+ B+?) WITHIN INTERVAL '1' HOUR
DEFINE
A AS A.transaction_type = 'credit',
B AS B.transaction_type = 'debit'
AND SUM(A.amount) > 500
AND SUM(B.amount) > 0.8 * SUM(A.amount)
);
25 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
We can verify the results of the job by running a simple select on the
fraudulent_transactions table.
26 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
27 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
28 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Possible Improvements
29 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Conclusion
Apache Flink provides a powerful, flexible foundation for real-time fraud
detection. By combining robust stream processing with pattern recognition
teams can quickly flag suspicious activity in a continuous flow of
transactions. Whether you’re capturing simple sequences of small and large
transactions or building sophisticated, ML-driven rules across multiple data
streams, Flink’s event-time processing and stateful analytics enable a
scalable, low-latency solution. As fraudulent behaviors continue to evolve,
leveraging Flink’s pattern recognition capabilities allows you to stay ahead of
threats and maintain trust in your platform.
30 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Hope you enjoyed the read. Our team keeps publishing such deep drives in
the Yugen Tech Blog — do consider giving our publication a follow if you’re
interested.
Fraud Detection Apache Flink System Design Concepts Real Time Analytics
Written by Yugen.ai
105 followers · 17 following
31 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
No responses yet
Sava Matic
32 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
See all from Yugen.ai See all from Yugen.ai Technology Blog
33 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Apr 3 1 Feb 5 37 1
34 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
Jun 3 33 5d ago
35 of 36 6/14/2025, 10:14 AM
Real Time Fraud Detection Using Apache Flink — Part 2 | by Yugen.ai | Yugen.ai Technology... https://medium.com/yugen-ai-technology-blog/real-time-fraud-detection-using-apache-flink-pa...
May 25 20 Apr 6 6
See more recommendations
Help Status About Careers Press Blog Privacy Rules Terms Text to speech
36 of 36 6/14/2025, 10:14 AM