Building A Production-Ready Data Pipeline With Azure - Complete Guide To Medallion Architecture - by Yasar Kocyigit - Jun, 2025 - Medium
Building A Production-Ready Data Pipeline With Azure - Complete Guide To Medallion Architecture - by Yasar Kocyigit - Jun, 2025 - Medium
1 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Today, I’ll walk you through building a complete Medallion Architecture data
pipeline using Azure services. By the end of this article, you’ll have a
production-ready solution that can process terabytes of data with minimal
maintenance.
2 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Architecture Overview
3 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
• Orchestration:
Azure Data Factory
Pipeline management and data movement
4 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
• Transformation:
Azure Databricks
Data processing with Apache Spark
• Storage:
Azure Data Lake Gen2
Scalable data storage
• Metadata:
Azure SQL Database
Control tables and processing logs
• Security:
Azure Key Vault
Secrets management
5 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Azure Databricks
• Apache Spark: Distributed processing for large datasets
ACID Transactions
Time Travel
6 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Schema Evolution
Upsert Operations
7 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Prerequisites
• Azure subscription with appropriate permissions
Repository Structure
azure-data-pipeline/
├── sql/ # Database setup scripts
│ ├── 01-create-control-tables.sql
│ ├── 02-create-stored-procedures.sql
│ └── 03-sample-data-setup.sql
├── databricks/
│ ├── notebooks/
│ │ └── bronze_to_silver.py # Main transformation logic
│ └── cluster-config.json
├── deployment/ # Infrastructure as Code
│ ├── 01-deploy-integration-runtime.ps1
│ ├── 02-deploy-datasets.ps1
│ └── 03-deploy-pipelines.ps1
└── config/
8 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
└── config-template.json
-- Tables Configuration
CREATE TABLE ctl.Tables (
TableId INT IDENTITY(1,1) PRIMARY KEY,
SourceSystemId INT NOT NULL,
SchemaName NVARCHAR(50) NOT NULL,
TableName NVARCHAR(100) NOT NULL,
LoadType NVARCHAR(20) CHECK (LoadType IN ('Full',
'Incremental')),
PrimaryKeyColumns NVARCHAR(500),
BronzePath NVARCHAR(500) NOT NULL,
SilverPath NVARCHAR(500) NOT NULL,
IsActive BIT DEFAULT 1,
Priority INT DEFAULT 100
9 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
);
10 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
PowerShell scripts.
# deployment/01-deploy-integration-runtime.ps1
param(
[Parameter(Mandatory=$true)]
[string]$SubscriptionId,
[Parameter(Mandatory=$true)]
[string]$ResourceGroupName,
[Parameter(Mandatory=$true)]
[string]$DataFactoryName,
[Parameter(Mandatory=$true)]
[string]$KeyVaultName,
[Parameter(Mandatory=$true)]
[string]$StorageAccountName,
[Parameter(Mandatory=$true)]
[string]$DatabricksWorkspaceUrl
)
11 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
if (-not $existingIR) {
Set-AzDataFactoryV2IntegrationRuntime -ResourceGroupName
$ResourceGroupName -DataFactoryName $DataFactoryName -Name
"SelfHostedIR" -Type SelfHosted -Force
Write-Host "Integration Runtime created successfully" -
ForegroundColor Green
$irKeys = Get-AzDataFactoryV2IntegrationRuntimeKey -
ResourceGroupName $ResourceGroupName -DataFactoryName
$DataFactoryName -Name "SelfHostedIR"
Write-Host "Authentication Keys:" -ForegroundColor Cyan
Write-Host "Key1: $($irKeys.AuthKey1)"
Write-Host "Key2: $($irKeys.AuthKey2)"
}
} catch {
Write-Host "Error creating Integration Runtime:
$($_.Exception.Message)" -ForegroundColor Red
exit 1
}
12 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
13 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
}
}
} | ConvertTo-Json -Depth 10
14 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
# deployment/02-deploy-datasets.ps1
param(
[Parameter(Mandatory=$true)]
[string]$SubscriptionId,
15 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
[Parameter(Mandatory=$true)]
[string]$ResourceGroupName,
[Parameter(Mandatory=$true)]
[string]$DataFactoryName
)
16 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
}
}
} | ConvertTo-Json -Depth 15
17 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
value = "@dataset().TableName"
type = "Expression"
}
}
}
} | ConvertTo-Json -Depth 15
18 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
# deployment/03-deploy-pipelines.ps1
param(
[Parameter(Mandatory=$true)]
[string]$SubscriptionId,
[Parameter(Mandatory=$true)]
[string]$ResourceGroupName,
[Parameter(Mandatory=$true)]
[string]$DataFactoryName
)
19 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
type = "SetVariable"
typeProperties = @{
variableName = "SourceQuery"
value = @{
value =
"@{if(equals(pipeline().parameters.TableConfig.LoadType,
'Incremental'), concat('SELECT * FROM ',
pipeline().parameters.TableConfig.SchemaName, '.',
pipeline().parameters.TableConfig.TableName, ' WHERE ',
pipeline().parameters.TableConfig.IncrementalColumn, ' > ''',
if(empty(pipeline().parameters.TableConfig.WatermarkValue),
'1900-01-01', pipeline().parameters.TableConfig.WatermarkValue),
''''), concat('SELECT * FROM ',
pipeline().parameters.TableConfig.SchemaName, '.',
pipeline().parameters.TableConfig.TableName))}"
type = "Expression"
}
}
}
@{
name = "Copy to Bronze"
type = "Copy"
dependsOn = @(@{ activity = "Build Source Query";
dependencyConditions = @("Succeeded") })
typeProperties = @{
source = @{
type = "SqlServerSource"
sqlReaderQuery = @{ value =
"@variables('SourceQuery')"; type = "Expression" }
}
sink = @{
type = "ParquetSink"
storeSettings = @{ type =
"AzureBlobFSWriteSettings" }
}
}
inputs = @(@{
referenceName = "DS_SQL_Generic"
20 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
type = "DatasetReference"
parameters = @{
SchemaName =
"@pipeline().parameters.TableConfig.SchemaName"
TableName =
"@pipeline().parameters.TableConfig.TableName"
}
})
outputs = @(@{
referenceName = "DS_Bronze_Parquet"
type = "DatasetReference"
parameters = @{
FilePath = "@{concat('bronze/',
toLower(pipeline().parameters.TableConfig.SourceSystemName), '/',
toLower(pipeline().parameters.TableConfig.SchemaName), '/',
pipeline().parameters.TableConfig.TableName, '/',
pipeline().parameters.ProcessingDate)}"
FileName =
"@{concat(pipeline().parameters.TableConfig.TableName, '_',
pipeline().parameters.ProcessingDate, '.parquet')}"
}
})
}
)
parameters = @{
TableConfig = @{ type = "object" }
ProcessingDate = @{ type = "string" }
}
variables = @{
SourceQuery = @{ type = "String" }
}
}
} | ConvertTo-Json -Depth 20
21 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
activities = @(
@{
name = "Get Tables for Silver"
type = "Lookup"
typeProperties = @{
source = @{
type = "AzureSqlSource"
sqlReaderQuery = @{
value = "SELECT DISTINCT t.TableId,
t.SchemaName, t.TableName, t.PrimaryKeyColumns, t.LoadType, CONCAT('/
mnt/bronze/', LOWER(ss.SourceSystemName), '/', LOWER(t.SchemaName),
'/', t.TableName) as BronzePath, CONCAT('/mnt/silver/',
LOWER(ss.SourceSystemName), '/', LOWER(t.SchemaName), '/',
t.TableName) as SilverPath FROM ctl.Tables t INNER JOIN
ctl.SourceSystems ss ON t.SourceSystemId = ss.SourceSystemId INNER
JOIN ctl.ProcessingStatus ps ON t.TableId = ps.TableId WHERE
ps.ProcessingDate = '@{pipeline().parameters.ProcessingDate}' AND
ps.Layer = 'Bronze' AND ps.Status = 'Success' AND t.IsActive = 1"
}
}
dataset = @{ referenceName =
"DS_Control_Database"; type = "DatasetReference" }
firstRowOnly = $false
}
}
@{
name = "ForEach Silver Table"
type = "ForEach"
dependsOn = @(@{ activity = "Get Tables for Silver";
dependencyConditions = @("Succeeded") })
typeProperties = @{
items = "@activity('Get Tables for
Silver').output.value"
isSequential = $false
batchCount = 4
activities = @(
@{
name = "Process Bronze to Silver"
22 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
type = "DatabricksNotebook"
typeProperties = @{
notebookPath = "/Shared/
bronze_to_silver"
baseParameters = @{
table_name =
"@{item().TableName}"
silver_path =
"@{item().SilverPath}"
load_type = "@{item().LoadType}"
schema_name =
"@{item().SchemaName}"
processing_date =
"@pipeline().parameters.ProcessingDate"
bronze_path =
"@{item().BronzePath}"
table_id = "@{item().TableId}"
primary_keys =
"@{item().PrimaryKeyColumns}"
}
}
linkedServiceName = @{ referenceName =
"LS_Databricks" }
}
)
}
}
)
parameters = @{
ProcessingDate = @{ type = "string" }
}
}
} | ConvertTo-Json -Depth 20
23 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
activities = @(
@{
name = "Set Processing Date"
type = "SetVariable"
typeProperties = @{
variableName = "ProcessingDate"
value =
"@{if(equals(pipeline().parameters.ProcessingDate, ''),
formatDateTime(utcnow(), 'yyyy-MM-dd'),
pipeline().parameters.ProcessingDate)}"
}
}
@{
name = "Get Active Tables"
type = "Lookup"
dependsOn = @(@{ activity = "Set Processing Date";
dependencyConditions = @("Succeeded") })
typeProperties = @{
source = @{
type = "AzureSqlSource"
sqlReaderQuery = "SELECT t.*,
ss.ConnectionString, ss.SourceSystemName, w.WatermarkValue FROM
ctl.Tables t INNER JOIN ctl.SourceSystems ss ON t.SourceSystemId =
ss.SourceSystemId LEFT JOIN ctl.Watermarks w ON t.TableId = w.TableId
WHERE t.IsActive = 1 AND ss.IsActive = 1 ORDER BY t.Priority,
t.TableId"
}
dataset = @{ referenceName =
"DS_Control_Database"; type = "DatasetReference" }
firstRowOnly = $false
}
}
@{
name = "ForEach Table"
type = "ForEach"
dependsOn = @(@{ activity = "Get Active Tables";
dependencyConditions = @("Succeeded") })
typeProperties = @{
24 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
25 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
parameters = @{
ProcessingDate = @{ type = "string"; defaultValue = "" }
SourceSystemName = @{ type = "string"; defaultValue = ""
}
TableName = @{ type = "string"; defaultValue = "" }
}
variables = @{
ProcessingDate = @{ type = "String"; defaultValue = "" }
}
}
} | ConvertTo-Json -Depth 20
26 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Pipeline Architecture:
# databricks/notebooks/bronze_to_silver.py
# Parameter Configuration
dbutils.widgets.text("table_id", "")
27 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
dbutils.widgets.text("schema_name", "")
dbutils.widgets.text("table_name", "")
dbutils.widgets.text("bronze_path", "")
dbutils.widgets.text("silver_path", "")
dbutils.widgets.text("processing_date", "")
dbutils.widgets.text("primary_keys", "")
dbutils.widgets.text("load_type", "Full")
# Extract Parameters
table_id = int(dbutils.widgets.get("table_id"))
schema_name = dbutils.widgets.get("schema_name")
table_name = dbutils.widgets.get("table_name")
bronze_path = dbutils.widgets.get("bronze_path")
silver_path = dbutils.widgets.get("silver_path")
processing_date = dbutils.widgets.get("processing_date")
primary_keys = dbutils.widgets.get("primary_keys").split(",") if
dbutils.widgets.get("primary_keys") else []
load_type = dbutils.widgets.get("load_type")
print(f"Processing: {schema_name}.{table_name}")
print(f"Load Type: {load_type}")
# Import Libraries
from pyspark.sql.functions import *
from delta.tables import *
import json
try:
bronze_df = spark.read.parquet(bronze_file_path)
bronze_count = bronze_df.count()
print(f"Successfully read {bronze_count} records")
except Exception as e:
# Fallback to wildcard pattern
bronze_wildcard_path = f"{bronze_path}/{processing_date}/
*.parquet"
28 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
try:
bronze_df = spark.read.parquet(bronze_wildcard_path)
bronze_count = bronze_df.count()
print(f"Read {bronze_count} records from wildcard path")
except Exception as e2:
error_result = {"status": "failed", "error": str(e2)}
dbutils.notebook.exit(json.dumps(error_result))
29 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
.save(silver_path)
print(f"Full load completed: {silver_df.count()} records")
silver_table.alias("target").merge(
silver_df.alias("source"),
merge_condition
).whenMatchedUpdateAll() \
.whenNotMatchedInsertAll() \
.execute()
print("Incremental merge completed")
else:
silver_df.write \
.format("delta") \
.mode("overwrite") \
.partitionBy("_processing_date") \
.save(silver_path)
print("Initial delta table created")
30 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Configuration Template
{
"environment": "production",
"azure": {
"subscriptionId": "YOUR_SUBSCRIPTION_ID",
"resourceGroupName": "data-pipeline-prod-rg",
"dataFactoryName": "adf-data-pipeline-prod",
"keyVaultName": "kv-data-pipeline-prod",
"storageAccountName": "datapipelineprodstore",
"databricksWorkspaceUrl": "https://adb-WORKSPACE_ID.azuredatabricks.net"
},
"database": {
"server": "sql-data-pipeline-prod.database.windows.net",
31 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
"database": "DataPipelineControl"
},
"processing": {
"defaultBatchSize": 8,
"maxRetryAttempts": 3,
"processingTimeout": "04:00:00"
}
}
# 2. Deploy infrastructure
.\deployment\01-deploy-integration-runtime.ps1 -SubscriptionId
"YOUR_SUB_ID" -ResourceGroupName "YOUR_RG" -DataFactoryName
"YOUR_ADF" -KeyVaultName "YOUR_KV" -StorageAccountName "YOUR_STORAGE"
-DatabricksWorkspaceUrl "YOUR_DATABRICKS_URL"
32 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
# 3. Setup database
# Execute SQL scripts in your Azure SQL Database:
# .\sql\01-create-control-tables.sql
# .\sql\02-create-stored-procedures.sql
# .\sql\03-sample-data-setup.sql
# 4. Configure Databricks
# Upload bronze_to_silver.py to /Shared/bronze_to_silver
# Configure storage mount points
# Update cluster configuration
33 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
);
34 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
{
"cluster_name": "data-pipeline-cluster",
"spark_version": "13.3.x-scala2.12",
"node_type_id": "Standard_DS3_v2",
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"auto_termination_minutes": 60,
"spark_conf": {
"spark.databricks.delta.preview.enabled": "true",
"spark.sql.adaptive.enabled": "true",
"spark.sql.adaptive.coalescePartitions.enabled": "true"
}
}
35 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
4. Cluster Sizing: Start with 2–4 workers, scale based on data volume
Real-World Results
After implementing this architecture for several enterprise clients, here are
the typical results:
Performance Metrics
• Cost Reduction: 60–70% reduction compared to traditional ETL tools
Business Impact
• Time to Market: New data sources onboarded in hours instead of weeks
36 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Implementation Approach
37 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
.groupBy("customer_id") \
.agg(
sum("order_amount").alias("total_spent"),
count("order_id").alias("total_orders"),
max("last_interaction_date").alias("last_activity")
)
return customer_360
Conclusion
We’ve built a production-ready data pipeline that:
Key Takeaways:
1. Medallion Architecture provides a clear separation of concerns and data
quality layers
38 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
maintenance overhead
39 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Complete Repository
You can find the complete implementation with all scripts, notebooks, and
documentation at:
40 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
• Comprehensive documentation
Next Steps
1. Share the Gold Layer:
I will provide details and implementation for the gold layer.
41 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Have questions about implementing this architecture? Found this helpful? Leave a
comment below or connect with me on LinkedIn. I’d love to hear about your data
pipeline challenges and successes!
42 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Senior Data Engineer | Cloud & Data Enthusiast | Designing the Future of
Scalable Analytics
No responses yet
Sava Matic
43 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
5d ago 53 2 4d ago
44 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Yasar Kocyigit
4d ago
45 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Jun 2 2 1 Jun 3 33
Jun 4 9 Jun 1 5
46 of 47 6/14/2025, 1:35 AM
Building a Production-Ready Data Pipeline with Azure: Complete Guide to Medallion Archite... https://medium.com/@kocyigityasar/building-a-production-ready-data-pipeline-with-azure-c...
Help Status About Careers Press Blog Privacy Rules Terms Text to speech
47 of 47 6/14/2025, 1:35 AM