0% found this document useful (0 votes)
48 views44 pages

D25L2507 - Final Draft - Analyst Roadmap To Databricks - From SQL To End-to-End BI - 1747454926520001ZrON

The document outlines a roadmap for analysts transitioning from SQL to end-to-end business intelligence using Databricks. It covers key areas such as data discovery, governance, transformation, and modeling, along with practical guidance on using the Unity Catalog and connecting Power BI with Databricks. The presentation emphasizes the importance of understanding data lineage, freshness, and volume while creating dashboards and reports.

Uploaded by

Havoc2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views44 pages

D25L2507 - Final Draft - Analyst Roadmap To Databricks - From SQL To End-to-End BI - 1747454926520001ZrON

The document outlines a roadmap for analysts transitioning from SQL to end-to-end business intelligence using Databricks. It covers key areas such as data discovery, governance, transformation, and modeling, along with practical guidance on using the Unity Catalog and connecting Power BI with Databricks. The presentation emphasizes the importance of understanding data lineage, freshness, and volume while creating dashboards and reports.

Uploaded by

Havoc2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Analyst

Roadmap to
Databricks
From SQL to End-to-End BI

Jake Duckers
June 10, 2025
Forward-looking Statement
This presentation has been prepared for informational purposes only. The information set forth herein
does not purport to be complete or contain all relevant information. Statements contained herein are
made as of the date of this presentation unless stated otherwise.
This presentation and the accompanying oral commentary may contain forward-looking statements. In
some cases, forward-looking statements can be identified by terms such as “may”, “will”, “should”,
“expects”, “plans”, “anticipates”, “could”, “intends”, “projects”, “believes”, “estimates”, “predicts”, or
“continue”, or the negative of these words or other similar terms or expressions that concern Databricks’
expectations, strategy, plans, or intentions.
Forward-looking statements are based on information available at the time those statements are made
and are inherently subject to risks and uncertainties that could cause actual results to differ materially
from those expressed in or suggested by the forward-looking statements.
Forward-looking statements should not be read as a guarantee of future performance or outcomes.
Except as required by law, Databricks does not undertake any obligation to publicly update or revise any
forward-looking statement, whether as a result of new information, future developments or otherwise.

1
Who am I?

2
Who am I?

Where's My Data?

3
Analyst Domains

4
Data Discovery & Governance

Three Areas Data Transformation & Modeling

Analytics &
% BI

5
Unity Catalog for An Analyst
Understanding Medallion Structure for the Business

Data Assets located in Catalog Pane

Catalog

Gold Schema:
Key Reports & Dashboard Datasets

Silver Schema:
Used to Build New Data Models

6
Data Discovery &
Data Governance

7
Unity Catalog

- Freshness
- Volume
- Lineage Data Discovery & Governance
- Schema

Establish Rules

Data Transformation & Modeling

Analytics % BI

8
Unity Catalog for An Analyst
Pilar 1 & 2: Freshness & Volume Using “Overview” Tab

When was the Table updated?

What is the size and structure of the


table?

9
Unity Catalog for An Analyst
Pilar 1 & 2: Freshness & Volume Using “Details” Tab

Who or What Created and Updates


this Data Object

12

10
Unity Catalog for An Analyst
Pilar 1: Freshness Using “History” Tab

What Operations are being done and What Notebook & Job Update the
when? Table?

11
Unity Catalog for An Analyst
Pilar 3: Lineage Using “Lineage” Tab

What are the Upstream and Use Lineage Graph to Visually


Downstream Dependencies? Represent Dependencies

12
Unity Catalog for An Analyst
Pilar 4: Schema Using “Overview” Tab

May be able to see code that


generated the Data Object

Identify Column Types and Comments

Identify Primary and Foreign Keys

13
Unity Catalog for An Analyst
Pilar 4 : Visualizing Relationships

Composite PK

Columns can have multiple FKs

14
Data Transformation
& Modeling

15
Data Discovery & Governance

SQL Editor, Notebooks, Compute

- EDA
- Data Mapping
- Code Environment Data Transformation & Modeling
- Code Execution
- Data Review

Create Data Objects

Analytics % BI

16
Workspace Crash Course
Find and Organize Your Queries, Notebooks, and Dashboards

Create a Hierarchical Structure to


Create Folders or GIT Folders
Organize Data Objects

17
SQL Editor

18
SQL Editor Crash Course

19
Notebooks

20
Notebooks Crash Course

21
Data Model Workflow
Notebook Structure for Creating Tables

1 21 31
CREATE OR REPLACE TABLE …
Column Names,
Data Types
Enforced Constraints, CREATE OR REPLACE TEMP VEW … INSERT INTO …
Comments
PK/FK Constraints
(CLUSTER BY)

22
Data Model Workflow

Use CREATE OR REPLACE command

Set Enforced Constraint

Define PK with PRIMARY KEY and RELY


(only if unique)

Define FKs with FOREIGN KEY

INSERT INTO using TEMP VIEWs


23
Orchestration

24
Business Reporting Workflow

IF/ELSE Task

Job Parameter
(Pushed Down to Tasks)

25
Workflow Overview
Review Business Reporting Workflow Runs

26
Analytics & BI

27
Data Discovery & Governance

Data Transformation & Modeling

Dashboards

- Semantic Model
- Filtering
- Visualizations Analytics & BI
- Genie

Deliver Insights 28
Connect Power BI & Databricks
Step 1: Grab SQL Warehouse Connection Details

Grab SQL Warehouse Server


Hostname

Grab SQL Warehouse HTTP Path

29
Connect Power BI & Databricks
Step 2: Use “Get Data” & Input Connection Details

Paste SQL Warehouse Server


Hostname

Paste SQL Warehouse HTTP Path

30
Connect Power BI & Databricks
Step 3: Load or Transform Data

Select Data Objects

Load or Transform Data

31
Connect Power BI & Databricks
Step 4: Automatic Sematic Model Thru PK/FK

32
Databricks
Dashboards

33
Dashboard & Genie Example

34
Setting Up Datasets
Separate Datasets for Visualizations & Filters

Datasets that Feed Visualizations

Datasets for Filter Selections

35
Dashboard Parameters
Creating Interactive Filters

Conditional array_contains()

Use Date Periods or Date Range Filter

Parameter Details Section

36
Dashboard Parameters
Filtering on Parameters

Set the Field to the State Dataset

Set Datasets to be Filtered by State

Set the Default Value

37
Calculated Dimensions & Measures
Improving Dashboard performance

Calculated Dimension

Calculated Measures

38
Publish Dashboard with Genie Space
Publish a Dashboard with Auto-Generate or Existing Genie Space

Choose Permission Type for Viewers

Enable Genie with Dashboard

39
Resources

40
Complete Your Surveys
Your feedback has a direct impact on Data + AI Summit content

• You will receive a survey for each


session attended
• Open the Databricks Events app
and select “My Surveys” from the
menu
• Surveys can also be submitted in
the Attendee Portal

41

You might also like