Relational Entities on
Databricks
Learning Objectives
u Databases
u Tables
u The impact of the LOCATION keyword
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
Database
u Databases = Schemas in Hive metastore
u CREATE DATABASE db_name
u CREATE SCHEMA db_name
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
Hive metastore
u repository of metadata
u Databases
u Tables
u …
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
CREATE TABLE table1;
CREATE TABLE table2;
…
central
Hive - default
metastore
- table_1
Workspace - table_2
-…
dbfs:/user/hive/warehouse
Storage …
table_1 table_2
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
CREATE SCHEMA db_x
USE db_x;
CREATE TABLE table1;
CREATE TABLE table2;
central
…
Hive - default - db_x
metastore
- table_1 - table_1
Workspace - table_2 - table_2
-… -…
dbfs:/user/hive/warehouse
Storage … db_x.db
table_1 table_2
table_1
table_2
…
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
CREATE SCHEMA db_y
LOCATION ‘dbfs:/custom/path/db_y.db’
USE db_y;
CREATE TABLE table1;
CREATE TABLE table2;
central …
Hive - default - db_x - db_y
metastore - table_1
- table_1 - table_1
Workspace - table_2 - table_2 - table_2
-…
-… -…
dbfs:/user/hive/warehouse dbfs:/custom/path
db_y.db
Storage … db_x.db
table_1 table_2 table_1
table_1
table_2
table_2
…
…
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
Tables
Manged tables External tables
u Created under the database u Created outside the database
directory directory
u CREATE TABLE table_name u CREATE TABLE table_name
LOCATION ‘path’
u Dropping the table, delete the u Dropping the table, will Not
underlying data files delete the underlying data files
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
CREATE TABLE table3
LOCATION ‘dbfs:/some/path_1/table3’
central
Hive
metastore
- default - db_x - db_y
- table_1 - table_1 - table_1
- table_2 - table_2 - table_2
- table_3
dbfs:/some/path_1 dbfs:/user/hive/warehouse dbfs:/custom/path
table_3
y.db
… db_x.db
table_1 table_2
table_1 table_1
table_2 table_2
… …
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
USE db_x;
CREATE TABLE table3
central
LOCATION ‘dbfs:/some/path_2/x_table3’;
Hive
metastore
- default - db_x - db_y
- table_1 - table_1 - table_1
- table_2 - table_2 - table_2
- table_3 - table_3
dbfs:/some/path_1 dbfs:/user/hive/warehouse dbfs:/custom/path
table_3
y.db
dbfs:/some/path_2 … db_x.db
table_1 table_2
table_1 table_1
table_2 table_2
x_table_3 …
…
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation
USE db_y;
CREATE TABLE table3
central LOCATION ‘dbfs:/some/path_2/y_table3’;
Hive
metastore
- default - db_x - db_y
- table_1 - table_1 - table_1
- table_2 - table_2 - table_2
- table_3 - table_3 - table_3
dbfs:/some/path_1 dbfs:/user/hive/warehouse dbfs:/custom/path
table_3
y.db
dbfs:/some/path_2 … db_x.db
table_1 table_2
table_1 table_1
table_2 table_2
x_table_3 y_table_3 …
…
Derar Alhussein © Udemy | Databricks Certified Data Engineer Associate - Preparation