Adv SQL and Functions
Adv SQL and Functions
SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL
JSONB and SP-GIST
Functions - Overview
Function Basics
Functions - By Example
Stephen Frost
[email protected]
Joe Conway
[email protected]
Queries
Syntax Overview
http://www.postgresql.org/docs/9.4/interactive/sql-select.html
Queries
Syntax Overview- from item
[ ONLY ] table_name [ * ]
[ [ AS ] alias [ ( column_alias [, ...] ) ] ]
[ LATERAL ] ( select )
[ AS ] alias [ ( column_alias [, ...] ) ]
[ LATERAL ] function_name ( [ argument [, ...] ] )
[ AS ] alias
[ ( column_alias [, ...] | column_definition [, ...] ) ]
[ LATERAL ] function_name ( [ argument [, ...] ] )
AS ( column_definition [, ...] )
with_query_name [ [ AS ] alias [ ( col_alias [, ...] ) ] ]
from_item [ NATURAL ] join_type
from_item [ ON join_condition | USING ( column [, ...] ) ]
Queries
Syntax Overview- VALUES, TABLE
Last, but not least, the most complicated ones of all. VALUES returns a table, after
evaluating all expressions:
TABLE table_name
[ ORDER BY expression [ ASC | DESC | USING op ], ... ]
[ LIMIT num ] [ OFFSET num ]
Queries
Examples
TABLE author;
Queries
Examples
Join Types
cross join
inner join
outer join
left
right
full
Cross Joins
Joins each row from the first table with each row from the second table
is equivalent to
Inner Joins
Joins each row of the first table with each row from the second table for which the
condition matches
SELECT ... FROM tab1 [ INNER ] JOIN tab2 USING (column list);
Inner Joins
Examples
Outer Joins
Joins each row from the first table with each row from the second table for which the
condition matches. Furthermore, nonmatching rows are added to the result.
left join all rows from the left table
right join all rows from the right table
full join all rows from both tables
Rows without a join partner are filled up with null values.
Outer Joins
Syntax
Outer Joins
Examples
Set Operations
Example Data
Set Operations
UNION
title
------------------
Joe Conway
Kostenlos Laufen
Angst Lauf
Wildlauf
Running Free
Running Scared
Running Wild
Stephen Frost
(8 rows)
Set Operations
UNION ALL
Set Operations
INTERSECT
title
-------
(0 rows)
Set Operations
EXCEPT
title
------------------
Running Free
Running Scared
Wildlauf
Running Wild
Angst Lauf
Kostenlos Laufen
(6 rows)
Subqueries
Uncorrelated
Uncorrelated subquery:
Subquery calculates a constant result set for the upper query
Executed only once
Subqueries
Correlated
Correlated subquery:
Subquery references variables from the upper query
Subquery has to be repeated for each row of the upper query
Could be rewritten as a join
Subqueries
Correlated
Results:
Window functions are like ordinary aggregates, but are restricted to operate on a
portion of the tuples only.
SELECT
title, language, price,
AVG(price) OVER(PARTITION BY language) FROM book;
SELECT
title, language, price,
ROUND(AVG(price) OVER(PARTITION BY language),2) FROM book;
SELECT
title, language, price,
AVG(price) OVER(ORDER BY language RANGE UNBOUNDED PRECEDING)
FROM book;
SELECT
title, language, price,
AVG(price) OVER(ORDER BY language ROWS UNBOUNDED PRECEDING)
FROM book;
Select all books and compare its price against the average price and total price of all
books in the same language:
SELECT
title, language, price,
AVG(price) OVER mywindow,
SUM(price) OVER mywindow
FROM book
WINDOW mywindow AS (PARTITION BY language);
http://www.postgresql.org/docs/9.4/interactive/tutorial-window.html
Results:
Multiple Window clauses can be in the same query, or even some with a named
window clause and some without one.
SELECT
row_number() OVER () as row, title, language, price,
AVG(price) OVER mywindow,
SUM(price) OVER mywindow
FROM book
WINDOW mywindow AS (PARTITION BY language);
Results:
SELECT
rank() OVER (ORDER BY title), title, language, price,
AVG(price) OVER mywindow,
SUM(price) OVER mywindow
FROM book
WINDOW mywindow AS (PARTITION BY language);
ORDER BY window clause may re-order the rows, but an explicit overall ORDER BY
can still be used to achieve the desired result ordering.
SELECT
rank() OVER (ORDER BY title), title, language, price,
AVG(price) OVER mywindow,
SUM(price) OVER mywindow
FROM book
WINDOW mywindow AS (PARTITION BY language) ORDER BY price;
Note that the rank value remains correct even though the final ordering is changed.
Results:
SELECT
rank() OVER (ORDER BY language), title, language, price,
AVG(price) OVER mywindow,
SUM(price) OVER mywindow
FROM book
WINDOW mywindow AS (PARTITION BY language);
Syntax
Using a self-reference within a RECURSIVE query needs the following syntax in the
inner WITH definition:
WITH with_1(prename)
AS ( SELECT ’Stephen’::text ),
with_2(fullname)
AS ( SELECT with_1.prename || ’ ’ || ’Frost’ from with_1 )
SELECT fullname FROM with_2;
Use WITH clauses to calculate the average by language, then another to pull the sum
by language, and finally join them with the original table.
Results:
1 Initialize
IT is initialized as an empty set
Execute the non-recursive query
Assign results to both RT and WT ;
2 Execute recursive query
Replace recursive self-reference with WT
Assign results during execution to IT
Append IT to RT
Replace WT with current IT
Truncate IT
3 Check recursion
Repeat 2) until WT is an empty set
Return RT
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Window Functions
Adv. SQL - Window Functions, CTEs, LATERAL
Window Function - Examples
JSONB and SP-GIST
Common Table Expressions (CTEs)
Functions - Overview
Writable CTEs
Function Basics
LATERAL
Functions - By Example
VALUES(1)
UNION
SELECT
n+1
FROM
foo_with
WHERE n < 100
parts list is a self-referencing table, cannot be easily retrieved with plain SQL.
UNION
Result: 34
Caveats
WITH archive_rows()
AS
(
DELETE
FROM parts_list
WHERE whole = ’car’
RETURNING *
)
INSERT INTO parts_list_archive
SELECT * FROM archive_rows;
UNION
LATERAL
LATERAL is a new JOIN method (aka ’LATERAL JOIN’) which allows a subquery in
one part of the FROM clause to refernce columns from earlier items in the FROM
clause.
Refer to earlier table
Refer to earlier subquery
Refer to earlier set-returning function
Implicitly added when a SRF is referring to an earlier item in the FROM clause
SELECT *
FROM numbers, LATERAL generate_series(1,max_num);
SELECT *
FROM numbers, generate_series(1,max_num);
SELECT *
FROM numbers, generate_series(1,max_num);
max_num | generate_series
---------+-----------------
1 | 1
2 | 1
2 | 2
3 | 1
3 | 2
3 | 3
[...]
(55 rows)
SELECT *
FROM (SELECT generate_series as max_num
FROM generate_series(1,10)) as numbers,
LATERAL generate_series(1,max_num);
SELECT *
FROM (SELECT generate_series as max_num
FROM generate_series(1,10)) as numbers,
generate_series(1,max_num);
SELECT *
FROM (SELECT generate_series as max_num
FROM generate_series(1,10)) as numbers,
LATERAL generate_series(1,max_num);
SELECT *
FROM (SELECT generate_series as max_num
FROM generate_series(1,10)) as numbers,
generate_series(1,max_num);
JSONB is a new data type in 9.4 which is nearly identical to the JSON data type.
There are a few specific difference which are important to note:
JSON is stored as a regular ’text’ blob, making it slow to utilize
JSONB is stored much more efficiently in a binary data format
JSONB is very slightly slower to input
JSONB normalizes input, reduces whitespace, does not preserve order or
duplicates
JSON can only be sensibly indexed through functional indexes
JSONB can be directly indexed
JSONB number output depends on PostgreSQL numeric data type
JSONB has containment and existance operators
JSONB Example
As mentioned, JSONB does not preserve whitespace (or lack of it), for example:
JSONB Example
JSONB uses the numeric data type’s output format, see these two identical inputs:
Array on the right side is contained within the one on the left.
SP-GIST
SP-GIST differs from other index types by decomposing the given space into disjoint
partitions.
SP-GIST index creation is generally faster than GIST
SP-GIST index size is comparable to GIST
SP-GIST query time is much faster than GIST
SP-GIST Example
SP-GIST Example
Performance depends on the amount of data and the size of the overall space of the
data which is indexed. A simple 1,000,000 point example shows improved
performance, where smaller data sets showed little difference:
postgres=# explain analyze select point from geo where point ~= ’(-29.549120804
[...]
Execution time: 0.245 ms
postgres=# create index pt_spgist_idx on geo using spgist(point);
CREATE INDEX
postgres=# explain analyze select point from geo where point ~= ’(-29.549120804
[...]
Execution time: 0.158 ms
Functions
Operators
Data types
Index methods
Casts
Triggers
Aggregates
Ordered-set Aggregates
Window Functions
SQL Functions
Behavior
Executes an arbitrary list of SQL statements separated by
semicolons
Last statement may be INSERT, UPDATE, or DELETE with
RETURNING clause
Arguments
Referenced by function body using name or $n: $1 is first arg,
etc. . .
If composite type, then dot notation $1.name used to access
Only used as data values, not as identifiers
Return
If singleton, first row of last query result returned, NULL on no
result
If SETOF, all rows of last query result returned, empty set on
no result
http://www.postgresql.org/docs/9.4/static/xfunc-sql.html
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Introduction
JSONB and SP-GIST Uses
Functions - Overview Varieties
Function Basics Languages
Functions - By Example
Procedural Languages
User-defined functions
Written in languages besides SQL and C
Task is passed to a special handler that knows the details of
the language
Dynamically loaded
Could be self-contained (e.g. PL/pgSQL)
Might be externally linked (e.g. PL/Perl)
http://www.postgresql.org/docs/9.4/static/xplang.html
Internal Functions
http://www.postgresql.org/docs/9.4/static/xfunc-internal.html
C Language Functions
Language Availability
PostgreSQL includes the following server-side procedural
languages:
http://www.postgresql.org/docs/9.4/static/xplang.html
PL/pgSQL
Perl
Python
Tcl
Other languages available:
http://pgfoundry.org/softwaremap/trove_list.php?form_cat=311
Java
V8 (Javascript)
Ruby
R
Shell
others . . .
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Creation
JSONB and SP-GIST Arguments
Functions - Overview Return Types
Function Basics Attributes
Functions - By Example
http://www.postgresql.org/docs/9.4/static/sql-createfunction.html
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Creation
JSONB and SP-GIST Arguments
Functions - Overview Return Types
Function Basics Attributes
Functions - By Example
Dollar Quoting
Works for all character strings
Particularly useful for function bodies
Consists of a dollar sign ($), ”tag” of zero or more characters,
another dollar sign
Start and End tag must match
Nest dollar-quoted string literals by choosing different tags at
each nesting level
CREATE OR REPLACE FUNCTION dummy () RETURNS text AS
$_$
BEGIN
RETURN $$Say ’hello’$$;
END;
$_$
LANGUAGE plpgsql;
http://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Creation
JSONB and SP-GIST Arguments
Functions - Overview Return Types
Function Basics Attributes
Functions - By Example
Anonymous Functions
http://www.postgresql.org/docs/9.4/static/sql-do.html
Anonymous Functions
DO $_$
DECLARE r record;
BEGIN
FOR r IN SELECT u.rolname
FROM pg_authid u
JOIN pg_auth_members m on m.member = u.oid
JOIN pg_authid g on g.oid = m.roleid
WHERE g.rolname = ’admin’
LOOP
EXECUTE $$ ALTER ROLE $$ || r.rolname ||
$$ SET work_mem = ’512MB’ $$;
END LOOP;
END$_$;
Anonymous Functions
argname (optional):
Most, but not all, languages will use in function body
Use named notation to improve readability and allow reordering
Defines the OUT column name in the result row type
CREATE FUNCTION testfoo (IN a int, INOUT mult int = 2, OUT a int)
RETURNS RECORD AS $$
VALUES (mult, a * mult);
$$ language sql;
SELECT * FROM testfoo(mult := 3, a := 14);
mult | a
------+----
3 | 42
(1 row)
Function Overloading
Input argument (IN/INOUT/VARIADIC) signature used
Avoid ambiguities:
Type (e.g. REAL vs. DOUBLE PRECISION)
Function name same as IN composite field name
VARIADIC vs same type scalar
CREATE OR REPLACE FUNCTION foo (text) RETURNS text AS $$
SELECT ’Hello ’ || $1
$$ LANGUAGE sql;
CREATE OR REPLACE FUNCTION foo (int) RETURNS text AS $$
SELECT ($1 / 2)::text || ’ was here’
$$ LANGUAGE sql;
[ RETURNS rettype
| RETURNS TABLE ( column_name column_type [, ...] ) ]
[ RETURNS rettype
| RETURNS TABLE ( column_name column_type [, ...] ) ]
SELECT testbar2();
testbar2
------------
(42,hello)
(64,world)
(2 rows)
SELECT (testbar2()).*;
f1 | f2
----+-------
42 | hello
64 | world
(2 rows)
LANGUAGE
LANGUAGE lang_name
WINDOW
WINDOW
Window Functions
Indicates function is a window function rather than ”normal”
function
Provides ability to calculate across sets of rows related to
current row
Similar to aggregate functions, but does not cause rows to
become grouped
Able to access more than just the current row of the query
result
Window functions can be written in C, PL/R, PL/V8, others?
WINDOW
Serveral window functions built-in
select distinct proname from pg_proc where proiswindow order by 1;
proname
--------------
cume_dist
dense_rank
first_value
lag
last_value
lead
nth_value
ntile
percent_rank
rank
row_number
(11 rows)
Volatility
VOLATILE (default)
Each call can return a different result
Example: random() or timeofday()
Functions modifying table contents must be declared volatile
STABLE
Returns same result for same arguments within single query
Example: now()
Consider configuration settings that affect output
IMMUTABLE
Always returns the same result for the same arguments
Example: lower(’ABC’)
Unaffected by configuration settings
Not dependent on table contents
Volatility
Volatility
LEAKPROOF requirements
No side effects
Reveals no info about args other than by return value
Planner may push leakproof functions into views created with
the security barrier option
Can only be set by the superuser
\c - joe
EXPLAIN ANALYZE SELECT * FROM user_books
WHERE leak_info(luser, bookname) = 0;
NOTICE: tom:book-1
NOTICE: tom:book-3
NOTICE: tom:book-5
NOTICE: tom:book-7
QUERY PLAN
------------------------------------------------------------------
Seq Scan on all_books (cost=0.00..1.18 rows=1 width=72) (actual ...
Filter: ((leak_info(luser, bookname) = 0) AND
(luser = ("current_user"())::text))
Rows Removed by Filter: 4
Planning time: 0.674 ms
Execution time: 2.044 ms
(5 rows)
\c - joe
EXPLAIN ANALYZE SELECT * FROM user_books
WHERE leak_info(luser, bookname) = 0;
QUERY PLAN
------------------------------------------------------------------------
Subquery Scan on user_books (cost=0.00..1.16 rows=1 width=72) (actual ...
Filter: (leak_info(user_books.luser, user_books.bookname) = 0)
-> Seq Scan on all_books (cost=0.00..1.14 rows=1 width=72) (actual ...
Filter: (luser = ("current_user"())::text)
Rows Removed by Filter: 4
Planning time: 0.648 ms
Execution time: 1.903 ms
(7 rows)
\c - joe
EXPLAIN ANALYZE SELECT * FROM user_books
WHERE leak_info(luser, bookname) = 0;
NOTICE: tom:book-1
NOTICE: tom:book-3
NOTICE: tom:book-5
NOTICE: tom:book-7
QUERY PLAN
------------------------------------------------------------------
Seq Scan on all_books (cost=0.00..1.18 rows=1 width=72) (actual ...
Filter: ((leak_info(luser, bookname) = 0) AND
(luser = ("current_user"())::text))
Rows Removed by Filter: 4
Planning time: 0.646 ms
Execution time: 2.145 ms
(5 rows)
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Creation
JSONB and SP-GIST Arguments
Functions - Overview Return Types
Function Basics Attributes
Functions - By Example
Lesson
Be sure function really is leak proof before making
LEAKPROOF
Why use LEAKPROOF at all?
Performance (predicate push down)
Optimizer Hints
COST execution_cost
ROWS result_rows
execution cost
Estimated execution cost for the function
Positive floating point number
Units are cpu operator cost
Cost is per returned row
Default: 1 unit for C-language/internal, 100 units for all others
result rows
Estimated number rows returned
Positive floating point number
Only allowed when declared to return set
Default: 1000
Stephen Frost, Joe Conway Postgres Open 2014
Adv. SQL - Basics
Adv. SQL - Window Functions, CTEs, LATERAL Creation
JSONB and SP-GIST Arguments
Functions - Overview Return Types
Function Basics Attributes
Functions - By Example
Optimizer Hints
SET clause
Specified config set to value for duration of function
SET FROM CURRENT uses session’s current value
CREATE FUNCTION testbar9 ()
RETURNS SETOF int AS $$
VALUES (42), (64);
$$ LANGUAGE sql SET work_mem = ’512MB’;
Function Body
AS definition
| AS obj_file, link_symbol
definition
String literal
Parse by language parser
Can be internal function name
Can be path to object file if C language function name matches
Dollar quote, or escape single quotes and backslashes
Function Body
AS definition
| AS obj_file, link_symbol
Function Body
Simple
Custom Operator
CREATE OPERATOR + (
procedure = sum,
leftarg = text,
rightarg = text
);
Custom Aggregate
INSERT RETURNING
Composite Argument
SELECT name,
double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
FROM emp;
Polymorphic
SELECT (new_emp()).name;
name
------
None
VARIADIC
DEFAULT Arguments
PL/pgSQL
http://www.postgresql.org/docs/9.4/static/plpgsql.html
Simple
Parameter ALIAS
Named Parameters
SELECT factorial(42::numeric);
factorial
------------------------------------------------------
1405006117752879898543142606244511569936384000000000
(1 row)
Recursive
CREATE OR REPLACE FUNCTION factorial (i numeric)
RETURNS numeric AS $$
BEGIN
IF i = 0 THEN
RETURN 1;
ELSIF i = 1 THEN
RETURN 1;
ELSE
RETURN i * factorial(i - 1);
END IF;
END;
$$ LANGUAGE plpgsql;
SELECT factorial(42::numeric);
factorial
------------------------------------------------------
1405006117752879898543142606244511569936384000000000
(1 row)
Record types
select format();
format
--------------
a = 2; b = 4
(1 row)
PERFORM
CREATE OR REPLACE FUNCTION func_w_side_fx() RETURNS void AS
$$ INSERT INTO foo VALUES (41),(42) $$ LANGUAGE sql;
SELECT dummy();
SELECT * FROM foo;
f1
----
41
42
(2 rows)
Dynamic SQL
Cursors
CREATE OR REPLACE FUNCTION totalbalance()
RETURNS numeric AS $$
DECLARE
tmp RECORD; result numeric;
BEGIN
result := 0.00;
FOR tmp IN SELECT * FROM foo LOOP
result := result + tmp.f1;
END LOOP;
RETURN result;
END;
$$ LANGUAGE plpgsql;
SELECT totalbalance();
totalbalance
--------------
83.00
(1 row)
Error Handling
http://www.postgresql.org/docs/9.4/static/errcodes-appendix.html
Window Function
CREATE TABLE mydata (
pk int primary key,
mydate date NOT NULL,
gender text NOT NULL CHECK(gender IN (’M’,’F’)),
mygroup text NOT NULL,
id int NOT NULL
);
Window Function
SELECT id, gender, obs_days, sum(chgd) as num_changes FROM
(SELECT id, gender,
CASE WHEN row_number() OVER w > 1
AND mygroup <> lag(mygroup) OVER w THEN 1
ELSE 0 END AS chgd,
last_value(mydate) OVER w - first_value(mydate) OVER w AS obs_days
FROM mydata
WINDOW w AS
(PARTITION BY id, gender ORDER BY id, gender, mydate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
) AS ss GROUP BY id, gender, obs_days ORDER BY id, gender;
id | gender | obs_days | num_changes
----+--------+----------+-------------
1 | F | 0 | 0
2 | F | 2126 | 5
3 | M | 770 | 1
4 | M | 0 | 0
(4 rows)
Lateral
Thank You
Questions?