Article about PostgreSQL 18
I wrote an article about PostgreSQL 18. It is in Czech language, but translators from Czech to English, German, ... works relatively well today.
Some notes about PostgreSQL
I wrote an article about PostgreSQL 18. It is in Czech language, but translators from Czech to English, German, ... works relatively well today.
Yesterday I found significant grow of seq read tuples. After some investigation I found query with strange predicate:
WHERE 1 = case when pb1_0.parent_id is not null then 0 end
It is really strange, and I had to ask, who wrote it.
The reply is - Hibernate. It is a transformation of predicate parent_id = (?)
when the list of id is empty.
Unfortunately, PostgreSQL is not able to detect so this predicate is always false, and then the repeated execution ended in repeated full scans.
Fortunately, Postgres has simple workaround - conditional index
CREATE INDEX ON TABLE pb(id) WHERE 1 = case when pb1_0.parent_id is not null then 0 end
This index is always empty, and then index scan is fast.
This issue should be fixed in more recent versions of Hibernate where predicate 1=0
is generated instead.
I compiled and uploaded zip files plpgsql_check for PostgreSQL 16 and PostgreSQL 17 - I used Microsoft Visual C 2022.
Setup:
Originally the PL/pgSQL was very simple language and integration procedural language with SQL was very primitive. Very specific feature of PL/pgSQL is translation of every expression to SQL. Thanks to this translation, the PL/pgSQL is very simple and really fully integrated with SQL engine. Bad side is possibility to write dirty or broken code, that is not detected.
Every expression is translated to queries. So expression `10` is translated to `SELECT 10`. The PL/pgSQL allows dirty expressions like `a FROM foo WHERE x = 10`. Although this syntax is ugly, it was very popular and it is supported and will be supported for ever.
Unfortunately, this support of this syntax with removing some limits in PostgreSQL 14, allows new kind of bugs (see https://www.postgresql.org/message-id/CALyvM2bp_CXMH_Gyq87pmHJRuZDEhV40u9VP8rX=CAnEt2wUXg@mail.gmail.com).
When somebody write code:
DECLARE var int;
BEGIN
var := 10
DELETE FROM tab WHERE varx = 20;
END;
This code on PostgreSQL 13 and older fails, but on PostgreSQL 14 and higher just quietly does nothing. Why?
The problem is in missing semicolon after `10`. PL/pgSQL translates the code to query:
`SELECT 10 DELETE FROM tab WHERE varx = 20`. This query fails on older Postgres, because `DELETE` is keyword and requires usage of `AS` keyword. But PostgreSQL 14 doesn't need it - It allows to usage a keywords like column names without necessity to use `AS` keyword.
I wrote a patch to Postgres - that implements new extra check that can detect this issue https://commitfest.postgresql.org/52/5044/. But this patch is waiting for review half year. I am not sure how much of plpgsql's developers using extra checks.
Today I found a way, how it is possible to detect this issue without necessity to modify SQL parser, and I wrote new check to plpgsql_check (it is merged in master branch):
(2025-02-05 22:17:04) postgres=# create or replace function foofoo()
returns void as $$
declare x int;
begin
x := 1 delete from foo where x = 10 ;
end;
$$ language plpgsql;
CREATE FUNCTION
(2025-02-05 22:17:07) postgres=# select plpgsql_check_function('foofoo()');
┌────────────────────────────────────────────────────────────────────┐
│ plpgsql_check_function │
╞════════════════════════════════════════════════════════════════════╡
│ warning extra:00000:4:assignment:expression is not pure expression │
│ Query: x := 1 delete from foo where x = 10 │
│ -- ^ │
│ Detail: there is a possibility of unwanted behave │
│ Context: at assignment to variable "x" declared on line 2 │
│ warning extra:00000:2:DECLARE:never read variable "x" │
└────────────────────────────────────────────────────────────────────┘
(6 rows)
I compiled and uploaded zip files with latest orafce and plpgsql_check for PostgreSQL 16 and PostgreSQL 17 - I used Microsoft Visual C 2022.
Setup:
This was first time when I used meson build system, and I was able to run regress tests. Fortunately the meson reduces lot of monkey work, unfortunately not all - I didn't find a way how to use ninja install with UAC.
One my customer asked me "what tables holds references to table pg_roles"?
The reply is simple - none. pg_roles is view. But this view uses very important table pg_authid. For custom tables we can use a query:
SELECT conname, conrelid::pg_catalog.regclass AS ontable, pg_catalog.pg_get_constraintdef(oid, true) AS condef FROM pg_catalog.pg_constraint c WHERE confrelid = 'a'::pg_catalog.regclass AND contype = 'f' ORDER BY conname; ┌─────────────┬─────────┬─────────────────────────────────────┐ │ conname │ ontable │ condef │ ╞═════════════╪═════════╪═════════════════════════════════════╡ │ b_a_id_fkey │ b │ FOREIGN KEY (a_id) REFERENCES a(id) │ └─────────────┴─────────┴─────────────────────────────────────┘ (1 row)But this method doesn't work for system tables. These tables doesn't use explicitly defined foreign keys. We should to use different method. The system function pg_get_catalog_foreign_keys returns all referencies between system tables, and we can filter result:
SELECT * FROM pg_get_catalog_foreign_keys() WHERE pktable = 'pg_authid'::regclass; ┌─────────────────────────┬──────────────┬───────────┬────────┬──────────┬────────┐ │ fktable │ fkcols │ pktable │ pkcols │ is_array │ is_opt │ ╞═════════════════════════╪══════════════╪═══════════╪════════╪══════════╪════════╡ │ pg_proc │ {proowner} │ pg_authid │ {oid} │ f │ f │ │ pg_type │ {typowner} │ pg_authid │ {oid} │ f │ f │ │ pg_class │ {relowner} │ pg_authid │ {oid} │ f │ f │ │ pg_operator │ {oprowner} │ pg_authid │ {oid} │ f │ f │ │ pg_opfamily │ {opfowner} │ pg_authid │ {oid} │ f │ f │ │ pg_opclass │ {opcowner} │ pg_authid │ {oid} │ f │ f │ │ pg_language │ {lanowner} │ pg_authid │ {oid} │ f │ f │ │ pg_largeobject_metadata │ {lomowner} │ pg_authid │ {oid} │ f │ f │ │ pg_statistic_ext │ {stxowner} │ pg_authid │ {oid} │ f │ f │ │ pg_event_trigger │ {evtowner} │ pg_authid │ {oid} │ f │ f │ │ pg_namespace │ {nspowner} │ pg_authid │ {oid} │ f │ f │ │ pg_conversion │ {conowner} │ pg_authid │ {oid} │ f │ f │ │ pg_database │ {datdba} │ pg_authid │ {oid} │ f │ f │ │ pg_db_role_setting │ {setrole} │ pg_authid │ {oid} │ f │ t │ │ pg_tablespace │ {spcowner} │ pg_authid │ {oid} │ f │ f │ │ pg_auth_members │ {roleid} │ pg_authid │ {oid} │ f │ f │ │ pg_auth_members │ {member} │ pg_authid │ {oid} │ f │ f │ │ pg_auth_members │ {grantor} │ pg_authid │ {oid} │ f │ f │ │ pg_ts_config │ {cfgowner} │ pg_authid │ {oid} │ f │ f │ │ pg_ts_dict │ {dictowner} │ pg_authid │ {oid} │ f │ f │ │ pg_extension │ {extowner} │ pg_authid │ {oid} │ f │ f │ │ pg_foreign_data_wrapper │ {fdwowner} │ pg_authid │ {oid} │ f │ f │ │ pg_foreign_server │ {srvowner} │ pg_authid │ {oid} │ f │ f │ │ pg_user_mapping │ {umuser} │ pg_authid │ {oid} │ f │ t │ │ pg_policy │ {polroles} │ pg_authid │ {oid} │ t │ t │ │ pg_default_acl │ {defaclrole} │ pg_authid │ {oid} │ f │ f │ │ pg_collation │ {collowner} │ pg_authid │ {oid} │ f │ f │ │ pg_publication │ {pubowner} │ pg_authid │ {oid} │ f │ f │ │ pg_subscription │ {subowner} │ pg_authid │ {oid} │ f │ f │ │ pg_variable │ {varowner} │ pg_authid │ {oid} │ f │ f │ └─────────────────────────┴──────────────┴───────────┴────────┴──────────┴────────┘ (30 rows)
I needed to summarize number of bugs per week per user
JSON log entry looks like:{"timestamp":"2024-06-07 00:10:56.525 CEST","user":"backend","dbname":"prd","pid":3557301,"remote_host":"199.16.203.8", "remote_port":31315,"session_id":"666229de.3647b5","line_num":3,"ps":"INSERT","session_start":"2024-06-06 23:27:58 CEST","vxid":"67/48252790", "txid":2033150293,"error_severity":"ERROR","state_code":"23505","message":"duplicate key value violates unique constraint \"payment_from_account_trid_key\"", "detail":"Key (trid)=(440607272834519) already exists.","statement":"insert into ... values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10)\nRETURNING *","application_name":"PostgreSQL JDBC Driver", "backend_type":"client backend","query_id":8536052635720301574}
cat postgresql-*.json | \ jq -r 'select(.error_severity=="ERROR") | select (.application_name | (startswith("pgAdmin") or startswith("IntelliJ"))) ' | \ jq -rs 'group_by (.user, .message) | map({user: .[0].user, message: .[0].message, count: length}) | .[] | [.count, .user, .message] | @tsv'I had to learn so grouping inside
jq
is allowed only on json array (and result is an array again), so I needed to merge
input records to array by using -s
, --slurp
option. For final processing, I need to translate result array just
to set of records by syntax .[]
. Transformation to tsv (tab separated data) are used for better readability than csv.
result in tsv format can looks like:
1 lukas relation "item_image" does not exist 1 lukas relation "item_image_share" does not exist 3 petr UNION types "char" and text cannot be matched 2 petr canceling statement due to user request 6 petr current transaction is aborted, commands ignored until end of transaction block 1 petr duplicate key value violates unique constraint "idx_op" 1 prd_jenkins_flyway relation "item_shipping" does not exist
jq
language is powerful and strange. It is strange to use |
pipe symbol inside an expression - like
filter on application_name
looks:
.application_name | startswith("xxx") or startswith("yyyy")Unfortunately, I didn't find a documentation named "gentle introduction to jq for people who knows SQL and C", so using jq language looks scary , but it is working pretty well.