Postgres Performance
Postgres Performance
In this blog, we will see how one goes about analyzing the workload, or
queries, that are running. We shall then review some basic configuration
parameters to improve the performance of our PostgreSQL database. As we
mentioned, we will see only some of the parameters. The list of PostgreSQL
parameters is extensive, we would only touch on some of the key ones.
However, one can always consult the official documentation to delve into
the parameters and configurations that seem most important or useful in our
environment.
EXPLAIN
One of the first steps we can take to understand how to improve the
performance of our database is to analyze the queries that are made.
PostgreSQL devises a query plan for each query it receives. To see this plan,
we will use EXPLAIN.
The structure of a query plan is a tree of plan nodes. The nodes in the lower
level of the tree are scan nodes. They return raw rows from a table. There are
different types of scan nodes for different methods of accessing the table. The
EXPLAIN output has a line for each node in the plan tree.
world=# EXPLAIN SELECT * FROM city t1,country t2 WHERE id>100 AND t1.population>70000
1 t2.population<7000000;
2 QUERY PLAN
3 --------------------------------------------------------------------------
This command shows how the tables in our query will be scanned. Let's see
what these values correspond to that we can observe in our EXPLAIN.
The first parameter shows the operation that the engine is performing
on the data in this step.
Estimated start-up cost. This is the time spent before the output phase
can begin.
Estimated total cost. This is stated on the assumption that the plan
node is run to completion. In practice, a node's parent node might stop short
of reading all available rows.
Estimated number of rows output by this plan node. Again, the node is
assumed to be run to completion.
Estimated average width of rows output by this plan node.
The most critical part of the display is the estimated statement execution cost,
which is the planner's guess at how long it will take to run the statement.
When comparing how effective one query is against the other, we will in
practice be comparing the cost values of them.
It's important to understand that the cost of an upper-level node includes the
cost of all its child nodes. It's also important to realize that the cost only
reflects things that the planner cares about. In particular, the cost does not
consider the time spent transmitting result rows to the client, which could be
an important factor in the real elapsed time; but the planner ignores it because
it cannot change it by altering the plan.
The costs are measured in arbitrary units determined by the planner's cost
parameters. Traditional practice is to measure the costs in units of disk page
fetches; that is, seq_page_cost is conventionally set to 1.0 and the other cost
parameters are set relative to that.
EXPLAIN ANALYZE
With this option, EXPLAIN executes the query, and then displays the true row
counts and true run time accumulated within each plan node, along with the
same estimates that a plain EXPLAIN shows.
Let's see an example of the use of this tool.
world=# EXPLAIN ANALYZE SELECT * FROM city t1,country t2 WHERE id>100 AND t1.popula
1 t2.population<7000000;
2 QUERY
PLAN
3
---------------------------------------------------------------------------------
-----
4
Nested Loop (cost=0.00..734.81 rows=50662 width=144) (actual time=0.081..22.066
5
-> Seq Scan on city t1 (cost=0.00..93.19 rows=347 width=31) (actual time=0.06
6
Filter: ((id > 100) AND (population > 700000))
7
Rows Removed by Filter: 3729
8
-> Materialize (cost=0.00..8.72 rows=146 width=113) (actual time=0.000..0.011
9 -> Seq Scan on country t2 (cost=0.00..7.99 rows=146 width=113) (actual
loops=1)
10
Filter: (population < 7000000)
11
Rows Removed by Filter: 93
12
Planning time: 0.136 ms
13
Execution time: 24.627 ms
14 (10 rows)
VACUUM
The VACUUM process is responsible for several maintenance tasks within the
database, one of them recovering storage occupied by dead tuples. In the
normal operation of PostgreSQL, tuples that are deleted or obsoleted by an
update are not physically removed from their table; they remain present until a
VACUUM is performed. Therefore, it is necessary to do the VACUUM
periodically, especially in frequently updated tables.
If the VACUUM is taking too much time or resources, it means that we must
do it more frequently, so that each operation has less to clean.
In any case you may need to disable the VACUUM, for example when loading
data in large quantities.
The VACUUM simply recovers space and makes it available for reuse. This
form of the command can operate in parallel with the normal reading and
writing of the table, since an exclusive lock is not obtained. However, the
additional space is not returned to the operating system (in most cases); it is
only available for reuse within the same table.
VACUUM FULL rewrites all the contents of the table in a new disk file without
additional space, which allows the unused space to return to the operating
system. This form is much slower and requires an exclusive lock on each
table while processing.
ANALYZE collects statistics on the contents of the tables in the database and
stores the results in pg_statistic. Subsequently, the query planner uses these
statistics to help determine the most efficient execution plans for queries.
Configuration parameters
max_connections
superuser_reserved_connections
shared_buffers
Sets the amount of memory that the database server uses for shared memory
buffers. If you have a dedicated database server with 1 GB or more of RAM, a
reasonable initial value for shared_buffers is 25% of your system's memory.
Larger configurations for shared_buffers generally require a corresponding
increase in max_wal_size, to extend the process of writing large amounts of
new or modified data over a longer period of time.
temp_buffers
Sets the maximum number of temporary buffers used for each session. These
are local session buffers used only to access temporary tables. A session will
assign the temporary buffers as needed up to the limit given by temp_buffers.
work_mem
Specifies the amount of memory that will be used by the internal operations of
ORDER BY, DISTINCT, JOIN, and hash tables before writing to the
temporary files on disk. When configuring this value we must take into
account that several sessions be executing these operations at the same time
and each operation will be allowed to use as much memory as specified by
this value before it starts to write data in temporary files.
maintenance_work_mem
fsync
If fsync is enabled, PostgreSQL will try to make sure that the updates are
physically written to the disk. This ensures that the database cluster can be
recovered to a consistent state after an operating system or hardware crash.
While disabling fsync generally improves performance, it can cause data loss
in the event of a power failure or a system crash. Therefore, it is only
advisable to deactivate fsync if you can easily recreate your entire database
from external data.
Maximum size the WAL is allowed to grow between the control points. The
size of WAL can exceed max_wal_size in special circumstances. Increasing
this parameter can increase the amount of time needed to recover faults.
When the WAL file is kept below this value, it is recycled for future use at a
checkpoint, instead of being deleted. This can be used to ensure that enough
WAL space is reserved to handle spikes in the use of WAL, for example when
executing large batch jobs.
wal_sync_method
Method used to force WAL updates to the disk. If fsync is disabled, this setting
has no effect.
wal_buffers
The amount of shared memory used for WAL data that has not yet been
written to disk. The default setting is about 3% of shared_buffers, not less
than 64KB or more than the size of a WAL segment (usually 16MB). Setting
this value to at least a few MB can improve write performance on a server with
many concurrent transactions.
effective_cache_size
This value is used by the query planner to take into account plans that may or
may not fit in memory. This is taken into account in the cost estimates of using
an index; a high value makes it more likely that index scans are used and a
low value makes it more likely that sequential scans will be used. A
reasonable value would be 50% of the RAM.
default_statistics_target
Specifies whether the transaction commit will wait for the WAL records to be
written to disk before the command returns a "success" indication to the client.
The possible values are: "on", "remote_apply", "remote_write", "local" and
"off". The default setting is "on". When it is disabled, there may be a delay
between the time the client returns, and when the transaction is guaranteed to
be secure against a server lock. Unlike fsync, disabling this parameter does
not create any risk of database inconsistency: a crash of the operating system
or database may result in the loss of some recent transactions allegedly
committed, but the state of the database will be exactly the same as if those
transactions had been cancelled cleanly. Therefore, deactivating
synchronous_commit can be a useful alternative when performance is more
important than the exact certainty about the durability of a transaction.
Logging
There are several types of data to log that may be useful or not. Let's see
some of them:
Design
In many cases, the design of our database can affect performance. We must
be careful in our design, normalizing our schema and avoiding redundant
data. In many cases it is convenient to have several small tables instead of
one huge table. But as we said before, everything depends on our system and
there is not a single possible solution.
Hardware
CPU: Maybe it does not make much sense to say this, but the more
CPU we have, the better. In any case it is not the most important in terms of
hardware, but if we can have a good CPU, our processing capacity will
improve and that directly impacts our database.
Hard disk: We have several types of discs that we can use, SCSI,
SATA, SAS, IDE. We also have solid state disks. We must compare quality /
price, which we should use to compare its speed. But the type of disk is not
the only thing to consider, we must also see how to configure them. If we
want good performance, we can use RAID10, keeping the WALs on another
disk outside the RAID. It is not recommended to use RAID5 since the
performance of this type of RAID for databases is not good.