Skip to content

Commit 607b276

Browse files
MasahikoSawadaCommitfest Bot
authored and
Commitfest Bot
committed
Support parallelism for collecting dead items during lazy vacuum.
This feature allows the vacuum to leverage multiple CPUs in order to collect dead items (i.e. the first pass over heap table) with parallel workers. The parallel degree for parallel heap vacuuming is determined based on the number of blocks to vacuum unless PARALLEL option of VACUUM command is specified, and further limited by max_parallel_maintenance_workers. For the parallel heap scan to collect dead items, we utilize a parallel block table scan, controlled by ParallelBlockTableScanDesc, in conjunction with the read stream. The workers' parallel scan descriptions are stored in the DSM space, enabling different parallel workers to resume the heap scan (phase 1) after a cycle of heap vacuuming and index vacuuming (phase 2 and 3) from their previous state. However, due to the potential presence of pinned buffers loaded by the read stream's look-ahead mechanism, we cannot abruptly stop phase 1 even when the space of dead_items TIDs exceeds the limit. Therefore, once the space of dead_items TIDs exceeds the limit, we begin processing pages without attempting to retrieve additional blocks by look-ahead mechanism until the read stream is exhausted, even if the the memory limit is surpassed. While this approach may increase the memory usage, it typically doesn't pose a significant problem, as processing a few 10s-100s buffers doesn't substantially increase the size of dead_items TIDs. When the parallel heap scan to collect dead items is enabled, we disable eager scanning. This is because parallel vacuum is available only in the VACUUM command and would not occur frequently, which doesn't align with the purpose of eager scanning. Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Reviewed-by: Melanie Plageman <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com
1 parent 6f87345 commit 607b276

File tree

9 files changed

+989
-121
lines changed

9 files changed

+989
-121
lines changed

doc/src/sgml/ref/vacuum.sgml

+35-19
Original file line numberDiff line numberDiff line change
@@ -280,25 +280,41 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
280280
<term><literal>PARALLEL</literal></term>
281281
<listitem>
282282
<para>
283-
Perform index vacuum and index cleanup phases of <command>VACUUM</command>
284-
in parallel using <replaceable class="parameter">integer</replaceable>
285-
background workers (for the details of each vacuum phase, please
286-
refer to <xref linkend="vacuum-phases"/>). The number of workers used
287-
to perform the operation is equal to the number of indexes on the
288-
relation that support parallel vacuum which is limited by the number of
289-
workers specified with <literal>PARALLEL</literal> option if any which is
290-
further limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
291-
An index can participate in parallel vacuum if and only if the size of the
292-
index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
293-
Please note that it is not guaranteed that the number of parallel workers
294-
specified in <replaceable class="parameter">integer</replaceable> will be
295-
used during execution. It is possible for a vacuum to run with fewer
296-
workers than specified, or even with no workers at all. Only one worker
297-
can be used per index. So parallel workers are launched only when there
298-
are at least <literal>2</literal> indexes in the table. Workers for
299-
vacuum are launched before the start of each phase and exit at the end of
300-
the phase. These behaviors might change in a future release. This
301-
option can't be used with the <literal>FULL</literal> option.
283+
Perform scanning heap, index vacuum, and index cleanup phases of
284+
<command>VACUUM</command> in parallel using
285+
<replaceable class="parameter">integer</replaceable> background workers
286+
(for the details of each vacuum phase, please refer to
287+
<xref linkend="vacuum-phases"/>).
288+
</para>
289+
<para>
290+
For heap tables, the number of workers used to perform the scanning
291+
heap is determined based on the size of table. A table can participate in
292+
parallel scanning heap if and only if the size of the table is more than
293+
<xref linkend="guc-min-parallel-table-scan-size"/>. During scanning heap,
294+
the heap table's blocks will be divided into ranges and shared among the
295+
cooperating processes. Each worker process will complete the scanning of
296+
its given range of blocks before requesting an additional range of blocks.
297+
</para>
298+
<para>
299+
The number of workers used to perform parallel index vacuum and index
300+
cleanup is equal to the number of indexes on the relation that support
301+
parallel vacuum. An index can participate in parallel vacuum if and only
302+
if the size of the index is more than <xref linkend="guc-min-parallel-index-scan-size"/>.
303+
Only one worker can be used per index. So parallel workers for index vacuum
304+
and index cleanup are launched only when there are at least <literal>2</literal>
305+
indexes in the table.
306+
</para>
307+
<para>
308+
Workers for vacuum are launched before the start of each phase and exit
309+
at the end of the phase. The number of workers for each phase is limited by
310+
the number of workers specified with <literal>PARALLEL</literal> option if
311+
any which is futher limited by <xref linkend="guc-max-parallel-maintenance-workers"/>.
312+
Please note that in any parallel vacuum phase, it is not guaanteed that the
313+
number of parallel workers specified in <replaceable class="parameter">integer</replaceable>
314+
will be used during execution. It is possible for a vacuum to run with fewer
315+
workers than specified, or even with no workers at all. These behaviors might
316+
change in a future release. This option can't be used with the <literal>FULL</literal>
317+
option.
302318
</para>
303319
</listitem>
304320
</varlistentry>

src/backend/access/heap/heapam_handler.c

+4
Original file line numberDiff line numberDiff line change
@@ -2671,6 +2671,10 @@ static const TableAmRoutine heapam_methods = {
26712671
.scan_sample_next_tuple = heapam_scan_sample_next_tuple,
26722672

26732673
.parallel_vacuum_compute_workers = heap_parallel_vacuum_compute_workers,
2674+
.parallel_vacuum_estimate = heap_parallel_vacuum_estimate,
2675+
.parallel_vacuum_initialize = heap_parallel_vacuum_initialize,
2676+
.parallel_vacuum_initialize_worker = heap_parallel_vacuum_initialize_worker,
2677+
.parallel_vacuum_collect_dead_items = heap_parallel_vacuum_collect_dead_items,
26742678
};
26752679

26762680

0 commit comments

Comments
 (0)