Re: Vacuum statistics

Lists: pgsql-hackers
From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Vacuum statistics
Date: 2024-05-30 17:33:51
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, everyone!

I think we don't have enough information to analyze vacuum functionality.

Needless to say that the vacuum is the most important process for a
database system. It prevents problems like table and index bloating and
emergency freezing if we have a wraparound problem. Furthermore, it
keeps the visibility map up to date. On the other hand, because of
incorrectly adjusted aggressive settings of autovacuum it can consume a
lot of computing resources that lead to all queries to the system
running longer.

Nowadays the vacuum gathers statistical information about tables, but it
is important not for optimizer only.

Because the vacuum is an automation process, there are a lot of settings
that determine their aggressive functionality to other objects of the
database. Besides, sometimes it is important to set a correct parameter
for the specified table, because of its dynamic changes.

An administrator of a database needs to set the settings of autovacuum
to have a balance between the vacuum's useful action in the database
system on the one hand, and the overhead of its workload on the other.
However, it is not enough for him to decide on vacuum functionality
through statistical information about the number of vacuum passes
through tables and operational data from progress_vacuum, because it is
available only during vacuum operation and does not provide a strategic
overview over the considered period.

To sum up, an automation vacuum has a strategic behavior because the
frequency of its functionality and resource consumption depends on the
workload of the database. Its workload on the database is minimal for an
append-only table and it is a maximum for the table with a
high-frequency updating. Furthermore, there is a high dependence of the
vacuum load on the number and volume of indexes. Because of the absence
of the visibility map for indexes, the vacuum scans the index
completely, and the worst situation when it needs to do it during a
bloating index situation in a small table.

I suggest gathering information about vacuum resource consumption for
processing indexes and tables and storing it in the table and index
relationships (for example, PgStat_StatTabEntry structure like it has
realized for usual statistics). It will allow us to determine how well
the vacuum is configured and evaluate the effect of overhead on the
system at the strategic level, the vacuum has gathered this information
already, but this valuable information doesn't store it.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-05-30 18:26:38
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30.05.2024 10:33, Alena Rybakina wrote:
>
> I suggest gathering information about vacuum resource consumption for
> processing indexes and tables and storing it in the table and index
> relationships (for example, PgStat_StatTabEntry structure like it has
> realized for usual statistics). It will allow us to determine how well
> the vacuum is configured and evaluate the effect of overhead on the
> system at the strategic level, the vacuum has gathered this
> information already, but this valuable information doesn't store it.
>
My colleagues and I have prepared a patch that can help to solve this
problem.

We are open to feedback.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 85.4 KB

From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-05-30 19:19:26
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> I suggest gathering information about vacuum resource consumption for
> processing indexes and tables and storing it in the table and index
> relationships (for example, PgStat_StatTabEntry structure like it has
> realized for usual statistics). It will allow us to determine how
> well
> the vacuum is configured and evaluate the effect of overhead on the
> system at the strategic level, the vacuum has gathered this
> information
> already, but this valuable information doesn't store it.
>
It seems a little bit unclear to me, so let me explain a little the
point of a proposition.

As the vacuum process is a backend it has a workload instrumentation.
We have all the basic counters available such as a number of blocks
read, hit and written, time spent on I/O, WAL stats and so on.. Also,
we can easily get some statistics specific to vacuum activity i.e.
number of tuples removed, number of blocks removed, number of VM marks
set and, of course the most important metric - time spent on vacuum
operation.

All those statistics must be stored by the Cumulative Statistics System
on per-relation basis. I mean individual cumulative counters for every
table and every index in the database.

Such counters will provide us a clear view about vacuum workload on
individual objects of the database, providing means to measure the
efficiency of performed vacuum fine tuning.
--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-07 06:46:30
Message-ID: CAFiTN-s3PG_v6RuozMHtsOeaWuQueGOQeZ3x-ZfPE1jt8hzYrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
<lena(dot)ribackina(at)yandex(dot)ru> wrote:
>
> On 30.05.2024 10:33, Alena Rybakina wrote:
> >
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information already, but this valuable information doesn't store it.
> >
> My colleagues and I have prepared a patch that can help to solve this
> problem.
>
> We are open to feedback.

I was reading through the patch here are some initial comments.

--
+typedef struct LVExtStatCounters
+{
+ TimestampTz time;
+ PGRUsage ru;
+ WalUsage walusage;
+ BufferUsage bufusage;
+ int64 VacuumPageMiss;
+ int64 VacuumPageHit;
+ int64 VacuumPageDirty;
+ double VacuumDelayTime;
+ PgStat_Counter blocks_fetched;
+ PgStat_Counter blocks_hit;
+} LVExtStatCounters;

I noticed that you are storing both pgBufferUsage and
VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
It seems they both exist in the system because some code, like
heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
relies on the old counters. And there is already a patch to remove
those old counters.

--
+static Datum
+pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
+{

I don't think you need this last parameter (ncolumns) we can anyway
fetch that from tupledesc, so adding an additional parameter
just for checking doesn't look good to me.

--
+ /* Tricky turn here: enforce pgstat to think that our database us dbid */
+
+ MyDatabaseId = dbid;

typo
/think that our database us dbid/think that our database has dbid

Also, remove the blank line between the comment and the next code
block that is related to that comment.

--
VacuumPageDirty = 0;
+ VacuumDelayTime = 0.;

There is an extra "." after 0

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-08 06:30:47
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your interest in this topic!

On 07.06.2024 09:46, Dilip Kumar wrote:
> On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
> <lena(dot)ribackina(at)yandex(dot)ru> wrote:
>> On 30.05.2024 10:33, Alena Rybakina wrote:
>>> I suggest gathering information about vacuum resource consumption for
>>> processing indexes and tables and storing it in the table and index
>>> relationships (for example, PgStat_StatTabEntry structure like it has
>>> realized for usual statistics). It will allow us to determine how well
>>> the vacuum is configured and evaluate the effect of overhead on the
>>> system at the strategic level, the vacuum has gathered this
>>> information already, but this valuable information doesn't store it.
>>>
>> My colleagues and I have prepared a patch that can help to solve this
>> problem.
>>
>> We are open to feedback.
> I was reading through the patch here are some initial comments.
>
> --
> +typedef struct LVExtStatCounters
> +{
> + TimestampTz time;
> + PGRUsage ru;
> + WalUsage walusage;
> + BufferUsage bufusage;
> + int64 VacuumPageMiss;
> + int64 VacuumPageHit;
> + int64 VacuumPageDirty;
> + double VacuumDelayTime;
> + PgStat_Counter blocks_fetched;
> + PgStat_Counter blocks_hit;
> +} LVExtStatCounters;
>
>
> I noticed that you are storing both pgBufferUsage and
> VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
> It seems they both exist in the system because some code, like
> heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
> relies on the old counters. And there is already a patch to remove
> those old counters.
I agree with you and I have fixed it.
>
> --
> +static Datum
> +pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
> +{
>
> I don't think you need this last parameter (ncolumns) we can anyway
> fetch that from tupledesc, so adding an additional parameter
> just for checking doesn't look good to me.
To be honest,I'm notsureifncolumns shouldbe deletedat
all,becausethepg_stats_vacuum
functionisusedtodisplaythreedifferenttypesof
statistics:fortables,indexes,
anddatabases.Weusethisparametertopassinformationaboutthe numberof
parameters(orhowmany statisticsweexpect)dependingonthe typeof
statistics.For example,table
vacuumstatisticscontain27parameters,whileindexesanddatabasescontain19and15parameters,
respectively.Youcanseethatthe pg_stats_vacuum functioncontainsan
Assertthatchecksthatthe expectednumberof tupledesc
parametersmatchestheactualnumber.

Assert(tupdesc->natts == ncolumns);

PerhapsIcanconvertitto alocalparameteranddetermineitsvaluealreadyinthe
function,for example:

pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int
ncolumns)
{

int columns = 0;

switch (type)

{

case PGSTAT_EXTVAC_HEAP:

ncolumns = EXTVACHEAPSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_INDEX:

ncolumns = EXTVACINDEXSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_DB:

ncolumns = EXTVACDBSTAT_COLUMNS;

break;

}

...

}

What do you think?

> --
> + /* Tricky turn here: enforce pgstat to think that our database us dbid */
> +
> + MyDatabaseId = dbid;
>
> typo
> /think that our database us dbid/think that our database has dbid
>
> Also, remove the blank line between the comment and the next code
> block that is related to that comment.
>
>
> --
> VacuumPageDirty = 0;
> + VacuumDelayTime = 0.;
>
> There is an extra "." after 0
>
>
Thank you, I fixed it.

In additionto thesechanges,Ifixedthe
problemwithdisplayingvacuumstatisticsfordatabases:Ifoundan
errorindefiningthe pg_stats_vacuum_database systemview.In
addition,Irewrotethe testsandconvertedthemintoa regressiontest.In
addition,Ihave dividedthe testtotestthe functionalityof the outputof
vacuumstatisticsintotwotests:oneofthemchecksthe functionalityof
tablesanddatabases,andthe other-indexes.Thisis causedby aproblemwiththe
vacuumfunctionalitywhenthe tablecontainsan
index.Youcanfindmoreinformationaboutthishere:[0]and[1].

I attached the diff to this letter.

[0]
https://www.postgresql.org/message-id/d1ca3a1d-7ead-41a7-bfd0-5b66ad97b1cd%40yandex.ru

[1]
https://www.postgresql.org/message-id/CAH2-Wznv94Q_Td8OS8bAN7fYLpfU6CGgjn6Xau5eJ_sDxEGeBA%40mail.gmail.com

Iam currentlyworkingondividingthispatchintothreepartstosimplifythe
reviewprocess:oneofthemwillcontaincodeforcollectingvacuumstatisticsontables,the
secondonindexesandthe lastondatabases.I alsowritethe documentation.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v2-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 103.3 KB
vacuum_file.diff.no-cfbot text/plain 65.9 KB

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-12 06:37:35
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
> On 08.06.2024 09:30, Alena Rybakina wrote:
>>
>> Iam currentlyworkingondividingthispatchintothreepartstosimplifythe
>> reviewprocess:oneofthemwillcontaincodeforcollectingvacuumstatisticsontables,the
>> secondonindexesandthe lastondatabases.
>>
I have divided the patch into three: the first patch containscodeforthe
functionalityof collecting and storage for tables, the second one for
indexes and the last one for databases.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-12 06:38:30
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
> On 08.06.2024 09:30, Alena Rybakina wrote:
>>
>> Iam currentlyworkingondividingthispatchintothreepartstosimplifythe
>> reviewprocess:oneofthemwillcontaincodeforcollectingvacuumstatisticsontables,the
>> secondonindexesandthe lastondatabases.
>>
I have divided the patch into three: the first patch containscodeforthe
functionalityof collecting and storage for tables, the second one for
indexes and the last one for databases.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v3-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 62.5 KB
v3-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 45.7 KB
v3-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 21.8 KB

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-17 10:09:51
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have written the documentary and attached the patch.

On 08.06.2024 09:30, Alena Rybakina wrote:
>
> Iam currentlyworkingondividingthispatchintothreepartstosimplifythe
> reviewprocess:oneofthemwillcontaincodeforcollectingvacuumstatisticsontables,the
> secondonindexesandthe lastondatabases.I alsowritethe documentation.
>
--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v3-0004-Add-documentation-about-the-system-views-that.patch text/x-patch 24.2 KB

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andrei Zubkov <zubkov(at)moonset(dot)ru>
Cc: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-27 01:39:45
Message-ID: CAD21AoDQ3bc+ebbhMvDy9eZ=QBnN-mHTAc0569cbnWXjozx0dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov(at)moonset(dot)ru> wrote:
>
> Hi,
>
> Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how
> > well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information
> > already, but this valuable information doesn't store it.
> >
> It seems a little bit unclear to me, so let me explain a little the
> point of a proposition.
>
> As the vacuum process is a backend it has a workload instrumentation.
> We have all the basic counters available such as a number of blocks
> read, hit and written, time spent on I/O, WAL stats and so on.. Also,
> we can easily get some statistics specific to vacuum activity i.e.
> number of tuples removed, number of blocks removed, number of VM marks
> set and, of course the most important metric - time spent on vacuum
> operation.

I've not reviewed the patch closely but it sounds helpful for users. I
would like to add a statistic, the high-water mark of memory usage of
dead tuple TIDs. Since the amount of memory used by TidStore is hard
to predict, I think showing the high-water mark would help users to
predict how much memory they set to maintenance_work_mem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-27 08:52:01
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello!

On Thu, 27/06/2024 at 10:39 +0900, Masahiko Sawada:
> On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov(at)moonset(dot)ru>
> wrote:
> > As the vacuum process is a backend it has a workload
> > instrumentation.
> > We have all the basic counters available such as a number of blocks
> > read, hit and written, time spent on I/O, WAL stats and so on..
> > Also,
> > we can easily get some statistics specific to vacuum activity i.e.
> > number of tuples removed, number of blocks removed, number of VM
> > marks
> > set and, of course the most important metric - time spent on vacuum
> > operation.
>
> I've not reviewed the patch closely but it sounds helpful for users.
> I
> would like to add a statistic, the high-water mark of memory usage of
> dead tuple TIDs. Since the amount of memory used by TidStore is hard
> to predict, I think showing the high-water mark would help users to
> predict how much memory they set to maintenance_work_mem.
>
Thank you for your interest on this patch. I've understand your idea.
The obvious goal of it is to avoid expensive index multi processing
during vacuum of the heap. Provided statistics in the patch contain the
index_vacuum_count counter for each table which can be compared to the
pg_stat_all_tables.vacuum_count to detect specific relation index
multi-passes. Previous setting of maintenance_work_mem is known. Usage
of TidStore should be proportional to the amount of dead-tuples vacuum
workload on the table, so as the first evaluation we can take the
number of index passes per one heap pass as a maintenance_work_mem
multiplier.

But there is a better way. Once we detected the index multiprocessing
we can lower the vacuum workload for the heap pass making vacuum a
little bit more aggressive for this particular relation. I mean, in
such case increasing maintenance_work_mem is not only decision.

Suggested high-water mark statistic can't be used as cumulative
statistic - any high-water mark statistic as maximim-like statistic is
valid for certain time period thus should be reset on some kind of
schedule. Without resets it should reach 100% once under the heavy load
and stay there forever.

Said that such high-water mark seems a little bit unclear and
complicated for the DBA. It seems redundant to me right now. I can see
the main value of such statistic is to avoid too large
maintenance_work_mem setting. But I can't see really dramatic
consequences of that. Maybe I've miss something..

--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, zubkov(at)moonset(dot)ru, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-10 18:14:27
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, everyone!

Thank you for your interesting patch with extended information
statistics about autovacuum.

Do you consider not to create new table in pg_catalog but to save
statistics in existing table? I mean pg_class or
pg_stat_progress_analyze, pg_stat_progress_vacuum?

P.S. If I sent this mail twice, I'm sorry :)

Regards

Ilia Evdokimov,

Tantor Labs.


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-10 19:37:25
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hi, Ilia!

> Do you consider not to create new table in pg_catalog but to save
> statistics in existing table? I mean pg_class or
> pg_stat_progress_analyze, pg_stat_progress_vacuum?
>
Thank you for your interest on our patch!

*_progress views is not our case. They hold online statistics while
vacuum is in progress. Once work is done on a table the entry is gone
from those views. Idea of this patch is the opposite - it doesn't
provide online statistics but it accumulates statistics about rosources
consumed by all vacuum passes over all relations. It's much closer to
the pg_stat_all_tables than pg_stat_progress_vacuum.

It seems pg_class is not the right place because it is not a statistic
view - it holds the current relation state and haven't anything about
the relation workload.

Maybe the pg_stat_all_tables is the right place but I have several
thoughts about why it is not:
- Some statistics provided by this patch is really vacuum specific. I
don't think we want them in the relation statistics view.
- Postgres is extreamly extensible. I'm sure someday there will be
table AMs that does not need the vacuum at all.

Right now vacuum specific workload views seems optimal choice to me.

Regards,

--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-08-10 20:57:35
Message-ID: CALdSSPgNAVexRd+v1Si356UJyTXWtSHOHsqVDRNhSQ_tDjTseg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 12 Jun 2024 at 11:38, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru> wrote:
>
> Hi!
>
> On 11.06.2024 16:09, Alena Rybakina wrote:
>
> On 08.06.2024 09:30, Alena Rybakina wrote:
>
> I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases.
>
> I have divided the patch into three: the first patch contains code for the functionality of collecting and storage for tables, the second one for indexes and the last one for databases.
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
Hi!
Few suggestions on this patch-set

1)
> +{ oid => '4701',
> + descr => 'pg_stats_vacuum_tables return stats values',
> + proname => 'pg_stats_vacuum_tables', provolatile => 's', prorettype => 'record',proisstrict => 'f',
> + proretset => 't',
> + proargtypes => 'oid oid',
> + proallargtypes =>

During development, OIDs should be picked up from range 8000-9999.
Same for pg_stats_vacuum_database & pg_stats_vacuum_indexes

Also, why are these function naming schemes like
pg_stats_vacuum_*something*, not pg_stat_vacuum_*something*, like
pg_stat_replication etc?

2) In 0003:
> + proargnames => '{dboid,dboid,db_blks_read,db_blks_hit,total_blks_dirtied,total_blks_written,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,system_time,user_time,total_time,interrupts}',

Repeated dboid arg name is strange. Is it done this way to make
pg_stats_vacuum function call in more unified fashion? I don't see any
other place within postgresql core with similar approach, so I doubt
it is correct.

3) 0001 patch vacuum_tables_statistics test creates
statistic_vacuum_database1, but does not use it. 0003 do.
Also I'm not sure if these additional checks on the second database
adds much value. Can you justify this please?

Other places look more or less fine to me.
However, I'll maybe post some additional nit-picky comments on the
next patch version.

--
Best regards,
Kirill Reshke


From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-08-11 13:58:54
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for inquiring about our topic!

On 10.08.2024 23:57, Kirill Reshke wrote:
> Hi!
> Few suggestions on this patch-set
>
> 1)
>> +{ oid => '4701',
>> + descr => 'pg_stats_vacuum_tables return stats values',
>> + proname => 'pg_stats_vacuum_tables', provolatile => 's', prorettype => 'record',proisstrict => 'f',
>> + proretset => 't',
>> + proargtypes => 'oid oid',
>> + proallargtypes =>
> During development, OIDs should be picked up from range 8000-9999.
> Same for pg_stats_vacuum_database & pg_stats_vacuum_indexes
>
> Also, why are these function naming schemes like
> pg_stats_vacuum_*something*, not pg_stat_vacuum_*something*, like
> pg_stat_replication etc?
To be honest, when I named it, I missed this aspect. I thought about the
plural vacuum statistics we show, so I named them. I fixed it.
> 2) In 0003:
>> + proargnames => '{dboid,dboid,db_blks_read,db_blks_hit,total_blks_dirtied,total_blks_written,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,system_time,user_time,total_time,interrupts}',
> Repeated dboid arg name is strange. Is it done this way to make
> pg_stats_vacuum function call in more unified fashion? I don't see any
> other place within postgresql core with similar approach, so I doubt
> it is correct.
Both parameters are required for input and output. We are trying to find
statistics for a specific database if the database oid was specified by
the user or display statistics for all objects, but we need to display
which database these statistics are for. I corrected the name of the
first parameter.
> 3) 0001 patch vacuum_tables_statistics test creates
> statistic_vacuum_database1, but does not use it. 0003 do.
> Also I'm not sure if these additional checks on the second database
> adds much value. Can you justify this please?

The statistic_vacuum_database1 needs us to check the visible of
statistics from another database (statistic_vacuum_database) as they are
after the manipulation with tables in another database, and after
deleting the vestat table . In the latter case, we need to be sure that
all the table statistics are not visible to us.

So, I agree that it should be added only in the latest version of the
patch, where we add vacuum statistics for databases. I fixed it.

> Other places look more or less fine to me.
> However, I'll maybe post some additional nit-picky comments on the
> next patch version.
We are glad any feedback and review, so feel free to do it)

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v4-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 66.1 KB
v4-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 41.4 KB
v4-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 19.2 KB
v4-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-13 13:18:48
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10.8.24 22:37, Andrei Zubkov wrote:

> Hi, Ilia!
>
>> Do you consider not to create new table in pg_catalog but to save
>> statistics in existing table? I mean pg_class or
>> pg_stat_progress_analyze, pg_stat_progress_vacuum?
>>
> Thank you for your interest on our patch!
>
> *_progress views is not our case. They hold online statistics while
> vacuum is in progress. Once work is done on a table the entry is gone
> from those views. Idea of this patch is the opposite - it doesn't
> provide online statistics but it accumulates statistics about rosources
> consumed by all vacuum passes over all relations. It's much closer to
> the pg_stat_all_tables than pg_stat_progress_vacuum.
>
> It seems pg_class is not the right place because it is not a statistic
> view - it holds the current relation state and haven't anything about
> the relation workload.
>
> Maybe the pg_stat_all_tables is the right place but I have several
> thoughts about why it is not:
> - Some statistics provided by this patch is really vacuum specific. I
> don't think we want them in the relation statistics view.
> - Postgres is extreamly extensible. I'm sure someday there will be
> table AMs that does not need the vacuum at all.
>
> Right now vacuum specific workload views seems optimal choice to me.
>
> Regards,

Agreed. They are not god places to store such statistics.

I have some suggestions:

1. pgstatfuncs.c in functions tuplestore_put_for_database() and
tuplestore_put_for_relation you can remove 'nulls' array if you're
sure that columns cannot be NULL.
2. These functions are almost the same and I would think of writing one
function depending of type 'ExtVacReportType'


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-13 13:37:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

And I have one suggestion for pg_stat_vacuum_database: I suppose we
should add database's name column after 'dboid' column because it is
difficult to read statistics without database's name. We could call it
'datname' just like in 'pg_stat_database' view.

Regards,

Ilia Evdokimov,
Tantor Labs LCC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-15 08:49:36
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 13.08.2024 16:18, Ilia Evdokimov wrote:
>
> On 10.8.24 22:37, Andrei Zubkov wrote:
>
>> Hi, Ilia!
>>
>>> Do you consider not to create new table in pg_catalog but to save
>>> statistics in existing table? I mean pg_class or
>>> pg_stat_progress_analyze, pg_stat_progress_vacuum?
>>>
>> Thank you for your interest on our patch!
>>
>> *_progress views is not our case. They hold online statistics while
>> vacuum is in progress. Once work is done on a table the entry is gone
>> from those views. Idea of this patch is the opposite - it doesn't
>> provide online statistics but it accumulates statistics about rosources
>> consumed by all vacuum passes over all relations. It's much closer to
>> the pg_stat_all_tables than pg_stat_progress_vacuum.
>>
>> It seems pg_class is not the right place because it is not a statistic
>> view - it holds the current relation state and haven't anything about
>> the relation workload.
>>
>> Maybe the pg_stat_all_tables is the right place but I have several
>> thoughts about why it is not:
>> - Some statistics provided by this patch is really vacuum specific. I
>> don't think we want them in the relation statistics view.
>> - Postgres is extreamly extensible. I'm sure someday there will be
>> table AMs that does not need the vacuum at all.
>>
>> Right now vacuum specific workload views seems optimal choice to me.
>>
>> Regards,
>
>
> Agreed. They are not god places to store such statistics.
>
>
> I have some suggestions:
>
> 1. pgstatfuncs.c in functions tuplestore_put_for_database() and
> tuplestore_put_for_relation you can remove 'nulls' array if you're
> sure that columns cannot be NULL.
>
We need to use this for tuplestore_putvalues function. With this
function, we fill the table with the values of the statistics.
>
> 1.
>
>
> 2. These functions are almost the same and I would think of writing
> one function depending of type 'ExtVacReportType'
>
I'm not sure that I fully understand what you mean. Can you explain it
more clearly, please?

On 13.08.2024 16:37, Ilia Evdokimov wrote:
> And I have one suggestion for pg_stat_vacuum_database: I suppose we
> should add database's name column after 'dboid' column because it is
> difficult to read statistics without database's name. We could call it
> 'datname' just like in 'pg_stat_database' view.
>
Thank you. Fixed.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v5-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 66.1 KB
v5-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 41.4 KB
v5-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 18.9 KB
v5-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Andrei Zubkov <zubkov(at)moonset(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-15 09:50:58
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 15.8.24 11:49, Alena Rybakina wrote:
>>
>> I have some suggestions:
>>
>> 1. pgstatfuncs.c in functions tuplestore_put_for_database() and
>> tuplestore_put_for_relation you can remove 'nulls' array if
>> you're sure that columns cannot be NULL.
>>
> We need to use this for tuplestore_putvalues function. With this
> function, we fill the table with the values of the statistics.

Ah, right! I'm sorry.

>> 1.
>>
>>
>>
>> 2. These functions are almost the same and I would think of writing
>> one function depending of type 'ExtVacReportType'
>>
> I'm not sure that I fully understand what you mean. Can you explain it
> more clearly, please?

Ah, I didn't notice that the size of all three tables is different.
Therefore, it won't be possible to write one function instead of two to
avoid code duplication. My mistake.


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-16 11:12:00
Message-ID: CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 15, 2024 at 4:49 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi!

I've applied all the v5 patches.
0002 and 0003 have white space errors.

+ <para>
+ Number of times blocks of this index were already found
+ in the buffer cache by vacuum operations, so that a read was
not necessary
+ (this only includes hits in the
+ &project; buffer cache, not the operating system's file system cache)
+ </para></entry>

+ Number of times blocks of this table were already found
+ in the buffer cache by vacuum operations, so that a read was
not necessary
+ (this only includes hits in the
+ &project; buffer cache, not the operating system's file system cache)
+ </para></entry>

"&project;"
represents a sgml file placeholder name as "project" and puts all the
content of "project.sgml" to system-views.sgml.
but you don't have "project.sgml". you may check
doc/src/sgml/filelist.sgml or doc/src/sgml/ref/allfiles.sgml
for usage of "&place_holder;".
so you can change it to "project", otherwise doc cannot build.

src/backend/commands/dbcommands.c
we have:
/*
* If built with appropriate switch, whine when regression-testing
* conventions for database names are violated. But don't complain during
* initdb.
*/
#ifdef ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
if (IsUnderPostmaster && strstr(dbname, "regression") == NULL)
elog(WARNING, "databases created by regression test cases
should have names including \"regression\"");
#endif
so in src/test/regress/sql/vacuum_tables_and_db_statistics.sql you
need to change dbname:
CREATE DATABASE statistic_vacuum_database;
CREATE DATABASE statistic_vacuum_database1;

+ <para>
+ The view <structname>pg_stat_vacuum_indexes</structname> will contain
+ one row for each index in the current database (including TOAST
+ table indexes), showing statistics about vacuuming that specific index.
+ </para>
TOAST should
<acronym>TOAST</acronym>

+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
maybe change to
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("return type must be a row type")));
Later I found out "InitMaterializedSRF(fcinfo, 0);" already did all
the work. much of the code can be gotten rid of.
please check attached.

>>>>
#define EXTVACHEAPSTAT_COLUMNS 27
#define EXTVACIDXSTAT_COLUMNS 19
#define EXTVACDBSTAT_COLUMNS 15
#define EXTVACSTAT_COLUMNS Max(EXTVACHEAPSTAT_COLUMNS, EXTVACIDXSTAT_COLUMNS)

static Oid CurrentDatabaseId = InvalidOid;
>>>>
we already defined MyDatabaseId in src/include/miscadmin.h,
Why do we need "static Oid CurrentDatabaseId = InvalidOid;"?
also src/backend/utils/adt/pgstatfuncs.c already included "miscadmin.h".

the following code one function has 2 return statements?
------------------------------------------------------------------------
/*
* Get the vacuum statistics for the heap tables.
*/
Datum
pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
{
return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP, EXTVACHEAPSTAT_COLUMNS);

PG_RETURN_NULL();
}

/*
* Get the vacuum statistics for the indexes.
*/
Datum
pg_stat_vacuum_indexes(PG_FUNCTION_ARGS)
{
return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_INDEX, EXTVACIDXSTAT_COLUMNS);

PG_RETURN_NULL();
}

/*
* Get the vacuum statistics for the database.
*/
Datum
pg_stat_vacuum_database(PG_FUNCTION_ARGS)
{
return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_DB, EXTVACDBSTAT_COLUMNS);

PG_RETURN_NULL();
}
------------------------------------------------------------------------
in pg_stats_vacuum:
if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
{
Oid relid = PG_GETARG_OID(1);

/* Load table statistics for specified database. */
if (OidIsValid(relid))
{
tabentry = fetch_dbstat_tabentry(dbid, relid);
if (tabentry == NULL || tabentry->vacuum_ext.type != type)
/* Table don't exists or isn't an heap relation. */
PG_RETURN_NULL();

tuplestore_put_for_relation(relid, tupstore, tupdesc,
tabentry, ncolumns);
}
else
{
...
}
}
I don't understand the ELSE branch. the IF branch means the input
dboid, heap/index oid is correct.
the ELSE branch means table reloid is invalid = 0.
I am not sure 100% what the ELSE Branch means.
Also there are no comments explaining why.
experiments seem to show that when reloid is 0, it will print out all
the vacuum statistics
for all the tables in the current database. If so, then only super
users can call pg_stats_vacuum?
but the table owner should be able to call pg_stats_vacuum for that
specific table.

/* Type of ExtVacReport */
typedef enum ExtVacReportType
{
PGSTAT_EXTVAC_INVALID = 0,
PGSTAT_EXTVAC_HEAP = 1,
PGSTAT_EXTVAC_INDEX = 2,
PGSTAT_EXTVAC_DB = 3,
} ExtVacReportType;
generally "HEAP" means table and index, maybe "PGSTAT_EXTVAC_HEAP" would be term

Attachment Content-Type Size
v5-0001-minor-refactor-pg_stats_vacuum-and-sub-routine.no-cfbot application/octet-stream 4.5 KB

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-19 09:32:48
Message-ID: CACJufxE6yAP+jUm4_GyKp7gUCkzbuJ9HGB-rR=92_hcLLZ9KTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

in pg_stats_vacuum
if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
{
Oid relid = PG_GETARG_OID(1);

/* Load table statistics for specified database. */
if (OidIsValid(relid))
{
tabentry = fetch_dbstat_tabentry(dbid, relid);
if (tabentry == NULL || tabentry->vacuum_ext.type != type)
/* Table don't exists or isn't an heap relation. */
PG_RETURN_NULL();

tuplestore_put_for_relation(relid, rsinfo, tabentry);
}
else
{
}

So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-08-19 16:28:00
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Are you certain that all tables are included in `pg_stat_vacuum_tables`?
I'm asking because of the following:

SELECT count(*) FROM pg_stat_all_tables ;
 count
-------
   108
(1 row)

SELECT count(*) FROM pg_stat_vacuum_tables ;
 count
-------
    20
(1 row)

--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-20 22:35:00
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you very much for your review! Sorry for my late response I
was overwhelmed by tasks.

On 16.08.2024 14:12, jian he wrote:
> On Thu, Aug 15, 2024 at 4:49 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> Hi!
>
> I've applied all the v5 patches.
> 0002 and 0003 have white space errors.
>
> + <para>
> + Number of times blocks of this index were already found
> + in the buffer cache by vacuum operations, so that a read was
> not necessary
> + (this only includes hits in the
> + &project; buffer cache, not the operating system's file system cache)
> + </para></entry>
>
> + Number of times blocks of this table were already found
> + in the buffer cache by vacuum operations, so that a read was
> not necessary
> + (this only includes hits in the
> + &project; buffer cache, not the operating system's file system cache)
> + </para></entry>
>
> "&project;"
> represents a sgml file placeholder name as "project" and puts all the
> content of "project.sgml" to system-views.sgml.
> but you don't have "project.sgml". you may check
> doc/src/sgml/filelist.sgml or doc/src/sgml/ref/allfiles.sgml
> for usage of "&place_holder;".
> so you can change it to "project", otherwise doc cannot build.
>
>
> src/backend/commands/dbcommands.c
> we have:
> /*
> * If built with appropriate switch, whine when regression-testing
> * conventions for database names are violated. But don't complain during
> * initdb.
> */
> #ifdef ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
> if (IsUnderPostmaster && strstr(dbname, "regression") == NULL)
> elog(WARNING, "databases created by regression test cases
> should have names including \"regression\"");
> #endif
> so in src/test/regress/sql/vacuum_tables_and_db_statistics.sql you
> need to change dbname:
> CREATE DATABASE statistic_vacuum_database;
> CREATE DATABASE statistic_vacuum_database1;
>
>
> + <para>
> + The view <structname>pg_stat_vacuum_indexes</structname> will contain
> + one row for each index in the current database (including TOAST
> + table indexes), showing statistics about vacuuming that specific index.
> + </para>
> TOAST should
> <acronym>TOAST</acronym>
>
>
>
> + /* Build a tuple descriptor for our result type */
> + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> + elog(ERROR, "return type must be a row type");
> maybe change to
> ereport(ERROR,
> (errcode(ERRCODE_DATATYPE_MISMATCH),
> errmsg("return type must be a row type")));
> Later I found out "InitMaterializedSRF(fcinfo, 0);" already did all
> the work. much of the code can be gotten rid of.
> please check attached.
I agree with your suggestions for improving the code. I will add this in
the next version of the patch.
>
> #define EXTVACHEAPSTAT_COLUMNS 27
> #define EXTVACIDXSTAT_COLUMNS 19
> #define EXTVACDBSTAT_COLUMNS 15
> #define EXTVACSTAT_COLUMNS Max(EXTVACHEAPSTAT_COLUMNS, EXTVACIDXSTAT_COLUMNS)
>
> static Oid CurrentDatabaseId = InvalidOid;
> we already defined MyDatabaseId in src/include/miscadmin.h,
> Why do we need "static Oid CurrentDatabaseId = InvalidOid;"?
> also src/backend/utils/adt/pgstatfuncs.c already included "miscadmin.h".
Hmm, Tom Lane added "misc admin.h", or I didn't notice something. Could
you point this out, please?

We used the Current Database Id to output statistics on tables from
another database, so we need to replace it with a different default
value. But I want to rewrite this patch to display table statistics only
for the current database, that is, this part will be removed in the
future. In my opinion, it would be more correct.
> the following code one function has 2 return statements?
> ------------------------------------------------------------------------
> /*
> * Get the vacuum statistics for the heap tables.
> */
> Datum
> pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
> {
> return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP, EXTVACHEAPSTAT_COLUMNS);
>
> PG_RETURN_NULL();
> }
>
> /*
> * Get the vacuum statistics for the indexes.
> */
> Datum
> pg_stat_vacuum_indexes(PG_FUNCTION_ARGS)
> {
> return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_INDEX, EXTVACIDXSTAT_COLUMNS);
>
> PG_RETURN_NULL();
> }
>
> /*
> * Get the vacuum statistics for the database.
> */
> Datum
> pg_stat_vacuum_database(PG_FUNCTION_ARGS)
> {
> return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_DB, EXTVACDBSTAT_COLUMNS);
>
> PG_RETURN_NULL();
> }
You are right - the second return is superfluous. I'll fix it.
> ------------------------------------------------------------------------
> in pg_stats_vacuum:
> if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
> {
> Oid relid = PG_GETARG_OID(1);
>
> /* Load table statistics for specified database. */
> if (OidIsValid(relid))
> {
> tabentry = fetch_dbstat_tabentry(dbid, relid);
> if (tabentry == NULL || tabentry->vacuum_ext.type != type)
> /* Table don't exists or isn't an heap relation. */
> PG_RETURN_NULL();
>
> tuplestore_put_for_relation(relid, tupstore, tupdesc,
> tabentry, ncolumns);
> }
> else
> {
> ...
> }
> }
> I don't understand the ELSE branch. the IF branch means the input
> dboid, heap/index oid is correct.
> the ELSE branch means table reloid is invalid = 0.
> I am not sure 100% what the ELSE Branch means.
> Also there are no comments explaining why.
> experiments seem to show that when reloid is 0, it will print out all
> the vacuum statistics
> for all the tables in the current database. If so, then only super
> users can call pg_stats_vacuum?
> but the table owner should be able to call pg_stats_vacuum for that
> specific table.
If any reloid has not been set by the user, we output statistics for all
objects - tables or indexes.In this part of the code, we find all the
suitable objects from the snapshot, if they belong to the index or table
type of objects.
> /* Type of ExtVacReport */
> typedef enum ExtVacReportType
> {
> PGSTAT_EXTVAC_INVALID = 0,
> PGSTAT_EXTVAC_HEAP = 1,
> PGSTAT_EXTVAC_INDEX = 2,
> PGSTAT_EXTVAC_DB = 3,
> } ExtVacReportType;
> generally "HEAP" means table and index, maybe "PGSTAT_EXTVAC_HEAP" would be term

No, Heap means something like a table in a relationship database, or its
alternative name is Heap.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-20 22:37:16
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

We check it there: "tabentry->vacuum_ext.type != type". Or were you
talking about something else?

On 19.08.2024 12:32, jian he wrote:
> in pg_stats_vacuum
> if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
> {
> Oid relid = PG_GETARG_OID(1);
>
> /* Load table statistics for specified database. */
> if (OidIsValid(relid))
> {
> tabentry = fetch_dbstat_tabentry(dbid, relid);
> if (tabentry == NULL || tabentry->vacuum_ext.type != type)
> /* Table don't exists or isn't an heap relation. */
> PG_RETURN_NULL();
>
> tuplestore_put_for_relation(relid, rsinfo, tabentry);
> }
> else
> {
> }
>
>
> So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
> it seems you didn't check "relid" 's relkind,
> you may need to use get_rel_relkind.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-08-20 22:39:15
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I think you've counted the above system tables from the database, but
I'll double-check it. Thank you for your review!

On 19.08.2024 19:28, Ilia Evdokimov wrote:
> Are you certain that all tables are included in
> `pg_stat_vacuum_tables`? I'm asking because of the following:
>
>
> SELECT count(*) FROM pg_stat_all_tables ;
>  count
> -------
>    108
> (1 row)
>
> SELECT count(*) FROM pg_stat_vacuum_tables ;
>  count
> -------
>     20
> (1 row)
>
--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-22 02:47:00
Message-ID: CACJufxGkvk7bVeidQoEtUt=PsReQTMKFD1NeeM9u3cJ0TStm6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?
>
> On 19.08.2024 12:32, jian he wrote:
>
> in pg_stats_vacuum
> if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
> {
> Oid relid = PG_GETARG_OID(1);
>
> /* Load table statistics for specified database. */
> if (OidIsValid(relid))
> {
> tabentry = fetch_dbstat_tabentry(dbid, relid);
> if (tabentry == NULL || tabentry->vacuum_ext.type != type)
> /* Table don't exists or isn't an heap relation. */
> PG_RETURN_NULL();
>
> tuplestore_put_for_relation(relid, rsinfo, tabentry);
> }
> else
> {
> }
>
>
> So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
> it seems you didn't check "relid" 's relkind,
> you may need to use get_rel_relkind.
>
> --

hi.
I mentioned some points at [1],
Please check the attached patchset to address these issues.

there are four occurrences of "CurrentDatabaseId", i am still confused
with usage of CurrentDatabaseId.

also please don't top-post, otherwise the archive, like [2] is not
easier to read for future readers.
generally you quote first, then reply.

[1] https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
[2] https://postgr.es/m/[email protected]

Attachment Content-Type Size
v6-0002-minor-doc-change-to-make-build-successfuly.no-cfbot application/octet-stream 1.5 KB
v6-0001-minor-refactor-pg_stats_vacuum-and-sub-routine.no-cfbot application/octet-stream 4.5 KB
v6-0004-ensure-pg_stats_vacuum-object-is-either-relati.no-cfbot application/octet-stream 2.1 KB
v6-0003-refactor-regression-test.no-cfbot application/octet-stream 5.9 KB

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-22 04:29:58
Message-ID: CALdSSPgEqTy85VPTJ9k=9t4iajkjSZevcAYAYCpRTR6iZ34PhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 22 Aug 2024 at 07:48, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >
> > We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?
> >
> > On 19.08.2024 12:32, jian he wrote:
> >
> > in pg_stats_vacuum
> > if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
> > {
> > Oid relid = PG_GETARG_OID(1);
> >
> > /* Load table statistics for specified database. */
> > if (OidIsValid(relid))
> > {
> > tabentry = fetch_dbstat_tabentry(dbid, relid);
> > if (tabentry == NULL || tabentry->vacuum_ext.type != type)
> > /* Table don't exists or isn't an heap relation. */
> > PG_RETURN_NULL();
> >
> > tuplestore_put_for_relation(relid, rsinfo, tabentry);
> > }
> > else
> > {
> > }
> >
> >
> > So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
> > it seems you didn't check "relid" 's relkind,
> > you may need to use get_rel_relkind.
> >
> > --
>
> hi.
> I mentioned some points at [1],
> Please check the attached patchset to address these issues.
>
> there are four occurrences of "CurrentDatabaseId", i am still confused
> with usage of CurrentDatabaseId.
>
> also please don't top-post, otherwise the archive, like [2] is not
> easier to read for future readers.
> generally you quote first, then reply.
>
> [1] https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
> [2] https://postgr.es/m/[email protected]

Hi, your points are valid.
Regarding 0003, I also wanted to object database naming in a
regression test during my review but for some reason didn't.Now, as
soon as we already need to change it, I suggest we also change
regression_statistic_vacuum_db1 to something less generic. Maybe
regression_statistic_vacuum_db_unaffected.

--
Best regards,
Kirill Reshke


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-08-23 01:07:37
Message-ID: CAPpHfdug0s2MD7bBf-5nDQGn1WBxCKiTmZyGfxHz_7P0CDOjbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 21, 2024 at 1:39 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
wrote:
>
> I think you've counted the above system tables from the database, but
> I'll double-check it. Thank you for your review!
>
> On 19.08.2024 19:28, Ilia Evdokimov wrote:
> > Are you certain that all tables are included in
> > `pg_stat_vacuum_tables`? I'm asking because of the following:
> >
> >
> > SELECT count(*) FROM pg_stat_all_tables ;
> > count
> > -------
> > 108
> > (1 row)
> >
> > SELECT count(*) FROM pg_stat_vacuum_tables ;
> > count
> > -------
> > 20
> > (1 row)
> >

I'd like to do some review a well.

+ MyDatabaseId = dbid;
+
+ PG_TRY();
+ {
+ tabentry = pgstat_fetch_stat_tabentry(relid);
+ MyDatabaseId = storedMyDatabaseId;
+ }
+ PG_CATCH();
+ {
+ MyDatabaseId = storedMyDatabaseId;
+ }
+ PG_END_TRY();

I think this is generally wrong to change MyDatabaseId, especially if you
have to wrap it with PG_TRY()/PG_CATCH(). I think, instead we need proper
API changes, i.e. make pgstat_fetch_stat_tabentry() and others take dboid
as an argument.

+/*
+ * Get the vacuum statistics for the heap tables.
+ */
+Datum
+pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
+{
+ return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP,
EXTVACHEAPSTAT_COLUMNS);
+
+ PG_RETURN_NULL();
+}

The PG_RETURN_NULL() is unneeded after another return statement. However,
does pg_stats_vacuum() need to return anything? What about making its
return type void?

@@ -874,4 +874,38 @@ pgstat_get_custom_snapshot_data(PgStat_Kind kind)
return pgStatLocal.snapshot.custom_data[idx];
}

+/* hash table for statistics snapshots entry */
+typedef struct PgStat_SnapshotEntry
+{
+ PgStat_HashKey key;
+ char status; /* for simplehash use */
+ void *data; /* the stats data itself */
+} PgStat_SnapshotEntry;

It would be nice to preserve encapsulation and don't expose pgstat_snapshot
hash in the headers. I see there is only one usage of it outside of
pgstat.c: pg_stats_vacuum().

+ Oid storedMyDatabaseId = MyDatabaseId;
+
+ pgstat_update_snapshot(PGSTAT_KIND_RELATION);
+ MyDatabaseId = storedMyDatabaseId;

This manipulation with storedMyDatabaseId looks pretty useless. It seems
to be intended to change MyDatabaseId, while I'm not fan of this as I
mentioned above.

+static PgStat_StatTabEntry *
+fetch_dbstat_tabentry(Oid dbid, Oid relid)
+{
+ Oid storedMyDatabaseId = MyDatabaseId;
+ PgStat_StatTabEntry *tabentry = NULL;
+
+ if (OidIsValid(CurrentDatabaseId) && CurrentDatabaseId == dbid)
+ /* Quick path when we read data from the same database */
+ return pgstat_fetch_stat_tabentry(relid);
+
+ pgstat_clear_snapshot();

It looks scary to reset the whole snapshot each time we access another
database. Need to also mention that the CurrentDatabaseId machinery isn't
implemented.

New functions
pg_stat_vacuum_tables(), pg_stat_vacuum_indexes(), pg_stat_vacuum_database()
are SRFs. When zero Oid is passed they report all the objects. However,
it seems they aren't intended to be used directly. Instead, there are
views with the same names. These views always call them with particular
Oids, therefore SRFs always return one row. Then why bother with SRF?
They could return plain records instead.

Also, as I mentioned above patchset makes a lot of trouble accessing
statistics of relations of another database. But that seems to be useless
given corresponding views allow to see only relations of the current
database. Even if you call functions directly, what is the value of this
information given that you don't know the relation oids in another
database? So, I think if we will give up and limit access to the relations
of the current database patch will become simpler and clearer.

------
Regards,
Alexander Korotkov
Supabase


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-08-25 15:59:46
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 23.08.2024 04:07, Alexander Korotkov wrote:
> On Wed, Aug 21, 2024 at 1:39 AM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >
> > I think you've counted the above system tables from the database, but
> > I'll double-check it. Thank you for your review!
> >
> > On 19.08.2024 19:28, Ilia Evdokimov wrote:
> > > Are you certain that all tables are included in
> > > `pg_stat_vacuum_tables`? I'm asking because of the following:
> > >
> > >
> > > SELECT count(*) FROM pg_stat_all_tables ;
> > >  count
> > > -------
> > >    108
> > > (1 row)
> > >
> > > SELECT count(*) FROM pg_stat_vacuum_tables ;
> > >  count
> > > -------
> > >     20
> > > (1 row)
> > >
>
> I'd like to do some review a well.
Thank you very much for your review and contribution to this thread!
>
> +   MyDatabaseId = dbid;
> +
> +   PG_TRY();
> +   {
> +       tabentry = pgstat_fetch_stat_tabentry(relid);
> +       MyDatabaseId = storedMyDatabaseId;
> +   }
> +   PG_CATCH();
> +   {
> +       MyDatabaseId = storedMyDatabaseId;
> +   }
> +   PG_END_TRY();
>
> I think this is generally wrong to change MyDatabaseId, especially if
> you have to wrap it with PG_TRY()/PG_CATCH().  I think, instead we
> need proper API changes, i.e. make pgstat_fetch_stat_tabentry() and
> others take dboid as an argument.
I fixed it by deleting this part pf the code. We can display statistics
only for current database.
>
> +/*
> + * Get the vacuum statistics for the heap tables.
> + */
> +Datum
> +pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
> +{
> +   return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP,
> EXTVACHEAPSTAT_COLUMNS);
> +
> +   PG_RETURN_NULL();
> +}
>
> The PG_RETURN_NULL() is unneeded after another return statement. 
> However, does pg_stats_vacuum() need to return anything?  What about
> making its return type void?
I think you are right, we can not return anything. Fixed.
>
> @@ -874,4 +874,38 @@ pgstat_get_custom_snapshot_data(PgStat_Kind kind)
>    return pgStatLocal.snapshot.custom_data[idx];
>  }
>
> +/* hash table for statistics snapshots entry */
> +typedef struct PgStat_SnapshotEntry
> +{
> +  PgStat_HashKey key;
> +  char     status;        /* for simplehash use */
> +  void     *data;         /* the stats data itself */
> +} PgStat_SnapshotEntry;
>
> It would be nice to preserve encapsulation and don't
> expose pgstat_snapshot hash in the headers.  I see there is only one
> usage of it outside of pgstat.c: pg_stats_vacuum().
Fixed.
>
> +        Oid  storedMyDatabaseId = MyDatabaseId;
> +
> +        pgstat_update_snapshot(PGSTAT_KIND_RELATION);
> +        MyDatabaseId = storedMyDatabaseId;
>
> This manipulation with storedMyDatabaseId looks pretty useless. It
> seems to be intended to change MyDatabaseId, while I'm not fan of this
> as I mentioned above.
Fixed.
>
> +static PgStat_StatTabEntry *
> +fetch_dbstat_tabentry(Oid dbid, Oid relid)
> +{
> +  Oid                  storedMyDatabaseId = MyDatabaseId;
> +  PgStat_StatTabEntry  *tabentry = NULL;
> +
> +  if (OidIsValid(CurrentDatabaseId) && CurrentDatabaseId == dbid)
> +     /* Quick path when we read data from the same database */
> +     return pgstat_fetch_stat_tabentry(relid);
> +
> +  pgstat_clear_snapshot();
>
> It looks scary to reset the whole snapshot each time we access another
> database.  Need to also mention that the CurrentDatabaseId machinery
> isn't implemented.
Fixed.
>
> New functions
> pg_stat_vacuum_tables(), pg_stat_vacuum_indexes(), pg_stat_vacuum_database()
> are SRFs.  When zero Oid is passed they report all the objects. 
> However, it seems they aren't intended to be used directly.  Instead,
> there are views with the same names. These views always call them with
> particular Oids, therefore SRFs always return one row.  Then why
> bother with SRF?  They could return plain records instead.

I didn't understand correctly - did you mean that we don't need SRF if
we need to display statistics for a specific object?

Otherwise, we need this when we display information on all database
objects (tables or indexes):

while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
!= NULL)
{
    CHECK_FOR_INTERRUPTS();

    tabentry = (PgStat_StatTabEntry *) entry->data;

    if (tabentry != NULL && tabentry->vacuum_ext.type == type)
        tuplestore_put_for_relation(relid, rsinfo, tabentry);
}

I know we can construct a HeapTuple object containing a TupleDesc,
values, and nulls for a particular object, but I'm not sure we can
augment it while looping through multiple objects.

/* Initialise attributes information in the tuple descriptor */

 tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);

...

PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));

If I missed something or misunderstood, can you explain in more detail?

>
> Also, as I mentioned above patchset makes a lot of trouble accessing
> statistics of relations of another database.  But that seems to be
> useless given corresponding views allow to see only relations of the
> current database.  Even if you call functions directly, what is the
> value of this information given that you don't know the relation oids
> in another database?  So, I think if we will give up and limit access
> to the relations of the current database patch will become simpler and
> clearer.
>
I agree with that and have fixed it already.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v6-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 63.5 KB
v6-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 40.7 KB
v6-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 19.9 KB
v6-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-25 16:06:51
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22.08.2024 05:47, jian he wrote:
> On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?
>>
>> On 19.08.2024 12:32, jian he wrote:
>>
>> in pg_stats_vacuum
>> if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
>> {
>> Oid relid = PG_GETARG_OID(1);
>>
>> /* Load table statistics for specified database. */
>> if (OidIsValid(relid))
>> {
>> tabentry = fetch_dbstat_tabentry(dbid, relid);
>> if (tabentry == NULL || tabentry->vacuum_ext.type != type)
>> /* Table don't exists or isn't an heap relation. */
>> PG_RETURN_NULL();
>>
>> tuplestore_put_for_relation(relid, rsinfo, tabentry);
>> }
>> else
>> {
>> }
>>
>>
>> So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
>> it seems you didn't check "relid" 's relkind,
>> you may need to use get_rel_relkind.
>>
>> --
> hi.
> I mentioned some points at [1],
> Please check the attached patchset to address these issues.

Thank you for your work! I checked the patches and added your suggested
changes to the new version of the patch here [0]. In my opinion, nothing
was missing, but please take a look.

[0]
https://www.postgresql.org/message-id/c4e4e305-7119-4183-b49a-d7092f4efba3%40postgrespro.ru

>
> there are four occurrences of "CurrentDatabaseId", i am still confused
> with usage of CurrentDatabaseId.

It needed to be used because of scanning objects from the other
database, so we change the id of dbid temporary to achieve it.

You should snow that every part of this code was deleted.Now we can
check information about tables and indexes from the current database.

> also please don't top-post, otherwise the archive, like [2] is not
> easier to read for future readers.
> generally you quote first, then reply.
>
> [1]https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
> [2]https://postgr.es/m/[email protected]
Ok, no problem.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-25 16:12:40
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22.08.2024 07:29, Kirill Reshke wrote:
> On Thu, 22 Aug 2024 at 07:48, jian he<jian(dot)universality(at)gmail(dot)com> wrote:
>> On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>> We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?
>>>
>>> On 19.08.2024 12:32, jian he wrote:
>>>
>>> in pg_stats_vacuum
>>> if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
>>> {
>>> Oid relid = PG_GETARG_OID(1);
>>>
>>> /* Load table statistics for specified database. */
>>> if (OidIsValid(relid))
>>> {
>>> tabentry = fetch_dbstat_tabentry(dbid, relid);
>>> if (tabentry == NULL || tabentry->vacuum_ext.type != type)
>>> /* Table don't exists or isn't an heap relation. */
>>> PG_RETURN_NULL();
>>>
>>> tuplestore_put_for_relation(relid, rsinfo, tabentry);
>>> }
>>> else
>>> {
>>> }
>>>
>>>
>>> So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
>>> it seems you didn't check "relid" 's relkind,
>>> you may need to use get_rel_relkind.
>>>
>>> --
>> hi.
>> I mentioned some points at [1],
>> Please check the attached patchset to address these issues.
>>
>> there are four occurrences of "CurrentDatabaseId", i am still confused
>> with usage of CurrentDatabaseId.
>>
>> also please don't top-post, otherwise the archive, like [2] is not
>> easier to read for future readers.
>> generally you quote first, then reply.
>>
>> [1]https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
>> [2]https://postgr.es/m/[email protected]
> Hi, your points are valid.
> Regarding 0003, I also wanted to object database naming in a
> regression test during my review but for some reason didn't.Now, as
> soon as we already need to change it, I suggest we also change
> regression_statistic_vacuum_db1 to something less generic. Maybe
> regression_statistic_vacuum_db_unaffected.
>
Hi! I fixed it in the new version of the patch [0]. Feel free to review it!

To be honest, I still doubt that the current database names
(regression_statistic_vacuum_db and regression_statistic_vacuum_db1) can
be easily generated, but if you insist on renaming, I will do it.

[0]
https://www.postgresql.org/message-id/c4e4e305-7119-4183-b49a-d7092f4efba3%40postgrespro.ru

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-08-26 11:55:13
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Just in case, I have attached a diff file to show the changes for the
latest version attached here [0] to make the review process easier.

[0]
https://www.postgresql.org/message-id/c4e4e305-7119-4183-b49a-d7092f4efba3%40postgrespro.ru

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
diff_vacuum.diff.no-cfbot text/plain 24.4 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-04 17:23:00
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, all!

I noticed that the pgstat_accumulate_extvac_stats function may be
declared as static in the pgstat_relation.c file rather than in the
pgstat.h file.

I fixed part of the code with interrupt counters. I believe that it is
not worth taking into account the number of interrupts if its level is
greater than ERROR, for example PANIC. Our server will no longer be
available to us and statistics data will not help us.

I have attached the new version of the code and the diff files
(minor-vacuum.no-cbot).

Attachment Content-Type Size
v7-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 64.1 KB
v7-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 40.6 KB
v7-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 19.9 KB
v7-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB
minor-vacuum.no-cbot text/plain 7.9 KB

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-05 12:47:00
Message-ID: CACJufxFU4ej3iGtGg3GDqmGzRmTVq3d9RGq+ibLfQoS8E3hJEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi, all!
>
> I have attached the new version of the code and the diff files
> (minor-vacuum.no-cbot).
>

hi.

still have white space issue when using "git apply",
you may need to use "git diff --check" to find out where.

/* ----------
diff --git a/src/test/regress/expected/opr_sanity.out
b/src/test/regress/expected/opr_sanity.out
index 5d72b970b03..7026de157e4 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -32,11 +32,12 @@ WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
prokind NOT IN ('f', 'a', 'w', 'p') OR
provolatile NOT IN ('i', 's', 'v') OR
proparallel NOT IN ('s', 'r', 'u');
- oid | proname
-------+------------------------
+ oid | proname
+------+-------------------------
8001 | pg_stat_vacuum_tables
8002 | pg_stat_vacuum_indexes
-(2 rows)
+ 8003 | pg_stat_vacuum_database
+(3 rows)

looking at src/test/regress/sql/opr_sanity.sql:

-- **************** pg_proc ****************
-- Look for illegal values in pg_proc fields.

SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
p1.pronargs < 0 OR
p1.pronargdefaults < 0 OR
p1.pronargdefaults > p1.pronargs OR
array_lower(p1.proargtypes, 1) != 0 OR
array_upper(p1.proargtypes, 1) != p1.pronargs-1 OR
0::oid = ANY (p1.proargtypes) OR
procost <= 0 OR
CASE WHEN proretset THEN prorows <= 0 ELSE prorows != 0 END OR
prokind NOT IN ('f', 'a', 'w', 'p') OR
provolatile NOT IN ('i', 's', 'v') OR
proparallel NOT IN ('s', 'r', 'u');

that means
oid | proname
------+-------------------------
8001 | pg_stat_vacuum_tables
8002 | pg_stat_vacuum_indexes
8003 | pg_stat_vacuum_database

These above functions, pg_proc.prorows should > 0 when
pg_proc.proretset is true.
I think that's the opr_sanity test's intention.
so you may need to change pg_proc.dat.

BTW the doc says:
prorows float4, Estimated number of result rows (zero if not proretset)

segmentation fault cases:
select * from pg_stat_vacuum_indexes(0);
select * from pg_stat_vacuum_tables(0);

+ else if (type == PGSTAT_EXTVAC_DB)
+ {
+ PgStatShared_Database *dbentry;
+ PgStat_EntryRef *entry_ref;
+ Oid dbid = PG_GETARG_OID(0);
+
+ if (OidIsValid(dbid))
+ {
+ entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_DATABASE,
+ dbid, InvalidOid, false);
+ dbentry = (PgStatShared_Database *) entry_ref->shared_stats;
+
+ if (dbentry == NULL)
+ /* Table doesn't exist or isn't a heap relation */
+ return;
+
+ tuplestore_put_for_database(dbid, rsinfo, dbentry);
+ pgstat_unlock_entry(entry_ref);
+ }
+ }
didn't error out when dbid is invalid?

pg_stat_vacuum_tables
pg_stat_vacuum_indexes
pg_stat_vacuum_database
these functions didn't verify the only input argument oid's kind.
for example:

create table s(a int primary key) with (autovacuum_enabled = off);
create view sv as select * from s;
vacuum s;
select * from pg_stat_vacuum_tables('sv'::regclass::oid);
select * from pg_stat_vacuum_indexes('sv'::regclass::oid);
select * from pg_stat_vacuum_database('sv'::regclass::oid);

above all these 3 examples should error out? because sv is view.

in src/backend/catalog/system_views.sql
for view creation of pg_stat_vacuum_indexes
you can change to

WHERE
db.datname = current_database() AND
rel.oid = stats.relid AND
ns.oid = rel.relnamespace
AND rel.relkind = 'i':

pg_stat_vacuum_tables in in src/backend/catalog/system_views.sql
you can change to

WHERE
db.datname = current_database() AND
rel.oid = stats.relid AND
ns.oid = rel.relnamespace
AND rel.relkind = 'r':


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-05 21:00:27
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your review!

On 05.09.2024 15:47, jian he wrote:
> On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> Hi, all!
>>
>> I have attached the new version of the code and the diff files
>> (minor-vacuum.no-cbot).
>>
> hi.
>
> still have white space issue when using "git apply",
> you may need to use "git diff --check" to find out where.
>
>
> /* ----------
> diff --git a/src/test/regress/expected/opr_sanity.out
> b/src/test/regress/expected/opr_sanity.out
> index 5d72b970b03..7026de157e4 100644
> --- a/src/test/regress/expected/opr_sanity.out
> +++ b/src/test/regress/expected/opr_sanity.out
> @@ -32,11 +32,12 @@ WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
> prokind NOT IN ('f', 'a', 'w', 'p') OR
> provolatile NOT IN ('i', 's', 'v') OR
> proparallel NOT IN ('s', 'r', 'u');
> - oid | proname
> -------+------------------------
> + oid | proname
> +------+-------------------------
> 8001 | pg_stat_vacuum_tables
> 8002 | pg_stat_vacuum_indexes
> -(2 rows)
> + 8003 | pg_stat_vacuum_database
> +(3 rows)
>
>
> looking at src/test/regress/sql/opr_sanity.sql:
>
> -- **************** pg_proc ****************
> -- Look for illegal values in pg_proc fields.
>
> SELECT p1.oid, p1.proname
> FROM pg_proc as p1
> WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
> p1.pronargs < 0 OR
> p1.pronargdefaults < 0 OR
> p1.pronargdefaults > p1.pronargs OR
> array_lower(p1.proargtypes, 1) != 0 OR
> array_upper(p1.proargtypes, 1) != p1.pronargs-1 OR
> 0::oid = ANY (p1.proargtypes) OR
> procost <= 0 OR
> CASE WHEN proretset THEN prorows <= 0 ELSE prorows != 0 END OR
> prokind NOT IN ('f', 'a', 'w', 'p') OR
> provolatile NOT IN ('i', 's', 'v') OR
> proparallel NOT IN ('s', 'r', 'u');
>
> that means
> oid | proname
> ------+-------------------------
> 8001 | pg_stat_vacuum_tables
> 8002 | pg_stat_vacuum_indexes
> 8003 | pg_stat_vacuum_database
>
>
> These above functions, pg_proc.prorows should > 0 when
> pg_proc.proretset is true.
> I think that's the opr_sanity test's intention.
> so you may need to change pg_proc.dat.
>
> BTW the doc says:
> prorows float4, Estimated number of result rows (zero if not proretset)
>
I agree with you and have fixed it.
> segmentation fault cases:
> select * from pg_stat_vacuum_indexes(0);
> select * from pg_stat_vacuum_tables(0);
>
>
> + else if (type == PGSTAT_EXTVAC_DB)
> + {
> + PgStatShared_Database *dbentry;
> + PgStat_EntryRef *entry_ref;
> + Oid dbid = PG_GETARG_OID(0);
> +
> + if (OidIsValid(dbid))
> + {
> + entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_DATABASE,
> + dbid, InvalidOid, false);
> + dbentry = (PgStatShared_Database *) entry_ref->shared_stats;
> +
> + if (dbentry == NULL)
> + /* Table doesn't exist or isn't a heap relation */
> + return;
> +
> + tuplestore_put_for_database(dbid, rsinfo, dbentry);
> + pgstat_unlock_entry(entry_ref);
> + }
> + }
> didn't error out when dbid is invalid?
>
It is caused by the empty statistic snapshot. I have fixed that by
updating the snapshot (pgstat_update_snapshot(PGSTAT_KIND_RELATION)
function).

I also added the test to check it.

> pg_stat_vacuum_tables
> pg_stat_vacuum_indexes
> pg_stat_vacuum_database
> these functions didn't verify the only input argument oid's kind.
> for example:
>
> create table s(a int primary key) with (autovacuum_enabled = off);
> create view sv as select * from s;
> vacuum s;
> select * from pg_stat_vacuum_tables('sv'::regclass::oid);
> select * from pg_stat_vacuum_indexes('sv'::regclass::oid);
> select * from pg_stat_vacuum_database('sv'::regclass::oid);
>
> above all these 3 examples should error out? because sv is view.

I don't think so. I noticed that if we try to find the object from the
system table with the different type the Postgres returns empty rows. I
think we should do the same.

> in src/backend/catalog/system_views.sql
> for view creation of pg_stat_vacuum_indexes
> you can change to
>
> WHERE
> db.datname = current_database() AND
> rel.oid = stats.relid AND
> ns.oid = rel.relnamespace
> AND rel.relkind = 'i':
>
>
>
> pg_stat_vacuum_tables in in src/backend/catalog/system_views.sql
> you can change to
>
> WHERE
> db.datname = current_database() AND
> rel.oid = stats.relid AND
> ns.oid = rel.relnamespace
> AND rel.relkind = 'r':
>
I agree with your proposal and fixed it like that.

Attachment Content-Type Size
v8-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 63.6 KB
v8-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 40.3 KB
v8-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 18.0 KB
v8-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB
minor-vacuum.no-cbot text/plain 7.9 KB

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-27 18:15:39
Message-ID: CAD21AoAVK7DwTZLfhwuRhTGgR=_ASu5YshEg_Cmpojk5ZdZ3tA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi! Thank you for your review!
>
> On 05.09.2024 15:47, jian he wrote:
>
> On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi, all!
>
> I have attached the new version of the code and the diff files
> (minor-vacuum.no-cbot).

Thank you for updating the patches. I've reviewed the 0001 patch and
have two comments.

I think we can split the 0001 patch into two parts: adding
pg_stat_vacuum_tables system views that shows the vacuum statistics
that we are currently collecting such as scanned_pages and
removed_pages, and another one is to add new statistics to collect
such as vacrel->set_all_visible_pages and visibility map updates.

I'm concerned that a pg_stat_vacuum_tables view has some duplicated
statistics that we already collect in different ways. For instance,
total_blks_{read,hit,dirtied,written} are already tracked at
system-level by pg_stat_io, and per-relation block I/O statistics can
be collected using pg_stat_statements. Having duplicated statistics
consumes more memory for pgstat and could confuse users if these
statistics are not consistent. I think it would be better to avoid
collecting duplicated statistics in different places.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-27 19:19:31
Message-ID: CAAKRu_auu=xt4w3Mm_jW-voJunZgno6XDDohH6hhRc4Z9dGdYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >
> > Hi! Thank you for your review!
> >
> > On 05.09.2024 15:47, jian he wrote:
> >
> > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >
> > Hi, all!
> >
> > I have attached the new version of the code and the diff files
> > (minor-vacuum.no-cbot).
>
> Thank you for updating the patches. I've reviewed the 0001 patch and
> have two comments.

I took a very brief look at this and was wondering if it was worth
having a way to make the per-table vacuum statistics opt-in (like a
table storage parameter) in order to decrease the shared memory
footprint of storing the stats.

- Melanie


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-27 19:25:14
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, 2024-09-27 at 11:15 -0700, Masahiko Sawada wrote:
> I'm concerned that a pg_stat_vacuum_tables view has some duplicated
> statistics that we already collect in different ways. For instance,
> total_blks_{read,hit,dirtied,written} are already tracked at
> system-level by pg_stat_io,

pg_stat_vacuum_tables.total_blks_{read,hit,dirtied,written} tracks
blocks used by vacuum in different ways while vacuuming this particular
table while pg_stat_io tracks blocks used by vacuum on the cluster
level.

> and per-relation block I/O statistics can
> be collected using pg_stat_statements.

This is impossible. pg_stat_statements tracks block statistics on a
statement level. One statement could touch many tables and many
indexes, and all used database blocks will be counted by the
pg_stat_statements counters on a statement-level. Autovacuum statistics
won't be accounted by the pg_stat_statements. After all,
pg_stat_statements won't hold the statements statistics forever. Under
pressure of new statements the statement eviction can happen and
statistics will be lost.

All of the above is addressed by relation-level vacuum statistics held
in the Cumulative Statistics System proposed by this patch.
--
regards, Andrei Zubkov
Postgres Professional


From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-27 20:13:15
Message-ID: CAD21AoD66b3u28n=73kudgMp5wiGiyYUN9LuC9z2ka6YTru5Gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 27, 2024 at 12:19 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> > >
> > > Hi! Thank you for your review!
> > >
> > > On 05.09.2024 15:47, jian he wrote:
> > >
> > > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> > >
> > > Hi, all!
> > >
> > > I have attached the new version of the code and the diff files
> > > (minor-vacuum.no-cbot).
> >
> > Thank you for updating the patches. I've reviewed the 0001 patch and
> > have two comments.
>
> I took a very brief look at this and was wondering if it was worth
> having a way to make the per-table vacuum statistics opt-in (like a
> table storage parameter) in order to decrease the shared memory
> footprint of storing the stats.

I'm not sure how users can select tables that enable vacuum statistics
as I think they basically want to have statistics for all tables, but
I see your point. Since the size of PgStat_TableCounts approximately
tripled by this patch (112 bytes to 320 bytes), it might be worth
considering ways to reduce the number of entries or reducing the size
of vacuum statistics.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-09-28 21:22:28
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your interesting for this patch!
>> I took a very brief look at this and was wondering if it was worth
>> having a way to make the per-table vacuum statistics opt-in (like a
>> table storage parameter) in order to decrease the shared memory
>> footprint of storing the stats.
> I'm not sure how users can select tables that enable vacuum statistics
> as I think they basically want to have statistics for all tables, but
> I see your point. Since the size of PgStat_TableCounts approximately
> tripled by this patch (112 bytes to 320 bytes), it might be worth
> considering ways to reduce the number of entries or reducing the size
> of vacuum statistics.

The main purpose of these statistics is to see abnormal behavior of
vacuum in relation to a table or the database as a whole.

For example, there may be a situation where vacuum has started to run
more often and spends a lot of resources on processing a certain index,
but the size of the index does not change significantly. Moreover, the
table in which this index is located can be much smaller in size. This
may be because the index is bloated and needs to be reindexed.

This is exactly what vacuum statistics can show - we will see that
compared to other objects, vacuum processed more blocks and spent more
time on this index.

Perhaps the vacuum parameters for the index should be set more
aggressively to avoid this in the future.

I suppose that if we turn off statistics collection for a certain
object, we can miss it. In addition, the user may not enable the
parameter for the object in time, because he will forget about it.

As for the second option, now I cannot say which statistics can be
removed, to be honest. So far, they all seem necessary.

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-08 16:18:46
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Made a rebase on a fresh master branch.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v9-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 63.6 KB
v9-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 39.7 KB
v9-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 18.5 KB
v9-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-16 10:31:43
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 08.10.2024 19:18, Alena Rybakina wrote:
> Made a rebase on a fresh master branch.
> --
> Regards,
> Alena Rybakina
> Postgres Professional

Thank you for rebasing.

I have noticed that when I create a table or an index on this table,
there is no information about the table or index in
pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a VACUUM.

Example:

CREATE TABLE t (i INT, j INT);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
....
(0 rows)
CREATE INDEX ON t (i);
SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
...
(0 rows)

I can see the entries after running VACUUM or executing autovacuum. or
when autovacuum is executed. I would suggest adding a line about the
relation even if it has not yet been processed by vacuum. Interestingly,
this issue does not occur with pg_stat_vacuum_database:

CREATE DATABASE example_db;
SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
dboid |       dbname | ...
 ...      | example_db | ...
(1 row)

BTW, I recommend renaming the view pg_stat_vacuum_database to
pg_stat_vacuum_database_S_  for consistency with pg_stat_vacuum_tables
and pg_stat_vacuum_indexes

--
Regards,
Ilia Evdokimov,
Tantor Labs LLC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-16 11:01:55
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 16.10.2024 13:31, Ilia Evdokimov wrote:
>
>
> On 08.10.2024 19:18, Alena Rybakina wrote:
>> Made a rebase on a fresh master branch.
>> --
>> Regards,
>> Alena Rybakina
>> Postgres Professional
>
> Thank you for rebasing.
>
> I have noticed that when I create a table or an index on this table,
> there is no information about the table or index in
> pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a
> VACUUM.
>
> Example:
>
> CREATE TABLE t (i INT, j INT);
> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
> SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
> ....
> (0 rows)
> CREATE INDEX ON t (i);
> SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
> ...
> (0 rows)
>
> I can see the entries after running VACUUM or executing autovacuum. or
> when autovacuum is executed. I would suggest adding a line about the
> relation even if it has not yet been processed by
> vacuum. Interestingly, this issue does not occur with
> pg_stat_vacuum_database:
>
> CREATE DATABASE example_db;
> SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
> dboid |       dbname | ...
>  ...      | example_db | ...
> (1 row)
>
> BTW, I recommend renaming the view pg_stat_vacuum_database to
> pg_stat_vacuum_database_S_  for consistency with pg_stat_vacuum_tables
> and pg_stat_vacuum_indexes
>
Thanks for the review. I'm investigating this. I agree with the
renaming, I will do it in the next version of the patch.

--
Regards,
Alena Rybakina
Postgres Professional


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-16 11:17:38
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Ilia,

On Wed, 2024-10-16 at 13:31 +0300, Ilia Evdokimov wrote:
> BTW, I recommend renaming the view pg_stat_vacuum_database to
> pg_stat_vacuum_databaseS  for consistency with pg_stat_vacuum_tables
> and pg_stat_vacuum_indexes

Such renaming doesn't seems correct to me because
pg_stat_vacuum_database is consistent with pg_stat_database view, while
pg_stat_vacuum_tables is consistent with pg_stat_all_tables.

This inconsistency is in Postgres views, so it should be changed
synchronously.
--
regards, Andrei Zubkov


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-22 19:30:10
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 16.10.2024 14:01, Alena Rybakina wrote:
>>
>> Thank you for rebasing.
>>
>> I have noticed that when I create a table or an index on this table,
>> there is no information about the table or index in
>> pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a
>> VACUUM.
>>
>> Example:
>>
>> CREATE TABLE t (i INT, j INT);
>> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
>> SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
>> ....
>> (0 rows)
>> CREATE INDEX ON t (i);
>> SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
>> ...
>> (0 rows)
>>
>> I can see the entries after running VACUUM or executing
>> autovacuum. or when autovacuum is executed. I would suggest adding a
>> line about the relation even if it has not yet been processed by
>> vacuum. Interestingly, this issue does not occur with
>> pg_stat_vacuum_database:
>>
>> CREATE DATABASE example_db;
>> SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
>> dboid |       dbname | ...
>>  ...      | example_db | ...
>> (1 row)
>>
>> BTW, I recommend renaming the view pg_stat_vacuum_database to
>> pg_stat_vacuum_database_S_  for consistency with
>> pg_stat_vacuum_tables and pg_stat_vacuum_indexes
>>
> Thanks for the review. I'm investigating this. I agree with the
> renaming, I will do it in the next version of the patch.
>
I fixed it. I added the left outer join to the vacuum views and for
converting the coalesce function from NULL to null values.

I also fixed the code in getting database statistics - we can get it
through the existing pgstat_fetch_stat_dbentry function and fixed couple
of comments.

I attached a diff file, as well as new versions of patches.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v10-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 65.3 KB
v10-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 41.6 KB
v10-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 20.9 KB
v10-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB
vacuum_stats_diff.no-cfbot text/plain 23.4 KB

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-10-28 13:40:04
Message-ID: CAPpHfdvSo3mfH=2m4ADCHAuN=22SnBY3TrPaPbGKTw3r_Jaw7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> I didn't understand correctly - did you mean that we don't need SRF if
> we need to display statistics for a specific object?
>
> Otherwise, we need this when we display information on all database
> objects (tables or indexes):
>
> while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
> != NULL)
> {
> CHECK_FOR_INTERRUPTS();
>
> tabentry = (PgStat_StatTabEntry *) entry->data;
>
> if (tabentry != NULL && tabentry->vacuum_ext.type == type)
> tuplestore_put_for_relation(relid, rsinfo, tabentry);
> }
>
> I know we can construct a HeapTuple object containing a TupleDesc,
> values, and nulls for a particular object, but I'm not sure we can
> augment it while looping through multiple objects.
>
> /* Initialise attributes information in the tuple descriptor */
>
> tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);
>
> ...
>
> PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
>
>
> If I missed something or misunderstood, can you explain in more detail?

Actually, I mean why do we need a possibility to return statistics for
all tables/indexes in one function call? User anyway is supposed to
use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
function calls one per relation. I suppose we can get rid of
possibility to get all the objects in one function call and just
return a tuple from the functions like other pgstatfuncs.c functions
do.

------
Regards,
Alexander Korotkov
Supabase


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-28 19:07:02
Message-ID: CAPpHfduwY8-fp34CuO9O57ouCs1K=Gn1rTnuG4AaWYhEo6nXyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 29, 2024 at 12:22 AM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> Hi! Thank you for your interesting for this patch!
>
> I took a very brief look at this and was wondering if it was worth
> having a way to make the per-table vacuum statistics opt-in (like a
> table storage parameter) in order to decrease the shared memory
> footprint of storing the stats.
>
> I'm not sure how users can select tables that enable vacuum statistics
> as I think they basically want to have statistics for all tables, but
> I see your point. Since the size of PgStat_TableCounts approximately
> tripled by this patch (112 bytes to 320 bytes), it might be worth
> considering ways to reduce the number of entries or reducing the size
> of vacuum statistics.
>
> The main purpose of these statistics is to see abnormal behavior of vacuum in relation to a table or the database as a whole.
>
> For example, there may be a situation where vacuum has started to run more often and spends a lot of resources on processing a certain index, but the size of the index does not change significantly. Moreover, the table in which this index is located can be much smaller in size. This may be because the index is bloated and needs to be reindexed.
>
> This is exactly what vacuum statistics can show - we will see that compared to other objects, vacuum processed more blocks and spent more time on this index.
>
> Perhaps the vacuum parameters for the index should be set more aggressively to avoid this in the future.
>
> I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it.

I agree with this point. Additionally, in order to benefit from
gatherting vacuum statistics only for some relations in terms of
space, we need to handle variable-size stat entries. That would
greatly increase the complexity.

> As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary.

Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
tripled in space. That a huge change from having no statistics on
vacuum to have it in much more detail than everything else we
currently have. I think the feasible way might be to introduce some
most demanded statistics first then see how it goes.

------
Regards,
Alexander Korotkov
Supabase


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-28 21:03:30
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Oct 28, 2024, at 2:07 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
>> I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it.
>
> I agree with this point. Additionally, in order to benefit from
> gatherting vacuum statistics only for some relations in terms of
> space, we need to handle variable-size stat entries. That would
> greatly increase the complexity.

Could vacuum stats be treated as a separate category instead of adding it to PgStat_TableCounts?

>> As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary.
>
> Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
> tripled in space. That a huge change from having no statistics on
> vacuum to have it in much more detail than everything else we
> currently have. I think the feasible way might be to introduce some
> most demanded statistics first then see how it goes.

Looking at the stats I do think the WAL stats are probably not helpful. First, there’s nothing users can do to tune how much WAL is generated by vacuum. Second, this introduces the risk of users saying “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”, which is most likely to make matters worse. There’s already a lot of stuff that goes into WAL without any detailed logging; if we ever wanted to provide a comprehensive view of what data is in WAL that should be handled separately.

The rest of the stats all look important. In fact, I think there’s even more stats that could be included (such as all frozen/visible pages skipped) - even more reason to look at having separate controls for tracking vacuum stats. There’s also an argument to be made for tracking autovac separately from manual vacuum. So long-term we might want to look at other ways to handle these stats, not only because of the large number of stats, but because they would be updated very infrequently compared to other stats counters. Ironically, the old stats system would probably have been more than sufficient for these stats. Tracking them in a real table might also be an option.

Is there a reason some fields are omitted from pg_stat_vacuum_database? While some stats are certainly more interesting at the per-relation level, I can’t really think of any that don’t make sense at the database level as well.

Looking at the per table/index stats, I strongly dislike the use of the term “delete” - it is a recipe for confusion with row deletion.. A much better term is “remove” or “removed”. I realize the term “delete” is used in places in vacuum logging, but IMO we should fix that as well instead of doubling-down on it.

I think “interrupts” is also a very confusing name - those fields should just be called “errors”.

I realize “relname” is being used for consistency with pg_stat_all_(tables|indexes), but I’m not sure it makes sense to double-down on that. Especially in pg_stat_vacuum_indexes, where it’s not completely clear whether relname is referring to the table or the index. I’m also inclined to say that the name of the table should be included in pg_stat_vacuum_indexes.

For all the views the docs should clarify that total_blks_written means blocks written by vacuum, as opposed to the background writer. Similarly they should clarify the difference between rel_blks_(read|hit) and total_blks_(read|hit). In the case of pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were called index_blks_(read|hit). Although… if total_blks_* is actually the count across the table and all the indexes I don’t know that we even need that counter. I realize that not ever vacuum even looks at the indexes, but if we’re going to go into that level of detail then we would (at minimum) need to count the number of times a vacuum completely skipped scanning the indexes.

Having rev_all_(frozen|visible)_pages in the same view as vacuum stats will confuse users into thinking that vacuum is clearing the bits. Those fields really belong in pg_stat_all_tables.

Sadly index_vacuum_count is may not useful at all at present. At minimum you’d need to know the number of times vacuum had run in total. I realize that’s in pg_stat_all_tables, but that doesn’t help if vacuum stats are tracked or reset separately. At minimum the docs should mention them. They also need to clarify if index_vacuum_count is incremented per-index or per-pass (hopefully the later). Assuming it’s per-pass, a better name for the field would be index_vacuum_passes, index_passes, index_pass_count, or similar. But even with that we still need a counter for the number of vacuums where index processing was skipped.

Other items
First, thanks to everyone that’s put work into this patch - it’s a big step forward. I certainly don’t want the perfect to be the enemy of the good, but since the size of these stats entries has already come up as a concern I want to consider use cases that would still not be covered by this patch. I’m not suggesting these need to be added now, but IMHO they’re logical next steps (that would also mean more counters). The cases below would probably mean at least doubling the number of vacuum-related counters, at least at the table level.

First, there’s still gaps in trying to track HOT; most notably a counter for how many updates would never be HOT eligible because they modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really limited without that info.

There should also be stats about unused line pointers - in degenerate cases the lp array can consume a significant portion of heap storage.

Monitoring bloat would be a lot more accurate if vacuum reported total tuple length for each run along with the total number of tuples it looked at. Having that info would make it trivial to calculate average tuple size, which could then be applied to reltuples and relpages to calculate how much space would being lost to bloat.

Autovacuum will self-terminate if it would block another process (unless it’s an aggressive vacuum) - that’s definitely something that should be tracked. Not just the number of times that happens, but also stats about how much work was lost because of this.

Shrinking a relation (what vacuum calls truncation, which is very confusing with the truncate command) is a rather complex process that currently has no visibility.

Tuning vacuum_freeze_min_age (and the MXID variant) is rather complicated. We maybe have enough stats on whether it could be set lower, but there’s no visibility on how the settings affect how often vacuum decides to be aggressive. At minimum, we should have stats on when vacuum is aggressive, especially since it significantly changes the behavior of autovac.

I saw someone else already mentioned tuning vacuum memory usage, but I’ll mention it again. Even if the issues with index_vacuum_count are fixed that still only tells you if you have a problem; it doesn’t give you a great idea of how much more memory you need. The best you can do is assuming you need (number of passes - 1) * current memory.

Speaking of which… there should be stats on any time vacuum decided on it’s own to skip index processing due to wraparound proximity.

I’m sure there’s some other use cases that I’m not thinking of.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-10-29 11:02:13
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28.10.2024 16:40, Alexander Korotkov wrote:
> On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> I didn't understand correctly - did you mean that we don't need SRF if
>> we need to display statistics for a specific object?
>>
>> Otherwise, we need this when we display information on all database
>> objects (tables or indexes):
>>
>> while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
>> != NULL)
>> {
>> CHECK_FOR_INTERRUPTS();
>>
>> tabentry = (PgStat_StatTabEntry *) entry->data;
>>
>> if (tabentry != NULL && tabentry->vacuum_ext.type == type)
>> tuplestore_put_for_relation(relid, rsinfo, tabentry);
>> }
>>
>> I know we can construct a HeapTuple object containing a TupleDesc,
>> values, and nulls for a particular object, but I'm not sure we can
>> augment it while looping through multiple objects.
>>
>> /* Initialise attributes information in the tuple descriptor */
>>
>> tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);
>>
>> ...
>>
>> PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
>>
>>
>> If I missed something or misunderstood, can you explain in more detail?
> Actually, I mean why do we need a possibility to return statistics for
> all tables/indexes in one function call? User anyway is supposed to
> use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
> function calls one per relation. I suppose we can get rid of
> possibility to get all the objects in one function call and just
> return a tuple from the functions like other pgstatfuncs.c functions
> do.
>
I haven’t thought about this before and agree with you. Thanks for the
clarification! I'll fix the patch this evening and release the updated
version.

--
Regards,
Alena Rybakina
Postgres Professional


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Jim Nasby <jnasby(at)upgrade(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-29 12:40:09
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for your attention to our patch!

On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
> > Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
> > tripled in space.  That a huge change from having no statistics on
> > vacuum to have it in much more detail than everything else we
> > currently have.  I think the feasible way might be to introduce
> > some
> > most demanded statistics first then see how it goes.
>
> Looking at the stats I do think the WAL stats are probably not
> helpful. First, there’s nothing users can do to tune how much WAL is
> generated by vacuum. Second, this introduces the risk of users saying
> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
> which is most likely to make matters worse. There’s already a lot of
> stuff that goes into WAL without any detailed logging; if we ever
> wanted to provide a comprehensive view of what data is in WAL that
> should be handled separately.

Yes, there is nothing we can directly do with WAL generated by vacuum,
but WAL generation is the part of vacuum work, and it will indirectly
affected by the changes of vacuum settings. So, WAL statistics is one
more dimension of vacuum workload. Also WAL stat is universal metric
which is measured cluster-wide and on the statement-level with
pg_stat_statements. Vacuum WAL counters will explain the part of
difference between those metrics. Besides vacuum WAL counters can be
used to locate abnormal vacuum behavior caused by a bug or the data
corruption. I think if the DBA is smart enough to look at vacuum WAL
generated stats and to understand what it means, the decision to
disable the autovacuum due to its WAL generation is unlikely.

Anyway I think some stats can be excluded to save some memory. The
first candidates are the system_time and user_time fields. Those are
very valuable, but are measured by the rusage stats, which won't be
available on all platforms. I think total_time and delay_time would be
sufficient.
The second is the interrupts field. It is needed for monitoring to know
do we have them or not, so tracking them on the database level will do
the trick. Interrupt is quite rare event, so once the monitoring system
will catch one the DBA can go to the server log for the details.

It seems there is another way. If the vacuum stats doesn't seems to be
mandatory in all systems, maybe we should add some hooks to the vacuum
so that vacuum statistics tracking can be done in an extension. I don't
think it is a good idea, because vacuum stats seems to me as mandatory
as the vacuum process itself.

> Is there a reason some fields are omitted
> from pg_stat_vacuum_database? While some stats are certainly more
> interesting at the per-relation level, I can’t really think of any
> that don’t make sense at the database level as well.

Some of the metrics are table-specific, some index-specific, so we
moved to the database level metrics more or less specific to the whole
database. Can you tell what stats you want to see at the database
level?

> Looking at the per table/index stats, I strongly dislike the use of
> the term “delete” - it is a recipe for confusion with row deletion..
> A much better term is “remove” or “removed”. I realize the term
> “delete” is used in places in vacuum logging, but IMO we should fix
> that as well instead of doubling-down on it.

Yes, this point was discussed in our team, and it seems confusing to me
too. We decided to name it as it is named in the code and to get
feedback from the community. Now we get one. Thank you. Now we should
discuss it and choose the best one. My personal choice is "removed".

> I realize “relname” is being used for consistency with
> pg_stat_all_(tables|indexes), but I’m not sure it makes sense to
> double-down on that. Especially in pg_stat_vacuum_indexes, where it’s
> not completely clear whether relname is referring to the table or the
> index. I’m also inclined to say that the name of the table should be
> included in pg_stat_vacuum_indexes.

Agreed. Table name is needed in the index view.

> For all the views the docs should clarify that total_blks_written
> means blocks written by vacuum, as opposed to the background Ywriter.

We have the "Number of database blocks written by vacuum operations
performed on this table" in the docs now. Do you mean we should
specifically note the vacuum process here?

> Similarly they should clarify the difference between
> rel_blks_(read|hit) and total_blks_(read|hit). In the case of
> pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were
> called index_blks_(read|hit). Although… if total_blks_* is actually
> the count across the table and all the indexes I don’t know that we
> even need that counter. I realize that not ever vacuum even looks at
> the indexes, but if we’re going to go into that level of detail then
> we would (at minimum) need to count the number of times a vacuum
> completely skipped scanning the indexes.

It is not clear to me enough. The stats described just as it is -
rel_blocks_* tracks blocks of the current heap, and total_* is for the
whole database blocks - not just tables and indexes, vacuum do some
work (quite a little) in the catalog and this work is counted here too.
Usually this stat won't be helpful, but maybe we can catch unusual
vacuum behavior using this stat.

> Having rev_all_(frozen|visible)_pages in the same view as vacuum
> stats will confuse users into thinking that vacuum is clearing the
> bits. Those fields really belong in pg_stat_all_tables.

Agreed.

> Sadly index_vacuum_count is may not useful at all at present. At
> minimum you’d need to know the number of times vacuum had run in
> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
> if vacuum stats are tracked or reset separately.

I'm in doubt - is it really possible to reset the vacuum stats
independent of pg_stat_all_tables?

> At minimum the docs should mention them. They also need to clarify
> if index_vacuum_count is incremented per-index or per-pass (hopefully
> the later). Assuming it’s per-pass, a better name for the field would
> be index_vacuum_passes, index_passes, index_pass_count, or similar.
> But even with that we still need a counter for the number of vacuums
> where index processing was skipped.

Agreed, the "index_passes" looks good to me, and index processing skip
counter looks good.

> First, there’s still gaps in trying to track HOT; most notably a
> counter for how many updates would never be HOT eligible because they
> modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really
> limited without that info.

Nice catch, I'll think about it. Those are not directly connected to
the vacuum workload but those are important.

> There should also be stats about unused line pointers - in degenerate
> cases the lp array can consume a significant portion of heap storage.
>
> Monitoring bloat would be a lot more accurate if vacuum reported
> total tuple length for each run along with the total number of tuples
> it looked at. Having that info would make it trivial to calculate
> average tuple size, which could then be applied to reltuples and
> relpages to calculate how much space would being lost to bloat.

Yes, bloat tracking is in our plans. Right now it is not clear enough
how to do it in the most reliable and convenient way.

> Autovacuum will self-terminate if it would block another process
> (unless it’s an aggressive vacuum) - that’s definitely something that
> should be tracked. Not just the number of times that happens, but
> also stats about how much work was lost because of this.

Agreed.

> Shrinking a relation (what vacuum calls truncation, which is very
> confusing with the truncate command) is a rather complex process that
> currently has no visibility.

In this patch table truncation can be seen in the "pages_removed" field
of "pg_stat_vacuum_tables" at least as the cumulative number of removed
pages. It is not clear enough, but it is visible.

> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
> complicated. We maybe have enough stats on whether it could be set
> lower, but there’s no visibility on how the settings affect how often
> vacuum decides to be aggressive. At minimum, we should have stats on
> when vacuum is aggressive, especially since it significantly changes
> the behavior of autovac.

When you say "agressive" do you mean the number of times when the
vacuum was processing the table with the FREEZE intention? I think this
is needed too.

> I saw someone else already mentioned tuning vacuum memory usage, but
> I’ll mention it again. Even if the issues with index_vacuum_count are
> fixed that still only tells you if you have a problem; it doesn’t
> give you a great idea of how much more memory you need. The best you
> can do is assuming you need (number of passes - 1) * current memory.

Do you think such approach is insufficient? It seems we do not need
byte-to-byte accuracy here.

> Speaking of which… there should be stats on any time vacuum decided
> on it’s own to skip index processing due to wraparound proximity.

Maybe we should just count the number of times when the vacuum was
started to prevent wraparound?

Jim, thank you for such detailed review of our patch!

--
regards, Andrei Zubkov
Postgres Professional


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Andrei Zubkov <zubkov(at)moonset(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-10-29 22:23:55
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Oct 29, 2024, at 7:40 AM, Andrei Zubkov <zubkov(at)moonset(dot)ru> wrote:
>
> Hi,
>
> Thanks for your attention to our patch!
>
> On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
>>> Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
>>> tripled in space. That a huge change from having no statistics on
>>> vacuum to have it in much more detail than everything else we
>>> currently have. I think the feasible way might be to introduce
>>> some
>>> most demanded statistics first then see how it goes.
>>
>> Looking at the stats I do think the WAL stats are probably not
>> helpful. First, there’s nothing users can do to tune how much WAL is
>> generated by vacuum. Second, this introduces the risk of users saying
>> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
>> which is most likely to make matters worse. There’s already a lot of
>> stuff that goes into WAL without any detailed logging; if we ever
>> wanted to provide a comprehensive view of what data is in WAL that
>> should be handled separately.
>
> Yes, there is nothing we can directly do with WAL generated by vacuum,
> but WAL generation is the part of vacuum work, and it will indirectly
> affected by the changes of vacuum settings. So, WAL statistics is one
> more dimension of vacuum workload. Also WAL stat is universal metric
> which is measured cluster-wide and on the statement-level with
> pg_stat_statements. Vacuum WAL counters will explain the part of
> difference between those metrics. Besides vacuum WAL counters can be
> used to locate abnormal vacuum behavior caused by a bug or the data
> corruption. I think if the DBA is smart enough to look at vacuum WAL
> generated stats and to understand what it means, the decision to
> disable the autovacuum due to its WAL generation is unlikely.

I’m generally for more stats rather than less - really just a question of how much we’re worried about stats overhead.

> Anyway I think some stats can be excluded to save some memory. The
> first candidates are the system_time and user_time fields. Those are
> very valuable, but are measured by the rusage stats, which won't be
> available on all platforms. I think total_time and delay_time would be
> sufficient.

Yeah, I considered throwing those under the bus. I agree they’re only marginally useful.

> The second is the interrupts field. It is needed for monitoring to know
> do we have them or not, so tracking them on the database level will do
> the trick. Interrupt is quite rare event, so once the monitoring system
> will catch one the DBA can go to the server log for the details.
Just to confirm… by “interrupt” you mean vacuum encountered an error?

> It seems there is another way. If the vacuum stats doesn't seems to be
> mandatory in all systems, maybe we should add some hooks to the vacuum
> so that vacuum statistics tracking can be done in an extension. I don't
> think it is a good idea, because vacuum stats seems to me as mandatory
> as the vacuum process itself.
I’d actually like hooks for all stats, so people can develop different ways of storing/aggregating them. But I agree that’s a separate discussion.

>> Is there a reason some fields are omitted
>> from pg_stat_vacuum_database? While some stats are certainly more
>> interesting at the per-relation level, I can’t really think of any
>> that don’t make sense at the database level as well.
>
> Some of the metrics are table-specific, some index-specific, so we
> moved to the database level metrics more or less specific to the whole
> database. Can you tell what stats you want to see at the database
> level?

Here’s the thing with pg_stat_vacuum_database; it’s the only way to see everything in the whole cluster. So I think the better question is what metrics simply don’t make sense at that level? And I don’t really see any that don’t.

>> For all the views the docs should clarify that total_blks_written
>> means blocks written by vacuum, as opposed to the background Ywriter.
>
> We have the "Number of database blocks written by vacuum operations
> performed on this table" in the docs now. Do you mean we should
> specifically note the vacuum process here?

The reason the stat is confusing is because it doesn’t have the meaning that the name implies. Most people that see this will think it’s actually measuring blocks dirtied, or at least something closer to that. It definitely hides the fact that many of the dirtied blocks could actually be written by the bgwriter. So an improvement to the docs would be “Number of blocks written directly by vacuum or auto vacuum. Blocks that are dirtied by a vacuum process can be written out by another process.”

Which makes me realize… I think vacuum only counts a block as dirtied if it was previously clean? If so the docs for that metric need to clarify that vacuum might modify a block but not count it as having been dirtied.

>> Similarly they should clarify the difference between
>> rel_blks_(read|hit) and total_blks_(read|hit). In the case of
>> pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were
>> called index_blks_(read|hit). Although… if total_blks_* is actually
>> the count across the table and all the indexes I don’t know that we
>> even need that counter. I realize that not ever vacuum even looks at
>> the indexes, but if we’re going to go into that level of detail then
>> we would (at minimum) need to count the number of times a vacuum
>> completely skipped scanning the indexes.
>
> It is not clear to me enough. The stats described just as it is -
> rel_blocks_* tracks blocks of the current heap, and total_* is for the
> whole database blocks - not just tables and indexes, vacuum do some
> work (quite a little) in the catalog and this work is counted here too.
> Usually this stat won't be helpful, but maybe we can catch unusual
> vacuum behavior using this stat.

Ok, so this just needs to be clarified in the docs by explicitly stating what is and isn’t part of the metric. It would also be better not to use the term “rel” since most people don’t immediately know what that means. So, table_blks_(read|hit) or index_blks_(read|hit).

Also, “total” is still not clear to me, at least in the context of pg_stat_vacuum_indexes. Is that different from pg_stat_vacuum_tables.total_blks_*? If so, how? If it’s the same then IMO it should just be removed from pg_stat_vacuum_indexes.

>> Sadly index_vacuum_count is may not useful at all at present. At
>> minimum you’d need to know the number of times vacuum had run in
>> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
>> if vacuum stats are tracked or reset separately.
>
> I'm in doubt - is it really possible to reset the vacuum stats
> independent of pg_stat_all_tables?

Most stats can be independently reset, so I was thinking these wouldn’t be an exception. If that’s not the case then I think the docs need to mention pg_stat_all_tables.(auto)vacuum_count, since it’s in a completely different view. Or better yet, include the vacuum/analyze related stats that are in pg_stat_all_tables in pg_stat_vacuum_tables.

BTW, have you thought about what stats should be added for ANALYZE? That’s obviously not as critical as vacuum, but maybe worth considering as part of this...

>> First, there’s still gaps in trying to track HOT; most notably a
>> counter for how many updates would never be HOT eligible because they
>> modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really
>> limited without that info.
>
> Nice catch, I'll think about it. Those are not directly connected to
> the vacuum workload but those are important.

Just to re-iterate: I don’t think this patch has to boil the ocean and try to handle all these extra use cases.

>> There should also be stats about unused line pointers - in degenerate
>> cases the lp array can consume a significant portion of heap storage.
>>
>> Monitoring bloat would be a lot more accurate if vacuum reported
>> total tuple length for each run along with the total number of tuples
>> it looked at. Having that info would make it trivial to calculate
>> average tuple size, which could then be applied to reltuples and
>> relpages to calculate how much space would being lost to bloat.
>
> Yes, bloat tracking is in our plans. Right now it is not clear enough
> how to do it in the most reliable and convenient way.
>
>> Autovacuum will self-terminate if it would block another process
>> (unless it’s an aggressive vacuum) - that’s definitely something that
>> should be tracked. Not just the number of times that happens, but
>> also stats about how much work was lost because of this.
>
> Agreed.
>
>> Shrinking a relation (what vacuum calls truncation, which is very
>> confusing with the truncate command) is a rather complex process that
>> currently has no visibility.
>
> In this patch table truncation can be seen in the "pages_removed" field
> of "pg_stat_vacuum_tables" at least as the cumulative number of removed
> pages. It is not clear enough, but it is visible.

Ahh, good point. I think it’s probably worth adding a counter (to this patch) for how many times vacuum actually decided to do page removal, because it’s (presumably) a pretty rare event. Without that counter it’s very hard to make any sense of the number of pages removed (other than being able to see some were removed, at least once).

>> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
>> complicated. We maybe have enough stats on whether it could be set
>> lower, but there’s no visibility on how the settings affect how often
>> vacuum decides to be aggressive. At minimum, we should have stats on
>> when vacuum is aggressive, especially since it significantly changes
>> the behavior of autovac.
>
> When you say "agressive" do you mean the number of times when the
> vacuum was processing the table with the FREEZE intention? I think this
> is needed too.

Yes. I intentionally use the term “aggressive” (as the code does) to avoid confusion with the FREEZE option (which as I’m sure you know simply forces some GUCs to 0). Further complicating this is that auto vac will report this as “to prevent wraparound”…

In any case… I’m actually leaning towards there should be a complete second set of counters for aggressive vacuums, because of how differently they work. :(

>> I saw someone else already mentioned tuning vacuum memory usage, but
>> I’ll mention it again. Even if the issues with index_vacuum_count are
>> fixed that still only tells you if you have a problem; it doesn’t
>> give you a great idea of how much more memory you need. The best you
>> can do is assuming you need (number of passes - 1) * current memory.
>
> Do you think such approach is insufficient? It seems we do not need
> byte-to-byte accuracy here.

Byte-for-byte, no. But I do wonder if there’s any way to do better than some multiple of what *_work_mem was set to.

And setting that aside, another significant problem is that you can’t actually do anything here without actually knowing what memory setting was used, which is definitely not a given. Off-hand I don’t see anyway this can actually be tuned (at all) with nothing but counters. :(

Definitely out of scope for this patch though :)

>> Speaking of which… there should be stats on any time vacuum decided
>> on it’s own to skip index processing due to wraparound proximity.
>
> Maybe we should just count the number of times when the vacuum was
> started to prevent wraparound?

Unfortunately even that isn’t simple… auto vac and manual vac have different GUCs, and of course there’s the FREEZE option. And then there’s the issue that MXIDs are handled completely separately.

Even ignoring all of that… by default an aggressive vacuum won’t skip indexes. That only happens when you hit vacuum_(multixact_)failsafe_age.

BTW, something I’ve been mulling over is what stats related to cleanup might be tracked at a system level. I’m thinking along the lines of how often heap_prune_page or the index marking code come across a dead tuple they can’t do anything about yet because it’s still visible. While you could track that per-relation, I’m not sure how helpful that actually is since it’s really a long-running transaction problem.

Similarly, it’d be nice if we had stats about how often all of the auto vac workers were occupied; something that’s also global in nature.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-02 12:22:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30.10.2024 01:23, Jim Nasby wrote:
> On Oct 29, 2024, at 7:40 AM, Andrei Zubkov<zubkov(at)moonset(dot)ru> wrote:
>> Hi,
>>
>> Thanks for your attention to our patch!
>>
>> On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
>>>> Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
>>>> tripled in space. That a huge change from having no statistics on
>>>> vacuum to have it in much more detail than everything else we
>>>> currently have. I think the feasible way might be to introduce
>>>> some
>>>> most demanded statistics first then see how it goes.
>>> Looking at the stats I do think the WAL stats are probably not
>>> helpful. First, there’s nothing users can do to tune how much WAL is
>>> generated by vacuum. Second, this introduces the risk of users saying
>>> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
>>> which is most likely to make matters worse. There’s already a lot of
>>> stuff that goes into WAL without any detailed logging; if we ever
>>> wanted to provide a comprehensive view of what data is in WAL that
>>> should be handled separately.
>> Yes, there is nothing we can directly do with WAL generated by vacuum,
>> but WAL generation is the part of vacuum work, and it will indirectly
>> affected by the changes of vacuum settings. So, WAL statistics is one
>> more dimension of vacuum workload. Also WAL stat is universal metric
>> which is measured cluster-wide and on the statement-level with
>> pg_stat_statements. Vacuum WAL counters will explain the part of
>> difference between those metrics. Besides vacuum WAL counters can be
>> used to locate abnormal vacuum behavior caused by a bug or the data
>> corruption. I think if the DBA is smart enough to look at vacuum WAL
>> generated stats and to understand what it means, the decision to
>> disable the autovacuum due to its WAL generation is unlikely.
> I’m generally for more stats rather than less - really just a question of how much we’re worried about stats overhead.
>
>> Anyway I think some stats can be excluded to save some memory. The
>> first candidates are the system_time and user_time fields. Those are
>> very valuable, but are measured by the rusage stats, which won't be
>> available on all platforms. I think total_time and delay_time would be
>> sufficient.
> Yeah, I considered throwing those under the bus. I agree they’re only marginally useful.
>
>> The second is the interrupts field. It is needed for monitoring to know
>> do we have them or not, so tracking them on the database level will do
>> the trick. Interrupt is quite rare event, so once the monitoring system
>> will catch one the DBA can go to the server log for the details.
> Just to confirm… by “interrupt” you mean vacuum encountered an error?
Yes it is.

I updated patches. I excluded system and user time statistics and save
number of interrupts only for database.I removed the ability to get
statistics for all tables, now they can only be obtained for an oid
table [0], as suggested here. I also renamed the statistics from
pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for
indexes and databases. I noticed that that’s what they’re mostly called.
Ready for discussion.

>>> For all the views the docs should clarify that total_blks_written
>>> means blocks written by vacuum, as opposed to the background Ywriter.
>> We have the "Number of database blocks written by vacuum operations
>> performed on this table" in the docs now. Do you mean we should
>> specifically note the vacuum process here?
> The reason the stat is confusing is because it doesn’t have the meaning that the name implies. Most people that see this will think it’s actually measuring blocks dirtied, or at least something closer to that. It definitely hides the fact that many of the dirtied blocks could actually be written by the bgwriter. So an improvement to the docs would be “Number of blocks written directly by vacuum or auto vacuum. Blocks that are dirtied by a vacuum process can be written out by another process.”
>
> Which makes me realize… I think vacuum only counts a block as dirtied if it was previously clean? If so the docs for that metric need to clarify that vacuum might modify a block but not count it as having been dirtied.
I think this makes sense, but I haven't fixed it in the documentation
yet. I need time to learn this, to be honest. I'll answer later.
>>> Sadly index_vacuum_count is may not useful at all at present. At
>>> minimum you’d need to know the number of times vacuum had run in
>>> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
>>> if vacuum stats are tracked or reset separately.
>> I'm in doubt - is it really possible to reset the vacuum stats
>> independent of pg_stat_all_tables?
> Most stats can be independently reset, so I was thinking these wouldn’t be an exception. If that’s not the case then I think the docs need to mention pg_stat_all_tables.(auto)vacuum_count, since it’s in a completely different view. Or better yet, include the vacuum/analyze related stats that are in pg_stat_all_tables in pg_stat_vacuum_tables.
To be honest, it was obvious to me, but we can mention it.
>>> Autovacuum will self-terminate if it would block another process
>>> (unless it’s an aggressive vacuum) - that’s definitely something that
>>> should be tracked. Not just the number of times that happens, but
>>> also stats about how much work was lost because of this.
>> Agreed.

>>> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
>>> complicated. We maybe have enough stats on whether it could be set
>>> lower, but there’s no visibility on how the settings affect how often
>>> vacuum decides to be aggressive. At minimum, we should have stats on
>>> when vacuum is aggressive, especially since it significantly changes
>>> the behavior of autovac.
>> When you say "agressive" do you mean the number of times when the
>> vacuum was processing the table with the FREEZE intention? I think this
>> is needed too.
> Yes. I intentionally use the term “aggressive” (as the code does) to avoid confusion with the FREEZE option (which as I’m sure you know simply forces some GUCs to 0). Further complicating this is that auto vac will report this as “to prevent wraparound”…
>
> In any case… I’m actually leaning towards there should be a complete second set of counters for aggressive vacuums, because of how differently they work. :(

>>> Speaking of which… there should be stats on any time vacuum decided
>>> on it’s own to skip index processing due to wraparound proximity.
>> Maybe we should just count the number of times when the vacuum was
>> started to prevent wraparound?
> Unfortunately even that isn’t simple… auto vac and manual vac have different GUCs, and of course there’s the FREEZE option. And then there’s the issue that MXIDs are handled completely separately.
>
> Even ignoring all of that… by default an aggressive vacuum won’t skip indexes. That only happens when you hit vacuum_(multixact_)failsafe_age.
>
> BTW, something I’ve been mulling over is what stats related to cleanup might be tracked at a system level. I’m thinking along the lines of how often heap_prune_page or the index marking code come across a dead tuple they can’t do anything about yet because it’s still visible. While you could track that per-relation, I’m not sure how helpful that actually is since it’s really a long-running transaction problem.
>
> Similarly, it’d be nice if we had stats about how often all of the auto vac workers were occupied; something that’s also global in nature.
>
>
>
I'll see how these statistics can be calculatedand will add in the patch.

[0]
https://www.postgresql.org/message-id/CAPpHfdvSo3mfH%3D2m4ADCHAuN%3D22SnBY3TrPaPbGKTw3r_Jaw7Q%40mail.gmail.com

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v11-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 19.9 KB
v11-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 42.0 KB
v11-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 67.7 KB
v11-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 27.1 KB

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-11-02 12:24:31
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 29.10.2024 14:02, Alena Rybakina wrote:
> On 28.10.2024 16:40, Alexander Korotkov wrote:
>> On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>> I didn't understand correctly - did you mean that we don't need SRF if
>>> we need to display statistics for a specific object?
>>>
>>> Otherwise, we need this when we display information on all database
>>> objects (tables or indexes):
>>>
>>> while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
>>> != NULL)
>>> {
>>> CHECK_FOR_INTERRUPTS();
>>>
>>> tabentry = (PgStat_StatTabEntry *) entry->data;
>>>
>>> if (tabentry != NULL && tabentry->vacuum_ext.type == type)
>>> tuplestore_put_for_relation(relid, rsinfo, tabentry);
>>> }
>>>
>>> I know we can construct a HeapTuple object containing a TupleDesc,
>>> values, and nulls for a particular object, but I'm not sure we can
>>> augment it while looping through multiple objects.
>>>
>>> /* Initialise attributes information in the tuple descriptor */
>>>
>>> tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);
>>>
>>> ...
>>>
>>> PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
>>>
>>>
>>> If I missed something or misunderstood, can you explain in more detail?
>> Actually, I mean why do we need a possibility to return statistics for
>> all tables/indexes in one function call? User anyway is supposed to
>> use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
>> function calls one per relation. I suppose we can get rid of
>> possibility to get all the objects in one function call and just
>> return a tuple from the functions like other pgstatfuncs.c functions
>> do.
>>
> I haven’t thought about this before and agree with you. Thanks for the
> clarification! I'll fix the patch this evening and release the updated
> version.

I updated the patches as per your suggestion. You can see it here [0].

[0]
https://www.postgresql.org/message-id/85b963fe-5977-43aa-9241-75b862abcc69%40postgrespro.ru


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-07 14:49:23
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 22.10.2024 22:30, Alena Rybakina wrote:
>
> Hi!
>
> On 16.10.2024 14:01, Alena Rybakina wrote:
>>>
>>> Thank you for rebasing.
>>>
>>> I have noticed that when I create a table or an index on this table,
>>> there is no information about the table or index in
>>> pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a
>>> VACUUM.
>>>
>>> Example:
>>>
>>> CREATE TABLE t (i INT, j INT);
>>> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
>>> SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
>>> ....
>>> (0 rows)
>>> CREATE INDEX ON t (i);
>>> SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
>>> ...
>>> (0 rows)
>>>
>>> I can see the entries after running VACUUM or executing
>>> autovacuum. or when autovacuum is executed. I would suggest adding a
>>> line about the relation even if it has not yet been processed by
>>> vacuum. Interestingly, this issue does not occur with
>>> pg_stat_vacuum_database:
>>>
>>> CREATE DATABASE example_db;
>>> SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
>>> dboid |       dbname | ...
>>>  ...      | example_db | ...
>>> (1 row)
>>>
>>> BTW, I recommend renaming the view pg_stat_vacuum_database to
>>> pg_stat_vacuum_database_S_  for consistency with
>>> pg_stat_vacuum_tables and pg_stat_vacuum_indexes
>>>
>> Thanks for the review. I'm investigating this. I agree with the
>> renaming, I will do it in the next version of the patch.
>>
> I fixed it. I added the left outer join to the vacuum views and for
> converting the coalesce function from NULL to null values.
>
> I also fixed the code in getting database statistics - we can get it
> through the existing pgstat_fetch_stat_dbentry function and fixed
> couple of comments.
>
> I attached a diff file, as well as new versions of patches.
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional

Thank you for fixing it.

1) I have found some typos in the test output files (out-files) when
running 'make check' and 'make check-world'. These typos might cause
minor discrepancies in test results. You may already be aware of them,
but I wanted to bring them to your attention in case they haven't been
noticed. I believe these can be fixed quickly.

2) Additionally, I observed that when we create a table and insert some
rows, executing the VACUUM FULL command does not update the information
in the 'pg_stat_get_vacuum_tables' However, running the VACUUM command
does update this information as expected. This seems inconsistent, and
it might be a bug.

Example:
CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM FULL;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                 4425 | ......
(1 row)

Regards,
Ilia Evdokimov,
Tantor Labs LLC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-08 19:23:42
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for review!

On 07.11.2024 17:49, Ilia Evdokimov wrote:
>
> Thank you for fixing it.
>
> 1) I have found some typos in the test output files (out-files) when
> running 'make check' and 'make check-world'. These typos might cause
> minor discrepancies in test results. You may already be aware of them,
> but I wanted to bring them to your attention in case they haven't been
> noticed. I believe these can be fixed quickly.
>
Yes, I'll fix it)
>
> 2) Additionally, I observed that when we create a table and insert
> some rows, executing the VACUUM FULL command does not update the
> information in the 'pg_stat_get_vacuum_tables' However, running the
> VACUUM command does update this information as expected. This seems
> inconsistent, and it might be a bug.
>
> Example:
> CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                       0 | ......
> (1 row)
>
> VACUUM FULL;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                       0 | ......
> (1 row)
>
> VACUUM;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                 4425 | ......
> (1 row)
>
>
vacuum full operation doesn't call a vacuum operation, so we can't
collect statistics for it. Furthermore, this is a different operation
than vacuum because it completely rebuilds the table and indexes, so it
looks like your previous table and its indexes were completely removed.
To sum up, I think it isn't a bug that the statistics aren't showing here.

--
Regards,
Alena Rybakina
Postgres Professional


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-08 19:34:56
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
>>> The second is the interrupts field. It is needed for monitoring to know
>>> do we have them or not, so tracking them on the database level will do
>>> the trick. Interrupt is quite rare event, so once the monitoring system
>>> will catch one the DBA can go to the server log for the details.
>> Just to confirm… by “interrupt” you mean vacuum encountered an error?
> Yes it is.
In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.
> I updated patches. I excluded system and user time statistics and save number of interrupts only for database.
> I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.
>
I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-10 16:51:09
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 08.11.2024 22:23, Alena Rybakina wrote:
> Hi! Thank you for review!
>
> On 07.11.2024 17:49, Ilia Evdokimov wrote:
>>
>> Thank you for fixing it.
>>
>> 1) I have found some typos in the test output files (out-files) when
>> running 'make check' and 'make check-world'. These typos might cause
>> minor discrepancies in test results. You may already be aware of
>> them, but I wanted to bring them to your attention in case they
>> haven't been noticed. I believe these can be fixed quickly.
>>
> Yes, I'll fix it)
>>
>> 2) Additionally, I observed that when we create a table and insert
>> some rows, executing the VACUUM FULL command does not update the
>> information in the 'pg_stat_get_vacuum_tables' However, running the
>> VACUUM command does update this information as expected. This seems
>> inconsistent, and it might be a bug.
>>
>> Example:
>> CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
>> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
>> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
>> schema | relname |    relid | total_blks_read | .........
>> -----------+------------+---------+----------------------+---------
>>    public | t            | 21416 |                       0 | ......
>> (1 row)
>>
>> VACUUM FULL;
>> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
>> schema | relname |    relid | total_blks_read | .........
>> -----------+------------+---------+----------------------+---------
>>    public | t            | 21416 |                       0 | ......
>> (1 row)
>>
>> VACUUM;
>> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
>> schema | relname |    relid | total_blks_read | .........
>> -----------+------------+---------+----------------------+---------
>>    public | t            | 21416 |                 4425 | ......
>> (1 row)
>>
>>
> vacuum full operation doesn't call a vacuum operation, so we can't
> collect statistics for it. Furthermore, this is a different operation
> than vacuum because it completely rebuilds the table and indexes, so
> it looks like your previous table and its indexes were completely
> removed. To sum up, I think it isn't a bug that the statistics aren't
> showing here.
>
Ah, you're right. This table does contain the _statistics_ for it.
Everything is okay then. Sorry for the confusion.

Regards,
Ilia Evdokimov,
Tantor Labs LLC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-10 20:09:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08.11.2024 22:34, Jim Nasby wrote:
>
>> On Nov 2, 2024, at 7:22 AM, Alena Rybakina
>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>
>>>> The second is the interrupts field. It is needed for monitoring to know
>>>> do we have them or not, so tracking them on the database level will do
>>>> the trick. Interrupt is quite rare event, so once the monitoring system
>>>> will catch one the DBA can go to the server log for the details.
>>> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>> Yes it is.
> In that case I feel rather strongly that we should label that as
> “errors”. “Interrupt” could mean a few different things, but “error”
> is very clear.
>>
>> I updated patches. I excluded system and user time statistics and
>> save number of interrupts only for database.I removed the ability to
>> get statistics for all tables, now they can only be obtained for an
>> oid table [0], as suggested here. I also renamed the statistics from
>> pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for
>> indexes and databases. I noticed that that’s what they’re mostly
>> called. Ready for discussion.
>>
> I think it’s better that the views follow the existing naming
> conventions (which don’t include “_get_”; only the functions have that
> in their names). Assuming that, the only question becomes
> pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of
> pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
I have fixed it.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v12-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 19.8 KB
v12-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 42.0 KB
v12-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 67.7 KB
v12-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 27.1 KB

From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-13 00:24:03
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> On 08.11.2024 22:34, Jim Nasby wrote:
>>
>>> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> <mailto:a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>
>>>>> The second is the interrupts field. It is needed for monitoring to know
>>>>> do we have them or not, so tracking them on the database level will do
>>>>> the trick. Interrupt is quite rare event, so once the monitoring system
>>>>> will catch one the DBA can go to the server log for the details.
>>>> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>>> Yes it is.
>> In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.
>>> I updated patches. I excluded system and user time statistics and save number of interrupts only for database.
>>> I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.
>>>
>> I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
> I have fixed it.

I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep stat_reset_timestamp at the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code. It appears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it in incase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set.

I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a cleanup lock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure from how some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating since they mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and renamed ExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for missed_dead_tuples. Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a global (or at most per-database) level, but missed_dead_tuples should really be at a per-table level.

Updated 0001-v13 attached, as well as the diff between v12 and v13.




From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-13 16:21:25
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your contribution to this thread!

On 13.11.2024 03:24, Jim Nasby wrote:
> On Nov 10, 2024, at 2:09 PM, Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>
>> On 08.11.2024 22:34, Jim Nasby wrote:
>>>
>>>> On Nov 2, 2024, at 7:22 AM, Alena Rybakina
>>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>>
>>>>>> The second is the interrupts field. It is needed for monitoring to know
>>>>>> do we have them or not, so tracking them on the database level will do
>>>>>> the trick. Interrupt is quite rare event, so once the monitoring system
>>>>>> will catch one the DBA can go to the server log for the details.
>>>>> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>>>> Yes it is.
>>> In that case I feel rather strongly that we should label that as
>>> “errors”. “Interrupt” could mean a few different things, but “error”
>>> is very clear.
>>>>
>>>> I updated patches. I excluded system and user time statistics and
>>>> save number of interrupts only for database.I removed the ability
>>>> to get statistics for all tables, now they can only be obtained for
>>>> an oid table [0], as suggested here. I also renamed the statistics
>>>> from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and
>>>> similarly for indexes and databases. I noticed that that’s what
>>>> they’re mostly called. Ready for discussion.
>>>>
>>> I think it’s better that the views follow the existing naming
>>> conventions (which don’t include “_get_”; only the functions have
>>> that in their names). Assuming that, the only question becomes
>>> pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent
>>> of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
>> I have fixed it.
>
> I’ve reviewed and made some cosmetic changes to patch 1, though of
> note it looks like an effort has been made to keep
> stat_reset_timestamp at the end of PgStat_StatDBEntry, so I
> re-arranged that. I also removed some obviously dead code. It appears
> that pgstat_update_snapshot(), InitSnapshotIterator() and
> ScanStatSnapshot() are also dead, but I’ve left it in incase I’m
> missing something. The tests are also failing for me because a number
> of psql variables aren’t set.
Thank you! Yes, I have deleted them.
>
> I do think we should separate out the counts for deleted but still
> visible tuples vs tuples where we couldn’t get a cleanup lock (in
> other words, recently_dead_tuples and missed_dead_tuples
> from LVRelState). I realize that’s a departure from how some of the
> existing reporting works, but IMO combining them together isn’t a
> pattern we should be repeating since they mean completely different
> things. Towards that end I did remove missed_dead_tuples from the
> reporting, and renamed ExtVacReport.dead_tuples to
> recently_dead_tuples, but I stopped short of creating a separate entry
> for missed_dead_tuples. Note that while recently_dead_tuples is really
> a global thing (so only needs to be reported at a global (or at most
> per-database) level, but missed_dead_tuples should really be at a
> per-table level.
I am willing to agree with your idea. But we need to think about how
clearly describe them in the documentation.
>
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
Thank you)

And I agree with your changes. And included them in patches.

---
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v13-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 65.0 KB
v13-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 43.2 KB
v13-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 22.2 KB
v13-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 27.1 KB

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-30 04:48:22
Message-ID: CALdSSPhv1MAd9GPyWcnMxxP2ZgXFpCksqe7onbkVzQwLaySqgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 13 Nov 2024 at 21:21, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi! Thank you for your contribution to this thread!
>
> On 13.11.2024 03:24, Jim Nasby wrote:
>
> On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
>
> On 08.11.2024 22:34, Jim Nasby wrote:
>
>
> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> The second is the interrupts field. It is needed for monitoring to know
> do we have them or not, so tracking them on the database level will do
> the trick. Interrupt is quite rare event, so once the monitoring system
> will catch one the DBA can go to the server log for the details.
>
> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>
> Yes it is.
>
> In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.
>
> I updated patches. I excluded system and user time statistics and save number of interrupts only for database. I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.
>
> I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
>
> I have fixed it.
>
>
> I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep stat_reset_timestamp at the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code. It appears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it in incase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set.
>
> Thank you! Yes, I have deleted them.
>
>
> I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a cleanup lock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure from how some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating since they mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and renamed ExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for missed_dead_tuples. Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a global (or at most per-database) level, but missed_dead_tuples should really be at a per-table level.
>
> I am willing to agree with your idea. But we need to think about how clearly describe them in the documentation.
>
>
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>
> Thank you)
>
> And I agree with your changes. And included them in patches.
>
> ---
> Regards,
> Alena Rybakina
> Postgres Professional

Hello!
After a brief glance, I think this patch set is good.
But there isn't any more time in the current CF to commit this :(.
So I moved to the next CF.

I also like the 0001 commit message. This commit message is quite
large and easy to understand. Actually, it might be too big. Perhaps
rather of being a commit message, the final paragraph (pages_frozen -
number of pages that..) need to be a part of the document. Perhaps
delete the explanation on pages_frozen that we have in 0004?

--
Best regards,
Kirill Reshke


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-12-02 08:27:18
Message-ID: CAPpHfds=woPcB9nPtMmu=g=U9q6-FHFh7fF_x=uhU3k2Oi03sA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, Alena!

On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>
> Thank you)
>
> And I agree with your changes. And included them in patches.

Thank you for the updated patchset. Some points from me.

* I've read the previous discussion on how important to keep all these
fields regarding vacuum statistics including points by Andrei and Jim.
It still worrying me that statistics volume is going to burst in about
3 times, but I don't have a particular proposal on how to make more
granular approach. I wonder if you could propose something.
* Previously PGSTAT_FILE_FORMAT_ID got increased by 1. Your 0001 patch
increases it by 2. It's minor note, but I'd like to keep the
tradition.
* Commit message for 0001 looks nice, but commit messages of 0002,
0003, and 0004 look messy. Could you please, rearrange them.
* The distinction between 0001 and 0002 is not clear. The first line
of 0001 is "Machinery for grabbing an extended vacuum statistics on
heap relations", the first line of 0002 is "Machinery for grabbing an
extended vacuum statistics on heap and index relations." I guess 0001
should be about heap relations while 0002 should be about just index
relations. Is this correct?
* I guess this statistics should work for any table AM, based on what
has been done in relation_vacuum() interface method. If that's
correct, we need to get rid of "heap" terminology and use "table"
instead.
* 0004 should be pure documentation patch, but it seems containing
changes to isolation tests. Please, move them into a more appropriate
place.

------
Regards,
Alexander Korotkov
Supabase


From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-12-02 14:46:14
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In my opinion, the patches are semantically correct. However, not all
dead code has been removed - I'm referring to pgstat_update_snapshot().
Also, the tests need to be fixed.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-12-02 20:12:05
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02.12.2024 11:27, Alexander Korotkov wrote:
> Hi, Alena!
>
> On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>>
>> Thank you)
>>
>> And I agree with your changes. And included them in patches.
> Thank you for the updated patchset. Some points from me.
>
> * I've read the previous discussion on how important to keep all these
> fields regarding vacuum statistics including points by Andrei and Jim.
> It still worrying me that statistics volume is going to burst in about
> 3 times, but I don't have a particular proposal on how to make more
> granular approach. I wonder if you could propose something.
> * Previously PGSTAT_FILE_FORMAT_ID got increased by 1. Your 0001 patch
> increases it by 2. It's minor note, but I'd like to keep the
> tradition.
> * Commit message for 0001 looks nice, but commit messages of 0002,
> 0003, and 0004 look messy. Could you please, rearrange them.
> * The distinction between 0001 and 0002 is not clear. The first line
> of 0001 is "Machinery for grabbing an extended vacuum statistics on
> heap relations", the first line of 0002 is "Machinery for grabbing an
> extended vacuum statistics on heap and index relations." I guess 0001
> should be about heap relations while 0002 should be about just index
> relations. Is this correct?
> * I guess this statistics should work for any table AM, based on what
> has been done in relation_vacuum() interface method. If that's
> correct, we need to get rid of "heap" terminology and use "table"
> instead.
> * 0004 should be pure documentation patch, but it seems containing
> changes to isolation tests. Please, move them into a more appropriate
> place.
>
Thank you for your valuable feedback, I am already carefully processing
your comments and will update the patches soon.

I will think about what can be done to address the problem of increasing
the volume of statistics; perhaps it will be possible to implement a guc
that, when enabled, will accumulate additional information on vacuum
statistics. For example, this way you can group statistics by buffers
and vacuum statistics.

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-12-02 20:59:58
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your review!

On 30.11.2024 07:48, Kirill Reshke wrote:
> Hello!
> After a brief glance, I think this patch set is good.
> But there isn't any more time in the current CF to commit this :(.
> So I moved to the next CF.
I agree with you. Thank you!)
> I also like the 0001 commit message. This commit message is quite
> large and easy to understand. Actually, it might be too big. Perhaps
> rather of being a commit message, the final paragraph (pages_frozen -
> number of pages that..) need to be a part of the document. Perhaps
> delete the explanation on pages_frozen that we have in 0004?
To be honest, I don't quite understand what you're suggesting. Are you
suggesting moving the explanation about the pages_frozen from the commit
message to the documentation or fixing something in the documentation
about the pages_frozen? Can you please explain?

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-12-02 21:00:37
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02.12.2024 17:46, Ilia Evdokimov wrote:
> In my opinion, the patches are semantically correct. However, not all
> dead code has been removed - I'm referring to
> pgstat_update_snapshot(). Also, the tests need to be fixed.
>
Thank you, I'll fix it

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-12-19 10:37:29
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! I updated patch.

On 02.12.2024 23:12, Alena Rybakina wrote:
> On 02.12.2024 11:27, Alexander Korotkov wrote:
>> Hi, Alena!
>>
>> On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>>>
>>> Thank you)
>>>
>>> And I agree with your changes. And included them in patches.
>> Thank you for the updated patchset. Some points from me.
>>
>> * I've read the previous discussion on how important to keep all these
>> fields regarding vacuum statistics including points by Andrei and Jim.
>> It still worrying me that statistics volume is going to burst in about
>> 3 times, but I don't have a particular proposal on how to make more
>> granular approach. I wonder if you could propose something.

I propose to solve this with a guc (track_vacuum_statistics) that will
allow us to disable statistics collection or enable it back when needed.
This statistics is useful if you collect it over a certain period of
time and watch the dynamics of the change, so I think a hook will not
hurt here.

I also added new statistics vm_new_visible_frozen_pages due to the
dc6acfd commit and renamed some of my statistics from frozen_pages to
vm_new_frozen_pages and all-visible_pages to vm_new_visible_pages. I
also added statistics missed_tupes, missed_pages. Both are necessary to
take into account how many tuples were not cleared by vacuum due to
failure to acquire a cleanup lock on a heap page. The second statistics
missed_pages will allow you to track whether this is one particular page
or not.

This information can make it clear that perhaps the data is broken
somewhere or there is an error in the operation of the database, for
example.

>> * Previously PGSTAT_FILE_FORMAT_ID got increased by 1. Your 0001 patch
>> increases it by 2. It's minor note, but I'd like to keep the
>> tradition.
Fixed
>> * Commit message for 0001 looks nice, but commit messages of 0002,
>> 0003, and 0004 look messy. Could you please, rearrange them.
Fixed
>> * The distinction between 0001 and 0002 is not clear. The first line
>> of 0001 is "Machinery for grabbing an extended vacuum statistics on
>> heap relations", the first line of 0002 is "Machinery for grabbing an
>> extended vacuum statistics on heap and index relations." I guess 0001
>> should be about heap relations while 0002 should be about just index
>> relations. Is this correct?
Fixed
>> * I guess this statistics should work for any table AM, based on what
>> has been done in relation_vacuum() interface method. If that's
>> correct, we need to get rid of "heap" terminology and use "table"
>> instead.
Fixed
>> * 0004 should be pure documentation patch, but it seems containing
>> changes to isolation tests. Please, move them into a more appropriate
>> place.
>>
Fixed

Thanks for your review, it was very helpful)

I also noticed that my stats for indexes were not being collected while
parallel vacuum was running. I fixed it by adding some extra code that
basically captured changes to the parallel_vacuum_process_all_indexes
function. I used a script like this to check if everything was correct.

pgbench -d postgres -i -s 10

my/inst/bin/pg_basebackup -D ~/backup

#psql

--check parallel vacuum statistics

create index accounts_idx1 on pgbench_accounts(bid);
create index accounts_idx2 on pgbench_accounts(aid, bid);

delete from pgbench_accounts where aid >5;
set max_parallel_maintenance_workers = 8;
VACUUM (PARALLEL 3) pgbench_accounts;

create index accounts_idx1 on pgbench_accounts(bid);
create index accounts_idx2 on pgbench_accounts(aid, bid);

delete from pgbench_accounts where aid >5;
set max_parallel_maintenance_workers = 8;
VACUUM (PARALLEL 3) pgbench_accounts;

pg_ctl -D ../postgres_data11 -l logfile stop

rm -rf ../postgres_data/*

cp -r ~/backup/* ~/postgres_data/

pg_ctl -D ../postgres_data11 -l logfile start

--check vacuum statistics processed by postmaster only

create index accounts_idx1 on pgbench_accounts(bid);
create index accounts_idx2 on pgbench_accounts(aid, bid);

delete from pgbench_accounts where aid >5;
set max_parallel_maintenance_workers = 8;
VACUUM (PARALLEL 0) pgbench_accounts;

To view statistics:

select vt.relname, total_blks_read AS total_blks_read,
  total_blks_hit AS total_blks_hit,
  total_blks_dirtied AS total_blks_dirtied,
  total_blks_written AS total_blks_written,

  rel_blks_read AS rel_blks_read,
  rel_blks_hit AS rel_blks_hit,

  pages_deleted AS pages_deleted,
  tuples_deleted AS tuples_deleted,

  wal_records AS wal_records,
  wal_fpi AS wal_fpi,
  wal_bytes AS wal_bytes,

  blk_read_time AS blk_read_time,
  blk_write_time AS blk_write_time,

  delay_time AS delay_time,
  total_time AS total_time
  FROM pg_stat_get_vacuum_indexes vt, pg_class c
WHERE (vt.relname='accounts_idx1' or vt.relname='accounts_idx2' or
vt.relname = 'pgbench_accounts_pkey') AND vt.relid = c.oid;

select stats.relname,stats.total_blks_read AS total_blks_read,
  stats.total_blks_hit AS total_blks_hit,
  stats.total_blks_dirtied AS total_blks_dirtied,
  stats.total_blks_written AS total_blks_written,

  stats.rel_blks_read AS rel_blks_read,
  stats.rel_blks_hit AS rel_blks_hit,

  stats.pages_scanned AS pages_scanned,
  stats.pages_removed AS pages_removed,
  stats.pages_frozen AS pages_frozen,
  stats.pages_all_visible AS pages_all_visible,
  stats.tuples_deleted AS tuples_deleted,
  stats.tuples_frozen AS tuples_frozen,
  stats.dead_tuples AS dead_tuples,

  stats.index_vacuum_count AS index_vacuum_count,
  stats.wal_records AS wal_records,
  stats.wal_fpi AS wal_fpi,
  stats.wal_bytes AS wal_bytes,

  stats.blk_read_time AS blk_read_time,
  stats.blk_write_time AS blk_write_time,

  stats.delay_time AS delay_time,
  stats.total_time AS total_time from pg_stat_vacuum_tables stats,
pg_stat_all_tables  WHERE stats.relname = 'pgbench_accounts' and
stats.relid = pg_stat_all_tables.relid;

output_single_19I got the following results and stored them in
output_single_19 and output_parallel_19 files.
I noticed that rel_blks_read and rel_blks_hit are too small compared to
the vacuum statistics when the vacuum is not parallel. I suspect that
this problem is related to the fact that the relationship statistics
have not reached that time. You can see that they are calculated in my
patch like this:

report->blks_fetched =
    rel->pgstat_info->counts.blocks_fetched - counters->blocks_fetched;
  report->blks_hit =
    rel->pgstat_info->counts.blocks_hit - counters->blocks_hit;

The second thing that bothered me was that some table stats differ in
the fields total_blks_read, rel_blks_read, pages_removed. If with the
buffer this could be related to the fact that in a single run we rely on
the stats of the global buffer and shaft statistics and this could
explain why there are more of them, then with pages_removed I have no
explanation yet as to what could have happened. I am still studying this.

When you have time, take a look at the patches, I will be glad to
receive any feedback.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
output_single_19 text/plain 2.0 KB
output_parallel_19 text/plain 2.0 KB
v14-0001-Machinery-for-grabbing-an-extended-vacuu.patch text/x-patch 70.1 KB
v14-0002-Machinery-for-grabbing-an-extended-vacuu.patch text/x-patch 60.5 KB
v14-0003-Machinery-for-grabbing-an-extended-vacuu.patch text/x-patch 21.5 KB
v14-0004-Add-documentation-about-the-system-views.patch text/x-patch 24.0 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2024-12-19 10:40:54
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 02.12.2024 17:46, Ilia Evdokimov wrote:
> In my opinion, the patches are semantically correct. However, not all
> dead code has been removed - I'm referring to
> pgstat_update_snapshot(). Also, the tests need to be fixed.
>
>
I fixed it [0]. Thank you!

[0]
https://www.postgresql.org/message-id/86f76aa5-1ab5-4e2e-9b15-405051852a2a%40postgrespro.ru

--
Regards,
Alena Rybakina
Postgres Professional


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-02 20:12:08
Message-ID: CAA5RZ0sjDx6QFRJJ+e=orwvLboLWO42f2qF0_HbDq-sMq=tW4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for the work you have done here. Exposing cumulative
metrics at this level of detail for vacuum is surely useful to find
vacuum bottlenecks and to determine the effectiveness of
vacuum tuning.

I am yet to look very closely, but I think some additional columns that
will be useful is the number of failsafe autovacuums occurred. Also
the counter for number of index_cleanup skipped, truncate phase
skipped and toast vacuuming skipped ( the latter will only be relevant
for the main relation ).

I also wonder if if makes sense to break down timing by phase. I surely
would like to know how much of my vacuum time was spent in index
cleanup vs heap scan, etc.

A nit: I noticed in v14, the column is "schema". It should be "schemaname"
for consistency.

Also, instead of pg_stat_vacuum_tables, what about pg_stat_vacuum?

Now, I became aware of this discussion after starting a new thread
to track total time spent in vacuum/analyze in pg_stat_all_tables [1].
But this begs the question of what should be done with the current
counters in pg_stat_all_tables? I see it mentioned above that (auto)vacuum_count
should be added to this new view, but it's also already in pg_stat_all_tables.
I don't think we should be duplicating the same columns across views.

I think total_time should be removed from your current patch and added
as is being suggested in [1]. This way high level metrics such as counts
and total time spent remain in pg_stat_all_tables, while the new view
you are proposing will contain more details. I don't think we will have
consistency issues between the views because a reset using pg_stat_reset()
will act on all the stats and pg_stat_reset_single_table_counters() will act on
all the stats related to that table. There should be no way to reset the vacuum
stats independently, AFAICT.

Alternatively, we can remove the vacuum related stats from pg_stat_all_tables,
but that will break monitoring tools and will leave us with the (auto)analyze
metrics alone in pg_stat_all_tables. This sounds very ugly.

What do you think?

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1] https://commitfest.postgresql.org/52/5485/


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-02 21:14:13
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> On Jan 2, 2025, at 2:12 PM, Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> Alternatively, we can remove the vacuum related stats from pg_stat_all_tables,
> but that will break monitoring tools and will leave us with the (auto)analyze
> metrics alone in pg_stat_all_tables. This sounds very ugly.

While backwards compatibility is important, there’s definitely precedent for changing what shows up in the catalog. IMHO it’s better to bite the bullet and move those fields instead of having vacuum stats spread across two different views.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-02 22:33:33
Message-ID: CAA5RZ0vFkGX8Hks3GCg=BPqVEcUHFTXBTmiheKgJ1-0C_14JFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> While backwards compatibility is important, there’s definitely precedent for changing
> what shows up in the catalog. IMHO it’s better to bite the bullet and move those fields
> instead of having vacuum stats spread across two different views.

Correct, the most recent one that I could think of is pg_stat_checkpointer,
which pulled the checkpoint related columns from pg_stat_bgwriter.
In that case though, these are distinct background processes and
it's a clear distinction.

In this case, I am not so sure about this, particularly because
we will then have the autoanalyze and autovacuum fields in different
views, which could be more confusing to users than saying pg_stat_all_tables
has high level metrics about vacuum and analyze and for more details on
vacuum, refer to pg_stat_vacuum_tables ( or whatever name we settle on ).

Regards,

Sami


From: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-03 14:15:30
Message-ID: CAKAnmmL-1s7x2BFMyMpmyoxf+DV4chN8T+FYu4zjms95g_ZAkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> While backwards compatibility is important, there’s definitely precedent
> for changing what shows up in the catalog. IMHO it’s better to bite the
> bullet and move those fields instead of having vacuum stats spread across
> two different views.
>

-1. That's a huge change, and pg_stat_all_tables is used way, way more than
things like pg_stat_bgwriter.

Cheers,
Greg


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-03 19:08:02
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 2, 2025, at 4:33 PM, Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
>> While backwards compatibility is important, there’s definitely precedent for changing
>> what shows up in the catalog. IMHO it’s better to bite the bullet and move those fields
>> instead of having vacuum stats spread across two different views.
>
> Correct, the most recent one that I could think of is pg_stat_checkpointer,
> which pulled the checkpoint related columns from pg_stat_bgwriter.
> In that case though, these are distinct background processes and
> it's a clear distinction.
>
> In this case, I am not so sure about this, particularly because
> we will then have the autoanalyze and autovacuum fields in different
> views, which could be more confusing to users than saying pg_stat_all_tables
> has high level metrics about vacuum and analyze and for more details on
> vacuum, refer to pg_stat_vacuum_tables ( or whatever name we settle on ).

I guess one question is how realistic it is to try and put everything about (auto)vacuum in a single view. Given the complexity, the answer to that might just be “no”. In that case leaving existing fields in pg_stat_all_tables is a lot more reasonable.

Related to this… it’d be nice if we had a view that gave insight to users about auto vacuum scheduling. I know there’s one floating around the internet, but given the number of systems I’ve seen where autovac can’t keep up it’d be good to raise user awareness.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-03 19:46:25
Message-ID: CAA5RZ0sJLo9jYw4CsrWNrgB3ff52jEK-kg1r8tL_+nEhw1C+Jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I guess one question is how realistic it is to try and put everything about (auto)vacuum in a single view.
> Given the complexity, the answer to that might just be “no”. In that case leaving existing fields in pg_stat_all_tables
> is a lot more reasonable.

Agree. I also think the total_time should be in pg_stat_all_tables.
total_time is a high level metric that along with vacuum_count
can calculate average run time of vacuums on a specific table.

Everything else in the new view are more granular details.

Regards,

Sami


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-04 20:37:51
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, thank you for your attention to this patch.

On 02.01.2025 23:12, Sami Imseih wrote:
> Hi,
>
> Thanks for the work you have done here. Exposing cumulative
> metrics at this level of detail for vacuum is surely useful to find
> vacuum bottlenecks and to determine the effectiveness of
> vacuum tuning.
Yes, we hope that this will help provide more detailed information about
the current efficiency of the vacuum and also suggest how to best
configure it for the relationship.
> I am yet to look very closely, but I think some additional columns that
> will be useful is the number of failsafe autovacuums occurred.
Do you mean when the autovacuum started to prevent workaround?
> Also
> the counter for number of index_cleanup skipped, truncate phase
> skipped and toast vacuuming skipped ( the latter will only be relevant
> for the main relation ).

I can add, but concerns have already been expressed about the large
amount of vacuum statistics and, as a consequence, this leads to the
allocation of additional memory (3 times).

Of course, now we are saved by the guc I added for statistics... I
understand that this information can better show the efficiency of the
vacuum, but how does it help in setting it up for heap relations?

regarding the skipped truncate phase, the statistics are already
collected in vacrel->nonempty_pages, it's easy to put them outside. I
think the current statistics only show the number of deleted tuples and
pages (both deleted and those visited by vacuum during tuple deletion),
so the opposite view won't hurt.

index_cleanup skipped can be obtained based on information from a small
number of vacuum buffer statistics and the number of pages of indexes
that belong to heap relations. I think you can notice the behavior
through current statistics: if the index's buffer values have increased
very slightly, then the vacuum does not go there probably because of the
impossibility of taking a clean-up lock on the index. The same
information can be obtained based on the number of missed_tuples in heap
relations. I wrote earlier how these values ​​​​are related.

toast vacuuming skipped to be honest I haven't found a place where
vacuum skips it in the code yet, so I can't say anything about them yet.

> I also wonder if if makes sense to break down timing by phase. I surely
> would like to know how much of my vacuum time was spent in index
> cleanup vs heap scan, etc.
At the moment, this information has already been added to the statistics
as a total time for heap relations and their indexes.
>
> A nit: I noticed in v14, the column is "schema". It should be "schemaname"
> for consistency.
Thank you, I'll fix it in the next version of the patch.
> Also, instead of pg_stat_vacuum_tables, what about pg_stat_vacuum?
>
> Now, I became aware of this discussion after starting a new thread
> to track total time spent in vacuum/analyze in pg_stat_all_tables [1].
> But this begs the question of what should be done with the current
> counters in pg_stat_all_tables? I see it mentioned above that (auto)vacuum_count
> should be added to this new view, but it's also already in pg_stat_all_tables.
> I don't think we should be duplicating the same columns across views.
>
> Alternatively, we can remove the vacuum related stats from pg_stat_all_tables,
> but that will break monitoring tools and will leave us with the (auto)analyze
> metrics alone in pg_stat_all_tables. This sounds very ugly.
>
> What do you think?
>
> Regards,
>
> Sami Imseih
> Amazon Web Services (AWS)
>
> [1]https://commitfest.postgresql.org/52/5485/

I don't think they interfere with my more detailed views of how the
vacuum works. I don't think there's anything worth removing.
>
> I think total_time should be removed from your current patch and added
> as is being suggested in [1]. This way high level metrics such as counts
> and total time spent remain in pg_stat_all_tables, while the new view
> you are proposing will contain more details. I don't think we will have
> consistency issues between the views because a reset using pg_stat_reset()
> will act on all the stats and pg_stat_reset_single_table_counters() will act on
> all the stats related to that table. There should be no way to reset the vacuum
> stats independently, AFAICT.

I think it is not quite correct to do so.

Firstly, the total time of vacuum operation does not give you a complete
idea of ​​when vacuum did not work delay time. I have seen many reports
where vacuum spends very little time on cleaning relations and most of
the time just sleeping.

Secondly, where to put the total time of vacuum for indexes and
databases? It would be incorrect not to take them into account at all.
What if we remove the total time from the heap statistics and add it to
pg_stat_tables and only leave the vacuum statistics total time of vacuum
operation of indexes and databases? It seems strange to me that they
will have to be viewed from different views.

I think it is necessary to look at the total time for tables into
perspective of how much time vacuum spent in total on processing
indexes, since indexes can be bloated, for example. I think it is better
to leave these statistics here.

--
Regards,
Alena Rybakina
Postgres Professional


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-06 02:00:16
Message-ID: CAA5RZ0v4YZA5uSDh+MU60cEpPvinuAaYPTvO+aZ7H7Vj3whi0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I am yet to look very closely, but I think some additional columns that
> will be useful is the number of failsafe autovacuums occurred.
>
> Do you mean when the autovacuum started to prevent workaround?
>

Specifically vacuum_failsafe_age [1] when autovacuum automatically
performs a vacuum without index cleanup, without truncate, bypassing
the vacuum ring buffer and disabling the cost limits. The purpose of this
is a last ditch effort to avoid wraparound and is triggered at 1.6 billion
transactions by default. When this state occurs, there is a single log
written for every table that is vacuumed with these options [2], and
my thoughts is to also track in the view as the use of these options
will overtime make the indexes bloat over time and less space is
given back to the OS due to skipped truncations. For most workloads,
this should not be common, but I am thinking of the extreme cases
or if someone potentially misconfigured the vacuum_failsafe_age.

As I thought about this more, failsafe autovacuum could be tracked on
the database level, pg_stat_database, since this guc can't be set
on a relation level.

> Also
> the counter for number of index_cleanup skipped, truncate phase
> skipped and toast vacuuming skipped ( the latter will only be relevant
> for the main relation ).
>
> I can add, but concerns have already been expressed about the large amount of
> vacuum statistics and, as a consequence, this leads
> to the allocation of additional memory (3 times).
> Of course, now we are saved by the guc I added for statistics...
> I understand that this information can better show the efficiency of the vacuum,
> but how does it help in setting it up for heap relations?

An administrator will find this information to be useful especially
if for some reason most vacuums are being run with these
options being off either via a manual vacuum or someone
turning off index_cleanup in the tables storage parameter.

postgres=# alter table t set (vacuum_index_cleanup = off,
vacuum_truncate = off );
ALTER TABLE

> regarding the skipped truncate phase, the statistics are already collected in vacrel->nonempty_pages,
> it's easy to put them outside. I think the current statistics only show the number of deleted tuples and pages
> (both deleted and those visited by vacuum during tuple deletion), so the opposite view won't hurt.

Can you clarify what you mean by "so the opposite view won't hurt." ?

> index_cleanup skipped can be obtained based on information from a small number of
> vacuum buffer statistics and the number of pages of indexes that belong to heap relations.
> I think you can notice the behavior through current statistics:

I don't think there is a view that provides cumulative vacuum buffer
stats currently.
pg_stat_io could be helpful for this purpose, but that is a cluster
wide view. As it
stands now, I think it's quite difficult for a user to determine for a
fact if indexes or
truncate is being skipped

> Secondly, where to put the total time of vacuum for indexes and databases?
> It would be incorrect not to take them into account at all. What if we remove the total time from
> the heap statistics and add it to pg_stat_tables and only leave the vacuum statistics total time of
> vacuum operation of indexes and databases?
> It seems strange to me that they will have to be viewed from different views.
>
> I think it is necessary to look at the total time for tables into perspective of how much
> time vacuum spent in total on processing indexes, since indexes can be bloated, for example.
> I think it is better to leave these statistics here.

You make valid points. I now think because track_vacuum_statistics is
optional, we should track total_time in 2 places. First place in the new
view being proposed here and the second place is in pg_stat_all_tables
as being proposed here [3]. This way if track_vacuum_statistics is off, the
total_time of vacuum could still be tracked by pg_stat_all_tables.

By the way, the current patch does not track materialized view,
but it should as materialized views can also be vacuumed.

Regards,

Sami

[1] https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE
[2] https://github.com/postgres/postgres/blob/master/src/backend/access/heap/vacuumlazy.c#L2437-L2444
[3] https://commitfest.postgresql.org/52/5485/


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2025-01-10 12:04:51
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, I have updated the patch. Fix minor mistakes in the document, added
the wraparound_failsafe_count statistics - it accounts the number of
times when the vacuum operates heap relation to prevent workaround
problem, fixed "shemaname".

On 06.01.2025 05:00, Sami Imseih wrote:
>> I am yet to look very closely, but I think some additional columns that
>> will be useful is the number of failsafe autovacuums occurred.
>>
>> Do you mean when the autovacuum started to prevent workaround?
>>
> Specifically vacuum_failsafe_age [1] when autovacuum automatically
> performs a vacuum without index cleanup, without truncate, bypassing
> the vacuum ring buffer and disabling the cost limits. The purpose of this
> is a last ditch effort to avoid wraparound and is triggered at 1.6 billion
> transactions by default. When this state occurs, there is a single log
> written for every table that is vacuumed with these options [2], and
> my thoughts is to also track in the view as the use of these options
> will overtime make the indexes bloat over time and less space is
> given back to the OS due to skipped truncations. For most workloads,
> this should not be common, but I am thinking of the extreme cases
> or if someone potentially misconfigured the vacuum_failsafe_age.
>
> As I thought about this more, failsafe autovacuum could be tracked on
> the database level, pg_stat_database, since this guc can't be set
> on a relation level.

I thought again about adding a statistic to account for skipping
truncation or index scans. In my opinion, we have statistics like
removed_pages on the heap relations. They are "the count number pages
removed by relation truncation". So if truncation was disabled on the
heap relation, their count will not increase.

As for skipped_indexes, we added an index_vacuum_count statistic that
counts the number of indexes on the heap relation that were vacuumed
during the vacuum procedure. If their count does not increase, then
vacuum will likely skip them.

>> Also
>> the counter for number of index_cleanup skipped, truncate phase
>> skipped and toast vacuuming skipped ( the latter will only be relevant
>> for the main relation ).
>>
>> I can add, but concerns have already been expressed about the large amount of
>> vacuum statistics and, as a consequence, this leads
>> to the allocation of additional memory (3 times).
>> Of course, now we are saved by the guc I added for statistics...
>> I understand that this information can better show the efficiency of the vacuum,
>> but how does it help in setting it up for heap relations?
> An administrator will find this information to be useful especially
> if for some reason most vacuums are being run with these
> options being off either via a manual vacuum or someone
> turning off index_cleanup in the tables storage parameter.
>
> postgres=# alter table t set (vacuum_index_cleanup = off,
> vacuum_truncate = off );
> ALTER TABLE
You can take these parameters into account when analyzing vacuum
statistics, right? Display them side by side.
>> regarding the skipped truncate phase, the statistics are already collected in vacrel->nonempty_pages,
>> it's easy to put them outside. I think the current statistics only show the number of deleted tuples and pages
>> (both deleted and those visited by vacuum during tuple deletion), so the opposite view won't hurt.
> Can you clarify what you mean by "so the opposite view won't hurt." ?
I meant that it wouldn't be excessive, but at the moment I think
otherwise. We already have removed_pages and it will be enough.
>> index_cleanup skipped can be obtained based on information from a small number of
>> vacuum buffer statistics and the number of pages of indexes that belong to heap relations.
>> I think you can notice the behavior through current statistics:
> I don't think there is a view that provides cumulative vacuum buffer
> stats currently.
We show it now in the views for heap relations, index relations,
databases (pg_stat_vacuum_tables, pg_stat_vacuum_indexes,
pg_stat_vacuum_databases) or you meant something else?
> pg_stat_io could be helpful for this purpose, but that is a cluster
> wide view. As it
> stands now, I think it's quite difficult for a user to determine for a
> fact if indexes or
> truncate is being skipped

I think so, it is difficult to get a clear picture of what is happening
by analyzing only this information.

We collect other statistics on vacuumed relation pages that can help
give a full picture: the number of pages missed due to failure to get a
clean-up lock on an index (missed_tuples), the number of vacuumed tuples
(tuples_deleted), and recently deleted tuples (recently_dead_tuples). I
think that's enough.

>> Secondly, where to put the total time of vacuum for indexes and databases?
>> It would be incorrect not to take them into account at all. What if we remove the total time from
>> the heap statistics and add it to pg_stat_tables and only leave the vacuum statistics total time of
>> vacuum operation of indexes and databases?
>> It seems strange to me that they will have to be viewed from different views.
>>
>> I think it is necessary to look at the total time for tables into perspective of how much
>> time vacuum spent in total on processing indexes, since indexes can be bloated, for example.
>> I think it is better to leave these statistics here.
> You make valid points. I now think because track_vacuum_statistics is
> optional, we should track total_time in 2 places. First place in the new
> view being proposed here and the second place is in pg_stat_all_tables
> as being proposed here [3]. This way if track_vacuum_statistics is off, the
> total_time of vacuum could still be tracked by pg_stat_all_tables.
>
> By the way, the current patch does not track materialized view,
> but it should as materialized views can also be vacuumed.
>
> Regards,
>
> Sami
>
> [1]https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE
> [2]https://github.com/postgres/postgres/blob/master/src/backend/access/heap/vacuumlazy.c#L2437-L2444
> [3]https://commitfest.postgresql.org/52/5485/
>
>
I don't agree with this.

Firstly, the hook is enabled by default, that is, it must be specially
disabled so that the vacuum statistics are not collected.

Secondly, it will cause confusion. First, the hook was disabled and
statistics were collected in one place - pg_stat_all_tables, and then it
was enabled and the user notices that the statistics there stopped
accumulating,
he is in a panic, "suddenly the vacuum does not work, what to do?". The
second point here bothers me, how to take into account this statistics
with the current detailed vacuum statistics? After all, adding these
values ​​​​is wrong -
they do not show the correct statistics regarding the same pages
processed by vacuum, ignoring it later means that they will be
redundant. I think it is better to save it here, since this will save us
from possible confusion.

Secondly, it will immediately show other important parameters regarding
this statistics - how long the vacuum was sleep (delay_time in my
patches), how much time the vacuum spent on processing indexes during
its processing.
Without this information, this assessment will not be voluminous and
indicative enough.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v15-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.5 KB
v15-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 61.1 KB
v15-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 21.9 KB
v15-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.2 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2025-01-10 14:51:10
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! I thought about this problem again and I think I have a solution.

On 02.12.2024 23:12, Alena Rybakina wrote:
>> * I've read the previous discussion on how important to keep all these
>> fields regarding vacuum statistics including points by Andrei and Jim.
>> It still worrying me that statistics volume is going to burst in about
>> 3 times, but I don't have a particular proposal on how to make more
>> granular approach. I wonder if you could propose something.
We can collect statistics on databases at all times - there are less
compared to vacuum statistics of relations, but they can give enough
information that can hint that something is going wrong.
With the track_vacuum_statistics guc we can cover cases of collecting
extended and complete information: when it is enabled, we will collect
vacuum statistics on relations both: heaps and indexes.
This will not lead to a synchronicity between constant database
statistics and temporary statistics of relations, since our vacuum
statistics are cumulative and it is assumed that we will look at changes
in statistics over a certain period.
What do you think?

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2025-01-10 15:31:59
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry, I made a typo due to lack of sleep, I've marked below where
exactly just in case.

On 10.01.2025 15:04, Alena Rybakina wrote:
>
> Hi, I have updated the patch. Fix minor mistakes in the document,
> added the wraparound_failsafe_count statistics - it accounts the
> number of times when the vacuum operates heap relation to prevent
> workaround problem, fixed "shemaname".
>
I didn't mean workaround problem but wraparound problem.
>>> Secondly, where to put the total time of vacuum for indexes and databases?
>>> It would be incorrect not to take them into account at all. What if we remove the total time from
>>> the heap statistics and add it to pg_stat_tables and only leave the vacuum statistics total time of
>>> vacuum operation of indexes and databases?
>>> It seems strange to me that they will have to be viewed from different views.
>>>
>>> I think it is necessary to look at the total time for tables into perspective of how much
>>> time vacuum spent in total on processing indexes, since indexes can be bloated, for example.
>>> I think it is better to leave these statistics here.
>> You make valid points. I now think because track_vacuum_statistics is
>> optional, we should track total_time in 2 places. First place in the new
>> view being proposed here and the second place is in pg_stat_all_tables
>> as being proposed here [3]. This way if track_vacuum_statistics is off, the
>> total_time of vacuum could still be tracked by pg_stat_all_tables.
>>
>> By the way, the current patch does not track materialized view,
>> but it should as materialized views can also be vacuumed.
>>
>> Regards,
>>
>> Sami
>>
>> [1]https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE
>> [2]https://github.com/postgres/postgres/blob/master/src/backend/access/heap/vacuumlazy.c#L2437-L2444
>> [3]https://commitfest.postgresql.org/52/5485/
>>
>>
> I don't agree with this.
>
> Firstly, the hook is enabled by default, that is, it must be specially
> disabled so that the vacuum statistics are not collected.
>
> Secondly, it will cause confusion. First, the hook was disabled and
> statistics were collected in one place - pg_stat_all_tables, and then
> it was enabled and the user notices that the statistics there stopped
> accumulating,
> he is in a panic, "suddenly the vacuum does not work, what to do?".
> The second point here bothers me, how to take into account this
> statistics with the current detailed vacuum statistics? After all,
> adding these values ​​​​is wrong -
> they do not show the correct statistics regarding the same pages
> processed by vacuum, ignoring it later means that they will be
> redundant. I think it is better to save it here, since this will save
> us from possible confusion.
>
> Secondly, it will immediately show other important parameters
> regarding this statistics - how long the vacuum was sleep (delay_time
> in my patches), how much time the vacuum spent on processing indexes
> during its processing.
> Without this information, this assessment will not be voluminous and
> indicative enough.
>
I didn't mean hook but guc here.

--
Regards,
Alena Rybakina
Postgres Professional


From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Sami Imseih <samimseih(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-13 08:54:36
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Sami,

Thank you for your attention to our patch and for your own work.

On Sun, 2025-01-05 at 20:00 -0600, Sami Imseih wrote:
>
> You make valid points. I now think because track_vacuum_statistics is
> optional, we should track total_time in 2 places. First place in the
> new
> view being proposed here and the second place is in
> pg_stat_all_tables
> as being proposed here [3]. This way if track_vacuum_statistics is
> off, the
> total_time of vacuum could still be tracked by pg_stat_all_tables.

I think that field total_time in pg_stat_all_tables is redundant at
least if it will be the only field we want to add there. Yes, we have
vacuum counts in pg_stat_all_tables, but those are not related to the
vacuum workload actually. When we think we see unusual numbers there,
we can answer the question "why" - we know the conditions causing
autovacuum to launch a vacuum on every particular table, we have tuple
statistics on this table, and we can detect anomalies here. For
example, when vacuum process should be launched 5 times, but was
launched only twice.

The total_time field is workload metric. Yes, we can calculate the
mean time of vacuum operation on every particular table but there is
nothing we can do with it. We don't know what this time should be for
this table now. We only can compare this metric to its values in the
past. But once we see this time raising we will immediately face the
question "why?". And we have nothing to say about it. Where the time
was spent: vacuuming heap, vacuuming indexes, sleeping in the delay
point or performing IO operations, is there actual workload performed
by vacuum increased with total_time, or now we are spending more time
for the same workload? I think if we are adding workload statistics to
the Cumulative Statistics System we should do it as complete as
possible.

--
Regards, Andrei Zubkov
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2025-01-13 11:19:46
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, all! I updated the patches and I solved the problems with parallel
vacuum.

On 19.12.2024 13:37, Alena Rybakina wrote:
>
> Hi! I updated patch.
>
> On 02.12.2024 23:12, Alena Rybakina wrote:
>> On 02.12.2024 11:27, Alexander Korotkov wrote:
>>> Hi, Alena!
>>>
>>> On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>>>>
>>>> Thank you)
>>>>
>>>> And I agree with your changes. And included them in patches.
>>> Thank you for the updated patchset. Some points from me.
>>>
>>> * I've read the previous discussion on how important to keep all these
>>> fields regarding vacuum statistics including points by Andrei and Jim.
>>> It still worrying me that statistics volume is going to burst in about
>>> 3 times, but I don't have a particular proposal on how to make more
>>> granular approach. I wonder if you could propose something.
>
> I propose to solve this with a guc (track_vacuum_statistics) that will
> allow us to disable statistics collection or enable it back when
> needed. This statistics is useful if you collect it over a certain
> period of time and watch the dynamics of the change, so I think a hook
> will not hurt here.
>
> I also added new statistics vm_new_visible_frozen_pages due to the
> dc6acfd commit and renamed some of my statistics from frozen_pages to
> vm_new_frozen_pages and all-visible_pages to vm_new_visible_pages. I
> also added statistics missed_tupes, missed_pages. Both are necessary
> to take into account how many tuples were not cleared by vacuum due to
> failure to acquire a cleanup lock on a heap page. The second
> statistics missed_pages will allow you to track whether this is one
> particular page or not.
>
> This information can make it clear that perhaps the data is broken
> somewhere or there is an error in the operation of the database, for
> example.
>
I changed the purpose of the guc. By default, vacuum database statistics
are always collected, and the track_vacuum_statistics hook enables the
ability to collect extended statistics for a given database's relations.
This is done to achieve a balance between the allocated memory for
storing statistics and having the necessary monitoring at hand to track
the state of vacuum operation.
In addition to the fact that there are far fewer databases in a system
than relations, vacuum statistics for a database contain fewer
statistics than relations, but
they are enough to indicate that something may be wrong in the system
and prompt the administrator to enable extended monitoring for relations.

By default, the guc is disabled.

> I also noticed that my stats for indexes were not being collected
> while parallel vacuum was running. I fixed it by adding some extra
> code that basically captured changes to the
> parallel_vacuum_process_all_indexes function. I used a script like
> this to check if everything was correct.
>
> pgbench -d postgres -i -s 10
>
>
> my/inst/bin/pg_basebackup -D ~/backup
>
>
> #psql
>
> --check parallel vacuum statistics
>
> create index accounts_idx1 on pgbench_accounts(bid);
> create index accounts_idx2 on pgbench_accounts(aid, bid);
>
> delete from pgbench_accounts where aid >5;
> set max_parallel_maintenance_workers = 8;
> VACUUM (PARALLEL 3) pgbench_accounts;
>
>
> create index accounts_idx1 on pgbench_accounts(bid);
> create index accounts_idx2 on pgbench_accounts(aid, bid);
>
> delete from pgbench_accounts where aid >5;
> set max_parallel_maintenance_workers = 8;
> VACUUM (PARALLEL 3) pgbench_accounts;
>
>
> pg_ctl -D ../postgres_data11 -l logfile stop
>
> rm -rf ../postgres_data/*
>
> cp -r ~/backup/* ~/postgres_data/
>
> pg_ctl -D ../postgres_data11 -l logfile start
>
>
> --check vacuum statistics processed by postmaster only
>
> create index accounts_idx1 on pgbench_accounts(bid);
> create index accounts_idx2 on pgbench_accounts(aid, bid);
>
> delete from pgbench_accounts where aid >5;
> set max_parallel_maintenance_workers = 8;
> VACUUM (PARALLEL 0) pgbench_accounts;
>
> I noticed that rel_blks_read and rel_blks_hit are too small compared
> to the vacuum statistics when the vacuum is not parallel. I suspect
> that this problem is related to the fact that the relationship
> statistics have not reached that time. You can see that they are
> calculated in my patch like this:
>
> report->blks_fetched =
>     rel->pgstat_info->counts.blocks_fetched - counters->blocks_fetched;
>   report->blks_hit =
>     rel->pgstat_info->counts.blocks_hit - counters->blocks_hit;
>
>
> The second thing that bothered me was that some table stats differ in
> the fields total_blks_read, rel_blks_read, pages_removed. If with the
> buffer this could be related to the fact that in a single run we rely
> on the stats of the global buffer and shaft statistics and this could
> explain why there are more of them, then with pages_removed I have no
> explanation yet as to what could have happened. I am still studying this.
>
> When you have time, take a look at the patches, I will be glad to
> receive any feedback.
>
I fixed this problem. Now the statistics of parallel and non-parallel
vacuums are almost equal. See files output_vacuum_0_workers and
output_vacuum_3_workers. The results were obtained from the test that I
described above.
In fact, we need to add tracking statistics in the
parallel_vacuum_one_index. But I noticed another problem that I have
already fixed.

Vacuum statistics for indexes were accumulated in heap vacuum statistics
because of the complexity of vacuum operations,
namely, vacuum statistics for index cleaning were included in heap
relationship vacuum statistics.
Vacuum switches from cleaning the table to its indexes and back during
its operation, and we need to take this into account.
So, before cleaning indexes, we now save the collected vacuum statistics
for the heap, but we send it only after the processing is complete.

> To view statistics:
>
> select vt.relname, total_blks_read AS total_blks_read,
>   total_blks_hit AS total_blks_hit,
>   total_blks_dirtied AS total_blks_dirtied,
>   total_blks_written AS total_blks_written,
>
>   rel_blks_read AS rel_blks_read,
>   rel_blks_hit AS rel_blks_hit,
>
>   pages_deleted AS pages_deleted,
>   tuples_deleted AS tuples_deleted,
>
>   wal_records AS wal_records,
>   wal_fpi AS wal_fpi,
>   wal_bytes AS wal_bytes,
>
>   blk_read_time AS blk_read_time,
>   blk_write_time AS blk_write_time,
>
>   delay_time AS delay_time,
>   total_time AS total_time
>   FROM pg_stat_get_vacuum_indexes vt, pg_class c
> WHERE (vt.relname='accounts_idx1' or vt.relname='accounts_idx2' or
> vt.relname = 'pgbench_accounts_pkey') AND vt.relid = c.oid;
>
>
> select stats.relname,stats.total_blks_read AS total_blks_read,
>   stats.total_blks_hit AS total_blks_hit,
>   stats.total_blks_dirtied AS total_blks_dirtied,
>   stats.total_blks_written AS total_blks_written,
>
>   stats.rel_blks_read AS rel_blks_read,
>   stats.rel_blks_hit AS rel_blks_hit,
>
>   stats.pages_scanned AS pages_scanned,
>   stats.pages_removed AS pages_removed,
>   stats.pages_frozen AS pages_frozen,
>   stats.pages_all_visible AS pages_all_visible,
>   stats.tuples_deleted AS tuples_deleted,
>   stats.tuples_frozen AS tuples_frozen,
>   stats.dead_tuples AS dead_tuples,
>
>   stats.index_vacuum_count AS index_vacuum_count,
>   stats.wal_records AS wal_records,
>   stats.wal_fpi AS wal_fpi,
>   stats.wal_bytes AS wal_bytes,
>
>   stats.blk_read_time AS blk_read_time,
>   stats.blk_write_time AS blk_write_time,
>
>   stats.delay_time AS delay_time,
>   stats.total_time AS total_time from pg_stat_vacuum_tables stats,
> pg_stat_all_tables  WHERE stats.relname = 'pgbench_accounts' and
> stats.relid = pg_stat_all_tables.relid;
>
>
Just in case, I'll write that during the test I used simpler queries:

select * from pg_stat_vacuum_tables where relname like '%accounts%';

select * from pg_stat_vacuum_indexes where relname like '%accounts_%';

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
output_vacuum_0_workers text/plain 2.9 KB
output_parallel_3_workers text/plain 2.9 KB
v16-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.2 KB
v16-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 64.1 KB
v16-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 36.4 KB
v16-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2025-01-13 11:29:31
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10.01.2025 17:51, Alena Rybakina wrote:
>
> Hi! I thought about this problem again and I think I have a solution.
>
> On 02.12.2024 23:12, Alena Rybakina wrote:
>>> * I've read the previous discussion on how important to keep all these
>>> fields regarding vacuum statistics including points by Andrei and Jim.
>>> It still worrying me that statistics volume is going to burst in about
>>> 3 times, but I don't have a particular proposal on how to make more
>>> granular approach. I wonder if you could propose something.
> We can collect statistics on databases at all times - there are less
> compared to vacuum statistics of relations, but they can give enough
> information that can hint that something is going wrong.
> With the track_vacuum_statistics guc we can cover cases of collecting
> extended and complete information: when it is enabled, we will collect
> vacuum statistics on relations both: heaps and indexes.
> This will not lead to a synchronicity between constant database
> statistics and temporary statistics of relations, since our vacuum
> statistics are cumulative and it is assumed that we will look at
> changes in statistics over a certain period.
> What do you think?

I implemented this in my latest patch version [0].

[0]
https://www.postgresql.org/message-id/1e81a0a1-a63b-48fb-905a-d6495f89ab73%40postgrespro.ru

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-13 13:26:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I noticed that the cfbot is bad, the reason seems to be related to the
lack of a parameter in src/backend/utils/misc/postgresql.conf.sample. I
added it, it should help.

On 13.01.2025 14:19, Alena Rybakina wrote:
>
> Hi, all! I updated the patches and I solved the problems with parallel
> vacuum.
>
> On 19.12.2024 13:37, Alena Rybakina wrote:
>>
>> Hi! I updated patch.
>>
>> On 02.12.2024 23:12, Alena Rybakina wrote:
>>> On 02.12.2024 11:27, Alexander Korotkov wrote:
>>>> Hi, Alena!
>>>>
>>>> On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
>>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>>> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>>>>>
>>>>> Thank you)
>>>>>
>>>>> And I agree with your changes. And included them in patches.
>>>> Thank you for the updated patchset. Some points from me.
>>>>
>>>> * I've read the previous discussion on how important to keep all these
>>>> fields regarding vacuum statistics including points by Andrei and Jim.
>>>> It still worrying me that statistics volume is going to burst in about
>>>> 3 times, but I don't have a particular proposal on how to make more
>>>> granular approach. I wonder if you could propose something.
>>
>> I propose to solve this with a guc (track_vacuum_statistics) that
>> will allow us to disable statistics collection or enable it back when
>> needed. This statistics is useful if you collect it over a certain
>> period of time and watch the dynamics of the change, so I think a
>> hook will not hurt here.
>>
>> I also added new statistics vm_new_visible_frozen_pages due to the
>> dc6acfd commit and renamed some of my statistics from frozen_pages to
>> vm_new_frozen_pages and all-visible_pages to vm_new_visible_pages. I
>> also added statistics missed_tupes, missed_pages. Both are necessary
>> to take into account how many tuples were not cleared by vacuum due
>> to failure to acquire a cleanup lock on a heap page. The second
>> statistics missed_pages will allow you to track whether this is one
>> particular page or not.
>>
>> This information can make it clear that perhaps the data is broken
>> somewhere or there is an error in the operation of the database, for
>> example.
>>
> I changed the purpose of the guc. By default, vacuum database
> statistics are always collected, and the track_vacuum_statistics hook
> enables the ability to collect extended statistics for a given
> database's relations.
> This is done to achieve a balance between the allocated memory for
> storing statistics and having the necessary monitoring at hand to
> track the state of vacuum operation.
> In addition to the fact that there are far fewer databases in a system
> than relations, vacuum statistics for a database contain fewer
> statistics than relations, but
> they are enough to indicate that something may be wrong in the system
> and prompt the administrator to enable extended monitoring for relations.
>
> By default, the guc is disabled.
>
>> I also noticed that my stats for indexes were not being collected
>> while parallel vacuum was running. I fixed it by adding some extra
>> code that basically captured changes to the
>> parallel_vacuum_process_all_indexes function. I used a script like
>> this to check if everything was correct.
>>
>> pgbench -d postgres -i -s 10
>>
>>
>> my/inst/bin/pg_basebackup -D ~/backup
>>
>>
>> #psql
>>
>> --check parallel vacuum statistics
>>
>> create index accounts_idx1 on pgbench_accounts(bid);
>> create index accounts_idx2 on pgbench_accounts(aid, bid);
>>
>> delete from pgbench_accounts where aid >5;
>> set max_parallel_maintenance_workers = 8;
>> VACUUM (PARALLEL 3) pgbench_accounts;
>>
>>
>> create index accounts_idx1 on pgbench_accounts(bid);
>> create index accounts_idx2 on pgbench_accounts(aid, bid);
>>
>> delete from pgbench_accounts where aid >5;
>> set max_parallel_maintenance_workers = 8;
>> VACUUM (PARALLEL 3) pgbench_accounts;
>>
>>
>> pg_ctl -D ../postgres_data11 -l logfile stop
>>
>> rm -rf ../postgres_data/*
>>
>> cp -r ~/backup/* ~/postgres_data/
>>
>> pg_ctl -D ../postgres_data11 -l logfile start
>>
>>
>> --check vacuum statistics processed by postmaster only
>>
>> create index accounts_idx1 on pgbench_accounts(bid);
>> create index accounts_idx2 on pgbench_accounts(aid, bid);
>>
>> delete from pgbench_accounts where aid >5;
>> set max_parallel_maintenance_workers = 8;
>> VACUUM (PARALLEL 0) pgbench_accounts;
>>
>> I noticed that rel_blks_read and rel_blks_hit are too small compared
>> to the vacuum statistics when the vacuum is not parallel. I suspect
>> that this problem is related to the fact that the relationship
>> statistics have not reached that time. You can see that they are
>> calculated in my patch like this:
>>
>> report->blks_fetched =
>>     rel->pgstat_info->counts.blocks_fetched - counters->blocks_fetched;
>>   report->blks_hit =
>>     rel->pgstat_info->counts.blocks_hit - counters->blocks_hit;
>>
>>
>> The second thing that bothered me was that some table stats differ in
>> the fields total_blks_read, rel_blks_read, pages_removed. If with the
>> buffer this could be related to the fact that in a single run we rely
>> on the stats of the global buffer and shaft statistics and this could
>> explain why there are more of them, then with pages_removed I have no
>> explanation yet as to what could have happened. I am still studying this.
>>
>> When you have time, take a look at the patches, I will be glad to
>> receive any feedback.
>>
> I fixed this problem. Now the statistics of parallel and non-parallel
> vacuums are almost equal. See files output_vacuum_0_workers and
> output_vacuum_3_workers. The results were obtained from the test that
> I described above.
> In fact, we need to add tracking statistics in the
> parallel_vacuum_one_index. But I noticed another problem that I have
> already fixed.
>
> Vacuum statistics for indexes were accumulated in heap vacuum
> statistics because of the complexity of vacuum operations,
> namely, vacuum statistics for index cleaning were included in heap
> relationship vacuum statistics.
> Vacuum switches from cleaning the table to its indexes and back during
> its operation, and we need to take this into account.
> So, before cleaning indexes, we now save the collected vacuum
> statistics for the heap, but we send it only after the processing is
> complete.
>
>> To view statistics:
>>
>> select vt.relname, total_blks_read AS total_blks_read,
>>   total_blks_hit AS total_blks_hit,
>>   total_blks_dirtied AS total_blks_dirtied,
>>   total_blks_written AS total_blks_written,
>>
>>   rel_blks_read AS rel_blks_read,
>>   rel_blks_hit AS rel_blks_hit,
>>
>>   pages_deleted AS pages_deleted,
>>   tuples_deleted AS tuples_deleted,
>>
>>   wal_records AS wal_records,
>>   wal_fpi AS wal_fpi,
>>   wal_bytes AS wal_bytes,
>>
>>   blk_read_time AS blk_read_time,
>>   blk_write_time AS blk_write_time,
>>
>>   delay_time AS delay_time,
>>   total_time AS total_time
>>   FROM pg_stat_get_vacuum_indexes vt, pg_class c
>> WHERE (vt.relname='accounts_idx1' or vt.relname='accounts_idx2' or
>> vt.relname = 'pgbench_accounts_pkey') AND vt.relid = c.oid;
>>
>>
>> select stats.relname,stats.total_blks_read AS total_blks_read,
>>   stats.total_blks_hit AS total_blks_hit,
>>   stats.total_blks_dirtied AS total_blks_dirtied,
>>   stats.total_blks_written AS total_blks_written,
>>
>>   stats.rel_blks_read AS rel_blks_read,
>>   stats.rel_blks_hit AS rel_blks_hit,
>>
>>   stats.pages_scanned AS pages_scanned,
>>   stats.pages_removed AS pages_removed,
>>   stats.pages_frozen AS pages_frozen,
>>   stats.pages_all_visible AS pages_all_visible,
>>   stats.tuples_deleted AS tuples_deleted,
>>   stats.tuples_frozen AS tuples_frozen,
>>   stats.dead_tuples AS dead_tuples,
>>
>>   stats.index_vacuum_count AS index_vacuum_count,
>>   stats.wal_records AS wal_records,
>>   stats.wal_fpi AS wal_fpi,
>>   stats.wal_bytes AS wal_bytes,
>>
>>   stats.blk_read_time AS blk_read_time,
>>   stats.blk_write_time AS blk_write_time,
>>
>>   stats.delay_time AS delay_time,
>>   stats.total_time AS total_time from pg_stat_vacuum_tables stats,
>> pg_stat_all_tables  WHERE stats.relname = 'pgbench_accounts' and
>> stats.relid = pg_stat_all_tables.relid;
>>
>>
> Just in case, I'll write that during the test I used simpler queries:
>
> select * from pg_stat_vacuum_tables where relname like '%accounts%';
>
> select * from pg_stat_vacuum_indexes where relname like '%accounts_%';
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v17-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.7 KB
v17-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 64.1 KB
v17-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 36.4 KB
v17-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-02 20:43:19
Message-ID: CAPpHfduoJEuoixPTTg2tjhnXqrdobuMaQGxriqxJ9TjN1uxOuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> I noticed that the cfbot is bad, the reason seems to be related to the lack of a parameter in src/backend/utils/misc/postgresql.conf.sample. I added it, it should help.

The patch doesn't apply cleanly. Please rebase.

I see you introduced new GUC variable pgstat_track_vacuum_statistics,
which should address the increased size of statistics. However, I
don't see how it could affect the size of PgStat_StatTabEntry struct.
It seems that when pgstat_track_vacuum_statistics == 0, extended
vacuum statistics is not collected but the size of hash table entries
is the same. Also, should pgstat_track_vacuum_statistics also affect
per database statistics?

The name of 0001 is "... on heap relations". Should we say "on table
relations", because new machinery should work with alternative table
AMs as well.

There are deletions of empty lines in
src/include/utils/pgstat_internal.h and src/include/pgstat.h. Please,
remote them as it's not purpose of this patchset.

------
Regards,
Alexander Korotkov
Supabase


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-04 15:22:52
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! Thank you for your review!

On 02.02.2025 23:43, Alexander Korotkov wrote:
> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> I noticed that the cfbot is bad, the reason seems to be related to the lack of a parameter in src/backend/utils/misc/postgresql.conf.sample. I added it, it should help.
> The patch doesn't apply cleanly. Please rebase.
I rebased them.
>
> I see you introduced new GUC variable pgstat_track_vacuum_statistics,
> which should address the increased size of statistics. However, I
> don't see how it could affect the size of PgStat_StatTabEntry struct.
> It seems that when pgstat_track_vacuum_statistics == 0, extended
> vacuum statistics is not collected but the size of hash table entries
> is the same.

Yes, hash table entries will be the same but vacuum_ext structure stored
in PgStat_StatTabEntry will not be filled with statistics, although
vacuum_ext structure stored in PgStat_StatDBEntry will be fill be.

> Also, should pgstat_track_vacuum_statistics also affect
> per database statistics?

According to my original idea, I thought that we could collect extended
statistics on relationships depending on whether the hook is enabled,
and always on databases. This will help us to constantly collect
statistics on the vacuum and notice when something is wrong and at the
same time not very expensive: there are much fewer databases compared to
the same relationships and there are much fewer statistics there. You
can introduce an additional hook that disables all collection of vacuum
statistics. This patch can be seen here for the 17th version of the
patch [0], I have not yet started adding it, since we did not come to an
agreement.

At the moment, I have made a patch for vacuum statistics for databases
that disables all vacuum statistics, the hook is disabled by default.

[0]
https://www.postgresql.org/message-id/attachment/170462/v17-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch

>
> The name of 0001 is "... on heap relations". Should we say "on table
> relations", because new machinery should work with alternative table
> AMs as well.
Agree. Fixed
>
> There are deletions of empty lines in
> src/include/utils/pgstat_internal.h and src/include/pgstat.h. Please,
> remote them as it's not purpose of this patchset.
>
fixed

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v18-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.2 KB
v18-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 58.0 KB
v18-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 31.1 KB
v18-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-04 15:49:43
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04.02.2025 18:22, Alena Rybakina wrote:
>
>>   Also, should pgstat_track_vacuum_statistics also affect
>> per database statistics?
>
> According to my original idea, I thought that we could collect
> extended statistics on relationships depending on whether the hook is
> enabled, and always on databases. This will help us to constantly
> collect statistics on the vacuum and notice when something is wrong
> and at the same time not very expensive: there are much fewer
> databases compared to the same relationships and there are much fewer
> statistics there. You can introduce an additional hook that disables
> all collection of vacuum statistics. This patch can be seen here for
> the 17th version of the patch [0], I have not yet started adding it,
> since we did not come to an agreement.
I added this version as
"v19-0003.2-extended-vacuum-statistics.patch.no-cbot"

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v19-0003.2-extended-vacuum-statistics.patch.no-cfbot text/plain 37.0 KB

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-05 06:59:40
Message-ID: CAPpHfdvchPY9=gW2tL9awfh3kfFB0Ktt8K5F4kJgYRAWOrebrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 4, 2025 at 5:22 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi! Thank you for your review!
>
> On 02.02.2025 23:43, Alexander Korotkov wrote:
> > On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
> > <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >> I noticed that the cfbot is bad, the reason seems to be related to the lack of a parameter in src/backend/utils/misc/postgresql.conf.sample. I added it, it should help.
> > The patch doesn't apply cleanly. Please rebase.
> I rebased them.
> >
> > I see you introduced new GUC variable pgstat_track_vacuum_statistics,
> > which should address the increased size of statistics. However, I
> > don't see how it could affect the size of PgStat_StatTabEntry struct.
> > It seems that when pgstat_track_vacuum_statistics == 0, extended
> > vacuum statistics is not collected but the size of hash table entries
> > is the same.
>
> Yes, hash table entries will be the same but vacuum_ext structure stored
> in PgStat_StatTabEntry will not be filled with statistics, although
> vacuum_ext structure stored in PgStat_StatDBEntry will be fill be.

What is the point for disabling pgstat_track_vacuum_statistics then?
I don't see it saves any valuable resources. The original point by
Masahiko Sawada was growth of data structures in times [1] (and
corresponding memory consumption especially with large number of
tables). Now, disabling pgstat_track_vacuum_statistics only saves
some cycles of pgstat_accumulate_extvac_stats(), and that seems
insignificant.

I see that we use hash tables with static element size. So, we can't
save space by dynamically changing entries size on the base of GUC.
But could we move vacuum statistics to separate hash tables? When GUC
is disabled, new hash tables could be just empty.

Links
1. https://www.postgresql.org/message-id/CAD21AoD66b3u28n%3D73kudgMp5wiGiyYUN9LuC9z2ka6YTru5Gw%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-05 12:03:24
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05.02.2025 09:59, Alexander Korotkov wrote:
> What is the point for disabling pgstat_track_vacuum_statistics then?
> I don't see it saves any valuable resources. The original point by
> Masahiko Sawada was growth of data structures in times [1] (and
> corresponding memory consumption especially with large number of
> tables). Now, disabling pgstat_track_vacuum_statistics only saves
> some cycles of pgstat_accumulate_extvac_stats(), and that seems
> insignificant.
>
> I see that we use hash tables with static element size. So, we can't
> save space by dynamically changing entries size on the base of GUC.
> But could we move vacuum statistics to separate hash tables? When GUC
> is disabled, new hash tables could be just empty.
>
> Links
> 1.https://www.postgresql.org/message-id/CAD21AoD66b3u28n%3D73kudgMp5wiGiyYUN9LuC9z2ka6YTru5Gw%40mail.gmail.com
>
I understand what you're talking about. I'm looking at the
pgstat_assoc_relation function and I think that's where I need to decide
whether we need to allocate memory in the hash table for vacuum
statistics for them or not.
The same thing happens there depending on the installed
pgstat_track_counts guc and pgstat_enabled value consequently. Like here:

Specifically, there is an example that for partitions, for example,
statistics are not accumulated and the condition used like that, like here:

if(!pgstat_track_counts)
{
if(rel->pgstat_info)
pgstat_unlink_relation(rel);
/* We're not counting at all */
rel->pgstat_enabled= false;
rel->pgstat_info= NULL;
return;
}
I think I can try yo add an external parameter in the relation like
ext_vacuum_pgstat_info and determine its values depending on the guc's
pgstat_track_vacuum_statisticsvalue.

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-17 14:46:16
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04.02.2025 18:22, Alena Rybakina wrote:
> Hi! Thank you for your review!
>
> On 02.02.2025 23:43, Alexander Korotkov wrote:
>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>> I noticed that the cfbot is bad, the reason seems to be related to
>>> the lack of a parameter in
>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it should
>>> help.
>> The patch doesn't apply cleanly.  Please rebase.
> I rebased them.
The patch needed a rebase again. There is nothing new since version 18,
only a rebase.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v19-0001-Implement-Self-Join-Elimination.patch text/x-patch 131.4 KB
v19-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.1 KB
v19-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 55.8 KB
v19-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-02-27 18:00:29
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!
On 17.02.2025 17:46, Alena Rybakina wrote:
> On 04.02.2025 18:22, Alena Rybakina wrote:
>> Hi! Thank you for your review!
>>
>> On 02.02.2025 23:43, Alexander Korotkov wrote:
>>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>> I noticed that the cfbot is bad, the reason seems to be related to
>>>> the lack of a parameter in
>>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it
>>>> should help.
>>> The patch doesn't apply cleanly.  Please rebase.
>> I rebased them.
> The patch needed a rebase again. There is nothing new since version
> 18, only a rebase.

The patch needed a new rebase.

Sorry, but due to feeling unwell I picked up a patch from another thread
and squashed the patch for vacuum statistics for indexes and heaps in
the previous version. Now I fixed everything together with the rebase.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v20-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.1 KB
v20-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 58.0 KB
v20-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 31.1 KB
v20-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-10 09:13:08
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

After commit eaf5027 we should add information about wal_buffers_full.

Any thoughts?

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.


From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-10 13:33:15
Message-ID: CALdSSPiw_-0_L3YV=Qn7oopPqY2XVrXwDSGLdSXS69QvMdXisQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 27 Feb 2025 at 23:00, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi!
> On 17.02.2025 17:46, Alena Rybakina wrote:
> > On 04.02.2025 18:22, Alena Rybakina wrote:
> >> Hi! Thank you for your review!
> >>
> >> On 02.02.2025 23:43, Alexander Korotkov wrote:
> >>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
> >>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >>>> I noticed that the cfbot is bad, the reason seems to be related to
> >>>> the lack of a parameter in
> >>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it
> >>>> should help.
> >>> The patch doesn't apply cleanly. Please rebase.
> >> I rebased them.
> > The patch needed a rebase again. There is nothing new since version
> > 18, only a rebase.
>
> The patch needed a new rebase.
>
> Sorry, but due to feeling unwell I picked up a patch from another thread
> and squashed the patch for vacuum statistics for indexes and heaps in
> the previous version. Now I fixed everything together with the rebase.
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional

Hi!
CI fails on this one[0]

Is it a flap or a real problem caused by v20?

```

SELECT relpages AS irp
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/vacuum_tables_and_db_statistics.out
/tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/vacuum_tables_and_db_statistics.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/vacuum_tables_and_db_statistics.out
2025-03-10 09:36:10.274799176 +0000
+++ /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/vacuum_tables_and_db_statistics.out
2025-03-10 09:49:35.796596462 +0000
@@ -65,7 +65,7 @@
WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
relname | vm_new_frozen_pages | tuples_deleted | relpages |
pages_scanned | pages_removed
---------+---------------------+----------------+----------+---------------+---------------
- vestat | 0 | 0 | 455 |
0 | 0
+ vestat | 0 | 0 | 417 |
0 | 0
(1 row)

SELECT relpages AS rp
=== EOF ===

```

[0] https://api.cirrus-ci.com/v1/artifact/task/5336493629112320/testrun/build-32/testrun/recovery/027_stream_regress/log/regress_log_027_stream_regress

--
Best regards,
Kirill Reshke


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-12 19:36:39
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 10.03.2025 16:33, Kirill Reshke wrote:
> On Thu, 27 Feb 2025 at 23:00, Alena Rybakina<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> Hi!
>> On 17.02.2025 17:46, Alena Rybakina wrote:
>>> On 04.02.2025 18:22, Alena Rybakina wrote:
>>>> Hi! Thank you for your review!
>>>>
>>>> On 02.02.2025 23:43, Alexander Korotkov wrote:
>>>>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
>>>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>>>> I noticed that the cfbot is bad, the reason seems to be related to
>>>>>> the lack of a parameter in
>>>>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it
>>>>>> should help.
>>>>> The patch doesn't apply cleanly. Please rebase.
>>>> I rebased them.
>>> The patch needed a rebase again. There is nothing new since version
>>> 18, only a rebase.
>> The patch needed a new rebase.
>>
>> Sorry, but due to feeling unwell I picked up a patch from another thread
>> and squashed the patch for vacuum statistics for indexes and heaps in
>> the previous version. Now I fixed everything together with the rebase.
>>
>> --
>> Regards,
>> Alena Rybakina
>> Postgres Professional
> Hi!
> CI fails on this one[0]
>
> Is it a flap or a real problem caused by v20?
>
> ```
>
> SELECT relpages AS irp
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/vacuum_tables_and_db_statistics.out
> /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/vacuum_tables_and_db_statistics.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/vacuum_tables_and_db_statistics.out
> 2025-03-10 09:36:10.274799176 +0000
> +++ /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/vacuum_tables_and_db_statistics.out
> 2025-03-10 09:49:35.796596462 +0000
> @@ -65,7 +65,7 @@
> WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
> relname | vm_new_frozen_pages | tuples_deleted | relpages |
> pages_scanned | pages_removed
> ---------+---------------------+----------------+----------+---------------+---------------
> - vestat | 0 | 0 | 455 |
> 0 | 0
> + vestat | 0 | 0 | 417 |
> 0 | 0
> (1 row)
>
> SELECT relpages AS rp
> === EOF ===
>
>
> ```
>
> [0]https://api.cirrus-ci.com/v1/artifact/task/5336493629112320/testrun/build-32/testrun/recovery/027_stream_regress/log/regress_log_027_stream_regress
Thank you for your help, I'll fix it soon.

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-12 19:41:33
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 10.03.2025 12:13, Ilia Evdokimov wrote:
> Hi,
>
> After commit eaf5027 we should add information about wal_buffers_full.
>
> Any thoughts?
>
> --
> Best regards,
> Ilia Evdokimov,
> Tantor Labs LLC.
>
I think I can add it. To be honest, I haven't gotten to know this
statistic yet, haven't had time to get to know this commit yet. How will
this statistic help us analyze the work of the vacuum for relations?

--
Regards,
Alena Rybakina
Postgres Professional


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-12 22:15:53
Message-ID: CAMFBP2oXkhX_k9FTqtW-LdTBepVq0PDuBEGO8-LpNGbyHTBrNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 12, 2025 at 2:41 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
wrote:

> Hi!
>
> On 10.03.2025 12:13, Ilia Evdokimov wrote:
> > Hi,
> >
> > After commit eaf5027 we should add information about wal_buffers_full.
> >
> > Any thoughts?
> >
> > --
> > Best regards,
> > Ilia Evdokimov,
> > Tantor Labs LLC.
> >
> I think I can add it. To be honest, I haven't gotten to know this
> statistic yet, haven't had time to get to know this commit yet. How will
> this statistic help us analyze the work of the vacuum for relations?
>

The usecase I can see here is that we don't want autovac creating so much
WAL traffic that it starts forcing other backends to have to write WAL out.
But tracking how many times autovac writes WAL buffers won't help with that
(though we also don't want any WAL buffers written by autovac to be counted
in the system-wide wal_buffers_full: autovac is a BG process and it's fine
if it's spending time writing WAL out). I think the same is true of a
manual vacuum as well.

What would be helpful would be a way to determine if autovac was causing
enough traffic to force other backends to write WAL. Offhand I'm not sure
how practical that actually is though.

BTW, there's also an argument to be made that autovac should throttle
itself if we're close to running out of available WAL buffers...


From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Jim Nasby <jnasby(at)upgrade(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-13 06:42:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Wed, Mar 12, 2025 at 05:15:53PM -0500, Jim Nasby wrote:
> The usecase I can see here is that we don't want autovac creating so much
> WAL traffic that it starts forcing other backends to have to write WAL out.
> But tracking how many times autovac writes WAL buffers won't help with that

Right, because the one that increments the wal_buffers_full metric could "just"
be a victim (i.e the one that happens to trigger the WAL buffers disk flush,
even though other backends contributed most of the buffer usage).

> (though we also don't want any WAL buffers written by autovac to be counted
> in the system-wide wal_buffers_full:

why? Or do you mean that it would be good to have 2 kinds of metrics: one
generated by "maintenance" activity and one by "regular" backends?

> What would be helpful would be a way to determine if autovac was causing
> enough traffic to force other backends to write WAL. Offhand I'm not sure
> how practical that actually is though.

a051e71e28a could help to see how much WAL has by written by the autovac workers.

> BTW, there's also an argument to be made that autovac should throttle
> itself if we're close to running out of available WAL buffers...

hmm, yeah I think that's an interesting idea OTOH that would mean to "delegate"
the WAL buffers flush to another backend.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: vignesh C <vignesh21(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-17 06:42:48
Message-ID: CALDaNm2pSezCNsuH0C8ktR7XN7U_Vk=tqN7xexZsJtSmdWnHDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 27 Feb 2025 at 23:30, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi!
> On 17.02.2025 17:46, Alena Rybakina wrote:
> > On 04.02.2025 18:22, Alena Rybakina wrote:
> >> Hi! Thank you for your review!
> >>
> >> On 02.02.2025 23:43, Alexander Korotkov wrote:
> >>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
> >>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> >>>> I noticed that the cfbot is bad, the reason seems to be related to
> >>>> the lack of a parameter in
> >>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it
> >>>> should help.
> >>> The patch doesn't apply cleanly. Please rebase.
> >> I rebased them.
> > The patch needed a rebase again. There is nothing new since version
> > 18, only a rebase.
>
> The patch needed a new rebase.

I noticed that the CI failure reported at [1], Ilia's comment from
[2], changed the status to Waiting on Author, please address them and
update it to Needs review.
[1] - https://www.postgresql.org/message-id/CALdSSPiw_-0_L3YV%3DQn7oopPqY2XVrXwDSGLdSXS69QvMdXisQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/47a7b784-5218-43f2-96e3-65f9a729c5a5%40tantorlabs.com

Regards,
Vignesh


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-18 05:57:40
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 17.03.2025 09:42, vignesh C wrote:
> On Thu, 27 Feb 2025 at 23:30, Alena Rybakina<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> Hi!
>> On 17.02.2025 17:46, Alena Rybakina wrote:
>>> On 04.02.2025 18:22, Alena Rybakina wrote:
>>>> Hi! Thank you for your review!
>>>>
>>>> On 02.02.2025 23:43, Alexander Korotkov wrote:
>>>>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
>>>>> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>>>> I noticed that the cfbot is bad, the reason seems to be related to
>>>>>> the lack of a parameter in
>>>>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it
>>>>>> should help.
>>>>> The patch doesn't apply cleanly. Please rebase.
>>>> I rebased them.
>>> The patch needed a rebase again. There is nothing new since version
>>> 18, only a rebase.
>> The patch needed a new rebase.
> I noticed that the CI failure reported at [1], Ilia's comment from
> [2], changed the status to Waiting on Author, please address them and
> update it to Needs review.
> [1] -https://www.postgresql.org/message-id/CALdSSPiw_-0_L3YV%3DQn7oopPqY2XVrXwDSGLdSXS69QvMdXisQ%40mail.gmail.com
> [2] -https://www.postgresql.org/message-id/47a7b784-5218-43f2-96e3-65f9a729c5a5%40tantorlabs.com
Okay, thank you!

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-21 19:42:17
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13.03.2025 09:42, Bertrand Drouvot wrote:
> Hi,
>
> On Wed, Mar 12, 2025 at 05:15:53PM -0500, Jim Nasby wrote:
>> The usecase I can see here is that we don't want autovac creating so much
>> WAL traffic that it starts forcing other backends to have to write WAL out.
>> But tracking how many times autovac writes WAL buffers won't help with that
> Right, because the one that increments the wal_buffers_full metric could "just"
> be a victim (i.e the one that happens to trigger the WAL buffers disk flush,
> even though other backends contributed most of the buffer usage).
>
>> (though we also don't want any WAL buffers written by autovac to be counted
>> in the system-wide wal_buffers_full:
> why? Or do you mean that it would be good to have 2 kinds of metrics: one
> generated by "maintenance" activity and one by "regular" backends?
>
>> What would be helpful would be a way to determine if autovac was causing
>> enough traffic to force other backends to write WAL. Offhand I'm not sure
>> how practical that actually is though.
> a051e71e28a could help to see how much WAL has by written by the autovac workers.
>
>> BTW, there's also an argument to be made that autovac should throttle
>> itself if we're close to running out of available WAL buffers...
> hmm, yeah I think that's an interesting idea OTOH that would mean to "delegate"
> the WAL buffers flush to another backend.
>
> Regards,
>

I will add it and fix the tests but later and I'll explain why.

I'm working on this issue [0] and try have already created new
statistics in Statistics Collector to store database and relation vacuum
statistics: PGSTAT_KIND_VACUUM_DB and PGSTAT_KIND_VACUUM_RELATION.

Vacuum statistics are saved there instead of relation's and database's
statistic structure, but for some reason it is not possible to find them
in the hash table when building a snapshot and display them accordingly.
I have not yet figured out where the error is.

Without solving this problem, committing vacuum statistics is not yet
possible. An alternative way for us was to refuse some statistics for
now for relations,
but we could not agree on which statistics should not be displayed yet
and for now we are only adding them :).

I understand why this is important to display more vacuum information
about vacuum statistics - it will allow us to better understand the
problems of incorrect vacuum settings or, for example, notice a bug in
its operation.

In order to reduce the memory consumption for storing them for those who
are not going to use them, I just realized that we need to create a
separate space for storing the statistics
I mentioned above (PGSTAT_KIND_VACUUM_DB and
PGSTAT_KIND_VACUUM_RELATION), there is no other way to do this and I am
still trying to complete this functionality.

I doubt that I will have time for this by code freeze date and even if I
do, I will hardly have time for a normal review. There's really a lot
more to learn related to the stat collector, so
I'm postponing it to the next commitfest.

Sorry. I'll fix the tests as soon as I finish this part, since they'll
most likely either break the same way or in some new way.

Tomorrow or the day after tomorrow I will send a diff patch with what I
have already managed to demonstrate the problem, since I need to bring
the code to a normal form.
Maybe someone who worked with the stat collector will suddenly tell me
where and what I have implemented incorrectly.

--
Regards,
Alena Rybakina
Postgres Professional


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-21 19:46:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry, I forgot to provide a link to the problem [0], actually. So I
provided it below.

[0]
https://www.postgresql.org/message-id/CAPpHfduoJEuoixPTTg2tjhnXqrdobuMaQGxriqxJ9TjN1uxOuA%40mail.gmail.com

On 21.03.2025 22:42, Alena Rybakina wrote:
> On 13.03.2025 09:42, Bertrand Drouvot wrote:
>> Hi,
>>
>> On Wed, Mar 12, 2025 at 05:15:53PM -0500, Jim Nasby wrote:
>>> The usecase I can see here is that we don't want autovac creating so much
>>> WAL traffic that it starts forcing other backends to have to write WAL out.
>>> But tracking how many times autovac writes WAL buffers won't help with that
>> Right, because the one that increments the wal_buffers_full metric could "just"
>> be a victim (i.e the one that happens to trigger the WAL buffers disk flush,
>> even though other backends contributed most of the buffer usage).
>>
>>> (though we also don't want any WAL buffers written by autovac to be counted
>>> in the system-wide wal_buffers_full:
>> why? Or do you mean that it would be good to have 2 kinds of metrics: one
>> generated by "maintenance" activity and one by "regular" backends?
>>
>>> What would be helpful would be a way to determine if autovac was causing
>>> enough traffic to force other backends to write WAL. Offhand I'm not sure
>>> how practical that actually is though.
>> a051e71e28a could help to see how much WAL has by written by the autovac workers.
>>
>>> BTW, there's also an argument to be made that autovac should throttle
>>> itself if we're close to running out of available WAL buffers...
>> hmm, yeah I think that's an interesting idea OTOH that would mean to "delegate"
>> the WAL buffers flush to another backend.
>>
>> Regards,
>>
>
> I will add it and fix the tests but later and I'll explain why.
>
> I'm working on this issue [0] and try have already created new
> statistics in Statistics Collector to store database and relation
> vacuum statistics: PGSTAT_KIND_VACUUM_DB and PGSTAT_KIND_VACUUM_RELATION.
>
> Vacuum statistics are saved there instead of relation's and database's
> statistic structure, but for some reason it is not possible to find
> them in the hash table when building a snapshot and display them
> accordingly.
> I have not yet figured out where the error is.
>
> Without solving this problem, committing vacuum statistics is not yet
> possible. An alternative way for us was to refuse some statistics for
> now for relations,
> but we could not agree on which statistics should not be displayed yet
> and for now we are only adding them :).
>
> I understand why this is important to display more vacuum information
> about vacuum statistics - it will allow us to better understand the
> problems of incorrect vacuum settings or, for example, notice a bug in
> its operation.
>
> In order to reduce the memory consumption for storing them for those
> who are not going to use them, I just realized that we need to create
> a separate space for storing the statistics
> I mentioned above (PGSTAT_KIND_VACUUM_DB and
> PGSTAT_KIND_VACUUM_RELATION), there is no other way to do this and I
> am still trying to complete this functionality.
>
> I doubt that I will have time for this by code freeze date and even if
> I do, I will hardly have time for a normal review. There's really a
> lot more to learn related to the stat collector, so
> I'm postponing it to the next commitfest.
>
> Sorry. I'll fix the tests as soon as I finish this part, since they'll
> most likely either break the same way or in some new way.
>
> Tomorrow or the day after tomorrow I will send a diff patch with what
> I have already managed to demonstrate the problem, since I need to
> bring the code to a normal form.
> Maybe someone who worked with the stat collector will suddenly tell me
> where and what I have implemented incorrectly.
>

--
Regards,
Alena Rybakina
Postgres Professional


From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-24 23:02:31
Message-ID: CAMFBP2o=K-+EBiOb70F8pHmW9f0O3iDLtyV5MjgeCgBvWUQsdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2025 at 2:42 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
wrote:

> On 13.03.2025 09:42, Bertrand Drouvot wrote:
>
> Hi,
>
> On Wed, Mar 12, 2025 at 05:15:53PM -0500, Jim Nasby wrote:
>
> The usecase I can see here is that we don't want autovac creating so much
> WAL traffic that it starts forcing other backends to have to write WAL out.
> But tracking how many times autovac writes WAL buffers won't help with that
>
> Right, because the one that increments the wal_buffers_full metric could "just"
> be a victim (i.e the one that happens to trigger the WAL buffers disk flush,
> even though other backends contributed most of the buffer usage).
>
>
> (though we also don't want any WAL buffers written by autovac to be counted
> in the system-wide wal_buffers_full:
>
> why? Or do you mean that it would be good to have 2 kinds of metrics: one
> generated by "maintenance" activity and one by "regular" backends?
>
> See below...

> What would be helpful would be a way to determine if autovac was causing
> enough traffic to force other backends to write WAL. Offhand I'm not sure
> how practical that actually is though.
>
> a051e71e28a could help to see how much WAL has by written by the autovac workers.
>
> I still don't think that helps (see below)

> BTW, there's also an argument to be made that autovac should throttle
> itself if we're close to running out of available WAL buffers...
>
> hmm, yeah I think that's an interesting idea OTOH that would mean to "delegate"
> the WAL buffers flush to another backend.
>
> Maybe it does, maybe it doesn't... but now I think you're getting why I'm
complaining about the proposed WAL flush metrics: who *flushes* WAL tells
you absolutely nothing about who generated the WAL. Not only that, but
flushing WAL isn't necessarily even bad: a user backend can't COMMIT
without flushing some amount of WAL (ignoring async-commit of course). That
really casts the whole idea of having stats on who's flushing how much WAL
in a new light: you can NOT use any such metric without a bunch of other
context; including who else was flushing how much WAL, whether WAL had to
absolutely be flushed anyway (ie, at bare minimum a COMMIT must flush
enough WAL to cover the commit record), and even where all the WAL is
coming from in the first place.

Though now that I think about it... if we're reporting how much WAL is
being generated by vacuum, then *maybe* it's helpful to also report how
much WAL is being flushed by vacuum. My emphasis on *maybe* is because it's
fine if autovac is writing more than it flushes, so long as the remainder
is being flushed by the checkpointer and not user backends... but you could
also determine that just by looking at how much WAL backends are flushing.

Basically, I'm leaning towards it would be best to rethink the whole
purpose of reporting WAL flush metrics before we further muddy the waters
by adding vacuum stats about it. At minimum we should have a metric that
shows how much WAL backends flushed because they *had* to due to
synchronous commit settings (which does affect more than just COMMIT).

> I will add it and fix the tests but later and I'll explain why.
>
> I'm working on this issue [0] and try have already created new statistics
> in Statistics Collector to store database and relation vacuum statistics:
> PGSTAT_KIND_VACUUM_DB and PGSTAT_KIND_VACUUM_RELATION.
>
> Vacuum statistics are saved there instead of relation's and database's
> statistic structure, but for some reason it is not possible to find them in
> the hash table when building a snapshot and display them accordingly.
> I have not yet figured out where the error is.
>
> Without solving this problem, committing vacuum statistics is not yet
> possible. An alternative way for us was to refuse some statistics for now
> for relations,
> but we could not agree on which statistics should not be displayed yet and
> for now we are only adding them :).
>
> I understand why this is important to display more vacuum information
> about vacuum statistics - it will allow us to better understand the
> problems of incorrect vacuum settings or, for example, notice a bug in its
> operation.
>
> In order to reduce the memory consumption for storing them for those who
> are not going to use them, I just realized that we need to create a
> separate space for storing the statistics
> I mentioned above (PGSTAT_KIND_VACUUM_DB and PGSTAT_KIND_VACUUM_RELATION),
> there is no other way to do this and I am still trying to complete this
> functionality.
>
> I doubt that I will have time for this by code freeze date and even if I
> do, I will hardly have time for a normal review. There's really a lot more
> to learn related to the stat collector, so
> I'm postponing it to the next commitfest.
>
> Sorry. I'll fix the tests as soon as I finish this part, since they'll
> most likely either break the same way or in some new way.
>
> Tomorrow or the day after tomorrow I will send a diff patch with what I
> have already managed to demonstrate the problem, since I need to bring the
> code to a normal form.
> Maybe someone who worked with the stat collector will suddenly tell me
> where and what I have implemented incorrectly.
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional
>
>


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-03-25 06:12:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi! I rebased the patches again - PGSTAT_FILE_FORMAT_ID needed to be fixed.

On 21.03.2025 22:42, Alena Rybakina wrote:
>
> I will add it and fix the tests but later and I'll explain why.
>
> I'm working on this issue [0] and try have already created new
> statistics in Statistics Collector to store database and relation
> vacuum statistics: PGSTAT_KIND_VACUUM_DB and PGSTAT_KIND_VACUUM_RELATION.
>
> Vacuum statistics are saved there instead of relation's and database's
> statistic structure, but for some reason it is not possible to find
> them in the hash table when building a snapshot and display them
> accordingly.
> I have not yet figured out where the error is.
>
> Without solving this problem, committing vacuum statistics is not yet
> possible. An alternative way for us was to refuse some statistics for
> now for relations,
> but we could not agree on which statistics should not be displayed yet
> and for now we are only adding them :).
>
> I understand why this is important to display more vacuum information
> about vacuum statistics - it will allow us to better understand the
> problems of incorrect vacuum settings or, for example, notice a bug in
> its operation.
>
> In order to reduce the memory consumption for storing them for those
> who are not going to use them, I just realized that we need to create
> a separate space for storing the statistics
> I mentioned above (PGSTAT_KIND_VACUUM_DB and
> PGSTAT_KIND_VACUUM_RELATION), there is no other way to do this and I
> am still trying to complete this functionality.
>
> I doubt that I will have time for this by code freeze date and even if
> I do, I will hardly have time for a normal review. There's really a
> lot more to learn related to the stat collector, so
> I'm postponing it to the next commitfest.
>
> Sorry. I'll fix the tests as soon as I finish this part, since they'll
> most likely either break the same way or in some new way.
>
> Tomorrow or the day after tomorrow I will send a diff patch with what
> I have already managed to demonstrate the problem, since I need to
> bring the code to a normal form.
> Maybe someone who worked with the stat collector will suddenly tell me
> where and what I have implemented incorrectly.
>
I attached also diff version that contains what I was talking about. The
test case:

create table t (x int);

insert into t select id from generate_series(1,1000) id;

delete from t where id > 900;

vacuum;

select * from pg_stat_vacuum_tables where relname = 't';

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v21-0004-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB
v21-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 31.1 KB
v21-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 58.0 KB
v21-0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.1 KB
vacuum_moved_new_entry.diff.no-cfbot text/plain 29.1 KB

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-04-22 18:23:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/24 14:40, Alexander Korotkov wrote:
> On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
>> If I missed something or misunderstood, can you explain in more detail?
>
> Actually, I mean why do we need a possibility to return statistics for
> all tables/indexes in one function call? User anyway is supposed to
> use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
> function calls one per relation. I suppose we can get rid of
> possibility to get all the objects in one function call and just
> return a tuple from the functions like other pgstatfuncs.c functions
> do.
I suppose it was designed this way because databases may contain
thousands of tables and indexes - remember, at least, partitions. But it
may be okay to use the SRF_FIRSTCALL_INIT / SRF_RETURN_NEXT API. I think
by registering a prosupport routine predicting cost and rows of these
calls, we may let the planner build adequate plans for queries involving
those stats - people will definitely join it with something else in the
database.

--
regards, Andrei Lepikhov