Return DSA area for hash table from GetNamedDSHash()

Lists: pgsql-hackers
From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Return DSA area for hash table from GetNamedDSHash()
Date: 2026-04-06 22:56:21
Message-ID: CAA5RZ0tKfCVqFnMZtavM42H63ha2Haf_C4mbJNWqkaW30cPW1w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

While working on extending tests for dshash.c [1], I realized that a
user that creates a hash table with GetNamedDSHash() has no way
to cap the size of the dsa area underpinning the table by using
dsa_set_size_limit(). This is because the dsa_area created using
this API is not exposed to the user.

This is a gap for users of the GetNamedDSHash() API,
because it's very likely that the callers don't want runaway growth of
these hash tables.

Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
and returns the area. The caller can then use dsa_set_size_limit() to limit
the size.

We could change the GetNamedDSHash() API to take in a size, but that
will not be ideal since a caller may want to change the size dynamically after
the hash table is created.

I don't have a patch for this yet, but I also think it will make sense for
pg_dsm_registry_allocations to also show the max_size

postgres=# select * from pg_dsm_registry_allocations;
name | type | size
------------------------+---------+---------
test_dsm_registry_dsa | area | 1048576
test_dsm_registry_hash | hash | 1048576
test_dsm_registry_dsm | segment | 20
(3 rows)

Thoughts?

[1] [https://www.postgresql.org/message-id/acXCJODjsCytdpwT%40paquier.xyz]

--
Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
v1-0001-Add-function-to-return-DSA-area-for-a-dshash-tabl.patch application/octet-stream 1.6 KB

From: jie wang <jugierwang(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Return DSA area for hash table from GetNamedDSHash()
Date: 2026-04-07 03:59:14
Message-ID: CAJnZyeCTvcAQshs8BHSDTAZ6JJgo619sFgWmdKHKf_UawgdeYA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Sami Imseih <samimseih(at)gmail(dot)com> 于2026年4月7日周二 06:56写道:

> Hi,
>
> While working on extending tests for dshash.c [1], I realized that a
> user that creates a hash table with GetNamedDSHash() has no way
> to cap the size of the dsa area underpinning the table by using
> dsa_set_size_limit(). This is because the dsa_area created using
> this API is not exposed to the user.
>
> This is a gap for users of the GetNamedDSHash() API,
> because it's very likely that the callers don't want runaway growth of
> these hash tables.
>
> Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
> and returns the area. The caller can then use dsa_set_size_limit() to limit
> the size.
>
> We could change the GetNamedDSHash() API to take in a size, but that
> will not be ideal since a caller may want to change the size dynamically
> after
> the hash table is created.
>
> I don't have a patch for this yet, but I also think it will make sense for
> pg_dsm_registry_allocations to also show the max_size
>
> postgres=# select * from pg_dsm_registry_allocations;
> name | type | size
> ------------------------+---------+---------
> test_dsm_registry_dsa | area | 1048576
> test_dsm_registry_hash | hash | 1048576
> test_dsm_registry_dsm | segment | 20
> (3 rows)
>
> Thoughts?
>
>
> [1] [https://www.postgresql.org/message-id/acXCJODjsCytdpwT%40paquier.xyz]
>
> --
> Sami Imseih
> Amazon Web Services (AWS)
>

Hi,

I think an assert check could be added in this patch for better safety.
Assert(hash_table != NULL);

Best regards,
--
wang jie


From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Return DSA area for hash table from GetNamedDSHash()
Date: 2026-04-08 01:08:46
Message-ID: adWqnsjOIAWUCLLz@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 06, 2026 at 05:56:21PM -0500, Sami Imseih wrote:
> Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
> and returns the area. The caller can then use dsa_set_size_limit() to limit
> the size.

+dsa_area *
+dshash_get_dsa_area(dshash_table *hash_table)
+{
+ Assert(hash_table->control->magic == DSHASH_MAGIC);
+
+ return hash_table->area;

Rather than an API that returns the DSA area, perhaps it would be more
natural to have a wrapper that calls dsa_set_size_limit(), using an
existing dshash_table in input?
--
Michael


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, jugierwang(at)gmail(dot)com
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Return DSA area for hash table from GetNamedDSHash()
Date: 2026-04-09 21:11:48
Message-ID: CAA5RZ0se3N-uQHi1L_32C1HM-ZW4T_D3vw5eCf0vDO2tHwB=sg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the replies!

> I think an assert check could be added in this patch for better safety.
> Assert(hash_table != NULL);
>

I followed the same approach we take for dshash_destroy() and
dshash_get_hash_table_handle(). The caller is responsible for
not passing in a NULL hash table, else that assert will segfault.

> +dsa_area *
> +dshash_get_dsa_area(dshash_table *hash_table)
> +{
> + Assert(hash_table->control->magic == DSHASH_MAGIC);
> +
> + return hash_table->area;
>
> Rather than an API that returns the DSA area, perhaps it would be more
> natural to have a wrapper that calls dsa_set_size_limit(), using an
> existing dshash_table in input?

hm, having GetNamedDSA return dsa_area for direct use while requiring
a special wrapper for the dshash case creates an inconsistent API in
dsm_registry.h. dshash_get_dsa_area() means either way the dsa_area is
obtained, dsa_set_size_limit() can be used to set the size.

--
Sami