summaryrefslogtreecommitdiff
path: root/src/backend/utils/hash
AgeCommit message (Collapse)Author
2013-05-29pgindent run for release 9.3Bruce Momjian
This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
2013-01-15Fix hash_update_hash_key() to handle same-bucket case correctly.Tom Lane
Original coding would corrupt the hashtable if the item being updated was at the end of its bucket chain and the new hash key hashed to that same bucket. Diagnosis and fix by Heikki Linnakangas.
2013-01-14Prevent very-low-probability PANIC during PREPARE TRANSACTION.Tom Lane
The code in PostPrepare_Locks supposed that it could reassign locks to the prepared transaction's dummy PGPROC by deleting the PROCLOCK table entries and immediately creating new ones. This was safe when that code was written, but since we invented partitioning of the shared lock table, it's not safe --- another process could steal away the PROCLOCK entry in the short interval when it's on the freelist. Then, if we were otherwise out of shared memory, PostPrepare_Locks would have to PANIC, since it's too late to back out of the PREPARE at that point. Fix by inventing a dynahash.c function to atomically update a hashtable entry's key. (This might possibly have other uses in future.) This is an ancient bug that in principle we ought to back-patch, but the odds of someone hitting it in the field seem really tiny, because (a) the risk window is small, and (b) nobody runs servers with maxed-out lock tables for long, because they'll be getting non-PANIC out-of-memory errors anyway. So fixing it in HEAD seems sufficient, at least until the new code has gotten some testing.
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-12-12Add defenses against integer overflow in dynahash numbuckets calculations.Tom Lane
The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
2012-10-19Fix hash_search to avoid corruption of the hash table on out-of-memory.Tom Lane
An out-of-memory error during expand_table() on a palloc-based hash table would leave a partially-initialized entry in the table. This would not be harmful for transient hash tables, since they'd get thrown away anyway at transaction abort. But for long-lived hash tables, such as the relcache hash, this would effectively corrupt the table, leading to crash or other misbehavior later. To fix, rearrange the order of operations so that table enlargement is attempted before we insert a new entry, rather than after adding it to the hash table. Problem discovered by Hitoshi Harada, though this is a bit different from his proposed patch.
2012-02-29Move CRC tables to libpgport, and provide them in a separate include file.Tom Lane
This makes it much more convenient to build tools for Postgres that are separately compiled and require a matching CRC implementation. To prevent multiple copies of the CRC polynomial tables being introduced into the postgres binaries, they are now included in the static library libpgport that is mainly meant for replacement system functions. That seems like a bit of a kludge, but there's no better place. This cleans up building of the tools pg_controldata and pg_resetxlog, which previously had to build their own copies of pg_crc.o. In the future, external programs that need access to the CRC tables can include the tables directly from the new header file pg_crc_tables.h. Daniel Farina, reviewed by Abhijit Menon-Sen and Tom Lane
2012-01-30Assorted comment fixes, mostly just typos, but some obsolete statements.Tom Lane
YAMAMOTO Takashi
2012-01-01Update copyright notices for year 2012.Bruce Momjian
2011-09-04Clean up the #include mess a little.Tom Lane
walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.
2011-09-01Remove unnecessary #include references, per pgrminclude script.Bruce Momjian
2011-04-11Fix the size of predicate lock manager's shared memory hash tables at creation.Heikki Linnakangas
This way they don't compete with the regular lock manager for the slack shared memory, making the behavior more predictable.
2011-01-01Stamp copyrights for year 2011.Bruce Momjian
2010-09-20Remove cvs keywords from all files.Magnus Hagander
2010-02-26pgindent run for 9.0Bruce Momjian
2010-01-07Remove all the special-case code for INT64_IS_BUSTED, per decision thatTom Lane
we're not going to support that anymore. I did keep the 64-bit-CRC-with-32-bit-arithmetic code, since it has a performance excuse to live. It's a bit moot since that's all ifdef'd out, of course.
2010-01-02Update copyright for the year 2010.Bruce Momjian
2009-01-01Update copyright for 2009.Bruce Momjian
2008-08-25Update URL to Ross William's paper.Alvaro Herrera
Devrim Gunduz.
2008-03-27Reduce the need for frontend programs to include "postgres.h" by refactoringTom Lane
inclusions in src/include/catalog/*.h files. The main idea here is to push function declarations for src/backend/catalog/*.c files into separate headers, rather than sticking them into the corresponding catalog definition file as has been done in the past. This commit only carries out that idea fully for pg_proc, pg_type and pg_conversion, but that's enough for the moment --- if pg_list.h ever becomes unsafe for frontend code to include, we'll need to work a bit more. Zdenek Kotala
2008-02-19Refactor backend makefiles to remove lots of duplicate codePeter Eisentraut
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-11-15pgindent run for 8.3.Bruce Momjian
2007-09-11Include hash table name in all the internal-error elog messages inTom Lane
dynahash.c. Sergey Koposov's current open problem shows the possible usefulness of this, and it doesn't add much code.
2007-06-01Fix several hash functions that were taking chintzy shortcuts instead ofTom Lane
delivering a well-randomized hash value. I got religion on this after observing that performance of multi-batch hash join degrades terribly if the higher-order bits of hash values aren't random, as indeed was true for say hashes of small integer values. It's now expected and documented that hash functions should use hash_any or some comparable method to ensure that all bits of their output are about equally random. initdb forced because this change invalidates existing hash indexes. For the same reason, this isn't back-patchable; the hash join performance problem will get a band-aid fix in the back branches.
2007-04-26Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scanTom Lane
is in progress on the same hashtable. This seems the least invasive way to fix the recently-recognized problem that a split could cause the scan to visit entries twice or (with much lower probability) miss them entirely. The only field-reported problem caused by this is the "failed to re-find shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky, which was caused by multiply visited entries. However, it seems certain that mdsync() is vulnerable to missing required fsync's due to missed entries, and I am fearful that RelationCacheInitializePhase2() might be at risk as well. Because of that and the generalized hazard presented by this bug, back-patch all the supported branches. Along the way, fix pg_prepared_statement() and pg_cursor() to not assume that the hashtables they are examining will stay static between calls. This is risky regardless of the newly noted dynahash problem, because hash_seq_search() has never promised to cope with deletion of table entries other than the just-returned one. There may be no bug here because the only supported way to call these functions is via ExecMakeTableFunctionResult() which will cycle them to completion before doing anything very interesting, but it seems best to get rid of the assumption. This affects 8.2 and HEAD only, since those functions weren't there earlier.
2007-01-20Remove remains of old depend target.Peter Eisentraut
2007-01-05Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian
back-stamped for this.
2006-10-04pgindent run for 8.2.Bruce Momjian
2006-09-27Replace strncpy with strlcpy in selected places that seem possibly relevantTom Lane
to performance. (A wholesale effort to get rid of strncpy should be undertaken sometime, but not during beta.) This commit also fixes dynahash.c to correctly truncate overlength string keys for hashtables, so that its callers don't have to anymore.
2006-08-14Remove hash_destroy calls in hash_create's failure paths. As noted byTom Lane
a Coverity warning, these are risky since the hashtable isn't necessarily fully set up yet. They're unnecessary anyway: a deletable hashtable should be in a memory context that will be cleared following elog(ERROR). Per report from Martijn van Oosterhout.
2006-07-22Add support to dynahash.c for partitioning shared hashtables accordingTom Lane
to the low-order bits of the entry hash value. Also make some incidental cleanups in the dynahash API, such as not exporting the hash header structs to the world.
2006-07-14Remove 576 references of include files that were not needed.Bruce Momjian
2006-06-25Tweak dynahash.c to avoid wasting memory space in non-shared hash tables.Tom Lane
palloc() will normally round allocation requests up to the next power of 2, so make dynahash choose allocation sizes that are as close to a power of 2 as possible. Back-patch to 8.1 --- the problem exists further back, but a much larger patch would be needed and it doesn't seem worth taking any risks.
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2005-11-22Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian
comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-08-20Convert the arithmetic for shared memory size calculation from 'int'Tom Lane
to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.
2005-06-26Tweak dynahash.c to not allocate so many entries at once when dealingTom Lane
with a table that has a small predicted size. Avoids wasting several hundred K on the timezone hash table, which is likely to have only one or a few entries, but the entries use up 10Kb apiece ...
2005-06-18When using C-string lookup keys in a dynahash.c hash table, use strncpy()Tom Lane
not memcpy() to copy the offered key into the hash table during HASH_ENTER. This avoids possible core dump if the passed key is located very near the end of memory. Per report from Stefan Kaltenbrunner.
2005-06-08Marginal hack to avoid spending a lot of time in find_join_rel duringTom Lane
large planning problems: when the list of join rels gets too long, make an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-02Change CRCs in WAL records from 64bit to 32bit for performance reasons.Tom Lane
Instead of a separate CRC on each backup block, include backup blocks in their parent WAL record's CRC; this is important to ensure that the backup block really goes with the WAL record, ie there was not a page tear right at the start of the backup block. Implement a simple form of compression of backup blocks: drop any run of zeroes starting at pd_lower, so as not to store the unused 'hole' that commonly exists in PG heap and index pages. Tweak PageRepairFragmentation and related routines to ensure they keep the unused space zeroed, so that the above compression method remains effective. All per recent discussions.
2005-05-29Modify hash_search() API to prevent future occurrences of the errorTom Lane
spotted by Qingqing Zhou. The HASH_ENTER action now automatically fails with elog(ERROR) on out-of-memory --- which incidentally lets us eliminate duplicate error checks in quite a bunch of places. If you really need the old return-NULL-on-out-of-memory behavior, you can ask for HASH_ENTER_NULL. But there is now an Assert in that path checking that you aren't hoping to get that behavior in a palloc-based hash table. Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions, which were not being used anywhere anymore, and were surely too ugly and unsafe to want to see revived again.
2005-05-16Adjust out-of-date comment.Tom Lane
2005-05-06Marginal performance improvements in dynahash: make sure that everythingTom Lane
associated with a hashtable is allocated in that hashtable's private context, so that hash_destroy only has to destroy the context and not do any retail pfree's; and tighten the inner loop of hash_seq_search.
2005-04-14Marginal hack to use a specialized hash function for dynahash hashtablesTom Lane
whose keys are OIDs. The only one that looks particularly performance critical is the relcache hashtable, but as long as we've got the function we may as well use it wherever it's applicable.
2004-12-31Tag appropriate files for rc3PostgreSQL Daemon
Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
2004-11-21Fix rounding problem in dynahash.c's decision about when the targetTom Lane
fill factor has been exceeded. We usually run with ffactor == 1, but the way the test was coded, it wouldn't split a bucket until the actual fill factor reached 2.0, because of use of integer division. Change from > to >= so that it will split more aggressively when the table starts to get full.
2004-10-25Modify hash_create() to elog(ERROR) if an error occurs, rather thanNeil Conway
returning a NULL pointer (some callers remembered to check the return value, but some did not -- it is safer to just bail out). Also, cleanup pgstat.c to use elog(ERROR) rather than elog(LOG) followed by exit().
2004-10-22Minor code cleanup: hdefault() only ever returned "true", so it may asNeil Conway
well be declared to return "void" to save callers the trouble of checking for errors.