summaryrefslogtreecommitdiff
path: root/src/test/isolation/isolationtester.c
AgeCommit message (Collapse)Author
2018-12-11Raise some timeouts to 180s, in test code.Noah Misch
Slow runs of buildfarm members chipmunk, hornet and mandrill saw the shorter timeouts expire. The 180s timeout in poll_query_until has been trouble-free since 2a0f89cd717ce6d49cdc47850577823682167e87 introduced it two years ago, so use 180s more widely. Back-patch to 9.6, where the first of these timeouts was introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/[email protected]
2018-12-03Add some missing schema qualificationsMichael Paquier
This does not improve the security and reliability of the touched areas, but it makes the style more consistent. Author: Michael Paquier Reviewed-by- Noah Misch Discussion: https://postgr.es/m/[email protected]
2018-11-09Indicate session name in isolationtester noticesAlvaro Herrera
When a session under isolationtester produces printable notices (NOTICE, WARNING) we were just printing them unadorned, which can be confusing when debugging. Prefix them with the session name, which makes things clearer. Author: Álvaro Herrera Reviewed-by: Hari Babu Kommi Discussion: https://postgr.es/m/[email protected]
2018-10-17Fix minor bug in isolationtester.Tom Lane
If the lock wait query failed, isolationtester would report the PQerrorMessage from some other connection, meaning there would be no message or an unrelated one. This seems like a pretty unlikely occurrence, but if it did happen, this bug could make it really difficult/confusing to figure out what happened. That seems to justify patching all the way back. In passing, clean up another place where the "wrong" conn was used for an error report. That one's not actually buggy because it's a different alias for the same connection, but it's still confusing to the reader.
2017-11-28Mark some more functions as pg_attribute_noreturn().Tom Lane
Doing this suppresses Coverity warnings and might allow improved code in some cases. The prospects of that are not so bright as to warrant back-patching, though. Michael Paquier, per Coverity
2017-06-21Phase 3 of pgindent updates.Tom Lane
Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2017-04-10Move isolationtester's is-blocked query into C code for speed.Tom Lane
Commit 4deb41381 modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/[email protected]
2017-04-05Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE.Kevin Grittner
This improves code coverage and lays a foundation for testing similar issues in a distributed environment. Author: Thomas Munro <[email protected]> Reviewed-by: Michael Paquier <[email protected]>
2017-02-25Remove useless duplicate inclusions of system header files.Tom Lane
c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
2016-08-30Fix a bunch of places that called malloc and friends with no NULL check.Tom Lane
Where possible, use palloc or pg_malloc instead; otherwise, insert explicit NULL checks. Generally speaking, these are places where an actual OOM is quite unlikely, either because they're in client programs that don't allocate all that much, or they're very early in process startup so that we'd likely have had a fork() failure instead. Hence, no back-patch, even though this is nominally a bug fix. Michael Paquier, with some adjustments by me Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-04-27Use memmove() not memcpy() to slide some pointers down.Tom Lane
The previous coding here was formally undefined, though it seems to accidentally work on most platforms in the buildfarm. Caught by some OpenBSD platforms in which libc contains an assertion check for overlapping areas passed to memcpy(). Thomas Munro
2016-03-09Handle invalid libpq sockets in more placesPeter Eisentraut
Also, make error messages consistent. From: Michael Paquier <[email protected]>
2016-02-22Create a function to reliably identify which sessions block which others.Tom Lane
This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else.
2016-02-12Revert "isolationtester: don't repeat the is-it-waiting query when retrying ↵Tom Lane
a step." This mostly reverts commit 9c9782f066e0ce5424b8706df2cce147cb78170f. I left in the parts that rearranged removal of completed waiting steps; but the idea of not rechecking a step's blocked-ness isn't working.
2016-02-12isolationtester: don't repeat the is-it-waiting query when retrying a step.Tom Lane
If we're retrying a step, then we already decided it was blocked on a lock, and there's no need to recheck that. The original coding of commit 38f8bdcac4982215beb9f65a19debecaf22fd470 resulted in a large number of is-it-waiting queries when dealing with multiple concurrently-blocked sessions, which is fairly pointless and also results in test failures in CLOBBER_CACHE_ALWAYS builds, where the is-it-waiting query is quite slow. This definition also permits appending pg_sleep() calls to steps where it's needed to control the order of finish of concurrent steps. Before, that did not work nicely because we'd decide that a step performing a sleep was not blocked and hang up waiting for it to finish, rather than noticing the completion of the concurrent step we're supposed to notice first. In passing, revise handling of removal of completed waiting steps to make it a bit less messy.
2016-02-12Re-pgindent isolationtester.c.Tom Lane
Need to do some more hacking on this, and got annoyed that it's not indent clean.
2016-02-12Fix whitespacePeter Eisentraut
2016-02-11Code review for isolationtester changes.Tom Lane
Fix a few oversights in 38f8bdcac4982215beb9f65a19debecaf22fd470: don't leak memory in run_permutation(), remember when we've issued a cancel rather than issuing another one every 10ms, fix some typos in comments.
2016-02-11Modify the isolation tester so that multiple sessions can wait.Robert Haas
This allows testing of deadlock scenarios. Scenarios that would previously have been considered invalid are now simply taken as a scenario in which more than one backend will wait.
2015-08-15Reject isolation test specifications with duplicate step names.Robert Haas
alter-table-1.spec has such a case, so change one instance of step rx1 to rx3 instead.
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-02-15Centralize getopt-related declarations in a new header file pg_getopt.h.Tom Lane
We used to have externs for getopt() and its API variables scattered all over the place. Now that we find we're going to need to tweak the variable declarations for Cygwin, it seems like a good idea to have just one place to tweak. In this commit, the variables are declared "#ifndef HAVE_GETOPT_H". That may or may not work everywhere, but we'll soon find out. Andres Freund
2013-12-20isolationtester: Ensure stderr is unbuffered, tooAlvaro Herrera
2013-12-19Make stdout unbufferedAlvaro Herrera
This ensures that all stdout output is flushed immediately, to match stderr. This eliminates the need for fflush(stdout) calls sprinkled all over the place. Per Daniel Wood in message [email protected]
2013-11-18Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStrHeikki Linnakangas
Arguably makes the code a bit more readable, and might give a small performance gain. David Rowley
2013-11-08Fix pg_isolation_regress to work outside its build directory.Robert Haas
This makes it possible to, for example, use the isolation tester to test a contrib module. Andres Freund
2013-10-22Replace pg_asprintf() with psprintf().Tom Lane
This eliminates an awkward coding pattern that's also unnecessarily inconsistent with backend coding. psprintf() is now the thing to use everywhere.
2013-10-13Add use of asprintf()Peter Eisentraut
Add asprintf(), pg_asprintf(), and psprintf() to simplify string allocation and composition. Replacement implementations taken from NetBSD. Reviewed-by: Álvaro Herrera <[email protected]> Reviewed-by: Asif Naeem <[email protected]>
2013-10-04isolationtester: Allow tuples to be returned in more placesAlvaro Herrera
Previously, isolationtester would forbid returning tuples in session-specific teardown (but not global teardown), as well as in global setup. Allow these places to return tuples, too.
2013-04-07In isolationtester, retry after EINTR return from select(2).Tom Lane
Per report from Jaime Casanova. Very curious that no one else has seen this failure ... but the code is clearly wrong as-is.
2013-04-03Minor robustness improvements for isolationtester.Tom Lane
Notice and complain about PQcancel() failures. Also, don't dump core if an error PGresult doesn't contain severity and message subfields, as it might not if it was generated by libpq itself. (We have a longstanding TODO item to improve that, but in the meantime isolationtester had better cope.) I tripped across the latter item while investigating a trouble report on buildfarm member spoonbill. As for the former, there's no evidence that PQcancel failure is actually involved in spoonbill's problem, but it still seems like a bad idea to ignore an error return code.
2013-02-28Flush stderr and stdout in isolation tester.Andrew Dunstan
This is a possibly vain attempt to fix a buffering issue observed for some MSVC builds.
2013-01-23isolationtester: add a few fflush(stderr) callsAlvaro Herrera
The lack of them is causing failures in some BF members. Per Andrew Dunstan.
2013-01-23Improve concurrency of foreign key lockingAlvaro Herrera
This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
2012-09-05Allow isolation tests to specify multiple setup blocks.Kevin Grittner
Each setup block is run as a single PQexec submission, and some statements such as VACUUM cannot be combined with others in such a block. Backpatch to 9.2. Kevin Grittner and Tom Lane
2012-06-10Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian
commit-fest.
2012-01-28Add simple tests of EvalPlanQual using the isolationtester infrastructure.Tom Lane
Much more could be done here, but at least now we have *some* automated test coverage of that mechanism. In particular this tests the writable-CTE case reported by Phil Sorber. In passing, remove isolationtester's arbitrary restriction on the number of steps in a permutation list. I used this so that a single spec file could be used to run several related test scenarios, but there are other possible reasons to want a step series that's not exactly a permutation. Improve documentation and fix a couple other nits as well.
2012-01-14Detect invalid permutations in isolationtesterAlvaro Herrera
isolationtester is now able to continue running other permutations when it detects that one of them is invalid, which is useful during initial development of spec files. Author: Alexander Shulgin
2012-01-14Avoid NULL pointer dereference in isolationtesterAlvaro Herrera
2012-01-11Validate number of steps specified in permutationAlvaro Herrera
A permutation that specifies more steps than defined causes isolationtester to crash, so avoid that. Using less steps than defined should probably not be a problem, but no spec currently does that.
2011-11-04Unbreak isolationtester on Win32Alvaro Herrera
I broke it in a previous commit because I neglected to install the necessary incantations to have getopt() work on Windows. Per red blots in buildfarm.
2011-11-03Implement a dry-run mode for isolationtesterAlvaro Herrera
This mode prints out the permutations that would be run by the given spec file, in the same format used by the permutation lines in spec files. This helps in building new spec files. Author: Alexander Shulgin, with some tweaks by me
2011-10-25Add debugging aid in isolationtesterAlvaro Herrera
2011-09-27Remove dependency on error ordering in isolation testsAlvaro Herrera
We now report errors reported by the just-unblocked and unblocking transactions identically; this should fix relatively common buildfarm failures reported by animals that are failing the "wrong" session.
2011-09-27Fix typoAlvaro Herrera
2011-08-18Add an SSI regression test that tests all interesting permutations in theHeikki Linnakangas
order of begin, prepare, and commit of three concurrent transactions that have conflicts between them. The test runs for a quite long time, and the expected output file is huge, but this test caught some serious bugs during development, so seems worthwhile to keep. The test uses prepared transactions, so it fails if the server has max_prepared_transactions=0. Because of that, it's marked as "ignore" in the schedule file. Dan Ports
2011-07-19Make isolationtester more robust on locked commandsAlvaro Herrera
Noah Misch diagnosed the buildfarm problems in the isolation tests partly as failure to differentiate backends properly; the old code was using backend IDs, which is not good enough because a new backend might use an already used ID. Use PIDs instead. Also, the code was purposely careless about other concurrent activity, because it isn't expected; and in fact, it doesn't affect the vast majority of the time. However, it can be observed that autovacuum can block tables for long enough to cause sporadic failures. The new code accounts for that by ignoring locks held by processes not explicitly declared in our spec file. Author: Noah Misch
2011-07-12Add support for blocked commands in isolationtesterAlvaro Herrera
This enables us to test that blocking commands (such as foreign keys checks that conflict with some other lock) act as intended. The set of tests that this adds is pretty minimal, but can easily be extended by adding new specs. The intention is that this will serve as a basis for ensuring that further tweaks of locking implementation preserve (or improve) existing behavior. Author: Noah Misch
2011-05-08Fix some portability issues in isolation regression test driver.Tom Lane
Remove random system #includes in favor of using postgres_fe.h. (The alternative to that is letting this module grow its own configuration testing ability...) Also fix the "make clean" target to actually clean things up. Per local testing.
2011-04-10pgindent run before PG 9.1 beta 1.Bruce Momjian