postgresql.git - This is the main PostgreSQL git repository.

Age	Commit message (Collapse)	Author
2017-01-18	Change some test macros to return true booleans	Alvaro Herrera
	These macros work fine when they are used directly in an "if" test or similar, but as soon as the return values are assigned to boolean variables (or passed as boolean arguments to some function), they become bugs, hopefully caught by compiler warnings. To avoid future problems, fix the definitions so that they return actual booleans. To further minimize the risk that somebody uses them in back-patched fixes that only work correctly in branches starting from the current master and not in old ones, back-patch the change to supported branches as appropriate. See also commit af4472bcb88ab36b9abbe7fd5858e570a65a2d1a, and the long discussion (and larger patch) in the thread mentioned in its commit message. Discussion: https://postgr.es/m/[email protected]
2017-01-18	Implement array version of jsonb_delete and operator	Magnus Hagander
	This makes it possible to delete multiple keys from a jsonb value by passing in an array of text values, which makes the operaiton much faster than individually deleting the keys (which would require copying the jsonb structure over and over again. Reviewed by Dmitry Dolgov and Michael Paquier
2017-01-18	Improve RLS planning by marking individual quals with security levels.	Tom Lane
	In an RLS query, we must ensure that security filter quals are evaluated before ordinary query quals, in case the latter contain "leaky" functions that could expose the contents of sensitive rows. The original implementation of RLS planning ensured this by pushing the scan of a secured table into a sub-query that it marked as a security-barrier view. Unfortunately this results in very inefficient plans in many cases, because the sub-query cannot be flattened and gets planned independently of the rest of the query. To fix, drop the use of sub-queries to enforce RLS qual order, and instead mark each qual (RestrictInfo) with a security_level field establishing its priority for evaluation. Quals must be evaluated in security_level order, except that "leakproof" quals can be allowed to go ahead of quals of lower security_level, if it's helpful to do so. This has to be enforced within the ordering of any one list of quals to be evaluated at a table scan node, and we also have to ensure that quals are not chosen for early evaluation (i.e., use as an index qual or TID scan qual) if they're not allowed to go ahead of other quals at the scan node. This is sufficient to fix the problem for RLS quals, since we only support RLS policies on simple tables and thus RLS quals will always exist at the table scan level only. Eventually these qual ordering rules should be enforced for join quals as well, which would permit improving planning for explicit security-barrier views; but that's a task for another patch. Note that FDWs would need to be aware of these rules --- and not, for example, send an insecure qual for remote execution --- but since we do not yet allow RLS policies on foreign tables, the case doesn't arise. This will need to be addressed before we can allow such policies. Patch by me, reviewed by Stephen Frost and Dean Rasheed. Discussion: https://postgr.es/m/[email protected]
2017-01-18	Add function to import operating system collations	Peter Eisentraut
	Move this logic out of initdb into a user-callable function. This simplifies the code and makes it possible to update the standard collations later on if additional operating system collations appear. Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Euler Taveira <[email protected]>
2017-01-17	Correct include file path	Peter Eisentraut
	Mistake in 352a24a1f9d6f7d4abb1175bfd22acc358f43140, not clear why it worked for some before.
2017-01-17	Generate fmgr prototypes automatically	Peter Eisentraut
	Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <[email protected]>
2017-01-17	Register missing money operators in system catalogs	Peter Eisentraut
	The operators moneyint8, int8money, and money/int8 were implemented in code but not registered in pg_operator or pg_proc. Reviewed-by: Pavel Stehule <[email protected]>
2017-01-17	Rename C symbols for backend lo_ functions	Peter Eisentraut
	Rename the C symbols for lo_* to be_lo_*, so they don't conflict with libpq prototypes. Reviewed-by: Pavel Stehule <[email protected]>
2017-01-17	Remove unnecessary include	Peter Eisentraut
	Between 6eeb95f0f56bb5e8a0a9328aeec04c9e6de87272 and 7b1c2a0f2066672b24f6257ec9b8d78a1754f494, builtins.h contained additional prototypes that have now been moved elsewhere, so we don't need to include nodes/parsenodes.h anymore. Fix some files that were relying on builtins.h implicitly pulling in some unrelated stuff they needed. Reviewed-by: Pavel Stehule <[email protected]>
2017-01-16	Fix check_srf_call_placement() to handle VALUES cases correctly.	Tom Lane
	INSERT ... VALUES with a single VALUES row is implemented quite differently from the general VALUES case. A user-visible implication of that is that we accept SRFs in the single-row case, but not in the multi-row case. That's a historical artifact no doubt, but in view of the lack of field complaints, I'm not excited about fixing it right now. However, check_srf_call_placement() needs to know about this, first because it should throw an error in the unsupported case, and second because it should set p_hasTargetSRFs in the single-row case (because we treat that like a SELECT tlist). That's an oversight in commit a4c35ea1c. To fix, split EXPR_KIND_VALUES into two values. So far as I can see, this is the only place where we need to distinguish the two cases at present; but there might be more later. Patch by me, per report from Andres Freund. Discussion: https://postgr.es/m/[email protected]
2017-01-15	Fix matching of boolean index columns to sort ordering.	Tom Lane
	Normally, if we have a WHERE clause like "indexcol = constant", the planner will figure out that that index column can be ignored when determining whether the index has a desired sort ordering. But this failed to work for boolean index columns, because a condition like "boolcol = true" is canonicalized to just "boolcol" which does not give rise to an EquivalenceClass. Add a check to allow the same type of deduction to be made in this case too. Per a complaint from Dima Pavlov. Arguably this is a bug, but given the limited impact and the small number of complaints so far, I won't risk destabilizing plans in stable branches by back-patching. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/[email protected]
2017-01-14	Change representation of statement lists, and add statement location info.	Tom Lane
	This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-13	Fix a bug in how we generate partition constraints.	Robert Haas
	Move the code for doing parent attnos to child attnos mapping for Vars in partition constraint expressions to a separate function map_partition_varattnos() and call it from the appropriate places. Doing it in get_qual_from_partbound(), as is now, would produce wrong result in certain multi-level partitioning cases, because it only considers the current pair of parent-child relations. In certain multi-level partitioning cases, attnums for the same key attribute(s) might differ between various levels causing the same attribute to be numbered differently in different instances of the Var corresponding to a given attribute. With this commit, in generate_partition_qual(), we first generate the the whole partition constraint (considering all levels of partitioning) and then do the mapping, so that Vars in the final expression are numbered according the leaf relation (to which it is supposed to apply). Amit Langote, reviewed by me.
2017-01-12	Fix field order in struct catcache.	Tom Lane
	Somebody failed to grasp the point of having the #ifdef CATCACHE_STATS fields at the end of the struct. Put that back the way it should be, and add a comment making it more explicit why it should be that way.
2017-01-09	Fix ALTER TABLE / SET TYPE for irregular inheritance	Alvaro Herrera
	If inherited tables don't have exactly the same schema, the USING clause in an ALTER TABLE / SET DATA TYPE misbehaves when applied to the children tables since commit 9550e8348b79. Starting with that commit, the attribute numbers in the USING expression are fixed during parse analysis. This can lead to bogus errors being reported during execution, such as: ERROR: attribute 2 has wrong type DETAIL: Table has type smallint, but query expects integer. Since it wouldn't do to revert to the original coding, we now apply a transformation to map the attribute numbers to the correct ones for each child. Reported by Justin Pryzby Analysis by Tom Lane; patch by me. Discussion: https://postgr.es/m/[email protected]
2017-01-07	Get rid of ParseState.p_value_substitute; use a columnref hook instead.	Tom Lane
	I noticed that p_value_substitute, which is a single-purpose kluge I added in 2002 (commit b0422b215), could be replaced by having domainAddConstraint install a parser hook that looks for the name "value". The parser hook code only dates back to 2009, so it's not surprising that we had to kluge this in 2002, but we can do it more cleanly now.
2017-01-07	Improve documentation of struct ParseState.	Tom Lane
	I got annoyed about how some fields of ParseState were documented in the struct's block comment and some weren't; not all of the latter are trivial. Fix that. Also reorder a couple of fields that seem to have been placed rather randomly, or maybe with an idea of avoiding padding space; but there are never so many ParseStates in existence at one time that we ought to value pad space over readability.
2017-01-05	Fix possible crash reading pg_stat_activity.	Robert Haas
	With the old code, a backend that read pg_stat_activity without ever having executed a parallel query might see a backend in the midst of executing one waiting on a DSA LWLock, resulting in a crash. The solution is for backends to register the tranche at startup time, not the first time a parallel query is executed. Report by Andreas Seltenreich. Patch by me, reviewed by Thomas Munro.
2017-01-04	Remove unnecessary arguments from partitioning functions.	Robert Haas
	RelationGetPartitionQual() and generate_partition_qual() are always called with recurse = true, so we don't need an argument for that. Extracted by me from a larger patch by Amit Langote.
2017-01-04	Fix reporting of constraint violations for table partitioning.	Robert Haas
	After a tuple is routed to a partition, it has been converted from the root table's row type to the partition's row type. ExecConstraints needs to report the failure using the original tuple and the parent's tuple descriptor rather than the ones for the selected partition. Amit Langote
2017-01-04	Prefer int-wide pg_atomic_flag over char-wide when using gcc intrinsics.	Tom Lane
	configure can only probe the existence of gcc intrinsics, not how well they're implemented, and unfortunately the answer is sometimes "badly". In particular we've found that multiple compilers fail to implement char-width __sync_lock_test_and_set() correctly on PPC; and even a correct implementation would necessarily be pretty inefficient, since that hardware has only a word-wide primitive to work with. Given the knowledge we've accumulated in s_lock.h, it appears that it's best to rely on int-width TAS operations on most non-Intel architectures. Hence, pick int not char when both are nominally available to us in generic-gcc.h (note that that code is not used for x86[_64]). Back-patch to fix regression test failures on FreeBSD/PPC. Ordinarily back-patching a change like this would be verboten because of ABI breakage. But since pg_atomic_flag is not yet used in any Postgres data structure, there's no ABI to break. It seems safer to back-patch to avoid possible gotchas, if someday we do back-patch something that uses pg_atomic_flag. Discussion: https://postgr.es/m/[email protected]
2017-01-04	Move partition_tuple_slot out of EState.	Robert Haas
	Commit 2ac3ef7a01df859c62d0a02333b646d65eaec5ff added a TupleTapleSlot for partition tuple slot to EState (es_partition_tuple_slot) but it's more logical to have it as part of ModifyTableState (mt_partition_tuple_slot) and CopyState (partition_tuple_slot). Discussion: http://postgr.es/m/[email protected] Amit Langote, per a gripe from me
2017-01-04	Re-allow SSL passphrase prompt at server start, but not thereafter.	Tom Lane
	Leave OpenSSL's default passphrase collection callback in place during the first call of secure_initialize() in server startup. Although that doesn't work terribly well in daemon contexts, some people feel we should not break it for anyone who was successfully using it before. We still block passphrase demands during SIGHUP, meaning that you can't adjust SSL configuration on-the-fly if you used a passphrase, but this is no worse than what it was before commit de41869b6. And we block passphrase demands during EXEC_BACKEND reloads; that behavior wasn't useful either, but at least now it's documented. Tweak some related log messages for more readability, and avoid issuing essentially duplicate messages about reload failure caused by a passphrase. Discussion: https://postgr.es/m/[email protected]
2017-01-04	Update obsolete comments in lwlock.h.	Robert Haas
	The typical size of an LWLock is now 16 bytes even on 64-bit platforms, and the size of slock_t is now irrelevant. But pg_atomic_uint32 can (perhaps surprisingly) still be larger than 4 bytes, so there's still some marginal point to allowing LWLOCK_MINIMAL_SIZE == 64. Commit 008608b9d51061b1f598c197477b3dc7be9c4a64 made the changes that led to the need for these updates.
2017-01-03	Update copyright via script for 2017	Bruce Momjian

2017-01-03	Allow SSL configuration to be updated at SIGHUP.	Tom Lane
	It is no longer necessary to restart the server to enable, disable, or reconfigure SSL. Instead, we just create a new SSL_CTX struct (by re-reading all relevant files) whenever we get SIGHUP. Testing shows that this is fast enough that it shouldn't be a problem. In conjunction with that, downgrade the logic that complains about pg_hba.conf "hostssl" lines when SSL isn't active: now that's just a warning condition not an error. An issue that still needs to be addressed is what shall we do with passphrase-protected server keys? As this stands, the server would demand the passphrase again on every SIGHUP, which is certainly impractical. But the case was only barely supported before, so that does not seem a sufficient reason to hold up committing this patch. Andreas Karlsson, reviewed by Michael Banck and Michael Paquier Discussion: https://postgr.es/m/[email protected]
2017-01-02	Use clock_gettime(), if available, in instr_time measurements.	Tom Lane
	The advantage of clock_gettime() is that the API allows the result to be precise to nanoseconds, not just microseconds as in gettimeofday(). Now that it's routinely possible to do tens of plan node executions in 1us, we really need more precision than gettimeofday() can offer for EXPLAIN ANALYZE to accumulate statistics with. Some research shows that clock_gettime() is available on pretty nearly every modern Unix-ish platform, and as far as I have been able to test, it has about the same execution time as gettimeofday(), so there's no loss in switching over. (By the same token, this doesn't do anything to fix the fact that we really wish clock readings were faster. But there's enough win here to justify changing anyway.) A small side benefit is that on most platforms, we can use CLOCK_MONOTONIC instead of CLOCK_REALTIME and thereby render EXPLAIN impervious to concurrent resets of the system clock. (This means that code must not assume that the contents of struct instr_time have any well-defined interpretation as timestamps, but really that was true before.) Some platforms offer nonstandard clock IDs that might be of interest. This patch knows we should use CLOCK_MONOTONIC_RAW on macOS, because it provides more precision and is faster to read than their CLOCK_MONOTONIC. If there turn out to be many more cases where we need special rules, it might be appropriate to handle the selection of clock ID in configure, but for the moment that doesn't seem worth the trouble. Discussion: https://postgr.es/m/[email protected]
2016-12-29	Remove manual breaks in NodeTag assignments to fix duplicate tag numbers.	Tom Lane
	Commit f0e44751d added new node tags at a place in the tag numbering where there was no daylight left before the next hard-coded number, resulting in some duplicate tag assignments. This doesn't seem to have caused any big problem so far, but it's surely trouble waiting to happen. We could adjust the manually assigned breakpoints to make more room, but that just leaves the same hazard waiting to strike again in future. What seems like a better idea is to get rid of the manual assignments and leave NodeTags to be automatically assigned, consecutively from one on up. This means that any change in the tag list forces a backend-wide recompile, but realistically that's usually needed anyway. Discussion: https://postgr.es/m/[email protected]
2016-12-29	Expand ad-hoc unit abbreviations in function descriptions	Peter Eisentraut
	There is no need to use abbreviations here, so just write it out for consistency.
2016-12-29	Make more use of RoleSpec struct	Peter Eisentraut
	Most code was casting this through a generic Node. By declaring everything as RoleSpec appropriately, we can remove a bunch of casts and ad-hoc node type checking. Reviewed-by: Alvaro Herrera <[email protected]>
2016-12-23	Replace enum InhOption with simple boolean.	Tom Lane
	Now that it has only INH_NO and INH_YES values, it's just weird that it's not a plain bool, so make it that way. Also rename RangeVar.inhOpt to "inh", to be like RangeTblEntry.inh. My recollection is that we gave it a different name specifically because it had a different representation than the derived bool value, but it no longer does. And this is a good forcing function to be sure we catch any places that are affected by the change. Bump catversion because of possible effect on stored RangeVar nodes. I'm not exactly convinced that we ever store RangeVar on disk, but we have a readfuncs function for it, so be cautious. (If we do do so, then commit e13486eba was in error not to bump catversion.) Follow-on to commit e13486eba. Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com
2016-12-23	Remove sql_inheritance GUC.	Robert Haas
	This backward-compatibility GUC is long overdue for removal. Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com
2016-12-23	Remove _hash_chgbufaccess().	Robert Haas
	This is basically for the same reasons I got rid of _hash_wrtbuf() in commit 25216c98938495fd741bf585dcbef45b3a9ffd40: it's not convenient to have a function which encapsulates MarkBufferDirty(), especially as we move towards having hash indexes be WAL-logged. Patch by me, reviewed (but not entirely endorsed) by Amit Kapila.
2016-12-22	Fix tuple routing in cases where tuple descriptors don't match.	Robert Haas
	The previous coding failed to work correctly when we have a multi-level partitioned hierarchy where tables at successive levels have different attribute numbers for the partition key attributes. To fix, have each PartitionDispatch object store a standalone TupleTableSlot initialized with the TupleDesc of the corresponding partitioned table, along with a TupleConversionMap to map tuples from the its parent's rowtype to own rowtype. After tuple routing chooses a leaf partition, we must use the leaf partition's tuple descriptor, not the root table's. To that end, a dedicated TupleTableSlot for tuple routing is now allocated in EState. Amit Langote
2016-12-22	Fix handling of expanded objects in CoerceToDomain and CASE execution.	Tom Lane
	When the input value to a CoerceToDomain expression node is a read-write expanded datum, we should pass a read-only pointer to any domain CHECK expressions and then return the original read-write pointer as the expression result. Previously we were blindly passing the same pointer to all the consumers of the value, making it possible for a function in CHECK to modify or even delete the expanded value. (Since a plpgsql function will absorb a passed-in read-write expanded array as a local variable value, it will in fact delete the value on exit.) A similar hazard of passing the same read-write pointer to multiple consumers exists in domain_check() and in ExecEvalCase, so fix those too. The fix requires adding MakeExpandedObjectReadOnly calls at the appropriate places, which is simple enough except that we need to get the data type's typlen from somewhere. For the domain cases, solve this by redefining DomainConstraintRef.tcache as okay for callers to access; there wasn't any reason for the original convention against that, other than not wanting the API of typcache.c to be any wider than it had to be. For CASE, there's no good solution except to add a syscache lookup during executor start. Per bug #14472 from Marcos Castedo. Back-patch to 9.5 where expanded values were introduced. Discussion: https://postgr.es/m/[email protected]
2016-12-22	Skip checkpoints, archiving on idle systems.	Andres Freund
	Some background activity (like checkpoints, archive timeout, standby snapshots) is not supposed to happen on an idle system. Unfortunately so far it was not easy to determine when a system is idle, which defeated some of the attempts to avoid redundant activity on an idle system. To make that easier, allow to make individual WAL insertions as not being "important". By checking whether any important activity happened since the last time an activity was performed, it now is easy to check whether some action needs to be repeated. Use the new facility for checkpoints, archive timeout and standby snapshots. The lack of a facility causes some issues in older releases, but in my opinion the consequences (superflous checkpoints / archived segments) aren't grave enough to warrant backpatching. Author: Michael Paquier, editorialized by Andres Freund Reviewed-By: Andres Freund, David Steele, Amit Kapila, Kyotaro HORIGUCHI Bug: #13685 Discussion: https://www.postgresql.org/message-id/[email protected] https://www.postgresql.org/message-id/CAB7nPqQcPqxEM3S735Bd2RzApNqSNJVietAC=6kfkYv_45dKwA@mail.gmail.com Backpatch: -
2016-12-22	Simplify tape block format.	Heikki Linnakangas
	No more indirect blocks. The blocks form a linked list instead. This saves some memory, because we don't need to have a buffer in memory to hold the indirect block (or blocks). To reflect that, TAPE_BUFFER_OVERHEAD is reduced from 3 to 1 buffer, which allows using more memory for building the initial runs. Reviewed by Peter Geoghegan and Robert Haas. Discussion: https://www.postgresql.org/message-id/34678beb-938e-646e-db9f-a7def5c44ada%40iki.fi
2016-12-21	Fix strange behavior (and possible crashes) in full text phrase search.	Tom Lane
	In an attempt to simplify the tsquery matching engine, the original phrase search patch invented rewrite rules that would rearrange a tsquery so that no AND/OR/NOT operator appeared below a PHRASE operator. But this approach had numerous problems. The rearrangement step was missed by ts_rewrite (and perhaps other places), allowing tsqueries to be created that would cause Assert failures or perhaps crashes at execution, as reported by Andreas Seltenreich. The rewrite rules effectively defined semantics for operators underneath PHRASE that were buggy, or at least unintuitive. And because rewriting was done in tsqueryin() rather than at execution, the rearrangement was user-visible, which is not very desirable --- for example, it might cause unexpected matches or failures to match in ts_rewrite. As a somewhat independent problem, the behavior of nested PHRASE operators was only sane for left-deep trees; queries like "x <-> (y <-> z)" did not behave intuitively at all. To fix, get rid of the rewrite logic altogether, and instead teach the tsquery execution engine to manage AND/OR/NOT below a PHRASE operator by explicitly computing the match location(s) and match widths for these operators. This requires introducing some additional fields into the publicly visible ExecPhraseData struct; but since there's no way for third-party code to pass such a struct to TS_phrase_execute, it shouldn't create an ABI problem as long as we don't move the offsets of the existing fields. Another related problem was that index searches supposed that "!x <-> y" could be lossily approximated as "!x & y", which isn't correct because the latter will reject, say, "x q y" which the query itself accepts. This required some tweaking in TS_execute_ternary along with the main tsquery engine. Back-patch to 9.6 where phrase operators were introduced. While this could be argued to change behavior more than we'd like in a stable branch, we have to do something about the crash hazards and index-vs-seqscan inconsistency, and it doesn't seem desirable to let the unintuitive behaviors induced by the rewriting implementation stand as precedent. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2016-12-21	Refactor partition tuple routing code to reduce duplication.	Robert Haas
	Amit Langote
2016-12-21	Fix corner-case bug in WaitEventSetWaitBlock on Windows.	Robert Haas
	If we do not reset the FD_READ event, WaitForMultipleObjects won't return it again again unless we've meanwhile read from the socket, which is generally true but not guaranteed. WaitEventSetWaitBlock itself may fail to return the event to the caller if the latch is also set, and even if we changed that, the caller isn't obliged to handle all returned events at once. On non-Windows systems, the socket-read event is purely level-triggered, so this issue does not exist. To fix, make Windows reset the event when needed. This bug was introduced by 98a64d0bd713cb89e61bef6432befc4b7b5da59e, and causes hangs when trying to use the pldebugger extension. Patch by Amit Kapial. Reported and tested by Ashutosh Sharma, who also provided some analysis. Further analysis by Michael Paquier.
2016-12-21	Reorder pg_sequence columns to avoid alignment issue	Peter Eisentraut
	On AIX, doubles are aligned at 4 bytes, but int64 is aligned at 8 bytes. Our code assumes that doubles have alignment that can also be applied to int64, but that fails in this case. One effect is that heap_form_tuple() writes tuples in a different layout than Form_pg_sequence expects. Rather than rewrite the whole alignment code, work around the issue by reordering the columns in pg_sequence so that the first int64 column naturally comes out at an 8-byte boundary.
2016-12-20	Add pg_sequence system catalog	Peter Eisentraut
	Move sequence metadata (start, increment, etc.) into a proper system catalog instead of storing it in the sequence heap object. This separates the metadata from the sequence data. Sequence metadata is now operated on transactionally by DDL commands, whereas previously rollbacks of sequence-related DDL commands would be ignored. Reviewed-by: Andreas Karlsson <[email protected]>
2016-12-20	Invalid parent's relcache after CREATE TABLE .. PARTITION OF.	Robert Haas
	Otherwise, subsequent commands in the same transaction see the wrong partition descriptor. Amit Langote. Reported by Tomas Vondra and David Fetter. Reviewed by me. Discussion: http://postgr.es/m/22dd313b-d7fd-22b5-0787-654845c8f849%402ndquadrant.com Discussion: http://postgr.es/m/20161215090916.GB20659%40fetter.org
2016-12-19	Provide a DSA area for all parallel queries.	Robert Haas
	This will allow future parallel query code to dynamically allocate storage shared by all participants. Thomas Munro, with assorted changes by me.
2016-12-19	Fix locking problem in _hash_squeezebucket() / _hash_freeovflpage().	Robert Haas
	A bucket squeeze operation needs to lock each page of the bucket before releasing the prior page, but the previous coding fumbled the locking when freeing an overflow page during a bucket squeeze operation. Commit 6d46f4783efe457f74816a75173eb23ed8930020 introduced this bug. Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after an initial trouble report by Jeff Janes. Reviewed by me. I also fixed a problem with a comment.
2016-12-19	Remove unused file.	Robert Haas
	This was added in 105409746499657acdffc109db9d343b464bda1f, but has never been used for anything as far as I can tell. There seems to be no reason to keep it.
2016-12-19	Support quorum-based synchronous replication.	Fujii Masao
	This feature is also known as "quorum commit" especially in discussion on pgsql-hackers. This commit adds the following new syntaxes into synchronous_standby_names GUC. By using FIRST and ANY keywords, users can specify the method to choose synchronous standbys from the listed servers. FIRST num_sync (standby_name [, ...]) ANY num_sync (standby_name [, ...]) The keyword FIRST specifies a priority-based synchronous replication which was available also in 9.6 or before. This method makes transaction commits wait until their WAL records are replicated to num_sync synchronous standbys chosen based on their priorities. The keyword ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to at least num_sync listed standbys. In this method, the values of sync_state.pg_stat_replication for the listed standbys are reported as "quorum". The priority is still assigned to each standby, but not used in this method. The existing syntaxes having neither FIRST nor ANY keyword are still supported. They are the same as new syntax with FIRST keyword, i.e., a priorirty-based synchronous replication. Author: Masahiko Sawada Reviewed-By: Michael Paquier, Amit Kapila and me Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com> Many thanks to the various individuals who were involved in discussing and developing this feature.
2016-12-16	Improve documentation around TS_execute().	Tom Lane
	I got frustrated by the lack of commentary in this area, so here is some reverse-engineered documentation, along with minor stylistic cleanup. No code changes more significant than removal of unused variables. Back-patch to 9.6, not because that's useful in itself, but because we have some bugs to fix in phrase search and this would cause merge failures if it's only in HEAD.
2016-12-16	Simplify LWLock tranche machinery by removing array_base/array_stride.	Robert Haas
	array_base and array_stride were added so that we could identify the offset of an LWLock within a tranche, but this facility is only very marginally used apart from the main tranche. So, give every lock in the main tranche its own tranche ID and get rid of array_base, array_stride, and all that's attached. For debugging facilities (Trace_lwlocks and LWLOCK_STATS) print the pointer address of the LWLock using %p instead of the offset. This is arguably more useful, and certainly a lot cheaper. Drop the offset-within-tranche from the information reported to dtrace and from one can't-happen message inside lwlock.c. The main user-visible impact of this change is that pg_stat_activity will now report all waits for LWLocks as "LWLock" rather than reporting some as "LWLockTranche" and others as "LWLockNamed". The main motivation for this change is that the need to specify an array_base and an array_stride is awkward for parallel query. There is only a very limited supply of tranche IDs so we can't just keep allocating new ones, and if we try to use the same tranche IDs every time then we run into trouble when multiple parallel contexts are use simultaneously. So if we didn't get rid of this mechanism we'd have to make it even more complicated. By simplifying it in this way, we instead reduce the size of the generated code for lwlock.c by about 5%. Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com
2016-12-16	Unbreak Finalize HashAggregate over Partial HashAggregate.	Robert Haas
	Commit 5dfc198146b49ce7ecc8a1fc9d5e171fb75f6ba5 introduced the use of a new type of hash table with linear reprobing for hash aggregates. Such a hash table behaves very poorly if keys are inserted in hash order, which does in fact happen in the case where a query use a Finalize HashAggregate node fed (via Gather) by a Partial HashAggregate node. In fact, queries with this type of plan tend to run effectively forever. Fix that by seeding the hash value differently in each worker (and in the leader, if it participates). Andres Freund and Robert Haas