summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2015-05-15TABLESAMPLE, SQL Standard and extensibleSimon Riggs
Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs
2015-05-15Silence another create_index regression test failure.Heikki Linnakangas
More platform differences in the less-significant digits in output. Per buildfarm member rover_firefly, still.
2015-05-15Fix outdated src/test/mb/ tests, and add a GB18030 test.Tom Lane
The expected-output files for these tests were broken by the recent addition of a warning for hash indexes. Update them. Also add a test case for GB18030 encoding, similar to the other ones. This is a pretty weak test, but it's better than nothing.
2015-05-15Add archive_mode='always' option.Heikki Linnakangas
In 'always' mode, the standby independently archives all files it receives from the primary. Original patch by Fujii Masao, docs and review by me.
2015-05-15Silence create_index regression test failure.Heikki Linnakangas
The expected output contained some floating point values which might get rounded slightly differently on different platforms. The exact output isn't very interesting in this test, so just round it. Per buildfarm member rover_firefly.
2015-05-15Fix datatype confusion with the new lossy GiST distance functions.Heikki Linnakangas
We can only support a lossy distance function when the distance function's datatype is comparable with the original ordering operator's datatype. The distance function always returns a float8, so we are limited to float8, and float4 (by a hard-coded cast of the float8 to float4). In light of this limitation, it seems like a good idea to have a separate 'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy distance function, it still works with lossy quals. There are cases like that with the build-in or contrib opclasses, but it's plausible. There was a hidden assumption that the ORDER BY values returned by GiST match the original ordering operator's return type, but there are plenty of examples where that's not true, e.g. in btree_gist and pg_trgm. As long as the distance function is not lossy, we can tolerate that and just not return the distance to the executor (or rather, always return NULL). The executor doesn't need the distances if there are no lossy results. There was another little bug: the recheck variable was not initialized before calling the distance function. That revealed the bigger issue, as the executor tried to reorder tuples that didn't need reordering, and that failed because of the datatype mismatch.
2015-05-15Fix insufficiently-paranoid GB18030 encoding verifier.Tom Lane
The previous coding effectively only verified that the second byte of a multibyte character was in the expected range; moreover, it wasn't careful to make sure that the second byte even exists in the buffer before touching it. The latter seems unlikely to cause any real problems in the field (in particular, it could never be a problem with null-terminated input), but it's still a bug. Since GB18030 is not a supported backend encoding, the only thing we'd really be doing with GB18030 text is converting it to UTF8 in LocalToUtf, which would fail anyway on any invalid character for lack of a match in its lookup table. So the only user-visible consequence of this change should be that you'll get "invalid byte sequence for encoding" rather than "character has no equivalent" for malformed GB18030 input. However, impending changes to the GB18030 conversion code will require these tighter up-front checks to avoid producing bogus results.
2015-05-15Support --verbose option in reindexdb.Fujii Masao
Sawada Masahiko, reviewed by Fabrízio Mello
2015-05-15Allow GiST distance function to return merely a lower-bound.Heikki Linnakangas
The distance function can now set *recheck = false, like index quals. The executor will then re-check the ORDER BY expressions, and use a queue to reorder the results on the fly. This makes it possible to do kNN-searches on polygons and circles, which don't store the exact value in the index, but just a bounding box. Alexander Korotkov and me
2015-05-15Support VERBOSE option in REINDEX command.Fujii Masao
When this option is specified, a progress report is printed as each index is reindexed. Per discussion, we agreed on the following syntax for the extensibility of the options. REINDEX (flexible options) { INDEX | ... } name Sawada Masahiko. Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi, Jim Nasby and me. Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com
2015-05-15Teach UtfToLocal/LocalToUtf to support algorithmic encoding conversions.Tom Lane
Until now, these functions have only supported encoding conversions using lookup tables, which is fine as long as there's not too many code points to convert. However, GB18030 expects all 1.1 million Unicode code points to be convertible, which would require a ridiculously-sized lookup table. Fortunately, a large fraction of those conversions can be expressed through arithmetic, ie the conversions are one-to-one in certain defined ranges. To support that, provide a callback function that is used after consulting the lookup tables. (This patch doesn't actually change anything about the GB18030 conversion behavior, just provide infrastructure for fixing it.) Since this requires changing the APIs of UtfToLocal/LocalToUtf anyway, take the opportunity to rearrange their argument lists into what seems to me a saner order. And beautify the call sites by using lengthof() instead of error-prone sizeof() arithmetic. In passing, also mark all the lookup tables used by these calls "const". This moves an impressive amount of stuff into the text segment, at least on my machine, and is safer anyhow.
2015-05-15Separate block sampling functionsSimon Riggs
Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek
2015-05-15pg_upgrade: make controldata checks more consistentBruce Momjian
Also add missing float8_pass_by_value check.
2015-05-15Add pg_settings.pending_restart columnPeter Eisentraut
with input from David G. Johnston, Robert Haas, Michael Paquier
2015-05-14Support "expanded" objects, particularly arrays, for better performance.Tom Lane
This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-13Fix comment.Robert Haas
Commit 78efd5c1edb59017f06ef96773e64e6539bfbc86 overlooked this. Report by Peter Geoghegan.
2015-05-13Extend abbreviated key infrastructure to datum tuplesorts.Robert Haas
Andrew Gierth, reviewed by Peter Geoghegan and by me.
2015-05-13Fix postgres_fdw to return the right ctid value in EvalPlanQual cases.Tom Lane
If a postgres_fdw foreign table is a non-locked source relation in an UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its ctid column, the wrong value would be returned if an EvalPlanQual recheck occurred. This happened because the foreign table's result row was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks just unconditionally set the reconstructed tuple's t_self to "invalid". To fix that, we can have EvalPlanQualFetchRowMarks copy the composite datum's t_ctid field, and be sure to initialize that along with t_self when postgres_fdw constructs a tuple to return. If we just did that much then EvalPlanQualFetchRowMarks would start returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps does not matter much, but then again maybe it might. The cause of that is that heap_form_tuple, which is the ultimate source of all composite datums, simply leaves t_ctid as zeroes in newly constructed tuples. That seems like a bad idea on general principles: a field that's really not been initialized shouldn't appear to have a valid value. So let's eat the trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))" in heap_form_tuple. This closes out our handling of Etsuro Fujita's report that tableoid and ctid weren't correctly set in postgres_fdw EvalPlanQual cases. Along the way we did a great deal of work to improve FDWs' ability to control row locking behavior; which was not wasted effort by any means, but it didn't end up being a fix for this problem because that feature would be too expensive for postgres_fdw to use all the time. Although the fix for the tableoid misbehavior was back-patched, I'm hesitant to do so here; it seems far less likely that people would care about remote ctid than tableoid, and even such a minor behavioral change as this in heap_form_tuple is perhaps best not back-patched. So commit to HEAD only, at least for the moment. Etsuro Fujita, with some adjustments by me
2015-05-13Fix jsonb replace and delete on scalars and empty structuresAndrew Dunstan
These operations now error out if attempted on scalars, and simply return the input if attempted on empty arrays or objects. Along the way we remove the unnecessary cloning of the input when it's known to be unchanged. Regression tests covering these cases are added.
2015-05-13Remove useless assertion.Robert Haas
Here, snapshot->xcnt is an unsigned type, so it will always be non-negative.
2015-05-13PL/Python: Remove procedure cache invalidationPeter Eisentraut
This was added to react to changes in the pg_transform catalog, but building with CLOBBER_CACHE_ALWAYS showed that PL/Python was not prepared for having its procedure cache cleared. Since this is a marginal use case, and we don't do this for other catalogs anyway, we can postpone this to another day.
2015-05-12Fix ON CONFLICT bugs that manifest when used in rules.Andres Freund
Specifically the tlist and rti of the pseudo "excluded" relation weren't properly treated by expression_tree_walker, which lead to errors when excluded was referenced inside a rule because the varnos where not properly adjusted. Similar omissions in OffsetVarNodes and expression_tree_mutator had less impact, but should obviously be fixed nonetheless. A couple tests of for ON CONFLICT UPDATE into INSERT rule bearing relations have been added. In passing I updated a couple comments.
2015-05-12Fix some errors from jsonb functions patch.Andrew Dunstan
The catalog version should have been bumped, and the alternative regression result file was not up to date with the name of jsonb_pretty.
2015-05-12Additional functions and operators for jsonbAndrew Dunstan
jsonb_pretty(jsonb) produces nicely indented json output. jsonb || jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text[] removes a key and associated value or array element at the designated path jsonb_replace(jsonb,text[],jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value. Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.
2015-05-12Add support for doing late row locking in FDWs.Tom Lane
Previously, FDWs could only do "early row locking", that is lock a row as soon as it's fetched, even though local restriction/join conditions might discard the row later. This patch adds callbacks that allow FDWs to do late locking in the same way that it's done for regular tables. To make use of this feature, an FDW must support the "ctid" column as a unique row identifier. Currently, since ctid has to be of type TID, the feature is of limited use, though in principle it could be used by postgres_fdw. We may eventually allow FDWs to specify another data type for ctid, which would make it possible for more FDWs to use this feature. This commit does not modify postgres_fdw to use late locking. We've tested some prototype code for that, but it's not in committable shape, and besides it's quite unclear whether it actually makes sense to do late locking against a remote server. The extra round trips required are likely to outweigh any benefit from improved concurrency. Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
2015-05-12pgbench: Don't fail during startupStephen Frost
In pgbench, report, but ignore, any errors returned when attempting to vacuum/truncate the default tables during startup. If the tables are needed, we'll error out soon enough anyway. Per discussion with Tatsuo, David Rowley, Jim Nasby, Robert, Andres, Fujii, Fabrízio de Royes Mello, Tomas Vondra, Michael Paquier, Peter, based on a suggestion from Jeff Janes, patch from Robert, additional message wording from Tom.
2015-05-12pg_basebackup -F t now succeeds with a long symlink targetAndrew Dunstan
2015-05-12doc build: use unique Makefile variable to control temp installBruce Momjian
2015-05-12"Fix" test_ddl_deparse regress test scheduleAlvaro Herrera
MSVC is not smart enough to figure it out, so dumb down the Makefile and remove the schedule file. Also add a .gitignore file. Author: Michael Paquier
2015-05-12doc: prevent SGML 'make check' from building temp installBruce Momjian
Report by Alvaro Herrera
2015-05-12Map basebackup tablespaces using a tablespace_map fileAndrew Dunstan
Windows can't reliably restore symbolic links from a tar format, so instead during backup start we create a tablespace_map file, which is used by the restoring postgres to create the correct links in pg_tblspc. The backup protocol also now has an option to request this file to be included in the backup stream, and this is used by pg_basebackup when operating in tar mode. This is done on all platforms, not just Windows. This means that pg_basebackup will not not work in tar mode against 9.4 and older servers, as this protocol option isn't implemented there. Amit Kapila, reviewed by Dilip Kumar, with a little editing from me.
2015-05-12Replace some appendStringInfo* calls with more appropriate variantsPeter Eisentraut
Author: David Rowley <[email protected]>
2015-05-11Allow on-the-fly capture of DDL event detailsAlvaro Herrera
This feature lets user code inspect and take action on DDL events. Whenever a ddl_command_end event trigger is installed, DDL actions executed are saved to a list which can be inspected during execution of a function attached to ddl_command_end. The set-returning function pg_event_trigger_ddl_commands can be used to list actions so captured; it returns data about the type of command executed, as well as the affected object. This is sufficient for many uses of this feature. For the cases where it is not, we also provide a "command" column of a new pseudo-type pg_ddl_command, which is a pointer to a C structure that can be accessed by C code. The struct contains all the info necessary to completely inspect and even reconstruct the executed command. There is no actual deparse code here; that's expected to come later. What we have is enough infrastructure that the deparsing can be done in an external extension. The intention is that we will add some deparsing code in a later release, as an in-core extension. A new test module is included. It's probably insufficient as is, but it should be sufficient as a starting point for a more complete and future-proof approach. Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick, Abhijit Menon-Sen. Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier, Craig Ringer, David Steele. Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost, Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule. Based on original work by Dimitri Fontaine, though I didn't use his code. Discussion: https://www.postgresql.org/message-id/[email protected] https://www.postgresql.org/message-id/[email protected] https://www.postgresql.org/message-id/[email protected]
2015-05-11Allow LOCK TABLE .. ROW EXCLUSIVE MODE with INSERTStephen Frost
INSERT acquires RowExclusiveLock during normal operation and therefore it makes sense to allow LOCK TABLE .. ROW EXCLUSIVE MODE to be executed by users who have INSERT rights on a table (even if they don't have UPDATE or DELETE). Not back-patching this as it's a behavior change which, strictly speaking, loosens security restrictions. Per discussion with Tom and Robert (circa 2013).
2015-05-11pg_upgrade: use single or double-quotes in command-line stringsBruce Momjian
This is platform-dependent.
2015-05-11Fix incorrect checking of deferred exclusion constraint after a HOT update.Tom Lane
If a row that potentially violates a deferred exclusion constraint is HOT-updated later in the same transaction, the exclusion constraint would be reported as violated when the check finally occurs, even if the row(s) the new row originally conflicted with have since been removed. This happened because the wrong TID was passed to check_exclusion_constraint(), causing the live HOT-updated row to be seen as a conflicting row rather than recognized as the row-under-test. Per bug #13148 from Evan Martin. It's been broken since exclusion constraints were invented, so back-patch to all supported branches.
2015-05-11Increase threshold for multixact member emergency autovac to 50%.Robert Haas
Analysis by Noah Misch shows that the 25% threshold set by commit 53bb309d2d5a9432d2602c93ed18e58bd2924e15 is lower than any other, similar autovac threshold. While we don't know exactly what value will be optimal for all users, it is better to err a little on the high side than on the low side. A higher value increases the risk that users might exhaust the available space and start seeing errors before autovacuum can clean things up sufficiently, but a user who hits that problem can compensate for it by reducing autovacuum_multixact_freeze_max_age to a value dependent on their average multixact size. On the flip side, if the emergency cap imposed by that patch kicks in too early, the user will experience excessive wraparound scanning and will be unable to mitigate that problem by configuration. The new value will hopefully reduce the risk of such bad experiences while still providing enough headroom to avoid multixact member exhaustion for most users. Along the way, adjust the documentation to reflect the effects of commit 04e6d3b877e060d8445eb653b7ea26b1ee5cec6b, which taught autovacuum to run for multixact wraparound even when autovacuum is configured off.
2015-05-11initdb: only recommend pg_ctl to start the serverBruce Momjian
Previously we mentioned the 'postgres' binary method as well.
2015-05-11pg_dump: suppress "Tablespace:" comment for default tablespacesBruce Momjian
Report by Hans Ginzel
2015-05-11Even when autovacuum=off, force it for members as we do in other cases.Robert Haas
Thomas Munro, with some adjustments by me.
2015-05-11Advance the stop point for multixact offset creation only at checkpoint.Robert Haas
Commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c advanced the stop point at vacuum time, but this has subsequently been shown to be unsafe as a result of analysis by myself and Thomas Munro and testing by Thomas Munro. The crux of the problem is that the SLRU deletion logic may get confused about what to remove if, at exactly the right time during the checkpoint process, the head of the SLRU crosses what used to be the tail. This patch, by me, fixes the problem by advancing the stop point only following a checkpoint. This has the additional advantage of making the removal logic work during recovery more like the way it works during normal running, which is probably good. At least one of the calls to DetermineSafeOldestOffset which this patch removes was already dead, because MultiXactAdvanceOldest is called only during recovery and DetermineSafeOldestOffset was set up to do nothing during recovery. That, however, is inconsistent with the principle that recovery and normal running should work similarly, and was confusing to boot. Along the way, fix some comments that previous patches in this area neglected to update. It's not clear to me whether there's any concrete basis for the decision to use only half of the multixact ID space, but it's neither necessary nor sufficient to prevent multixact member wraparound, so the comments should not say otherwise.
2015-05-11Fix DetermineSafeOldestOffset for the case where there are no mxacts.Robert Haas
Commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c failed to take into account the possibility that there might be no multixacts in existence at all. Report by Thomas Munro; patch by me.
2015-05-10Code review for foreign/custom join pushdown patch.Tom Lane
Commit e7cb7ee14555cc9c5773e2c102efd6371f6f2005 included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.
2015-05-10Add missing "static" marker.Tom Lane
Per buildfarm member pademelon.
2015-05-09Add new OID alias type regnamespaceAndrew Dunstan
Catalog version bumped Kyotaro HORIGUCHI
2015-05-09Add new OID alias type regroleAndrew Dunstan
The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi
2015-05-09Improve ParseConfigFp comment wrt head/tailStephen Frost
The head_p and tail_p pointers passed to ParseConfigFp() are actually input/output parameters, not strictly output paramaters. This updates the function comment to reflect that. Per discussion with Tom.
2015-05-08Change default for include_realm to 1Stephen Frost
The default behavior for GSS and SSPI authentication methods has long been to strip the realm off of the principal, however, this is not a secure approach in multi-realm environments and the use-case for the parameter at all has been superseded by the regex-based mapping support available in pg_ident.conf. Change the default for include_realm to be '1', meaning that we do NOT remove the realm from the principal by default. Any installations which depend on the existing behavior will need to update their configurations (ideally by leaving include_realm set to 1 and adding a mapping in pg_ident.conf, but alternatively by explicitly setting include_realm=0 prior to upgrading). Note that the mapping capability exists in all currently supported versions of PostgreSQL and so this change can be done today. Barring that, existing users can update their configurations today to explicitly set include_realm=0 to ensure that the prior behavior is maintained when they upgrade. This needs to be noted in the release notes. Per discussion with Magnus and Peter.
2015-05-08Modify pg_stat_get_activity to build a tuplestoreStephen Frost
This updates pg_stat_get_activity() to build a tuplestore for its results instead of using the old-style multiple-call method. This simplifies the function, though that wasn't the primary motivation for the change, which is that we may turn it into a helper function which can filter the results (or not) much more easily.
2015-05-08Bump catversion for pg_file_settingsStephen Frost
Pointed out by Andres (thanks!) Apologies for not including it in the initial patch.