== PostgreSQL Weekly News - July 21, 2019 ==
От | David Fetter |
---|---|
Тема | == PostgreSQL Weekly News - July 21, 2019 == |
Дата | |
Msg-id | 20190721201114.GA3885@fetter.org обсуждение исходный текст |
Список | pgsql-announce |
== PostgreSQL Weekly News - July 21, 2019 == 2Q PGConf 2019 will be held December 4 & 5 in Chicago. The CFP is open through August 30, 2019. https://www.2qpgconf.com/ == PostgreSQL Product News == postgres-checkup 1.2 "The Guardian's Voice", a tool that automates detailed health checks of PostgreSQL clusters, released. https://gitlab.com/postgres-ai-team/postgres-checkup/tags/1.2 == PostgreSQL Jobs for July == http://archives.postgresql.org/pgsql-jobs/2019-07/ == PostgreSQL Local == PGConf.Brazil 2019 will take place August 1-3, 2019 in São Paulo. http://pgconf.com.br The first Austrian pgDay, will take place September 6, 2019 at the Hilton Garden Inn in Wiener Neustadt. https://pgday.at/en/ PostgresOpen will be September 11th - 13th, 2019 in Orlando, Florida at the Rosen Centre Hotel. The CfP is open at https://2019.postgresopen.org/callforpapers/ https://2019.postgresopen.org/ PostgresConf South Africa 2019 will take place in Johannesburg on October 8-9, 2019 https://postgresconf.org/conferences/SouthAfrica2019 pgDay Paris 2020 will be held in Paris, France on March 26, 2020 at Espace Saint-Martin. http://2020.pgday.paris/ Nordic PGDay 2020 will be held in Helsinki, Finland at the Hilton Helsinki Strand Hotel on March 24, 2020. The CfP is open through December 31, 2019 at https://2020.nordicpgday.org/cfp/ == PostgreSQL in the News == Planet PostgreSQL: http://planet.postgresql.org/ PostgreSQL Weekly News is brought to you this week by David Fetter Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org. == Applied Patches == Thomas Munro pushed: - Provide XLogRecGetFullXid(). In order to be able to work with FullTransactionId values during replay without increasing the size of the WAL, infer the epoch. In general we can't do that safely, but during replay we can because we know that nextFullXid can't advance concurrently. Prevent frontend code from seeing this new function, due to the above restriction. Perhaps in future it will be possible to extract the value entirely from independent WAL records, and then this restriction can be lifted. Author: Thomas Munro, based on earlier code from Andres Freund Discussion: https://postgr.es/m/CA%2BhUKG%2BmLmuDjMi6o1dxkKvGRL56Y2Rz%2BiXAcrZV03G9ZuFQ8Q%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/67b9b3ca328392f9afc4e66fe03564f5fc87feff - Report the time taken by pgbench initialization steps. Author: Fabien Coelho Reviewed-by: Ibrar Ahmed Discussion: https://postgr.es/m/alpine.DEB.2.21.1904061810510.3678%40lancre https://git.postgresql.org/pg/commitdiff/ce8f946764005e1bde5a538542205e55f81bb6cc - Provide pgbench --show-script to dump built-in scripts. Author: Fabien Coelho Reviewed-by: Ibrar Ahmed Discussion: https://postgr.es/m/alpine.DEB.2.21.1904081737390.5867%40lancre https://git.postgresql.org/pg/commitdiff/5823677acc567d7790cc68972de12f6718913d7d - Move some md.c-specific logic from smgr.c to md.c. Potential future SMGR implementations may not want to create tablespace directories when creating an SMGR relation. Move that logic to mdcreate(). Move the initialization of md-specific data structures from smgropen() to a new callback mdopen(). Author: Thomas Munro Reviewed-by: Shawn Debnath (as part of an earlier patch set) Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/dfd0121dc73aab491bcaad2d2b7a2a749389add8 Tom Lane pushed: - Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 - Remove dead code. These memory context switches are useless in the wake of commit 1cff1b95a. Noted by Jesper Pedersen. Discussion: https://postgr.es/m/f078ce63-9e04-0f3e-d200-d7ee66279abe@redhat.com https://git.postgresql.org/pg/commitdiff/4c3d05d875dd173a81a995c6e14d69496b467eec - Redesign the API for list sorting (list_qsort becomes list_sort). In the wake of commit 1cff1b95a, the obvious way to sort a List is to apply qsort() directly to the array of ListCells. list_qsort was building an intermediate array of pointers-to-ListCells, which we no longer need, but getting rid of it forces an API change: the comparator functions need to do one less level of indirection. Since we're having to touch the callers anyway, let's do two additional changes: sort the given list in-place rather than making a copy (as none of the existing callers have any use for the copying behavior), and rename list_qsort to list_sort. It was argued that the old name exposes more about the implementation than it should, which I find pretty questionable, but a better reason to rename it is to be sure we get the attention of any external callers about the need to fix their comparator functions. While we're at it, change four existing callers of qsort() to use list_sort instead; previously, they all had local reinventions of list_qsort, ie build-an-array-from-a-List-and-qsort-it. (There are some other places where changing to list_sort perhaps would be worthwhile, but they're less obviously wins.) Discussion: https://postgr.es/m/29361.1563220190@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/569ed7f48312c70ed4a79daec1d7688fda4e74ac - Clean up some ad-hoc code for sorting and de-duplicating Lists. heap.c and relcache.c contained nearly identical copies of logic to insert OIDs into an OID list while preserving the list's OID ordering (and rejecting duplicates, in one case but not the other). The comments argue that this is faster than qsort for small numbers of OIDs, which is at best unproven, and seems even less likely to be true now that lappend_cell_oid has to move data around. In any case it's ugly and hard-to-follow code, and if we do have a lot of OIDs to consider, it's O(N^2). Hence, replace with simply lappend'ing OIDs to a List, then list_sort the completed List, then remove adjacent duplicates if necessary. This is demonstrably O(N log N) and it's much simpler for the callers. It's possible that this would be somewhat inefficient if there were a very large number of duplicates, but that seems unlikely in the existing usage. This adds list_deduplicate_oid and list_oid_cmp infrastructure to list.c. I didn't bother with equivalent functionality for integer or pointer Lists, but such could always be added later if we find a use for it. Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/2f5b8eb5a28b4e6de9d20cc7d2c6028c6c7a8aa8 - Remove lappend_cell...() family of List functions. It seems worth getting rid of these functions because they require the caller to retain a ListCell pointer into a List that it's modifying, which is a dangerous practice with the new List implementation. (The only other List-modifying function that takes a ListCell pointer as input is list_delete_cell, which nowadays is preferentially used via the constrained API foreach_delete_current.) There was only one remaining caller of these functions after commit 2f5b8eb5a, and that was some fairly ugly GEQO code that can be much more clearly expressed using a list-index variable and list_insert_nth. Hence, rewrite that code, and remove the functions. Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/c245776906b065fcd59831a25c3b24ad3ddcd849 - Fix thinko in construction of old_conpfeqop list. This should lappend the OIDs, not lcons them; the existing code produced a list in reversed order. This is harmless for single-key FKs or FKs where all the key columns are of the same type, which probably explains how it went unnoticed. But if those conditions are not met, ATAddForeignKeyConstraint would make the wrong decision about whether an existing FK needs to be revalidated. I think it would almost always err in the safe direction by revalidating a constraint that didn't need it. You could imagine scenarios where the pfeqop check was fooled by swapping the types of two FK columns in one ALTER TABLE, but that case would probably be rejected by other tests, so it might be impossible to get to the worst-case scenario where an FK should be revalidated and isn't. (And even then, it's likely to be fine, unless there are weird inconsistencies in the equality behavior of the replacement types.) However, this is a performance bug at least. Noted while poking around to see whether lcons calls could be converted to lappend. This bug is old, dating to commit cb3a7c2b9, so back-patch to all supported branches. https://git.postgresql.org/pg/commitdiff/3093eb2b83645a083a47ea62769ffd89e31f3664 - Avoid using lcons and list_delete_first where it's easy to do so. Formerly, lcons was about the same speed as lappend, but with the new List implementation, that's not so; with a long List, data movement imposes an O(N) cost on lcons and list_delete_first, but not lappend. Hence, invent list_delete_last with semantics parallel to list_delete_first (but O(1) cost), and change various places to use lappend and list_delete_last where this can be done without much violence to the code logic. There are quite a few places that construct result lists using lcons not lappend. Some have semantic rationales for that; I added comments about it to a couple that didn't have them already. In many such places though, I think the coding is that way only because back in the dark ages lcons was faster than lappend. Hence, switch to lappend where this can be done without causing semantic changes. In ExecInitExprRec(), this results in aggregates and window functions that are in the same plan node being executed in a different order than before. Generally, the executions of such functions ought to be independent of each other, so this shouldn't result in visibly different query results. But if you push it, as one regression test case does, you can show that the order is different. The new order seems saner; it's closer to the order of the functions in the query text. And we never documented or promised anything about this, anyway. Also, in gistfinishsplit(), don't bother building a reverse-order list; it's easy now to iterate backwards through the original list. It'd be possible to go further towards removing uses of lcons and list_delete_first, but it'd require more extensive logic changes, and I'm not convinced it's worth it. Most of the remaining uses deal with queues that probably never get long enough to be worth sweating over. (Actually, I doubt that any of the changes in this patch will have measurable performance effects either. But better to have good examples than bad ones in the code base.) Patch by me, thanks to David Rowley and Daniel Gustafsson for review. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/d97b714a219959a50f9e7b37ded674f5132f93f3 - Fix sepgsql test results for commit d97b714a2. The aggregate-order difference explained in my previous commit turns out to also affect the order of log entries emitted in the contrib/sepgsql regression test. Per buildfarm. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/82c8a3c52adfd993b72289bfa8739f97216a06df - Doc: explain where to find Makefile used to build sepgsql-regtest.pp. At least on Fedora and RHEL, it's not in the same RPM that's needed for building sepgsql itself. Today is the second or third time I've had to rediscover how to install that, so let's document it this time. https://git.postgresql.org/pg/commitdiff/860c095fd548cd25586e4273e9b489082b4ffa13 - Clarify the distinction between public and private SPITupleTable fields. The fields that we consider public are "tupdesc" and "vals", which historically are in the middle of the struct. Move them to the front (this should be perfectly safe to do in HEAD) and add comments to make it quite clear which fields are public or not. Also adjust spi.sgml's documentation of the struct to match. That doc had bit-rotted somewhat, as it was missing some fields. (Arguably we should just remove all the private fields from the docs, but for now I refrained.) Daniel Gustafsson, reviewed by Fabien Coelho Discussion: https://postgr.es/m/0D19F836-B743-4340-B6A2-F148CA3DD1F0@yesql.se https://git.postgresql.org/pg/commitdiff/fec0778c8098cebec2d5cb3674ac7151d8d95638 - Sync our copy of the timezone library with IANA release tzcode2019b. A large fraction of this diff is just due to upstream's somewhat random decision to rename a bunch of internal variables and struct fields. However, there is an interesting new feature in zic: it's grown a "-b slim" option that emits zone files without 32-bit data and other backwards-compatibility hacks. We should consider whether we wish to enable that. https://git.postgresql.org/pg/commitdiff/f285322f9cd3145ea2e5b870e6ba7e0c641422ac - Update time zone data files to tzdata release 2019b. Brazil no longer observes DST. Historical corrections for Palestine, Hong Kong, and Italy. https://git.postgresql.org/pg/commitdiff/93907478e15f5762800b6acbe2eff03167843874 - Further adjust SPITupleTable to provide a public row-count field. Now that commit fec0778c8 drew a clear line between public and private fields in SPITupleTable, it seems pretty silly that the count of valid tuples isn't on the public side of that line. The reason why not was that there wasn't such a count. For reasons lost in the mists of time, spi.c preferred to keep a count of remaining free entries in the array. But that seems pretty pointless: it's unlike the way we handle similar code everywhere else, and it involves extra subtractions that surely outweigh having to do a comparison rather than test-for-zero to check for array-full. Hence, rearrange so that this code does the expansible array logic the same as everywhere else, with a count of valid entries alongside the allocated array length. And document the count as public. I looked for core-code callers where it would make sense to start relying on tuptable->numvals rather than the separate SPI_processed variable. Right now there don't seem to be places where it'd be a win to do so without more code restructuring than I care to undertake today. In principle, though, having SPITupleTables be fully self-contained should be helpful down the line. Discussion: https://postgr.es/m/16852.1563395722@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/bc8393cf27731055467a83068c680c86f9c112ea - Silence compiler warning, hopefully. Absorb commit e5e04c962a5d12eebbf867ca25905b3ccc34cbe0 from upstream IANA code, in hopes of silencing warnings from MSVC about negating a bool value. Discussion: https://postgr.es/m/20190719035347.GJ1859@paquier.xyz https://git.postgresql.org/pg/commitdiff/421466863548de58199c7c6ececaae6b5f621b2f - Remove no-longer-helpful reliance on fixed-size local array. Coverity complained about this code, apparently because it uses a local array of size FUNC_MAX_ARGS without a guard that the input argument list is no longer than that. (Not sure why it complained today, since this code's been the same for a long time; possibly it re-analyzed everything the List API change touched?) Rather than add a guard, though, let's just get rid of the local array altogether. It was only there to avoid list_nth() calls, and those are no longer expensive. https://git.postgresql.org/pg/commitdiff/330cafdfaa11ebe53e3e59688acac1577ae0cb34 Peter Geoghegan pushed: - Fix pathological nbtree split point choice issue. Specific ever-decreasing insertion patterns could cause successive unbalanced nbtree page splits. Problem cases involve a large group of duplicates to the left, and ever-decreasing insertions to the right. To fix, detect the situation by considering the newitem offset before performing a split using nbtsplitloc.c's "many duplicates" strategy. If the new item was inserted just to the right of our provisional "many duplicates" split point, infer ever-decreasing insertions and fall back on a 50:50 (space delta optimal) split. This seems to barely affect cases that already had acceptable space utilization. An alternative fix also seems possible. Instead of changing nbtsplitloc.c split choice logic, we could instead teach _bt_truncate() to generate a new value for new high keys by interpolating from the lastleft and firstright key values. That would certainly be a more elegant fix, but it isn't suitable for backpatching. Discussion: https://postgr.es/m/CAH2-WznCNvhZpxa__GqAa1fgQ9uYdVc=_apArkW2nc-K3O7_NA@mail.gmail.com Backpatch: 12-, where the nbtree page split enhancements were introduced. https://git.postgresql.org/pg/commitdiff/e3899ffd8beafdaaa037b503163a9f572e9fc729 - Correct nbtsplitloc.c comment. The logic just added by commit e3899ffd falls back on a 50:50 page split in the event of a new item that's just to the right of our provisional "many duplicates" split point. Fix a comment that incorrectly claimed that the new item had to be just to the left of our provisional split point. Backpatch: 12-, just like commit e3899ffd. https://git.postgresql.org/pg/commitdiff/bfdbac2ab3ef1a12a7de231552b128ed83ad00bb - Fix nbtree metapage cache upgrade bug. Commit 857f9c36cda, which taught nbtree VACUUM to avoid unnecessary index scans, bumped the nbtree version number from 2 to 3, while adding the ability for nbtree indexes to be upgraded on-the-fly. Various assertions that assumed that an nbtree index was always on version 2 had to be changed to accept any supported version (version 2 or 3 on Postgres 11). However, a few assertions were missed in the initial commit, all of which were in code paths that cache a local copy of the metapage metadata, where the index had been expected to be on the current version (no longer version 2) as a generic sanity check. Rather than simply update the assertions, follow-up commit 0a64b45152b intentionally made the metapage caching code update the per-backend cached metadata version without changing the on-disk version at the same time. This could even happen when the planner needed to determine the height of a B-Tree for costing purposes. The assertions only fail on Postgres v12 when upgrading from v10, because they were adjusted to use the authoritative shared memory metapage by v12's commit dd299df8. To fix, remove the cache-only upgrade mechanism entirely, and update the assertions themselves to accept any supported version (go back to using the cached version in v12). The fix is almost a full revert of commit 0a64b45152b on the v11 branch. VACUUM only considers the authoritative metapage, and never bothers with a locally cached version, whereas everywhere else isn't interested in the metapage fields that were added by commit 857f9c36cda. It seems unlikely that this bug has affected any user on v11. Reported-By: Christoph Berg Bug: #15896 Discussion: https://postgr.es/m/15896-5b25e260fdb0b081%40postgresql.org Backpatch: 11-, where VACUUM was taught to avoid unnecessary index scans. https://git.postgresql.org/pg/commitdiff/d004147eb3ece6b5981dbdd3d918ffc3f23fc505 - Don't rely on estimates for amcheck Bloom filters. Solely relying on a relation's reltuples/relpages estimate to size the Bloom filters used by amcheck verification makes verification less effective when the estimates are very stale. In extreme cases, verification options that use Bloom filters internally could be totally ineffective, without users receiving any clear indication that certain types of corruption might easily be missed. To fix, use RelationGetNumberOfBlocks() instead of relpages to size the downlink block Bloom filter. Use the same RelationGetNumberOfBlocks() value to derive a minimum size for the heapallindexed Bloom filter, rather than completely trusting reltuples. Verification will still be reasonably effective when the projected/estimated number of Bloom filter elements is at least 1/5 of the final number of elements, which is assured by the new sizing logic. Reported-By: Alexander Korotkov Discussion: https://postgr.es/m/CAH2-Wzk0ke2J42KrNYBKu0Xovjy-sU5ub7PWjgpbsKdAQcL4OA@mail.gmail.com Backpatch: 11-, where downlink/heapallindexed verification were added. https://git.postgresql.org/pg/commitdiff/894af78f185afee221a6762a1a49057043b7bbf5 Bruce Momjian pushed: - doc: mention pg_reload_conf() for reloading the config file. Reported-by: Ian Barwick Discussion: https://postgr.es/m/538950ec-b86a-1650-6078-beb7091c09c2@2ndquadrant.com Backpatch-through: 9.4 https://git.postgresql.org/pg/commitdiff/c6bce6ebb668c7da03d01244d34cac0335561103 Michaël Paquier pushed: - Fix inconsistencies and typos in the tree. This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com https://git.postgresql.org/pg/commitdiff/0896ae561b6c799d45cb61d8a3b18fbb442130a7 - Simplify description of --data-checksums in documentation of initdb. The documentation mentioned that data checksums cannot be changed after initialization, which is not true as pg_checksums can do that with its --enable option introduced in v12. This simply removes the sentence telling so. Reported-by: Basil Bourque Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/15909-e9d74271f1647472@postgresql.org Backpatch-through: 12 https://git.postgresql.org/pg/commitdiff/1c1602b8b685a68796f8ba48e41f778c0c42ba43 - Fix typo in mvdistinct.c. Noticed while browsing the code. https://git.postgresql.org/pg/commitdiff/70a33b21099c046dc38f07ffb02b1e0cf2aff91d - Refactor parallelization processing code in src/bin/scripts/. The existing facility of vacuumdb to handle parallel connections into a given database with an authentication set is moved to a common file in src/bin/scripts/, named scripts_parallel.c. This introduces a set of routines to initialize, wait and terminate a set of connections, simplifying a bit the code of vacuumdb on the way. More routines related to result handling and database connection are moved to common.c. The initial plan is to use that for reindexdb, but it could be applied to other tools like clusterdb. While on it, clean up a set of variables "progname" which were defined as routine arguments for error messages. Since most of the callers have switched to pg_log_error() and such there is no need for this variable. Author: Julien Rouhaud Reviewed-by: Michael Paquier, Álvaro Herrera Discussion: https://postgr.es/m/CAOBaU_YrnH_Jqo46NhaJ7uRBiWWEcS40VNRQxgFbqYo9kApUsg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5f3840370b63fdf17f704a285623ccc233fa8d4f - Doc: clarify when table rewrites happen with column addition and DEFAULT. 16828d5 has improved ALTER TABLE so as a column addition does not require a rewrite for a non-NULL default with constant expressions, but one spot in the documentation did not get updated consistently. The documentation also now clarifies the fact that this does not apply if the expression is volatile, where a table rewrite is still required. Reported-by: Daniel Westermann Author: Ian Barwick Reviewed-by: Michael Paquier, Daniel Westermann Discussion: https://postgr.es/m/DB6PR0902MB2184C7D5645CF15D75EB7957D2CF0@DB6PR0902MB2184.eurprd09.prod.outlook.com Backpatch-through: 11 https://git.postgresql.org/pg/commitdiff/1300fa66b2f3d0dcd2eed7a5eff9e3fc22807f7c - Fix compilation warning of pg_basebackup with MinGW. Several buildfarm members have been complaining about that with gcc, like jacana. Weirdly enough, Visual Studio's compilers do not find this issue. Author: Michael Paquier Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/20190719050830.GK1859@paquier.xyz https://git.postgresql.org/pg/commitdiff/90317ab7e64bd2d855c73a6ba579de6d04a7b25c Andres Freund pushed: - tableam: comment improvements. Author: Brad DeJong Discussion: https://postgr.es/m/CAJnrtnxDYOQFsDfWz2iri0T_fFL2ZbbzgCOE=4yaMcszgcsf4A@mail.gmail.com Backpatch: 12- https://git.postgresql.org/pg/commitdiff/21039555cdec75836d246fcbcd4b44ee63dabfad Tomáš Vondra pushed: - Remove unnecessary TYPECACHE_GT_OPR lookup. The TYPECACHE_GT_OPR is not needed (it used to be in older version of the MCV code), but the compiler failed to detect this as the result was used in a fmgr_info() call, populating a FmgrInfo entry. Backpatch to v12, where this code was introduced. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12 https://git.postgresql.org/pg/commitdiff/a4303a078c661ebafe8c8c2167b2ad9bf16b32ce - Fix handling of NULLs in MCV items and constants. There were two issues in how the extended statistics handled NULL values in opclauses. Firstly, the code was oblivious to the possibility that Const may be NULL (constisnull=true) in which case the constvalue is undefined. We need to treat this as a mismatch, and not call the proc. Secondly, the MCV item itself may contain NULL values too - the code already did check that, and updated the match bitmap accordingly, but failed to ensure we won't call the operator procedure anyway. It did work for AND-clauses, because in that case false in the bitmap stops evaluation of further clauses. But for OR-clauses ir was not easy to get incorrect estimates or even trigger a crash. This fixes both issues by extending the existing check so that it looks at constisnull too, and making sure it skips calling the procedure. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu https://git.postgresql.org/pg/commitdiff/e4deae7396f2a5576c0c8289e2bfc005ed3d6989 - Simplify bitmap updates in multivariate MCV code. When evaluating clauses on a multivariate MCV list, we build a bitmap tracking how the clauses match each item of the MCV list. When updating the bitmap we need to consider the current value (tracking how the item matches preceding clauses), match for the current clause and whether the clauses are connected by AND or OR. Until now the logic was copied on every place updating the bitmap, which was not quite readable. So just move it to a separate function and call it where needed. Backpatch to 12, where the code was introduced. While not a bugfix, this should make maintenance and future backpatches easier. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu https://git.postgresql.org/pg/commitdiff/7d24f6a49076f975ca87926b3cde8fdea3448ecb - Fix handling of opclauses in extended statistics. We expect opclauses to have exactly one Var and one Const, but the code was checking the Const by calling is_pseudo_constant_clause() which is incorrect - we need a proper constant. Fixed by using plain IsA(x,Const) to check type of the node. We need to do these checks in two places, so move it into a separate function that can be called in both places. Reported by Andreas Seltenreich, based on crash reported by sqlsmith. Backpatch to v12, where this code was introduced. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12 https://git.postgresql.org/pg/commitdiff/e8b6ae2130e3a95bb776708a9a7c9cb21fe8ac87 - Rework examine_opclause_expression to use varonleft. The examine_opclause_expression function needs to return information on which side of the operator we found the Var, but the variable was called "isgt" which is rather misleading (it assumes the operator is either less-than or greater-than, but it may be equality or something else). Other places in the planner use a variable called "varonleft" for this purpose, so just adopt the same convention here. The code also assumed we don't care about this flag for equality, as (Var = Const) and (Const = Var) should be the same thing. But that does not work for cross-type operators, in which case we need to pass the parameters to the procedure in the right order. So just use the same code for all types of expressions. This means we don't need to care about the selectivity estimation function anymore, at least not in this code. We should only get the supported cases here (thanks to statext_is_compatible_clause). Reviewed-by: Tom Lane Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12 https://git.postgresql.org/pg/commitdiff/e38a55ba46bbd2510baccdbaa01298cbca972b88 - Use column collation for extended statistics. The current extended statistics code was a bit confused which collation to use. When building the statistics, the collations defined as default for the data types were used (since commit 5e0928005). The MCV code was however using the column collations for MCV serialization, and then DEFAULT_COLLATION_OID when computing estimates. So overall the code was using all three possible options, inconsistently. This uses the column colation everywhere - this makes it consistent with what 5e0928005 did for regular stats. We however do not track the collations in a catalog, because we can derive them from column-level information. This may need to change in the future, e.g. after allowing statistics on expressions. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12 https://git.postgresql.org/pg/commitdiff/a63378a03ec0a53c7c579dfdb3abff57811d8ced Jeff Davis pushed: - Fix daterange canonicalization for +/- infinity. The values 'infinity' and '-infinity' are a part of the DATE type itself, so a bound of the date 'infinity' is not the same as an unbounded/infinite range. However, it is still wrong to try to canonicalize such values, because adding or subtracting one has no effect. Fix by treating 'infinity' and '-infinity' the same as unbounded ranges for the purposes of canonicalization (but not other purposes). Backpatch to all versions because it is inconsistent with the documented behavior. Note that this could be an incompatibility for applications relying on the behavior contrary to the documentation. Author: Laurenz Albe Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/77f24ea19ab802bc9bc60ddbb8977ee2d646aec1.camel%40cybertec.at Backpatch-through: 9.4 https://git.postgresql.org/pg/commitdiff/e6feef571a016c9dac52a01aebad484768eb5c68 - Fix error in commit e6feef57. I was careless passing a datum directly to DATE_NOT_FINITE without calling DatumGetDateADT() first. Backpatch-through: 9.4 https://git.postgresql.org/pg/commitdiff/b538c90b1bded5464787e2b8e4431302d24eb601 - pg_stat_statements: add missing check for pgss_enabled(). Make pgss_post_parse_analyze() more consistent with the other hooks, and avoid unnecessary overhead when pg_stat_statements.track=none. Author: Raymond Martin Reviewed-by: Fabien COELHO Discussion: https://postgr.es/m/BN8PR21MB1217B003C4F79DE230AA36B9B1580%40BN8PR21MB1217.namprd21.prod.outlook.com https://git.postgresql.org/pg/commitdiff/6f40ee4f837ec1ac59c8ddc73b67a04978a184d5 David Rowley pushed: - Speed up finding EquivalenceClasses for a given set of rels. Previously in order to determine which ECs a relation had members in, we had to loop over all ECs stored in PlannerInfo's eq_classes and check if ec_relids mentioned the relation. For the most part, this was fine, as generally, unless queries were fairly complex, the overhead of performing the lookup would have not been that significant. However, when queries contained large numbers of joins and ECs, the overhead to find the set of classes matching a given set of relations could become a significant portion of the overall planning effort. Here we allow a much more efficient method to access the ECs which match a given relation or set of relations. A new Bitmapset field in RelOptInfo now exists to store the indexes into PlannerInfo's eq_classes list which each relation is mentioned in. This allows very fast lookups to find all ECs belonging to a single relation. When we need to lookup ECs belonging to a given pair of relations, we can simply bitwise-AND the Bitmapsets from each relation and use the result to perform the lookup. We also take the opportunity to write a new implementation of generate_join_implied_equalities which makes use of the new indexes. generate_join_implied_equalities_for_ecs must remain as is as it can be given a custom list of ECs, which we can't easily determine the indexes of. This was originally intended to fix the performance penalty of looking up foreign keys matching a join condition which was introduced by 100340e2d. However, we're speeding up much more than just that here. Author: David Rowley, Tom Lane Reviewed-by: Tom Lane, Tomas Vondra Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/3373c7155350cf6fcd51dd090f29e1332901e329 == Pending Patches == Matheus de Oliveira sent in another revision of a patch to add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT. Ian Barwick sent in a patch to improve the uuid_version() documents. Hubert Zhang sent in another revision of a patch to add hooks for disk quotas. Pavel Stěhule sent in another revision of a patch for psql to allow sorting the results of backslash commands by size. Paul Guo sent in another revision of a patch to auto-generate a recovery.conf and ensure a clean shutdown with pg_rewind. Paul Guo sent in another revision of a patch to skip copydir() if either src directory or dst directory is missing due to re-redoing create database but the tablespace is dropped. Luis Carril sent in another revision of a patch to add an option to dump foreign data in pg_dump. Andrey V. Lepikhov sent in a patch to ensure more secure initialization of required_relids field. Ian Barwick and Bruce Momjian traded patches to mention pg_reload_conf() in the pg_hba.conf documentation. John Naylor sent in a patch to fix a typo in the list of languages with new stemmers. Laurenz Albe sent in a patch to document some improvements in consistency between *n*x and Windows for syncing data. Daniel Westermann sent in a patch to clarify the tip in the documentation on adding columns with defaults. Fabien COELHO sent in another revision of a patch to share the strtoX64 functions between the front- and back-ends. Edmund Horner and David Rowley traded patches to add a new plan type, Tid Range Scan, to support range quals over CTID. Konstantin Knizhnik sent in four more revisions of a patch to add a built-in connection pooler. Nikita Glukhov sent in another revision of a patch to add information about access methods and indexes to psql's meta-commands. Peter Geoghegan sent in a patch to overwrite the lastright item with highkey in nbtsort.c. Peter Geoghegan sent in a patch to simplify _bt_getstackbuf() by making it accept a child BlockNumber argument, rather than requiring that callers store the child block number in the parent stack item's bts_btentry field. Surafel Temesgen sent in another revision of a patch to implement conflict handling in COPY ... FROM. Fabien COELHO sent in another revision of a patch which extends the initialization phase controls. Masahiko Sawada sent in a patch to add a RESUME option to VACUUM and autovacuum. Pavel Stěhule sent in another revision of a patch to add schema variables. Nikita Glukhov sent in another revision of a patch to implement SQL/JSON functions. Nikita Glukhov sent in another revision of a patch to implement SQL/JSON's JSON_TABLE. Justin Pryzby sent in three more revisions of a patch to psql to print table associated with given TOAST table, make \d pg_toast.foo show its indices, and show the children of partitioned indices. Nikita Glukhov sent in another revision of a patch to fix max size checking for ltree and lquery. Tom Lane and Sergei Kornilov traded patches to change the ereport level for QueuePartitionConstraintValidation. Jesper Pedersen and Michaël Paquier traded patches to highlight the fact that pg_receivewal doesn't apply WAL, and as such synchronous-commit needs to be remote_write or lower. Dilip Kumar sent in another revision of a patch to clean up orphaned files using undo logs. Jesper Pedersen sent in another revision of a patch to implement a separate UniqueKey node. Alexander Korotkov and Liudmila Mantrova traded patches to add support for the jsonpath .datetime() method. Michaël Paquier sent in a patch to fix a bug where pg_stat_replication lag fields return non-NULL values even with NULL LSNs. Ryo Matsumura sent in a patch to imlement CREATE AS EXECUTE in ECPG. Iwata Aya sent in another revision of a patch to implement a libpq debug log. Amit Langote sent in another revision of a patch to use the root parent's permissions when read child table's stats. Nikita Glukhov sent in another revision of a patch to improve the ltree syntax. Antonin Houska sent in another revision of a patch to implement aggregate pushdown. Ian Barwick sent in a patch to make the configuration file "include" directive handling more robust. Jeff Davis sent in two revisions of a patch to enable simplehash to use already-calculated hash values. Melanie Plageman sent in a patch to implement plan-time extraction of scan cols, execution-time extraction and comparison with the aforementioned, and stop adding returningList for invalid result_relation number. Gareth Palmer sent in a patch to implement the INSERT SET syntax. Daniel Westermann and Ian Barwick traded patches to fix the documentation for adding a column with a default value. Michaël Paquier and Julien Rouhaud traded patches to add parallelism and glibc-dependent-only options to reindexdb. Roman Zharkov sent in a patch to fix some intermittent pg_ctl failures on Windows. Anastasia Lubennikova sent in two more revisions of a patch to storage of duplicates more efficiently in B-tree indexes. Tom Lane sent in a patch to touch all the places where it seemed like an easy win to stop using lcons and/or list_delete_first. Daniel Gustafsson sent in a patch to use a counter and list_copy_tail to avoid repeated list_delete_first calls. Álvaro Herrera sent in a patch to fix a bug that manifested as getting ERROR "relation 16401 has no triggers" with partition foreign key alter. Amit Langote sent in another revision of a patch to make some cosmetic improvements to partitionwise join code, fix partitionwise join to handle FULL JOINs correctly, and add multi-relation EC child members in a separate pass. Anastasia Lubennikova sent in a patch to fix a bug that manifested as pg_upgrade failing with a non-standard ACL. Jeff Davis sent in a patch to implement memory accounting at the block level. Ashutosh Sharma sent in two revisions of a patch to implement CALL in ECPG. David Rowley sent in a patch to speed up finding EquivalenceClasses for a given set of rels, and add a special purpose generate_join_implied_equalities implementation. Amit Khandekar sent in another revision of a patch to implement minimal logical decoding on standbys. Daniel Gustafsson sent in a patch to bring xpath/numvals in line with recent developments in the SPITupleTable struct. Tom Lane sent in a patch to add dependencies for partitioning columns. Mike Palmiotto sent in a patch to make the sepgsql-regtest policy module less error-prone, and add a sandboxed cluster for the sepgsql regression tests. Amit Langote sent in a patch to use es_result_relation_info less, and rearrange the partition row movement update code. Ian Barwick sent in a patch to ensure that primary_slot_name is escaped in pg_basebackup. Peifeng Qiu sent in another revision of a patch to make it possible to compile from source using latest Microsoft Windows SDK. Tomáš Vondra sent in another revision of a patch to clean up ALTER STATISTICS. Thomas Munro sent in a PoC patch to make Datum a strong type. Tomáš Vondra and James Coleman traded patches to implement incremental sort. David Rowley sent in another revision of a patch to shrink bloated locallocktable. David Rowley sent in a patch to perform a bms_next_member() type loop then just fetch the list item with list_nth(). Tom Lane sent in a patch to rationalize list concat and copy operations.
В списке pgsql-announce по дате отправления: