== PostgreSQL Weekly News - August 10 2008 ==
От | David Fetter |
---|---|
Тема | == PostgreSQL Weekly News - August 10 2008 == |
Дата | |
Msg-id | 20080811050030.GA15314@fetter.org обсуждение исходный текст |
Список | pgsql-announce |
== PostgreSQL Weekly News - August 10 2008 == == PostgreSQL Product News == BitNami LAPPStack 1.0 released. http://bitnami.org/stack/lappstack pgbouncer 1.2.3 released. http://pgfoundry.org/projects/pgbouncer/ phpPgAdmin 4.2.1 released. http://sourceforge.net/project/showfiles.php?group_id=37132 PyReplica 1.0.3 released. http://pgfoundry.org/projects/pyreplica/ Another PostgreSQL Diff Tool 1.2 released. http://pgfoundry.org/projects/apgdiff/ pgSphere 1.0.1 released. http://pgfoundry.org/projects/pgsphere/ PostgreSQL Toolbox 1 released. http://pgfoundry.org/projects/pg-toolbox/ == PostgreSQL Jobs for August == http://archives.postgresql.org/pgsql-jobs/2008-08/threads.php == PostgreSQL Local == The Prato Linux User Group will be having PostgreSQL talks in September. The schedule in Italian is: http://www.prato.linux.it/serate_a_tema_2008 PGCon Brazil 2008 will be on September 26-27 at Unicamp in Campinas. http://pgcon.postgresql.org.br/index.en.html PgDay.fr will be October 4 in Toulouse. The Call for Papers is open: http://www.postgresqlfr.org/?q=node/1686 Registration: http://www.pgday.fr/doku.php/inscription Sponsor the European PGDay! http://www.pgday.org/en/sponsors/campaign The Call for Papers for European PGDay has begun. http://www.pgday.org/en/call4papers PGDay.(IT|EU) 2008 will be October 17 and 18 in Prato. http://www.pgday.org/it/ == PostgreSQL in the News == Planet PostgreSQL: http://www.planetpostgresql.org/ General Bits, Archives and occasional new articles: http://www.varlena.com/GeneralBits/ PostgreSQL Weekly News is brought to you this week by David Fetter and Devrim GUNDUZ. Submit news and announcements by Sunday at 3:00pm Pacific time. Please send English language ones to david@fetter.org, German language to pwn@pgug.de, Italian language to pwn@itpug.org. == Applied Patches == Tom Lane committed: - Improve CREATE/DROP/RENAME DATABASE so that when failing because the source or target database is being accessed by other users, it tells you whether the "other users" are live sessions or uncommitted prepared transactions. (Indeed, it tells you exactly how many of each, but that's mostly just because it was easy to do so.) This should help forestall the gotcha of not realizing that a prepared transaction is what's blocking the command. Per discussion. - Improve SELECT DISTINCT to consider hash aggregation, as well as sort/uniq, as methods for implementing the DISTINCT step. This eliminates the former performance gap between DISTINCT and GROUP BY, and also makes it possible to do SELECT DISTINCT on datatypes that only support hashing not sorting. SELECT DISTINCT ON is still always implemented by sorting; it would take executor changes to support hashing that, and it's not clear it's worth the trouble. This is a release-note-worthy incompatibility from previous PG versions, since SELECT DISTINCT can no longer be counted on to deliver sorted output without explicitly saying ORDER BY. (Anyone who can't cope with that can consider turning off enable_hashagg.) Several regression test queries needed to have ORDER BY added to preserve stable output order. I fixed the ones that manifested here, but there might be some other cases that show up on other platforms. - In pgsql/src/test/regress/pg_regress.c, fix some message style guideline violations in pg_regress, as well as some failures to expose messages for translation. - In pgsql/src/backend/storage/buffer/bufmgr.c, in ReadOrZeroBuffer (and related entry points), don't bother to call PageHeaderIsValid when we zero the buffer instead of reading the page in. The actual performance improvement is probably marginal since this function isn't very heavily used, but a cycle saved is a cycle earned Zdenek Kotala - Add an ORDER BY to one more SELECT DISTINCT test case, per buildfarm results. - In pgsql/src/backend/optimizer/plan/planner.c, department of second thoughts: fix newly-added code in planner.c to make real sure that DISTINCT ON does what it's supposed to, ie, sort by the full ORDER BY list before unique-ifying. The error seems masked in simple cases by the fact that query_planner won't return query pathkeys that only partially match the requested sort order, but I wouldn't want to bet that it couldn't be exposed in some way or other. - Do not allow Unique nodes to be scanned backwards. The code claimed that it would work, but in fact it didn't return the same rows when moving backwards as when moving forwards. This would have no visible effect in a DISTINCT query (at least assuming the column datatypes use a strong definition of equality), but it gave entirely wrong answers for DISTINCT ON queries. - Teach the system how to use hashing for UNION. (INTERSECT/EXCEPT will follow, but seem like a separate patch since most of the remaining work is on the executor side.) I took the opportunity to push selection of the grouping operators for set operations into the parser where it belongs. Otherwise this is just a small exercise in making prepunion.c consider both alternatives. As with the recent DISTINCT patch, this means we can UNION on datatypes that can hash but not sort, and it means that UNION without ORDER BY is no longer certain to produce sorted output. - Support hashing for duplicate-elimination in INTERSECT and EXCEPT queries. This completes my project of improving usage of hashing for duplicate elimination (aggregate functions with DISTINCT remain undone, but that's for some other day). As with the previous patches, this means we can INTERSECT/EXCEPT on datatypes that can hash but not sort, and it means that INTERSECT/EXCEPT without ORDER BY are no longer certain to produce sorted output. - Improve INTERSECT/EXCEPT hashing by realizing that we don't need to make any hashtable entries for tuples that are found only in the second input: they can never contribute to the output. Furthermore, this implies that the planner should endeavor to put first the smaller (in number of groups) input relation for an INTERSECT. Implement that, and upgrade prepunion's estimation of the number of rows returned by setops so that there's some amount of sanity in the estimate of which one is smaller. - In pgsql/src/backend/executor/execMain.c, install checks in executor startup to ensure that the tuples produced by an INSERT or UPDATE will match the target table's current rowtype. In pre-8.3 releases inconsistency can arise with stale cached plans, as reported by Merlin Moncure. (We patched the equivalent hazard on the SELECT side in Feb 2007; I'm not sure why we thought there was no risk on the insertion side.) In 8.3 and HEAD this problem should be impossible due to plan cache invalidation management, but it seems prudent to make the check anyway. Back-patch as far as 8.0. 7.x versions lack ALTER COLUMN TYPE, so there seems no way to abuse a stale plan comparably. - Fix corner-case bug introduced with HOT: if REINDEX TABLE pg_class (or a REINDEX DATABASE including same) is done before a session has done any other update on pg_class, the pg_class relcache entry was left with an incorrect setting of rd_indexattr, because the indexed-attributes set would be first demanded at a time when we'd forced a partial list of indexes into the pg_class entry, and it would remain cached after that. This could result in incorrect decisions about HOT-update safety later in the same session. In practice, since only pg_class_relname_nsp_index would be missed out, only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a problem. Per report and test case from Ondrej Jirman. Magnus Hagander committed: - Move pgstat.tmp into a temporary directory under $PGDATA named pg_stat_tmp. This allows the use of a ramdrive (either through mount or symlink) for the temporary file that's written every half second, which should reduce I/O. On server shutdown/startup, the file is written to the old location in the global directory, to preserve data across restarts. Bump catversion since the $PGDATA directory layout changed. == Rejected Patches (for now) == No one was disappointed this week :-) == Pending Patches == ITAGAKI Takahiro sent in a patch to user NDirectFileRead/Write counters to get I/O counts in BufFile the module. These counters are visible when log_statement_stats is on. Pavel Stehule sent in a patch to implement GROUPING SETS. Tom Lane sent in a patch to use hashes for set operations. Martin Pihlak sent in a patch to make dropping and re-creating functions work more nicely with plan invalidation. Simon Riggs sent in a patch which adds a hook for stats plugins. Abhijit Menon-Sen sent in a patch which extend has_table_privilege() to include sequence information. Robert Haas sent in a patch to implement CREATE OR REPLACE VIEW. Simon Riggs sent two revisions of a patch to fix pg_stop_backup per suggestion by Fujii Masao. pg_stop_backup now tests XLogArchiveCheckDone() for both stopxlogfilename and history file and then stats the stop WAL. Marko Kreen sent in a patch to fix a security issue in dblink. Alvaro Herrera sent in two revisions of a patch to make autovacuum process TOAST tables separately from main tables. Volkan YAZICI sent in three revisions of a patch to allow people to increase the verbosity of set-returning functions. David Wheeler sent in some touch-ups for his citext patch. Euler Taveira de Oliveira sent in a patch which allows symlinking statistics files at initdb time.
В списке pgsql-announce по дате отправления: