Re: WIP Patch for GROUPING SETS phase 1
От | Svenne Krap |
---|---|
Тема | Re: WIP Patch for GROUPING SETS phase 1 |
Дата | |
Msg-id | 20150420083658.2543.70336.pgcf@coridan.postgresql.org обсуждение исходный текст |
Ответ на | Re: WIP Patch for GROUPING SETS phase 1 (Svenne Krap <svenne.lists@krap.dk>) |
Ответы |
Re: WIP Patch for GROUPING SETS phase 1
(Svenne Krap <svenne.lists@krap.dk>)
Re: WIP Patch for GROUPING SETS phase 1 (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Список | pgsql-hackers |
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, passed Spec compliant: not tested Documentation: tested, passed Hi, I have (finally) found time to review this. The syntax is as per spec as I can see, and the queries I have tested have all produced the correct output. The documentation looks good and is clear. I think it is spec compliant, but I am not used enough to the spec to be sure. Also I have not understood the function of<set quantifier> (DISTINCT,ALL) part in the group by clause (and hence not tested it). Hence I haven't marked the speccompliant part. The installcheck-world fails, but in src/pl/tcl/results/pltcl_queries.out (a sorting problem when looking at the diff) whichshould be unrelated to GSP. I don't know enough of the check to know if it has already run the GSP tests.. I have also been running a few tests on some real data. This is run on my laptop with 32 GB of memory and a fast SSD. The first dataset is a join between a data table of 472 MB (4,3 Mrows) and a tiny multi-column lookup table. I am returninga count(*). Here the data is hierarchical so CUBE does not make sense. GROUPING SETS and ROLLUP both works fine and if work_buffers arelarge enough it slightly beats the handwritten "union all" equivalent (runtimes as 7,6 seconds to 7,7 seconds). If work_buffersare the default 4MB the union-all-equivalent (UAE) beats the GS-query almost 2:1 due to disk spill (14,3 (GS)vs. 8,2 (UAE) seconds). The other query is on the same datatable as before, but with three "columns" (two calculated and one natural) for a cube.I am returning a count(*). First column is "extract year from date column" Second column is "divide a value by something and truncate" (i.e. make buckets) Third column is a litteral integer column. Here the GS-version is slightly slower than the UAE-version (17,5 vs. 14,2). Nothing obvious about why in the explain (analyze,buffers,costs,timing). I have the explains, but as the dataset is semi-private and I don't have any easy way to edit out names in it, I will sendit on request (non-disclosure from the recipient is of course a must) and not post it on the list. I think the feature is ready to be commited, but am unsure whether I am qualified to gauge that :) /Svenne The new status of this patch is: Ready for Committer
В списке pgsql-hackers по дате отправления: