Обсуждение: BUG #4728: segfault with window function partition involving subquery

Поиск
Список
Период
Сортировка

BUG #4728: segfault with window function partition involving subquery

От
"Bruce Toll"
Дата:
The following bug has been logged online:

Bug reference:      4728
Logged by:          Bruce Toll
Email address:      btoll@dhsus.com
PostgreSQL version: 8.4devel
Operating system:   GNU/Linux (Ubuntu Hardy)
Description:        segfault with window function partition involving
subquery
Details:

Greetings,

Thanks for all of the work on the upcoming
8.4 release.  The windows functions are great.
I encountered a problem that will hopefully be
easy for you to reproduce using psql with the
snippet below.

NOTE: A workaround in this simple case is to simply
remove col1 from the PARTITION BY clause, as the
subselect insures that col1 never varies.

Regards,
Bruce Toll

----------------------------------------------------------------------------
--

CREATE TEMP TABLE test_table (
    col1    int,
    col2    int,
    col3    int
);

INSERT INTO test_table VALUES
    (1,1,2),
    (1,2,2);

SELECT count(col1) OVER (PARTITION BY col1, col2, col3)
FROM (
    SELECT *
    FROM test_table
    WHERE col1 = 1
) AS r;

----------------------------------------------------------------------------
--

The code above caused a segmentation fault for 8.4devel postgres on
GNU/Linux (Ubuntu Hardy), 32bit Intel Core 2 Duo CPU,
Linux kernel 2.6.24-23-generic.

Output of 'select version();':
----------------------------------------------------------------------------
--
PostgreSQL 8.4devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.4
(Ubuntu 4.2.4-1ubuntu3), 32-bit


The last commit included in the build was:
----------------------------------------------------------------------------
--
commit 93094a17be1e01fa969aa721b14d806d98157efe
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sat Mar 21 00:04:40 2009 +0000

    Optimize multi-batch hash joins when the outer relation has a
nonuniform
    distribution, by creating a special fast path for the (first few) most
common
    values of the outer relation.  Tuples having hashvalues matching the
MCVs
    are effectively forced to be in the first batch, so that we never write
    them out to the batch temp files.

    Bryce Cutt and Ramon Lawrence, with some editorialization by me.
----------------------------------------------------------------------------
--


psql output:
----------------------------------------------------------------------------
--
CREATE TABLE
INSERT 0 2
psql:crash_20.sql:34: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
psql:crash_20.sql:34: connection to server was lost
----------------------------------------------------------------------------
--

gdb bt after postgres backend SIGSEGV:
----------------------------------------------------------------------------
--
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb790b6d0 (LWP 28300)]
0x08218ff4 in grouping_planner (root=0x856561c, tuple_fraction=0) at
planner.c:2549
2549                    *partColIdx[*partNumCols] = sortColIdx[scidx++];
(gdb) bt
#0  0x08218ff4 in grouping_planner (root=0x856561c, tuple_fraction=0) at
planner.c:2549
#1  0x0821a814 in subquery_planner (glob=0x85a2804, parse=0x8564d1c,
parent_root=0x0,
    hasRecursion=0 '\0', tuple_fraction=0, subroot=0xbfc3e2b8) at
planner.c:480
#2  0x0821ad6e in standard_planner (parse=0x8564d1c, cursorOptions=0,
boundParams=0x0)
    at planner.c:189
#3  0x0826df0f in pg_plan_query (querytree=0x8564d1c, cursorOptions=0,
boundParams=0x0)
    at postgres.c:697
#4  0x0826e003 in pg_plan_queries (querytrees=0x85a27e8, cursorOptions=0,
boundParams=0x0)
    at postgres.c:756
#5  0x0826e8ca in exec_simple_query (
    query_string=0x8563d1c "SELECT count(col1) OVER (PARTITION BY col1,
col2, col3)\nFROM (\n    SELECT *\n    FROM test_table\n    WHERE col1 =
1\n) AS r;") at postgres.c:920
#6  0x0826f98d in PostgresMain (argc=4, argv=0x84ef928, username=0x84ef900
"bmt") at postgres.c:3606
#7  0x0823b468 in ServerLoop () at postmaster.c:3331
#8  0x0823c3ca in PostmasterMain (argc=3, argv=0x84ebbb0) at
postmaster.c:1054
#9  0x081e0f49 in main (argc=3, argv=0x84ebbb0) at main.c:188


The server log does not appear to provide much additional information:
----------------------------------------------------------------------------
--
LOG:  server process (PID 28300) was terminated by signal 11: Segmentation
fault
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
----------------------------------------------------------------------------
--

Re: BUG #4728: segfault with window function partition involving subquery

От
Tom Lane
Дата:
"Bruce Toll" <btoll@dhsus.com> writes:
> I encountered a problem that will hopefully be
> easy for you to reproduce using psql with the
> snippet below.

Sigh ... I could've sworn I tested that code path, but evidently not,
'cause it's broken as can be.

*** src/backend/optimizer/plan/planner.c.orig    Fri Jan  9 10:46:10 2009
--- src/backend/optimizer/plan/planner.c    Tue Mar 24 17:03:03 2009
***************
*** 2546,2553 ****
              if (list_length(new_pathkeys) > list_length(pathkeys))
              {
                  /* this sort clause is actually significant */
!                 *partColIdx[*partNumCols] = sortColIdx[scidx++];
!                 *partOperators[*partNumCols] = sgc->eqop;
                  (*partNumCols)++;
                  pathkeys = new_pathkeys;
              }
--- 2546,2553 ----
              if (list_length(new_pathkeys) > list_length(pathkeys))
              {
                  /* this sort clause is actually significant */
!                 (*partColIdx)[*partNumCols] = sortColIdx[scidx++];
!                 (*partOperators)[*partNumCols] = sgc->eqop;
                  (*partNumCols)++;
                  pathkeys = new_pathkeys;
              }
***************
*** 2565,2572 ****
              if (list_length(new_pathkeys) > list_length(pathkeys))
              {
                  /* this sort clause is actually significant */
!                 *ordColIdx[*ordNumCols] = sortColIdx[scidx++];
!                 *ordOperators[*ordNumCols] = sgc->eqop;
                  (*ordNumCols)++;
                  pathkeys = new_pathkeys;
              }
--- 2565,2572 ----
              if (list_length(new_pathkeys) > list_length(pathkeys))
              {
                  /* this sort clause is actually significant */
!                 (*ordColIdx)[*ordNumCols] = sortColIdx[scidx++];
!                 (*ordOperators)[*ordNumCols] = sgc->eqop;
                  (*ordNumCols)++;
                  pathkeys = new_pathkeys;
              }


Thanks for the report!

            regards, tom lane

Re: BUG #4728: segfault with window function partition involving subquery

От
Bruce Toll
Дата:
Tom,

Thanks for the fix and the quick reply.  Your patch solved my original
issue, as well as the supplied test case.

Regards,
Bruce Toll

Re: BUG #4728: segfault with window function partition involving subquery

От
Guillaume Smet
Дата:
On Tue, Mar 24, 2009 at 10:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Sigh ... I could've sworn I tested that code path, but evidently not,
> 'cause it's broken as can be.

If it's a code path not exercised by any regression test, is it worth
it to add one or we don't have any chance to break this code again
later?

--
Guillaume