Обсуждение: BUG #15053: SIGSEGV - While executing query with cube agregator

Поиск
Список
Период
Сортировка

BUG #15053: SIGSEGV - While executing query with cube agregator

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15053
Logged by:          Orian Beltrame da Silva
Email address:      obeltrame@callixbrasil.com
PostgreSQL version: 9.6.3
Operating system:   Linux  4.9.0-4-amd64 #1 SMP Debian 4.9.65-3
Description:

Hi, 

the following postgres crash was generated during the execution of this
query:

 select
  freeswitch_bug_events.created_at::date,
  freeswitch_nodes."name",
  tenants."name",
  count(*) filter(where freeswitch_bug_events.event = 'call received'),
  count(*) filter(where freeswitch_bug_events.event = 'status changed')
from dev.freeswitch_bug_events
  join public.tenants on freeswitch_bug_events.tenant = tenants.id 
  join public.freeswitch_nodes on freeswitch_bug_events.node =
freeswitch_nodes.id
group by 
  cube (freeswitch_bug_events.created_at::date, freeswitch_nodes."name",
tenants."name" )

the query result is about 500k rows, so the developer was scrolling on the
result, using the dbeaver client, when the process on  the server crashed.

This crash puts the postgres in the recovery mode (a really nice feature!)
and after a few minutes the server was recovered and started to accpet new
connections.
I was enable to get a coredump of this process, if you people wants to
proccede to the investigation i can send the full core dump and a backup of
the tables involved.

Core was generated by `postgres: postgres db 192.168.1.165(54804) BIND
         '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000559ce4a520e4 in pfree ()
(gdb) bt
#0  0x0000559ce4a520e4 in pfree ()
#1  0x0000559ce4811e69 in ExecClearTuple ()
#2  0x0000559ce4811ebc in ExecResetTupleTable ()
#3  0x0000559ce4806349 in standard_ExecutorEnd ()
#4  0x0000559ce47cf8d3 in PortalCleanup ()
#5  0x0000559ce4a52842 in PortalDrop ()
#6  0x0000559ce4a52a50 in CreatePortal ()
#7  0x0000559ce49256ec in PostgresMain ()
#8  0x0000559ce469f2e2 in ?? ()
#9  0x0000559ce48c2dd1 in PostmasterMain ()
#10 0x0000559ce46a0497 in main ()


Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Tom Lane
Дата:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> the following postgres crash was generated during the execution of this
> query:

>  select
>   freeswitch_bug_events.created_at::date,
>   freeswitch_nodes."name",
>   tenants."name",
>   count(*) filter(where freeswitch_bug_events.event = 'call received'),
>   count(*) filter(where freeswitch_bug_events.event = 'status changed')
> from dev.freeswitch_bug_events
>   join public.tenants on freeswitch_bug_events.tenant = tenants.id 
>   join public.freeswitch_nodes on freeswitch_bug_events.node =
> freeswitch_nodes.id
> group by 
>   cube (freeswitch_bug_events.created_at::date, freeswitch_nodes."name",
> tenants."name" )

Hmm, well, first please update to something newer than 9.6.3, to see if
the problem is already resolved.  (9.6.7 is being released this week.)

If not, please see if you can extract a self-contained test case.  That
stack trace is not really enough information to fix it.

            regards, tom lane


Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Andreas Seltenreich
Дата:
Tom Lane writes:
> =?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
>> the following postgres crash was generated during the execution of this
>> query:
[...]
>> group by cube
>
> Hmm, well, first please update to something newer than 9.6.3, to see if
> the problem is already resolved.  (9.6.7 is being released this week.)

backtrace and query look similar to the issue we reported leading to the
following commitfest entry:

    https://commitfest.postgresql.org/16/1413/

Our customer hasn't seen any crashes with cube() statements since
setting

    replacement_sort_tuples = 0

I'm afraid this workaround might be needed here as well unless a fix
makes it into 9.6.7.

regards,
Andreas


Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Peter Geoghegan
Дата:
On Tue, Feb 6, 2018 at 2:00 PM, Andreas Seltenreich
<andreas.seltenreich@credativ.de> wrote:
> backtrace and query look similar to the issue we reported leading to the
> following commitfest entry:
>
>     https://commitfest.postgresql.org/16/1413/

I suspect the same.

-- 
Peter Geoghegan


Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Orian Beltrame da Silva
Дата:
Thanks in advance!

I Will try the suggestions and the test cases on the tread and report any new results.



Orian Beltrame da Silva
Product Developer | DevOps

Rua do Rócio, 220 - Cj. 72
São Paulo - SP - 04552-000
55 11 4063 4222

obeltrame@callix.com.br
www.callix.com.br


2018-02-06 20:14 GMT-02:00 Peter Geoghegan <pg@bowt.ie>:
On Tue, Feb 6, 2018 at 2:00 PM, Andreas Seltenreich
<andreas.seltenreich@credativ.de> wrote:
> backtrace and query look similar to the issue we reported leading to the
> following commitfest entry:
>
>     https://commitfest.postgresql.org/16/1413/

I suspect the same.

--
Peter Geoghegan

Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Andrew Gierth
Дата:
>>>>> "Peter" == Peter Geoghegan <pg@bowt.ie> writes:

 > On Tue, Feb 6, 2018 at 2:00 PM, Andreas Seltenreich
 > <andreas.seltenreich@credativ.de> wrote:
 >> backtrace and query look similar to the issue we reported leading to the
 >> following commitfest entry:
 >> 
 >> https://commitfest.postgresql.org/16/1413/

 Peter> I suspect the same.

I just eyeballed that. The obvious question that comes to mind is, can't
this issue be fixed in nodeAgg rather than monkeying with the slot code?

-- 
Andrew (irc:RhodiumToad)


Re: BUG #15053: SIGSEGV - While executing query with cube agregator

От
Peter Geoghegan
Дата:
On Wed, Feb 7, 2018 at 3:24 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
> I just eyeballed that. The obvious question that comes to mind is, can't
> this issue be fixed in nodeAgg rather than monkeying with the slot code?

You're looking at the wrong patch. There are two patches posted that
each fix the issue in different ways. One of which was really just for
illustrative purposes. The right place to fix this is in tuplesort.c.

-- 
Peter Geoghegan