Обсуждение: array_agg crash?

Поиск
Список
Период
Сортировка

array_agg crash?

От
"Chris Spotts"
Дата:

I had one simple query that kept crashing the connection. It crashes after several minutes.

Tried restarting, it still error’d at the same place.

Tried recreating the table it was selecting from, it still error’d at the same place.

I rewrote the query with an ARRAY subselect and it finished flawlessly in a few seconds.

There is about 4 million records in the table its selecting from.  No array ends up with more than 4 elements.

For some reason, the log indicates this is causing an issue with autovacuum…it says it was -9’d, but it wasn’t by or any other physical person.

 

Here is the problem query…

 

create table public.temptrips

as

select trip_id,array_agg(customer_upload_id)

from

trip_ids_to_customer_upload_ids

group by trip_id;

server closed the connection unexpectedly

        This probably means the server terminated abnormally

        before or while processing the request.

The connection to the server was lost. Attempting reset: Failed.

!>

 

 

LOG:  00000: autovacuum launcher process (PID 10264) was terminated by signal 9: Killed

2009-07-21 08:44:26 EDT -    LOCATION:  LogChildExit, postmaster.c:2673

2009-07-21 08:44:26 EDT -    LOG:  00000: terminating any other active server processes

2009-07-21 08:44:26 EDT -    LOCATION:  HandleChildCrash, postmaster.c:2500

2009-07-21 08:44:26 EDT -    WARNING:  57P02: terminating connection because of crash of another server process

2009-07-21 08:44:26 EDT -    DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2009-07-21 08:44:26 EDT -    HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2009-07-21 08:44:26 EDT -    LOCATION:  quickdie, postgres.c:2495

2009-07-21 08:44:27 EDT - postgres postgres /usr/local/pgsql/bin/postmaster FATAL:  57P03: the database system is in recovery mode

2009-07-21 08:44:27 EDT - postgres postgres /usr/local/pgsql/bin/postmaster LOCATION:  ProcessStartupPacket, postmaster.c:1721

2009-07-21 08:44:26 EDT - postgres postgres startup WARNING:  57P02: terminating connection because of crash of another server process

2009-07-21 08:44:27 EDT - postgres postgres startup DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2009-07-21 08:44:27 EDT - postgres postgres startup HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2009-07-21 08:44:27 EDT - postgres postgres startup LOCATION:  quickdie, postgres.c:2495

2009-07-21 08:44:28 EDT -    LOG:  00000: all server processes terminated; reinitializing

2009-07-21 08:44:28 EDT -    LOCATION:  PostmasterStateMachine, postmaster.c:2858

2009-07-21 08:44:28 EDT -    LOG:  00000: database system was interrupted; last known up at 2009-07-21 08:41:32 EDT

2009-07-21 08:44:28 EDT -    LOCATION:  StartupXLOG, xlog.c:5236

2009-07-21 08:44:28 EDT -    LOG:  00000: database system was not properly shut down; automatic recovery in progress

2009-07-21 08:44:28 EDT -    LOCATION:  StartupXLOG, xlog.c:5410

2009-07-21 08:44:28 EDT -    LOG:  00000: redo starts at 76/4380AC70

2009-07-21 08:44:28 EDT -    LOCATION:  StartupXLOG, xlog.c:5493

2009-07-21 08:44:29 EDT -    LOG:  00000: record with zero length at 76/438869D0

2009-07-21 08:44:29 EDT -    LOCATION:  ReadRecord, xlog.c:3532

2009-07-21 08:44:29 EDT -    LOG:  00000: redo done at 76/438869A0

2009-07-21 08:44:29 EDT -    LOCATION:  StartupXLOG, xlog.c:5625

2009-07-21 08:44:29 EDT -    LOG:  00000: last completed transaction was at log time 2009-07-21 08:41:49.707423-04

2009-07-21 08:44:29 EDT -    LOCATION:  StartupXLOG, xlog.c:5629

2009-07-21 08:44:30 EDT -    LOG:  00000: autovacuum launcher started

2009-07-21 08:44:30 EDT -    LOCATION:  AutoVacLauncherMain, autovacuum.c:529

2009-07-21 08:44:30 EDT -    LOG:  00000: database system is ready to accept connections

2009-07-21 08:44:30 EDT -    LOCATION:  reaper, postmaster.c:2272

 

 

It looks like this is causing the autovacuum to crash, what could cause this?

 

 

 

 

 

 

postgres=# select version();

                                                       version

----------------------------------------------------------------------------------------------------------------------

 PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-8.0.1), 64-bit

(1 row)

 

 

Chris

 

Re: array_agg crash?

От
Tom Lane
Дата:
"Chris Spotts" <rfusca@gmail.com> writes:
> LOG:  00000: autovacuum launcher process (PID 10264) was terminated by
> signal 9: Killed

Looks like the OOM killer is loose on your system.  Disable memory
overcommit in the kernel and things will get better.
http://www.postgresql.org/docs/8.4/static/kernel-resources.html#AEN24152

            regards, tom lane

Re: array_agg crash?

От
"Chris Spotts"
Дата:
> "Chris Spotts" <rfusca@gmail.com> writes:
> > LOG:  00000: autovacuum launcher process (PID 10264) was terminated
> by
> > signal 9: Killed
>
> Looks like the OOM killer is loose on your system.  Disable memory
> overcommit in the kernel and things will get better.
> http://www.postgresql.org/docs/8.4/static/kernel-
> resources.html#AEN24152
>
>             regards, tom lane
[Spotts, Christopher]

That seems to have fixed it in the sense that it doesn't crash, but I killed
it after 15 minutes and no results...rather I tried to kill it, but it looks
like I'm going to have to -9 it...


Re: array_agg crash?

От
Tom Lane
Дата:
"Chris Spotts" <rfusca@gmail.com> writes:
> That seems to have fixed it in the sense that it doesn't crash, but I killed
> it after 15 minutes and no results...rather I tried to kill it, but it looks
> like I'm going to have to -9 it...

How big were the arrays you were trying to push around here?  I tried
interrupting a similar query and it came right back; but if you were
building some really enormous arrays I could see the final array
build loop taking awhile, and there's no CHECK_FOR_INTERRUPTS in it...

            regards, tom lane

Re: array_agg crash?

От
"Chris Spotts"
Дата:
> killed
> > it after 15 minutes and no results...rather I tried to kill it, but
> it looks
> > like I'm going to have to -9 it...
>
> How big were the arrays you were trying to push around here?  I tried
> interrupting a similar query and it came right back; but if you were
> building some really enormous arrays I could see the final array
> build loop taking awhile, and there's no CHECK_FOR_INTERRUPTS in it...
>
>             regards, tom lane
[Spotts, Christopher]
Like I'd said originally, there were no arrays that ended up being more 4
elements long -  all integers. The vast majority of them were 1 or 2 long.

If it was having trouble allocating the memory for arrays, why doesn't using
a ARRAY(subselect)  yield the same issue?


Re: array_agg crash?

От
Tom Lane
Дата:
"Chris Spotts" <rfusca@gmail.com> writes:
>>> killed
>>> it after 15 minutes and no results...rather I tried to kill it, but
>>> it looks like I'm going to have to -9 it...
>>
>> How big were the arrays you were trying to push around here?

> Like I'd said originally, there were no arrays that ended up being more 4
> elements long -  all integers. The vast majority of them were 1 or 2 long.

Hm, maybe the problem is not so much array size as too many arrays.  How
many groups are you expecting in that query?  Does the plan for the
array_agg query show hash or group aggregation?

            regards, tom lane

Re: array_agg crash?

От
"Chris Spotts"
Дата:
> >> How big were the arrays you were trying to push around here?
>
> > Like I'd said originally, there were no arrays that ended up being
> more 4
> > elements long -  all integers. The vast majority of them were 1 or 2
> long.
>
> Hm, maybe the problem is not so much array size as too many arrays.
> How
> many groups are you expecting in that query?  Does the plan for the
> array_agg query show hash or group aggregation?
>
>             regards, tom lane
[Spotts, Christopher]
GroupAggregate


Re: array_agg crash?

От
Tom Lane
Дата:
"Chris Spotts" <rfusca@gmail.com> writes:
>> many groups are you expecting in that query?  Does the plan for the
>> array_agg query show hash or group aggregation?

> GroupAggregate

Huh, there's no reason it should take much memory then.  Maybe you've
found a memory leak.  Can you put together a self-contained test case?

            regards, tom lane

Re: array_agg crash?

От
"Chris Spotts"
Дата:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Tuesday, July 21, 2009 12:16 PM
> To: Chris Spotts
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] array_agg crash?
>
> "Chris Spotts" <rfusca@gmail.com> writes:
> >> many groups are you expecting in that query?  Does the plan for the
> >> array_agg query show hash or group aggregation?
>
> > GroupAggregate
>
> Huh, there's no reason it should take much memory then.  Maybe you've
> found a memory leak.  Can you put together a self-contained test case?
>
>             regards, tom lane
[Spotts, Christopher]
I'll try to put something together tonight and let you know tomorrow.


Re: array_agg crash?

От
Chris Spotts
Дата:
Tom Lane wrote:
> "Chris Spotts" <rfusca@gmail.com> writes:
>
>>> many groups are you expecting in that query?  Does the plan for the
>>> array_agg query show hash or group aggregation?
>>>
>
>
>> GroupAggregate
>>
>
> Huh, there's no reason it should take much memory then.  Maybe you've
> found a memory leak.  Can you put together a self-contained test case?
>
>             regards, tom lane
>
What do you want specifically as far as details for the test case?  I
exported just the table that that was reading from.  Installed a new
clean virtual  machine ubuntu (jaunty) and then installed 8.4.0.
Imported the table and definition.  Ran the same query and the same
thing happened.

Table its selecting from is:
          Table "public.trip_ids_to_customer_upload_ids"
       Column       |  Type   | Modifiers | Storage | Description
--------------------+---------+-----------+---------+-------------
 trip_id            | bigint  |           | plain   |
 customer_upload_id | integer |           | plain   |
Indexes:
    "trips_customer_id" btree (trip_id, customer_upload_id)
Has OIDs: no

There is 3801347 rows in the table.  There are 3773039 distinct trip_id
values.  So you can see that the vast majority of rows here are just a
single  element array.


Re: array_agg crash?

От
Tom Lane
Дата:
Chris Spotts <rfusca@gmail.com> writes:
> What do you want specifically as far as details for the test case?  I
> exported just the table that that was reading from.  Installed a new
> clean virtual  machine ubuntu (jaunty) and then installed 8.4.0.
> Imported the table and definition.  Ran the same query and the same
> thing happened.

The table dump and the query would be enough then.  Can you send it
to me off-list?

            regards, tom lane