Обсуждение: crazy logging from PG 8.4

Поиск
Список
Период
Сортировка

crazy logging from PG 8.4

От
Jon Nelson
Дата:
After several days of running a simple query over a very large dataset
(more than a terabyte), I logged in to note that postgresql had
decided to log *billions* of log lines.

The query looks like this:

create table result_table as
select TEXTFIELD, count(distinct INTFIELD) from SOME_HUGE_TABLE group
by TEXTFIELD;

At some point, PG started logging stuff that looked like this (over
and over again):

SELECT 2704204951 user@dbname LOG:  00000: performsort starting: CPU 0.00....

The 2704204951 number corresponds to the "%l" (session line number)
in log_line_prefix.  Thus, by the time I saw this, it had logged 2.7
billion lines.


An strace of the process showed this, over and over again:

getrusage(....)
write(2, <logging info>)

and nothing else.  What happened here?  It seems like PG just went
crazy and got itself into some sort of loop. I had to kill the query.

This is PostgreSQL 8.4.13 on x86_64, Linux, using Scientific Linux 6.4.

--
Jon

Re: crazy logging from PG 8.4

От
"Tomas Vondra"
Дата:
On 23 Říjen 2013, 16:00, Jon Nelson wrote:
> After several days of running a simple query over a very large dataset
> (more than a terabyte), I logged in to note that postgresql had
> decided to log *billions* of log lines.
>
> The query looks like this:
>
> create table result_table as
> select TEXTFIELD, count(distinct INTFIELD) from SOME_HUGE_TABLE group
> by TEXTFIELD;
>
> At some point, PG started logging stuff that looked like this (over
> and over again):
>
> SELECT 2704204951 user@dbname LOG:  00000: performsort starting: CPU
> 0.00....
>
> The 2704204951 number corresponds to the "%l" (session line number)
> in log_line_prefix.  Thus, by the time I saw this, it had logged 2.7
> billion lines.
>
>
> An strace of the process showed this, over and over again:
>
> getrusage(....)
> write(2, <logging info>)
>
> and nothing else.  What happened here?  It seems like PG just went
> crazy and got itself into some sort of loop. I had to kill the query.

First guess is that you enabled detailed logging in config. See
log_*_stats in
http://www.postgresql.org/docs/8.4/static/runtime-config-statistics.html

I'd guess it's log_executor_stats, but as you've copied a single line (and
not even that one was complete), it's hard to say.

Those are the only lines referring to getrusage in the source code,
although I'm not sure any of those could that produce the number of
messages you've mentioned.

> This is PostgreSQL 8.4.13 on x86_64, Linux, using Scientific Linux 6.4.

Not really the most updated 8.4 install. Please consider updating to 8.4.18.

Tomas

Re: crazy logging from PG 8.4

От
Tom Lane
Дата:
"Tomas Vondra" <tv@fuzzy.cz> writes:
> On 23 Říjen 2013, 16:00, Jon Nelson wrote:
>> After several days of running a simple query over a very large dataset
>> (more than a terabyte), I logged in to note that postgresql had
>> decided to log *billions* of log lines.

> First guess is that you enabled detailed logging in config.

I think it's trace_sort.

            regards, tom lane

Re: crazy logging from PG 8.4

От
Jon Nelson
Дата:
On Wed, Oct 23, 2013 at 9:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Tomas Vondra" <tv@fuzzy.cz> writes:
>> On 23 =C5=98=C3=ADjen 2013, 16:00, Jon Nelson wrote:
>>> After several days of running a simple query over a very large dataset
>>> (more than a terabyte), I logged in to note that postgresql had
>>> decided to log *billions* of log lines.
>
>> First guess is that you enabled detailed logging in config.
>
> I think it's trace_sort.

I think you are right.

Let's say that there would end up being 2.7+ billion groups.  Does
that scenario help to explain? I guess I'd like to see trace_sort have
a new value in addition to just "on" and "off", perhaps "external".  I
don't really care about internal sorts but I do care about external
sorts. Right now, they're the only (minimal!) insight I have into how
the query is progressing.

--=20
Jon

Re: crazy logging from PG 8.4

От
Tomas Vondra
Дата:
On 23.10.2013 20:56, Jon Nelson wrote:
> On Wed, Oct 23, 2013 at 9:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> "Tomas Vondra" <tv@fuzzy.cz> writes:
>>> On 23 Říjen 2013, 16:00, Jon Nelson wrote:
>>>> After several days of running a simple query over a very large dataset
>>>> (more than a terabyte), I logged in to note that postgresql had
>>>> decided to log *billions* of log lines.
>>
>>> First guess is that you enabled detailed logging in config.
>>
>> I think it's trace_sort.
>
> I think you are right.
>
> Let's say that there would end up being 2.7+ billion groups.  Does
> that scenario help to explain? I guess I'd like to see trace_sort have
> a new value in addition to just "on" and "off", perhaps "external".  I
> don't really care about internal sorts but I do care about external
> sorts. Right now, they're the only (minimal!) insight I have into how
> the query is progressing.

Yes, that sounds plausible, because COUNT(DISTINCT) ends with a sort for
each group. At least that's my experience.

If by internal/external you mean in-memory/on-disk sorting, then you can
disable trace_sort and use log_temp_files instead. It will probably log
some other operations using temp files, but it's unlikely to happen as
frequently as trace_sort.

Tomas