Обсуждение: Re: [HACKERS] Out of memory problem (forwarded bug report)

Поиск
Список
Период
Сортировка

Re: [HACKERS] Out of memory problem (forwarded bug report)

От
"Vladimír Beneš"
Дата:
-----Původní zpráva-----
Od: Tom Lane <tgl@sss.pgh.pa.us>
Komu: Oliver Elphick <olly@lfix.co.uk>
Kopie: pgsql-hackers@postgreSQL.org <pgsql-hackers@postgreSQL.org>;
Vladimir.Benes@pvt.cz <Vladimir.Benes@pvt.cz>
Datum: 22. února 2000 18:06
Předmět: Re: [HACKERS] Out of memory problem (forwarded bug report)


>"Oliver Elphick" <olly@lfix.co.uk> writes:
>> Can someone advise, please, how to deal with this problem in 6.5.3?
>

>My guess is that the cause is memory leaks during expression evaluation;
>but without seeing the complete view definitions and underlying table
>definitions, it's impossible to know what processing is being invoked
>by this query...
>
> regards, tom lane


   Well, I will append views and underlying table definition:

1) Once again - failure query:
select comm_type,name,tot_bytes,tot_packets
from flow_sums_days_send_200002_view
where day='2000-02-21' and name not like '@%'
union all
select comm_type,name,tot_bytes,tot_packets
from flow_sums_days_receive_200002_view
where day='2000-02-21' and name not like '@%'

2) views definition:
create view flow_sums_days_send_200002_view as
select 'send'::varchar as comm_type, date_trunc('day',start) as day, src_name as name, sum(bytes) as tot_bytes,
sum(packets)as tot_packets
 
from flow_sums_200002
group by day, src_name

create view flow_sums_days_receive_200002_view as
select 'receive'::varchar as comm_type, date_trunc('day',start) as day, dst_name as name, sum(bytes) as tot_bytes,
sum(packets)as tot_packets
 
from flow_sums_200002
group by day, dst_name


I wanted create only one usefull view:

create view flow_sums_days_200002_view as
select 'send'::varchar as comm_type, date_trunc('day',start) as day, src_name as name, sum(bytes) as tot_bytes,
sum(packets)as tot_packets
 
from flow_sums_200002
group by day, src_name
UNION ALL
select 'receive'::varchar as comm_type, date_trunc('day',start) as day, dst_name as name, sum(bytes) as tot_bytes,
sum(packets)as tot_packets
 
from flow_sums_200002
group by day, dst_name

...but Postgres cann't use clause UNION ALL at view definition. So I created
two views mentioned above and I wanted use this ones with UNION ALL clause
only.

3) underlaying table definition:
create table flow_sums_200002 (   primary_collector varchar(50) not null,   start datetime not null,   end_period
datetimenot null,   dead_time_rel float4 not null,   src_name varchar(50) not null,   dst_name varchar(50) not null,
bytesint8 not null,   packets int4 not null
 
)
   Today this table has about 3 000 000 rows and the select command
mentioned above returns 190 + 255 rows.

   Now I don't use clause "UNION ALL" and the program executes two queryes
and then adds both result to new result. I reduced time increment of number
rows to flow_sums_200002 table (three times less). This table contains data
of February 2000 and the program will create table flow_sums_200003 with
relevant views next month.   Well, now this solution solve my problem but always depends on number of
rows - I only moved limit of rows count.

           Thank You, V. Benes

P.S.: I append part of top on my system while the query is running:

CPU states: 98.6% user,  1.3% system,  0.0% nice,  0.0% idle
Mem:  127256K av, 124316K used,   2940K free,  29812K shrd,   2620K buff
Swap: 128516K av,  51036K used,  77480K free                  7560K cached
 PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
2942 postgres  20   0  141M  99M 17348 R       0 99.0 80.4   1:22 postmaster

=> postmaster later took 80 - 95% of memory, free memory decressed to 2 MB,
CPU was overloaded (0% idle and 99% by user process of postmaster). Have You
ever seen something similar :-) ?




Re: [HACKERS] Out of memory problem (forwarded bug report)

От
Tom Lane
Дата:
Vladimir, Thanks for the details.  I think you are undoubtedly running into
expression evaluation memory leaks.  Basically, any expression that
yields a non-pass-by-value data type consumes memory that is not
reclaimed until end of statement --- so when you process a few million
rows, that memory starts to add up.  (Yes, I realize this is a horrible
misfeature.  It's on our TO-DO list to fix it, but it probably won't
happen until 7.1 or 7.2.)  In the meantime the best I can offer you
is workarounds.
 I think the major problems here are coming from the
"date_trunc('day',start)" calculation (because its datetime result is
pass-by-reference) and to a lesser extent from the sum(bytes)
calculation (because int8 is pass-by-reference).  You could easily
replace "date_trunc('day',start)" with "date(start)"; since date is
a pass-by-value type, that won't leak memory, and it should give
equivalent results.  The int8 sum is not quite so easy to fix.
I assume you can't get away with switching to int4 --- probably
your sum would overflow an int4?  It may be that just fixing the
inefficient date_trunc calc will reduce your memory requirements
enough to get by.  If not, the only good news I have is that release
7.0 does fix the memory-leak problem for internal calculations of
aggregate functions like sum().  You can get the first beta release
for 7.0 now.
        regards, tom lane


"Vladimír Beneš" <Vladimir.Benes@pvt.cz> writes:
> -----P�vodn� zpr�va-----
> Od: Tom Lane <tgl@sss.pgh.pa.us>
> Komu: Oliver Elphick <olly@lfix.co.uk>
> Kopie: pgsql-hackers@postgreSQL.org <pgsql-hackers@postgreSQL.org>;
> Vladimir.Benes@pvt.cz <Vladimir.Benes@pvt.cz>
> Datum: 22. �nora 2000 18:06
> P�edm�t: Re: [HACKERS] Out of memory problem (forwarded bug report)


>> "Oliver Elphick" <olly@lfix.co.uk> writes:
>>> Can someone advise, please, how to deal with this problem in 6.5.3?
>> 

>> My guess is that the cause is memory leaks during expression evaluation;
>> but without seeing the complete view definitions and underlying table
>> definitions, it's impossible to know what processing is being invoked
>> by this query...
>> 
>> regards, tom lane



>     Well, I will append views and underlying table definition:

> 1) Once again - failure query:
> select comm_type,name,tot_bytes,tot_packets
> from flow_sums_days_send_200002_view
> where day='2000-02-21' and name not like '@%'
> union all
> select comm_type,name,tot_bytes,tot_packets
> from flow_sums_days_receive_200002_view
> where day='2000-02-21' and name not like '@%'

> 2) views definition:
> create view flow_sums_days_send_200002_view as
> select
>   'send'::varchar as comm_type, date_trunc('day',start) as day,
>   src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, src_name

> create view flow_sums_days_receive_200002_view as
> select
>   'receive'::varchar as comm_type, date_trunc('day',start) as day,
>   dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, dst_name


> I wanted create only one usefull view:

> create view flow_sums_days_200002_view as
> select
>   'send'::varchar as comm_type, date_trunc('day',start) as day,
>   src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, src_name
> UNION ALL
> select
>   'receive'::varchar as comm_type, date_trunc('day',start) as day,
>   dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, dst_name

> ...but Postgres cann't use clause UNION ALL at view definition. So I created
> two views mentioned above and I wanted use this ones with UNION ALL clause
> only.

> 3) underlaying table definition:
> create table flow_sums_200002 (
>     primary_collector varchar(50) not null,
>     start datetime not null,
>     end_period datetime not null,
>     dead_time_rel float4 not null,
>     src_name varchar(50) not null,
>     dst_name varchar(50) not null,
>     bytes int8 not null,
>     packets int4 not null
> )

>     Today this table has about 3 000 000 rows and the select command
> mentioned above returns 190 + 255 rows.


>     Now I don't use clause "UNION ALL" and the program executes two queryes
> and then adds both result to new result. I reduced time increment of number
> rows to flow_sums_200002 table (three times less). This table contains data
> of February 2000 and the program will create table flow_sums_200003 with
> relevant views next month.
>     Well, now this solution solve my problem but always depends on number of
> rows - I only moved limit of rows count.


>             Thank You, V. Benes

> P.S.: I append part of top on my system while the query is running:

> CPU states: 98.6% user,  1.3% system,  0.0% nice,  0.0% idle
> Mem:  127256K av, 124316K used,   2940K free,  29812K shrd,   2620K buff
> Swap: 128516K av,  51036K used,  77480K free                  7560K cached

>   PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
> 2942 postgres  20   0  141M  99M 17348 R       0 99.0 80.4   1:22 postmaster

> => postmaster later took 80 - 95% of memory, free memory decressed to 2 MB,
> CPU was overloaded (0% idle and 99% by user process of postmaster). Have You
> ever seen something similar :-) ?



Re: [HACKERS] Out of memory problem (forwarded bug report)

От
"Oliver Elphick"
Дата:
Tom Lane wrote:
... >your sum would overflow an int4?  It may be that just fixing the >inefficient date_trunc calc will reduce your
memoryrequirements >enough to get by.  If not, the only good news I have is that release >7.0 does fix the memory-leak
problemfor internal calculations of >aggregate functions like sum().  You can get the first beta release >for 7.0 now.
 

I'm putting together a Debian release of the beta at the moment. 

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Thy word is a lamp unto my feet, and a light
untomy      path."           Psalms 119:105