Re: [HACKERS] Out of memory problem (forwarded bug report)
От | Vladimír Beneš |
---|---|
Тема | Re: [HACKERS] Out of memory problem (forwarded bug report) |
Дата | |
Msg-id | 010101bf7ebf$888368f0$451c11ac@p53apk.chv.pvt.cz обсуждение исходный текст |
Ответы |
Re: [HACKERS] Out of memory problem (forwarded bug report)
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-hackers |
Hi, thank You very much, You helped me :-) Well, I will use function date() and not date_trunc() in my views. I tryed change type of "bytes" column from int8 to int4. Range 0 .. +2 147 483 647 will suffice here. But I must retype this type to int8 in sum function because overflow can occur: collector=> create table flow_sums ( collector-> primary_collector varchar(50) not null, collector-> start datetime not null, collector-> end_period datetime not null, collector-> dead_time_rel float4 not null, collector-> src_name varchar(50) not null, collector-> dst_name varchar(50) not null, collector-> bytes int4 not null, collector-> packets int4 not null); CREATE collector=> insert into flow_sums(primary_collector,start,end_period,dead_time_rel,src_name,dst_name ,bytes,packets) collector-> select primary_collector,start,end_period,dead_time_rel,src_name,dst_name,bytes,pac kets collector-> from flow_sums_200002; INSERT 0 3198588 collector=> create unique index flow_sums_pk on flow_sums(primary_collector,start, src_name, dst_name); CREATE collector=> collector=> select sum(bytes) from flow_sums; sum ----------- -1712976144 (1 row) collector=> select sum(int8(bytes)) from flow_sums; sum ------------ 603877412592 (1 row) collector=> So my views will retype attributes in sums to int8 and they will use function date to round datetime attribute. Thank You very much and please forward this mail to pgsql-hackers@postgreSQL.org becouse the robot will bounce my CC. Thanks, V. Benes -----Původní zpráva----- Od: Tom Lane <tgl@sss.pgh.pa.us> Komu: Vladimír Beneš <Vladimir.Benes@pvt.cz> Kopie: pgsql-hackers@postgreSQL.org <pgsql-hackers@postgreSQL.org>; Mühlpachr Michal <michalm@pvt.net> Datum: 24. února 2000 6:46 Předmět: Re: [HACKERS] Out of memory problem (forwarded bug report) >Vladimir, > Thanks for the details. I think you are undoubtedly running into >expression evaluation memory leaks. Basically, any expression that >yields a non-pass-by-value data type consumes memory that is not >reclaimed until end of statement --- so when you process a few million >rows, that memory starts to add up. (Yes, I realize this is a horrible >misfeature. It's on our TO-DO list to fix it, but it probably won't >happen until 7.1 or 7.2.) In the meantime the best I can offer you >is workarounds. > > I think the major problems here are coming from the >"date_trunc('day',start)" calculation (because its datetime result is >pass-by-reference) and to a lesser extent from the sum(bytes) >calculation (because int8 is pass-by-reference). You could easily >replace "date_trunc('day',start)" with "date(start)"; since date is >a pass-by-value type, that won't leak memory, and it should give >equivalent results. The int8 sum is not quite so easy to fix. >I assume you can't get away with switching to int4 --- probably >your sum would overflow an int4? It may be that just fixing the >inefficient date_trunc calc will reduce your memory requirements >enough to get by. If not, the only good news I have is that release >7.0 does fix the memory-leak problem for internal calculations of >aggregate functions like sum(). You can get the first beta release >for 7.0 now. > > regards, tom lane > > >"Vladimír Beneš" <Vladimir.Benes@pvt.cz> writes: >> -----Původní zpráva----- >> Od: Tom Lane <tgl@sss.pgh.pa.us> >> Komu: Oliver Elphick <olly@lfix.co.uk> >> Kopie: pgsql-hackers@postgreSQL.org <pgsql-hackers@postgreSQL.org>; >> Vladimir.Benes@pvt.cz <Vladimir.Benes@pvt.cz> >> Datum: 22. února 2000 18:06 >> Předmět: Re: [HACKERS] Out of memory problem (forwarded bug report) > > >>> "Oliver Elphick" <olly@lfix.co.uk> writes: >>>> Can someone advise, please, how to deal with this problem in 6.5.3? >>> > >>> My guess is that the cause is memory leaks during expression evaluation; >>> but without seeing the complete view definitions and underlying table >>> definitions, it's impossible to know what processing is being invoked >>> by this query... >>> >>> regards, tom lane > > > >> Well, I will append views and underlying table definition: > >> 1) Once again - failure query: >> select comm_type,name,tot_bytes,tot_packets >> from flow_sums_days_send_200002_view >> where day='2000-02-21' and name not like '@%' >> union all >> select comm_type,name,tot_bytes,tot_packets >> from flow_sums_days_receive_200002_view >> where day='2000-02-21' and name not like '@%' > >> 2) views definition: >> create view flow_sums_days_send_200002_view as >> select >> 'send'::varchar as comm_type, date_trunc('day',start) as day, >> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets >> from flow_sums_200002 >> group by day, src_name > >> create view flow_sums_days_receive_200002_view as >> select >> 'receive'::varchar as comm_type, date_trunc('day',start) as day, >> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets >> from flow_sums_200002 >> group by day, dst_name > > >> I wanted create only one usefull view: > >> create view flow_sums_days_200002_view as >> select >> 'send'::varchar as comm_type, date_trunc('day',start) as day, >> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets >> from flow_sums_200002 >> group by day, src_name >> UNION ALL >> select >> 'receive'::varchar as comm_type, date_trunc('day',start) as day, >> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets >> from flow_sums_200002 >> group by day, dst_name > >> ...but Postgres cann't use clause UNION ALL at view definition. So I created >> two views mentioned above and I wanted use this ones with UNION ALL clause >> only. > >> 3) underlaying table definition: >> create table flow_sums_200002 ( >> primary_collector varchar(50) not null, >> start datetime not null, >> end_period datetime not null, >> dead_time_rel float4 not null, >> src_name varchar(50) not null, >> dst_name varchar(50) not null, >> bytes int8 not null, >> packets int4 not null >> ) > >> Today this table has about 3 000 000 rows and the select command >> mentioned above returns 190 + 255 rows. > > >> Now I don't use clause "UNION ALL" and the program executes two queryes >> and then adds both result to new result. I reduced time increment of number >> rows to flow_sums_200002 table (three times less). This table contains data >> of February 2000 and the program will create table flow_sums_200003 with >> relevant views next month. >> Well, now this solution solve my problem but always depends on number of >> rows - I only moved limit of rows count. > > >> Thank You, V. Benes > >> P.S.: I append part of top on my system while the query is running: > >> CPU states: 98.6% user, 1.3% system, 0.0% nice, 0.0% idle >> Mem: 127256K av, 124316K used, 2940K free, 29812K shrd, 2620K buff >> Swap: 128516K av, 51036K used, 77480K free 7560K cached > >> PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND >> 2942 postgres 20 0 141M 99M 17348 R 0 99.0 80.4 1:22 postmaster > >> => postmaster later took 80 - 95% of memory, free memory decressed to 2 MB, >> CPU was overloaded (0% idle and 99% by user process of postmaster). Have You >> ever seen something similar :-) ?
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Oliver Elphick"Дата:
Сообщение: Re: [HACKERS] Out of memory problem (forwarded bug report)