Обсуждение: Postgres eats up memory when using cursors
Hello, I would like to bring subj to your attention again. The problem is the following: I need to have cursor opened for a long time. I declare a cursor on the table of approx. 1 million rows. And start fetching data by 1000 rows at each fetch. Data processing can take quite a long time (3-4 days) Theoretically postgres process should remain the same in size. But it grows... In the end of 3rd day it becames 256Mb large!!!! And this is REAL problem. Also I would like to mention that this problem was raised by other person also. I would like to hear any comments on this issue. -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Denis Perchine <dyp@perchine.com> writes:
> I declare a cursor on the table of approx. 1 million rows.
> And start fetching data by 1000 rows at each fetch.
> Data processing can take quite a long time (3-4 days)
> Theoretically postgres process should remain the same in size.
> But it grows... In the end of 3rd day it becames 256Mb large!!!!
Query details please? You can't expect any results from such a
vague report.
regards, tom lane
On Thursday 01 March 2001 21:33, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> > I declare a cursor on the table of approx. 1 million rows.
> > And start fetching data by 1000 rows at each fetch.
> > Data processing can take quite a long time (3-4 days)
> > Theoretically postgres process should remain the same in size.
> > But it grows... In the end of 3rd day it becames 256Mb large!!!!
>
> Query details please? You can't expect any results from such a
> vague report.
:-)))
That's right.
declare senders_c cursor for select email, first_name, last_name from senders
order by email
fetch 1000 from senders_c
db=# explain declare senders_c cursor for select email, first_name, last_name
from senders order by email;
NOTICE: QUERY PLAN:
Index Scan using senders_email_key on senders (cost=0.00..197005.37
rows=928696 width=36)
db=# \d senders
Table "senders"
Attribute | Type | Modifier
------------+-----------+----------
email | text |
first_name | text |
last_name | text |
stamp | timestamp |
Index: senders_email_key
db=# \d senders_email_key
Index "senders_email_key"
Attribute | Type
-----------+------
email | text
unique btree
That's all. I could not imagine anything more simple...
--
Sincerely Yours,
Denis Perchine
----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------
Denis Perchine <dyp@perchine.com> writes:
> I declare a cursor on the table of approx. 1 million rows.
> And start fetching data by 1000 rows at each fetch.
> Data processing can take quite a long time (3-4 days)
> Theoretically postgres process should remain the same in size.
> But it grows... In the end of 3rd day it becames 256Mb large!!!!
> declare senders_c cursor for select email, first_name, last_name from senders
> order by email
> fetch 1000 from senders_c
> db=# explain declare senders_c cursor for select email, first_name, last_name
> from senders order by email;
> NOTICE: QUERY PLAN:
> Index Scan using senders_email_key on senders (cost=0.00..197005.37
> rows=928696 width=36)
> db=# \d senders
> Table "senders"
> Attribute | Type | Modifier
> ------------+-----------+----------
> email | text |
> first_name | text |
> last_name | text |
> stamp | timestamp |
> Index: senders_email_key
> db=# \d senders_email_key
> Index "senders_email_key"
> Attribute | Type
> -----------+------
> email | text
> unique btree
> That's all. I could not imagine anything more simple...
Looks pretty simple to me too; I was expecting that you were doing
expression evaluations or some such. I cannot reproduce any backend
leakage in this query, either in 7.0.* or current sources. I did
create table senders (email text, first_name text, last_name text);
\copy senders from 'todo' -- some junk data
create index sendersi on senders(email);
insert into senders select * from senders; -- repeat a few times
begin;
explain
declare senders_c cursor for select email, first_name, last_name
from senders order by email;
NOTICE: QUERY PLAN:
Index Scan using sendersi on senders (cost=0.00..102.78 rows=2161 width=36)
end;
then made a script file
begin;
declare senders_c cursor for select email, first_name, last_name
from senders order by email;
fetch 10000 from senders_c
\g zzz
fetch backward 10000 from senders_c
\g zzz
fetch 10000 from senders_c
\g zzz
fetch backward 10000 from senders_c
\g zzz
fetch 10000 from senders_c
\g zzz
... repeat a few thousand times ...
end;
Backend memory usage is absolutely rock steady.
Curiously, there does seem to be a leak in psql, or possibly libpq,
when running this script. It's per-query not per-tuple, so it's
easier to see if you make the fetches shorter:
fetch 100 from senders_c
\g zzz
fetch backward 100 from senders_c
\g zzz
Repeating these like mad, psql grows about 1Kb/sec on my machine.
This occurs with current sources but NOT with 7.0.2 psql. Peter,
any thoughts about that?
regards, tom lane
> Looks pretty simple to me too; I was expecting that you were doing > expression evaluations or some such. I cannot reproduce any backend > leakage in this query, either in 7.0.* or current sources. I did > > create table senders (email text, first_name text, last_name text); > \copy senders from 'todo' -- some junk data > create index sendersi on senders(email); > insert into senders select * from senders; -- repeat a few times > > begin; > explain > declare senders_c cursor for select email, first_name, last_name > from senders order by email; > NOTICE: QUERY PLAN: > > Index Scan using sendersi on senders (cost=0.00..102.78 rows=2161 > width=36) > > end; > > then made a script file > > begin; > declare senders_c cursor for select email, first_name, last_name > from senders order by email; > > fetch 10000 from senders_c > \g zzz > fetch backward 10000 from senders_c > \g zzz > fetch 10000 from senders_c > \g zzz > fetch backward 10000 from senders_c > \g zzz > fetch 10000 from senders_c > \g zzz > ... repeat a few thousand times ... > > end; > > Backend memory usage is absolutely rock steady. > > Curiously, there does seem to be a leak in psql, or possibly libpq, > when running this script. It's per-query not per-tuple, so it's > easier to see if you make the fetches shorter: > > fetch 100 from senders_c > \g zzz > fetch backward 100 from senders_c > \g zzz > > Repeating these like mad, psql grows about 1Kb/sec on my machine. > This occurs with current sources but NOT with 7.0.2 psql. Peter, > any thoughts about that? Actually you should also consider that there is really intensive inserts happend at the same time. Cursor is closed in 3-4 days only... -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Tom Lane writes:
> Curiously, there does seem to be a leak in psql, or possibly libpq,
> when running this script. It's per-query not per-tuple, so it's
> easier to see if you make the fetches shorter:
>
> fetch 100 from senders_c
> \g zzz
> fetch backward 100 from senders_c
> \g zzz
>
> Repeating these like mad, psql grows about 1Kb/sec on my machine.
> This occurs with current sources but NOT with 7.0.2 psql. Peter,
> any thoughts about that?
Well, here's a memory leak:
Index: print.c
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/src/bin/psql/print.c,v
retrieving revision 1.16
diff -c -r1.16 print.c
*** print.c 2001/02/27 08:13:27 1.16
--- print.c 2001/03/01 17:30:00
***************
*** 1116,1121 ****
--- 1116,1122 ----
free(footers[0]);
free(footers);
}
+ free(align);
}
===snip
This leaks (columns + 1) per query, 4 bytes in your case. But is has been
there in 7.0 as well.
I couldn't find anything suspicious in the CVS diffs of psql or libpq
since 7.0.
So let's think: What if you use COPY? \copy? Plain SELECT? What's the
leakage per query cycle (not per second)? Is it related to the size of
the data?
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Repeating these like mad, psql grows about 1Kb/sec on my machine.
>> This occurs with current sources but NOT with 7.0.2 psql. Peter,
>> any thoughts about that?
> Well, here's a memory leak:
Good catch. I confirm this stops the leak in my test.
> This leaks (columns + 1) per query, 4 bytes in your case. But is has been
> there in 7.0 as well.
Ooops, my bad --- repeating my test shows process growth in 7.0 as well.
I had missed it the first time because I'd previously done a plain
"select *", which ballooned the process footprint of psql. So the
leakage was happening within previously-grabbed memory and wasn't
obvious in "top".
This doesn't seem to actually be related to Denis' problem, but a
leak is a leak ...
regards, tom lane
> > This leaks (columns + 1) per query, 4 bytes in your case. But is has > > been there in 7.0 as well. > > Ooops, my bad --- repeating my test shows process growth in 7.0 as well. > I had missed it the first time because I'd previously done a plain > "select *", which ballooned the process footprint of psql. So the > leakage was happening within previously-grabbed memory and wasn't > obvious in "top". > > This doesn't seem to actually be related to Denis' problem, but a > leak is a leak ... Is this is the same leak? I mean I do not use psql in any mean... -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Denis Perchine <dyp@perchine.com> writes:
>> This doesn't seem to actually be related to Denis' problem, but a
>> leak is a leak ...
> Is this is the same leak? I mean I do not use psql in any mean...
It isn't.
I'm now suspecting that your problem must be related to doing inserts
while a cursor is held open. Are the inserts done by the same backend,
or a different one? Could you experiment and see if you can devise a
complete example? I'm kinda busy right now looking at WAL...
regards, tom lane
> >> This doesn't seem to actually be related to Denis' problem, but a > >> leak is a leak ... > > > > Is this is the same leak? I mean I do not use psql in any mean... > > It isn't. > > I'm now suspecting that your problem must be related to doing inserts > while a cursor is held open. Are the inserts done by the same backend, > or a different one? By different ones. This backend just do fetches. > Could you experiment and see if you can devise a complete example? I'm > kinda busy right now looking at WAL... Inserts are the only operations done besides fetching. This is a complete example. There are no other actions are taken on this table. Only inserts made by other backend, and fetches trough only on cursor. I mean that this is the only backend doing fetches. -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------