Обсуждение: Out of memory error on huge resultset

Поиск

Список

Период

Сортировка

Out of memory error on huge resultset

От

"Nick Fankhauser"

Дата:

10 октября 2002 г., 11:29:15

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

10 октября 2002 г., 11:40:52

Nick,

Use a cursor, the current driver doesn't support caching, the backend
gives you everything you ask for, you can't just say you want a limited
set.

So if you use cursors you can fetch a subset

Dave
On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
>
> I'm selecting a huge ResultSet from our database- about one million rows,
> with one of the fields being varchar(500). I get an out of memory error from
> java.
>
> If the whole ResultSet gets stashed in memory, this isn't really surprising,
> but I'm wondering why this happens (if it does), rather than a subset around
> the current record being cached and other rows being retrieved as needed.
>
> If it turns out that there are good reasons for it to all be in memory, then
> my question is whether there is a better approach that people typically use
> in this situation. For now, I'm simply breaking up the select into smaller
> chunks, but that approach won't be satisfactory in the long run.
>
> Thanks
>
> -Nick
>
> --------------------------------------------------------------------------
> Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
> Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>

Re: Out of memory error on huge resultset

От

Barry Lind

Дата:

10 октября 2002 г., 17:39:21

Nick,

This has been discussed before on this list many times.  But the short
answer is that that is how the postgres server handles queries.  If you
issue a query the server will return the entire result.  (try the same
query in psql and you will have the same problem).  To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry


Nick Fankhauser wrote:
> I'm selecting a huge ResultSet from our database- about one million rows,
> with one of the fields being varchar(500). I get an out of memory error from
> java.
>
> If the whole ResultSet gets stashed in memory, this isn't really surprising,
> but I'm wondering why this happens (if it does), rather than a subset around
> the current record being cached and other rows being retrieved as needed.
>
> If it turns out that there are good reasons for it to all be in memory, then
> my question is whether there is a better approach that people typically use
> in this situation. For now, I'm simply breaking up the select into smaller
> chunks, but that approach won't be satisfactory in the long run.
>
> Thanks
>
> -Nick
>
> --------------------------------------------------------------------------
> Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
> Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 08:06:31

Barry,
  Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

In C library is 'execute query' without fetch - in jdbc execute fetch all rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many  ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> Nick,
>
> This has been discussed before on this list many times.  But the short
> answer is that that is how the postgres server handles queries.  If you
> issue a query the server will return the entire result.  (try the same
> query in psql and you will have the same problem).  To work around this
> you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> sql commands for postgres).
>
> thanks,
> --Barry
>
> Nick Fankhauser wrote:
> > I'm selecting a huge ResultSet from our database- about one million rows,
> > with one of the fields being varchar(500). I get an out of memory error
> > from java.
> >
> > If the whole ResultSet gets stashed in memory, this isn't really
> > surprising, but I'm wondering why this happens (if it does), rather than
> > a subset around the current record being cached and other rows being
> > retrieved as needed.
> >
> > If it turns out that there are good reasons for it to all be in memory,
> > then my question is whether there is a better approach that people
> > typically use in this situation. For now, I'm simply breaking up the
> > select into smaller chunks, but that approach won't be satisfactory in
> > the long run.
> >
> > Thanks
> >
> > -Nick
> >
> > -------------------------------------------------------------------------
> >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > http://www.ontko.com/
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

Re: Out of memory error on huge resultset

От

Aaron Mulder

Дата:

11 октября 2002 г., 10:20:34

    What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)?  Is it significantly slower or more
resource-intensive than fetching all the data at once?  Certainly it seems
like it would save memory in some cases.

Aaron

On 10 Oct 2002, Dave Cramer wrote:
> Nick,
>
> Use a cursor, the current driver doesn't support caching, the backend
> gives you everything you ask for, you can't just say you want a limited
> set.
>
> So if you use cursors you can fetch a subset
>
> Dave
> On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
> >
> > I'm selecting a huge ResultSet from our database- about one million rows,
> > with one of the fields being varchar(500). I get an out of memory error from
> > java.
> >
> > If the whole ResultSet gets stashed in memory, this isn't really surprising,
> > but I'm wondering why this happens (if it does), rather than a subset around
> > the current record being cached and other rows being retrieved as needed.
> >
> > If it turns out that there are good reasons for it to all be in memory, then
> > my question is whether there is a better approach that people typically use
> > in this situation. For now, I'm simply breaking up the select into smaller
> > chunks, but that approach won't be satisfactory in the long run.
> >
> > Thanks
> >
> > -Nick
> >
> > --------------------------------------------------------------------------
> > Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
> > Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 10:35:13

No disadvantage, in fact that is what we would like to do.


setFetchSize(size)  turns on cursor support, otherwise fetch normally

Dave

On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
>     What would be the disadvantage of making the JDBC driver use a
> cursor under the covers (always)?  Is it significantly slower or more
> resource-intensive than fetching all the data at once?  Certainly it seems
> like it would save memory in some cases.
>
> Aaron
>
> On 10 Oct 2002, Dave Cramer wrote:
> > Nick,
> >
> > Use a cursor, the current driver doesn't support caching, the backend
> > gives you everything you ask for, you can't just say you want a limited
> > set.
> >
> > So if you use cursors you can fetch a subset
> >
> > Dave
> > On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
> > >
> > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > with one of the fields being varchar(500). I get an out of memory error from
> > > java.
> > >
> > > If the whole ResultSet gets stashed in memory, this isn't really surprising,
> > > but I'm wondering why this happens (if it does), rather than a subset around
> > > the current record being cached and other rows being retrieved as needed.
> > >
> > > If it turns out that there are good reasons for it to all be in memory, then
> > > my question is whether there is a better approach that people typically use
> > > in this situation. For now, I'm simply breaking up the select into smaller
> > > chunks, but that approach won't be satisfactory in the long run.
> > >
> > > Thanks
> > >
> > > -Nick
> > >
> > > --------------------------------------------------------------------------
> > > Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
> > > Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>

Re: Out of memory error on huge resultset

От

"Michael Paesold"

Дата:

11 октября 2002 г., 10:42:07

Dave Cramer <Dave@micro-automation.net> wrote:

> No disadvantage, in fact that is what we would like to do.
>
>
> setFetchSize(size)  turns on cursor support, otherwise fetch normally
>
> Dave
>
> On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
> > What would be the disadvantage of making the JDBC driver use a
> > cursor under the covers (always)?  Is it significantly slower or more
> > resource-intensive than fetching all the data at once?  Certainly it
seems
> > like it would save memory in some cases.
> >
> > Aaron

Well, using a cursor based result set *always* is not going to work. Cursors
will not be held over a commit, whereas a buffer result set will. So the
setFetchSize..

Regards,
Michael Paesold

Re: Out of memory error on huge resultset

От

nferrier@tapsellferrier.co.uk

Дата:

11 октября 2002 г., 10:53:16

Dave Cramer <Dave@micro-automation.net> writes:

> No disadvantage, in fact that is what we would like to do.
>
>
> setFetchSize(size)  turns on cursor support, otherwise fetch
> normally

I love PostgreSQL's default behaviour: it's great to know
that a server side cursor resource is probably not hanging around for
simple querys.

Nic

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 10:53:27

Michael,

You are correct, commit will effectively close the cursor.

This is the only way to deal with large result sets however.

Dave
On Fri, 2002-10-11 at 10:44, Michael Paesold wrote:
> Dave Cramer <Dave@micro-automation.net> wrote:
>
> > No disadvantage, in fact that is what we would like to do.
> >
> >
> > setFetchSize(size)  turns on cursor support, otherwise fetch normally
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
> > > What would be the disadvantage of making the JDBC driver use a
> > > cursor under the covers (always)?  Is it significantly slower or more
> > > resource-intensive than fetching all the data at once?  Certainly it
> seems
> > > like it would save memory in some cases.
> > >
> > > Aaron
>
> Well, using a cursor based result set *always* is not going to work. Cursors
> will not be held over a commit, whereas a buffer result set will. So the
> setFetchSize..
>
> Regards,
> Michael Paesold
>
>

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 10:57:21

Just so you know this isn't implemented yet. My reference to
setFetchSize below was just a suggestion as to how to implement it

Dave
On Fri, 2002-10-11 at 10:52, nferrier@tapsellferrier.co.uk wrote:
> Dave Cramer <Dave@micro-automation.net> writes:
>
> > No disadvantage, in fact that is what we would like to do.
> >
> >
> > setFetchSize(size)  turns on cursor support, otherwise fetch
> > normally
>
> I love PostgreSQL's default behaviour: it's great to know
> that a server side cursor resource is probably not hanging around for
> simple querys.
>
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>
>

Re: Out of memory error on huge resultset

От

nferrier@tapsellferrier.co.uk

Дата:

11 октября 2002 г., 10:58:44

Dave Cramer <Dave@micro-automation.net> writes:

> Just so you know this isn't implemented yet. My reference to
> setFetchSize below was just a suggestion as to how to implement it

I'll do it if you like.


Nic

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 11:00:40

Nic,

That would be great!

Dave
On Fri, 2002-10-11 at 10:57, nferrier@tapsellferrier.co.uk wrote:
> Dave Cramer <Dave@micro-automation.net> writes:
>
> > Just so you know this isn't implemented yet. My reference to
> > setFetchSize below was just a suggestion as to how to implement it
>
> I'll do it if you like.
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>

Re: Out of memory error on huge resultset

От

Doug Fields

Дата:

11 октября 2002 г., 11:44:32

At 08:27 AM 10/11/2002, snpe wrote:
>Barry,
>   Is it true ?
>I create table with one column varchar(500) and enter 1 milion rows with
>length 10-20 character.JDBC query 'select * from a' get error 'out of
>memory', but psql not.
>I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug


>In C library is 'execute query' without fetch - in jdbc execute fetch all
>rows
>and this is problem - I think that executequery must prepare query and fetch
>(ResultSet.next or ...) must fetch only fetchSize rows.
>I am not sure, but I think that is problem with jdbc, not postgresql
>Hackers ?
>Does psql fetch all rows and if not how many  ?
>Can I change fetch size in psql ?
>CURSOR , FETCH and MOVE isn't solution.
>If I use jdbc in third-party IDE, I can't force this solution
>
>regards
>
>On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > Nick,
> >
> > This has been discussed before on this list many times.  But the short
> > answer is that that is how the postgres server handles queries.  If you
> > issue a query the server will return the entire result.  (try the same
> > query in psql and you will have the same problem).  To work around this
> > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > sql commands for postgres).
> >
> > thanks,
> > --Barry
> >
> > Nick Fankhauser wrote:
> > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > with one of the fields being varchar(500). I get an out of memory error
> > > from java.
> > >
> > > If the whole ResultSet gets stashed in memory, this isn't really
> > > surprising, but I'm wondering why this happens (if it does), rather than
> > > a subset around the current record being cached and other rows being
> > > retrieved as needed.
> > >
> > > If it turns out that there are good reasons for it to all be in memory,
> > > then my question is whether there is a better approach that people
> > > typically use in this situation. For now, I'm simply breaking up the
> > > select into smaller chunks, but that approach won't be satisfactory in
> > > the long run.
> > >
> > > Thanks
> > >
> > > -Nick
> > >
> > > -------------------------------------------------------------------------
> > >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > http://www.ontko.com/
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 12:03:00

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> At 08:27 AM 10/11/2002, snpe wrote:
> >Barry,
> >   Is it true ?
> >I create table with one column varchar(500) and enter 1 milion rows with
> >length 10-20 character.JDBC query 'select * from a' get error 'out of
> >memory', but psql not.
> >I insert 8 milion rows and psql work fine yet (slow, but work)
>
> The way the code works in JDBC is, in my opinion, a little poor but
> possibly mandated by JDBC design specs.
>
> It reads the entire result set from the database backend and caches it in a
> horrible Vector (which should really be a List and which should at least
> make an attempt to get the # of rows ahead of time to avoid all the
> resizing problems).
>
> Then, it doles it out from memory as you go through the ResultSet with the
> next() method.
>
> I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> WHOLE THING - through the result set as each row is returned from the
> backend, thus ensuring that you never use much more memory than one line.
> EVEN IF you have to keep the connection locked.
>
> The latter is what I expected it to do. The former is what it does. So, it
> necessitates you creating EVERY SELECT query which you think has more than
> a few rows (or which you think COULD have more than a few rows, "few" being
> defined by our VM memory limits) into a cursor based query. Really klugy. I
> intend to write a class to do that for every SELECT query for me automatically.
>
> Cheers,
>
> Doug
>
>
> >In C library is 'execute query' without fetch - in jdbc execute fetch all
> >rows
> >and this is problem - I think that executequery must prepare query and fetch
> >(ResultSet.next or ...) must fetch only fetchSize rows.
> >I am not sure, but I think that is problem with jdbc, not postgresql
> >Hackers ?
> >Does psql fetch all rows and if not how many  ?
> >Can I change fetch size in psql ?
> >CURSOR , FETCH and MOVE isn't solution.
> >If I use jdbc in third-party IDE, I can't force this solution
> >
> >regards
> >
> >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > Nick,
> > >
> > > This has been discussed before on this list many times.  But the short
> > > answer is that that is how the postgres server handles queries.  If you
> > > issue a query the server will return the entire result.  (try the same
> > > query in psql and you will have the same problem).  To work around this
> > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > sql commands for postgres).
> > >
> > > thanks,
> > > --Barry
> > >
> > > Nick Fankhauser wrote:
> > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > with one of the fields being varchar(500). I get an out of memory error
> > > > from java.
> > > >
> > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > a subset around the current record being cached and other rows being
> > > > retrieved as needed.
> > > >
> > > > If it turns out that there are good reasons for it to all be in memory,
> > > > then my question is whether there is a better approach that people
> > > > typically use in this situation. For now, I'm simply breaking up the
> > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > the long run.
> > > >
> > > > Thanks
> > > >
> > > > -Nick
> > > >
> > > > -------------------------------------------------------------------------
> > > >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > http://www.ontko.com/
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 5: Have you checked our extensive FAQ?
> > > >
> > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 12:24:28

Hello,
  Does it mean that psql uses cursors ?

regards
Haris Peco
On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> This really is an artifact of the way that postgres gives us the data.
>
> When you query the backend you get *all* of the results in the query,
> and there is no indication of how many results you are going to get. In
> simple selects it would be possible to get some idea by using
> count(field), but this wouldn't work nearly enough times to make it
> useful. So that leaves us with using cursors, which still won't tell you
> how many rows you are getting back, but at least you won't have the
> memory problems.
>
> This approach is far from trivial which is why it hasn't been
> implemented as of yet, keep in mind that result sets support things like
> move(n), first(), last(), the last of which will be the trickiest. Not
> to mention updateable result sets.
>
> As it turns out there is a mechanism to get to the end move 0 in
> 'cursor', which currently is being considered a bug.
>
> Dave
>
> On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > At 08:27 AM 10/11/2002, snpe wrote:
> > >Barry,
> > >   Is it true ?
> > >I create table with one column varchar(500) and enter 1 milion rows with
> > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > >memory', but psql not.
> > >I insert 8 milion rows and psql work fine yet (slow, but work)
> >
> > The way the code works in JDBC is, in my opinion, a little poor but
> > possibly mandated by JDBC design specs.
> >
> > It reads the entire result set from the database backend and caches it in
> > a horrible Vector (which should really be a List and which should at
> > least make an attempt to get the # of rows ahead of time to avoid all the
> > resizing problems).
> >
> > Then, it doles it out from memory as you go through the ResultSet with
> > the next() method.
> >
> > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > WHOLE THING - through the result set as each row is returned from the
> > backend, thus ensuring that you never use much more memory than one line.
> > EVEN IF you have to keep the connection locked.
> >
> > The latter is what I expected it to do. The former is what it does. So,
> > it necessitates you creating EVERY SELECT query which you think has more
> > than a few rows (or which you think COULD have more than a few rows,
> > "few" being defined by our VM memory limits) into a cursor based query.
> > Really klugy. I intend to write a class to do that for every SELECT query
> > for me automatically.
> >
> > Cheers,
> >
> > Doug
> >
> > >In C library is 'execute query' without fetch - in jdbc execute fetch
> > > all rows
> > >and this is problem - I think that executequery must prepare query and
> > > fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > >I am not sure, but I think that is problem with jdbc, not postgresql
> > >Hackers ?
> > >Does psql fetch all rows and if not how many  ?
> > >Can I change fetch size in psql ?
> > >CURSOR , FETCH and MOVE isn't solution.
> > >If I use jdbc in third-party IDE, I can't force this solution
> > >
> > >regards
> > >
> > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > Nick,
> > > >
> > > > This has been discussed before on this list many times.  But the
> > > > short answer is that that is how the postgres server handles queries.
> > > >  If you issue a query the server will return the entire result.  (try
> > > > the same query in psql and you will have the same problem).  To work
> > > > around this you can use explicit cursors (see the DECLARE CURSOR,
> > > > FETCH, and MOVE sql commands for postgres).
> > > >
> > > > thanks,
> > > > --Barry
> > > >
> > > > Nick Fankhauser wrote:
> > > > > I'm selecting a huge ResultSet from our database- about one million
> > > > > rows, with one of the fields being varchar(500). I get an out of
> > > > > memory error from java.
> > > > >
> > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > surprising, but I'm wondering why this happens (if it does), rather
> > > > > than a subset around the current record being cached and other rows
> > > > > being retrieved as needed.
> > > > >
> > > > > If it turns out that there are good reasons for it to all be in
> > > > > memory, then my question is whether there is a better approach that
> > > > > people typically use in this situation. For now, I'm simply
> > > > > breaking up the select into smaller chunks, but that approach won't
> > > > > be satisfactory in the long run.
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Nick
> > > > >
> > > > > -------------------------------------------------------------------
> > > > >------ - Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > > http://www.ontko.com/
> > > > >
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 5: Have you checked our
> > > > > extensive FAQ?
> > > > >
> > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > >---------------------------(end of broadcast)---------------------------
> > >TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to
> > > majordomo@postgresql.org)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Out of memory error on huge resultset

От

Barry Lind

Дата:

11 октября 2002 г., 12:54:23

Doug Fields wrote:

>
> It reads the entire result set from the database backend and caches it
> in a horrible Vector (which should really be a List and which should at
> least make an attempt to get the # of rows ahead of time to avoid all
> the resizing problems).
>

The problem here is that we would then need two completely different
implementations for jdbc1 and jdbc2/3 since List is not part of jdk1.1.
  We could build our own List implementation that works on jdk1.1, but I
am not sure the gain in performance is worth it.  If you could do some
testing and come back with some numbers of the differences in
performance between ResultSets implemented with Vectors and Lists that
would probably give us enough information to guage how to proceed on
this suggested improvement.

> Then, it doles it out from memory as you go through the ResultSet with
> the next() method.
>
> I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
> THE WHOLE THING - through the result set as each row is returned from
> the backend, thus ensuring that you never use much more memory than one
> line. EVEN IF you have to keep the connection locked.
>

This had actually been tried in the past (just getting the records from
the server connection as requested), but this behavior violates the spec
and broke many peoples applications.  The problem is that if you don't
use cursors, you end up tying up the connection until you finish
fetching all rows.  So code like the following no longer works:

get result set
while (rs.next()) {
   get some values from the result
   use them to update/insert some other table using a preparedstatement
}

Since the connection is locked until all the results are fetched, you
can't use the connection to perform the update/insert you want to do for
each itteration of the loop.

> The latter is what I expected it to do. The former is what it does. So,
> it necessitates you creating EVERY SELECT query which you think has more
> than a few rows (or which you think COULD have more than a few rows,
> "few" being defined by our VM memory limits) into a cursor based query.
> Really klugy. I intend to write a class to do that for every SELECT
> query for me automatically.
>
> Cheers,
>
> Doug
>

--Barry

Re: Out of memory error on huge resultset

От

Aaron Mulder

Дата:

11 октября 2002 г., 12:54:59

    It wouldn't be bad to start with a naive implementation of
last()...  If the only problem we have is that last() doesn't perform
well, we're probably making good progress.  :)
    On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:
> This really is an artifact of the way that postgres gives us the data.
>
> When you query the backend you get *all* of the results in the query,
> and there is no indication of how many results you are going to get. In
> simple selects it would be possible to get some idea by using
> count(field), but this wouldn't work nearly enough times to make it
> useful. So that leaves us with using cursors, which still won't tell you
> how many rows you are getting back, but at least you won't have the
> memory problems.
>
> This approach is far from trivial which is why it hasn't been
> implemented as of yet, keep in mind that result sets support things like
> move(n), first(), last(), the last of which will be the trickiest. Not
> to mention updateable result sets.
>
> As it turns out there is a mechanism to get to the end move 0 in
> 'cursor', which currently is being considered a bug.
>
> Dave
>
> On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > At 08:27 AM 10/11/2002, snpe wrote:
> > >Barry,
> > >   Is it true ?
> > >I create table with one column varchar(500) and enter 1 milion rows with
> > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > >memory', but psql not.
> > >I insert 8 milion rows and psql work fine yet (slow, but work)
> >
> > The way the code works in JDBC is, in my opinion, a little poor but
> > possibly mandated by JDBC design specs.
> >
> > It reads the entire result set from the database backend and caches it in a
> > horrible Vector (which should really be a List and which should at least
> > make an attempt to get the # of rows ahead of time to avoid all the
> > resizing problems).
> >
> > Then, it doles it out from memory as you go through the ResultSet with the
> > next() method.
> >
> > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > WHOLE THING - through the result set as each row is returned from the
> > backend, thus ensuring that you never use much more memory than one line.
> > EVEN IF you have to keep the connection locked.
> >
> > The latter is what I expected it to do. The former is what it does. So, it
> > necessitates you creating EVERY SELECT query which you think has more than
> > a few rows (or which you think COULD have more than a few rows, "few" being
> > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > intend to write a class to do that for every SELECT query for me automatically.
> >
> > Cheers,
> >
> > Doug
> >
> >
> > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > >rows
> > >and this is problem - I think that executequery must prepare query and fetch
> > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > >I am not sure, but I think that is problem with jdbc, not postgresql
> > >Hackers ?
> > >Does psql fetch all rows and if not how many  ?
> > >Can I change fetch size in psql ?
> > >CURSOR , FETCH and MOVE isn't solution.
> > >If I use jdbc in third-party IDE, I can't force this solution
> > >
> > >regards
> > >
> > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > Nick,
> > > >
> > > > This has been discussed before on this list many times.  But the short
> > > > answer is that that is how the postgres server handles queries.  If you
> > > > issue a query the server will return the entire result.  (try the same
> > > > query in psql and you will have the same problem).  To work around this
> > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > sql commands for postgres).
> > > >
> > > > thanks,
> > > > --Barry
> > > >
> > > > Nick Fankhauser wrote:
> > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > from java.
> > > > >
> > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > a subset around the current record being cached and other rows being
> > > > > retrieved as needed.
> > > > >
> > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > then my question is whether there is a better approach that people
> > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > the long run.
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Nick
> > > > >
> > > > > -------------------------------------------------------------------------
> > > > >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > > http://www.ontko.com/
> > > > >
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 5: Have you checked our extensive FAQ?
> > > > >
> > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 6: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > >
> > >---------------------------(end of broadcast)---------------------------
> > >TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: Out of memory error on huge resultset

От

Dror Matalon

Дата:

11 октября 2002 г., 13:12:45

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
>     It wouldn't be bad to start with a naive implementation of
> last()...  If the only problem we have is that last() doesn't perform
> well, we're probably making good progress.  :)
>     On the other hand, I would think the updateable result sets would
> be the most challenging; does the server provide any analogous features
> with its cursors?
>
> Aaron
>
> On 11 Oct 2002, Dave Cramer wrote:
> > This really is an artifact of the way that postgres gives us the data.
> >
> > When you query the backend you get *all* of the results in the query,
> > and there is no indication of how many results you are going to get. In
> > simple selects it would be possible to get some idea by using
> > count(field), but this wouldn't work nearly enough times to make it
> > useful. So that leaves us with using cursors, which still won't tell you
> > how many rows you are getting back, but at least you won't have the
> > memory problems.
> >
> > This approach is far from trivial which is why it hasn't been
> > implemented as of yet, keep in mind that result sets support things like
> > move(n), first(), last(), the last of which will be the trickiest. Not
> > to mention updateable result sets.
> >
> > As it turns out there is a mechanism to get to the end move 0 in
> > 'cursor', which currently is being considered a bug.
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > At 08:27 AM 10/11/2002, snpe wrote:
> > > >Barry,
> > > >   Is it true ?
> > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > >memory', but psql not.
> > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > >
> > > The way the code works in JDBC is, in my opinion, a little poor but
> > > possibly mandated by JDBC design specs.
> > >
> > > It reads the entire result set from the database backend and caches it in a
> > > horrible Vector (which should really be a List and which should at least
> > > make an attempt to get the # of rows ahead of time to avoid all the
> > > resizing problems).
> > >
> > > Then, it doles it out from memory as you go through the ResultSet with the
> > > next() method.
> > >
> > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > WHOLE THING - through the result set as each row is returned from the
> > > backend, thus ensuring that you never use much more memory than one line.
> > > EVEN IF you have to keep the connection locked.
> > >
> > > The latter is what I expected it to do. The former is what it does. So, it
> > > necessitates you creating EVERY SELECT query which you think has more than
> > > a few rows (or which you think COULD have more than a few rows, "few" being
> > > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > > intend to write a class to do that for every SELECT query for me automatically.
> > >
> > > Cheers,
> > >
> > > Doug
> > >
> > >
> > > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > > >rows
> > > >and this is problem - I think that executequery must prepare query and fetch
> > > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > >Hackers ?
> > > >Does psql fetch all rows and if not how many  ?
> > > >Can I change fetch size in psql ?
> > > >CURSOR , FETCH and MOVE isn't solution.
> > > >If I use jdbc in third-party IDE, I can't force this solution
> > > >
> > > >regards
> > > >
> > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > Nick,
> > > > >
> > > > > This has been discussed before on this list many times.  But the short
> > > > > answer is that that is how the postgres server handles queries.  If you
> > > > > issue a query the server will return the entire result.  (try the same
> > > > > query in psql and you will have the same problem).  To work around this
> > > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > > sql commands for postgres).
> > > > >
> > > > > thanks,
> > > > > --Barry
> > > > >
> > > > > Nick Fankhauser wrote:
> > > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > > from java.
> > > > > >
> > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > > a subset around the current record being cached and other rows being
> > > > > > retrieved as needed.
> > > > > >
> > > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > > then my question is whether there is a better approach that people
> > > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > > the long run.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > -Nick
> > > > > >
> > > > > > -------------------------------------------------------------------------
> > > > > >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > > > http://www.ontko.com/
> > > > > >
> > > > > >
> > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > TIP 5: Have you checked our extensive FAQ?
> > > > > >
> > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 6: Have you searched our list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > >
> > > >---------------------------(end of broadcast)---------------------------
> > > >TIP 2: you can get off all lists at once with the unregister command
> > > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 14:04:55

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
>
> Hi,
>
> I'm jumping in late into this discussion but ...
>
> In my mind a lot of these features break the model. From an application
> prespective, if I want to do last, I do a count(*) and then I do a fetch
> with limit; Not quite the same, but all these methods of fetching the
> whole data locally and manipulating it to a large exten defeat the
> purpose. Let the backend do the work, instead of trying to replicate the
> functionality in JDBC.
>
> That said I do understand that some of these are required by the JDBC 2.0
> spec.
>
> Dror
>
> On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> >     It wouldn't be bad to start with a naive implementation of
> > last()...  If the only problem we have is that last() doesn't perform
> > well, we're probably making good progress.  :)
> >     On the other hand, I would think the updateable result sets would
> > be the most challenging; does the server provide any analogous features
> > with its cursors?
> >
> > Aaron
> >
> > On 11 Oct 2002, Dave Cramer wrote:
> > > This really is an artifact of the way that postgres gives us the data.
> > >
> > > When you query the backend you get *all* of the results in the query,
> > > and there is no indication of how many results you are going to get. In
> > > simple selects it would be possible to get some idea by using
> > > count(field), but this wouldn't work nearly enough times to make it
> > > useful. So that leaves us with using cursors, which still won't tell you
> > > how many rows you are getting back, but at least you won't have the
> > > memory problems.
> > >
> > > This approach is far from trivial which is why it hasn't been
> > > implemented as of yet, keep in mind that result sets support things like
> > > move(n), first(), last(), the last of which will be the trickiest. Not
> > > to mention updateable result sets.
> > >
> > > As it turns out there is a mechanism to get to the end move 0 in
> > > 'cursor', which currently is being considered a bug.
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > >Barry,
> > > > >   Is it true ?
> > > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > > >memory', but psql not.
> > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > >
> > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > possibly mandated by JDBC design specs.
> > > >
> > > > It reads the entire result set from the database backend and caches it in a
> > > > horrible Vector (which should really be a List and which should at least
> > > > make an attempt to get the # of rows ahead of time to avoid all the
> > > > resizing problems).
> > > >
> > > > Then, it doles it out from memory as you go through the ResultSet with the
> > > > next() method.
> > > >
> > > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > > WHOLE THING - through the result set as each row is returned from the
> > > > backend, thus ensuring that you never use much more memory than one line.
> > > > EVEN IF you have to keep the connection locked.
> > > >
> > > > The latter is what I expected it to do. The former is what it does. So, it
> > > > necessitates you creating EVERY SELECT query which you think has more than
> > > > a few rows (or which you think COULD have more than a few rows, "few" being
> > > > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > > > intend to write a class to do that for every SELECT query for me automatically.
> > > >
> > > > Cheers,
> > > >
> > > > Doug
> > > >
> > > >
> > > > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > > > >rows
> > > > >and this is problem - I think that executequery must prepare query and fetch
> > > > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > > >Hackers ?
> > > > >Does psql fetch all rows and if not how many  ?
> > > > >Can I change fetch size in psql ?
> > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > >
> > > > >regards
> > > > >
> > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > Nick,
> > > > > >
> > > > > > This has been discussed before on this list many times.  But the short
> > > > > > answer is that that is how the postgres server handles queries.  If you
> > > > > > issue a query the server will return the entire result.  (try the same
> > > > > > query in psql and you will have the same problem).  To work around this
> > > > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > > > sql commands for postgres).
> > > > > >
> > > > > > thanks,
> > > > > > --Barry
> > > > > >
> > > > > > Nick Fankhauser wrote:
> > > > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > > > from java.
> > > > > > >
> > > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > > > a subset around the current record being cached and other rows being
> > > > > > > retrieved as needed.
> > > > > > >
> > > > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > > > then my question is whether there is a better approach that people
> > > > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > > > the long run.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > -Nick
> > > > > > >
> > > > > > > -------------------------------------------------------------------------
> > > > > > >- Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > > > > http://www.ontko.com/
> > > > > > >
> > > > > > >
> > > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > > TIP 5: Have you checked our extensive FAQ?
> > > > > > >
> > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > >
> > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > TIP 6: Have you searched our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > >
> > > > >---------------------------(end of broadcast)---------------------------
> > > > >TIP 2: you can get off all lists at once with the unregister command
> > > > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > > >
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 6: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>
> --
> Dror Matalon
> Zapatec Inc
> 1700 MLK Way
> Berkeley, CA 94709
> http://www.zapatec.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>

Re: Out of memory error on huge resultset

От

Doug Fields

Дата:

11 октября 2002 г., 14:13:16

>The problem here is that we would then need two completely different
>implementations for jdbc1 and jdbc2/3 since List is not part of
>jdk1.1.  We could build our own List implementation that works on jdk1.1,
>but I am not sure the gain in performance is worth it.  If you could do
>some testing and come back with some numbers of the differences in
>performance between ResultSets implemented with Vectors and Lists that
>would probably give us enough information to guage how to proceed on this
>suggested improvement.

In the past, I have done this sort of thing.

The "synchronized" overhead of a "synchronized method" is about 7 times the
overhead of a regular method call. I did many empirical tests of this on
JDK 1.3 and 1.4 on Linux (2.2 and 2.4) due to the high performance demands
of the software my firm uses. Now, that all depends on how many times you
invoke those methods and how fast they are otherwise. I'm unwilling to do
that for PostgreSQL, but I have to imagine that scrapping JDK 1.1 support
would not be a bad thing and may even be a good thing. Anyone still using
JDK 1.1 is also probably using it in conjunction with other products from
that era, so having a modern product compatible with a very out of date
product makes no sense in my estimation.

I don't make policy, though - that seems to be your job generally, Barry. :)

>This had actually been tried in the past (just getting the records from
>the server connection as requested), but this behavior violates the spec
>and broke many peoples applications.  The problem is that if you don't use
>cursors, you end up tying up the connection until you finish fetching all
>rows.  So code like the following no longer works:
>
>get result set
>while (rs.next()) {
>   get some values from the result
>   use them to update/insert some other table using a preparedstatement
>}
>
>Since the connection is locked until all the results are fetched, you
>can't use the connection to perform the update/insert you want to do for
>each itteration of the loop.

Agreed on this point. However, nonetheless, and regardless of the fact that
this may break the spec and should not be the default behavior, this should
be an option, because the current way of the driver working is a horror for
anyone who has to deal with large result sets (such as I do on a regular
basis).

I don't mind keeping a connection locked for ages. I do mind running out of
VM space for a large result set which is streamed FROM THE DATABASE SERVER.
Result sets should have the ability to be streamed end to end, IMO - even
if it's a non-standard extension or an option to the connection when
created or the statement when created.

Again, I don't make policy, and I'm insufficiently motivated to do it
myself. Don't think it invalidates my opinion, but I won't kvetch about it
either. I just designed a class which does the same thing by taking a query
and turning it into a cursor-based query.

Cheers,

Doug

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 14:18:15

No,

It doesn't have to store them, only display them

Dave
On Fri, 2002-10-11 at 12:48, snpe wrote:
> Hello,
>   Does it mean that psql uses cursors ?
>
> regards
> Haris Peco
> On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > This really is an artifact of the way that postgres gives us the data.
> >
> > When you query the backend you get *all* of the results in the query,
> > and there is no indication of how many results you are going to get. In
> > simple selects it would be possible to get some idea by using
> > count(field), but this wouldn't work nearly enough times to make it
> > useful. So that leaves us with using cursors, which still won't tell you
> > how many rows you are getting back, but at least you won't have the
> > memory problems.
> >
> > This approach is far from trivial which is why it hasn't been
> > implemented as of yet, keep in mind that result sets support things like
> > move(n), first(), last(), the last of which will be the trickiest. Not
> > to mention updateable result sets.
> >
> > As it turns out there is a mechanism to get to the end move 0 in
> > 'cursor', which currently is being considered a bug.
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > At 08:27 AM 10/11/2002, snpe wrote:
> > > >Barry,
> > > >   Is it true ?
> > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > >memory', but psql not.
> > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > >
> > > The way the code works in JDBC is, in my opinion, a little poor but
> > > possibly mandated by JDBC design specs.
> > >
> > > It reads the entire result set from the database backend and caches it in
> > > a horrible Vector (which should really be a List and which should at
> > > least make an attempt to get the # of rows ahead of time to avoid all the
> > > resizing problems).
> > >
> > > Then, it doles it out from memory as you go through the ResultSet with
> > > the next() method.
> > >
> > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > WHOLE THING - through the result set as each row is returned from the
> > > backend, thus ensuring that you never use much more memory than one line.
> > > EVEN IF you have to keep the connection locked.
> > >
> > > The latter is what I expected it to do. The former is what it does. So,
> > > it necessitates you creating EVERY SELECT query which you think has more
> > > than a few rows (or which you think COULD have more than a few rows,
> > > "few" being defined by our VM memory limits) into a cursor based query.
> > > Really klugy. I intend to write a class to do that for every SELECT query
> > > for me automatically.
> > >
> > > Cheers,
> > >
> > > Doug
> > >
> > > >In C library is 'execute query' without fetch - in jdbc execute fetch
> > > > all rows
> > > >and this is problem - I think that executequery must prepare query and
> > > > fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > >Hackers ?
> > > >Does psql fetch all rows and if not how many  ?
> > > >Can I change fetch size in psql ?
> > > >CURSOR , FETCH and MOVE isn't solution.
> > > >If I use jdbc in third-party IDE, I can't force this solution
> > > >
> > > >regards
> > > >
> > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > Nick,
> > > > >
> > > > > This has been discussed before on this list many times.  But the
> > > > > short answer is that that is how the postgres server handles queries.
> > > > >  If you issue a query the server will return the entire result.  (try
> > > > > the same query in psql and you will have the same problem).  To work
> > > > > around this you can use explicit cursors (see the DECLARE CURSOR,
> > > > > FETCH, and MOVE sql commands for postgres).
> > > > >
> > > > > thanks,
> > > > > --Barry
> > > > >
> > > > > Nick Fankhauser wrote:
> > > > > > I'm selecting a huge ResultSet from our database- about one million
> > > > > > rows, with one of the fields being varchar(500). I get an out of
> > > > > > memory error from java.
> > > > > >
> > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > surprising, but I'm wondering why this happens (if it does), rather
> > > > > > than a subset around the current record being cached and other rows
> > > > > > being retrieved as needed.
> > > > > >
> > > > > > If it turns out that there are good reasons for it to all be in
> > > > > > memory, then my question is whether there is a better approach that
> > > > > > people typically use in this situation. For now, I'm simply
> > > > > > breaking up the select into smaller chunks, but that approach won't
> > > > > > be satisfactory in the long run.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > -Nick
> > > > > >
> > > > > > -------------------------------------------------------------------
> > > > > >------ - Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax
> > > > > > 1.765.962.9788 Ray Ontko & Co.     Software Consulting Services
> > > > > > http://www.ontko.com/
> > > > > >
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 5: Have you checked our
> > > > > > extensive FAQ?
> > > > > >
> > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > >---------------------------(end of broadcast)---------------------------
> > > >TIP 2: you can get off all lists at once with the unregister command
> > > >     (send "unregister YourEmailAddressHere" to
> > > > majordomo@postgresql.org)
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 16:09:56

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco
On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> Agreed, but there are selects where count(*) won't work. Even so, what
> we are talking about here is hiding the implementation of cursors behind
> the result set. What I would envision is some sort of cacheing where
> when the user set's the fetchsize to 10 for instance we do the select,
> and when they ask for next() we check to see if we have these rows in
> the cache, and go get them if necessary 10 at a time, possibly keeping
> one set of ten behind where we are and one set of 10 ahead of where we
> are. So recalling that resultSets have absolute positioning, as well as
> first(), and last() positioning we need the ability to move with the
> minimum number of trips to the backend.
>
> As it turns out the move command in postgres does support moving to the
> end (move 0 ); at the moment this is considered a bug, and is on the
> todo list to be removed. I expect we can get some sort of implementation
> which allows us to move to the end ( move end )
>
> Dave
>
> On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > Hi,
> >
> > I'm jumping in late into this discussion but ...
> >
> > In my mind a lot of these features break the model. From an application
> > prespective, if I want to do last, I do a count(*) and then I do a fetch
> > with limit; Not quite the same, but all these methods of fetching the
> > whole data locally and manipulating it to a large exten defeat the
> > purpose. Let the backend do the work, instead of trying to replicate the
> > functionality in JDBC.
> >
> > That said I do understand that some of these are required by the JDBC 2.0
> > spec.
> >
> > Dror
> >
> > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > >     It wouldn't be bad to start with a naive implementation of
> > > last()...  If the only problem we have is that last() doesn't perform
> > > well, we're probably making good progress.  :)
> > >     On the other hand, I would think the updateable result sets would
> > > be the most challenging; does the server provide any analogous features
> > > with its cursors?
> > >
> > > Aaron
> > >
> > > On 11 Oct 2002, Dave Cramer wrote:
> > > > This really is an artifact of the way that postgres gives us the
> > > > data.
> > > >
> > > > When you query the backend you get *all* of the results in the query,
> > > > and there is no indication of how many results you are going to get.
> > > > In simple selects it would be possible to get some idea by using
> > > > count(field), but this wouldn't work nearly enough times to make it
> > > > useful. So that leaves us with using cursors, which still won't tell
> > > > you how many rows you are getting back, but at least you won't have
> > > > the memory problems.
> > > >
> > > > This approach is far from trivial which is why it hasn't been
> > > > implemented as of yet, keep in mind that result sets support things
> > > > like move(n), first(), last(), the last of which will be the
> > > > trickiest. Not to mention updateable result sets.
> > > >
> > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > 'cursor', which currently is being considered a bug.
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > >Barry,
> > > > > >   Is it true ?
> > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > rows with length 10-20 character.JDBC query 'select * from a' get
> > > > > > error 'out of memory', but psql not.
> > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > >
> > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > possibly mandated by JDBC design specs.
> > > > >
> > > > > It reads the entire result set from the database backend and caches
> > > > > it in a horrible Vector (which should really be a List and which
> > > > > should at least make an attempt to get the # of rows ahead of time
> > > > > to avoid all the resizing problems).
> > > > >
> > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > with the next() method.
> > > > >
> > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > returned from the backend, thus ensuring that you never use much
> > > > > more memory than one line. EVEN IF you have to keep the connection
> > > > > locked.
> > > > >
> > > > > The latter is what I expected it to do. The former is what it does.
> > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > has more than a few rows (or which you think COULD have more than a
> > > > > few rows, "few" being defined by our VM memory limits) into a
> > > > > cursor based query. Really klugy. I intend to write a class to do
> > > > > that for every SELECT query for me automatically.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Doug
> > > > >
> > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > fetch all rows
> > > > > >and this is problem - I think that executequery must prepare query
> > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > > > >I am not sure, but I think that is problem with jdbc, not
> > > > > > postgresql Hackers ?
> > > > > >Does psql fetch all rows and if not how many  ?
> > > > > >Can I change fetch size in psql ?
> > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > >
> > > > > >regards
> > > > > >
> > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > Nick,
> > > > > > >
> > > > > > > This has been discussed before on this list many times.  But
> > > > > > > the short answer is that that is how the postgres server
> > > > > > > handles queries.  If you issue a query the server will return
> > > > > > > the entire result.  (try the same query in psql and you will
> > > > > > > have the same problem).  To work around this you can use
> > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > commands for postgres).
> > > > > > >
> > > > > > > thanks,
> > > > > > > --Barry
> > > > > > >
> > > > > > > Nick Fankhauser wrote:
> > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > get an out of memory error from java.
> > > > > > > >
> > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > does), rather than a subset around the current record being
> > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > >
> > > > > > > > If it turns out that there are good reasons for it to all be
> > > > > > > > in memory, then my question is whether there is a better
> > > > > > > > approach that people typically use in this situation. For
> > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > -Nick
> > > > > > > >
> > > > > > > > -------------------------------------------------------------
> > > > > > > >------------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.
> > > > > > > > Software Consulting Services http://www.ontko.com/
> > > > > > > >
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > our extensive FAQ?
> > > > > > > >
> > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > >---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > lists at once with the unregister command (send "unregister
> > > > > > YourEmailAddressHere" to majordomo@postgresql.org)
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > unsubscribe commands go to majordomo@postgresql.org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > Usenet, please send an appropriate subscribe-nomail command to
> > > majordomo@postgresql.org so that your message can get through to the
> > > mailing list cleanly
> >
> > --
> > Dror Matalon
> > Zapatec Inc
> > 1700 MLK Way
> > Berkeley, CA 94709
> > http://www.zapatec.com
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 16:31:49

AFAIK, it doesn't work unless the driver is running natively on linux.
It used to use a native library, but I did see something which looks
like a net protocol ?

dave
On Fri, 2002-10-11 at 16:33, snpe wrote:
> There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> Last version don't wokr with pgsql 7.3 and I don't test more.
> I will try next day, when I download pgsql 7.2
>
> regards
> Haris Peco
> On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > Agreed, but there are selects where count(*) won't work. Even so, what
> > we are talking about here is hiding the implementation of cursors behind
> > the result set. What I would envision is some sort of cacheing where
> > when the user set's the fetchsize to 10 for instance we do the select,
> > and when they ask for next() we check to see if we have these rows in
> > the cache, and go get them if necessary 10 at a time, possibly keeping
> > one set of ten behind where we are and one set of 10 ahead of where we
> > are. So recalling that resultSets have absolute positioning, as well as
> > first(), and last() positioning we need the ability to move with the
> > minimum number of trips to the backend.
> >
> > As it turns out the move command in postgres does support moving to the
> > end (move 0 ); at the moment this is considered a bug, and is on the
> > todo list to be removed. I expect we can get some sort of implementation
> > which allows us to move to the end ( move end )
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > Hi,
> > >
> > > I'm jumping in late into this discussion but ...
> > >
> > > In my mind a lot of these features break the model. From an application
> > > prespective, if I want to do last, I do a count(*) and then I do a fetch
> > > with limit; Not quite the same, but all these methods of fetching the
> > > whole data locally and manipulating it to a large exten defeat the
> > > purpose. Let the backend do the work, instead of trying to replicate the
> > > functionality in JDBC.
> > >
> > > That said I do understand that some of these are required by the JDBC 2.0
> > > spec.
> > >
> > > Dror
> > >
> > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > >     It wouldn't be bad to start with a naive implementation of
> > > > last()...  If the only problem we have is that last() doesn't perform
> > > > well, we're probably making good progress.  :)
> > > >     On the other hand, I would think the updateable result sets would
> > > > be the most challenging; does the server provide any analogous features
> > > > with its cursors?
> > > >
> > > > Aaron
> > > >
> > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > This really is an artifact of the way that postgres gives us the
> > > > > data.
> > > > >
> > > > > When you query the backend you get *all* of the results in the query,
> > > > > and there is no indication of how many results you are going to get.
> > > > > In simple selects it would be possible to get some idea by using
> > > > > count(field), but this wouldn't work nearly enough times to make it
> > > > > useful. So that leaves us with using cursors, which still won't tell
> > > > > you how many rows you are getting back, but at least you won't have
> > > > > the memory problems.
> > > > >
> > > > > This approach is far from trivial which is why it hasn't been
> > > > > implemented as of yet, keep in mind that result sets support things
> > > > > like move(n), first(), last(), the last of which will be the
> > > > > trickiest. Not to mention updateable result sets.
> > > > >
> > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > 'cursor', which currently is being considered a bug.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > >Barry,
> > > > > > >   Is it true ?
> > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > rows with length 10-20 character.JDBC query 'select * from a' get
> > > > > > > error 'out of memory', but psql not.
> > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > >
> > > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > > possibly mandated by JDBC design specs.
> > > > > >
> > > > > > It reads the entire result set from the database backend and caches
> > > > > > it in a horrible Vector (which should really be a List and which
> > > > > > should at least make an attempt to get the # of rows ahead of time
> > > > > > to avoid all the resizing problems).
> > > > > >
> > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > with the next() method.
> > > > > >
> > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > more memory than one line. EVEN IF you have to keep the connection
> > > > > > locked.
> > > > > >
> > > > > > The latter is what I expected it to do. The former is what it does.
> > > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > > has more than a few rows (or which you think COULD have more than a
> > > > > > few rows, "few" being defined by our VM memory limits) into a
> > > > > > cursor based query. Really klugy. I intend to write a class to do
> > > > > > that for every SELECT query for me automatically.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Doug
> > > > > >
> > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > fetch all rows
> > > > > > >and this is problem - I think that executequery must prepare query
> > > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > > > > >I am not sure, but I think that is problem with jdbc, not
> > > > > > > postgresql Hackers ?
> > > > > > >Does psql fetch all rows and if not how many  ?
> > > > > > >Can I change fetch size in psql ?
> > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > >
> > > > > > >regards
> > > > > > >
> > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > Nick,
> > > > > > > >
> > > > > > > > This has been discussed before on this list many times.  But
> > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > handles queries.  If you issue a query the server will return
> > > > > > > > the entire result.  (try the same query in psql and you will
> > > > > > > > have the same problem).  To work around this you can use
> > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > commands for postgres).
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > --Barry
> > > > > > > >
> > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > get an out of memory error from java.
> > > > > > > > >
> > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > > does), rather than a subset around the current record being
> > > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > > >
> > > > > > > > > If it turns out that there are good reasons for it to all be
> > > > > > > > > in memory, then my question is whether there is a better
> > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > -Nick
> > > > > > > > >
> > > > > > > > > -------------------------------------------------------------
> > > > > > > > >------------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.
> > > > > > > > > Software Consulting Services http://www.ontko.com/
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > > our extensive FAQ?
> > > > > > > > >
> > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > > our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > >---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > > lists at once with the unregister command (send "unregister
> > > > > > > YourEmailAddressHere" to majordomo@postgresql.org)
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > > list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo@postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > > Usenet, please send an appropriate subscribe-nomail command to
> > > > majordomo@postgresql.org so that your message can get through to the
> > > > mailing list cleanly
> > >
> > > --
> > > Dror Matalon
> > > Zapatec Inc
> > > 1700 MLK Way
> > > Berkeley, CA 94709
> > > http://www.zapatec.com
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 16:38:41

I am tried with jxdbcon - it don't work with large table, too.
'out of memory' is when executeQuery()

regards
Haris Peco
On Friday 11 October 2002 10:33 pm, snpe wrote:
> There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> Last version don't wokr with pgsql 7.3 and I don't test more.
> I will try next day, when I download pgsql 7.2
>
> regards
> Haris Peco
>
> On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > Agreed, but there are selects where count(*) won't work. Even so, what
> > we are talking about here is hiding the implementation of cursors behind
> > the result set. What I would envision is some sort of cacheing where
> > when the user set's the fetchsize to 10 for instance we do the select,
> > and when they ask for next() we check to see if we have these rows in
> > the cache, and go get them if necessary 10 at a time, possibly keeping
> > one set of ten behind where we are and one set of 10 ahead of where we
> > are. So recalling that resultSets have absolute positioning, as well as
> > first(), and last() positioning we need the ability to move with the
> > minimum number of trips to the backend.
> >
> > As it turns out the move command in postgres does support moving to the
> > end (move 0 ); at the moment this is considered a bug, and is on the
> > todo list to be removed. I expect we can get some sort of implementation
> > which allows us to move to the end ( move end )
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > Hi,
> > >
> > > I'm jumping in late into this discussion but ...
> > >
> > > In my mind a lot of these features break the model. From an application
> > > prespective, if I want to do last, I do a count(*) and then I do a
> > > fetch with limit; Not quite the same, but all these methods of fetching
> > > the whole data locally and manipulating it to a large exten defeat the
> > > purpose. Let the backend do the work, instead of trying to replicate
> > > the functionality in JDBC.
> > >
> > > That said I do understand that some of these are required by the JDBC
> > > 2.0 spec.
> > >
> > > Dror
> > >
> > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > >     It wouldn't be bad to start with a naive implementation of
> > > > last()...  If the only problem we have is that last() doesn't perform
> > > > well, we're probably making good progress.  :)
> > > >     On the other hand, I would think the updateable result sets would
> > > > be the most challenging; does the server provide any analogous
> > > > features with its cursors?
> > > >
> > > > Aaron
> > > >
> > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > This really is an artifact of the way that postgres gives us the
> > > > > data.
> > > > >
> > > > > When you query the backend you get *all* of the results in the
> > > > > query, and there is no indication of how many results you are going
> > > > > to get. In simple selects it would be possible to get some idea by
> > > > > using count(field), but this wouldn't work nearly enough times to
> > > > > make it useful. So that leaves us with using cursors, which still
> > > > > won't tell you how many rows you are getting back, but at least you
> > > > > won't have the memory problems.
> > > > >
> > > > > This approach is far from trivial which is why it hasn't been
> > > > > implemented as of yet, keep in mind that result sets support things
> > > > > like move(n), first(), last(), the last of which will be the
> > > > > trickiest. Not to mention updateable result sets.
> > > > >
> > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > 'cursor', which currently is being considered a bug.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > >Barry,
> > > > > > >   Is it true ?
> > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > rows with length 10-20 character.JDBC query 'select * from a'
> > > > > > > get error 'out of memory', but psql not.
> > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > >
> > > > > > The way the code works in JDBC is, in my opinion, a little poor
> > > > > > but possibly mandated by JDBC design specs.
> > > > > >
> > > > > > It reads the entire result set from the database backend and
> > > > > > caches it in a horrible Vector (which should really be a List and
> > > > > > which should at least make an attempt to get the # of rows ahead
> > > > > > of time to avoid all the resizing problems).
> > > > > >
> > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > with the next() method.
> > > > > >
> > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > more memory than one line. EVEN IF you have to keep the
> > > > > > connection locked.
> > > > > >
> > > > > > The latter is what I expected it to do. The former is what it
> > > > > > does. So, it necessitates you creating EVERY SELECT query which
> > > > > > you think has more than a few rows (or which you think COULD have
> > > > > > more than a few rows, "few" being defined by our VM memory
> > > > > > limits) into a cursor based query. Really klugy. I intend to
> > > > > > write a class to do that for every SELECT query for me
> > > > > > automatically.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Doug
> > > > > >
> > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > fetch all rows
> > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > fetchSize rows. I am not sure, but I think that is problem with
> > > > > > > jdbc, not postgresql Hackers ?
> > > > > > >Does psql fetch all rows and if not how many  ?
> > > > > > >Can I change fetch size in psql ?
> > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > >
> > > > > > >regards
> > > > > > >
> > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > Nick,
> > > > > > > >
> > > > > > > > This has been discussed before on this list many times.  But
> > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > handles queries.  If you issue a query the server will return
> > > > > > > > the entire result.  (try the same query in psql and you will
> > > > > > > > have the same problem).  To work around this you can use
> > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > commands for postgres).
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > --Barry
> > > > > > > >
> > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > get an out of memory error from java.
> > > > > > > > >
> > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > really surprising, but I'm wondering why this happens (if
> > > > > > > > > it does), rather than a subset around the current record
> > > > > > > > > being cached and other rows being retrieved as needed.
> > > > > > > > >
> > > > > > > > > If it turns out that there are good reasons for it to all
> > > > > > > > > be in memory, then my question is whether there is a better
> > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > -Nick
> > > > > > > > >
> > > > > > > > > -----------------------------------------------------------
> > > > > > > > >-- ------------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co. Software
> > > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > checked our extensive FAQ?
> > > > > > > > >
> > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > searched our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > >---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 2: you can get off
> > > > > > > all lists at once with the unregister command (send "unregister
> > > > > > > YourEmailAddressHere" to majordomo@postgresql.org)
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo@postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > through Usenet, please send an appropriate subscribe-nomail command
> > > > to majordomo@postgresql.org so that your message can get through to
> > > > the mailing list cleanly
> > >
> > > --
> > > Dror Matalon
> > > Zapatec Inc
> > > 1700 MLK Way
> > > Berkeley, CA 94709
> > > http://www.zapatec.com
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > commands go to majordomo@postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 16:43:58

Looking at their code, default fetch size is 1000?

Anyways, I think there is sufficient interest in this that we should
have something running soon here

Dave
On Fri, 2002-10-11 at 17:02, snpe wrote:
> I am tried with jxdbcon - it don't work with large table, too.
> 'out of memory' is when executeQuery()
>
> regards
> Haris Peco
> On Friday 11 October 2002 10:33 pm, snpe wrote:
> > There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> > Last version don't wokr with pgsql 7.3 and I don't test more.
> > I will try next day, when I download pgsql 7.2
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > > Agreed, but there are selects where count(*) won't work. Even so, what
> > > we are talking about here is hiding the implementation of cursors behind
> > > the result set. What I would envision is some sort of cacheing where
> > > when the user set's the fetchsize to 10 for instance we do the select,
> > > and when they ask for next() we check to see if we have these rows in
> > > the cache, and go get them if necessary 10 at a time, possibly keeping
> > > one set of ten behind where we are and one set of 10 ahead of where we
> > > are. So recalling that resultSets have absolute positioning, as well as
> > > first(), and last() positioning we need the ability to move with the
> > > minimum number of trips to the backend.
> > >
> > > As it turns out the move command in postgres does support moving to the
> > > end (move 0 ); at the moment this is considered a bug, and is on the
> > > todo list to be removed. I expect we can get some sort of implementation
> > > which allows us to move to the end ( move end )
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > > Hi,
> > > >
> > > > I'm jumping in late into this discussion but ...
> > > >
> > > > In my mind a lot of these features break the model. From an application
> > > > prespective, if I want to do last, I do a count(*) and then I do a
> > > > fetch with limit; Not quite the same, but all these methods of fetching
> > > > the whole data locally and manipulating it to a large exten defeat the
> > > > purpose. Let the backend do the work, instead of trying to replicate
> > > > the functionality in JDBC.
> > > >
> > > > That said I do understand that some of these are required by the JDBC
> > > > 2.0 spec.
> > > >
> > > > Dror
> > > >
> > > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > >     It wouldn't be bad to start with a naive implementation of
> > > > > last()...  If the only problem we have is that last() doesn't perform
> > > > > well, we're probably making good progress.  :)
> > > > >     On the other hand, I would think the updateable result sets would
> > > > > be the most challenging; does the server provide any analogous
> > > > > features with its cursors?
> > > > >
> > > > > Aaron
> > > > >
> > > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > > This really is an artifact of the way that postgres gives us the
> > > > > > data.
> > > > > >
> > > > > > When you query the backend you get *all* of the results in the
> > > > > > query, and there is no indication of how many results you are going
> > > > > > to get. In simple selects it would be possible to get some idea by
> > > > > > using count(field), but this wouldn't work nearly enough times to
> > > > > > make it useful. So that leaves us with using cursors, which still
> > > > > > won't tell you how many rows you are getting back, but at least you
> > > > > > won't have the memory problems.
> > > > > >
> > > > > > This approach is far from trivial which is why it hasn't been
> > > > > > implemented as of yet, keep in mind that result sets support things
> > > > > > like move(n), first(), last(), the last of which will be the
> > > > > > trickiest. Not to mention updateable result sets.
> > > > > >
> > > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > > 'cursor', which currently is being considered a bug.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > > >Barry,
> > > > > > > >   Is it true ?
> > > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > > rows with length 10-20 character.JDBC query 'select * from a'
> > > > > > > > get error 'out of memory', but psql not.
> > > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > > >
> > > > > > > The way the code works in JDBC is, in my opinion, a little poor
> > > > > > > but possibly mandated by JDBC design specs.
> > > > > > >
> > > > > > > It reads the entire result set from the database backend and
> > > > > > > caches it in a horrible Vector (which should really be a List and
> > > > > > > which should at least make an attempt to get the # of rows ahead
> > > > > > > of time to avoid all the resizing problems).
> > > > > > >
> > > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > > with the next() method.
> > > > > > >
> > > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > > more memory than one line. EVEN IF you have to keep the
> > > > > > > connection locked.
> > > > > > >
> > > > > > > The latter is what I expected it to do. The former is what it
> > > > > > > does. So, it necessitates you creating EVERY SELECT query which
> > > > > > > you think has more than a few rows (or which you think COULD have
> > > > > > > more than a few rows, "few" being defined by our VM memory
> > > > > > > limits) into a cursor based query. Really klugy. I intend to
> > > > > > > write a class to do that for every SELECT query for me
> > > > > > > automatically.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Doug
> > > > > > >
> > > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > > fetch all rows
> > > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > > fetchSize rows. I am not sure, but I think that is problem with
> > > > > > > > jdbc, not postgresql Hackers ?
> > > > > > > >Does psql fetch all rows and if not how many  ?
> > > > > > > >Can I change fetch size in psql ?
> > > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > > >
> > > > > > > >regards
> > > > > > > >
> > > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > > Nick,
> > > > > > > > >
> > > > > > > > > This has been discussed before on this list many times.  But
> > > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > > handles queries.  If you issue a query the server will return
> > > > > > > > > the entire result.  (try the same query in psql and you will
> > > > > > > > > have the same problem).  To work around this you can use
> > > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > > commands for postgres).
> > > > > > > > >
> > > > > > > > > thanks,
> > > > > > > > > --Barry
> > > > > > > > >
> > > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > > get an out of memory error from java.
> > > > > > > > > >
> > > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > > really surprising, but I'm wondering why this happens (if
> > > > > > > > > > it does), rather than a subset around the current record
> > > > > > > > > > being cached and other rows being retrieved as needed.
> > > > > > > > > >
> > > > > > > > > > If it turns out that there are good reasons for it to all
> > > > > > > > > > be in memory, then my question is whether there is a better
> > > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > -Nick
> > > > > > > > > >
> > > > > > > > > > -----------------------------------------------------------
> > > > > > > > > >-- ------------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co. Software
> > > > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ---------------------------(end of
> > > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > > checked our extensive FAQ?
> > > > > > > > > >
> > > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > > searched our list archives?
> > > > > > > > >
> > > > > > > > > http://archives.postgresql.org
> > > > > > > >
> > > > > > > >---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 2: you can get off
> > > > > > > > all lists at once with the unregister command (send "unregister
> > > > > > > > YourEmailAddressHere" to majordomo@postgresql.org)
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > > unsubscribe commands go to majordomo@postgresql.org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > > through Usenet, please send an appropriate subscribe-nomail command
> > > > > to majordomo@postgresql.org so that your message can get through to
> > > > > the mailing list cleanly
> > > >
> > > > --
> > > > Dror Matalon
> > > > Zapatec Inc
> > > > 1700 MLK Way
> > > > Berkeley, CA 94709
> > > > http://www.zapatec.com
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > > commands go to majordomo@postgresql.org
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 16:54:04

I test Oracle JDeveloper and jdbc driver for postgresql work fine now
Meanwhile, for production systems I have to have setFetchSize for large tables
I think that it is same with any Java IDE.

Best solution is that we have only n rows from backend, but I don't know is it
possible
regards
Haris Peco

On Friday 11 October 2002 10:38 pm, Dave Cramer wrote:
> Looking at their code, default fetch size is 1000?
>
> Anyways, I think there is sufficient interest in this that we should
> have something running soon here
>
> Dave
>
> On Fri, 2002-10-11 at 17:02, snpe wrote:
> > I am tried with jxdbcon - it don't work with large table, too.
> > 'out of memory' is when executeQuery()
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 10:33 pm, snpe wrote:
> > > There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> > > Last version don't wokr with pgsql 7.3 and I don't test more.
> > > I will try next day, when I download pgsql 7.2
> > >
> > > regards
> > > Haris Peco
> > >
> > > On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > > > Agreed, but there are selects where count(*) won't work. Even so,
> > > > what we are talking about here is hiding the implementation of
> > > > cursors behind the result set. What I would envision is some sort of
> > > > cacheing where when the user set's the fetchsize to 10 for instance
> > > > we do the select, and when they ask for next() we check to see if we
> > > > have these rows in the cache, and go get them if necessary 10 at a
> > > > time, possibly keeping one set of ten behind where we are and one set
> > > > of 10 ahead of where we are. So recalling that resultSets have
> > > > absolute positioning, as well as first(), and last() positioning we
> > > > need the ability to move with the minimum number of trips to the
> > > > backend.
> > > >
> > > > As it turns out the move command in postgres does support moving to
> > > > the end (move 0 ); at the moment this is considered a bug, and is on
> > > > the todo list to be removed. I expect we can get some sort of
> > > > implementation which allows us to move to the end ( move end )
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > > > Hi,
> > > > >
> > > > > I'm jumping in late into this discussion but ...
> > > > >
> > > > > In my mind a lot of these features break the model. From an
> > > > > application prespective, if I want to do last, I do a count(*) and
> > > > > then I do a fetch with limit; Not quite the same, but all these
> > > > > methods of fetching the whole data locally and manipulating it to a
> > > > > large exten defeat the purpose. Let the backend do the work,
> > > > > instead of trying to replicate the functionality in JDBC.
> > > > >
> > > > > That said I do understand that some of these are required by the
> > > > > JDBC 2.0 spec.
> > > > >
> > > > > Dror
> > > > >
> > > > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > > >     It wouldn't be bad to start with a naive implementation of
> > > > > > last()...  If the only problem we have is that last() doesn't
> > > > > > perform well, we're probably making good progress.  :)
> > > > > >     On the other hand, I would think the updateable result sets
> > > > > > would be the most challenging; does the server provide any
> > > > > > analogous features with its cursors?
> > > > > >
> > > > > > Aaron
> > > > > >
> > > > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > > > This really is an artifact of the way that postgres gives us
> > > > > > > the data.
> > > > > > >
> > > > > > > When you query the backend you get *all* of the results in the
> > > > > > > query, and there is no indication of how many results you are
> > > > > > > going to get. In simple selects it would be possible to get
> > > > > > > some idea by using count(field), but this wouldn't work nearly
> > > > > > > enough times to make it useful. So that leaves us with using
> > > > > > > cursors, which still won't tell you how many rows you are
> > > > > > > getting back, but at least you won't have the memory problems.
> > > > > > >
> > > > > > > This approach is far from trivial which is why it hasn't been
> > > > > > > implemented as of yet, keep in mind that result sets support
> > > > > > > things like move(n), first(), last(), the last of which will be
> > > > > > > the trickiest. Not to mention updateable result sets.
> > > > > > >
> > > > > > > As it turns out there is a mechanism to get to the end move 0
> > > > > > > in 'cursor', which currently is being considered a bug.
> > > > > > >
> > > > > > > Dave
> > > > > > >
> > > > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > > > >Barry,
> > > > > > > > >   Is it true ?
> > > > > > > > >I create table with one column varchar(500) and enter 1
> > > > > > > > > milion rows with length 10-20 character.JDBC query 'select
> > > > > > > > > * from a' get error 'out of memory', but psql not.
> > > > > > > > >I insert 8 milion rows and psql work fine yet (slow, but
> > > > > > > > > work)
> > > > > > > >
> > > > > > > > The way the code works in JDBC is, in my opinion, a little
> > > > > > > > poor but possibly mandated by JDBC design specs.
> > > > > > > >
> > > > > > > > It reads the entire result set from the database backend and
> > > > > > > > caches it in a horrible Vector (which should really be a List
> > > > > > > > and which should at least make an attempt to get the # of
> > > > > > > > rows ahead of time to avoid all the resizing problems).
> > > > > > > >
> > > > > > > > Then, it doles it out from memory as you go through the
> > > > > > > > ResultSet with the next() method.
> > > > > > > >
> > > > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > > > LOADING THE WHOLE THING - through the result set as each row
> > > > > > > > is returned from the backend, thus ensuring that you never
> > > > > > > > use much more memory than one line. EVEN IF you have to keep
> > > > > > > > the connection locked.
> > > > > > > >
> > > > > > > > The latter is what I expected it to do. The former is what it
> > > > > > > > does. So, it necessitates you creating EVERY SELECT query
> > > > > > > > which you think has more than a few rows (or which you think
> > > > > > > > COULD have more than a few rows, "few" being defined by our
> > > > > > > > VM memory limits) into a cursor based query. Really klugy. I
> > > > > > > > intend to write a class to do that for every SELECT query for
> > > > > > > > me automatically.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Doug
> > > > > > > >
> > > > > > > > >In C library is 'execute query' without fetch - in jdbc
> > > > > > > > > execute fetch all rows
> > > > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > > > fetchSize rows. I am not sure, but I think that is problem
> > > > > > > > > with jdbc, not postgresql Hackers ?
> > > > > > > > >Does psql fetch all rows and if not how many  ?
> > > > > > > > >Can I change fetch size in psql ?
> > > > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > > > >If I use jdbc in third-party IDE, I can't force this
> > > > > > > > > solution
> > > > > > > > >
> > > > > > > > >regards
> > > > > > > > >
> > > > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > > > Nick,
> > > > > > > > > >
> > > > > > > > > > This has been discussed before on this list many times.
> > > > > > > > > > But the short answer is that that is how the postgres
> > > > > > > > > > server handles queries.  If you issue a query the server
> > > > > > > > > > will return the entire result.  (try the same query in
> > > > > > > > > > psql and you will have the same problem).  To work around
> > > > > > > > > > this you can use explicit cursors (see the DECLARE
> > > > > > > > > > CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > > > > > >
> > > > > > > > > > thanks,
> > > > > > > > > > --Barry
> > > > > > > > > >
> > > > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > > > I'm selecting a huge ResultSet from our database- about
> > > > > > > > > > > one million rows, with one of the fields being
> > > > > > > > > > > varchar(500). I get an out of memory error from java.
> > > > > > > > > > >
> > > > > > > > > > > If the whole ResultSet gets stashed in memory, this
> > > > > > > > > > > isn't really surprising, but I'm wondering why this
> > > > > > > > > > > happens (if it does), rather than a subset around the
> > > > > > > > > > > current record being cached and other rows being
> > > > > > > > > > > retrieved as needed.
> > > > > > > > > > >
> > > > > > > > > > > If it turns out that there are good reasons for it to
> > > > > > > > > > > all be in memory, then my question is whether there is
> > > > > > > > > > > a better approach that people typically use in this
> > > > > > > > > > > situation. For now, I'm simply breaking up the select
> > > > > > > > > > > into smaller chunks, but that approach won't be
> > > > > > > > > > > satisfactory in the long run.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > -Nick
> > > > > > > > > > >
> > > > > > > > > > > -------------------------------------------------------
> > > > > > > > > > >---- -- ------------ - Nick Fankhauser  nickf@ontko.com
> > > > > > > > > > > Phone 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko &
> > > > > > > > > > > Co. Software Consulting Services http://www.ontko.com/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > ---------------------------(end of
> > > > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > > > checked our extensive FAQ?
> > > > > > > > > > >
> > > > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > > > >
> > > > > > > > > > ---------------------------(end of
> > > > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > > > searched our list archives?
> > > > > > > > > >
> > > > > > > > > > http://archives.postgresql.org
> > > > > > > > >
> > > > > > > > >---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 2: you can get
> > > > > > > > > off all lists at once with the unregister command (send
> > > > > > > > > "unregister YourEmailAddressHere" to
> > > > > > > > > majordomo@postgresql.org)
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > searched our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > > > unsubscribe commands go to majordomo@postgresql.org
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > > > through Usenet, please send an appropriate subscribe-nomail
> > > > > > command to majordomo@postgresql.org so that your message can get
> > > > > > through to the mailing list cleanly
> > > > >
> > > > > --
> > > > > Dror Matalon
> > > > > Zapatec Inc
> > > > > 1700 MLK Way
> > > > > Berkeley, CA 94709
> > > > > http://www.zapatec.com
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo@postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > lists at once with the unregister command (send "unregister
> > > > YourEmailAddressHere" to majordomo@postgresql.org)
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > Usenet, please send an appropriate subscribe-nomail command to
> > > majordomo@postgresql.org so that your message can get through to the
> > > mailing list cleanly
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 17:19:31

Can You do this :
   We save 1000 (or fetchSize rows) first from beginning
   If table have < 1000 rows we save all rows, but if table have more rows
and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows
or maybe continue fetching, if it possible)
  When user request last we fetch all rows, but save only last 1000 etc

  We save only fetchSize rows and seek from begining when user request
backward (or maybe seek always when user request out our 'fetchSize' window)

  This is slow for large tables, but this is solution until developer get us
better solution from backend.If table have < fetchSize rows this is same
current solution and we can fix minimal fetchSize for better performance with
small tables.

regards
Haris Peco
On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> No,
>
> It doesn't have to store them, only display them
>
> Dave
>
> On Fri, 2002-10-11 at 12:48, snpe wrote:
> > Hello,
> >   Does it mean that psql uses cursors ?
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > This really is an artifact of the way that postgres gives us the data.
> > >
> > > When you query the backend you get *all* of the results in the query,
> > > and there is no indication of how many results you are going to get. In
> > > simple selects it would be possible to get some idea by using
> > > count(field), but this wouldn't work nearly enough times to make it
> > > useful. So that leaves us with using cursors, which still won't tell
> > > you how many rows you are getting back, but at least you won't have the
> > > memory problems.
> > >
> > > This approach is far from trivial which is why it hasn't been
> > > implemented as of yet, keep in mind that result sets support things
> > > like move(n), first(), last(), the last of which will be the trickiest.
> > > Not to mention updateable result sets.
> > >
> > > As it turns out there is a mechanism to get to the end move 0 in
> > > 'cursor', which currently is being considered a bug.
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > >Barry,
> > > > >   Is it true ?
> > > > >I create table with one column varchar(500) and enter 1 milion rows
> > > > > with length 10-20 character.JDBC query 'select * from a' get error
> > > > > 'out of memory', but psql not.
> > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > >
> > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > possibly mandated by JDBC design specs.
> > > >
> > > > It reads the entire result set from the database backend and caches
> > > > it in a horrible Vector (which should really be a List and which
> > > > should at least make an attempt to get the # of rows ahead of time to
> > > > avoid all the resizing problems).
> > > >
> > > > Then, it doles it out from memory as you go through the ResultSet
> > > > with the next() method.
> > > >
> > > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
> > > > THE WHOLE THING - through the result set as each row is returned from
> > > > the backend, thus ensuring that you never use much more memory than
> > > > one line. EVEN IF you have to keep the connection locked.
> > > >
> > > > The latter is what I expected it to do. The former is what it does.
> > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > has more than a few rows (or which you think COULD have more than a
> > > > few rows, "few" being defined by our VM memory limits) into a cursor
> > > > based query. Really klugy. I intend to write a class to do that for
> > > > every SELECT query for me automatically.
> > > >
> > > > Cheers,
> > > >
> > > > Doug
> > > >
> > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > fetch all rows
> > > > >and this is problem - I think that executequery must prepare query
> > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows. I
> > > > > am not sure, but I think that is problem with jdbc, not postgresql
> > > > > Hackers ?
> > > > >Does psql fetch all rows and if not how many  ?
> > > > >Can I change fetch size in psql ?
> > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > >
> > > > >regards
> > > > >
> > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > Nick,
> > > > > >
> > > > > > This has been discussed before on this list many times.  But the
> > > > > > short answer is that that is how the postgres server handles
> > > > > > queries. If you issue a query the server will return the entire
> > > > > > result.  (try the same query in psql and you will have the same
> > > > > > problem).  To work around this you can use explicit cursors (see
> > > > > > the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > >
> > > > > > thanks,
> > > > > > --Barry
> > > > > >
> > > > > > Nick Fankhauser wrote:
> > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > million rows, with one of the fields being varchar(500). I get
> > > > > > > an out of memory error from java.
> > > > > > >
> > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > does), rather than a subset around the current record being
> > > > > > > cached and other rows being retrieved as needed.
> > > > > > >
> > > > > > > If it turns out that there are good reasons for it to all be in
> > > > > > > memory, then my question is whether there is a better approach
> > > > > > > that people typically use in this situation. For now, I'm
> > > > > > > simply breaking up the select into smaller chunks, but that
> > > > > > > approach won't be satisfactory in the long run.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > -Nick
> > > > > > >
> > > > > > > ---------------------------------------------------------------
> > > > > > >---- ------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.     Software
> > > > > > > Consulting Services http://www.ontko.com/
> > > > > > >
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > our extensive FAQ?
> > > > > > >
> > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > >---------------------------(end of
> > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > lists at once with the unregister command (send "unregister
> > > > > YourEmailAddressHere" to
> > > > > majordomo@postgresql.org)
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > commands go to majordomo@postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

Re: Out of memory error on huge resultset

От

"Simpson, Mike W"

Дата:

11 октября 2002 г., 17:50:09

This definitely isn't my field, but in our code we're perfectly happy
setting LIMIT and OFFSET.  In combination with stored procedures,
performance is excellent.

If you really want a driver change, wouldn't it be simpler to just append
LIMIT and OFFSET?
Anytime setFetchSize is called, just tack on
    query+=" LIMIT "+getFetchSize()+" OFFSET "+rowPointer;
and rs.next()/cursor code increment the rowPointer.

Just my 2 cents...
Mike

-----Original Message-----
From: snpe [mailto:snpe@snpe.co.yu]
Sent: Friday, October 11, 2002 3:43 PM
To: Dave Cramer
Cc: pgsql-jdbc; PostgreSQL-development
Subject: Re: [JDBC] Out of memory error on huge resultset


Can You do this :
   We save 1000 (or fetchSize rows) first from beginning
   If table have < 1000 rows we save all rows, but if table have more rows
and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows

or maybe continue fetching, if it possible)
  When user request last we fetch all rows, but save only last 1000 etc

  We save only fetchSize rows and seek from begining when user request
backward (or maybe seek always when user request out our 'fetchSize' window)

  This is slow for large tables, but this is solution until developer get us

better solution from backend.If table have < fetchSize rows this is same
current solution and we can fix minimal fetchSize for better performance
with
small tables.

regards
Haris Peco
On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> No,
>
> It doesn't have to store them, only display them
>
> Dave
>
> On Fri, 2002-10-11 at 12:48, snpe wrote:
> > Hello,
> >   Does it mean that psql uses cursors ?
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > This really is an artifact of the way that postgres gives us the data.
> > >
> > > When you query the backend you get *all* of the results in the query,
> > > and there is no indication of how many results you are going to get.
In
> > > simple selects it would be possible to get some idea by using
> > > count(field), but this wouldn't work nearly enough times to make it
> > > useful. So that leaves us with using cursors, which still won't tell
> > > you how many rows you are getting back, but at least you won't have
the
> > > memory problems.
> > >
> > > This approach is far from trivial which is why it hasn't been
> > > implemented as of yet, keep in mind that result sets support things
> > > like move(n), first(), last(), the last of which will be the
trickiest.
> > > Not to mention updateable result sets.
> > >
> > > As it turns out there is a mechanism to get to the end move 0 in
> > > 'cursor', which currently is being considered a bug.
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > >Barry,
> > > > >   Is it true ?
> > > > >I create table with one column varchar(500) and enter 1 milion rows
> > > > > with length 10-20 character.JDBC query 'select * from a' get error
> > > > > 'out of memory', but psql not.
> > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > >
> > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > possibly mandated by JDBC design specs.
> > > >
> > > > It reads the entire result set from the database backend and caches
> > > > it in a horrible Vector (which should really be a List and which
> > > > should at least make an attempt to get the # of rows ahead of time
to
> > > > avoid all the resizing problems).
> > > >
> > > > Then, it doles it out from memory as you go through the ResultSet
> > > > with the next() method.
> > > >
> > > > I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING
> > > > THE WHOLE THING - through the result set as each row is returned
from
> > > > the backend, thus ensuring that you never use much more memory than
> > > > one line. EVEN IF you have to keep the connection locked.
> > > >
> > > > The latter is what I expected it to do. The former is what it does.
> > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > has more than a few rows (or which you think COULD have more than a
> > > > few rows, "few" being defined by our VM memory limits) into a cursor
> > > > based query. Really klugy. I intend to write a class to do that for
> > > > every SELECT query for me automatically.
> > > >
> > > > Cheers,
> > > >
> > > > Doug
> > > >
> > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > fetch all rows
> > > > >and this is problem - I think that executequery must prepare query
> > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
I
> > > > > am not sure, but I think that is problem with jdbc, not postgresql
> > > > > Hackers ?
> > > > >Does psql fetch all rows and if not how many  ?
> > > > >Can I change fetch size in psql ?
> > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > >
> > > > >regards
> > > > >
> > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > Nick,
> > > > > >
> > > > > > This has been discussed before on this list many times.  But the
> > > > > > short answer is that that is how the postgres server handles
> > > > > > queries. If you issue a query the server will return the entire
> > > > > > result.  (try the same query in psql and you will have the same
> > > > > > problem).  To work around this you can use explicit cursors (see
> > > > > > the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > >
> > > > > > thanks,
> > > > > > --Barry
> > > > > >
> > > > > > Nick Fankhauser wrote:
> > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > million rows, with one of the fields being varchar(500). I get
> > > > > > > an out of memory error from java.
> > > > > > >
> > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > does), rather than a subset around the current record being
> > > > > > > cached and other rows being retrieved as needed.
> > > > > > >
> > > > > > > If it turns out that there are good reasons for it to all be
in
> > > > > > > memory, then my question is whether there is a better approach
> > > > > > > that people typically use in this situation. For now, I'm
> > > > > > > simply breaking up the select into smaller chunks, but that
> > > > > > > approach won't be satisfactory in the long run.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > -Nick
> > > > > > >
> > > > > > >
---------------------------------------------------------------
> > > > > > >---- ------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.
Software
> > > > > > > Consulting Services http://www.ontko.com/
> > > > > > >
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > our extensive FAQ?
> > > > > > >
> > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > >---------------------------(end of
> > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > lists at once with the unregister command (send "unregister
> > > > > YourEmailAddressHere" to
> > > > > majordomo@postgresql.org)
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > commands go to majordomo@postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: Out of memory error on huge resultset

От

Vincent Janelle

Дата:

11 октября 2002 г., 17:52:39

But what if your data changes?

On Fri, 11 Oct 2002 15:49:42 -0600
"Simpson, Mike W" <mike.simpson@pbs.proquest.com> wrote:

> This definitely isn't my field, but in our code we're perfectly happy
> setting LIMIT and OFFSET.  In combination with stored procedures,
> performance is excellent.
>
> If you really want a driver change, wouldn't it be simpler to just append
> LIMIT and OFFSET?
> Anytime setFetchSize is called, just tack on
>     query+=" LIMIT "+getFetchSize()+" OFFSET "+rowPointer;
> and rs.next()/cursor code increment the rowPointer.
>
> Just my 2 cents...
> Mike
>

Re: Out of memory error on huge resultset

От

snpe

Дата:

11 октября 2002 г., 18:15:59

What if query have LIMIT or OFFSET before and what when user request next row
out of LIMIT or OFFSET

regards
Haris Peco
On Friday 11 October 2002 11:49 pm, Simpson, Mike W wrote:
> This definitely isn't my field, but in our code we're perfectly happy
> setting LIMIT and OFFSET.  In combination with stored procedures,
> performance is excellent.
>
> If you really want a driver change, wouldn't it be simpler to just append
> LIMIT and OFFSET?
> Anytime setFetchSize is called, just tack on
>     query+=" LIMIT "+getFetchSize()+" OFFSET "+rowPointer;
> and rs.next()/cursor code increment the rowPointer.
>
> Just my 2 cents...
> Mike
>
> -----Original Message-----
> From: snpe [mailto:snpe@snpe.co.yu]
> Sent: Friday, October 11, 2002 3:43 PM
> To: Dave Cramer
> Cc: pgsql-jdbc; PostgreSQL-development
> Subject: Re: [JDBC] Out of memory error on huge resultset
>
>
> Can You do this :
>    We save 1000 (or fetchSize rows) first from beginning
>    If table have < 1000 rows we save all rows, but if table have more rows
> and user request 1001 we fetch 1000 (again from begining, but skip 1000
> rows
>
> or maybe continue fetching, if it possible)
>   When user request last we fetch all rows, but save only last 1000 etc
>
>   We save only fetchSize rows and seek from begining when user request
> backward (or maybe seek always when user request out our 'fetchSize'
> window)
>
>   This is slow for large tables, but this is solution until developer get
> us
>
> better solution from backend.If table have < fetchSize rows this is same
> current solution and we can fix minimal fetchSize for better performance
> with
> small tables.
>
> regards
> Haris Peco
>
> On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> > No,
> >
> > It doesn't have to store them, only display them
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 12:48, snpe wrote:
> > > Hello,
> > >   Does it mean that psql uses cursors ?
> > >
> > > regards
> > > Haris Peco
> > >
> > > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > > This really is an artifact of the way that postgres gives us the
> > > > data.
> > > >
> > > > When you query the backend you get *all* of the results in the query,
> > > > and there is no indication of how many results you are going to get.
>
> In
>
> > > > simple selects it would be possible to get some idea by using
> > > > count(field), but this wouldn't work nearly enough times to make it
> > > > useful. So that leaves us with using cursors, which still won't tell
> > > > you how many rows you are getting back, but at least you won't have
>
> the
>
> > > > memory problems.
> > > >
> > > > This approach is far from trivial which is why it hasn't been
> > > > implemented as of yet, keep in mind that result sets support things
> > > > like move(n), first(), last(), the last of which will be the
>
> trickiest.
>
> > > > Not to mention updateable result sets.
> > > >
> > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > 'cursor', which currently is being considered a bug.
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > >Barry,
> > > > > >   Is it true ?
> > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > rows with length 10-20 character.JDBC query 'select * from a' get
> > > > > > error 'out of memory', but psql not.
> > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > >
> > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > possibly mandated by JDBC design specs.
> > > > >
> > > > > It reads the entire result set from the database backend and caches
> > > > > it in a horrible Vector (which should really be a List and which
> > > > > should at least make an attempt to get the # of rows ahead of time
>
> to
>
> > > > > avoid all the resizing problems).
> > > > >
> > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > with the next() method.
> > > > >
> > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
>
> LOADING
>
> > > > > THE WHOLE THING - through the result set as each row is returned
>
> from
>
> > > > > the backend, thus ensuring that you never use much more memory than
> > > > > one line. EVEN IF you have to keep the connection locked.
> > > > >
> > > > > The latter is what I expected it to do. The former is what it does.
> > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > has more than a few rows (or which you think COULD have more than a
> > > > > few rows, "few" being defined by our VM memory limits) into a
> > > > > cursor based query. Really klugy. I intend to write a class to do
> > > > > that for every SELECT query for me automatically.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Doug
> > > > >
> > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > fetch all rows
> > > > > >and this is problem - I think that executequery must prepare query
> > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
>
> I
>
> > > > > > am not sure, but I think that is problem with jdbc, not
> > > > > > postgresql Hackers ?
> > > > > >Does psql fetch all rows and if not how many  ?
> > > > > >Can I change fetch size in psql ?
> > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > >
> > > > > >regards
> > > > > >
> > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > Nick,
> > > > > > >
> > > > > > > This has been discussed before on this list many times.  But
> > > > > > > the short answer is that that is how the postgres server
> > > > > > > handles queries. If you issue a query the server will return
> > > > > > > the entire result.  (try the same query in psql and you will
> > > > > > > have the same problem).  To work around this you can use
> > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > commands for postgres).
> > > > > > >
> > > > > > > thanks,
> > > > > > > --Barry
> > > > > > >
> > > > > > > Nick Fankhauser wrote:
> > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > get an out of memory error from java.
> > > > > > > >
> > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > does), rather than a subset around the current record being
> > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > >
> > > > > > > > If it turns out that there are good reasons for it to all be
>
> in
>
> > > > > > > > memory, then my question is whether there is a better
> > > > > > > > approach that people typically use in this situation. For
> > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > -Nick
>
> ---------------------------------------------------------------
>
> > > > > > > >---- ------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.
>
> Software
>
> > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > >
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > our extensive FAQ?
> > > > > > > >
> > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > >---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > lists at once with the unregister command (send "unregister
> > > > > > YourEmailAddressHere" to
> > > > > > majordomo@postgresql.org)
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > unsubscribe commands go to majordomo@postgresql.org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 5: Have you checked our
> > > extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

11 октября 2002 г., 18:24:29

This is more or less the same problem, you still need to cache rows, and
know where you are in the result set, cursors make it easier because you
can move absolute, or relative in a cursor. Also as it stands now there
is a move end which as it turns out gives you the number of rows in the
resultset.

Just for my elucidation, is there someone working on an implementation?

Nic, you mentioned you would have a shot at it?


Dave
On Fri, 2002-10-11 at 17:49, Simpson, Mike W wrote:
> This definitely isn't my field, but in our code we're perfectly happy
> setting LIMIT and OFFSET.  In combination with stored procedures,
> performance is excellent.
>
> If you really want a driver change, wouldn't it be simpler to just append
> LIMIT and OFFSET?
> Anytime setFetchSize is called, just tack on
>     query+=" LIMIT "+getFetchSize()+" OFFSET "+rowPointer;
> and rs.next()/cursor code increment the rowPointer.
>
> Just my 2 cents...
> Mike
>
> -----Original Message-----
> From: snpe [mailto:snpe@snpe.co.yu]
> Sent: Friday, October 11, 2002 3:43 PM
> To: Dave Cramer
> Cc: pgsql-jdbc; PostgreSQL-development
> Subject: Re: [JDBC] Out of memory error on huge resultset
>
>
> Can You do this :
>    We save 1000 (or fetchSize rows) first from beginning
>    If table have < 1000 rows we save all rows, but if table have more rows
> and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows
>
> or maybe continue fetching, if it possible)
>   When user request last we fetch all rows, but save only last 1000 etc
>
>   We save only fetchSize rows and seek from begining when user request
> backward (or maybe seek always when user request out our 'fetchSize' window)
>
>   This is slow for large tables, but this is solution until developer get us
>
> better solution from backend.If table have < fetchSize rows this is same
> current solution and we can fix minimal fetchSize for better performance
> with
> small tables.
>
> regards
> Haris Peco
> On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> > No,
> >
> > It doesn't have to store them, only display them
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 12:48, snpe wrote:
> > > Hello,
> > >   Does it mean that psql uses cursors ?
> > >
> > > regards
> > > Haris Peco
> > >
> > > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > > This really is an artifact of the way that postgres gives us the data.
> > > >
> > > > When you query the backend you get *all* of the results in the query,
> > > > and there is no indication of how many results you are going to get.
> In
> > > > simple selects it would be possible to get some idea by using
> > > > count(field), but this wouldn't work nearly enough times to make it
> > > > useful. So that leaves us with using cursors, which still won't tell
> > > > you how many rows you are getting back, but at least you won't have
> the
> > > > memory problems.
> > > >
> > > > This approach is far from trivial which is why it hasn't been
> > > > implemented as of yet, keep in mind that result sets support things
> > > > like move(n), first(), last(), the last of which will be the
> trickiest.
> > > > Not to mention updateable result sets.
> > > >
> > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > 'cursor', which currently is being considered a bug.
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > >Barry,
> > > > > >   Is it true ?
> > > > > >I create table with one column varchar(500) and enter 1 milion rows
> > > > > > with length 10-20 character.JDBC query 'select * from a' get error
> > > > > > 'out of memory', but psql not.
> > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > >
> > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > possibly mandated by JDBC design specs.
> > > > >
> > > > > It reads the entire result set from the database backend and caches
> > > > > it in a horrible Vector (which should really be a List and which
> > > > > should at least make an attempt to get the # of rows ahead of time
> to
> > > > > avoid all the resizing problems).
> > > > >
> > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > with the next() method.
> > > > >
> > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> LOADING
> > > > > THE WHOLE THING - through the result set as each row is returned
> from
> > > > > the backend, thus ensuring that you never use much more memory than
> > > > > one line. EVEN IF you have to keep the connection locked.
> > > > >
> > > > > The latter is what I expected it to do. The former is what it does.
> > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > has more than a few rows (or which you think COULD have more than a
> > > > > few rows, "few" being defined by our VM memory limits) into a cursor
> > > > > based query. Really klugy. I intend to write a class to do that for
> > > > > every SELECT query for me automatically.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Doug
> > > > >
> > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > fetch all rows
> > > > > >and this is problem - I think that executequery must prepare query
> > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> I
> > > > > > am not sure, but I think that is problem with jdbc, not postgresql
> > > > > > Hackers ?
> > > > > >Does psql fetch all rows and if not how many  ?
> > > > > >Can I change fetch size in psql ?
> > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > >
> > > > > >regards
> > > > > >
> > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > Nick,
> > > > > > >
> > > > > > > This has been discussed before on this list many times.  But the
> > > > > > > short answer is that that is how the postgres server handles
> > > > > > > queries. If you issue a query the server will return the entire
> > > > > > > result.  (try the same query in psql and you will have the same
> > > > > > > problem).  To work around this you can use explicit cursors (see
> > > > > > > the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > > >
> > > > > > > thanks,
> > > > > > > --Barry
> > > > > > >
> > > > > > > Nick Fankhauser wrote:
> > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > million rows, with one of the fields being varchar(500). I get
> > > > > > > > an out of memory error from java.
> > > > > > > >
> > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > does), rather than a subset around the current record being
> > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > >
> > > > > > > > If it turns out that there are good reasons for it to all be
> in
> > > > > > > > memory, then my question is whether there is a better approach
> > > > > > > > that people typically use in this situation. For now, I'm
> > > > > > > > simply breaking up the select into smaller chunks, but that
> > > > > > > > approach won't be satisfactory in the long run.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > -Nick
> > > > > > > >
> > > > > > > >
> ---------------------------------------------------------------
> > > > > > > >---- ------ - Nick Fankhauser  nickf@ontko.com  Phone
> > > > > > > > 1.765.935.4283  Fax 1.765.962.9788 Ray Ontko & Co.
> Software
> > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > >
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > our extensive FAQ?
> > > > > > > >
> > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > >---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > lists at once with the unregister command (send "unregister
> > > > > > YourEmailAddressHere" to
> > > > > > majordomo@postgresql.org)
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > > commands go to majordomo@postgresql.org
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>

Cursor based result set

От

Dave Cramer

Дата:

11 октября 2002 г., 18:30:25

For the first pass, I think it is perfectly acceptable to not support
updateable resultsets if the user chooses to use a cursor based
resultset.

Dave

Re: Out of memory error on huge resultset

От

Nic Ferrier

Дата:

11 октября 2002 г., 18:55:28

Dave Cramer <Dave@micro-automation.net> writes:

> This is more or less the same problem, you still need to cache rows, and
> know where you are in the result set, cursors make it easier because you
> can move absolute, or relative in a cursor. Also as it stands now there
> is a move end which as it turns out gives you the number of rows in the
> resultset.
>
> Just for my elucidation, is there someone working on an implementation?
>
> Nic, you mentioned you would have a shot at it?

Yes. I'll bash out an implementation over the w/e.

Basically, I like your solution: use setFetchSize to control when a
cursor is used.

I am not so concerened about JDBC 2.0 compliance: the main thing is
that one ought to be able to _not_ read in a million rows if that's
what's in the table you're selecting.

setFetchSize is perfect for that.

Once that's done then people can test it and maybe improve it so that
it's jdbc 2.0 compliant and maybe so that the use of caching could be
guessed (don't ask me how yet, that's not the point).

If people like it, maybe it will be accepted.

Nic

Re: Out of memory error on huge resultset

От

Nic Ferrier

Дата:

11 октября 2002 г., 21:00:47

Nic Ferrier <nferrier@tapsellferrier.co.uk> writes:

> Basically, I like your solution: use setFetchSize to control when a
> cursor is used.
> 

I decided the quickest short term thing to do was make everything go
through a cursor and see what effect that has.

The default fetch size is 0 and guess what that means to the FETCH
command? That's right: get everything.

So I'm simply transforming querys from:

   SELECT x FROM y WHERE z;

into
   DECLARE jdbcXX CURSOR FOR $query ;    FETCH FORWARD $fetchSize jdbcXX;

I'll then alter the code to deal with the necessary auto-fetching
when the cache gets low.

I think this is a pretty good trade off between what we've got and
the need not to trash the VM with a gazillion row query.

I haven't quite got it working yet, but I've only been hacking for a
couple of hours (and I'm not very quick  /8-).

Nic

Re: Out of memory error on huge resultset

От

Barry Lind

Дата:

11 октября 2002 г., 23:29:48

Nic Ferrier wrote:

> So I'm simply transforming querys from:
>
>
>     SELECT x FROM y WHERE z;
>
>
> into
>
>     DECLARE jdbcXX CURSOR FOR $query ;
>     FETCH FORWARD $fetchSize jdbcXX;
>

And when $query is:  "insert into foo values (...); select * from bar;"

You will get:
DECLARE jdbcXX CURSOR FOR insert into foo values (...);
select * from bar;
FETCH FORWARD $fetchSize jdbcXX;

Which clearly isn't what you want.

So like the discussion that we had on when we should/could use server
side prepared statements either the default needs to be not to use
cursors, or the sql string needs to be parsed looking for ';' and only
use cursors if no ';' is found.

thanks,
--Barry

Re: Out of memory error on huge resultset

От

Nic Ferrier

Дата:

12 октября 2002 г., 07:23:15

Barry Lind <barry@xythos.com> writes:

> Nic Ferrier wrote:
>
> > So I'm simply transforming querys from:
> >
> >
> >     SELECT x FROM y WHERE z;
> >
> >
> > into
> >
> >     DECLARE jdbcXX CURSOR FOR $query ;
> >     FETCH FORWARD $fetchSize jdbcXX;
> >
>
> And when $query is:  "insert into foo values (...); select * from bar;"
>
> You will get:
> DECLARE jdbcXX CURSOR FOR insert into foo values (...);
> select * from bar;
> FETCH FORWARD $fetchSize jdbcXX;
>
> Which clearly isn't what you want.

That's a really good point Barry. I never do that so it's not
something I'd considered.

Isn't a solution to only do the to_cursor translation when the
statement given begins with "SELECT "?

I agree that it's far from perfect, but it will do for _most_ cases
won't it?

Interestingly, I don't think what you've speced above is even
possible with some databases is it? Another advantage of using
PostgreSQL is that you must properly terminate your SQL statements,
unlike with some commercial databases I could mention.

Nic

Re: Out of memory error on huge resultset

От

Aaron Mulder

Дата:

12 октября 2002 г., 07:53:37

On 12 Oct 2002, Nic Ferrier wrote:
> That's a really good point Barry. I never do that so it's not
> something I'd considered.
>
> Isn't a solution to only do the to_cursor translation when the
> statement given begins with "SELECT "?

    Well, there's always the case of "select * from
really_small_table; insert into reall_small table; select * from
really_big_table;..."

    In practice, I've used some 100+ statement commands, because we
run updates to one or our internal databases in DBVisualizer, and it
passes the entire script as one command.  On the other hand, those are all
updates, but there's nothing preventing you from putting some selects in
there.

    I think it's best to do what Barry mentioned and disable this for
statements containing a ; (or at least containing a ; with any non
whitespace thereafter).  Then just consider whether you want to search for
quoted/escaped ;s.

Aaron

> I agree that it's far from perfect, but it will do for _most_ cases
> won't it?
>
>
> Interestingly, I don't think what you've speced above is even
> possible with some databases is it? Another advantage of using
> PostgreSQL is that you must properly terminate your SQL statements,
> unlike with some commercial databases I could mention.

Re: Out of memory error on huge resultset

От

Nic Ferrier

Дата:

12 октября 2002 г., 08:01:24

Aaron Mulder <ammulder@alumni.princeton.edu> writes:

> On 12 Oct 2002, Nic Ferrier wrote:
> > That's a really good point Barry. I never do that so it's not
> > something I'd considered.
> >
> > Isn't a solution to only do the to_cursor translation when the
> > statement given begins with "SELECT "?
>
>     Well, there's always the case of "select * from
> really_small_table; insert into reall_small table; select * from
> really_big_table;..."
>
>     In practice, I've used some 100+ statement commands, because we
> run updates to one or our internal databases in DBVisualizer, and it
> passes the entire script as one command.  On the other hand, those are all
> updates, but there's nothing preventing you from putting some selects in
> there.
>
>     I think it's best to do what Barry mentioned and disable this for
> statements containing a ; (or at least containing a ; with any non
> whitespace thereafter).  Then just consider whether you want to search for
> quoted/escaped ;s.

Yes. It's tricky isn't it. Obviously using multiple SQL statements is
kinda popular. I've always used stored procs.


This is certainly more of a can of worms that I first
thought. However, I'll finish it off and we'll see how messy it is
then.


Nic

Re: Out of memory error on huge resultset

От

Dave Cramer

Дата:

12 октября 2002 г., 08:32:00

Nic,

if you just concentrate on the cacheing, and moving logic, that will be
more than enough. Once we get that then we can work on when it should be
invoked.

Dave


On Sat, 2002-10-12 at 08:20, Nic Ferrier wrote:
> Aaron Mulder <ammulder@alumni.princeton.edu> writes:
>
> > On 12 Oct 2002, Nic Ferrier wrote:
> > > That's a really good point Barry. I never do that so it's not
> > > something I'd considered.
> > >
> > > Isn't a solution to only do the to_cursor translation when the
> > > statement given begins with "SELECT "?
> >
> >     Well, there's always the case of "select * from
> > really_small_table; insert into reall_small table; select * from
> > really_big_table;..."
> >
> >     In practice, I've used some 100+ statement commands, because we
> > run updates to one or our internal databases in DBVisualizer, and it
> > passes the entire script as one command.  On the other hand, those are all
> > updates, but there's nothing preventing you from putting some selects in
> > there.
> >
> >     I think it's best to do what Barry mentioned and disable this for
> > statements containing a ; (or at least containing a ; with any non
> > whitespace thereafter).  Then just consider whether you want to search for
> > quoted/escaped ;s.
>
> Yes. It's tricky isn't it. Obviously using multiple SQL statements is
> kinda popular. I've always used stored procs.
>
>
> This is certainly more of a can of worms that I first
> thought. However, I'll finish it off and we'll see how messy it is
> then.
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
>

cursor based result sets not updateable

От

Dave Cramer

Дата:

12 октября 2002 г., 10:50:40

Nic,

Looking at the docs, we don't have to worry about updateable result
sets. cursors are read only

READ ONLY

SQL92 keyword indicating that the cursor will be used in a read only
mode. Since this is the only cursor access mode available in PostgreSQL
this keyword has no effect.

Dave

Re: cursor based result sets not updateable

От

Nic Ferrier

Дата:

12 октября 2002 г., 15:09:53

Dave Cramer <Dave@micro-automation.net> writes:

> Nic,
>
> Looking at the docs, we don't have to worry about updateable result
> sets. cursors are read only
>

To be honest, Dave, I wasn't worrying about it   /8->


Nic

problem with the build file?

От

Nic Ferrier

Дата:

12 октября 2002 г., 15:31:01

Here's a patch I had to apply to the ant script to get the build to
work properly.

It's incomplete because it doesn't address jdbc3. But I thought I'd
drop it here because I might have it completly round the back of my
neck anyway.


Nic


*** build.xml    2002/09/25 07:01:30    1.29
--- build.xml    2002/10/12 19:28:35
***************
*** 104,110 ****
      <javac srcdir="${srcdir}" destdir="${builddir}" debug="${debug}">
        <include name="${package}/**" />

!       <exclude name="${package}/jdbc1/**" unless="jdbc1"/>
        <exclude name="${package}/jdbc2/**" unless="jdbc2"/>
        <exclude name="${package}/jdbc3/**" unless="jdbc3"/>

--- 104,110 ----
      <javac srcdir="${srcdir}" destdir="${builddir}" debug="${debug}">
        <include name="${package}/**" />

!       <exclude name="${package}/jdbc1/Jdbc1*" unless="jdbc1"/>
        <exclude name="${package}/jdbc2/**" unless="jdbc2"/>
        <exclude name="${package}/jdbc3/**" unless="jdbc3"/>

Re: problem with the build file?

От

Barry Lind

Дата:

18 октября 2002 г., 13:55:57

Nic,

What problem are you seeing that this patch is trying to address?  I
don't have any problems building without this patch.  So before I look
at applying it, I want to understand the problem you are seeing.

thanks,
--Barry



Nic Ferrier wrote:
> Here's a patch I had to apply to the ant script to get the build to
> work properly.
>
> It's incomplete because it doesn't address jdbc3. But I thought I'd
> drop it here because I might have it completly round the back of my
> neck anyway.
>
>
> Nic
>
>
> *** build.xml    2002/09/25 07:01:30    1.29
> --- build.xml    2002/10/12 19:28:35
> ***************
> *** 104,110 ****
>       <javac srcdir="${srcdir}" destdir="${builddir}" debug="${debug}">
>         <include name="${package}/**" />
>
> !       <exclude name="${package}/jdbc1/**" unless="jdbc1"/>
>         <exclude name="${package}/jdbc2/**" unless="jdbc2"/>
>         <exclude name="${package}/jdbc3/**" unless="jdbc3"/>
>
> --- 104,110 ----
>       <javac srcdir="${srcdir}" destdir="${builddir}" debug="${debug}">
>         <include name="${package}/**" />
>
> !       <exclude name="${package}/jdbc1/Jdbc1*" unless="jdbc1"/>
>         <exclude name="${package}/jdbc2/**" unless="jdbc2"/>
>         <exclude name="${package}/jdbc3/**" unless="jdbc3"/>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: problem with the build file?

От

Nic Ferrier

Дата:

18 октября 2002 г., 14:13:15

Barry Lind <blind@xythos.com> writes:

> Nic,
>
> What problem are you seeing that this patch is trying to address?  I
> don't have any problems building without this patch.  So before I look
> at applying it, I want to understand the problem you are seeing.

- I was building JDBC2.

- I made a change to AbstractJdbc1ResultSet.java

- I ran ANT

- No files were compiled.

I think this was because the <exclude> tag (for the jdbc2 build)
excluded the entire jdbc1 directory erroneously. It is not logical to
exclude the entire directory because some of the jdbc2 classes
actually extend the jdbc1 classes (notably the classes called
AbstractXXX are the base classes for all the other directories).

I understood from the mailing list that the new structure had
happened recently and I presume this build nit got lost in the
changes.

Nic

Re: problem with the build file?

От

Barry Lind

Дата:

18 октября 2002 г., 16:42:08

Nic,

Your change shouldn't be necessary.  Since even though the directory is
excluded, the particular files in it that are referenced from the
included directories will automatically be compiled because they are
referenced from the other classes that are being compiled.  That is the
way javac works when compiling classes, it automatically compiles any
dependent objects as well.

thanks,
--Barry


Nic Ferrier wrote:
> Barry Lind <blind@xythos.com> writes:
>
>
>>Nic,
>>
>>What problem are you seeing that this patch is trying to address?  I
>>don't have any problems building without this patch.  So before I look
>>at applying it, I want to understand the problem you are seeing.
>
>
>
> - I was building JDBC2.
>
> - I made a change to AbstractJdbc1ResultSet.java
>
> - I ran ANT
>
> - No files were compiled.
>
>
>
> I think this was because the <exclude> tag (for the jdbc2 build)
> excluded the entire jdbc1 directory erroneously. It is not logical to
> exclude the entire directory because some of the jdbc2 classes
> actually extend the jdbc1 classes (notably the classes called
> AbstractXXX are the base classes for all the other directories).
>
> I understood from the mailing list that the new structure had
> happened recently and I presume this build nit got lost in the
> changes.
>
>
> Nic
>
>

Re: problem with the build file?

От

Nic Ferrier

Дата:

18 октября 2002 г., 18:18:32

Barry Lind <blind@xythos.com> writes:

> Nic,
>
> Your change shouldn't be necessary.  Since even though the directory is
> excluded, the particular files in it that are referenced from the
> included directories will automatically be compiled because they are
> referenced from the other classes that are being compiled.  That is the
> way javac works when compiling classes, it automatically compiles any
> dependent objects as well.

Ok, it's been a while since I studided javac.

But last time I looked javac couldn't find a source file if it had not been
passed to the compiler (on the command line, or as part of an @ file
or via the API).

The ANT task excludes the jdbc1 files from being passed to the
compiler.

I've just confirmed that this is broken. Here's what I did (with the
old ant script):

- added the line:

     System.out.println("one");

to file org/postgresql/jdbc1/AbstractJdbc1ResultSet.java

- make (to build jdbc2)

the 1 file gets compiled.

- change the line to:

     System.out.println("two");

- make (to build jdbc2)

No files get compiled.

Of course, the chanbge to "two" should be compiled but isn't because
ant has excluded the source file from the compiler.

So, I think it is necessary.

Nic

Re: problem with the build file?

От

Barry Lind

Дата:

19 октября 2002 г., 03:54:47

Nic,

But if you did a 'make clean; make' it would recompile the changed file.

It isn't being compiled the second time because a compiled .class file
is found.  If you do a make clean then it will force a recompile since
no .class file exists.

thanks,
--Barry


Nic Ferrier wrote:
> Barry Lind <blind@xythos.com> writes:
>
>
>>Nic,
>>
>>Your change shouldn't be necessary.  Since even though the directory is
>>excluded, the particular files in it that are referenced from the
>>included directories will automatically be compiled because they are
>>referenced from the other classes that are being compiled.  That is the
>>way javac works when compiling classes, it automatically compiles any
>>dependent objects as well.
>
>
> Ok, it's been a while since I studided javac.
>
> But last time I looked javac couldn't find a source file if it had not been
> passed to the compiler (on the command line, or as part of an @ file
> or via the API).
>
> The ANT task excludes the jdbc1 files from being passed to the
> compiler.
>
>
> I've just confirmed that this is broken. Here's what I did (with the
> old ant script):
>
> - added the line:
>
>      System.out.println("one");
>
> to file org/postgresql/jdbc1/AbstractJdbc1ResultSet.java
>
> - make (to build jdbc2)
>
> the 1 file gets compiled.
>
>
> - change the line to:
>
>      System.out.println("two");
>
> - make (to build jdbc2)
>
> No files get compiled.
>
>
> Of course, the chanbge to "two" should be compiled but isn't because
> ant has excluded the source file from the compiler.
>
>
> So, I think it is necessary.
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Out of memory error on huge resultset