Обсуждение: process large tables

Поиск
Список
Период
Сортировка

process large tables

От
Kristina Magwood
Дата:

Hi,
I am trying to process a large table.  Unfortunately, using select * from table gives me a ResultSet that is too large.  
The java runs out of memory even if I boost the vm memory.
Is there any way I can programmatically (in java) retrieve say 10,000 records at a time without knowing anything specific about the table?  Then, when I am done with those records, retrieve the next 10,000, etc?

Thank you in advance for any help you can spare.
Kristina

Re: process large tables

От
Nelson Arapé
Дата:
From the documentation
(http://jdbc.postgresql.org/documentation/80/query.html#query-with-cursor)

"By default the driver collects all the results for the query at once. This
can be inconvenient for large data sets so the JDBC driver provides a means
of basing a ResultSet on a database cursor and only fetching a small number
of rows.

..."

With a cursor you fetch rows by pieces.  It is well explained in the
documentation.

Bye
Nelson Arapé

El Jue 14 Abr 2005 16:37, Kristina Magwood escribió:
> Hi,
> I am trying to process a large table.  Unfortunately, using select * from
> table gives me a ResultSet that is too large.
> The java runs out of memory even if I boost the vm memory.
> Is there any way I can programmatically (in java) retrieve say 10,000
> records at a time without knowing anything specific about the table? Then,
> when I am done with those records, retrieve the next 10,000, etc?
>
> Thank you in advance for any help you can spare.
> Kristina

Re: process large tables

От
Alan Stange
Дата:
Kristina Magwood wrote:

>
> Hi,
> I am trying to process a large table.  Unfortunately, using select *
> from table gives me a ResultSet that is too large.
> The java runs out of memory even if I boost the vm memory.
> Is there any way I can programmatically (in java) retrieve say 10,000
> records at a time without knowing anything specific about the table?
>  Then, when I am done with those records, retrieve the next 10,000, etc?
>
> Thank you in advance for any help you can spare.
> Kristina


The following should work but was written mostly from memory, so it
might not compile.  Make sure you're only pulling the columns you need.

There are three important bits (the details of which might have changed
from pg7 to pg8):
1) the auto commit must be turned off.
2) the fetch direction must be FETCH_FORWARD
3) the query can be only a *single* statement.  So no trailing ";" is
allowed, as "select * from foo;" is actually three statements (assuming
I understand the discussion that happened some time back).

You can tell it's working if you see cursors being created on the
server, or by using truss/strace type tools on the client...and if your
memory footprint is smaller.    This was all discussed on the list a few
months back, so you might look through the mailing list for more details.

Good luck.

-- Alan


String sql = null;
boolean commitState = true;
try {
    Connection conn = getConnectionSomehow();
    Statement st = conn.createStatement();
    commitState = conn.getAutoCommit();
    conn.setAutoCommit(false);  // keep the commit state
    st.setFetchDirection(ResultSet.FETCH_FORWARD);
    st.setFetchSize(nRows);   // number of rows to pull in one chunk.
    ResultSet rs = st.executeQuery(sql = "select * from someTable");
    while (rs.next())
       processRow();
} catch (Exception e) {
    System.err.println("error in processing statement: " + sql);
    e.printStackTrace();
} finally {
    // restore the commit state if the connection is going back to a pool...
    if (conn != null) try {conn.setAutoCommit(commitState); } catch
(Exception e) {}
    if (st != null) try {st.close(); } catch (Exception e) {}
    if (conn != null) try {conn.close(); } catch (Exception e) {}
}