Обсуждение: Large selects

Поиск
Список
Период
Сортировка

Large selects

От
Bendik Rognlien Johansen
Дата:
Hello!

We are having a lot of trouble trying to use JDBC with PostgreSQL.

This is what we want to achieve:

- Select all rows in a table (people)
- Select addresses belonging to each person. By subselect or join (addresses)
- Select phone numbers belonging to each person. By subselect or join (contacts)
- Loop through the results, building a Lucene index.


Our tables:
- people: ~ 8 million records (could be larger in the future, maybe 50 million)
- addresses ~8 million records
- contacts: ~10 million records


Hardware:
2x dual core 3GHz intel Xeon
1-2 Gb RAM



Problem:

It seems that the PostgreSQL driver fetches the whole result, storing it in memory, causing Java to run out of memory. java.lang.OutOfMemoryError
We have tried increasing memory for the program (-Xmn100M -Xms512M -Xmx512M), but we only have one gigabyte of memory to work with.

Does anyone have any suggestions on how to deal with this, other that throwing more hardware at the problem? Cursors?


Any help is much appreciated!
Thanks




Re: Large selects

От
Roland Walter
Дата:
Bendik Rognlien Johansen schrieb:
> Hello!
>
> We are having a lot of trouble trying to use JDBC with PostgreSQL.
>
> This is what we want to achieve:
>
> - Select all rows in a table (people)
> - Select addresses belonging to each person. By subselect or join
> (addresses)
> - Select phone numbers belonging to each person. By subselect or join
> (contacts)
> - Loop through the results, building a Lucene index.
>
>
> Our tables:
> - people: ~ 8 million records (could be larger in the future, maybe 50
> million)
> - addresses ~8 million records
> - contacts: ~10 million records
>
>
> Hardware:
> 2x dual core 3GHz intel Xeon
> 1-2 Gb RAM
>
>
>
> Problem:
>
> It seems that the PostgreSQL driver fetches the whole result, storing
> it in memory, causing Java to run out of
> memory. java.lang.OutOfMemoryError
> We have tried increasing memory for the program (-Xmn100M -Xms512M
> -Xmx512M), but we only have one gigabyte of memory to work with.
>
> Does anyone have any suggestions on how to deal with this, other that
> throwing more hardware at the problem? Cursors?
>
>
> Any help is much appreciated!
> Thanks
>
Look here:

http://archives.postgresql.org/pgsql-jdbc/2006-01/msg00004.php

--
Roland Walter                   mailto: rwa (at) mosaic-ag (dot) com
MOSAIC SOFTWARE AG               phone: +49 (0) 22 25 / 88 2-44 9
Am Pannacker 3                     fax: +49 (0) 22 25 / 88 2-20 1
D-53340 Meckenheim                http://www.mosaic-ag.com


Die in dieser E-Mail enthaltenen Nachrichten und Anhaenge sind ausschliesslich
fuer den bezeichneten Adressaten bestimmt. Sie koennen rechtlich geschuetzte,
vertrauliche Informationen enthalten. Falls Sie nicht der bezeichnete Empfaenger
oder zum Empfang dieser E-Mail nicht berechtigt sind, ist die Verwendung,
Vervielfaeltigung oder Weitergabe von Nachrichten und Anhaengen untersagt.
Falls Sie diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte
unverzueglich den Absender und vernichten Sie die E-Mail.

This e-mail message and any attachment are intended exclusively for the named
addressee. They may contain confidential information which may also be protected
by professional secrecy. Unless you are the named addressee (or authorised to
receive for the addressee) you may not copy or use this message or any attachment
or disclose the contents to anyone else. If this e-mail was sent to you by mistake
please notify the sender immediately and delete this e-mail.


Re: Large selects

От
"Koth, Christian (DWBI)"
Дата:
Bendik,

- Loop through the results, building a Lucene index.

I guess you are trying to search within the contact data and therefore using lucene? Why not using the database build
inpossibilities for searching using SQL? 


- Does anyone have any suggestions on how to deal with this, other that throwing more hardware at the problem? Cursors?

We are using cursors to access large result sets which works quite well. Also with postgres JDBC you can use scrollable
resultsets (only forward scrolling) and a limited fetch size. 

Christian


******************************************
The information contained in, or attached to, this e-mail, may contain confidential information and is intended solely
forthe use of the individual or entity to whom they are addressed and may be subject to legal privilege.  If you have
receivedthis e-mail in error you should notify the sender immediately by reply e-mail, delete the message from your
systemand notify your system manager.  Please do not copy it for any purpose, or disclose its contents to any other
person. The views or opinions presented in this e-mail are solely those of the author and do not necessarily represent
thoseof the company.  The recipient should check this e-mail and any attachments for the presence of viruses.  The
companyaccepts no liability for any damage caused, directly or indirectly, by any virus transmitted in this email. 
******************************************

Re: Large selects

От
Bendik Rognlien Johansen
Дата:
Thank you both!
I will have a closer look tomorrow.



On Aug 24, 2006, at 10:38 AM, Koth, Christian (DWBI) wrote:
>
> Bendik,
>
> - Loop through the results, building a Lucene index.
>
> I guess you are trying to search within the contact data and
> therefore using lucene? Why not using the database build in
> possibilities for searching using SQL?
>
>
> - Does anyone have any suggestions on how to deal with this, other
> that throwing more hardware at the problem? Cursors?
>
> We are using cursors to access large result sets which works quite
> well. Also with postgres JDBC you can use scrollable result sets
> (only forward scrolling) and a limited fetch size.
>
> Christian
>






On Aug 24, 2006, at 10:30 AM, Roland Walter wrote:

> Bendik Rognlien Johansen schrieb:
>> Hello!
>>
>> We are having a lot of trouble trying to use JDBC with PostgreSQL.
>>
>> This is what we want to achieve:
>>
>> - Select all rows in a table (people)
>> - Select addresses belonging to each person. By subselect or join
>> (addresses)
>> - Select phone numbers belonging to each person. By subselect or
>> join (contacts)
>> - Loop through the results, building a Lucene index.
>>
>>
>> Our tables:
>> - people: ~ 8 million records (could be larger in the future,
>> maybe 50 million)
>> - addresses ~8 million records
>> - contacts: ~10 million records
>>
>>
>> Hardware:
>> 2x dual core 3GHz intel Xeon
>> 1-2 Gb RAM
>>
>>
>>
>> Problem:
>>
>> It seems that the PostgreSQL driver fetches the whole result,
>> storing it in memory, causing Java to run out of memory.
>> java.lang.OutOfMemoryError
>> We have tried increasing memory for the program (-Xmn100M -Xms512M
>> -Xmx512M), but we only have one gigabyte of memory to work with.
>>
>> Does anyone have any suggestions on how to deal with this, other
>> that throwing more hardware at the problem? Cursors?
>>
>>
>> Any help is much appreciated!
>> Thanks
>>
> Look here:
>
> http://archives.postgresql.org/pgsql-jdbc/2006-01/msg00004.php
>
> --
> Roland Walter                   mailto: rwa (at) mosaic-ag (dot) com
> MOSAIC SOFTWARE AG               phone: +49 (0) 22 25 / 88 2-44 9
> Am Pannacker 3                     fax: +49 (0) 22 25 / 88 2-20 1
> D-53340 Meckenheim                http://www.mosaic-ag.com
>
>
> Die in dieser E-Mail enthaltenen Nachrichten und Anhaenge sind
> ausschliesslich fuer den bezeichneten Adressaten bestimmt. Sie
> koennen rechtlich geschuetzte, vertrauliche Informationen
> enthalten. Falls Sie nicht der bezeichnete Empfaenger
> oder zum Empfang dieser E-Mail nicht berechtigt sind, ist die
> Verwendung,
> Vervielfaeltigung oder Weitergabe von Nachrichten und Anhaengen
> untersagt.
> Falls Sie diese E-Mail irrtuemlich erhalten haben, informieren Sie
> bitte unverzueglich den Absender und vernichten Sie die E-Mail.
>
> This e-mail message and any attachment are intended exclusively for
> the named addressee. They may contain confidential information
> which may also be protected by professional secrecy. Unless you are
> the named addressee (or authorised to
> receive for the addressee) you may not copy or use this message or
> any attachment
> or disclose the contents to anyone else. If this e-mail was sent to
> you by mistake
> please notify the sender immediately and delete this e-mail.
>