Re: any solution for doing a data file import spawning it on multiple processes

Поиск
Список
Период
Сортировка
От Edson Richter
Тема Re: any solution for doing a data file import spawning it on multiple processes
Дата
Msg-id BLU0-SMTP148E7857F9C55502E604153CFFA0@phx.gbl
обсуждение исходный текст
Ответ на any solution for doing a data file import spawning it on multiple processes  ("hb@101-factory.eu" <hb@101-factory.eu>)
Ответы Re: any solution for doing a data file import spawning it on multiple processes  ("hb@101-factory.eu" <hb@101-factory.eu>)
Список pgsql-general
Em 16/06/2012 12:04, hb@101-factory.eu escreveu:
> hi there,
>
> I am trying to import large data files into pg.
> for now i used the. xarg linux command to spawn the file line for line and set  and use the  maximum available
connections.
>
> we use pg pool as connection pool to the database, and so try to maximize the concurrent data import of the file.
>
> problem for now that it seems to work well but we miss a line once in a while, and that is not acceptable. also it
createszombies ;(. 
>
> does anybody have any other tricks that will do the job?
>
> thanks,
>
> Henk

I've used custom Java application using connection pooling (limited to
1000 connections, mean 1000 concurrent file imports).

I'm able to import more than 64000 XML files (about 13Kb each) in 5
minutes, without memory leaks neither zombies, and (of course) no
missing records.

Besides I each thread import separate file, I have another situation
where I have separated threads importing different lines of same file.
No problems at all. Do not forget to check your OS "file open" limits
(it was a big issue in the past for me due Lucene indexes generated
during import).

Server: 8 core Xeon, 16Gig, SAS 15000 rpm disks, PgSQL 9.1.3, Linux
Centos 5, Sun Java 1.6.27.

Regards,

Edson Richter


В списке pgsql-general по дате отправления:

Предыдущее
От: "hb@101-factory.eu"
Дата:
Сообщение: any solution for doing a data file import spawning it on multiple processes
Следующее
От: "hb@101-factory.eu"
Дата:
Сообщение: Re: any solution for doing a data file import spawning it on multiple processes