Обсуждение: Newbie: PG8 and text file parsing

Поиск
Список
Период
Сортировка

Newbie: PG8 and text file parsing

От
Paul R
Дата:
I need some advice.. I am new to PG8 - I am working on the project
that needs to load data daily to PostgreSQL from the text file.. the
text file is 40MB comma delimited file with row consisting of 20-30
fields and couple thousands of rows..

so the data needs to be parsed and if the row exists it needs to be
updated, otherwise new record is going to be created. My problem is
with parsing:

1-what would be a good approach to do it? is plpgSQL suitable for
this? or should I use PHP or other language to to that and run as a
cron job/service?

2-how about committing transactions?  would there be a problem if it
is done at the end of all the updates?

Thank you
Paul

Re: Newbie: PG8 and text file parsing

От
"Joshua D. Drake"
Дата:
On Fri, 2005-02-18 at 20:34 -0500, Paul R wrote:
> I need some advice.. I am new to PG8 - I am working on the project
> that needs to load data daily to PostgreSQL from the text file.. the
> text file is 40MB comma delimited file with row consisting of 20-30
> fields and couple thousands of rows..

Hello,

We had to do something similar once. We used plPythonU to copy the file
from the filesystem to a temporary table. Then from the temporary table
insert into the permanent table.

Sincerely,

Joshua D. Drake


>
> so the data needs to be parsed and if the row exists it needs to be
> updated, otherwise new record is going to be created. My problem is
> with parsing:
>
> 1-what would be a good approach to do it? is plpgSQL suitable for
> this? or should I use PHP or other language to to that and run as a
> cron job/service?
>
> 2-how about committing transactions?  would there be a problem if it
> is done at the end of all the updates?
>
> Thank you
> Paul
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
--
Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
Custom programming, 24x7 support, managed services, and hosting
Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
Reliable replication, Mammoth Replicator - http://www.commandprompt.com/


Re: Newbie: PG8 and text file parsing

От
Tatsuo Ishii
Дата:
> I need some advice.. I am new to PG8 - I am working on the project
> that needs to load data daily to PostgreSQL from the text file.. the
> text file is 40MB comma delimited file with row consisting of 20-30
> fields and couple thousands of rows..
>
> so the data needs to be parsed and if the row exists it needs to be
> updated, otherwise new record is going to be created. My problem is
> with parsing:
>
> 1-what would be a good approach to do it? is plpgSQL suitable for
> this? or should I use PHP or other language to to that and run as a
> cron job/service?

There are some tools which can read a text file and treat it as if
it's a real table. Using them would make your job much easier.

ftp://ftp.sra.co.jp/pub/cmd/postgres/exttable/exttable-0.5.tar.gz
http://pgfoundry.org/projects/dbi-link/

> 2-how about committing transactions?  would there be a problem if it
> is done at the end of all the updates?

Not at all, I think.

Re: Newbie: PG8 and text file parsing

От
Christopher Browne
Дата:
blueaxon@gmail.com (Paul R) wrote:
> I need some advice.. I am new to PG8 - I am working on the project
> that needs to load data daily to PostgreSQL from the text file.. the
> text file is 40MB comma delimited file with row consisting of 20-30
> fields and couple thousands of rows..
>
> so the data needs to be parsed and if the row exists it needs to be
> updated, otherwise new record is going to be created. My problem is
> with parsing:
>
> 1-what would be a good approach to do it? is plpgSQL suitable for
> this? or should I use PHP or other language to to that and run as a
> cron job/service?

When I have done this sort of thing, I have used Perl to turn the data
into a file in a format that can be readily read in using the
PostgreSQL-specific "COPY" command.

If you like PHP or Python or Icon/sed/awk/SNOBOL, that's your call;
using a language to "parse" the text that you are familiar with is
likely to be better than using a language that you aren't familiar
with...

> 2-how about committing transactions?  would there be a problem if it
> is done at the end of all the updates?

No, no problem at all.

When we did the data conversion for the .ORG registry, it involved
splitting the data into 50,000 record "chunks", each loaded in using
COPY.  The prime reason to split at 50,000 was that if problems were
encountered, it's a lot easier to edit a file with 50,000 records
using vi than it is to edit one with 5 million records.  I don't
recall needing to use that capability, but it gave everyone confidence
to know it was there...
--
select 'cbbrowne' || '@' || 'gmail.com';
http://cbbrowne.com/info/
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

Re: Newbie: PG8 and text file parsing

От
"Sim Zacks"
Дата:
You can always do an insert and add a trigger that checks the key for
existence. If it exists do the update instead.
That way you have one line of code doing both inserts and updates.

The only problem with it is that it looks confusing from an outsiders
perspective, so it would have to be documented very well.

Sim


"Paul R" <blueaxon@gmail.com> wrote in message
news:3eb42da105021817342feb0b2d@mail.gmail.com...
> I need some advice.. I am new to PG8 - I am working on the project
> that needs to load data daily to PostgreSQL from the text file.. the
> text file is 40MB comma delimited file with row consisting of 20-30
> fields and couple thousands of rows..
>
> so the data needs to be parsed and if the row exists it needs to be
> updated, otherwise new record is going to be created. My problem is
> with parsing:
>
> 1-what would be a good approach to do it? is plpgSQL suitable for
> this? or should I use PHP or other language to to that and run as a
> cron job/service?
>
> 2-how about committing transactions?  would there be a problem if it
> is done at the end of all the updates?
>
> Thank you
> Paul
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>