Обсуждение: ignore errors for COPY

Поиск
Список
Период
Сортировка

ignore errors for COPY

От
Vangelis Katsikaros
Дата:
Hello

I have postrges 8.3.12 and I have the following issue:

I have a table
create table test(
      table_id integer,
      datetime timestamp,
      MMSI integer,
      lat real,
      lng real,
);

and I bulk insert data to this table with COPY.

A tiny portion of the data in the file are wrong. For example one date
is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
encounters this row and stop the insertion with
ERROR:  date/time field value out of range: "2009-93-29 05:27:08"

Is there a way I can "turn" this error into a warning (or suppress the
error) and make COPY simply to skip this row?

I have a big amount of data (~100G) so iterating through them to find
all the possible wrong timestamp, reals, and integers will be quite
tedious and time consuming.

Regards
Vangelis

Re: ignore errors for COPY

От
Guillaume Lelarge
Дата:
Le 11/11/2010 13:01, Vangelis Katsikaros a écrit :
> Hello
>
> I have postrges 8.3.12 and I have the following issue:
>
> I have a table
> create table test(
>      table_id integer,
>      datetime timestamp,
>      MMSI integer,
>      lat real,
>      lng real,
> );
>
> and I bulk insert data to this table with COPY.
>
> A tiny portion of the data in the file are wrong. For example one date
> is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
> encounters this row and stop the insertion with
> ERROR:  date/time field value out of range: "2009-93-29 05:27:08"
>
> Is there a way I can "turn" this error into a warning (or suppress the
> error) and make COPY simply to skip this row?
>

Nope.

> I have a big amount of data (~100G) so iterating through them to find
> all the possible wrong timestamp, reals, and integers will be quite
> tedious and time consuming.
>

You should better look at pgloader which will use COPY to put your data
in your table and found the lines in error. Of course, it takes time to
detect lines in error. But at least, all "good" lines will be in your
table, and all "bad" lines will be in a file, so that you can modify
them to inject later.


--
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

Re: ignore errors for COPY

От
"Rhys A.D. Stewart"
Дата:
On Thu, Nov 11, 2010 at 8:05 AM, Guillaume Lelarge
<guillaume@lelarge.info> wrote:
> Le 11/11/2010 13:01, Vangelis Katsikaros a écrit :
>> Hello
>>
>> I have postrges 8.3.12 and I have the following issue:
>>
>> I have a table
>> create table test(
>>      table_id integer,
>>      datetime timestamp,
>>      MMSI integer,
>>      lat real,
>>      lng real,
>> );
>>
>> and I bulk insert data to this table with COPY.
>>
>> A tiny portion of the data in the file are wrong. For example one date
>> is "2009-93-29 05:27:08" which obviously has a wrong month (93). COPY
>> encounters this row and stop the insertion with
>> ERROR:  date/time field value out of range: "2009-93-29 05:27:08"
>>
>> Is there a way I can "turn" this error into a warning (or suppress the
>> error) and make COPY simply to skip this row?
>>
>
> Nope.
>
>> I have a big amount of data (~100G) so iterating through them to find
>> all the possible wrong timestamp, reals, and integers will be quite
>> tedious and time consuming.
>>
>
> You should better look at pgloader which will use COPY to put your data
> in your table and found the lines in error. Of course, it takes time to
> detect lines in error. But at least, all "good" lines will be in your
> table, and all "bad" lines will be in a file, so that you can modify
> them to inject later.
>
>
> --
> Guillaume
>  http://www.postgresql.fr
>  http://dalibo.com
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

olé!!!

Maybe you could import it as a text column and then deal with the
conversion in the DB....i do that sometimes....never had 100GB of data
to work with though....


Rhys

Re: ignore errors for COPY [solved]

От
Vangelis Katsikaros
Дата:
On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:
>
> You should better look at pgloader which will use COPY to put your data
> in your table and found the lines in error. Of course, it takes time to
> detect lines in error. But at least, all "good" lines will be in your
> table, and all "bad" lines will be in a file, so that you can modify
> them to inject later.
>
>

Hi

Thanks for the answer, indeed it solves the issue.

Now, I would like to ask a second question (sorry for using the same
thread :)

I would like to create a specific database on a another location (not in
"data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
I have to run
initdb -D /other/path/

of if there is another way

(for example in mysql I can do the same thing by creating a symlink to
the other location inside "datadir = /var/lib/mysql" )

Regards
Vangelis


Re: ignore errors for COPY [solved]

От
Guillaume Lelarge
Дата:
Le 11/11/2010 17:46, Vangelis Katsikaros a écrit :
> On 11/11/2010 03:05 PM, Guillaume Lelarge wrote:
>>
>> You should better look at pgloader which will use COPY to put your data
>> in your table and found the lines in error. Of course, it takes time to
>> detect lines in error. But at least, all "good" lines will be in your
>> table, and all "bad" lines will be in a file, so that you can modify
>> them to inject later.
>>
>>
>
> Hi
>
> Thanks for the answer, indeed it solves the issue.
>

Great.

> Now, I would like to ask a second question (sorry for using the same
> thread :)
>
> I would like to create a specific database on a another location (not in
> "data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
> I have to run
> initdb -D /other/path/
>
> of if there is another way
>

To create another database in another location, you first need to create
a tablespace, and then create a database in that tablespace. For
example, in psql:

CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
CREATE DATABASE newdb TABLESPACE otherlocation;

> (for example in mysql I can do the same thing by creating a symlink to
> the other location inside "datadir = /var/lib/mysql" )
>

Don't know MySQL, so can't say :)


--
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

Re: ignore errors for COPY [solved]

От
Vangelis Katsikaros
Дата:
On 11/11/2010 08:15 PM, Guillaume Lelarge wrote:
>>
>> I would like to create a specific database on a another location (not in
>> "data_directory = '/var/lib/postgresql/8.3/main'" ). I was wondering if
>> I have to run
>> initdb -D /other/path/
>>
>> of if there is another way
>>
>
> To create another database in another location, you first need to create
> a tablespace, and then create a database in that tablespace. For
> example, in psql:
>
> CREATE TABLESPACE otherlocation LOCATION '/some/specific/directory';
> CREATE DATABASE newdb TABLESPACE otherlocation;

Hey thanks again, that did the job :)