Обсуждение: PostgreSQL: Copy from File missing data error

Поиск
Список
Период
Сортировка

PostgreSQL: Copy from File missing data error

От
Holly Gibons
Дата:
I'm using PostgreSQL 9.0 via pgAdmin III

I'm trying to build a PostgreSQL/PostGIS database using Entire country files dataset but I'm getting missing data error

I'm wondering if the copy command is affected by diacritics or I've not set the database up properly
Created a new database with UTF8 encoding

I built the table schema based on the given format (but using type text for NM_MODIFY_DATE not varchar,  having these as dates didn't make a difference)

I used large text viewer to open the uncompressed countries.txt file and copied the top 5 rows into a test file

Using   PostgreSQL Copy this test file imported correctly so I know my schema is correct
copy my_table(List of columns ) from '\\Path\\To\\TestFile.txt' WITH delimiter E'\t' csv HEADER;
However when I tried to ingest the larger  countries.txt (2.9GB) file I get an error "missing data" for column xyz at line 12345 (Last column with  data in that row, NM_MODIFY_DATE)

Using large text viewer again I located this line and copied together with proceeding and following lines into my test file and tried the copy again but get the same error

I opened the test file in excel to  see if there is columns  missing although not every row has data in each column they do  all match
 
the problem row has UFI -3373955 & UNI 6329083 

I don't know if this is relevant but looking at the database properties , in pgAdmin, the 'collection' &  'Character type' are both set as "English_United Kingdom, 1252 " I  didn't set this and creating a  new DB the options are "C",  "English_United Kingdom, 1252 " or "POSIX"

Could someone suggest what I'm doing wrong?
Thank you

Re: PostgreSQL: Copy from File missing data error

От
Ron
Дата:
You might want to try pg_bulkload, and have it kick out malformed rows.

It's packaged for RHEL6 and above, plus various other distros.

On 09/04/2018 01:13 PM, Holly Gibons wrote:
I'm using PostgreSQL 9.0 via pgAdmin III

I'm trying to build a PostgreSQL/PostGIS database using Entire country files dataset but I'm getting missing data error

I'm wondering if the copy command is affected by diacritics or I've not set the database up properly
Created a new database with UTF8 encoding

I built the table schema based on the given format (but using type text for NM_MODIFY_DATE not varchar,  having these as dates didn't make a difference)

I used large text viewer to open the uncompressed countries.txt file and copied the top 5 rows into a test file

Using   PostgreSQL Copy this test file imported correctly so I know my schema is correct
copy my_table(List of columns ) from '\\Path\\To\\TestFile.txt' WITH delimiter E'\t' csv HEADER;
However when I tried to ingest the larger  countries.txt (2.9GB) file I get an error "missing data" for column xyz at line 12345 (Last column with  data in that row, NM_MODIFY_DATE)

Using large text viewer again I located this line and copied together with proceeding and following lines into my test file and tried the copy again but get the same error

I opened the test file in excel to  see if there is columns  missing although not every row has data in each column they do  all match
 
the problem row has UFI -3373955 & UNI 6329083 

I don't know if this is relevant but looking at the database properties , in pgAdmin, the 'collection' &  'Character type' are both set as "English_United Kingdom, 1252 " I  didn't set this and creating a  new DB the options are "C",  "English_United Kingdom, 1252 " or "POSIX"

Could someone suggest what I'm doing wrong?
Thank you

--
Angular momentum makes the world go 'round.

Re: PostgreSQL: Copy from File missing data error

От
Dimitri Maziuk
Дата:
On 09/04/2018 01:13 PM, Holly Gibons wrote:

> Could someone suggest what I'm doing wrong?

There's probably a magic character that is not escaped properly
somewhere before the error location. Exactly how to find them all in a
2GB+ file is another question: e.g. if you know python you could play
with csv.Reader and unicode/bytes encode/decode...

--
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu


Вложения