Обсуждение: Importing data from csv
Hi Folks,<br /><br />sorry if this is a duplicate post, i've been tryin to find a solution of importing data into
postgresfrom a csv file. The problem is, I have a database which consists of columns which contain newline characters
(macand unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the
datawhich breaks the copy procedure. <br /><br />I also tried using the script posted in one of the previous posts..<br
clear="all"/><br />#! /usr/bin/perl<br />$inquotes = 0;<br />while (<>){<br /> # Chop the crlf<br /> chop
($_);<br/> chop ($_);<br /><br /> # this first bit goes through and replaces <br /> # all the commas that
renot in quotes with tildes<br /> for ($i=0 ; $i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br
/> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if
((!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br /> }<br />
}<br/> }<br /> # this replaces any quotes<br /> s/"//g;<br /> print "$_\n";<br />}<br /><br /><br
/>catdata_file | perl scriptname.pl > outputfile.dat<br /><br />and when i run the copy command i get messages like
datamissing for xyz column.<br />any possible hints....... <br /><br />--<br />Thanks,<br />Sumeet
I recently did this by parsing the data through a VB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…
Worked for me but I’m not sure the exact science behind it so someone else might be able to be of some more detailed help.
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Sumeet
Sent: Friday, 25 August 2006 00:48
To: pgsql-sql@postgresql.org
Subject: [SQL] Importing data from csv
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
I also tried using the script posted in one of the previous posts..
#! /usr/bin/perl
$inquotes = 0;
while (<>){
# Chop the crlf
chop ($_);
chop ($_);
# this first bit goes through and replaces
# all the commas that re not in quotes with tildes
for ($i=0 ; $i < length($_) ; $i++){
$char=substr($_,$i,1);
if ($char eq '"' ){
$inquotes = not($inquotes);
}else{
if ( (!$inquotes) && ($char eq ",") ){
substr($_,$i,1)="~";
}
}
}
# this replaces any quotes
s/"//g;
print "$_\n";
}
cat data_file | perl scriptname.pl > outputfile.dat
and when i run the copy command i get messages like data missing for xyz column.
any possible hints.......
--
Thanks,
Sumeet
*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
Phillip Smith wrote:
I recently did this by parsing the data through a VB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…
Worked for me but I’m not sure the exact science behind it so someone else might be able to be of some more detailed help.
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Sumeet
Sent: Friday, 25 August 2006 00:48
To: pgsql-sql@postgresql.org
Subject: [SQL] Importing data from csv
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
I also tried using the script posted in one of the previous posts..
#! /usr/bin/perl
$inquotes = 0;
while (<>){
# Chop the crlf
chop ($_);
chop ($_);
# this first bit goes through and replaces
# all the commas that re not in quotes with tildes
for ($i=0 ; $i < length($_) ; $i++){
$char=substr($_,$i,1);
if ($char eq '"' ){
$inquotes = not($inquotes);
}else{
if ( (!$inquotes) && ($char eq ",") ){
substr($_,$i,1)="~";
}
}
}
# this replaces any quotes
s/"//g;
print "$_\n";
}
cat data_file | perl scriptname.pl > outputfile.dat
and when i run the copy command i get messages like data missing for xyz column.
any possible hints.......
--
Thanks,
Sumeet*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplate Chief Architect JavaPIM http://sourceforge.net/projects/javapim
Scot P. Floess wrote:
A newline in CSV parlance denotes the end of a record....unless that newline is contained with quotes...
Phillip Smith wrote:I recently did this by parsing the data through a VB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…
Worked for me but I’m not sure the exact science behind it so someone else might be able to be of some more detailed help.
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Sumeet
Sent: Friday, 25 August 2006 00:48
To: pgsql-sql@postgresql.org
Subject: [SQL] Importing data from csv
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
I also tried using the script posted in one of the previous posts..
#! /usr/bin/perl
$inquotes = 0;
while (<>){
# Chop the crlf
chop ($_);
chop ($_);
# this first bit goes through and replaces
# all the commas that re not in quotes with tildes
for ($i=0 ; $i < length($_) ; $i++){
$char=substr($_,$i,1);
if ($char eq '"' ){
$inquotes = not($inquotes);
}else{
if ( (!$inquotes) && ($char eq ",") ){
substr($_,$i,1)="~";
}
}
}
# this replaces any quotes
s/"//g;
print "$_\n";
}
cat data_file | perl scriptname.pl > outputfile.dat
and when i run the copy command i get messages like data missing for xyz column.
any possible hints.......
--
Thanks,
Sumeet*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplate Chief Architect JavaPIM http://sourceforge.net/projects/javapim
-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplate Chief Architect JavaPIM http://sourceforge.net/projects/javapim
There you go – it was the quotes that did it, not the back-slashes. I knew someone else would shed some better light! J
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Scot P. Floess
Sent: Friday, 25 August 2006 10:00
To: floess@mindspring.com
Cc: Phillip Smith; pgsql-sql@postgresql.org
Subject: Re: [SQL] Importing data from csv
And if its contained with quotes...its considered a field
Scot P. Floess wrote:
A newline in CSV parlance denotes the end of a record....unless that newline is contained with quotes...
Phillip Smith wrote:
I recently did this by parsing the data through a VB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…
Worked for me but I’m not sure the exact science behind it so someone else might be able to be of some more detailed help.
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Sumeet
Sent: Friday, 25 August 2006 00:48
To: pgsql-sql@postgresql.org
Subject: [SQL] Importing data from csv
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
I also tried using the script posted in one of the previous posts..
#! /usr/bin/perl
$inquotes = 0;
while (<>){
# Chop the crlf
chop ($_);
chop ($_);
# this first bit goes through and replaces
# all the commas that re not in quotes with tildes
for ($i=0 ; $i < length($_) ; $i++){
$char=substr($_,$i,1);
if ($char eq '"' ){
$inquotes = not($inquotes);
}else{
if ( (!$inquotes) && ($char eq ",") ){
substr($_,$i,1)="~";
}
}
}
# this replaces any quotes
s/"//g;
print "$_\n";
}
cat data_file | perl scriptname.pl > outputfile.dat
and when i run the copy command i get messages like data missing for xyz column.
any possible hints.......
--
Thanks,
Sumeet
*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
-- Scot P. Floess27 Lake RoyaleLouisburg, NC 27549 252-478-8087 (Home)919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplateChief Architect JavaPIM http://sourceforge.net/projects/javapim -- Scot P. Floess27 Lake RoyaleLouisburg, NC 27549 252-478-8087 (Home)919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplateChief Architect JavaPIM http://sourceforge.net/projects/javapim*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
Phillip Smith wrote:
There you go – it was the quotes that did it, not the back-slashes. I knew someone else would shed some better light! J
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Scot P. Floess
Sent: Friday, 25 August 2006 10:00
To: floess@mindspring.com
Cc: Phillip Smith; pgsql-sql@postgresql.org
Subject: Re: [SQL] Importing data from csv
And if its contained with quotes...its considered a field
Scot P. Floess wrote:A newline in CSV parlance denotes the end of a record....unless that newline is contained with quotes...
Phillip Smith wrote:I recently did this by parsing the data through a VB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…
Worked for me but I’m not sure the exact science behind it so someone else might be able to be of some more detailed help.
Cheers,
-p
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Sumeet
Sent: Friday, 25 August 2006 00:48
To: pgsql-sql@postgresql.org
Subject: [SQL] Importing data from csv
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
I also tried using the script posted in one of the previous posts..
#! /usr/bin/perl
$inquotes = 0;
while (<>){
# Chop the crlf
chop ($_);
chop ($_);
# this first bit goes through and replaces
# all the commas that re not in quotes with tildes
for ($i=0 ; $i < length($_) ; $i++){
$char=substr($_,$i,1);
if ($char eq '"' ){
$inquotes = not($inquotes);
}else{
if ( (!$inquotes) && ($char eq ",") ){
substr($_,$i,1)="~";
}
}
}
# this replaces any quotes
s/"//g;
print "$_\n";
}
cat data_file | perl scriptname.pl > outputfile.dat
and when i run the copy command i get messages like data missing for xyz column.
any possible hints.......
--
Thanks,
Sumeet
*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
--Scot P. Floess27 Lake RoyaleLouisburg, NC 27549252-478-8087 (Home)919-754-4592 (Work)Chief Architect JPlate http://sourceforge.net/projects/jplateChief Architect JavaPIM http://sourceforge.net/projects/javapim
--Scot P. Floess27 Lake RoyaleLouisburg, NC 27549252-478-8087 (Home)919-754-4592 (Work)Chief Architect JPlate http://sourceforge.net/projects/jplateChief Architect JavaPIM http://sourceforge.net/projects/javapim*******************Confidentiality and Privilege Notice*******************
The material contained in this message is privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsible for delivery of the message to such person, you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by reply email.
Information in this message that does not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmission of this message or any attachments
-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate http://sourceforge.net/projects/jplate Chief Architect JavaPIM http://sourceforge.net/projects/javapim
On Thu, Aug 24, 2006 at 08:19:58PM -0400, Scot P. Floess wrote: > Well, being that there isn't a RFC for CSV...other than "defacto" > definitions...I am pretty sure that is widely agreed upon ;) RFC 4180 Common Format and MIME Type for Comma-Separated Values (CSV) Files ftp://ftp.rfc-editor.org/in-notes/rfc4180.txt "While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence, which allows for a wide variety of interpretations of CSV files. This section documents the format that seems to be followed by most implementations:" -- Michael Fuhr
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================