RE: Off topic? - Solution for a delimiter problem
От | Andrew Gould |
---|---|
Тема | RE: Off topic? - Solution for a delimiter problem |
Дата | |
Msg-id | 20001221154635.21510.qmail@web124.yahoomail.com обсуждение исходный текст |
Ответ на | Off topic? - Solution for a delimiter problem (Andrew Gould <andrewgould@yahoo.com>) |
Список | pgsql-general |
I think (I could be miss-reading the sed line) you're assuming that every field is a text field. I receive many files with both text and numeric fields. The numberic fields are not double quoted; so automating the pairing of double quotes as you have done becomes more complicated. The algorithm I used reads the file one character at a time. The detection of a double quotation mark changes the value of a variable by multiplying the variable's value by -1. The variable's value acts as an on/off switch that determines whether or not a space will be replaced by a tab. This algorithm allows for any combination of text and numeric fields. Thank you for the suggestion though. I am trying to learn more about regular expressions. Andrew Gould --- Jeff Eckermann <jeckermann@verio.net> wrote: > Isn't this overkill? > Something like: > sed -e 's/\" \"/^I/g' -e 's/\"//g' <file> > would do the job nicely. I'm sure people on this > list can suggest even > simpler ways (note: the "^I", or tab character, in > the example is created by > typing Ctrl-V then Ctrl-I). > Sed has been (and is) my friend for many such cases. > > > -----Original Message----- > > From: Andrew Gould [SMTP:andrewgould@yahoo.com] > > Sent: Wednesday, December 20, 2000 5:58 PM > > To: Postgres Mailing List > > Subject: Re: [GENERAL] Off topic? - Solution for a > delimiter problem > > > > If space was used as the delimiter without > accounting > > for the quotation marks, text fields might be > broken > > apart whenever a space occurred. For example, the > > street address "123 Sesame Street" might be split > into > > 3 fields containing '"123', 'Sesame' and 'Street"' > > respectively. In the past, I imported the file > into > > MySQL, then exported the data into a tab delimited > > text file for importing into PostgreSQL. > > > > The Python script saves many steps; so I was > offering > > it up for use and improvement by others. I didn't > > know if I should post the script to the mailing > list > > or send it directly to people who are interested. > > (It's only 1k.) > > > > Andrew Gould > > > > --- "Poul L. Christiansen" <poulc@cs.auc.dk> > wrote: > > > Maybe you can specify the delimiter to be space. > > > > > > radius=# \h copy > > > Command: COPY > > > Description: Copies data between files and > tables > > > Syntax: > > > COPY [ BINARY ] table [ WITH OIDS ] > > > FROM { 'filename' | stdin } > > > [ [USING] DELIMITERS 'delimiter' ] > > > [ WITH NULL AS 'null string' ] > > > COPY [ BINARY ] table [ WITH OIDS ] > > > TO { 'filename' | stdout } > > > [ [USING] DELIMITERS 'delimiter' ] > > > [ WITH NULL AS 'null string' ] > > > > > > That would probably be COPY table FROM filename > > > USING DELIMITERS ' '; > > > > > > HTH, > > > Poul L. Christiansen > > > > > > Andrew Gould wrote: > > > > > > > I receive space delimited files that have > spaces > > > > within double quoted text fields. > > > > > > > > Since PostgreSQL's COPY doesn't deal with > multiple > > > > delimiters, I've written a crude Python script > to > > > > convert the files to a tab delimited format > > > without > > > > the double quotes and without disturbing the > > > spaces > > > > within text fields. I'm sure someone could > make > > > it > > > > more efficient. (Probably by rewriting it in > > > Perl.) > > > > > > > > Does anyone else need the script? Is the > script > > > > something I can/should post? > > > > > > > > Andrew Gould > > > > __________________________________________________ > > Do You Yahoo!? > > Yahoo! Shopping - Thousands of Stores. Millions of > Products. > > http://shopping.yahoo.com/ __________________________________________________ Do You Yahoo!? Yahoo! Shopping - Thousands of Stores. Millions of Products. http://shopping.yahoo.com/
В списке pgsql-general по дате отправления: