Re: exposing COPY API
От | Andrew Dunstan |
---|---|
Тема | Re: exposing COPY API |
Дата | |
Msg-id | 4D4C0645.2010109@dunslane.net обсуждение исходный текст |
Ответ на | Re: exposing COPY API (Itagaki Takahiro <itagaki.takahiro@gmail.com>) |
Список | pgsql-hackers |
On 02/04/2011 05:49 AM, Itagaki Takahiro wrote: > Here is a demonstration to support jagged input files. It's a patch > on the latest patch. The new added API is: > > bool NextLineCopyFrom( > [IN] CopyState cstate, > [OUT] char ***fields, [OUT] int *nfields, [OUT] Oid *tupleOid) > > It just returns separated fields in the next line. Fortunately, I need > no extra code for it because it is just extracted from NextCopyFrom(). Thanks, I'll have a look at it, after an emergency job I need to attend to. But the API looks weird. Why are fields and nfields OUT params. The issue isn't decomposing the line into raw fields. The code for doing that works fine as is, including on jagged files. See commit af1a614ec6d074fdea46de2e1c462f23fc7ddc6f which was done for exactly this purpose. The issue is taking those and composing them into the expected tuple. > I'm willing to include the change into copy APIs, > but we still have a few issues. See below. > > On Fri, Feb 4, 2011 at 16:53, Andrew Dunstan<andrew@dunslane.net> wrote: >> The problem with COPY FROM is that nobody's come up with a good syntax for >> allowing it as a FROM target. Doing what I want via FDW neatly gets us >> around that problem. But I'm quite OK with doing the hard work inside the >> COPY code - that's what my working prototype does in fact. > I think it is not only syntax issue. I found an issue that we hard to > support FORCE_NOT_NULL option for extra fields. See FIXME in the patch. > It is a fundamental problem to support jagged fields. It's not a problem at all if you turn the line into a text array. That's exactly why we've been proposing it for this. The array has however many elements are on the line. >> One thing I'd like is to to have file_fdw do something we can't do another >> way. currently it doesn't, so it's nice but uninteresting. > BTW, how do you determine which field is shifted in your broken CSV file? > For example, the case you find "AB,CD,EF" for 2 columns tables. > I could provide a raw CSV reader for jagged files, but you still have to > cook the returned fields into a proper tuple... > See above. My client who deals with this situation and has been doing so for years treats underflowing fields as null and ignores overflowing fields. They would do he same if the data were delivered with a text array. It works very well for them. See <https://github.com/adunstan/postgresql-dev/tree/sqlmed2> for my dev branch on this. cheers andrew
В списке pgsql-hackers по дате отправления: