Re: Add header support to text format and matching feature

Поиск
Список
Период
Сортировка
От Julien Rouhaud
Тема Re: Add header support to text format and matching feature
Дата
Msg-id 20220607154744.vvmitnqhyxrne5ms@jrouhaud
обсуждение исходный текст
Ответ на Re: Add header support to text format and matching feature  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Ответы Re: Add header support to text format and matching feature  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Hi,

On Wed, Mar 30, 2022 at 09:11:09AM +0200, Peter Eisentraut wrote:
>
> Committed, after some further refinements as discussed.

While working on nearby code, I found some problems with this feature.

First, probably nitpicking, the HEADER MATCH is allowed for COPY TO, is that
expected?  The documentation isn't really explicit about it, but there's
nothing to match when exporting data it's a bit surprising.  I'm not opposed to
have HEADER MATCH means HEADER ON for COPY TO, as as-is one can easily reuse
the commands history, but maybe it should be clearly documented?

Then, apparently HEADER MATCH doesn't let you do sanity checks against a custom
column list.  This one looks like a clear oversight, as something like that
should be entirely valid IMHO:

CREATE TABLE tbl(col1 int, col2 int);
COPY tbl (col2, col1) TO '/path/to/file' WITH (HEADER MATCH);
COPY tbl (col2, col1) FROM '/path/to/file' WITH (HEADER MATCH);

but right now it errors out with:

ERROR:  column name mismatch in header line field 1: got "col1", expected "col2"

Note that the error message is bogus if you specify attributes in a
different order from the relation, as the code is mixing access to the tuple
desc and access to the raw fields with the same offset.

This also means that it will actually fail to detect a mismatch in the provided
column list and let you import data in the wrong position as long as the
datatypes are compatible and the column header in the file are in the correct
order.  For instance:

CREATE TABLE abc (a text, b text, c text);
INSERT INTO abc SELECT 'a', 'b', 'c';
COPY abc TO '/path/to/file' WITH (HEADER MATCH);

You can then import the data with any of those:
COPY abc(c, b, a) TO '/path/to/file' WITH (HEADER MATCH);
COPY abc(c, a, b) TO '/path/to/file' WITH (HEADER MATCH);
[...]
SELECT * FROM abc;

Even worse, if you try to do a COPY ... FROM ... WITH (HEADER ON) on a table
that has some dropped attribute(s).  The current code will access random memory
as there's no exact attnum / raw field mapping anymore.

I can work on a fix if needed (with some additional regression test to cover
those cases), but I'm still not sure that having a user provided column list is
supposed to be accepted or not for the HEADER MATCH.  In the meantime I will
add an open item.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rod Taylor
Дата:
Сообщение: Re: Collation version tracking for macOS
Следующее
От: Robert Haas
Дата:
Сообщение: Re: pgcon unconference / impact of block size on performance