Robert Haas wrote:
> Part of my hesitancy, I suppose, is that I don't
> understand why we even have this strange convention of making \.
> terminate the input in the first place -- I mean, why wouldn't that be
> done in some kind of out-of-band way, rather than including a special
> marker in the data?
The v3 protocol added the out-of-band method, but the v2 protocol
did not have it, and as far as I understand, this is the reason why
CopyReadLineText() must interpret \. as an end-of-data marker.
The v2 protocol was removed in pg14
https://www.postgresql.org/docs/release/14.0/
<quote>
Remove server and libpq support for the version 2 wire protocol (Heikki
Linnakangas)
This was last used as the default in PostgreSQL 7.3 (released in 2002).
</quote>
Also I hadnt' noticed this before, but the current doc has this mention
that is relevant to this patch:
https://www.postgresql.org/docs/current/protocol-changes.html
"Summary of Changes since Protocol 2.0"
<quote>
COPY data is now encapsulated into CopyData and CopyDone
messages. There is a well-defined way to recover from errors during
COPY. The special “\.” last line is not needed anymore, and is not
sent during COPY OUT. (It is still recognized as a terminator during
COPY IN, but its use is deprecated and will eventually be removed.)
</quote>
What the present patch does is essentially, for the server-side part,
stop recognizing "\." as as terminator, like this paragraph says, but
it does that for CSV only, not for TEXT.
> Hmm. Looking at the rest of the patch, it seems like you're removing
> the logic that prevents us from interpreting
>
> \. lksdghksdhgjskdghjs
>
> as an end-of-file while in CSV mode. But I would have thought based on
> what problem you're trying to fix that you would have wanted to keep
> that logic and further restrict it so that it only applies when not
> within a quoted string.
>
> Maybe I'm misunderstanding what bug you're trying to fix?
The fix is that \. is no longer recognized as special in CSV, whether
alone on a line or not, and whether in a quoted section or not.
It's always interpreted as data, like it would have been in
the first place, I imagine, if the v2 protocol could have handled
it. This is why the patch consists mostly of removing code and
simplifying comments.
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite