Re: Fixing backslash dot for COPY FROM...CSV

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: Fixing backslash dot for COPY FROM...CSV
Дата
Msg-id 918e339c-c253-4366-847b-96f926ba46a0@manitou-mail.org
обсуждение исходный текст
Ответ на Re: Fixing backslash dot for COPY FROM...CSV  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Fixing backslash dot for COPY FROM...CSV
Список pgsql-hackers
    Tom Lane wrote:

> I've looked over this patch and I generally agree that this is a
> reasonable solution.

Thanks for reviewing this!

> I'm also wondering why the patch adds a test for
> "PQprotocolVersion(conn) >= 3" in handleCopyIn.

I've removed this in the attached update.

> I concur with Robert's doubts about some of the doc changes though.
> In particular, since we're not changing the behavior for non-CSV
> mode, we shouldn't remove any statements that apply to non-CSV mode.

I don't quite understand the issues with the doc changes. Let me
detail the changes.

The first change is under
  <refsect2>
   <title>CSV Format</title>
so it does no concern non-CSV modes.

--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -761,11 +761,7 @@ COPY <replaceable class="parameter">count</replaceable>
    format, <literal>\.</literal>, the end-of-data marker, could also appear
    as a data value.  To avoid any misinterpretation, a <literal>\.</literal>
    data value appearing as a lone entry on a line is automatically
-    quoted on output, and on input, if quoted, is not interpreted as the
-    end-of-data marker.  If you are loading a file created by another
-    application that has a single unquoted column and might have a
-    value of <literal>\.</literal>, you might need to quote that value in
the
-    input file.
+    quoted on output.
   </para>


The part about quoting output is kept because the code still does that.

About this bit:
 "and on input, if quoted, is not interpreted as the end-of-data marker."

So the patch removes that part. The reason is \. is now not interpreted as
EOD in both cases, quoted or unquoted, conforming to spec.
Previously, what the reader should have understood by "if quoted, ..."
is that it implies "if not quoted, then .\ will be interpreted as EOD
even though that behavior does not conform to the CSV spec".
If we documented the new behavior, it would be something like:
when quoted, it works as expected, and when unquoted, it works as
expected too. Isn't it simpler not to say anything?

About the next sentence:
  "If you are loading a file created by another application
  that has a single unquoted column and might have a value of \., you
  might need to quote that value in the input file."

It's removed as well because it's not longer necessary
to do that.

The other hunk is in psql doc:

--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1119,10 +1119,6 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value'
'second value' \g
     destination, because all data must pass through the client/server
     connection.  For large amounts of data the <acronym>SQL</acronym>
     command might be preferable.
-     Also, because of this pass-through method, <literal>\copy
-     ... from</literal> in <acronym>CSV</acronym> mode will erroneously
-     treat a <literal>\.</literal> data value alone on a line as an
-     end-of-input marker.

That behavior no longer happens, so this gets removed as well.


Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dagfinn Ilmari Mannsåker
Дата:
Сообщение: Re: IPC::Run::time[r|out] vs our TAP tests
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Popcount optimization using AVX512