Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
Дата
Msg-id 297f1c95-63dd-4180-824e-3448e2e25fa3@dunslane.net
обсуждение
Ответ на Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON  (Ayush Tiwari <ayushtiwari.slg01@gmail.com>)
Ответы Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
Список pgsql-hackers


On 2026-04-29 We 12:49 PM, Ayush Tiwari wrote:
Hi,

On Mon, 20 Apr 2026 at 20:31, Ayush Tiwari <ayushtiwari.slg01@gmail.com> wrote:
Hi,

On Mon, 20 Apr 2026 at 19:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
 
Seems to me the correct thing here is to make it work like the other
cases, ie perform pg_server_to_any().  I have exactly no sympathy for
the argument about the RFC saying it must be UTF-8, not least because
that's not in fact what is implemented (what if the server encoding
isn't UTF-8?).

Agreed. I initially thought rejecting the option was the safer route 
given the RFC, but as you pointed out, we aren't enforcing 
UTF-8 strictly on the server side anyway. 


Rejecting this option altogether doesn't improve anything, not
functionally, not specs-compliance-wise, nor according to the
principle of least surprise.
 
Makes sense. Implementing the conversion properly 
keeps JSON format consistent with how the text and CSV formats behave.

No, you don't get to punt this till later.  Once we ship v19 there's
going to be a strong expectation of backwards compatibility.

The idea of sending UTF-8 to a client that's set client_encoding to
something else would be risible, if it weren't a security hazard.

I agree sending unconverted bytes to a mismatched 
client encoding is clearly a security hazard that needs addressing. Did
not consider the backward compatibility part, my bad.

Was trying out adding  pg_server_to_any() to the json_buf after 
composite_to_json() returns, 
correctly covering both explicit ENCODING option specifications and 
implicit client_encoding mismatches. 

Let me send a patch with code and associated test cases.

 
Attached patch with round trip test case. Please review and let me 
know if it's in the right direction.

I have registered this patch set in the CommitFest for tracking: 

Please let me know if the patch looks good, and if I need to add it
in the open items list for PG 19.



Basically good, I think. I have modified your test a bit, testing more directly for the presence of the LATIN-1 encoded character and the absence of the UTF-8 encoded character, by reading in the file with pg_read_binary_file, and adding a test for implicit encoding by setting client_encoding.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Вложения

В списке pgsql-hackers по дате отправления: