Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
| От | Andrew Dunstan |
|---|---|
| Тема | Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON |
| Дата | |
| Msg-id | 297f1c95-63dd-4180-824e-3448e2e25fa3@dunslane.net обсуждение |
| Ответ на | Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON (Ayush Tiwari <ayushtiwari.slg01@gmail.com>) |
| Ответы |
Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
|
| Список | pgsql-hackers |
On 2026-04-29 We 12:49 PM, Ayush Tiwari wrote:
Hi,On Mon, 20 Apr 2026 at 20:31, Ayush Tiwari <ayushtiwari.slg01@gmail.com> wrote:Hi,On Mon, 20 Apr 2026 at 19:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:Seems to me the correct thing here is to make it work like the other
cases, ie perform pg_server_to_any(). I have exactly no sympathy for
the argument about the RFC saying it must be UTF-8, not least because
that's not in fact what is implemented (what if the server encoding
isn't UTF-8?).Agreed. I initially thought rejecting the option was the safer routegiven the RFC, but as you pointed out, we aren't enforcingUTF-8 strictly on the server side anyway.
Rejecting this option altogether doesn't improve anything, not
functionally, not specs-compliance-wise, nor according to the
principle of least surprise.Makes sense. Implementing the conversion properlykeeps JSON format consistent with how the text and CSV formats behave.
No, you don't get to punt this till later. Once we ship v19 there's
going to be a strong expectation of backwards compatibility.
The idea of sending UTF-8 to a client that's set client_encoding to
something else would be risible, if it weren't a security hazard.I agree sending unconverted bytes to a mismatchedclient encoding is clearly a security hazard that needs addressing. Didnot consider the backward compatibility part, my bad.
Was trying out adding pg_server_to_any() to the json_buf aftercomposite_to_json() returns,correctly covering both explicit ENCODING option specifications andimplicit client_encoding mismatches.Let me send a patch with code and associated test cases.Attached patch with round trip test case. Please review and let meknow if it's in the right direction.I have registered this patch set in the CommitFest for tracking:Please let me know if the patch looks good, and if I need to add itin the open items list for PG 19.
Basically good, I think. I have modified your test a bit, testing more directly for the presence of the LATIN-1 encoded character and the absence of the UTF-8 encoded character, by reading in the file with pg_read_binary_file, and adding a test for implicit encoding by setting client_encoding.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: