Обсуждение: COPY JSON: use trailing commas in FORCE_ARRAY output
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning of
thenext line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[
{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```
I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
For comparison, the existing json_agg() places commas at the end of the line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```
If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT
json"is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more
commonstyle.
This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how
commasand newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
See the attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Вложения
Hi,
On Wed, 6 May 2026 at 12:11, Chao Li <li.evan.chao@gmail.com> wrote:
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning of the next line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[
{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```
I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
For comparison, the existing json_agg() places commas at the end of the line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```
If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT json" is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more common style.
This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how commas and newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
regression test too passed.
I agree that since FORMAT json is new for PG19, it is reasonable to adjust
the formatting before release. The implementation still streams the output
I agree that since FORMAT json is new for PG19, it is reasonable to adjust
the formatting before release. The implementation still streams the output
and only changes where the separator/newline are emitted.
Regards,
Ayush
Ayush
> On 6 May 2026, at 08:40, Chao Li <li.evan.chao@gmail.com> wrote: > I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon. It might look uncommon, but for very wide lines it's IMHO preferrable to not have to scroll all the way to the end of the line to know that the line is part of an array. > For comparison, the existing json_agg() places commas at the end of the line: That's true, but json_agg() and COPY TO in ndjson format have different use cases. > ..it should not have any performance impact. It does add branches though, and in one branch use a non-inlined function where previously it would unconditionally use an inline function. ISTM it would still be valuable to do performance testing given that COPY is commonly used in performance sensitive settings. -- Daniel Gustafsson
On 5/6/26 2:40 PM, Chao Li wrote:
> Hi,
>
> Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning
ofthe next line, like this:
> ```
> evantest=# copy test_json_copy to stdout with (format json, force_array);
> [
> {"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
> ,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
> ,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
> ,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
> ]
> ```
>
> I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
>
> For comparison, the existing json_agg() places commas at the end of the line:
> ```
> evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
> json_agg
> ----------------------------------------------
> [{"id":1,"name":"Alice"}, +
> {"id":2,"name":"Bob"}, +
> {"id":3,"name":"Charlie"}, +
> {"id":4,"name":"Special Case: \"Quotes\""}]
> (1 row)
> ```
>
> If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT
json"is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more
commonstyle.
>
> This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how
commasand newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
>
> See the attached patch for details.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
Thanks for the patch, I like it as I feel better with placing commas at the end of lines.
I have a small suggestion. The function name CopySendTextLikeEOL reads very similar to the existing
CopySendTextLikeEndOfRow.Would it better to rename it to CopySendTextLikeLineTerminator?
Other than that, the patch looks good to me.
Regards,
Alex Guo
> On May 6, 2026, at 16:23, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 6 May 2026, at 08:40, Chao Li <li.evan.chao@gmail.com> wrote: > >> I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon. > > It might look uncommon, but for very wide lines it's IMHO preferrable to not > have to scroll all the way to the end of the line to know that the line is part > of an array. > >> For comparison, the existing json_agg() places commas at the end of the line: > > That's true, but json_agg() and COPY TO in ndjson format have different use > cases. > >> ..it should not have any performance impact. > > It does add branches though, and in one branch use a non-inlined function where > previously it would unconditionally use an inline function. ISTM it would > still be valuable to do performance testing given that COPY is commonly used in > performance sensitive settings. > Make sense. I just did a test to compare the performance between master and the patch: For the data setup, since the patch only changes where the comma is emitted, I intentionally used a table with only one column,to minimize the cost of formatting each row: ``` DROP TABLE IF EXISTS copy_json_force_array_perf; CREATE UNLOGGED TABLE copy_json_force_array_perf(id int); INSERT INTO copy_json_force_array_perf SELECT g FROM generate_series(1, 10000000) AS g; VACUUM ANALYZE copy_json_force_array_perf; \timing on ``` On master: ``` evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1208.694 ms (00:01.209) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1200.203 ms (00:01.200) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1238.639 ms (00:01.239) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1211.344 ms (00:01.211) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1252.197 ms (00:01.252) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1223.510 ms (00:01.224) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1212.378 ms (00:01.212) ``` Average: ~1221 ms With the patch: ``` evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1218.580 ms (00:01.219) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1212.913 ms (00:01.213) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1204.350 ms (00:01.204) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1205.276 ms (00:01.205) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1202.088 ms (00:01.202) evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY); COPY 10000000 Time: 1222.390 ms (00:01.222) ``` Average: ~1211 ms The difference doesn't look quite meaningful from this test. I built with debug and asserts disabled, and compiled with -O2.The output was written to /dev/null to avoid client/network overhead. The tests ran on my MacBook M4. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
> On May 6, 2026, at 16:46, Alex Guo <guo.alex.hengchen@gmail.com> wrote:
>
>
> On 5/6/26 2:40 PM, Chao Li wrote:
>> Hi,
>>
>> Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning
ofthe next line, like this:
>> ```
>> evantest=# copy test_json_copy to stdout with (format json, force_array);
>> [
>> {"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
>> ,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
>> ,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
>> ,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
>> ]
>> ```
>>
>> I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
>>
>> For comparison, the existing json_agg() places commas at the end of the line:
>> ```
>> evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
>> json_agg
>> ----------------------------------------------
>> [{"id":1,"name":"Alice"}, +
>> {"id":2,"name":"Bob"}, +
>> {"id":3,"name":"Charlie"}, +
>> {"id":4,"name":"Special Case: \"Quotes\""}]
>> (1 row)
>> ```
>>
>> If this feature had already been released, I would not think it worth changing just for formatting. But since
"FORMATjson" is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the
morecommon style.
>>
>> This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how
commasand newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
>>
>> See the attached patch for details.
>>
>> Best regards,
>> --
>> Chao Li (Evan)
>> HighGo Software Co., Ltd.
>> https://www.highgo.com/
>>
>>
>>
> Thanks for the patch, I like it as I feel better with placing commas at the end of lines.
>
> I have a small suggestion. The function name CopySendTextLikeEOL reads very similar to the existing
CopySendTextLikeEndOfRow.Would it better to rename it to CopySendTextLikeLineTerminator?
>
> Other than that, the patch looks good to me.
>
> Regards,
> Alex Guo
Thanks for the suggestion, I take it.
PFA v2 - Renamed CopySendTextLikeEOL to CopySendTextLikeLineTerminator.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/