Re: [PATCH] Add error hints for invalid COPY options

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [PATCH] Add error hints for invalid COPY options
Дата
Msg-id CAD21AoBC9O80OJchfyjnnxCZV-YfR-fDXyCKLNxdXdTNFYVk1Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Add error hints for invalid COPY options  (Sugamoto Shinya <shinya34892@gmail.com>)
Список pgsql-hackers
On Tue, Dec 2, 2025 at 3:42 AM Sugamoto Shinya <shinya34892@gmail.com> wrote:
>
>
>
> On Sat, Nov 29, 2025 at 9:36 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
>>
>>
>>
>> On Fri, Nov 28, 2025 at 2:59 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>>>
>>> On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892@gmail.com> wrote:
>>> >
>>> >
>>> >
>>> > 2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
>>> >>
>>> >> On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
>>> >> > On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
>>> >> >> This follows the pattern already used elsewhere in PostgreSQL for providing
>>> >> >> helpful error hints to users.
>>> >> >
>>> >> > Given we have 15 COPY options now, it sounds like a reasonable idea.
>>> >> >
>>> >> > One concern about the patch is that when adding a new COPY option, we
>>> >> > could miss updating valid_copy_options list, resulting in providing a
>>> >> > wrong suggestion. I think we can consider refactoring the COPY option
>>> >> > handling so that we check the given option is a valid name or not by
>>> >> > checking valid_copy_options array and then process the option value.
>>> >>
>>> >> +1.  Ideally, folks wouldn't need to update a separate list when adding new
>>> >> options.
>>> >>
>>> >> >> Additionally, this patch corrects a misleading comment for the
>>> >> >> convert_selectively option. The comment stated it was "not-accessible-from-SQL",
>>> >> >> but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
>>> >> >> The updated comment clarifies that while technically accessible, it's intended for
>>> >> >> internal use and not recommended for end-user use due to potential data loss.
>>> >> >
>>> >> > Hmm, I'm not sure the proposed comment improves the clarification.
>>> >> > It's essentially non-accessible from SQL since we cannot provide a
>>> >> > valid value for convert_selectively from SQL commands.
>>> >>
>>> >> Yeah, I'd leave it alone, at least for this patch.
>>> >>
>>> >> --
>>> >> nathan
>>> >>
>>> >>
>>> >
>>> >
>>> > Thanks for checking my proposal.
>>> >
>>> >
>>> > For the refactoring of the COPY options, it sounds reasonable to me. Let me take that changes in my patch.
>>>
>>>
>>> Also one little thing:
>>>
>>>
>>> >+{
>>> >+ {"default", copy_opt_default, true},
>>> >+ {"delimiter", copy_opt_delimiter, true},
>>> >+ {"encoding", copy_opt_encoding, true},
>>> >+ {"escape", copy_opt_escape, true},
>>> >+ {"force_not_null", copy_opt_force_not_null, true},
>>> >+ {"force_null", copy_opt_force_null, true},
>>> >+ {"force_quote", copy_opt_force_quote, true},
>>> >+ {"format", copy_opt_format, true},
>>> >+ {"freeze", copy_opt_freeze, true},
>>> >+ {"header", copy_opt_header, true},
>>> >+ {"log_verbosity", copy_opt_log_verbosity, true},
>>> >+ {"null", copy_opt_null, true},
>>> >+ {"on_error", copy_opt_on_error, true},
>>> >+ {"quote", copy_opt_quote, true},
>>> >+ {"reject_limit", copy_opt_reject_limit, true},
>>> >+ {"convert_selectively", copy_opt_convert_selectively, false},
>>> >+ {NULL, NULL, false}
>>> >+};
>>>
>>> Maybe we need one more struct member here, to indicate which options
>>> are valid to be specified by user?
>>>
>>> Also, pattern
>>>
>>> static const struct {..} array_name[] = ... is not used in PostgreSQL
>>> sources. At least, I do not see any use of such .
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Kirill Reshke
>>
>>
>>
>> Thanks for checking my proposal.
>>
>>
>> > Maybe we need one more struct member here, to indicate which options
>> > are valid to be specified by user?
>>
>> If you don't mind, I would like to make a separate patch for fixing the "convert_selectively"
>> option and focus on refactoring error handling here because I tend to feel we should
>> separate refactoring changes and non-backward compatible changes into different commits.
>> After this patch gets merged, I'll make another thread to discuss whether we should block
>> unexpected "convert_selectively" use or not.
>>
>>
>> > static const struct {..} array_name[] = ... is not used in PostgreSQL
>> > sources. At least, I do not see any use of such .
>>
>> I saw several places that use that sort of style, for example src/backend/utils/adt/encode.c:836
>> and src/backend/access/heap/heapam.c:122, but you seems to be more or less correct since
>> usually we define types explicitly like src/backend/foreign/foreign.c:576 and src/backend/backup/basebackup.c:191.
>> I updated my patch by defining a new type `CopyCoptionDef`.
>>
>>
>> Also, I added improvements to helper functions like defGet**. I just removed and unified those
>> into corresponding proceeOption** functions.
>>
>> Regards,
>
>
>
> Hi,
>
>
> Just a friendly ping on this thread.
>
>
> In the latest version of the patch, I refactored COPY option handling so that:
>
> All the COPY options and their validation functions are defined in a single table (CopyOptionDef), and
>
> Error hints for invalid option names/values are generated based on that table.
>
>
> The goal was to make it harder to forget updating the error-hinting logic when adding new options, and to keep
validationlogic in one place. 
>
>
> But on the other hand, I can also simplify this if you feel the current approach is too heavy. For example, one
alternativewould be to keep the existing per-option handling and just add a minimal option check like
validate_copy_option()near the top of the main options loop in order to keep our implementations simple and small, even
ifthat does not completely eliminate the chance of someone missing an update. 
>
> This is an alternative approche what I mentioned here.
>
> ```
> list valid_options = ["format", "force_null", ...]
>
> foreach(opt, options)
> {
>    validate_copy_option(opt, valid_options)   <--- THIS
>
>    if (opt.name == "format") ...
>    if (opt.name == "force_null") ...
>    ...
> }
> ```

The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.

>
> Regarding convert_selectively, I have kept behavior and comments unchanged in this patch. As I said, I plan to
proposea separate patch to address the possibility of users specifying convert_selectively from SQL (e.g., by rejecting
itin the parser), once we agree on the direction for this refactoring. 

+1 for a separate discussion and patch. Thank you for pointing out
that it actually can be accessible via SQL command.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления: