Re: Add header support to text format and matching feature

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: Add header support to text format and matching feature
Дата
Msg-id CALDaNm07bcwPY2W68Qghvw2QAV9X8NkV6pihGicko5om=txF+Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add header support to text format and matching feature  ("Daniel Verite" <daniel@manitou-mail.org>)
Ответы Re: Add header support to text format and matching feature  (Rémi Lapeyre <remi.lapeyre@lenstra.fr>)
Список pgsql-hackers
Thanks for your comments, Please find my thoughts inline.

> In my tests it works fine except for one crash that I can reproduce
> on a fresh build and default configuration with:
>
> $ cat >file.txt
> i
> 1
>
> $ psql
> postgres=# create table x(i int);
> CREATE TABLE
> postgres=# \copy x(i) from file.txt with (header match)
> COPY 1
> postgres=# \copy x(i) from file.txt with (header match)
> COPY 1
> postgres=# \copy x(i) from file.txt with (header match)
> COPY 1
> postgres=# \copy x(i) from file.txt with (header match)
> COPY 1
> postgres=# \copy x(i) from file.txt with (header match)
> COPY 1
> postgres=# \copy x(i) from file.txt with (header match)
> PANIC:  ERRORDATA_STACK_SIZE exceeded
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>

Fixed, replaced PG_TRY/PG_CATCH with strcmp logic to get the header option.

>
> Code comments:
>
>
> +/*
> + * Represents whether the header must be absent, present or present and
> match.
> + */
> +typedef enum CopyHeader
> +{
> +       COPY_HEADER_ABSENT,
> +       COPY_HEADER_PRESENT,
> +       COPY_HEADER_MATCH
> +} CopyHeader;
> +
>  /*
>   * This struct contains all the state variables used throughout a COPY
>   * operation. For simplicity, we use the same struct for all variants of
> COPY,
> @@ -136,7 +146,7 @@ typedef struct CopyStateData
>         bool            binary;                 /* binary format? */
>         bool            freeze;                 /* freeze rows on loading? */
>         bool            csv_mode;               /* Comma Separated Value
> format? */
> -       bool            header_line;    /* CSV or text header line? */
> +       CopyHeader  header_line;        /* CSV or text header line? */
>
>
> After the redefinition into this enum type, there are still a
> bunch of references to header_line that treat it like a boolean:
>
> 1190:                   if (cstate->header_line)
> 1398:   if (cstate->binary && cstate->header_line)
> 2119:           if (cstate->header_line)
> 3635:   if (cstate->cur_lineno == 0 && cstate->header_line)
>
> It works fine since COPY_HEADER_ABSENT is 0 as the first value of the enum,
> but maybe it's not good style to count on that.

Fixed. Changed it to cstate->header_line != COPY_HEADER_ABSENT.

>
>
>
> +                       PG_TRY();
> +                       {
> +                               if (defGetBoolean(defel))
> +                                       cstate->header_line =
> COPY_HEADER_PRESENT;
> +                               else
> +                                       cstate->header_line =
> COPY_HEADER_ABSENT;
> +                       }
> +                       PG_CATCH();
> +                       {
> +                               char    *sval = defGetString(defel);
> +
> +                               if (!cstate->is_copy_from)
> +                                       PG_RE_THROW();
> +
> +                               if (pg_strcasecmp(sval, "match") == 0)
> +                                       cstate->header_line =
> COPY_HEADER_MATCH;
> +                               else
> +                                       ereport(ERROR,
> +
> (errcode(ERRCODE_SYNTAX_ERROR),
> +                                                errmsg("header requires a
> boolean or \"match\"")));
> +                       }
> +                       PG_END_TRY();
>
> It seems wrong to use a PG_CATCH block for this. I understand that
> it's because defGetBoolean() calls ereport() on non-booleans, but then
> it should be split into an error-throwing function and a
> non-error-throwing lexical analysis of the boolean, the above code
> calling the latter.
> Besides the comments in elog.h above PG_TRY say that
>  "the error recovery code
>   can either do PG_RE_THROW to propagate the error outwards, or do a
>   (sub)transaction abort. Failure to do so may leave the system in an
>   inconsistent state for further processing."
> Maybe this is what happens with the repeated uses of "match"
> eventually failing with ERRORDATA_STACK_SIZE exceeded.
>

Fixed, replaced PG_TRY/PG_CATCH with strcmp logic to get the header option.

>
> -    HEADER [ <replaceable class="parameter">boolean</replaceable> ]
> +    HEADER { <literal>match</literal> | <literal>true</literal> |
> <literal>false</literal> }
>
> This should be enclosed in square brackets because HEADER
> with no argument is still accepted.
>

Fixed.

>
>
>
> +      names from the table. On input, the first line is discarded when set
> +      to <literal>true</literal> or required to match the column names if
> set
>
> The elision of "header" as the subject might be misinterpreted as if
> it's the first line that is true.  I'd suggest
> "when <literal>header>/literal> is set to ..."  to avoid any confusion.
>

Fixed.

Attached v5 patch with the fixes of above comments.
Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: track_planning causing performance regression
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions