Re: Parallel copy

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Parallel copy
Дата
Msg-id CALj2ACUj1303Pvhhnp_J6+BG0d9Fp1bRDCKL=pQThJ4+8HAGNw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel copy  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Parallel copy  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
Hi,

It looks like the parsing of newly introduced "PARALLEL" option for
COPY FROM command has an issue(in the
0002-Framework-for-leader-worker-in-parallel-copy.patch),
Mentioning ....PARALLEL '4ar2eteid'); would pass with 4 workers since
atoi() is being used for converting string to integer which just
returns 4, ignoring other strings.

I used strtol(), added error checks and introduced the error "
improper use of argument to option "parallel"" for the above cases.

        parallel '4ar2eteid');
ERROR:  improper use of argument to option "parallel"
LINE 5:         parallel '1\');

Along with the updated patch
0002-Framework-for-leader-worker-in-parallel-copy.patch, also
attaching all the latest patches from [1].

[1] - https://www.postgresql.org/message-id/CALj2ACW94icER3WrWapon7JkcX8j0TGRue5ycWMTEvgA3X7fOg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

On Tue, Jun 23, 2020 at 12:22 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 8:07 AM vignesh C <vignesh21@gmail.com> wrote:
> > I have attached the patch for the same with the fixes.
>
> The patches were not applying on the head, attached the patches that can be applied on head.
> I have added a commitfest entry[1] for this feature.
>
> [1] - https://commitfest.postgresql.org/28/2610/
>
>
> On Tue, Jun 23, 2020 at 8:07 AM vignesh C <vignesh21@gmail.com> wrote:
>>
>> Thanks Ashutosh For your review, my comments are inline.
>> On Fri, Jun 19, 2020 at 5:41 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I just got some time to review the first patch in the list i.e.
0001-Copy-code-readjustment-to-support-parallel-copy.patch.As the patch name suggests, it is just trying to reshuffle
theexisting code for COPY command here and there. There is no extra changes added in the patch as such, but still I do
havesome review comments, please have a look: 
>> >
>> > 1) Can you please add some comments atop the new function PopulateAttributes() describing its functionality in
detail.Further, this new function contains the code from BeginCopy() to set attribute level options used with COPY FROM
suchas FORCE_QUOTE, FORCE_NOT_NULL, FORCE_NULL etc. in cstate and along with that it also copies the code from
BeginCopy()to set other infos such as client encoding type, encoding conversion etc. Hence, I think it would be good to
giveit some better name, basically something that matches with what actually it is doing. 
>> >
>>
>> There is no new code added in this function, some part of code from
>> BeginCopy was made in to a new function as this part of code will also
>> be required for the parallel copy workers before the workers start the
>> actual copy operation. This code was made into a function to avoid
>> duplication. Changed the function name to PopulateGlobalsForCopyFrom &
>> added few comments.
>>
>> > 2) Again, the name for the new function CheckCopyFromValidity() doesn't look good to me. From the function name it
appearsas if it does the sanity check of the entire COPY FROM command, but actually it is just doing the sanity check
forthe target relation specified with COPY FROM. So, probably something like CheckTargetRelValidity would look more
sensible,I think? TBH, I am not good at naming the functions so you can always ignore my suggestions about function and
variablenames :) 
>> >
>>
>> Changed as suggested.
>> > 3) Any reason for not making CheckCopyFromValidity as a macro instead of a new function. It is just doing the
sanitycheck for the target relation. 
>> >
>>
>> I felt there is reasonable number of lines in the function & it is not
>> in performance intensive path, so I preferred function over macro.
>> Your thoughts?
>>
>> > 4) Earlier in CopyReadLine() function while trying to clear the EOL marker from cstate->line_buf.data (copied
data),we were not checking if the line read by CopyReadLineText() function is a header line or not, but I can see that
yourpatch checks that before clearing the EOL marker. Any reason for this extra check? 
>> >
>>
>> If you see the caller of CopyReadLine, i.e. NextCopyFromRawFields does
>> nothing for the header line, server basically calls CopyReadLine
>> again, it is a kind of small optimization. Anyway server is not going
>> to do anything with header line, I felt no need to clear EOL marker
>> for header lines.
>> /* on input just throw the header line away */
>> if (cstate->cur_lineno == 0 && cstate->header_line)
>> {
>> cstate->cur_lineno++;
>> if (CopyReadLine(cstate))
>> return false; /* done */
>> }
>>
>> cstate->cur_lineno++;
>>
>> /* Actually read the line into memory here */
>> done = CopyReadLine(cstate);
>> I think no need to make a fix for this. Your thoughts?
>>
>> > 5) I noticed the below spurious line removal in the patch.
>> >
>> > @@ -3839,7 +3953,6 @@ static bool
>> >  CopyReadLine(CopyState cstate)
>> >  {
>> >     bool        result;
>> > -
>> >
>>
>> Fixed.
>> I have attached the patch for the same with the fixes.
>> Thoughts?
>>
>> Regards,
>> Vignesh
>> EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: should libpq also require TLSv1.2 by default?
Следующее
От: "movead.li@highgo.ca"
Дата:
Сообщение: Re: pg_resetwal --next-transaction-id may cause database failed to restart.