Re: Parallel copy

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel copy
Дата
Msg-id CAA4eK1+==1q3mk5Lpsyzw5SGDPfv+RLuZ7w47Oo0St6jpbcs+Q@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Parallel copy  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
Ответы Re: Parallel copy  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
On Sun, Oct 18, 2020 at 7:47 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> Hi Vignesh,
>
> After having a look over the patch,
> I have some suggestions for
> 0003-Allow-copy-from-command-to-process-data-from-file.patch.
>
> 1.
>
> +static uint32
> +EstimateCstateSize(ParallelContext *pcxt, CopyState cstate, List *attnamelist,
> +                                  char **whereClauseStr, char **rangeTableStr,
> +                                  char **attnameListStr, char **notnullListStr,
> +                                  char **nullListStr, char **convertListStr)
> +{
> +       uint32          strsize = MAXALIGN(sizeof(SerializedParallelCopyState));
> +
> +       strsize += EstimateStringSize(cstate->null_print);
> +       strsize += EstimateStringSize(cstate->delim);
> +       strsize += EstimateStringSize(cstate->quote);
> +       strsize += EstimateStringSize(cstate->escape);
>
>
> It use function EstimateStringSize to get the strlen of null_print, delim, quote and escape.
> But the length of null_print seems has been stored in null_print_len.
> And delim/quote/escape must be 1 byte, so I think call strlen again seems unnecessary.
>
> How about  " strsize += sizeof(uint32) + cstate->null_print_len + 1"
>

+1. This seems like a good suggestion but add comments for
delim/quote/escape to indicate that we are considering one-byte for
each. I think this will obviate the need of function
EstimateStringSize. Another thing in this regard is that we normally
use add_size function to compute the size but I don't see that being
used in this and nearby computation. That helps us to detect overflow
of addition if any.

EstimateCstateSize()
{
..
+
+ strsize++;
..
}

Why do we need this additional one-byte increment? Does it make sense
to add a small comment for the same?

> 2.
> +       strsize += EstimateNodeSize(cstate->whereClause, whereClauseStr);
>
> +       copiedsize += CopyStringToSharedMemory(cstate, whereClauseStr,
> +                                                                                  shmptr + copiedsize);
>
> Some string length is counted for two times.
> The ' whereClauseStr ' has call strlen in EstimateNodeSize once and call strlen in CopyStringToSharedMemory again.
> I don't know wheather it's worth to refacor the code to avoid duplicate strlen . what do you think ?
>

It doesn't seem worth to me. We probably need to use additional
variables to save those lengths. I think it will add more
code/complexity than we will save. See EstimateParamListSpace and
SerializeParamList where we get the typeLen each time, that way code
looks neat to me and we are don't going to save much by not following
a similar thing here.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "tsunakawa.takay@fujitsu.com"
Дата:
Сообщение: RE: Transactions involving multiple postgres foreign servers, take 2
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Make procedure OUT parameters work with JDBC