Re: [HACKERS] asynchronous execution

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [HACKERS] asynchronous execution
Дата
Msg-id 20170718.162452.221576658.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] asynchronous execution  (Antonin Houska <ah@cybertec.at>)
Ответы Re: [HACKERS] asynchronous execution  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
Hello,

At Tue, 11 Jul 2017 10:28:51 +0200, Antonin Houska <ah@cybertec.at> wrote in <6448.1499761731@localhost>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Effectively it is a waiting-queue followed by a
> > completed-list. The point of the compaction is keeping the order
> > of waiting or not-yet-completed requests, which is crucial to
> > avoid kind-a precedence inversion. We cannot keep the order by
> > using bitmapset in such way.
> 
> > The current code waits all waiters at once and processes all
> > fired events at once. The order in the waiting-queue is
> > inessential in the case. On the other hand I suppoese waiting on
> > several-tens to near-hundred remote hosts is in a realistic
> > target range. Keeping the order could be crucial if we process a
> > part of the queue at once in the case.
> > 
> > Putting siginificance on the deviation of response time of
> > remotes, process-all-at-once is effective. In turn we should
> > consider the effectiveness of the lifecycle of the larger wait
> > event set.
> 
> ok, I missed the fact that the order of es_pending_async entries is
> important. I think this is worth adding a comment.

I'll put an upper limit to the number of waiters processed at
once. Then add a comment like that.

> Actually the reason I thought of simplification was that I noticed small
> inefficiency in the way you do the compaction. In particular, I think it's not
> always necessary to swap the tail and head entries. Would something like this
> make sense?

I'm not sure, but I suppose that it is rare that all of the first
many elements in the array are not COMPLETE. In most cases the
first element gets a response first.

> 
>         /* If any node completed, compact the array. */
>         if (any_node_done)
>         {
...
>             for (tidx = 0; tidx < estate->es_num_pending_async; ++tidx)
>             {
...
>                 if (tail->state == ASYNCREQ_COMPLETE)
>                     continue;
> 
>                 /*
>                  * If the array starts with one or more incomplete requests,
>                  * both head and tail point at the same item, so there's no
>                  * point in swapping.
>                  */
>                 if (tidx > hidx)
>                 {

This works to skip the first several elements when all of them
are ASYNCREQ_COMPLETE. I think it makes sense as long as it
doesn't harm the loop. The optimization is more effective by
putting out of the loop like this.

|  for (tidx = 0; tidx < estate->es_num_pending_async &&                 estate->es_pending_async[tidx] ==
ASYNCREQ_COMPLETE;++tidx);
 
|  for (; tidx < estate->es_num_pending_async; ++tidx)
...


> And besides that, I think it'd be more intuitive if the meaning of "head" and
> "tail" was reversed: if the array is iterated from lower to higher positions,
> then I'd consider head to be at higher position, not tail.

Yeah, but maybe the "head" is still confusing even if reversed
because it is still not a head of something.  It might be less
confusing by rewriting it in more verbose-but-straightforwad way.


|      int npending = 0;
| 
|      /* Skip over not-completed items at the beginning */
|      while (npending < estate->es_num_pending_async &&
|             estate->es_pending_async[npending] != ASYNCREQ_COMPLETE)
|        npending++;
| 
|      /* Scan over the rest for not-completed items */
|      for (i = npending + 1 ; i < estate->es_num_pending_async; ++i)
|      {
|        PendingAsyncRequest *tmp;
|        PendingAsyncRequest *curr = estate->es_pending_async[i];
|
|        if (curr->state == ASYNCREQ_COMPLETE)
|          continue;
|
|     /* Move the not-completed item to the tail of the first chunk */
|        tmp = estate->es_pending_async[i];
|        estate->es_pending_async[nepending] = tmp;
|        estate->es_pending_async[i] = tmp;
|        ++npending;
|      }


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Etsuro Fujita
Дата:
Сообщение: [HACKERS] Re: Oddity in error handling of constraint violation inExecConstraints for partitioned tables
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] PgFDW connection invalidation by ALTER SERVER/ALTERUSER MAPPING