Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id 20160322125957.GH3790@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 2016-03-22 18:19:48 +0530, Amit Kapila wrote:
> > I'm actually rather unconvinced that it's all that common that all
> > subtransactions are on one page. If you have concurrency - otherwise
> > there'd be not much point in this patch - they'll usually be heavily
> > interleaved, no?  You can argue that you don't care about subxacts,
> > because they're more often used in less concurrent scenarios, but if
> > that's the argument, it should actually be made.
> >
> 
> Note, that we are doing it only when a transaction has less than equal to
> 64 sub transactions.

So?

> > This code is a bit arcane. I think it should be restructured to
> > a) Directly go for LWLockAcquire if !all_xact_same_page || nsubxids >
> >    PGPROC_MAX_CACHED_SUBXIDS || IsGXactActive(). Going for a conditional
> >    lock acquire first can be rather expensive.
> 
> The previous version (v5 - [1]) has code that way, but that adds few extra
> instructions for single client case and I was seeing minor performance
> regression for single client case due to which it has been changed as per
> current code.

I don't believe that changing conditions here is likely to cause a
measurable regression.


> > So, we enqueue ourselves as the *head* of the wait list, if there's
> > other waiters. Seems like it could lead to the first element after the
> > leader to be delayed longer than the others.
> >
> 
> It will not matter because we are waking the queued process only once we
> are done with xid status update.

If there's only N cores, process N+1 won't be run immediately. But yea,
it's probably not large.


> > FWIW, You can move the nextidx = part of out the loop,
> > pgatomic_compare_exchange will update the nextidx value from memory; no
> > need for another load afterwards.
> >
> 
> Not sure, if I understood which statement you are referring here (are you
> referring to atomic read operation) and how can we save the load operation?

Yes, to the atomic read. And we can save it in the loop, because
compare_exchange returns the current value if it fails.


> > > +      * Now that we've got the lock, clear the list of processes
> waiting for
> > > +      * group XID status update, saving a pointer to the head of the
> list.
> > > +      * Trying to pop elements one at a time could lead to an ABA
> problem.
> > > +      */
> > > +     while (true)
> > > +     {
> > > +             nextidx = pg_atomic_read_u32(&procglobal->clogGroupFirst);
> > > +             if
> (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst,
> > > +
>          &nextidx,
> > > +
>          INVALID_PGPROCNO))
> > > +                     break;
> > > +     }
> >
> > Hm. It seems like you should should simply use pg_atomic_exchange_u32(),
> > rather than compare_exchange?
> >
> 
> We need to remember the head of list to wake up the processes due to which
> I think above loop is required.

exchange returns the old value? There's no need for a compare here.


> > I think it's worthwhile to create a benchmark that does something like
> > BEGIN;SELECT ... FOR UPDATE; SELECT pg_sleep(random_time);
> > INSERT;COMMIT; you'd find that if random is a bit larger (say 20-200ms,
> > completely realistic values for network RTT + application computation),
> > the success rate of group updates shrinks noticeably.
> >
> 
> I think it will happen that way, but what do we want to see with that
> benchmark? I think the results will be that for such a workload either
> there is no benefit or will be very less as compare to short transactions.

Because we want our performance improvements to matter in reality, not
just in unrealistic benchmarks where the benchmarking tool is running on
the same machine as the the database, and uses unix sockets. That not
actually an all that realistic workload.


Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Support for N synchronous standby servers - take 2
Следующее
От: Andres Freund
Дата:
Сообщение: Re: NOT EXIST for PREPARE