Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id 20160324001055.GD4686@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Andres Freund <andres@anarazel.de>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Andres Freund <andres@anarazel.de>)
Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 2016-03-23 21:43:41 +0100, Andres Freund wrote:
> I'm playing around with SELECT txid_current(); right now - that should
> be about the most specific load for setting clog bits.

Or so I thought.

In my testing that showed just about zero performance difference between
the patch and master. And more surprisingly, profiling showed very
little contention on the control lock. Hacking
TransactionIdSetPageStatus() to return without doing anything, actually
only showed minor performance benefits.

[there's also the fact that txid_current() indirectly acquires two
lwlock twice, which showed up more prominently than control lock, but
that I could easily hack around by adding a xid_current().]

Similar with an INSERT only workload. And a small scale pgbench.


Looking through the thread showed that the positive results you'd posted
all were with relativey big scale factors. Which made me think. Running
a bigger pgbench showed that most the interesting (i.e. long) lock waits
were both via TransactionIdSetPageStatus *and* TransactionIdGetStatus().


So I think what happens is that once you have a big enough table, the
UPDATEs standard pgbench does start to often hit *old* xids (in unhinted
rows). Thus old pages have to be read in, potentially displacing slru
content needed very shortly after.


Have you, in your evaluation of the performance of this patch, done
profiles over time? I.e. whether the performance benefits are the
immediately, or only after a significant amount of test time? Comparing
TPS over time, for both patched/unpatched looks relevant.


Even after changing to scale 500, the performance benefits on this,
older 2 socket, machine were minor; even though contention on the
ClogControlLock was the second most severe (after ProcArrayLock).

Afaics that squares with Jesper's result, which basically also didn't
show a difference either way?


I'm afraid that this patch might be putting bandaid on some of the
absolutely worst cases, without actually addressing the core
problem. Simon's patch in [1] seems to come closer addressing that
(which I don't believe it's safe without going doing every status
manipulation atomically, as individual status bits are smaller than 4
bytes).  Now it's possibly to argue that the bandaid might slow the
bleeding to a survivable level, but I have to admit I'm doubtful.

Here's the stats for a -s 500 run btw:Performance counter stats for 'system wide':           18,747
probe_postgres:TransactionIdSetTreeStatus          68,884      probe_postgres:TransactionIdGetStatus            9,718
  probe_postgres:PGSemaphoreLock
 
(the PGSemaphoreLock is over 50% ProcArrayLock, followed by ~15%
SimpleLruReadPage_ReadOnly)


My suspicion is that a better approach for now would be to take Simon's
patch, but add a (per-page?) 'ClogModificationLock'; to avoid the need
of doing something fancier in TransactionIdSetStatusBit().

Andres

[1]:
http://archives.postgresql.org/message-id/CANP8%2Bj%2BimQfHxkChFyfnXDyi6k-arAzRV%2BZG-V_OFxEtJjOL2Q%40mail.gmail.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: NOT EXIST for PREPARE
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers