Re: some longer, larger pgbench tests with various performance-related patches

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: some longer, larger pgbench tests with various performance-related patches
Дата
Msg-id CAMkU=1zApd5pP3RX3moxvBm8PAEOmGjmrHLstcfABXjECNm1yw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: some longer, larger pgbench tests with various performance-related patches  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Mon, Feb 6, 2012 at 6:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Feb 4, 2012 at 2:13 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> We really need to nail that down.  Could you post the scripts (on the
>> wiki) you use for running the benchmark and making the graph?  I'd
>> like to see how much work it would be for me to change it to detect
>> checkpoints and do something like color the markers blue during
>> checkpoints and red elsewhen.
>
> They're pretty crude - I've attached them here.

Thanks.  But given Greg's comments about pgbench-tools, I'll probably
work on learning that instead.

>
>> Also, I'm not sure how bad that graph really is.  The overall
>> throughput is more variable, and there are a few latency spikes but
>> they are few.  The dominant feature is simply that the long-term
>> average is less than the initial burst.Of course the goal is to have
>> a high level of throughput with a smooth latency under sustained
>> conditions.  But to expect that that long-sustained smooth level of
>> throughput be identical to the "initial burst throughput" sounds like
>> more of a fantasy than a goal.
>
> That's probably true, but the drop-off is currently quite extreme.
> The fact that disabling full_page_writes causes throughput to increase
> by >4x is dismaying, at least to me.

I just meant that the latest graph, with full page writes off, is not that bad.

The ones with fpw on are definitely bad, more due to the latency
spikes than the throughput.


>
>> If we want to accept the lowered
>> throughput and work on the what variability/spikes are there, I think
>> a good approach would be to take the long term TPS average, and dial
>> the number of clients back until the initial burst TPS matches that
>> long term average.  Then see if the spikes still exist over the long
>> term using that dialed back number of clients.
>
> Hmm, I might be able to do that.
>
>> I don't think the full-page-writes are leading to WALInsert
>> contention, for example, because that would probably lead to smooth
>> throughput decline, but not those latency spikes in which those entire
>> seconds go by without transactions.
>
> Right.
>
>> I doubt it is leading to general
>> IO compaction, as the IO at that point should be pretty much
>> sequential (the checkpoint has not yet reached the sync stage, the WAL
>> is sequential).  So I bet that that is caused by fsyncs occurring at
>> xlog segment switches, and the locking that that entails.
>
> That's definitely possible.
>
>> If I
>> recall, we can have a segment which is completely written to OS and in
>> the process of being fsynced, and we can have another segment which is
>> in some state of partially in wal_buffers and partly written out to OS
>> cache, but that we can't start reusing the wal_buffers that were
>> already written to OS for that segment (and therefore are
>> theoretically available for reuse by the upcoming 3rd segment)  until
>> the previous segments fsync has completed.  So all WALInsert's freeze.
>>  Or something like that.  This code has changed a bit since last time
>> I studied it.
>
> Yeah, I need to better-characterize where the pauses are coming from,
> but I'm reluctant to invest too much effort in until Heikki's xlog
> scaling patch goes in, because I think that's going to change things
> enough that any work done now will mostly be wasted.

OK, You've scared me off from digging into the locking at wal switch
for now.  So instead I've spent some time today trying to unbreak the
xlog scaling patch and haven't had any luck.  Does anyone know if any
combination of that patches history + git master history had been
tested and verified to produce a recoverable WAL stream?  It is a
shame that "make check" doesn't test for that, but I don't know how to
make it do so.

Cheers,

Jeff


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: random_page_cost vs seq_page_cost
Следующее
От: Gaetano Mendola
Дата:
Сообщение: Re: CUDA Sorting