Re: Turning off HOT/Cleanup sometimes

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Turning off HOT/Cleanup sometimes
Дата
Msg-id CA+TgmoYcccT_x4x=1ZtYmzwmauuq1ZKpOMBunQ4c__waRPZ6pA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Turning off HOT/Cleanup sometimes  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On Tue, Apr 21, 2015 at 11:04 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Yes, it might be too much optimization to try to get the checkpoint to
> flush all those pages sequentially, but I was thinking of our current
> behavior where, after an update of all rows, we effectively write out
> the entire table because we have dirtied every page.  I guess with later
> prune-based writes, we aren't really writing all the pages as we have
> the pattern where pages with prunable content is kind of random. I guess
> I was just wondering what value there is to your write-then-skip idea,
> vs just writing the first X% of pages we find?  Your idea certainly
> spreads out the pruning, and doesn't require knowing the size of the
> table, though I though that information was easily determined.
>
> One thing to consider is how we handle pruning of index scans that hit
> multiple heap pages.  Do we still write X% of the pages in the table, or
> %X of the heap pages we actually access via SELECT?  With the
> write-then-skip approach, we would do X% of the pages we access, while
> with the first-X% approach, we would probably prune all of them as we
> would not be accessing most of the table.  I don't think we can do the
> first first-X% of pages and have the percentage based on the number of
> pages accessed as we have no way to know how many heap pages we will
> access from the index.  (We would know for bitmap scans, but that
> complexity doesn't seem worth it.)  That would argue, for consistency
> with sequential and index-based heap access, that your approach is best.

I actually implemented something like this for setting hint bits a few
years ago:

http://www.postgresql.org/message-id/AANLkTik5QzR8wTs0MqCWwmNp-qHGrdKY5Av5aOB7W4Dp@mail.gmail.com
http://www.postgresql.org/message-id/AANLkTimGKaG7wdu-x77GNV2Gh6_Qo5Ss1u5b6Q1MsPUy@mail.gmail.com

At least in later versions, the patch writes a certain number of
hinted pages, then skips writing a run of pages, then writes another
run of hinted pages.  The basic problem here is that, after the fsync
queue compaction patch went in, the benefits on my tests were pretty
modest.  Yeah, it costs something to write out lots of dirty pages,
but before the fsync queue compaction stuff, the initial scan of an
unhinted table took like 6x the time on the machine I tested on, but
after that, it was like 1.5x the time.  Blunting that spike just
wasn't exciting enough.

It strikes me that it would be better to have an integrated strategy
for this problem.  It doesn't make sense to have one strategy for
deciding whether to set hint bits and a separate strategy for deciding
whether to HOT-prune.  And if we decide to set hint bits and
HOT-prune, it might be smart to try to mark the page all-visible, too,
if it is and we're not about to update it.  I believe we're losing a
lot of performance on OLTP workloads by re-dirtying the same pages
over and over again.  We've probably all hit cases where there is an
obvious loss of performance because of this sort of thing, but I'm
starting to think it's hurting us in a lot of less-obvious ways.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: Turning off HOT/Cleanup sometimes
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Performance tuning assisted by a GUI application