Re: Turning off HOT/Cleanup sometimes

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Turning off HOT/Cleanup sometimes
Дата
Msg-id CA+TgmoZ7T+KQyGYwYqFEGkwK3SnYMM3bqQzj1+=0CiKAS8YF4A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Turning off HOT/Cleanup sometimes  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Turning off HOT/Cleanup sometimes  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Jan 9, 2014 at 12:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jan 8, 2014 at 3:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> We also make SELECT clean up blocks as it goes. That is useful in OLTP
>>> workloads, but it means that large SQL queries and pg_dump effectively
>>> do much the same work as VACUUM, generating huge amounts of I/O and
>>> WAL on the master, the cost and annoyance of which is experienced
>>> directly by the user. That is avoided on standbys.
>
>> On a pgbench workload, though, essentially all page cleanup happens as
>> a result of HOT cleanups, like >99.9%.  It might be OK to have that
>> happen for write operations, but it would be a performance disaster if
>> updates didn't try to HOT-prune.  Our usual argument for doing HOT
>> pruning even on SELECT cleanups is that not doing so pessimizes
>> repeated scans, but there are clearly cases that end up worse off as a
>> result of that decision.
>
> My recollection of the discussion when HOT was developed is that it works
> that way not because anyone thought it was beneficial, but simply because
> we didn't see an easy way to know when first fetching a page whether we're
> going to try to UPDATE some tuple on the page.  (And we can't postpone the
> pruning, because the query will have tuple pointers into the page later.)
> Maybe we should work a little harder on passing that information down.
> It seems reasonable to me that SELECTs shouldn't be tasked with doing
> HOT pruning.
>
>> I'm not entirely wild about adding a parameter in this area because it
>> seems that we're increasingly choosing to further expose what arguably
>> ought to be internal implementation details.
>
> I'm -1 for a parameter as well, but I think that just stopping SELECTs
> from doing pruning at all might well be a win.  It's at least worthy
> of some investigation.

Unfortunately, there's no categorical answer.  You can come up with
workloads where HOT pruning on selects is a win; just create a bunch
of junk and then read the same pages lots of times in a row.  And you
can also come up with workloads where it's a loss; create a bunch of
junk and then read them just once.  I don't know how easy it's going
to be to set that parameter in a useful way for some particular
environment, and I think that's possibly an argument against having
it.  But the argument that we don't need a parameter because one
behavior is best for everyone is not going to fly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Следующее
От: Robert Haas
Дата:
Сообщение: Re: newlines at end of generated SQL