On 8 January 2014 08:33, Simon Riggs <simon@2ndquadrant.com> wrote:
> VACUUM cleans up blocks, which is nice because it happens offline in a
> lazy manner.
>
> We also make SELECT clean up blocks as it goes. That is useful in OLTP
> workloads, but it means that large SQL queries and pg_dump effectively
> do much the same work as VACUUM, generating huge amounts of I/O and
> WAL on the master, the cost and annoyance of which is experienced
> directly by the user. That is avoided on standbys.
>
> Effects of that are that long running statements often run much longer
> than we want, increasing bloat as a result. It also produces wildly
> varying response times, depending upon extent of cleanup required.
>
> It is a simple task to make that behaviour optional on the master.
>
> I propose a USERSET parameter, prune_cost_limit (<---insert better name here)
> which will make the behaviour optional, default -1, in normal user
> processes. VACUUM will ignore this parameter and so its actions will
> never be deferred.
>
> In detail, this parameter would disable pruning for any scan larger
> than the cost limit. So large scans will disable the behaviour. The
> default, -1, means never disable pruning, which is the current
> behavour.
>
> We track the number of pages dirtied by the current statement. When
> this reaches prune_cost_limit, we will apply these behaviours to all
> shared_buffer block accesses...
>
> (1) avoid running heap_page_prune_opt()
>
> (2) avoid dirtying the buffer for hints. (This is safe because the
> hinted changes will either be lost or will be part of the full page
> image when we make a logged-change).
>
> (i.e. doesn't apply to temp tables)
>
> For example, if we set prune_cost_limit = 4 this behaviour allows
> small index lookups via bitmapheapscan to continue to cleanup, while
> larger index and seq scans will avoid cleanup.
>
>
>
> There would be a postgresql.conf parameter prune_cost_limit, as well
> as a table level parameter that would prevent pruning except via
> VACUUM.
>
> This will help in these ways
> * Reduce write I/O from SELECTs and pg_dump - improving backups and BI queries
> * Allow finer grained control over Hot Standby conflicts
> * Potentially allow diagnostic inspection of older data via SeqScan
>
> Prototype patch shows this is possible and simple enough for 9.4.
> Major objections? Or should I polish up and submit?
Patch attached, implemented to reduce writes by SELECTs only.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services