Any plans to expose publicly (via stored proc) relation_needs_vacanalyze, or some flavour of do_autovacuum?

Поиск
Список
Период
Сортировка
От Vlad Romascanu
Тема Any plans to expose publicly (via stored proc) relation_needs_vacanalyze, or some flavour of do_autovacuum?
Дата
Msg-id AANLkTik7FjH-nOC3VMEL4HispjdKLxjJBc5Jv57p06Ht@mail.gmail.com
обсуждение исходный текст
Ответы Re: Any plans to expose publicly (via stored proc) relation_needs_vacanalyze, or some flavour of do_autovacuum?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Imagine the following sequence of events:
1. a "writer" session begins a transaction, growing the number of live
tuples in several tables (e.g. via COPY) from mere tens (or hundreds)
to tens of thousands of tuples, then COMMITs
2. one or more "reader" sessions perform a SELECT ... JOIN on the very
tables grown by the "writer" session

Currently, unless the autovacuum daemon happens to run after the
"writer" session COMMITs and complete before the "reader" session(s)
SELECT, then the "reader" session(s) may complete execution in minutes
instead of milliseconds because of an incorrect execution plan based
on stale statistics.  Have seen this happen in reality.

In my specific case, all write operations are serialized via an
application-level mutex, i.e. there will only be one "writer" session
and multiple "reader" sessions at any given time.  In such a setup,
the most obvious workaround is to explicitly ANALYZE the affected
tables, prior to COMMITting, in the "writer" session.  New data would
thus be committed along with up-to-date statistics.

However, let's tweak the above scenario and assume that, later on, the
"writer" session only INSERTs one row in a table which now counts
several million tuples.-- indiscriminately running ANALYZE after such
a measly INSERTion is overkill.

The logical, amended solution would then be to have the "writer"
session perform, after INSERTion but before COMMITTing, the same
calculation that the autovacuum daemon currently performs inside
relation_needs_vacanalyze, based on the same configuration parameters
that the autovacuum daemon uses, before deciding whether to explicitly
ANALYZE or not the affected application tables.  Ideally one would not
want to duplicate the relation_needs_vacanalyze logic (not to mention
having to guess the value of last_anl_tuples, which is not exposed via
any storedproc -- one would have to assume that it is more or less
equivalent to pg_class.reltuples.)

So my question is: does it sound reasonable to (and/or are there any
existing plans to) expose either relation_needs_vacanalyze, or a
per-table flavour of do_autoanalyze, to the public via a stored proc
for those in my situation or who need more granular control over
autovacuuming than the autovacuum daemon does?

Thank you,
Vlad.

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Schema search path
Следующее
От: Yaroslav Tykhiy
Дата:
Сообщение: Re: Schema search path