Re: pg_upgrade: Pass -j down to vacuumdb

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: pg_upgrade: Pass -j down to vacuumdb
Дата
Msg-id CAMkU=1zcOfThON4p00VTWEO1xdY6R8Xbh9KSt9pZkLHq0gdE=A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_upgrade: Pass -j down to vacuumdb  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы Re: pg_upgrade: Pass -j down to vacuumdb
Список pgsql-hackers
On Tue, Mar 26, 2019 at 7:28 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2019-03-25 22:57, Tom Lane wrote:
> +     fprintf(script, "echo %sYou may wish to add --jobs=N for parallel analyzing.%s\n",
> +                     ECHO_QUOTE, ECHO_QUOTE);

But then you get that information after you have already started the script.

True, but the same goes for all the other information there, and it sleeps to let you break out of it.  And I make it a habit to glance through any scripts someone suggests that I run, so would notice the embedded advice without running it at all.

I don't find any information about this analyze business on the
pg_upgrade reference page.  Maybe a discussion there could explain the
different paths better than making the output script extra complicated.

Essentially: If you want a slow and gentle analyze, use the supplied
script.  If you want a fast analyze, use vacuumdb, perhaps with an
appropriate --jobs option.  Note that pg_upgrade --jobs and vacuumdb
--jobs are resource-bound in different ways, so the same value might not
be appropriate for both.


To me, analyze-in-stages is not about gentleness at all.  For example, it does nothing to move vacuum_cost_delay away from its default of 0.  Rather, it is about slamming the bare minimum statistics in there as fast as possible, so your database doesn't keel over from horrible query plans on even simple queries as soon as you reopen.  I want the database to survive long enough for the more complete statistics to be gathered.  If you have quickly accumulated max_connection processes all running horrible query plans that never finish, your database might as well still be closed for all the good it does the users. And all the load generated by those is going to make the critical ANALYZE all that much slower.

At first blush I thought it was obvious that you would not want to run analyze-in-stages in parallel.  But after thinking about it some more and reflecting on experience doing some troublesome upgrades, I would reverse that and say it is now obvious you do want at least the first stage of analyze-in-stages, and probably the first two, to run in parallel.  That is not currently an option it supports, so we can't really recommend it in the script or the docs.

But we could at least adopt the more straightforward patch, suggest that if they don't want analyze-in-stages they should consider doing the big-bang analyze in parallel.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: Ordered Partitioned Table Scans
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Log a sample of transactions