Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics
Дата
Msg-id CAK-MWwTCBMK=wst5mHeZ_eVS9rX1xaDvxBDZbAzcO-a2eeJ=zA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
On Thu, Jan 28, 2016 at 11:53 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

> dimitri@2ndQuadrant.fr wrote:
>
> >
> > I witnessed a case where I had no time to extract information about it,
> so
> > it's all from memory, sorry about that.
> >
> > We had several Hot Standby nodes in 9.3.x used for load balancing read
> only
> > queries. All queries were stuck, and pg_locks showed they were refused =
a
> > lock against pg_catalog.pg_statistics.
>
> My WAG is that this is related to the standby replaying the
> AccessExclusive lock obtained to truncate pg_statistics.  That seems
> consistent with the presented symptoms.
>
> I would have assumed that the actual truncation shouldn't take a
> particularly long time, so user processes on the standby wouldn't be
> locked for long enough to cause visible trouble.  Maybe the table was
> originally very bloated and a lot of pages got removed, leading the
> replay process to take some time.  It is strange that it would be locked
> for long enough to allow for manually querying pg_locks and
> pg_stat_activity.
>
> --
> =C3=81lvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

=E2=80=8BIf pg_statistics had been seriously bloated (like few GB of empty =
space at
the end),
and in the same time mechanical (and busy) HDD used for the database, the
vacuum truncation could require really long time on replicas.
I seen it few times with a different application tables during
pg_compactable runs.
Reason simple: this empty space at the end of table not used anywhere so it
stay cold (outside of the shared buffers and file system cache).
I not sure (didn't checked actual code) but I have feeling that the replay
process (or OS) going to read every empty page before truncation all time
keeping AccessExclusive, and it might use a of lot time.
I think it might be useful to investigate what exact disk operations
performed during truncation scan, and try move some of the work outside of
AccessExclusive block.
(I not sure again but it seems that truncated (and cold) part of relation
will be read backward page by page from the end of file, such reading mode
have an abysmal performance on mechanical storage).


--=20
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.com/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #13896: Current docs says 'worker_spi' is a contrib
Следующее
От: "Sinha, Amitava"
Дата:
Сообщение: Re: Is Windows Posready 2009 XP operating system supported?