Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
Дата
Msg-id 20140604174659.GP5146@eldon.alvh.no-ip.org
обсуждение исходный текст
Ответ на Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby  (Serge Negodyuck <petr@petrovich.kiev.ua>)
Ответы Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
Serge Negodyuck wrote:
> 2014-06-02 17:10 GMT+03:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
>
> > Serge Negodyuck wrote:
> > > Hello,
> > >
> > > I've upgraded postgresql to version 9.3.4 and did fresh initdb and
> > restored
> > > database from sql backup.
> > > According to 9.4.3 changelog issue with multixact wraparound was fixed.
> >
> > Ouch.  This is rather strange.  First I see the failing multixact has
> > 8684 members, which is totally unusual.  My guess is that you have code
> > that creates lots of subtransactions, and perhaps does something to one
> > tuple in a different subtransaction; doing sometihng like that would be,
> > I think, the only way to get subxacts that large.  Does that sound
> > right?
> >
> It sounds like you are right. I've found a lot of inserts in logs. Each
> insert cause trigger to be performed. This  trigger updates counter in
> other table.
> It is very possible this tirgger tries to update the same counter for
> different inserts.

I wasn't able to reproduce it that way, but I eventually figured out
that if I altered altered the plpython function to grab a FOR NO KEY
UPDATE lock first, insertion would grow the multixact beyond reasonable
limits; see the attachment.  If you then INSERT many tuples in "product"
in a single transaction, the resulting xmax is a Multixact that has as
many members as inserts there are, plus one.

(One variation that causes even more bizarre results is dispensing with
the plpy.subtransaction() in the function and instead setting a
savepoint before each insert.  In fact, given the multixact members
shown in your log snippet I think that's more similar to what you code
does.)

> > > Then, did pg_basebackup to slave database. It does not help
> > > 2014-06-02 09:58:49 EEST 172.18.10.17 db2 DETAIL: Could not open file
> > > "pg_multixact/members/1112D": No such file or directory.
> > > 2014-06-02 09:58:49 EEST 172.18.10.18 db2 DETAIL: Could not open file
> > > "pg_multixact/members/11130": No such file or directory.
> > > 2014-06-02 09:58:51 EEST 172.18.10.34 db2 DETAIL: Could not open file
> > > "pg_multixact/members/11145": No such file or directory.
> > > 2014-06-02 09:58:51 EEST 172.18.10.38 db2 DETAIL: Could not open file
> > > "pg_multixact/members/13F76": No such file or directory
> >
> > Are these the only files missing?  Are intermediate files there?
>
> Only 0000 - 001E files were present on slave server.

I don't understand how can files be missing in the replica.
pg_basebackup simply copies all files it can find in the master to the
replica, so if the 111xx files are present in the master they should
certainly be present in the replica as well.  I gave the pg_basebackup
code a look just to be sure there are no 4-char pattern matching or
something like that, and it doesn't look like it attempts to do that at
all.  I also asked Magnus just to be sure and he confirms this.

I'm playing a bit more with this test case, I'll let you know where it
leads.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #10527: TRAP when joining local table with view on tds_fdw foreign table
Следующее
От: "Gunnar \"Nick\" Bluth"
Дата:
Сообщение: Re: BUG #10527: TRAP when joining local table with view on tds_fdw foreign table