Re: Incomplete freezing when truncating a relation during vacuum

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Incomplete freezing when truncating a relation during vacuum
Дата
Msg-id 20131130160058.GB31100@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Incomplete freezing when truncating a relation during vacuum  (Noah Misch <noah@leadboat.com>)
Ответы Re: Incomplete freezing when truncating a relation during vacuum  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Incomplete freezing when truncating a relation during vacuum  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
Hi Noah,

On 2013-11-30 00:40:06 -0500, Noah Misch wrote:
> > > On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
> > > > With regard to fixing things up, ISTM the best bet is heap_prune_chain()
> > > > so far. That's executed b vacuum and by opportunistic pruning and we
> > > > know we have the appropriate locks there. Looks relatively easy to fix
> > > > up things there. Not sure if there are any possible routes to WAL log
> > > > this but using log_newpage()?
> > > > I am really not sure what the best course of action is :(
> 
> Based on subsequent thread discussion, the plan you outline sounds reasonable.
> Here is a sketch of the specific semantics of that fixup.  If a HEAPTUPLE_LIVE
> tuple has XIDs older than the current relfrozenxid/relminmxid of its relation
> or newer than ReadNewTransactionId()/ReadNextMultiXactId(), freeze those XIDs.
> Do likewise for HEAPTUPLE_DELETE_IN_PROGRESS, ensuring a proper xmin if the
> in-progress deleter aborts.  Using log_newpage_buffer() seems fine; there's no
> need to optimize performance there.

We'd need to decide what to do with xmax values, they'd likely need to
be treated differently.

The problem with log_newpage_buffer() is that we'd quite possibly issue
one such call per item on a page. And that might become quite
expensive. Logging ~1.5MB per 8k page in the worst case sounds a bit
scary.

> (All the better if we can, with minimal
> hacks, convince heap_freeze_tuple() itself to log the right changes.)

That likely comes to late - we've already pruned the page and might have
made wrong decisions there. Also, heap_freeze_tuple() is run on both the
primary and standbys.
I think our xl_heap_freeze format, that relies on running
heap_freeze_tuple() during recovery, is a terrible idea, but we cant
change that right now.

> Time is tight to finalize this, but it would be best to get this into next
> week's release.  That way, the announcement, fix, and mitigating code
> pertaining to this data loss bug all land in the same release.  If necessary,
> I think it would be worth delaying the release, or issuing a new release a
> week or two later, to closely align those events.  That being said, I'm
> prepared to review a patch in this area over the weekend.

I don't think I currently have the energy/brainpower/time to develop
such a fix in a suitable quality till monday. I've worked pretty hard on
trying to fix the host of multixact data corruption bugs the last days
and developing a solution that I'd be happy to put into such critical
paths is certainly several days worth of work.

I am not sure if it's a good idea to delay the release because of this,
there are so many other critical issues that that seems like a bad
tradeoff.

That said, if somebody else is taking the lead I am certainly willing to
help in detail with review and testing.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: MultiXact truncation, startup et al.
Следующее
От: Andres Freund
Дата:
Сообщение: Re: MultiXact truncation, startup et al.