Re: pgsql: Fix a couple of bugs in MultiXactId freezing
От | Alvaro Herrera |
---|---|
Тема | Re: pgsql: Fix a couple of bugs in MultiXactId freezing |
Дата | |
Msg-id | 20131211170005.GA5777@eldon.alvh.no-ip.org обсуждение исходный текст |
Ответ на | Re: pgsql: Fix a couple of bugs in MultiXactId freezing (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: pgsql: Fix a couple of bugs in MultiXactId freezing
|
Список | pgsql-hackers |
Andres Freund wrote: > On 2013-12-09 19:14:58 -0300, Alvaro Herrera wrote: > > Here's a revamped version of this patch. One thing I didn't do here is > > revert the exporting of CreateMultiXactId, but I don't see any way to > > avoid that. > > I don't so much have a problem with exporting CreateMultiXactId(), just > with exporting it under its current name. It's already quirky to have > both MultiXactIdCreate and CreateMultiXactId() in multixact.c but > exporting it imo goes to far. MultiXactidCreateFromMembers(int, MultiXactMembers *) ? > > Andres mentioned the idea of sharing some code between > > heap_prepare_freeze_tuple and heap_tuple_needs_freeze, but I haven't > > explored that. > > My idea would just be to have heap_tuple_needs_freeze() call > heap_prepare_freeze_tuple() and check whether it returns true. Yes, > that's slightly more expensive than the current > heap_tuple_needs_freeze(), but it's only called when we couldn't get a > cleanup lock on a page, so that seems ok. Doesn't seem a completely bad idea, but let's leave it for a separate patch. This should be changed in master only IMV anyway, while the rest of this patch is to be backpatched to 9.3. > > ! if (!MultiXactIdIsValid(multi)) > > ! { > > ! /* Ensure infomask bits are appropriately set/reset */ > > ! *flags |= FRM_INVALIDATE_XMAX; > > ! return InvalidTransactionId; > > ! } > > Maybe comment that we're sure to be only called when IS_MULTI is set, > which will be unset by FRM_INVALIDATE_XMAX? I wondered twice whether we > wouldn't just continually mark the buffer dirty this way. Done. > > ! else if (MultiXactIdPrecedes(multi, cutoff_multi)) > > ! { > > ! /* > > ! * This old multi cannot possibly have members still running. If it > Cna you place an Assert(!MultiXactIdIsRunning(multi)) here? Done. > > ! if (ISUPDATE_from_mxstatus(members[i].status) && > > ! !TransactionIdDidAbort(members[i].xid))# > > It makes me wary to see a DidAbort() without a previous InProgress() > call. Also, after we crashed, doesn't DidAbort() possibly return > false for transactions that were in progress before we crashed? At > least that's how I always understood it, and how tqual.c is written. Yes, that's correct. But note that here we're not doing a tuple liveliness test, which is what tqual.c is doing. What we do with this info is to keep the Xid as part of the multi if it's still running or committed. We also keep it if the xact crashed, which is fine because the Xid will be removed by some later step. If we know for certain that the update Xid is aborted, then we can ignore it, but this is just an optimization and not needed for correctness. That loop had a bug, so I restructured it. (If the update xact had aborted we wouldn't get to the "continue" and thus would treat it as a locker-only. I don't think that behavior would cause any visible misbehavior but it's wrong nonetheless.) One interesting bit is that we might end up creating singleton MultiXactIds when freezing, if there's no updater and there's a running locker. We could avoid this (i.e. mark the tuple as locked by a single Xid) but it would complicate FreezeMultiXactId's API and it's unlikely to occur with any frequency anyway. > > ! /* We only keep lockers if they are still running */ > > ! if (TransactionIdIsCurrentTransactionId(members[i].xid) || > > I don't think there's a need to check for > TransactionIdIsCurrentTransactionId() - vacuum can explicitly *not* be > run inside a transaction. Keep in mind that freezing can also happen for tuples handled during a table-rewrite operation such as CLUSTER. I wouldn't place a bet that you can't have a multi created by a transaction and later run cluster in the same table in the same transaction. Maybe this is fine because of the fact that at that point we're holding an exclusive lock in the table, but it seems fragile. And the test is cheap anyway. > > --- 5571,5586 ---- > > * xvac transaction succeeded. > > */ > > if (tuple->t_infomask & HEAP_MOVED_OFF) > > ! frz->frzflags |= XLH_FREEZE_XVAC; > > else > > ! frz->frzflags |= XLH_INVALID_XVAC; > > > > Hm. Isn't this case inverted? I.e. shouldn't you set XLH_FREEZE_XVAC and > XLH_INVALID_XVAC exactly the other way round? I really don't understand > the moved in/off, since the code has been gone longer than I've followed > the code... Yep, fixed. > > --- b/src/backend/access/rmgrdesc/mxactdesc.c > > *************** > > *** 41,47 **** out_member(StringInfo buf, MultiXactMember *member) > > appendStringInfoString(buf, "(upd) "); > > break; > > default: > > ! appendStringInfoString(buf, "(unk) "); > > break; > > } > > } > > --- 41,47 ---- > > appendStringInfoString(buf, "(upd) "); > > break; > > default: > > ! appendStringInfo(buf, "(unk) ", member->status); > > break; > > } > > } > > That change doesn't look like it will do anything? Meh. That was a leftover --- removed. (I was toying with the "desc" code because it misbehaves when applied on records as they are created, as opposed to being applied on records as they are replayed. I'm pretty sure everyone already knows about this, and it's the reason why everybody has skimped from examining arrays of things stored in followup data records. I was naive enough to write code that tries to decode the followup record that contains the members of the multiaxct we're creating, which works fine during replay but gets them completely wrong during regular operation. This is the third time I'm surprised by this misbehavior; blame my bad memory for not remembering that it's not supposed to work in the first place.) Right now there is one case in this code that returns FRM_INVALIDATE_XMAX when it's not strictly necessary, i.e. it would also work to keep the Multi as is and return FRM_NOOP instead; and it also returns FRM_NOOP in one case when we could return FRM_INVALIDATE_XMAX instead. Neither does any great damage, but there is a consideration that future examiners of the tuple would have to resolve the MultiXact by themselves (==> performance hit). On the other hand, returning INVALIDATE causes the block to be dirtied, which is undesirable if not already dirty. Maybe this can be optimized so that we return a separate flag from FreezeMultiXactId when Xmax invalidation is optional, so that we execute all such operations if and only if the block is already dirty or being dirtied for other reasons. That would provide the cleanup for later onlookers, while not causing an unnecessary dirtying. Attached are patches for this, for both 9.3 and master. The 9.3 patch keeps the original FREEZE record; I have tested that an unpatched replica dies with: PANIC: heap2_redo: unknown op code 32 CONTEXTO: xlog redo UNKNOWN LOG: proceso de inicio (PID 316) fue terminado por una señal 6: Aborted when the master is running the new code. The message is ugly, but I don't see any way to fix that. For the master branch, I have removed the original FREEZE record definition completely and bumped XLOG_PAGE_MAGIC. This doesn't pose a problem given that we have no replication between different major versions. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: