Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
От | Peter Geoghegan |
---|---|
Тема | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE |
Дата | |
Msg-id | CAM3SWZQ4O244fj_RwJUiZ=CiGOSjaYAc2_NwQYwKrvNAT1ExnA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
(Andres Freund <andres@2ndquadrant.com>)
|
Список | pgsql-hackers |
On Sun, Sep 22, 2013 at 2:10 AM, Andres Freund <andres@2ndquadrant.com> wrote: > I can't follow here. Why does e.g. the promise tuple approach bloat more > than the subxact example? > The protocol is roughly: > 1) Insert index pointer containing an xid to be waiting upon instead of > the target tid into all indexes > 2) Insert heap tuple, we can be sure there's no conflict now > 3) Go through the indexes and repoint the item to point to the tid of the > heaptuple instead of the xid. > > There's zero heap or index bloat in the uncontended case. In the > contended case it's just the promise tuples from 1) that are inserted > before the conflict is detected. Those can be marked as dead when the > conflict happened. It depends on your definition of the contended case. You're assuming that insertion is the most probable outcome, when in fact much of the time updating is just as likely or even more likely. Many promise tuples may be inserted before actually seeing a conflict and deciding to update/lock for update. In order for the example in the docs to bloat at all, both the UPDATE and the INSERT need to fail within a tiny temporal window - that's what I mean by uncontended (it is usually tiny because if the UPDATE blocks, that often means it will succeed anyway, but if not the INSERT will very probably succeed). This is because the UPDATE won't bloat when no existing row is seen, because its subplan will return no rows. The INSERT will only bloat if it fails, which is generally very unlikely because of the fact that the UPDATE just did nothing. Contrast that with bloating almost every time an UPDATE is necessary (I think that bloat that is generally cleaned up synchronously is still bloat). That's before we even talk about the additional overhead. Making the locks expensive to release/clean-up could really hurt, since it appears they'll *have* to be unlocked before row locking, and during that time concurrent activity affecting the row to be locked can necessitate a full restart - that's a window we want to keep as small as possible. I think reviewer time would for now be much better spent discussing the patch at a higher level (along the lines of my recent mail to Stephen and Robert). I've been at least as guilty as anyone else in getting mired in these details. We'll be much better equipped to have this discussion afterwards, because it isn't clear to us if we really need or would find it at all useful to have long-lasting value locks, how frequently we'll need to retry and for what reasons, and so on. My immediate concern as the patch author is to come up with a better answer to the problem that Robert described [1], because "hey, I locked the row -- you take it from here user that might not have any version of it visible to you" is not good enough. I hope that there isn't a tension between solving that problem and offering the flexibility and composability of the proposed syntax. [1] http://www.postgresql.org/message-id/AANLkTineR-rDFWENeddLg=GrkT+epMHk2j9X0YqpiTY8@mail.gmail.com -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: