Re: Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch)
От | Peter Geoghegan |
---|---|
Тема | Re: Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch) |
Дата | |
Msg-id | CAM3SWZSd3MU=c9HcP-SxCbUcLvFWT8t9MZCotOJxuE3RbTvd0A@mail.gmail.com обсуждение исходный текст |
Ответ на | Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch) (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Problems with approach #2 to value locking (INSERT ... ON
CONFLICT UPDATE/IGNORE patch)
|
Список | pgsql-hackers |
On Thu, Jan 1, 2015 at 11:17 PM, Peter Geoghegan <pg@heroku.com> wrote: > The attached patch fixes Jeff's test case, as far as it goes. But, as > I mentioned, it's not intended as a real fix. For one thing, I have > made multiple changes to HeapTupleSatisfies*() routines, but haven't > fully considered the consequences for, say, ri_triggers.c. > HeapTupleSatisfiesSelf() and HeapTupleSatisfiesHistoricMVCC() were not > modified. I've modified HeapTupleSatisfiesVacuum() to see > InvalidTransactionID (not invalid xid infomask bits) as visible for > garbage collection (alongside HeapTupleSatisfiesDirty() and > HeapTupleSatisfiesMVCC(), which I changed first), and I wouldn't be > surprised if that created new problems in other areas. In short, this > patch is just for illustrative purposes. I think that I see another race condition, requiring custom instrumentation to the server to demonstrate. Basically, even with the fix above, there are still similar races. I'm not trying to call ExecLockUpdateTuple() against "super deleted" NULL tuples, because those are no longer visible to any relevant snapshot type, the fix described above. However, I can still see a change in tuple header xmin between a dirty index scan and attempting to lock that row to update. If I'm not mistaken, the race condition is small enough that something like Jeff's tool won't catch it directly in a reasonable amount of time, because it happens to be the same index key value. It's not all that easy to recreate, even with my custom instrumentation. With fsync=off, it occurs somewhat infrequently with a custom instrumentation Postgres build: pg@hamster:~/jjanes_upsert$ perl count_upsert.pl 8 100000 [Mon Jan 5 17:31:57 2015] init done at count_upsert.pl line 101. [Mon Jan 5 17:32:17 2015] child abnormal exit [Mon Jan 5 17:32:17 2015] DBD::Pg::db selectall_arrayref failed: ERROR: mismatch in xmin for (322,264). Initially 4324145, then 4324429 at count_upsert.pl line 210.\n at count_upsert.pl line 227. [Mon Jan 5 17:32:49 2015] sum is -800 [Mon Jan 5 17:32:49 2015] count is 9515 [Mon Jan 5 17:32:49 2015] normal exit at 1420507969 after 733912 items processed at count_upsert.pl line 182. I think that "super deleting" broken promise tuples undermines how vacuum interlocking is supposed to work [1]. We end up at the wrong tuple, that happens to occupy the same slot as the old one (and happens to have the same index key value, because the race is so small are there are relatively few distinct values in play - it's hard to recreate). Attached instrumentation patch was used with Jeff Jane's tool. If we weren't super deleting in the patch, then I think that our backend's xmin horizon would be enough of an interlock (haven't tested that theory by testing value locking approach #1 implementation yet, but I worried about the scenario quite a lot before and things seemed okay). But we are "super deleting" with implementation #2, and we're also not holding a pin on the heap buffer or B-Tree leaf page buffer throughout, so this happens (if I'm not mistaken). [1] https://github.com/postgres/postgres/blob/REL9_3_STABLE/src/backend/access/nbtree/README#L142 -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: