Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfiana bit)
От | Sokolov Yura |
---|---|
Тема | Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfiana bit) |
Дата | |
Msg-id | 20170813180524.41a8bdd7@falcon-work обсуждение исходный текст |
Ответ на | Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit) (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)
|
Список | pgsql-hackers |
В Fri, 11 Aug 2017 14:05:08 -0400 Robert Haas <robertmhaas@gmail.com> пишет: > On Thu, Aug 10, 2017 at 11:12 AM, Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: > > These results look very cool! > > I think CSN is eventually inevitable, but it's a long distance > > feature. Thus, this optimization could make us a good serve before > > we would have CSN. Do you expect there are cases when this patch > > causes slowdown? What about case when we scan each xip array only > > once (not sure how to simulate that using pgbench)? > > Just a random thought here: > > The statements pgbench executes are pretty simple: they touch one row > in one table. You wouldn't expect them to scan the xip array very > many times. If even those statements touch the array enough for this > to win, it seems like it might be hard to construct an even worse > case. I might be missing something, though. With zipfian distribution, many concurrent transactions tries to update the same row. Every transaction eventually updates this row (may be after waiting), so there are a lot of concurrent row version, and transaction scans through its snapshot many times. > > We're not going to accept code like this, though: > > + d = xip[i] >> 6; > + j = k = xip[i] & mask; > + while (xiphash[j] != InvalidTransactionId) > + { > + j = (k + d) & mask; > + d = d*5 + 1; > + } > > Single-character variable names, hard-coded constants, and no > comments! I totally agree. First version (which you've cited) was clear POC. Second version (I've sent in Thursday) looks a bit better, but I agree there is room for improvement. I'll try to make it prettier. BTW, I have a question: there is CopySnapshot function, so snapshot is clearly changed after copy. But I could not follow: is xip and subxip array changes in a copy, or it remains the same to original, but only other snapshot fields could be changed? > I kind of doubt as a general point that we really want another > open-coded hash table -- I wonder if this could be made to use > simplehash. I like simplehash very much (although I'm missing couple of features). Unfortunately, siplehash is not simple enough for this use case: - it will use at least twice more memory for hash table itself (cause it have to have 'entry->status' field), - it has large header (ad-hoc hash-table consumes 1 byte in SnapshotData structure), - its allocation will be trickier (custom hash-table is co-allocated after xip-array itself), - its insert algorithm looks to be much more expensive (at least, it is more complex in instructions count). I thing, using simplehash here will lead to much more invasive patch, than this ad-hoc table. And I believe it will be slower (and this place is very performance critical), though I will not bet on. BTW, ad-hoc hash tables already exist: at least recourse owner uses one. -- With regards, Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company
В списке pgsql-hackers по дате отправления: