Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)
От | Yura Sokolov |
---|---|
Тема | Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit) |
Дата | |
Msg-id | a7b61e78-8c92-6f06-5fbb-09f51f017421@gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
17.03.2018 03:36, Tomas Vondra пишет: > > On 03/17/2018 12:03 AM, Yura Sokolov wrote: >> 16.03.2018 04:23, Tomas Vondra пишет: >>> >>> ... >>> >>> OK, a few more comments. >>> >>> 1) The code in ExtendXipSizeForHash seems somewhat redundant with >>> my_log2 (that is, we could just call the existing function). >> >> Yes, I could call my_log2 from ExtendXipSizeForHash. But wouldn't one >> more call be more expensive than loop itself? >> > > I very much doubt it there would be a measurable difference. Firstly, > function calls are not that expensive, otherwise we'd be running around > and removing function calls from the hot paths. Secondly, the call only > happens with many XIDs, and in that case the cost should be out-weighted > by faster lookups. And finally, the function does stuff that seems far > more expensive than a single function call (for example allocation, > which may easily trigger a malloc). > > In fact, in the interesting cases it's pretty much guaranteed to hit a > malloc, because the number of backend processes needs to be high enough > (say, 256 or more), which means > > GetMaxSnapshotSubxidCount() > > will translate to something like > > ((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS) > = (64 + 1) * 256 > = 16640 > > and because XIDs are 4B each, that's ~65kB of memory (even ignoring the > ExtendXipSizeForHash business). And aset.c treats chunks larger than 8kB > as separate blocks, that are always malloc-ed independently. > > But I may be missing something, and the extra call actually makes a > difference. But I think the onus of proving that is on you, and the > default should be not to duplicate code. > >>> 2) Apparently xhlog/subxhlog fields are used for two purposes - to >>> store log2 of the two XID counts, and to remember if the hash table >>> is built. That's a bit confusing (at least it should be explained >>> in a comment) but most importantly it seems a bit unnecessary. >> >> Ok, I'll add comment. >> >>> I assume it was done to save space, but I very much doubt that >>> makes any difference. So we could easily keep a separate flag. I'm >>> pretty sure there are holes in the SnapshotData struct, so we could >>> squeeze it the flags in one of those. >> >> There's hole just right after xhlog. But it will be a bit strange to >> move fields around. >> > > Is SnapshotData really that sensitive to size increase? I have my doubts > about that, TBH. The default stance should be to make the code easy to > understand. That is, we should only move fields around if it actually > makes a difference. > >>> But do we even need a separate flag? We could just as easily keep >>> the log2 fields set to 0 and only set them after actually building >>> the hash table. >> >> There is a need to signal that there is space for hash. It is not >> always allocated. iirc, I didn't cover the case where snapshot were >> restored from file, and some other place or two. >> Only if all places where snapshot is allocated are properly modified >> to allocate space for hash, then flag could be omitted, and log2 >> itself used as a flag. >> > > Hmmm, that makes it a bit inconsistent, I guess ... why not to do the > same thing on all those places? > >>> Or even better, why not to store the mask so that XidInXip does not >>> need to compute it over and over (granted, that's uint32 instead >>> of just uint8, but I don't think SnapshotData is particularly >>> sensitive to this, especially considering how much larger the xid >>> hash table is). >> >> I don't like unnecessary sizeof struct increase. And I doubt that >> computation matters. I could be mistaken though, because it is hot >> place. Do you think it will significantly improve readability? >> > > IMHO the primary goal is to make the code easy to read and understand, > and only optimize struct size if it actually makes a difference. We have > no such proof here, and I very much doubt you'll be able to show any > difference because even with separate flags pahole says this: > > struct SnapshotData { > SnapshotSatisfiesFunc satisfies; /* 0 8 */ > TransactionId xmin; /* 8 4 */ > TransactionId xmax; /* 12 4 */ > TransactionId * xip; /* 16 8 */ > uint32 xcnt; /* 24 4 */ > uint8 xhlog; /* 28 1 */ > > /* XXX 3 bytes hole, try to pack */ > > TransactionId * subxip; /* 32 8 */ > int32 subxcnt; /* 40 4 */ > uint8 subxhlog; /* 44 1 */ > bool suboverflowed; /* 45 1 */ > bool takenDuringRecovery; /* 46 1 */ > bool copied; /* 47 1 */ > CommandId curcid; /* 48 4 */ > uint32 speculativeToken; /* 52 4 */ > uint32 active_count; /* 56 4 */ > uint32 regd_count; /* 60 4 */ > /* --- cacheline 1 boundary (64 bytes) --- */ > pairingheap_node ph_node; /* 64 24 */ > TimestampTz whenTaken; /* 88 8 */ > XLogRecPtr lsn; /* 96 8 */ > > /* size: 104, cachelines: 2, members: 19 */ > /* sum members: 101, holes: 1, sum holes: 3 */ > /* last cacheline: 40 bytes */ > }; > > That is, the whole structure is only 104B - had it got over 128B, it > might have made a difference due to extra cacheline. In fact, even if > you make the xhlog and subxhlog uint32 (which would be enough to store > the mask, which is what simplehash does), it'd be only 120B. > > Based on this I'd dare to say neither of those changes would have > negative impact. > > So I suggest to split each of the "combined" fields into a simple bool > flag ("hash table built") and a uint32 mask, and see if it simplifies > the code (I believe it will) and what impact it has (I believe there > will be none). > > Sorry if this seems like a bikeshedding ... > > > regards > I've made version "without bit magic" as you suggested (in attach). I can't reliably say is there any performance difference with previous version. With regards, Yura.
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andres FreundДата:
Сообщение: Re: Rethinking -L switch handling and construction of LDFLAGS
Следующее
От: Tom LaneДата:
Сообщение: Re: Rethinking -L switch handling and construction of LDFLAGS