Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)
Дата
Msg-id a7b61e78-8c92-6f06-5fbb-09f51f017421@gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
17.03.2018 03:36, Tomas Vondra пишет:
> 
> On 03/17/2018 12:03 AM, Yura Sokolov wrote:
>> 16.03.2018 04:23, Tomas Vondra пишет:
>>>
>>> ...
>>>
>>> OK, a few more comments.
>>>
>>> 1) The code in ExtendXipSizeForHash seems somewhat redundant with
>>> my_log2 (that is, we could just call the existing function).
>>
>> Yes, I could call my_log2 from ExtendXipSizeForHash. But wouldn't one
>> more call be more expensive than loop itself?
>>
> 
> I very much doubt it there would be a measurable difference. Firstly,
> function calls are not that expensive, otherwise we'd be running around
> and removing function calls from the hot paths. Secondly, the call only
> happens with many XIDs, and in that case the cost should be out-weighted
> by faster lookups. And finally, the function does stuff that seems far
> more expensive than a single function call (for example allocation,
> which may easily trigger a malloc).
> 
> In fact, in the interesting cases it's pretty much guaranteed to hit a
> malloc, because the number of backend processes needs to be high enough
> (say, 256 or more), which means
> 
>     GetMaxSnapshotSubxidCount()
> 
> will translate to something like
> 
>     ((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
>     = (64 + 1) * 256
>     = 16640
> 
> and because XIDs are 4B each, that's ~65kB of memory (even ignoring the
> ExtendXipSizeForHash business). And aset.c treats chunks larger than 8kB
> as separate blocks, that are always malloc-ed independently.
> 
> But I may be missing something, and the extra call actually makes a
> difference. But I think the onus of proving that is on you, and the
> default should be not to duplicate code.
> 
>>> 2) Apparently xhlog/subxhlog fields are used for two purposes - to
>>> store log2 of the two XID counts, and to remember if the hash table
>>> is built. That's a bit confusing (at least it should be explained
>>> in a comment) but most importantly it seems a bit unnecessary.
>>
>> Ok, I'll add comment.
>>
>>> I assume it was done to save space, but I very much doubt that
>>> makes any difference. So we could easily keep a separate flag. I'm
>>> pretty sure there are holes in the SnapshotData struct, so we could
>>> squeeze it the flags in one of those.
>>
>> There's hole just right after xhlog. But it will be a bit strange to 
>> move fields around.
>>
> 
> Is SnapshotData really that sensitive to size increase? I have my doubts
> about that, TBH. The default stance should be to make the code easy to
> understand. That is, we should only move fields around if it actually
> makes a difference.
> 
>>> But do we even need a separate flag? We could just as easily keep 
>>> the log2 fields set to 0 and only set them after actually building 
>>> the hash table.
>>
>> There is a need to signal that there is space for hash. It is not
>> always allocated. iirc, I didn't cover the case where snapshot were
>> restored from file, and some other place or two.
>> Only if all places where snapshot is allocated are properly modified
>> to allocate space for hash, then flag could be omitted, and log2
>> itself used as a flag.
>>
> 
> Hmmm, that makes it a bit inconsistent, I guess ... why not to do the
> same thing on all those places?
> 
>>> Or even better, why not to store the mask so that XidInXip does not
>>> need to compute it over and over (granted, that's uint32 instead
>>> of just uint8, but I don't think SnapshotData is particularly
>>> sensitive to this, especially considering how much larger the xid
>>> hash table is).
>>
>> I don't like unnecessary sizeof struct increase. And I doubt that 
>> computation matters. I could be mistaken though, because it is hot
>> place. Do you think it will significantly improve readability?
>>
> 
> IMHO the primary goal is to make the code easy to read and understand,
> and only optimize struct size if it actually makes a difference. We have
> no such proof here, and I very much doubt you'll be able to show any
> difference because even with separate flags pahole says this:
> 
> struct SnapshotData {
>     SnapshotSatisfiesFunc      satisfies;            /*     0     8 */
>     TransactionId              xmin;                 /*     8     4 */
>     TransactionId              xmax;                 /*    12     4 */
>     TransactionId *            xip;                  /*    16     8 */
>     uint32                     xcnt;                 /*    24     4 */
>     uint8                      xhlog;                /*    28     1 */
> 
>     /* XXX 3 bytes hole, try to pack */
> 
>     TransactionId *            subxip;               /*    32     8 */
>     int32                      subxcnt;              /*    40     4 */
>     uint8                      subxhlog;             /*    44     1 */
>     bool                       suboverflowed;        /*    45     1 */
>     bool                       takenDuringRecovery;  /*    46     1 */
>     bool                       copied;               /*    47     1 */
>     CommandId                  curcid;               /*    48     4 */
>     uint32                     speculativeToken;     /*    52     4 */
>     uint32                     active_count;         /*    56     4 */
>     uint32                     regd_count;           /*    60     4 */
>     /* --- cacheline 1 boundary (64 bytes) --- */
>     pairingheap_node           ph_node;              /*    64    24 */
>     TimestampTz                whenTaken;            /*    88     8 */
>     XLogRecPtr                 lsn;                  /*    96     8 */
> 
>     /* size: 104, cachelines: 2, members: 19 */
>     /* sum members: 101, holes: 1, sum holes: 3 */
>     /* last cacheline: 40 bytes */
> };
> 
> That is, the whole structure is only 104B - had it got over 128B, it
> might have made a difference due to extra cacheline. In fact, even if
> you make the xhlog and subxhlog uint32 (which would be enough to store
> the mask, which is what simplehash does), it'd be only 120B.
> 
> Based on this I'd dare to say neither of those changes would have
> negative impact.
> 
> So I suggest to split each of the "combined" fields into a simple bool
> flag ("hash table built") and a uint32 mask, and see if it simplifies
> the code (I believe it will) and what impact it has (I believe there
> will be none).
> 
> Sorry if this seems like a bikeshedding ...
> 
> 
> regards
> 

I've made version "without bit magic" as you suggested (in attach).
I can't reliably say is there any performance difference with previous
version.

With regards,
Yura.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Rethinking -L switch handling and construction of LDFLAGS
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Rethinking -L switch handling and construction of LDFLAGS