Re: Something's been bugging me
От | Tom Lane |
---|---|
Тема | Re: Something's been bugging me |
Дата | |
Msg-id | 5790.1191164665@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Something's been bugging me (Gregory Stark <stark@enterprisedb.com>) |
Список | pgsql-hackers |
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> I'd be inclined to make the second byte be the length and have >> VARSIZE_1B_E depend on that --- any objection? > On one hand it offends me since it's hard coding an assumption that the size > of a pointer decides what it contains and vice versa. There's nothing saying > we won't have two possible special meanings for a one-byte datum. Well, what you're proposing is to treat the second byte of a 1B_E datum as an enum value, requiring every piece of code that examines it to know exactly what all the possible values are. That doesn't sound like a good idea to me. If it does turn out to be a good idea, we could still redefine it that way --- we'd just have assigned 18 not 0 as the enumeration value for basic TOAST pointers. The key point in my mind is that there is lots of performance-critical code (the inner loops of heap_deform_tuple and friends) that needs to determine the physical size of a datum quickly. Interpreting the content of a toasted datum is a completely separate and much less performance-critical task. If it turns out that the size is not sufficient to tell the difference between two types of 1B_E values, we can go over to examining the contents instead. I note that basic TOAST pointers start with va_rawsize which can't exceed 1G, so there are two free bits that could be exploited in exactly the same way as TOAST and now varvarlena have done with 4-byte datum length words. > And it forecloses any possibility of having a type whose size is at all > variable. Au contraire, I think it makes it easier, at least for sizes up to 255 bytes --- you need not introduce any more complexity into VARSIZE_ANY to have that. > On the other hand I suppose you're concerned about the time to do a few > comparisons before knowing which length to skip over? I'm not entirely sure > cycle-counting at that level leads to the correct conclusions. I am. I have spent many many hours examining PG profiling results, and stuff in and around the tuple-decoding loops is almost always interesting from a performance standpoint. Cycles spent in interpreting a toast pointer never are (not least because you probably have to go off and do I/O after you interpret the pointer). I'm willing to push almost any amount of work onto toast_fetch_datum if it'll save cycles in VARSIZE_ANY. But in this case you haven't even demonstrated a reason to think that any complexity will be added there. The likely uses for this, in my mind, are toast pointers with wider valueid fields and toast pointers with indicators of different compression methods, and those seem like they'd naturally be different sizes anyway. regards, tom lane
В списке pgsql-hackers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: [COMMITTERS] pgsql: Applied another patch by ITAGAKI Takahiro