Re: Abbreviated keys for Numeric

Поиск
Список
Период
Сортировка
От Andrew Gierth
Тема Re: Abbreviated keys for Numeric
Дата
Msg-id 87a8z7gokj.fsf@news-spur.riddles.org.uk
обсуждение исходный текст
Ответ на Re: Abbreviated keys for Numeric  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: Abbreviated keys for Numeric  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
>>>>> "Peter" == Peter Geoghegan <pg@heroku.com> writes:
Peter> Attached is a revision of this patch, that I'm calling v2. WhatPeter> do you think, Andrew?

"No." is I think the best summary of my response.

I strongly suggest whichever committer ends up looking at this consider
my original version unchanged in preference to this. The cost/benefit
decision of supporting abbreviation on 32bit platforms is a point that
can be debated (I strongly support retaining the 32bit code, obviously),
but the substantive changes here are actively wrong.
Peter> Other than that, I've tried to keep things closer to the textPeter> opclass.  For example, the cost model now
hasa few debuggingPeter> traces (disabled by default). I have altered the ad-hoc costPeter> model so that it no longer
concernsitself with NULL inputs,Peter> which seemed questionable (not least since the abbreviationPeter> conversion
functionisn't actually called for NULL inputs. AnyPeter> attempt to track the full count within numeric code
thereforePeter>cannot work.).
 

This is simply wrong. The reason why the cost model (in my version)
tracks non-null values by having its own counter is precisely BECAUSE
the passed-in memtupcount includes nulls, and therefore the code will
UNDERESTIMATE the fraction of successfully abbreviated values if the
comparison is based on memtupcount.  In your version, if there are 9999
null values at the start of the input, then on the first non-null value
after that, memtupcount will be 10000 and there will be only 1 distinct
abbreviated value, causing abbreviation to spuriously abort.

The test to clamp the estimate to 1.0 is just nonsensical and serves no
purpose whatever, and the comment for it is wrong.

You should fix the text abbreviation code, not propagate your mistakes
further.

(BTW, there's an outright typo in your code, ';;' for ';' at the end of
a line. Sloppy.)
Peter> I also now allocate a buffer of scratch memory separately fromPeter> the main sortsupport object - doing one
allocationfor allPeter> sortsupport state, bunched together as a buffer seemed like aPeter> questionable
micro-optimization.

It's yet another cache line... I admit I did not benchmark that choice,
but then neither did you.
Peter> It seemed unwise to silently disable abbreviation when someonePeter> happened to build with DEC_DIGITS != 4. A
staticassertion nowPeter> gives these unusual cases the opportunity to make an informedPeter> decision about either
disablingabbreviation or not changingPeter> DEC_DIGITS in light of the performance penalty, in aPeter> self-documenting
way.

A) Nobody in their right minds is ever going to do that anyway

B) Anybody who does that is either not concerned about performance or is
concerned only about performance of the low-level numeric ops, and
abbreviation is the last thing they're going to be worried about in
either case.

-- 
Andrew (irc:RhodiumToad)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: proposal: doc: simplify examples of dynamic SQL
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pg_recvlogical description