Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead
Дата
Msg-id CAM3SWZRCUKw2mkUZDh8xajBSBN_mBS+6v32=vqLjw-oApq7Yqw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Jul 23, 2015 at 8:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> My priorities are different from yours. Your conclusion is basically
> that it's OK to burden everyone who comes along and does future
> development that may use the sorting code differently from the way
> it's used now with dealing with this issue somehow, or deciding not to
> deal with it.  I have a really tough time agreeing with that;
> tuplesort.c is, and should be, an abstraction layer, and people using
> it from the outside should not need to worry about what happens on the
> inside.

I don't know where this idea came from, because I'm already supporting
the requirement for the external sorting case, where doing this is too
much of a burden. That is entirely unchanged, and so such differences
are clearly already respected. Any new patch likely to care about this
(e.g. parallel internal sorts) will probably naturally not be affected
by it anyway, just because they'll add a new tuplesort state, or
because multiple "memtuples" arrays will be filled in parallel, but
still in sequential order, and so everything works out just the same.
Aside from that, the author might have to spend 5 minutes thinking
about it. I don't see the problem.

Adding this single new requirement for exactly 2 extant callers does
not make tuplesort any less well encapsulated. Please don't write
wildly inaccurate summaries of what I've said, like "Your conclusion
is basically that it's OK to burden everyone who comes along and does
future development that may use the sorting code differently from the
way it's used now". That is patently untrue.

> Your original post lays out two rationales for the TID comparisons,
> and says that one of them is obsolete, but the other is "probably"
> still valid.  I think what you should do is go find out whether the
> second rationale is valid or not.  If it's not, we can get rid of that
> code.  If it is valid, then we can't.  I'm not going to endorse the
> notion that tuplesort.c will only DTRT if it receives tuples in TID
> order; it cannot be the responsibility of the caller of the sort code
> to ensure that the tuples are sorted.  Even if it shaves a few
> percentage points off the runtime now, the complexity it imposes on
> future patch authors is, IMO, not worth it.

More than a few - sometimes more than 10%.

The second rationale was, as far as I can tell, a theoretical one that
was never experimentally validated. I'm pretty sure you could come up
with a case where not having it hurt, if you were sufficiently
creative. I'm not sure that I have the stomach for another protracted
debate about these fuzzy costs, which this patch was suppose to avoid.
However, you don't like this patch for reasons that I cannot fathom. I
think that I will have to withdraw it, and forget about cutting this
unnecessary cost from B-Tree builds.

Our priorities are different, but mine are changing; I simply don't
want to spend a lot of time arguing with you about things like this.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: BRIN index and aborted transaction
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Eliminating CREATE INDEX comparator TID tie-breaker overhead