Re: [HACKERS] Solution for LIMIT cost estimation

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Solution for LIMIT cost estimation
Дата
Msg-id 13438.950502275@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Solution for LIMIT cost estimation  (Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>)
Ответы Re: [HACKERS] Solution for LIMIT cost estimation  (Don Baccus <dhogaza@pacifier.com>)
Re: [HACKERS] Solution for LIMIT cost estimation  (Philip Warner <pjw@rhyme.com.au>)
Список pgsql-hackers
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Tom Lane wrote:
>> I have currently got it working (I think; not too well tested yet)
>> using the proposal I offered before of "pay attention to the size
>> of LIMIT, but ignore OFFSET", so that the same query plan will be
>> derived from similar queries with different OFFSETs.  Does anyone
>> have a substantial gripe with that compromise?

> Would offset be any use if you did make use of it?

Yes, because the number of tuples that will *actually* get fetched
is offset+limit.  If you had a large offset so that the tuples
getting returned were from somewhere near the end of the query,
then choosing a fast-start algorithm would be a Bad Idea; you'd
really want a plan that optimizes on the basis of total cost
rather than startup cost.

Hmm, I'm on the verge of talking myself out of the compromise ;-).
I'm not sure how many people will really use large offsets, but
anyone who does might be a pretty unhappy camper.  If you're asking
for OFFSET 1000000 LIMIT 1, the thing might pick a nested loop
which is exceedingly fast-start ... but also exceedingly expensive
when you go ahead and fetch many tuples anyway.

Perhaps we should stick to two alternatives:

1. If LIMIT is present, optimize on an assumption that X% of the
tuples are fetched, where X does *not* depend on the specific
values given for OFFSET or LIMIT.  (But we could make X a settable
parameter...)

2. Optimize using OFFSET+LIMIT as the expected number of tuples to
fetch.  Document that varying OFFSET or LIMIT will not necessarily
generate consistent results unless you specify ORDER BY to force a
consistent tuple order.

I don't really like #1, but I can see where #2 might cause some
unhappiness as well.  Comments, opinions?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: The Hermit Hacker
Дата:
Сообщение: pgsql-support 'distribution' ...
Следующее
От: Chris Bitmead
Дата:
Сообщение: Re: [HACKERS] Solution for LIMIT cost estimation