Re: FETCH FIRST clause PERCENT option

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: FETCH FIRST clause PERCENT option
Дата
Msg-id 20190301.103145.233922788.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: FETCH FIRST clause PERCENT option  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: FETCH FIRST clause PERCENT option
Re: FETCH FIRST clause PERCENT option
Список pgsql-hackers
Hello.

At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com>
> > One biggest issue seems to be we don't know the total number of

# One *of* the biggest *issues*?

> > outer tuples before actually reading a null tuple. I doubt of
> > general shortcut for that. It also seems preventing limit node
> > from just using materialized outer.
> > 
> 
> Sure, if you actually want all tuples, you'll have to execute the outer
> plan till completion. But that's not what I'm talking about - what if we
> only ever need to read one row from the limit?

We have no choice than once reading all tuples just to find we
are to return just one row, since estimator is not guaranteed to
be exact as required for this purpose.

> To give you a (admittedly, somewhat contrived and artificial example):
> 
>     SELECT * FROM t1 WHERE id IN (
>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>     );
> 
> Maybe this example is bogus and/or does not really matter in practice. I
> don't know, but I've been unable to convince myself that's the case.

I see such kind of idiom common. Even in the quite simple example
above, *we* cannot tell how many tuples the inner should return
unless we actually fetch all tuples in t2. This is the same
problem with count(*).

The query is equivalent to the folloing one.

 SELECT * FROM t1 WHERE id IN (
   SELECT id FROM t2 ORDER BY x
     FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
 );

This scans t2 twice, but this patch does only one full scan
moving another partial scan to tuplestore. We would win if the
outer is complex enough.

Anyway, even excluding the count(*) issue, it seems that we are
not alone in that point. (That is, at least Oracle shows
different E-Rows and A-Rows for PERCENT).

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chapman Flack
Дата:
Сообщение: Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context
Следующее
От: "Tsunakawa, Takayuki"
Дата:
Сообщение: RE: Protect syscache from bloating with negative cache entries