Re: FETCH FIRST clause PERCENT option

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: FETCH FIRST clause PERCENT option
Дата
Msg-id b6a3bdfa-d5c8-534c-c1e8-fdabee061b04@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: FETCH FIRST clause PERCENT option  (Surafel Temesgen <surafel3000@gmail.com>)
Ответы Re: FETCH FIRST clause PERCENT option  (Surafel Temesgen <surafel3000@gmail.com>)
Список pgsql-hackers

On 1/4/19 7:40 AM, Surafel Temesgen wrote:
> 
> 
> On Thu, Jan 3, 2019 at 4:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
>     On 1/3/19 1:00 PM, Surafel Temesgen wrote:
>     > Hi
>     >
>     > On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
>     > <tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>
>     <mailto:tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>>> wrote:
> 
>     >     The execution part of the patch seems to be working correctly,
>     but I
>     >     think there's an improvement - we don't need to execute the
>     outer plan
>     >     to completion before emitting the first row. For example,
>     let's say the
>     >     outer plan produces 10000 rows in total and we're supposed to
>     return the
>     >     first 1% of those rows. We can emit the first row after
>     fetching the
>     >     first 100 rows, we don't have to wait for fetching all 10k rows.
>     >
>     >
>     > but total rows count is not given how can we determine safe to
>     return row
>     >
> 
>     But you know how many rows were fetched from the outer plan, and this
>     number only grows grows. So the number of rows returned by FETCH FIRST
>     ... PERCENT also only grows. For example with 10% of rows, you know that
>     once you reach 100 rows you should emit ~10 rows, with 200 rows you know
>     you should emit ~20 rows, etc. So you may track how many rows we're
>     supposed to return / returned so far, and emit them early.
> 
> 
> 
> 
> yes that is clear but i don't find it easy to put that in formula. may
> be someone with good mathematics will help
> 

What formula? All the math remains exactly the same, you just need to
update the number of rows to return and track how many rows are already
returned.

I haven't tried doing that, but AFAICS you'd need to tweak how/when
node->count is computed - instead of computing it only once it needs to
be updated after fetching each row from the subplan.

Furthermore, you'll need to stash the subplan rows somewhere (into a
tuplestore probably), and whenever the node->count value increments,
you'll need to grab a row from the tuplestore and return that (i.e.
tweak node->position and set node->subSlot).

I hope that makes sense. The one thing I'm not quite sure about is
whether tuplestore allows adding and getting rows at the same time.

Does that make sense?

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Delay locking partitions during query execution
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Fast path for empty relids in check_outerjoin_delay()