Re: FETCH FIRST clause PERCENT option

Поиск

Список

Период

Сортировка

От	Surafel Temesgen
Тема	Re: FETCH FIRST clause PERCENT option
Дата	22 августа 2019 г. 10:40:30
Msg-id	CALAY4q-P3g+U_Kk=qJ5DVqdT5UX8w+4azVv0OuWwhPdw+rTgeQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: FETCH FIRST clause PERCENT option (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы	Re: FETCH FIRST clause PERCENT option (Ryan Lambert <ryan@rustprooflabs.com>) Re: FETCH FIRST clause PERCENT option (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Aug 20, 2019 at 9:10 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

Hi,

At Wed, 7 Aug 2019 10:20:09 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in <CALAY4q98xbVHtZ4yj9DcCMG2-s1_JuRr7fYaNfW+bKmr22OiVQ@mail.gmail.com>
> Hi
> On Wed, Aug 7, 2019 at 6:11 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
>
> >
> > I have some comments.
> >
> > This patch uses distinct parameters for exact number and
> > percentage. On the othe hand planner has a notion of
> > tuple_fraction covering the both. The same handling is also used
> > for tuple number estimation. Using the same scheme will make
> > things far simpler. See the comment of grouping_planner().
> >
> >
> Its because of data type difference .In planner the data type is the same

I meant that, with the usage of tuple_fraction, one variable can
represent both the absolute limit and relative limit. This
simplifies parse structs.

In grouping_planner the patch set tuple bound to -1 in create_ordered_paths because it iterate until the end in percentage. Did that answer your question?

> > In executor part, addition to LimiteState.position, this patch
> > uses LimiteState.backwardPosition to count how many tuples we're
> > back from the end of the current tuplestore. But things will get
> > simpler by just asking the tuplestore for the number of holding
> > tuples.
> >
> >
> backwardPosition hold how many tuple returned in backward scan

I meant that we don't need to hold the number in estate.

I did it this way because I didn't find an API in tuplestore to do this

> > + slot = node->subSlot;
> >
> > Why is this needed? The variable is properly set before use and
> > the assignment is bogus in the first place.
> >
> >
> its because Tuplestore needs initialized slot.

I meant that the initilized slot is overwritten before first use.

> > The new code block in LIMIT_RESCAN in ExecLimit is useless since
> > it is exatctly the same with existing code block. Why didn't you
> > use the existing if block?
> >
>
> But they test different scenario

I meant that the two different scenario can share the code block.

Sorry for not clarifying will .One have to be check before offsetting and the other is after offsetting

> > > if (node->limitOption == PERCENTAGE)
> > + {
> > + node->perExactCount = ceil(node->percent *
> > node->position / 100.0);
> > +
> > +
> >
> > node->position is the number of tuples returned to upper node so
> > far (not the number of tuples this node has read from the lower
> > node so far). I don't understand what the expression means.
> >
>
> node->position hold the number of tuples this node has read from the lower
> node so far. see LIMIT_RESCAN state

Reallly? node->position is incremented when
tuplestore_gettupleslot_heaptuple() succeeded and reutnrs the
tuple to the caller immediately...

> > + if (node->perExactCount == node->perExactCount + 1)
> > + node->perExactCount++;
> >
> > What? The condition never be true. As the result, the following
> > if block below won't run.
> >
>
> it became true according to the number of tuple returned in from the lower
> node so far
> and percentage specification.

Mmm. How do you think X can be equal to (X + 1)?

Oops my bad .The attached patch remove the need of doing that

> > > /*
> > + * Return the tuple up to the number of exact count in
> > OFFSET
> > + * clause without percentage value consideration.
> > + */
> > + if (node->perExactCount > 0)
> > + {
> > +
> >
> >
> >
> >
> > + /*
> > + * We may needed this tuple in backward scan so put it into
> > + * tuplestore.
> > + */
> > + if (node->limitOption == PERCENTAGE)
> > + {
> > + tuplestore_puttupleslot(node->tupleStore, slot);
> > + tuplestore_advance(node->tupleStore, true);
> > + }
> >
> > "needed"->"need" ? The comment says that this is needed for
> > backward scan, but it is actually required even in forward
> > scan. More to the point, tuplestore_advance lacks comment.
>
>
> ok
>
>
> >
> > Anyway, the code in LIMIT_RESCAN is broken in some ways. For
> > example, if it is used as the inner scan of a NestLoop, the
> > tuplestore accumulates tuples by every scan. You will see that
> > the tuplestore stores up to 1000 tuples (10 times of the inner)
> > by the following query.
> >
>
> It this because in percentage we scan the whole table

It's useless and rather harmful that the tuplestore holds
indefinite number of duplicate set of the whole tuples from the
lower node. We must reuse tuples already stored in the tuplestore
or clear it before the next round.

i agree with this optimization but it don't have to be in first version

regards

Surafel

Вложения

percent-incremental-v8.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Konstantin Knizhnik
Дата: 22 августа 2019 г., 10:06:56
Сообщение: Re: Optimization of vacuum for logical replication

Следующее

От: Michael Paquier
Дата: 22 августа 2019 г., 10:45:58
Сообщение: Refactoring of connection with password prompt loop for frontends

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: FETCH FIRST clause PERCENT option

Вложения

Предыдущее

Следующее