Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments
Дата
Msg-id CAEepm=3mUYhQq4yoiYa=WjFvvUcCVA3H0ZNjR_QVMTkn0Camwg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On Tue, Jun 13, 2017 at 6:40 PM, Noah Misch <noah@leadboat.com> wrote:
> On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:
>> On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>> > On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:
>> >> This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
>> >> the right place for enrtuples.  It's a concept regularly seen in planner data
>> >> structures but not otherwise seen at parse tree level.
>> >
>> > I agree that this is strange.  Perhaps
>> > set_namedtuplestore_size_estimates should instead look up the
>> > EphemeralNamedRelation by rte->enrname to find its way to
>> > enr->md.enrtuples, but I'm not sure off the top of my head how it
>> > should get its hands on the QueryEnvironment required to do that.  I
>> > will look into this on Monday, but other ideas/clues welcome...
>>
>> Here's some background:  If you look at the interface changes
>> introduced by 18ce3a4 you will see that it is now possible for a
>> QueryEnvironment object to be injected into the parser and executor.
>> Currently the only use for it is to inject named tuplestores into the
>> system via SPI_register_relation or SPI_register_trigger_data.  That's
>> to support SQL standard transition tables, but anyone can now use SPI
>> to expose data to SQL via an ephemeral named relation in this way.
>> (In future we could imagine other kinds of objects like table
>> variables, anonymous functions or streams).
>>
>> The QueryEnvironment is used by the parser to resolve names and build
>> RangeTblEntry objects.  The planner doesn't currently need it, because
>> all the information it needs is in the RangeTblEntry, including the
>> offending row estimate.  Then the executor needs it to get its hands
>> on the tuplestores.  So the question is: how can we get it into
>> costsize.c, so that it can look up the EphermalNamedRelationMetaData
>> object by name, instead of trafficking statistics through parser data
>> structures?
>>
>> Here are a couple of ways forward that I can see:
>>
>> 1.  Figure out how to get the QueryEnvironment through more of these
>> stack frames (possibly inside other objects), so that
>> set_namedtuplestore_size_estimates can look up enrtuples by enrname:
>>
>>   set_namedtuplestore_size_estimates <-- would need QueryEnvironment
>>   set_namedtuplestore_pathlist
>>   set_rel_size
>>   set_base_rel_sizes
>>   make_one_rel
>>   query_planner
>>   grouping_planner
>>   subquery_planner
>>   standard_planner
>>   planner
>>   pg_plan_query
>>   pg_plan_queries <-- doesn't receive QueryEnvironment
>>   BuildCachedPlan <-- receives QueryEnvironment
>>   GetCachedPlan
>>   _SPI_execute_plan
>>   SPI_execute_plan_with_paramlist
>>
>> 2.  Rip the row estimation out for now, use a bogus hard coded
>> estimate like we do in some other cases, and revisit later.  See
>> attached (including changes from my previous message).
>> Unsurprisingly, a query plan changes.
>>
>> Thoughts?
>
> I'm not sufficiently familiar with the relevant code to judge this one.  Let's
> see if planner experts voice an opinion.  Absent more opinions, the current
> design stands.

Here's another thought I had about this during review[1] and didn't
follow up:  Query plans are cached by plpgsql and reused.  If you're
unlucky, on the first firing of a given trigger a transition
tuplestore could have 0 or 1 rows and on the second firing it could
have a zillion rows, I guess potentially resulting in pathologically
bad performance.  I do think it's a good idea for SPI client code to
be able to provide an estimate via SPI_register_relation (if the
coding passes muster or can be suitably refactored), but maybe "oh,
about 1,000" is actually better for transition tables anyway in the
absence of some kind of adaptive replanning or user declarations.

[1] https://www.postgresql.org/message-id/CAEepm%3D02A5LTfGtHo8xWpEmW4AYY%2BES-uPr02bWb0OzGF4Ej-Q%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: [HACKERS] pg_receivewal and messages printed in non-verbose mode
Следующее
От: Petr Jelinek
Дата:
Сообщение: Re: [HACKERS] Refreshing subscription relation state inside atransaction block