Re: extended stats on partitioned tables

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: extended stats on partitioned tables
Дата
Msg-id 56dceeb0-7d30-fc4d-6a37-8a0f1a76228c@enterprisedb.com
обсуждение исходный текст
Ответ на Re: extended stats on partitioned tables  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On 9/25/21 11:46 PM, Justin Pryzby wrote:
> On Sat, Sep 25, 2021 at 11:01:21PM +0200, Tomas Vondra wrote:
>>> Do you think it's possible to backpatch a fix to handle partitioned tables
>>> specifically ?
>>>
>>> The "tuple already updated" error which I reported and which was fixed by
>>> 859b3003 involved inheritence children.  Since partitioned tables have no data
>>> themselves, the !inh check could be relaxed.  It's not totally clear to me if
>>> the correct statistics would be used in that case.  I suppose the wrong
>>> (inherited) stats would be wrongly applied affect queries FROM ONLY a
>>> partitioned table, which seems pointless to write and also hard for the
>>> estimates to be far off :)
>>
>> Hmmm, maybe. To prevent the "tuple concurrently updated" we must ensure we
>> never build stats with and without inheritance at the same time (for the
>> same rel). The 859b3003de ensures that by only building extended stats in
>> the (!inh) case, but we might tweak that based on relkind. See the attached
>> patch. But I wonder if there are cases that might be hurt by this - that'd
>> be a regression too, of course.
> 
> I think we should leave the inheritance case alone, since it hasn't changed in
> 2 years, and building stats on the table ONLY is a legitimate interpretation,
> and it's as good as we can do without the catalog change.
> 
> But the partitioned case used to work, and there's no utility in selecting FROM
> ONLY a partitioned table, so we might as well build the stats including its
> partitions.
> 
> I don't think anything would get worse for the partitioned case.
> Obviously building inherited ext stats could change plans - that's the point.
> It's weird that the stats objects which existed for 18 months before being
> "built" after the patch was applied, but no so weird that the release notes
> wouldn't be ample documentation.
> 

Agreed.

> If building statistics caused the plan to change undesirably, the solution
> would be to drop the stats object, of course.
> 
> +               build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
                                                                                                       
 
> 
> It's weird to build inherited extended stats for partitioned tables but not for
> inheritence parents.  We could be clever and say "build inherited ext stats for
> inheritence parents only if we didn't insert any stats for the table itself
> (because it's empty)".  But I think that's fragile: a single tuple in the
> parent table could cause stats to be built there instead of on its heirarchy,
> and the extended stats would be used for *both* FROM and FROM ONLY, which is an
> awful combination.
> 

I don't think there's a good way to check if there are any rows in the
parent relation. And even then, a single row might cause huge changes to
query plans (essentially switching to very different stats).

> Since do_analyze_rel is only called once for partitioned tables, I think you
> could write that as:
> 
> /* Do not build inherited stats (since the catalog cannot support it) except
>  * for partitioned tables, for which numrows==0 and have no non-inherited stats */
> build_ext_stats = !inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;
> 

Good point.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: can we add some file(msvc) to gitignore
Следующее
От: Artur Zakirov
Дата:
Сообщение: Re: Empty string in lexeme for tsvector