Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

Поиск
Список
Период
Сортировка
От Andrei Lepikhov
Тема Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2
Дата
Msg-id c9225640-e2dc-4d9c-a598-0d1d1673b11f@gmail.com
обсуждение исходный текст
Ответ на Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2
Список pgsql-bugs
On 4/12/25 00:30, Alexander Korotkov wrote:
> On Thu, Apr 10, 2025 at 3:28 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> ...
>> Here, I attempt to use this routine in the hash join bucket size
>> estimation. I transformed it a little, made it more general. Not sure it
>> is the best design, but it is debatable.
> ...
> SELECT FROM sb_1 LEFT JOIN sb_2 ON (sb_2.x=sb_1.x) AND (sb_1.x=sb_2.x)
> AND (sb_1.y=sb_2.y);
> 
> When you use add_unique_group_var() which keeps varinfos unique then
> you can no longer expect that varinfos have the same order as
> origin_rinfos.
Ok, here is a patch that considers this issue. Now GroupVarInfo tracks 
source RestrictInfo. Not sure it is an ideal approach, but we don't need 
to synchronise the restrictions and corresponding varinfos.

On Fri, 11 Apr 2025 at 01:31, Tomas Vondra <tomas@vondra.me> wrote:
 > I think estimate_multivariate_bucketsize() needs to be more careful
 > about building the GroupVarInfo list - in particular, it needs to do
 > dance with examine_variable + add_unique_group_var + pull_var_clause,
 > similar to estimate_num_groups() at line ~3532.
Yeah, estimate_num_groups and bucket size estimation have a lot in 
common. It would be better to invent some common GroupVarInfo 
preparation/estimation code for them, but specifics of HashJoin bucket 
estimation need mcv_freq and result caching that limits intersection of 
these estimators.

-- 
regards, Andrei Lepikhov
Вложения

В списке pgsql-bugs по дате отправления: