BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first
Дата
Msg-id 16966-f3ebf098ec2889b7@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16966
Logged by:          Zeb Burke-Conte
Email address:      zebburkeconte@gmail.com
PostgreSQL version: 13.2
Operating system:   Ubuntu 20.04 LTS
Description:

I'm seeing a performance issue when joining across two tables on columns
that require a cast from varchar to bpchar. No matter how selective the
condition is on the bpchar table, the outer scan will be on the table with
the varchar column. It's possible that the issue case is more specific than
that but you can see for yourselves with the example below (which is pretty
minimal). It ends with two queries that should be planned nearly
identically, but aren't; as a result, one is 100x slower.

Note: This is not an issue of inaccurate statistics/selectivity estimates.
The estimates are spot on, but the more selective condition is not being
used as the outer scan.

Example SQL:

drop table if exists public.a;
drop table if exists public.b;

create table public.b (
    id bpchar(16) not null,
    constraint b_pk primary key (id)
);

create table public.a (
    id varchar not null,
    constraint a_pk primary key (id)
);

insert into a
    (id)
    select (ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n +
10000*ten_thousands.n + 100000*hundred_thousands.n)::varchar
    from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ones(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) tens(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) hundreds(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) thousands(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ten_thousands(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) hundred_thousands(n)
    order by 1;

insert into b
    (id)
    select (ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n +
10000*ten_thousands.n + 100000*hundred_thousands.n)::varchar
    from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ones(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) tens(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) hundreds(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) thousands(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ten_thousands(n),
         (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) hundred_thousands(n)
    order by 1;

analyze a;
analyze b;

set enable_hashjoin = off;
set enable_mergejoin = off;
set enable_seqscan = off;

explain analyze select * from a join b on a.id = b.id where a.id in
('109244', '721345', '8911');
explain analyze select * from a join b on a.id = b.id where b.id in
('109244', '721345', '8911');


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Maxim Boguk
Дата:
Сообщение: Re: BUG #16964: Quadratic performance degradation of INSERT and ADD CONSTRAINT for intarray/GIST EXCLUDE CONSTRAINT
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: BUG #16965: Select query fails with ERROR: XX000: could not find pathkey item to sort