Wrong estimations and NL Anti join poor performance

Поиск
Список
Период
Сортировка
От Andrew Zakharov
Тема Wrong estimations and NL Anti join poor performance
Дата
Msg-id 044a01d59e36$920e7020$b62b5060$@mail.ru
обсуждение исходный текст
Список pgsql-performance

Hello colleagues –

 

The problem description:

We're moving from 9.6 to 11.5. There is a SQL code that never ends in 11.5 but works fine in 9.6. The main cause is the optimizer considers of using NL Anti join instead of Merge in 9.6. And the root cause - wrong estimation while self-joining.

 

System environment:

CentOS Linux 3.10.0-1062.4.1.el7.x86_64 x86_64

MemTotal:       16266644 kB

Intel(R) Xeon(R) CPU E7-8867 v3 @ 2.50GHz

HDD - unknown

 

PostgreSQL:

PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit

shared_buffers = 1GB

huge_pages = on

temp_buffers = 1GB

max_prepared_transactions = 128

max_connections = 500

work_mem = 256MB

maintenance_work_mem = 1024MB

autovacuum_work_mem = 512MB

max_worker_processes = 100

max_parallel_workers_per_gather = 0 # changing this value to any others takes no effect for issue resolving

max_parallel_workers = 8

checkpoint_timeout = 30min

max_wal_size = 32GB

min_wal_size = 8GB

checkpoint_completion_target = 0.9

enable_nestloop = on                # off value fixes the issue but this is wrong way

random_page_cost = 4.0

effective_cache_size = 4GB

default_statistics_target = 2000

 

Main script:

 

  -- preparation

  -- this table is reverted tree with tree keys position_uuid, parent_position_uuid

  create temporary table tmp_nsi_klp on commit drop as

  select

    k1.gid,

    k1.smnn_gid,

    k1.position_uuid,

    p.parent_position_uuid,

    k1.order_number,

    k1.date_start,

    k1.date_end,

    k1.is_active,

    coalesce(p.is_fake_series, false) as is_fake_series

  from

    nsi_klp k1

      left join (select gid, unnest(parent_position_uuid) as parent_position_uuid, coalesce(array_length(parent_position_uuid, 1),0) > 1 as is_fake_series from nsi_klp where version_esklp = '2.0') p using (gid)

  where

    k1.version_esklp = '2.0'

  ;

 

  create unique index tmp_nsi_klp_ui on tmp_nsi_klp(gid, parent_position_uuid);

 

  analyze tmp_nsi_klp;

 

  -- working set (!!This SQL never ends in 11.5 now)

  create temporary table tmp_klp_replace on commit drop as

    select distinct on (klp_gid)

      *

    from (

      select

        k2.gid                  as klp_gid,

        k2.smnn_gid             as klp_smnn_gid,

        k2.position_uuid        as klp_position_uuid,

        k2.order_number         as klp_order_number,

        k2.is_active            as klp_is_active,

        k1.gid                  as klp_child_gid,

        k1.smnn_gid             as klp_child_smnn_gid,

        k1.position_uuid        as klp_child_position_uuid,

        k1.order_number         as klp_child_order_number,

        k1.is_active            as klp_child_is_active

      from

        tmp_nsi_klp k1

          join tmp_nsi_klp k2 on (k2.position_uuid = k1.parent_position_uuid)

      union all

      select

        k1.gid                  as klp_gid,

        k1.smnn_gid             as klp_smnn_gid,

        k1.position_uuid        as klp_position_uuid,

        k1.order_number         as klp_order_number,

        k1.is_active            as klp_is_active,

        null                    as klp_child_gid,

        null                    as klp_child_smnn_gid,

        null                    as klp_child_position_uuid,

        null                    as klp_child_order_number,

        null                    as klp_child_is_active

      from

        tmp_nsi_klp k1

          left join tmp_nsi_klp k2 on (k1.position_uuid = k2.parent_position_uuid)

          left join (select position_uuid from tmp_nsi_klp where not is_fake_series group by position_uuid having count(1) > 1) klp_series on (klp_series.position_uuid = k1.position_uuid)

      where

        -- not exists(select 1 from tmp_nsi_klp k2 where k1.position_uuid = k2.parent_position_uuid)

        k2.gid is null                        -- none referenced

        and klp_series.position_uuid is null  -- klp series with the same position_uuid

    ) a

    order by

      klp_gid,

      klp_order_number desc

  ;

 

Characteristics of source table - tmp_nsi_klp:

 

create table tmp_nsi_klp (

     gid uuid NULL,                  -- not null by the fact

     smnn_gid uuid NULL,             -- not null by the fact

     position_uuid uuid NULL,        -- not null by the fact

     parent_position_uuid uuid NULL,

     order_number int8 NULL,

     date_start timestamp NULL,      -- not null by the fact

     date_end timestamp NULL,

     is_active bool NULL,            -- not null by the fact

     is_fake_series bool NULL        -- not null by the fact

);

 

Rows: 237279

 

Cols stats:

https://docs.google.com/spreadsheets/d/1Ocbult13kZ64vK9nHt-_BV3EENK_ZSHFTAmRZLISUIE/edit?usp=sharing

 

Execution plans for problematic query - working set "create temporary table tmp_klp_replace on commit drop as":

 

On 11.5 (option nestloop enabled):

https://explain.depesz.com/s/pIzd

Exec time: never finished

 

On 9.6 (PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit):

https://explain.depesz.com/s/sO0G

Exec time: ~1 sec

 

On 11.5 (option nestloop disabled):

https://explain.depesz.com/s/eYzk

Exec time: ~1,5 sec

 

Construction "not exists(select 1 from tmp_nsi_klp k2 where k1.position_uuid = k2.parent_position_uuid)" works perfectly but there are lots of similar constructions in a code made for checking inclusion of data. Thus no chances to change existing code to another using not exists construction. Are there any options to bring initial statement to life and keep the server option nestloop enable?

Give me a clue, pls.

Thanks in advance.

Andrew.

 

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Out of memory error on automatic vacuum
Следующее
От: "Sterpu Victor"
Дата:
Сообщение: Postgresql planning time too high