Increasing work_mem slows down query, why?

Поиск
Список
Период
Сортировка
От Silvio Moioli
Тема Increasing work_mem slows down query, why?
Дата
Msg-id 79dd683d-3296-1b21-ab4a-28fdc2d98807@suse.de
обсуждение исходный текст
Ответы Re: Increasing work_mem slows down query, why?  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-performance
Dear list,

here is a pretty contrived case where increasing work_mem produces a worse plan, with much worse overall query time. I
wonderwhy that is the case.
 


Problem: INSERTing a thousand new rows in a table which can easily have one million rows. PK is "id", which comes from
atable, and we have two columns (called "name" and "version") which do not admit duplicates.
 

Schema here:
https://github.com/uyuni-project/uyuni/blob/Uyuni-2020.03/schema/spacewalk/common/tables/rhnPackageCapability.sql
Indices here:
https://github.com/uyuni-project/uyuni/blob/Uyuni-2020.03/schema/spacewalk/postgres/tables/rhnPackageCapability_index.sql


We want one command that returns IDs given (name, version) couples. If they are already in the table, they should be
SELECTed,if they are not, they should be INSERTed.
 

Version is NULLable and NULL should be treated as a value.

We use:

WITH wanted_capability(ordering, name, version) AS (
  VALUES (1, 'first_name', '1.0.0'), (2, 'first_name', '1.0.1'), (1, 'second_name', '1.0.0'), ...998 more...
)
missing_capability AS (
  SELECT wanted_capability.*
    FROM wanted_capability LEFT JOIN rhnPackageCapability
      ON wanted_capability.name = rhnPackageCapability.name
        AND wanted_capability.version IS NOT DISTINCT FROM rhnPackageCapability.version
    WHERE rhnPackageCapability.id IS NULL
),
inserted_capability AS (
  INSERT INTO rhnPackageCapability(id, name, version)
    SELECT nextval('rhn_pkg_capability_id_seq'), name, version FROM missing_capability ON CONFLICT DO NOTHING
    RETURNING id, name, version
)
SELECT wanted_capability.ordering, inserted_capability.id
  FROM wanted_capability JOIN inserted_capability
    ON wanted_capability.name = inserted_capability.name
      AND wanted_capability.version IS NOT DISTINCT FROM inserted_capability.version
    UNION (
      SELECT wanted_capability.ordering, rhnPackageCapability.id
        FROM wanted_capability JOIN rhnPackageCapability
          ON wanted_capability.name = rhnPackageCapability.name
            AND wanted_capability.version IS NOT DISTINCT FROM rhnPackageCapability.version
    )
  ORDER BY ordering
;


Behavior at work_mem = 5 MB is pretty good, query finishes in 200ms. Plan: https://explain.dalibo.com/plan/4u

Behavior at work_mem = 80 MB seems not equally good, query takes more than 13s. Two expensive SORTs and MERGE JOINs are
doneinstead of HASH JOINs. Plan: thttps://explain.dalibo.com/plan/ORd
 

Adding one more INDEX on rhnCapability.name fixes the issue.

My question is: why are SORTs chosen if more work_mem is available, and why can't the planner predict query will be
slowerthat way?
 

All of the above is reproducible on openSUSE Leap and PostgreSQL 10.12.

Ideas welcome, and thanks in advance!

Regards,
--
Silvio Moioli
SUSE Manager Development Team



В списке pgsql-performance по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: Best way to delete big amount of records from big table
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: Increasing work_mem slows down query, why?