Performance issues with composite types (partitioned table)

Поиск
Список
Период
Сортировка
От Sebastijan Wieser
Тема Performance issues with composite types (partitioned table)
Дата
Msg-id CALiGJY0XsjSG3MRdgd4oN2=6yDhpUgtq-9=7vy+fNpf-70BMtg@mail.gmail.com
обсуждение исходный текст
Список pgsql-performance
Hello,

We are having performance issues with a table partitioned by date, using composite type columns.
I have attached the table definition, full reproducible of the issue and execution plans to this email.

Ultimately, we want to sum certain fields contained in those composite types,
using a simple status filter across one month (31 partition) and group by date.
But, we fail to get a satisfactory performance even on a single partition.

I have tried multiple indexing options, but none work for us, as I will explain.
I will refer to these indexes as: ix1, ix2, ix3, ix4, ix5, ix6 and for the second part ix7.
ix1-ix6 are defined in the attached repro.sql file and the performance of the query with each of them is shown in exec_plans file.
ix7 is defined in the repro_part_vs_parent.sql and performance of relevant queries in exec_plans_part_vs_parent file.

This is the query targeting single partition:

SELECT
SUM(COALESCE((col1).a + (col1).b + (col1).c + (col1).d, 0)) AS val1,
SUM(COALESCE((col2).y, 0)) AS val2
FROM
public."mytable:2020-12-09" --single partition of public.mytable
WHERE status IN (1,2,3,4);

We get the best performance using ix2, while I would expect to get better performance using ix3, and perhaps ix5. 

Questions:
1. Why cannot Postgres plan for index-only scan with ix3?
2. Why is the query cost so high when using ix3?
3. Is it possible to define an index such as ix3, that is, with a drastically reduced size and listing only expressions we project?
4. Are there any other indexing or query rewrite options that are worth trying here?
5. Judging by execution time, it seems that Postgres can leverage defined expressions in ix2, so why not in ix3? Why must it fetch col1 and col2 from the table when I force ix3 usage?
6. As ix3 is only 53MB in size (see repro.sql) as opposed to ix1 and ix2 which are 266MB and 280MB respectively, I would expect Postgres to use it instead?


In addition to this, please look at the attached repro_part_vs_parent.sql file and its related execution plans file.
There, I tried running a similar query on a partitioned table targeting a single partition, and afterwards on the partition itself.
The results confuse me. I would expect to get similar performance in both situations, but the query runs much slower through the parent table.
By looking at the output of the seq scan node (parent query), it seems that running the query on the parent table prepends partition name as an alias to projected columns.
Does that make Postgres unable to recognize the expression in the index, or is there something else happening here?

These are the queries:

--partitioned (parent) table, targeting single partition
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, SETTINGS)
SELECT
dt,
SUM(COALESCE((col1).a + (col1).b + (col1).c + (col1).d, 0)) AS expected,
SUM(COALESCE((col2).y, 0)) AS repayments
FROM
public.mytable
WHERE
dt = '2020-12-09'
AND status IN (1,2,3,4)
GROUP BY
dt;

--querying the partition directly instead:
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, SETTINGS)
SELECT
dt,
SUM(COALESCE((col1).a + (col1).b + (col1).c + (col1).d, 0)) AS expected,
SUM(COALESCE((col2).y, 0)) AS repayments
FROM
public."mytable:2020-12-09"
WHERE
dt = '2020-12-09'
AND status IN (1,2,3,4)
GROUP BY
dt;

Relevant setup information:
pg version/OS (1): PostgreSQL 12.5, compiled by Visual C++ build 1914, 64-bit / Windows 10
pg version/OS (2): PostgreSQL 12.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit / CentOS Linux release 7.8.2003 (Core)
total number of table partitions: 31
single partition size (with PK, no other indexes): 4GB
single partition number of rows: 2M
Postgres configuration settings can be observed in the provided execution plans


depesz links:
no index: https://explain.depesz.com/s/8H93
ix1: https://explain.depesz.com/s/kEYi
ix2: https://explain.depesz.com/s/yydX
ix3: https://explain.depesz.com/s/gAFm
ix4: https://explain.depesz.com/s/8lbh
ix5: https://explain.depesz.com/s/WIqwK
ix6: https://explain.depesz.com/s/BNUc
ix7 (parent): https://explain.depesz.com/s/DqUf
ix7 (child): https://explain.depesz.com/s/ejmP

Attached files:
1. repro.sql: contains the code which will reproduce my issue
2. exec_plans: lists execution plans for repro.sql I got on my machine with each of the mentioned indexes in place
3. repro_part_vs_parent.sql: contains queries showing the unexpected performance difference for the identical query ran on parent table vs. single partition
4. exec_plans_part_vs_parent: lists relevant execution plans for repro_part_vs_parent.sql

Thank you very much in advance.
Please let me know if something is unclear or if I can provide any other relevant info.

Best regards,
Sebastijan Wieser
Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Christophe Pettus
Дата:
Сообщение: Re: "Required checkpoints occurs too frequently"
Следующее
От: M Tarkeshwar Rao
Дата:
Сообщение: RE: Autovacuum not functioning for large tables but it is working for few other small tables.