Re: Query Performance / Planner estimate off

Поиск

Список

Период

Сортировка

От	Mats Olsen
Тема	Re: Query Performance / Planner estimate off
Дата	20 октября 2020 г. 19:45:36
Msg-id	60498453-2278-5f62-7ff6-343eea7bfd3b@duneanalytics.com обсуждение исходный текст
Ответ на	Query Performance / Planner estimate off (Mats Julian Olsen <mats@duneanalytics.com>)
Список	pgsql-performance

Дерево обсуждения

On 10/20/20 11:37 AM, Mats Julian Olsen wrote:

Dear Postgres community,

I'm looking for some help to manage queries against two large tables.

Context:
We run a relatively large postgresql instance (5TB, 32 vCPU, 120GB RAM) with a hybrid transactional/analytical workload. Data is written in batches every 15 seconds or so, and the all queryable tables are append-only (we never update or delete). Our users can run analytical queries on top of these tables.

We recently came across a series of troublesome queries one of which I'll dive into here.

Please see the following gist for both the query we run and the \d+ output: https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf.

The tables in question are:
- `ethereum.transactions`: 833M rows, partitioned, 171M rows after WHERE
- `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M rows after WHERE

The query plans I submitted was querying the table `uniswap_v2."Pair_evt_Mint"`which has 560k rows before and after WHERE. Also not partitioned. Apologies for the inconsistency, but as I mentioned the same performance problem holds when using `uniswap_v2."Pair_evt_Swap" (even worse due to it's size).

The crux of our issue is that the query planner chooses a nested loop join for this query. Essentially making this query (and other queries) take a very long time to complete. In contrast, by toggling `enable_nestloop` and `enable_seqscan` off we can take the total runtime down from 16 minutes to 2 minutes.

1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
3) enable_nestloop=off; enable_seqscan=off (2 min): https://explain.depesz.com/s/0WXx

How can I get Postgres not to loop over 12M rows?

Let me know if there is anything I left out here that would be useful for further debugging.

--
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com

В списке pgsql-performance по дате отправления:

Предыдущее

От: Mats Olsen
Дата: 20 октября 2020 г., 17:50:02
Сообщение: Re: Query Performance / Planner estimate off

Следующее

От: Victor Yegorov
Дата: 20 октября 2020 г., 19:51:39
Сообщение: Re: Query Performance / Planner estimate off

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Query Performance / Planner estimate off

Предыдущее

Следующее