Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches

Поиск

Список

Период

Сортировка

От	Dimitrios Apostolou
Тема	Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
Дата	31 января 2023 г. 19:32:10
Msg-id	8504ea0b-584f-773b-44c1-233f4ac8e630@gmx.net обсуждение исходный текст
Ответ на	Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-general

Дерево обсуждения

Thanks for the insight on the internals. Regarding your questions:

On Tue, 31 Jan 2023, Tom Lane wrote:

> Do you get the same 10 rows when you repeat the command?

Yes. Just tested with both cold and hot caches. The first 10 rows are
exactly the same, either they return slowly or immediately.

> If turning synchronize_seqscans off changes the behavior, that'd
> be a good clue that this is the right theory.

Turning it off makes the query slow no matter how many times I re-run it.
The system is doing lots of read I/O with both hot and cold caches. Here
is the EXPLAIN output from the hot cache run (that previously had only 14
hits and no reads):

  Limit  (cost=0.00..0.29 rows=10 width=42) (actual time=620510.813..620510.821 rows=10 loops=1)
    Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n, test_function_n,
test_datatag_n
    Buffers: shared hit=64 read=2334462
    I/O Timings: shared/local read=567846.559
    ->  Seq Scan on public.test_runs_raw  (cost=0.00..9250235.80 rows=317603680 width=42) (actual
time=620510.800..620510.804rows=10 loops=1) 
          Output: run_n, test_name_n, workitem_n, started_on, duration_ms, test_result_n, test_executable_n,
test_function_n,test_datatag_n 
          Buffers: shared hit=64 read=2334462
          I/O Timings: shared/local read=567846.559
  Settings: effective_cache_size = '2GB', max_parallel_workers_per_gather = '0', work_mem = '64MB'
  Planning Time: 0.099 ms
  Execution Time: 620510.855 ms

After reading the docs, I'm surprised this setting affects my case given
that I have no parallelism in my setup.

> As for a real fix,
> it might be time for a VACUUM FULL or CLUSTER on that table.

Regarding CLUSTER, would it help with a seqscan on a bloated table?
Furthermore, given that the table is expected to grow every day by a few
million rows, do you suggest running CLUSTER every night? Will postgres
remember that the rows up to N are clustered, even after appending more
rows?

Dimitris

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 31 января 2023 г., 19:30:59
Сообщение: Re: Download file from COPY ... TO with pgadmin

Следующее

От: Adrian Klaver
Дата: 31 января 2023 г., 19:38:30
Сообщение: Re: Best Open Source OS for Postgresql

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches

Предыдущее

Следующее