Re: Zero throughput on a query on a very large table.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Zero throughput on a query on a very large table.
Дата
Msg-id 29285.1548397485@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Zero throughput on a query on a very large table.  ("ldh@laurent-hasson.com" <ldh@laurent-hasson.com>)
Ответы Re: Zero throughput on a query on a very large table.  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Zero throughput on a query on a very large table.  ("ldh@laurent-hasson.com" <ldh@laurent-hasson.com>)
Список pgsql-performance
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Query:
>      select * from tmp_outpatient_rev order by desy_sort_key, claim_no

> Plan:
> [ seqscan-and-sort ... parallelized, but still seqscan-and-sort ]

>     - I have tried to hack the planner to force an index scan (which would avoid the sort/gather steps and should
startstreaming data right away), in particular, enable_seqscan=false or seq_page_cost=2. This had ZERO impact on the
planto my surprise. 

If you can't get an indexscan plan despite setting enable_seqscan=false,
that typically means that the planner thinks the index's sort order
does not match what the query is asking for.  I wonder whether you
created the index with nondefault collation, or asc/desc ordering,
or something like that.  There's not enough detail here to diagnose
that.

It should also be noted that what enable_seqscan=false actually does
is to add a cost penalty of 1e10 to seqscan plans.  It's possible
that your table is so large and badly ordered that the estimated
cost differential between seqscan and indexscan is more than 1e10,
so that the planner goes for the seqscan anyway.  You could probably
overcome that by aggressively decreasing random_page_cost (and by
"aggressive" I don't mean 2, I mean 0.2, or maybe 0.00002, whatever
it takes).  However, if that's what's happening, I'm worried that
getting what you asked for may not really be the outcome you wanted.
Just because you start to see some data streaming to your app right
away doesn't mean the process is going to complete in less time than
it would if you waited for the sort to happen.

You didn't mention what you have work_mem set to, but a small value
of that would handicap the sort-based plan a lot.  I wonder whether
jacking up work_mem to help the sorts run faster won't end up being
the better idea in the end.

            regards, tom lane

PS: On the third hand, you mention having created new indexes on this
table with apparently not a lot of pain, which is a tad surprising
if you don't have the patience to wait for a sort to finish.  How
long did those index builds take?


В списке pgsql-performance по дате отправления:

Предыдущее
От: Andreas Kretschmer
Дата:
Сообщение: Re: Zero throughput on a query on a very large table.
Следующее
От: David Rowley
Дата:
Сообщение: Re: Zero throughput on a query on a very large table.