One query run twice in parallel results in huge performance decrease

Поиск
Список
Период
Сортировка
От Jan Michel
Тема One query run twice in parallel results in huge performance decrease
Дата
Msg-id 5299103F.1040307@mueschelsoft.de
обсуждение исходный текст
Ответы Re: One query run twice in parallel results in huge performance decrease  (Jeff Janes <jeff.janes@gmail.com>)
Re: One query run twice in parallel results in huge performance decrease  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
Dear all,
I have a quite strange problem running an extensive query on geo data
checking for crossing ways. I don't know if this is a postgres or postgis
problem, but I hope you can help. Running one thread is no problem, it
finishes within 10-15 minutes. Run two of those queries in parallel and
they will not finish within 24 hours. It is definitely not a caching or I/O
problem.

First, the environment:
Running on a large server (32 cores, 128 GB RAM, fast RAID disks)
I tested psql 8.1 / 9.1 / 9.3 and postgis 1.5 and 2.1.0 on Debian 6 and
OpenSuse 12.3. All behave similar. The pgsql server settings were
optimized using pgtune, wal logging and autovacuum is off.

I'm working on a set of databases, each 5-10 GB big filled with OSM
geo data. I run many different queries, and I know the server can handle
up to 8 parallel tasks without a decrease in performance compared to a
single thread. Most data is kept in the cache and almost no read access
to the disk needs to be done.
Everything works well, despite one query, that runs on a table with ~ 1M
entries. It searches for ways crossing each other:
http://etherpad.netluchs.de/pgquery
(The definition of the source table is included as well)

Here is the explain analyze of the query:
http://explain.depesz.com/s/fAcV
As you can see, the row estimate is far off, but the runtime of 11 minutes
is acceptable, I think.

When I run a second instance of this query in a unrelated database on the
same server, they take 100% CPU, no iowait and they do not finish even
after more than a day.
An explain done directly before executing the query shows a huge cost
estimate and varying different plans:
http://explain.depesz.com/s/XDR
http://explain.depesz.com/s/SeG

How can two queries have such a strong influence on each other? Especially
when the host server could handle even ten queries without problems?
And most important: What can I do?

Thank you all in advance for your help!
Jan


В списке pgsql-performance по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Postgresql in a Virtual Machine
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: One query run twice in parallel results in huge performance decrease