Hanging query on a fresh restart

Поиск
Список
Период
Сортировка
От Jason Jho
Тема Hanging query on a fresh restart
Дата
Msg-id CACLReKuBuHU85p38fYoF_h_jMKeeFewOXvS7zSGV+fHWz4ENrQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Hanging query on a fresh restart  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-performance
Hi,

We using Postgres 9.3.10 on Amazon RDS and running into some strange behavior that has been tough to track down and debug (partially due to the limited admin access from RDS).

We're running a read-only query that normally takes ~10-15 min., but also runs concurrently with several other intensive queries (these queries themselves, finish).  

On one particular day, this query hung for many hours and even while we killed pids for running queries and any locks granted, the query would never return.  Also no hints of blocking processes.  After some digging through some I/O metrics, we didn't see any memory issues or unusual spikes that would lead us to believe that we're running low on resources.

There is 1 caveat, however:  there was a different schema that contained a day-old copy of data that isn't normally present when the hang started to occur.  However, since these are completely different schema namespaces with no crossovers in the queries themselves, I don't see how this is relevant.


 1) We ended up doing a full reboot of the RDS instance and ran the query again, this time, no other queries are running off of a fresh boot-up (no competing locks or transactions).  The query continued to hang.

 2) We then ran pg_dump to snapshot the current data and did a full pg_restore (after dropping all schemas) of an older dataset where we knew this query would run successfully.  As expected, the query ran fine.

 3) We then dropped all schemas again and pg_restored the previous dataset that was causing the query to hang, and then to my surprise, the query ran just fine.  No hangs.  

We thought this might be possibly due to some internal vacuuming, but this is unlikely since there are no real concurrent reads or updates happening.  Auto-vacuum is also on with default settings.

What is the most confusing part in all of this is why a DROP SCHEMA CASCADE and a fresh pg_restore would somehow fix the problem.  Even a fresh reboot didn't fix it.

Any ideas??

В списке pgsql-performance по дате отправления:

Предыдущее
От: Igor Neyman
Дата:
Сообщение: Re: Slow 3 Table Join with v bad row estimate
Следующее
От: David Osborne
Дата:
Сообщение: Re: Slow 3 Table Join with v bad row estimate