Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.

Поиск

Список

Период

Сортировка

От	MichaelDBA
Тема	Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.
Дата	26 февраля 2019 г. 00:07:30
Msg-id	5C745912.1080200@sqlexec.com обсуждение исходный текст
Ответ на	Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks. (Jeff Janes <jeff.janes@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

Was wondering when that would come up, taking queuing logic outside the database. Can be overly painful architecting queuing logic in relational databases. imho.

Regards,
Michael Vitale

Jeff Janes
Monday, February 25, 2019 3:30 PM
On Sat, Feb 23, 2019 at 4:06 PM Gunther <raj@gusw.net> wrote:
Hi,
I am using an SQL queue for distributing work to massively parallel workers.
You should look into specialized queueing software.

...
I figured I might just pause all workers briefly to schedule the REINDEX Queue command, but the problem with this is that while the transaction volume is large, some jobs may take minutes to process, and in that case we need to wait minutes to quiet the database with then 47 workers sitting as idle capacity waiting for the 48th to finish so that the index can be rebuilt!
The jobs that take minutes are themselves the problem. They prevent tuples from being cleaned up, meaning all the other jobs needs to grovel through the detritus every time they need to claim a new row. If you got those long running jobs to end, you probably wouldn't even need to reindex--the problem would go away on its own as the dead-to-all tuples get cleaned up.

Locking a tuple and leaving the transaction open for minutes is going to cause no end of trouble on a highly active system. You should look at a three-state method where the tuple can be pending/claimed/finished, rather than pending/locked/finished. That way the process commits immediately after claiming the tuple, and then records the outcome in another transaction once it is done processing. You will need a way to detect processes that failed after claiming a row but before finishing, but implementing that is going to be easier than all of this re-indexing stuff you are trying to do now. You would claim the row by updating a field in it to have something distinctive about the process, like its hostname and pid, so you can figure out if it is still running when it comes time to clean up apparently forgotten entries.

Cheers,

Jeff

В списке pgsql-performance по дате отправления:

Предыдущее

От: Jeff Janes
Дата: 26 февраля 2019 г., 00:03:55
Сообщение: Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.

Следующее

От: "support@mekong.be"
Дата: 26 февраля 2019 г., 02:22:39
Сообщение: Re: Query slow for new participants

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.

Предыдущее

Следующее