Standby recovery conflicts: add information when the cancellation occurs

Поиск
Список
Период
Сортировка
От Drouvot, Bertrand
Тема Standby recovery conflicts: add information when the cancellation occurs
Дата
Msg-id 5a11fa42-f275-8610-4b69-76f52d11d8ab@amazon.com
обсуждение исходный текст
Список pgsql-hackers
Hi hackers,

As suggested by Masao, I am starting a new thread to follow up about 
standby recovery conflicts.

The initial patch proposed in [1] has been split in 3 parts:

- Add block information in error context of WAL REDO apply: committed 
(9d0bd95fa90a7243047a74e29f265296a9fc556d)
- Add information when the startup process is waiting for recovery 
conflicts: committed (0650ff23038bc3eb8d8fd851744db837d921e285)
- Add information when the cancellation occurs:  subject of this new thread

As you can see, the initial idea was also to dump information about the 
blocking backends (should they reach the cancellation stage).

Main idea is to provide information like:

2020-06-15 06:48:54.778 UTC [7037] LOG: about to interrupt pid: 7037, 
backend_type: client backend, state: active, wait_event_type: Timeout, 
wait_event: PgSleep, query_start: 2020-06-15 06:48:13.008427+00

Some examples, on how this could be useful:

     - For example the query being canceled usually runs in 1 second, 
seeing that it started 1 minute ago (when canceled) could indicate plan 
change.
     - For example a lot of queries have been canceled and all of them 
were waiting on “DataFileRead”: that could indicate bad IO response time 
at that moment.
     - Seeing the state as “idle in transaction” could potentially 
indicate an unexpected application behavior (say the application is 
using Begin; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; then 
select and then stay in an idle in transaction state that could lead to 
recovery conflict)

Main purpose is to dump information just before the cancellation occurs 
to get some clue on what was going on and get some data to work on (to 
avoid future conflict and cancellation).

If you think this information can be useful then I can submit a patch in 
this area.

Bertrand

[1]: 
https://www.postgresql.org/message-id/9a60178c-a853-1440-2cdc-c3af916cff59%40amazon.com








В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: ResourceOwner refactoring
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: support for MERGE