Re: BUG #14891: Old cancel request presented by pgbouncer honoredafter skipping a query.

Поиск
Список
Период
Сортировка
От Skarsol
Тема Re: BUG #14891: Old cancel request presented by pgbouncer honoredafter skipping a query.
Дата
Msg-id ca81c513b1af95a4aef1b70e09ae7f6b@wurm.cx
обсуждение исходный текст
Ответ на Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  ("Todd A. Cook" <tcook@blackducksoftware.com>)
Список pgsql-bugs
As a workaround for this issue we've dropped the pgbouncer connection 
lifetime to 60 seconds and that seems to have alleviated this for the 
most part. No response from pgbouncer about this (either to the recently 
created issue or the mailing list message last year when I initially 
reported this).

Based on the comments in the past discussions on the postgres cancel 
protocol it seems that this is not viewed as a big issue because there's 
no real reports of it causing problems. Are other people just not using 
pgbouncer in transaction mode with the default settings (or not having 
two instances of pgbouncer between client and server)? Or typically 
don't send many cancel requests? Or is there just something silly I'm 
missing?

In one of our medium databases we see 5-15 cancel requests per day and 
with pgbouncer on the hard coded default setting (3600 second connection 
lifetime) we would get around 1 or 2 relevant erroneous cancels (one 
that causes an insert to fail, typically failing a larger transaction) 
per week. This was up from about 1 a month using the config default of 
1800 seconds that we lived with for a long time.

Is there something better we should be using other than pgbouncer for 
connection pooling?

On 2017-11-08 22:55, Skarsol wrote:

The following bug has been logged on the website:

Bug reference:      14891
Logged by:          Skarsol
Email address:      postgresql(at)skarsol(dot)com
PostgreSQL version: 9.6.3
Operating system:   Linux 4.4.8-hardened-r1 #4 SMP Mon Jun 12
Description:

This might be a symptom of the issue discussed in the ToDo "Changes to 
make
cancellations more reliable and more secure" but as it is related to the
pgbouncer bug I've opened at
https://github.com/pgbouncer/pgbouncer/issues/245 I figured I'd post it 
over
here just to make sure.

As the last step of this bug, pgbouncer 1.7.2 presents a cancel request 
to
postgres 9.6.3. This request targets pid 29330 which is connected to
pgbouncer on port 33024. That pid then accepts a new query, returns a 
result
set, accepts another new query, and then cancels that one out.

Expected behavior would have been for either no cancel (as that pid was
between queries at the time) or to cancel the first query. Cancelling 
the
2nd query is just weird (to me).

I have no idea how much of this is related to whatever pgbouncer is 
doing to
delay the cancel in the first place before presenting it to postgres.

I'm aware that we're 2 minor versions behind, but I don't see anything 
that
seems relevant to this in the changelogs.

Image of the relevant wireshark display at
https://user-images.githubusercontent.com/1915152/32578433-d5d4a71c-c4a2-11e7-9d25-f59d5afbb06b.jpg


В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Todd A. Cook"
Дата:
Сообщение: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop