Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Дата
Msg-id f92ce13f-cbad-769d-72df-f5b87717f375@gmail.com
обсуждение исходный текст
Ответ на Re: [EXTERNAL] Re: Add non-blocking version of PQcancel  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Список pgsql-hackers
Hello Thomas,

17.07.2024 03:05, Thomas Munro wrote:
> On Wed, Jul 17, 2024 at 3:08 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> Ugh.  I tried to follow what's going on in that cygwin code, but I gave
>> up pretty quickly.  It depends on a mutex, but I didn't see the mutex
>> being defined or initialized anywhere.
> https://github.com/cygwin/cygwin/blob/cygwin-3.5.3/winsup/cygwin/fhandler/socket_inet.cc#L217C1-L217C77
>
> Not obvious how it'd be deadlocking (?), though...  it's hard to see
> how anything between LOCK_EVENTS and UNLOCK_EVENTS could escape/return
> early.  (Something weird going on with signal handlers?  I can't
> imagine where one would call poll() though).

I've simplified the repro to the following:
echo "
-- setup foreign server "loopback" --

CREATE TABLE t1(i int);
CREATE FOREIGN TABLE ft1 (i int) SERVER loopback OPTIONS (table_name 't1');
CREATE FOREIGN TABLE ft2 (i int) SERVER loopback OPTIONS (table_name 't1');

INSERT INTO t1 SELECT i FROM generate_series(1, 100000) g(i);
" | psql

cat << 'EOF' | psql
Select pg_sleep(10);
SET statement_timeout = '10ms';
SELECT 'SELECT count(*) FROM ft1 CROSS JOIN ft2;' FROM generate_series(1, 100)
\gexec
EOF

I've attached strace (with --mask=0x251, per [1]) to the query-cancelling
backend and got strace.log (see in attachment), while observing:
ERROR:  canceling statement due to statement timeout
...
ERROR:  canceling statement due to statement timeout
-- total 14 lines, then the process hanged --
-- I interrupted it several seconds later --

As far as I can see (having analyzed a number of runs), the hanging occurs
when some itimer-related activity happens before "peek_socket" in this
event sequence:
[main] postgres {pid} select_stuff::wait: res after verify 0
[main] postgres {pid} select_stuff::wait: returning 0
[main] postgres {pid} select: sel.wait returns 0
[main] postgres {pid} peek_socket: read_ready: 0, write_ready: 1, except_ready: 0

(See the last occurrence of the sequence in the log.)

[1] https://cygwin.com/cygwin-ug-net/strace.html

Best regards,
Alexander
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: improve performance of pg_dump with many sequences
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: improve performance of pg_dump with many sequences