Re: Query running for very long time (server hanged) with parallel append

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Query running for very long time (server hanged) with parallel append
Дата
Msg-id CA+TgmoZj-gXqbQD4E1Bg7yVB3bvcoSC5HWSv0DBwLNxBO5uFBQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Query running for very long time (server hanged) with parallel append  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: Query running for very long time (server hanged) with parallel append
Re: Query running for very long time (server hanged) with parallel append
Список pgsql-hackers
On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 2 February 2018 at 03:50, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>> Whatever logic bug might be causing the query to hang, it's not good
>> that we're unable to SIGINT/SIGTERM our way out of this state.  See
>> also this other bug report for a known problem (already fixed but not
>> yet released), but which came with an extra complaint, as yet
>> unexplained, that the query couldn't be interrupted:
>>
>> https://www.postgresql.org/message-id/flat/151724453314.1238.409882538067070269%40wrigleys.postgresql.org
>
> Yeah, it is not good that there is no response to the SIGINT.
>
> The query is actually hanging because one of the workers is in a small
> loop where it iterates over the subplans searching for unfinished
> plans, and it never comes out of the loop (it's a bug which I am yet
> to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
> each iteration; it's a small loop that does not pass control to any
> other functions .

Uh, sounds like we'd better fix that bug.

> But I am not sure about this : while the workers are at it, why the
> backend that is waiting for the workers does not come out of the wait
> state with a SIGINT. I guess the same issue has been discussed in the
> mail thread that you pointed.

Is it getting stuck here?

    /*
     * We can't finish transaction commit or abort until all of the workers
     * have exited.  This means, in particular, that we can't respond to
     * interrupts at this stage.
     */
    HOLD_INTERRUPTS();
    WaitForParallelWorkersToExit(pcxt);
    RESUME_INTERRUPTS();

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] [PATCH] Lockable views
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()