Re: [PATCH] lock_timeout and common SIGALRM framework

Поиск
Список
Период
Сортировка
От Boszormenyi Zoltan
Тема Re: [PATCH] lock_timeout and common SIGALRM framework
Дата
Msg-id 500085DA.5050908@cybertec.at
обсуждение исходный текст
Ответ на Re: [PATCH] lock_timeout and common SIGALRM framework  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PATCH] lock_timeout and common SIGALRM framework  (Boszormenyi Zoltan <zb@cybertec.at>)
Re: [PATCH] lock_timeout and common SIGALRM framework  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
2012-07-12 19:05 keltezéssel, Tom Lane írta:
> Here is a revised version of the timeout-infrastructure patch.
> I whacked it around quite a bit, notably:
>
> * I decided that the most convenient way to handle the initialization
> issue was to combine establishment of the signal handler with resetting
> of the per-process variables.  So handle_sig_alarm is no longer global,
> and InitializeTimeouts is called at the places where we used to do
> "pqsignal(SIGALRM, handle_sig_alarm);".  I believe this is correct
> because any subprocess that was intending to use SIGALRM must have
> called that before establishing any timeouts.

OK.

> * BTW, doing that exposed the fact that walsender processes were failing
> to establish a SIGALRM signal handler, which is a pre-existing bug,
> because they run a normal authentication transaction during InitPostgres
> and hence need to be able to cope with deadlock and statement timeouts.
> I will do something about back-patching a fix for that.

Wow, my work uncovered a pre-existing bug in PostgreSQL. :-)

> * I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
> and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
> done in walsender and autovacuum processes.  I'm not totally satisfied
> with that, but for sure it didn't work to only establish them in
> regular backends.
>
> * I didn't like the "TimeoutName" nomenclature, because to me "name"
> suggests that the value is a string, not just an enum.  So I renamed
> that to TimeoutId.

OK.

> * I whacked around the logic in timeout.c a fair amount, because it
> had race conditions if SIGALRM happened while enabling or disabling
> a timeout.  I believe the attached coding is safe, but I'm not totally
> happy with it from a performance standpoint, because it will do two
> setitimer calls (a disable and then a re-enable) in many cases where
> the old code did only one.

Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in
the original code called setitimer() twice if statement timeout
was still in effect, it was done in CheckStatementTimeout().
Considering this, I think there is no performance problem with
the new code you came up with.

> I think what we ought to do is go ahead and apply this, so that we
> can have the API nailed down, and then we can revisit the internals
> of timeout.c to see if we can't get the performance back up.
> It's clearly a much cleaner design than the old spaghetti logic in
> proc.c, so I think we ought to go ahead with this independently of
> whether the second patch gets accepted.

There is one tiny bit you might have broken. You wrote previously:

> I am also underwhelmed by the "timeout_start" callback function concept.
> In the first place, that's broken enable_timeout, which incorrectly
> assumes that the value it gets must be "now" (see its schedule_alarm
> call).  In the second place, it seems fairly likely that callers of
> get_timeout_start would likewise want the clock time at which the
> timeout was enabled, not the timeout_start reference time.  (If they
> did want the latter, why couldn't they get it from wherever the callback
> function had gotten it?)  I'm inclined to propose that we drop the
> timeout_start concept and instead provide two functions for scheduling
> interrupts:
>
>     enable_timeout_after(TimeoutName tn, int delay_ms);
>     enable_timeout_at(TimeoutName tn, TimestampTz fin_time);
>
> where you use the former if you want the standard GetCurrentTimestamp +
> n msec calculation, but if you want the stop time calculated in some
> other way, you calculate it yourself and use the second function.

It's okay, but you haven't really followed it with STATEMENT_TIMEOUT:

-----8<----------8<----------8<----------8<----------8<-----
*** 2396,2404 ****                /* Set statement timeout running, if any */                /* NB: this mustn't be
enableduntil we are within an xact */                if (StatementTimeout > 0) 
!                       enable_sig_alarm(StatementTimeout, true);                else
!                       cancel_from_timeout = false;
                xact_started = true;        }
--- 2397,2405 ----                /* Set statement timeout running, if any */                /* NB: this mustn't be
enableduntil we are within an xact */                if (StatementTimeout > 0) 
!                       enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);                else
!                       disable_timeout(STATEMENT_TIMEOUT, false);
                xact_started = true;        }
-----8<----------8<----------8<----------8<----------8<-----

It means that StatementTimeout losts its precision. It would trigger
in the future counting from "now" instead of counting from
GetCurrentStatementStartTimestamp(). It should be:

enable_timeout_at(STATEMENT_TIMEOUT,
TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), StatementTimeout));

> I haven't really looked at the second patch yet, but at minimum that
> will need some rebasing to match the API tweaks here.

Yes, I will do that.

Thanks for your review and work.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: isolation check takes a long time
Следующее
От: Tom Lane
Дата:
Сообщение: Re: initdb and fsync