Re: snapshot too old, configured by time

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: snapshot too old, configured by time
Дата
Msg-id CACjxUsPK=tXm9NY-YmwquXX6B__EN8o6bL5h68z_X-51nsmiTA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: snapshot too old, configured by time  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: snapshot too old, configured by time  (Michael Paquier <michael.paquier@gmail.com>)
Re: snapshot too old, configured by time  (Andres Freund <andres@anarazel.de>)
Re: snapshot too old, configured by time  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: snapshot too old, configured by time  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Dec 2, 2015 at 12:39 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Nov 9, 2015 at 8:07 AM, Steve Singer <steve@ssinger.info> wrote:

>> In snapmgr.c
>>
>>
>> + * XXX: If we can trust a read of an int64 value to be atomic, we can skip the
>> + * spinlock here.
>> + */
>> +int64
>> +GetOldSnapshotThresholdTimestamp(void)
>>
>>
>> Was your intent with the XXX for it to be a TODO to only aquire the lock on
>> platforms without the atomic 64bit operations?

I'm not sure whether we can safely assume a read of an int64 to be
atomic on any platform; if we actually can on some platforms, and
we have a #define to tell us whether we are in such an environment,
we could condition the spinlock calls on that.  Are we there yet?

>> On a more general note:
>>
>> I've tried various manual tests of this feature and it sometimes works as
>> expected and sometimes doesn't.
>> I'm getting the feeling that how I expect it to work isn't quite in sync
>> with how it does work.
>>
>> I'd expect the following to be sufficient to generate the test
>>
>> T1: Obtains a snapshot that can see some rows
>> T2: Waits 60 seconds and performs an update on those rows
>> T2: Performs a vacuum
>> T1: Waits 60 seconds, tries to select from the table.  The snapshot should
>> be too old
>>
>>
>> For example it seems that in test 002 the select_ok on conn1 following the
>> vacuum but right before the final sleep is critical to the snapshot too old
>> error showing up (ie if I remove that select_ok but leave in the sleep I
>> don't get the error)
>>
>> Is this intended and if so is there a better way we can explain how things
>> work?

At every phase I took a conservative approach toward deferring
pruning of tuples still visible to any snapshot -- often reducing
the overhead of tracking by letting things go to the next minute
boundary.  The idea is that an extra minute or two probably isn't
going to be a big deal in terms of bloat, so if we can save any
effort on the bookkeeping by letting things go a little longer, it
is a worthwhile trade-off.  That does make it hard to give a
precise statement of exactly when a transaction *will* be subject
to cancellation based on this feature, so I have emphasized the
minimum guaranteed time that a transaction will be *safe*.  In
reviewing what you describe, I think I still don't have it as
aggressive as I can (and probably should).  My biggest concern is
that a long-running transaction which gets a transaction ID
matching the xmin on a snapshot it will hold for a long time may
not be subject to cancellation.  That's probably not too hard to
fix, but the bigger problem is the testing.

People have said that issuing SQL commands directly from a TAP test
via DBD::Pg is not acceptable for a core feature, and (despite
assertions to the contrary) I see no way to test this feature with
existing testing mechanisms.  The bigger set of work here, if we
don't want this feature to go in without any testing scripts (which
is not acceptable IMO), is to enhance the isolation tester or
hybridize TAP testing with the isolation tester.

>> Also is 0 intended to be an acceptable value for old_snapshot_threshold and
>> if so what should we expect to see then?

The docs in the patch say this:

+        <para>
+         A value of <literal>-1</> disables this feature, and is the default.
+         Useful values for production work probably range from a small number
+         of hours to a few days.  The setting will be coerced to a granularity
+         of minutes, and small numbers (such as <literal>0</> or
+         <literal>1min</>) are only allowed because they may sometimes be
+         useful for testing.  While a setting as high as <literal>60d</> is
+         allowed, please note that in many workloads extreme bloat or
+         transaction ID wraparound may occur in much shorter time frames.
+        </para>

Once we can agree on a testing methodology I expect that I will be
adding a number of tests based on a cluster started with
old_snapshot_threshold = 0, but as I said in my initial post of the
patch I was keeping the tests in the patch thin until it was
confirmed whether this testing methodology was acceptable.  Since
it isn't, that was just as well.  The time put into learning enough
about perl and TAP tests to create those tests already exceeds the
time to develop the actual patch, and it looks like even more will
be needed for a test methodology that doesn't require adding a
package or downloading a CPAN module.  C'est la vie.  I did expand
my perl and TAP knowledge considerably, for what that's worth.

> There has been a review but no replies for more than 1 month. Returned
> with feedback?

I do intend to post another version of the patch to tweak the
calculations again, after I can get a patch in to expand the
testing capabilities to allow an acceptable way to test the patch
-- so I put it into the next CF instead.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julian Schauder
Дата:
Сообщение: proposal: add 'waiting for replication' to pg_stat_activity.state
Следующее
От: Jeff Janes
Дата:
Сообщение: Fwd: Another little thing about psql wrapped expanded output