Re: Read Uncommitted

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: Read Uncommitted
Дата
Msg-id 685ab71a-8299-7294-44cf-e5981cd54233@gmail.com
обсуждение исходный текст
Ответ на Re: Read Uncommitted  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Read Uncommitted  (Mark Dilger <hornschnorter@gmail.com>)
Re: Read Uncommitted  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers

On 12/19/19 1:50 AM, Simon Riggs wrote:
> It seems possible that catalog access would be the thing that makes this 
> difficult. Cache invalidations wouldn't yet have been fired, so that 
> would lead to rather weird errors, and as you say, potential issues from 
> data type changes which would not be acceptable in a facility available 
> to non-superusers.
> 
> We could limit that to xacts that don't do DDL, which is a very small % 
> of xacts, but then those xacts are more likely to be ones you'd want to 
> recover or investigate.
> 
> So I now withdraw this patch as submitted and won't be resubmitting.

Oh, I'm sorry to hear that.  I thought this feature sounded useful, and
we were working out what its limitations were.  What I gathered from
the discussion so far was:

   - It should be called something other than READ UNCOMMITTED
   - It should only be available to superusers, at least for the initial
     implementation
   - Extra care might be required to lock catalogs to avoid unsafe
     operations that could lead to backends crashing or security
     vulnerabilities
   - Toast tables need to be handled with care

For the record, in case we revisit this idea in the future, which were
the obstacles that killed this patch?

Tom's point on that third item:

 > But I am quite afraid that we'd introduce security holes by future
 > reductions of required lock levels --- or else that this feature would be
 > the sole reason why we couldn't reduce the lock level for some DDL
 > operation.  I'm doubtful that its use-case is worth that."

Anybody running SET TRANSACTION ISOLATION LEVEL RECOVERY might
have to get ExclusiveLock on most of the catalog tables.  But that
would only be done if somebody starts a transaction using this
isolation level, which is not normal, so it shouldn't be a problem
under normal situations.  If the lock level reduction that Tom
mentions was implemented, there would be no problem, as long as the
lock level you reduce to still blocks against ExclusiveLock, which
surely it must.  If the transaction running in RECOVERY level isolation
can't get the locks, then it blocks and doesn't help the administrator
who is trying to use this feature, but that is no worse than the
present situation where the feature is entirely absent.  When no
catalog changes are in flight, the administrator gets the locks and
can continue inspecting the in-process changes of other transactions.

Robert's point on that fourth item:

 > As soon as a transaction aborts, the TOAST rows can be vacuumed
 > away, but the READ UNCOMMITTED transaction might've already seen the
 > main tuple. This is not even a particularly tight race, necessarily,
 > since for example the table might be scanned, feeding tuples into a
 > tuplesort, and then the detoating might happen further up in the query
 > tree after the sort has completed.

I don't know if this could be fixed without adding overhead to toast
processing for non-RECOVERY transactions, but perhaps it doesn't need
to be fixed at all.  Perhaps you just accept that in RECOVERY mode you
can't see toast data, and instead get NULLs for all such rows.  Now,
that could have security implications if somebody defines a policy
where NULL in a toast column means "allow" rather than "deny" for
some issue, but if this RECOVERY mode is limited to superusers, that
isn't such a big objection.

There may be a number of other gotchas still to be resolved, but
abandoning the patch at this stage strikes me as premature.

-- 
Mark Dilger



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: client auth docs seem to have devolved
Следующее
От: Antonin Houska
Дата:
Сообщение: Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence