Re: [HACKERS] Shaky coding for vacuuming partitioned relations

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: [HACKERS] Shaky coding for vacuuming partitioned relations
Дата
Msg-id c6f082e8-76e3-e726-e719-e59eef8cd557@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Shaky coding for vacuuming partitioned relations  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Shaky coding for vacuuming partitioned relations  (Michael Paquier <michael.paquier@gmail.com>)
Re: [HACKERS] Shaky coding for vacuuming partitioned relations  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2017/09/25 12:10, Michael Paquier wrote:
> On Sat, Sep 23, 2017 at 4:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Somebody inserted this into vacuum.c's get_rel_oids():
>>
>>         tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
>>         if (!HeapTupleIsValid(tuple))
>>             elog(ERROR, "cache lookup failed for relation %u", relid);
>>
>> apparently without having read the very verbose comment two lines above,
>> which points out that we're not taking any lock on the target relation.
>> So, if that relation is concurrently being dropped, you're likely to
>> get "cache lookup failed for relation NNNN" rather than anything more
>> user-friendly.
> 
> This has been overlooked during the lookups of 3c3bb99, and by
> multiple people including me. elog() should never be things users can
> face, as well as cache lookups.

Agreed, that was an oversight.

>> A minimum-change fix would be to replace the elog() with an ereport
>> that produces the same "relation does not exist" error you'd have
>> gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed
>> a few microseconds earlier.  But that feels like its's band-aiding
>> around the problem.
> 
> Yeah, that's not right. This is a cache lookup error at the end.

Also agreed that fixing the locking here would be a better solution.

>> What I'm wondering about is changing the RangeVarGetRelid call to take
>> ShareUpdateExclusiveLock rather than no lock.  That would protect the
>> syscache lookup, and it would also make the find_all_inheritors call
>> a lot more meaningful.
>>
>> If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped
>> as soon as we close the caller's transaction, and then we'd acquire
>> the same or stronger lock inside vacuum_rel().  So that seems fine.
>> If we're doing an ANALYZE, then the lock would continue to be held
>> and analyze_rel would merely be acquiring it an extra time, so we'd
>> actually be removing a race-condition failure scenario for ANALYZE.
>> This would mean a few more cycles in lock management, but since this
>> only applies to a manual VACUUM or ANALYZE that specifies a table
>> name, I'm not too concerned about that.
> 
> I think that I am +1 on that. Testing such a thing I am not seeing
> anything wrong either. The call to find_all_inheritors should also use
> ShareUpdateExclusiveLock, and the lock on top of RangeVarGetRelid()
> needs to be reworked.
> 
> Attached is a proposal of patch.

Hmm, I'm not sure if we need to lock the partitions, too.  Locks taken by
find_all_inheritors() will be gone once the already-running transaction is
ended by the caller (vacuum()).  get_rel_oids() should just lock the table
mentioned in the command (the parent table), so that find_all_inheritors()
returns the correct partition list, as Tom perhaps alluded to when he said
"it would also make the find_all_inheritors call a lot more meaningful.",
but...

When vacuum() iterates that list and calls vacuum_rel() for each relation
in the list, it might be the case that a relation in that list is no
longer a partition of the parent.  So, one might say that it would get
vacuumed unnecessarily.  Perhaps that's fine?

>> Thoughts?
> 
> As long as I don't forget... Another thing currently on HEAD and
> REL_10_STABLE is that OIDs of partitioned tables are used, but the
> RangeVar of the parent is used for error reports. This leads to
> incorrect reports if a partition gets away in the middle of autovacuum
> as only information about the parent is reported to the user.

Oh, you're right.  The original RangeVar (corresponding to the table
mentioned in the command) refers to the parent table.

> I am not
> saying that this needs to be fixed in REL_10_STABLE at this stage
> though as this would require some refactoring similar to what the
> patch adding support for VACUUM with multiple relations does.

I think it would be better to fix that independently somehow.  VACUUM on
partitioned tables is handled by calling vacuum_rel() on individual
partitions.  It would be nice if vacuum_rel() didn't have to refer to the
RangeVar.  Could we not use get_rel_name(relid) in place of
relation->relname?  I haven't seen the other patch yet though, so maybe
I'm missing something.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Haribabu Kommi
Дата:
Сообщение: Re: [HACKERS] SERIALIZABLE with parallel query
Следующее
От: Noah Misch
Дата:
Сообщение: [HACKERS] Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags.Should it?