Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Дата
Msg-id 20180403.144512.77690028.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
Hello.

At Mon, 2 Apr 2018 16:11:12 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in
<20180402191112.wneiyj4v5upnfjst@alvherre.pgsql>
> Why do we need AccessExclusiveLock on all children of a relation that we
> want to scan to search for rows not satisfying the constraint?  I think
> it should be enough to get ShareLock, which prevents INSERT, no?  I have
> a feeling I'm missing something here, but I don't know what, and all
> tests pass with that change.

Mmm. I'm not sure if there's a lock-upgrade case but the
following sentense is left at the last of one of the modified
comments. ATREwriteTables is called with AEL after that (that has
finally no effect in this case).

|                                   But we cannot risk a deadlock by taking
| * a weaker lock now and the stronger one only when needed.

I don't actually find places where the children's lock can be
raised but this seems just following the lock parent first
principle.

By the way check_default_allows_bound() (CREATE TABLE case)
contains a similar usage of find_all_inheritors(default_rel,
AEL).

> Also: the change proposed to remove the get_default_oid_from_partdesc()
> call actually fixes the bug, but to me it was not at all obvious why.

CommandCounterIncrement updates the content of a relcache entry
via invalidation. It can be surprising for callers of a function
like StorePartitionBound.

CommandCounterIncrement
 <skip inval mechanism>
   RelationCacheInvalidateEntry
     RelationFlushRelation
       RelationClearRelation

> To figure out why, I first had to realize that
> ValidatePartitionConstraints was lying to me, both in name and in
> comments: it doesn't actually validate anything, it merely queues
> entries so that alter table's phase 3 would do the validation.  I found
> this extremely confusing, so I fixed the comments to match reality, and
> later decided to rename the function also.

It is reasonable. Removing exccessive extension of lower-level
partitions is also reasonable. The previous code extended
inheritances in different manner for levels at odd and even
depth.

> At that point I was able to understand what the problem was: when
> attaching the default partition, we were setting the constraint to be
> validated for that partition to the correctly computed partition
> constraint; and later in the same function we would set the constraint
> to "the immature constraint" because we were now seeing that the default
> partition OID was not invalid.  So it was rather a bug in
> ValidatePartitionConstraints, in that it was accepting to set the
> expression to validate when another expression had already been set!  I
> added an assert to protect against this.  And then changed the decision
> of whether or not to run this block based on the attached partition
> being the default one or not; because as the "if" test was, it was just
> a recipe for confusion.  (Now, if you look carefully you realize that
> the new test for bound->is_default I added is kinda redundant: it can
> only be set if there was a default partition OID at start of the
> function, and therefore we know we're not adding a default partition.
> And for the case where we *are* adding the default partition, it is
> still Invalid because we don't re-read its own OID.  But relying on that
> seems brittle because it breaks if we happen to update the default
> partition OID in the middle of that function, which is what we were
> doing.  Hence the new is_default test.)

Seems reasonable. But even if we assume so, rereading
defaultPartOid is still breaking the assumption that
defaultPartOid is that at the time of entering to this function
and the added condition just conceals the fact.

So I think it should be an assertion.

| if (OidIsValid(defaultPartOid))
| {
|      /*
|       *  The command cannot be adding default partition if the
|       *  defaultPartOid is valid.
|       */
|      Assert(!cmd->bound->is_default);


> I looked at the implementation of ValidatePartitionConstraints and
> didn't like it, so I rewrote it also.
> 
> This comment is mistaken, too:
> -       /*
> -        * Skip if the partition is itself a partitioned table.  We can only
> -        * ever scan RELKIND_RELATION relations.
> -        */
> ... because it ignores the possibility of a partition being a foreign
> table.  The code seems to work, but I think there is no test to cover
> the case that a foreign table containing data that doesn't satisfy the
> constraint is attached, because when I set that case to return doing
> nothing (ie. ATGetQueueEntry is not called for a foreign partition), no
> test failed.

Foreign tables are intentionally not verified on attaching (in my
understanding). ATRewriteTables ignores foreign tables so it
contradicts to what ATRewriteTables thinks. As my understanding
foreign tables are assumed to be unrestrictable by local
constraint after attaching so the users are responsible to the
consistency.

> Generally speaking, I didn't like ATExecAttachPartition; it's doing too
> much that should have been done in ATPrepAttachPartition.  The only
> reason that's not broken is because we don't allow ATTACH PARTITION to
> appear together with other commands in alter table, which seems
> disappointing ... for example, when attaching multiple tables and a
> default partition exist, you have to scan the default one multiple
> times, which is lame.

Indeed, currently only partition commands are isolated from other
alter table subcommands (except all in tablespace cases). We can
improve that in the next step?

> (Speaking of lame: I noticed that RelationGetPartitionQual obtains lock
> on the parent relation ... I wonder if this can be used to cause a
> deadlock during InitResultRelationInfo.)

Mmm. That seems strange since the RealtionGetPartitionQual
doesn't return anything (specifically NIL) there, because we
don't allow a partition to be attached to partitioned tables. It
seems totally useless.

Addition to that the code tries to add the partition qual (which
is always NIL) destructively and assign to partConstraint..

> BTW, I think this is already broken for the case where the default
> partition is partitioned and you attach a partition colliding with a row
> that's being concurrently inserted in a partition of the default
> partition, though I didn't bother to write a test for that.

How is it broken? Every attaching partitions are checked for the
specified partition bound and every partitions of the default
partition are also checked against the new default part bound. We
already hold required locks on all the participants.


> Anyway, I'm just an onlooker fixing a CommandCounterIncrement change.

It's reassuring. Thanks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Optimizing nested ConvertRowtypeExpr execution
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Rewriting the test of pg_upgrade as a TAP test - take two