On Fri, Jun 15, 2012 at 1:58 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Do we need the retry flag (applies to two places)? If oldblkno is
> still InvalidBlockNumber then it can't equal blkno.
I was a bit concerned that blkno = BUCKET_TO_BLKNO(metap, bucket)
might set blkno to InvalidBlockNumber, thus possibly messing up the
algorithm. This way seemed better in terms of not requiring any
assumption in that area.
> In the README, the psuedo codes probably needs to mention re-locking
> the meta page in the loop.
Good point. Fixed.
> Also, "page" is used to mean either the disk page or the shared buffer
> currently holding that page, depending on context. This is confusing.
> Maybe we should clarify "Lock the meta page buffer". Of course this
> gripe precedes your patch and applies to other parts of the code as
> well, but since we mingle LW locks (on buffers) and heavy locks (on
> names of disk pages) it might make sense to be more meticulous here.
I attempted to improve this somewhat in the README, but I think it may
be more than I can do completely.
> "exclusive-lock page 0 (assert the right to begin a split)" is no
> longer true, nor is "release X-lock on page 0"
I think this is fixed.
> Also in the README, section "To prevent deadlock we enforce these
> coding rules:" would need to be changed as those rules are being
> changed. But, should we change them at all?
>
> In _hash_expandtable, the claim "But since we are only trylocking here
> it should be OK" doesn't completely satisfy me. Even a conditional
> heavy-lock acquire still takes a transient non-conditional LW Lock on
> the lock manager partition. Unless there is a global rule that no one
> can take a buffer content lock while holding a lock manager partition
> lock, this seems dangerous. Could this be redone to follow the
> pattern of heavy locking with no content lock, then reacquiring the
> buffer content lock to check to make sure we locked the correct
> things? I don't know if it would be better to loop, or just give up,
> if the meta page changed underneath us.
Hmm. That was actually a gloss I added on existing code to try to
convince myself that it was safe; I don't think that the changes I
made make that any more or less safe than it was before. But the
question of whether or not it is safe is an interesting one. I do
believe that your statement is true, though: lock manager locks - or
backend locks, for the fast-path - should be the last thing we lock.
In some cases we lock more than one lock manager partition lock, but
we never lock anything else afterwards, AFAIK.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company