Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

Поиск
Список
Период
Сортировка
Peter Eisentraut <peter_e@gmx.net> writes:
> On Fri, 2013-12-20 at 10:54 -0300, Alvaro Herrera wrote:
>> Evidently something is not going well in ReadRecord.  It should have
>> reported the read failure, but didn't.  That seems a separate bug that
>> needs fixed.

> This is enabling large-file support on OS X, so that seems kind of
> important.  It's not failing with newer versions of OS X, so that leaves
> the following possibilities, I think:

[ gets out ancient PPC laptop ... ]

[ man, this thing is slow ... ]

[ hours later ... ]

There are three or four different bugs here, but the key points are:

1. pg_resetxlog is diligently trashing every single WAL file in pg_xlog/,
and then failing (by virtue of some ancient OS X bug in readdir()), so
that it doesn't get to the point of recreating a WAL file with a
checkpoint record.

2. pg_resetxlog already rewrote pg_control, which might not be such a hot
idea; but whether it had or not, pg_control's last-checkpoint pointer
would be pointing to a WAL file that is not in pg_xlog/ now.

3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going.

4. The server tries to start, and fails because it can't find a WAL file
containing the last checkpoint record.  This is pretty unsurprising given
the facts above.  The reason you don't see any "no such file" report is
that XLogFileRead() will report any BasicOpenFile() failure *other than*
ENOENT.  And nothing else makes up for that.

Re point 4: the logic, if you can call it that, in xlog.c and xlogreader.c
is making my head spin.  There are about four levels of overcomplicated
and undercommented code before you ever get down to XLogFileRead, so I
have no idea which level to blame for the lack of error reporting in this
specific case.  But there are pretty clearly some cases in which ignoring
ENOENT in XLogFileRead isn't such a good idea, and XLogFileRead isn't
being told when to do that or not.

Re point 3: I already bitched about that.

Re point 2: I wonder whether pg_resetxlog shouldn't do things in a
different order.  Rewriting pg_control to point to a file that doesn't
exist yet doesn't sound great.  I'm not sure though if there's any
great improvement to be had here; basically, if we fail partway
through, we're probably screwed no matter what.

Re point 1: there are threads in our archives suggesting that EINVAL
from readdir() after unlinking a directory entry is a known problem
on some versions of OS X, eg
http://www.postgresql.org/message-id/flat/47C45B07-8459-48D8-8FBE-291864019CC2@me.com
and I also find stuff on the intertubes suggesting that renaming an entry
can cause it too, eg
http://www.dovecot.org/list/dovecot/2009-July/041009.html

We thought at the time that the readdir bug was new in OS X 10.6, but I'll
bet it was there in 10.5's 64-bit-inode support and was only exposed to
default builds when 10.6 turned on 64-bit-inodes in userland by default.
So Apple fixed it in 10.6.2 but never back-patched the fix into 10.5.x ---
shame on them.

I'm disinclined to contort our stuff to any great extent to fix it,
because all the OS X versions mentioned are several years obsolete.
Perhaps though we should override Autoconf's setting of
_DARWIN_USE_64_BIT_INODE, if we can do that easily?  It's clearly
not nearly as problem-free on 10.5 as the Autoconf boys believe,
and it's already enabled by default on the release series where it
does work.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: PoC: Partial sort
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE