Re: [BUGS] Breakage with VACUUM ANALYSE + partitions

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Дата
Msg-id 20160422170736.bjrsusb4ubdhdtl4@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Ответы Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Robert Haas <robertmhaas@gmail.com>)
Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Thom Brown <thom@linux.com>)
Список pgsql-hackers
On 2016-04-14 11:09:29 +0530, Abhijit Menon-Sen wrote:
> At 2016-04-12 09:00:57 -0400, robertmhaas@gmail.com wrote:
> >
> > On Mon, Apr 11, 2016 at 1:17 PM, Andres Freund <andres@anarazel.de> wrote:
> > >
> > > 3) Actually handle the case of the last open segment not being
> > >    RELSEG_SIZE properly in _mdfd_getseg() - mdnblocks() does so.
> >
> > #3 seems like it's probably about 15 years overdue, so let's do that
> > anyway.
>
> I don't have anything useful to contribute, I'm just trying to
> understand this fix.
>
> _mdfd_getseg() looks like this:
>
>     targetseg = blkno / ((BlockNumber) RELSEG_SIZE);
>     for (nextsegno = 1; nextsegno <= targetseg; nextsegno++)
>     {
>         Assert(nextsegno == v->mdfd_segno + 1);
>
>         if (v->mdfd_chain == NULL)
>         {
>             /*
>              * Normally we will create new segments only if authorized by the
>              * caller (i.e., we are doing mdextend()). […]
>              */
>             if (behavior == EXTENSION_CREATE || InRecovery)
>             {
>                 if (_mdnblocks(reln, forknum, v) < RELSEG_SIZE)
>                                     /* mdextend */
>                 v->mdfd_chain = _mdfd_openseg(reln, forknum, +nextsegno, O_CREAT);
>             }
>             else
>             {
>                 /* We won't create segment if not existent */
>                 v->mdfd_chain = _mdfd_openseg(reln, forknum, nextsegno, 0);
>             }
>             if (v->mdfd_chain == NULL)
>                             /* return NULL or ERROR */
>         }
>         v = v->mdfd_chain;
>     }
>
> Do I understand correctly that the case of the "last open segment"
> (i.e., the one for which mdfd_chain == NULL) not being RELSEG_SIZE
> (i.e., _mdnblocks(reln, forknum, v) < RELSEG_SIZE) should not call
> _mfdf_openseg on nextsegno if behaviour is not EXTENSION_CREATE or
> InRecovery?
>
> And that "We won't create segment if not existent" should happen, but
> doesn't only because the next segment file wasn't removed earlier, so
> we have to add an extra check for that case?
>
> In other words, is something like the following what's needed here, or
> is there more to it?

It's a bit more complicated than that, but that's the principle. Thanks
for the patch, and sorry for my late response. See my attached version
for a more fleshed out version of this.

Looking at the code around this I found:
* _mdfd_getseg(), in contrast to mdnblocks() doesn't check segment size,
  neither whether to continue with a too small, nor to error out with a
  too big segment
* For, imo pretty bad, reasons in mdsync() we currently rely on
  EXTENSION_RETURN_NULL to extend the file in recovery, and to set errno
  to ENOENT. Brrr.
* we currently FATAL if a segment is too big - I've copied that
  behaviour, but why not just ERROR out?

The attached patch basically adds the segment size checks to
_mdfd_getseg(), and doesn't perform extension, even in recovery, if
EXTENSION_REALLY_RETURN_NULL is passed.

This fixes the issue for me, both in the originally reported variant,
and in some more rudely setup version (i.e. rm'ing files).


I'm seing some weird behaviour (even with writeback flushing disabled)
with smgr invals and recovery/standbys, but I don't want to dive into it
before http://www.postgresql.org/message-id/CAD21AoDpZ6Xjg=gFrGPnSn4oTRRcwK1EBrWCq9OqOHuAcMMC=w@mail.gmail.com
is addressed, it's too likely to be related.

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: kqueue
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: GIN data corruption bug(s) in 9.6devel