Обсуждение: Strange coding in _mdfd_openseg()

Поиск
Список
Период
Сортировка

Strange coding in _mdfd_openseg()

От
Thomas Munro
Дата:
Hello,

I think the following conditional code is misleading, and I wonder if
it would be better like so:

--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber
forknum, BlockNumber segno,
        if (fd < 0)
                return NULL;

-       if (segno <= reln->md_num_open_segs[forknum])
-               _fdvec_resize(reln, forknum, segno + 1);
+       /*
+        * Segments are always opened in order from lowest to highest,
so we must
+        * be adding a new one at the end.
+        */
+       Assert(segno == reln->md_num_open_segs[forknum]);
+
+       _fdvec_resize(reln, forknum, segno + 1);

        /* fill the entry */
        v = &reln->md_seg_fds[forknum][segno];

I think the condition is always true, and with == it would also always
be true.  If that weren't the case, the call to _fdvec_resize() code
would effectively leak vfds.

-- 
Thomas Munro
https://enterprisedb.com



Re: Strange coding in _mdfd_openseg()

От
Kyotaro HORIGUCHI
Дата:
At Wed, 3 Apr 2019 17:14:36 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in
<CA+hUKG+NBw+uSzxF1os-SO6gUuw=cqO5DAybk6KnHKzgGvxhxA@mail.gmail.com>
> Hello,
> 
> I think the following conditional code is misleading, and I wonder if
> it would be better like so:
> 
> --- a/src/backend/storage/smgr/md.c
> +++ b/src/backend/storage/smgr/md.c
> @@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber
> forknum, BlockNumber segno,
>         if (fd < 0)
>                 return NULL;
> 
> -       if (segno <= reln->md_num_open_segs[forknum])
> -               _fdvec_resize(reln, forknum, segno + 1);
> +       /*
> +        * Segments are always opened in order from lowest to highest,
> so we must
> +        * be adding a new one at the end.
> +        */
> +       Assert(segno == reln->md_num_open_segs[forknum]);
> +
> +       _fdvec_resize(reln, forknum, segno + 1);
> 
>         /* fill the entry */
>         v = &reln->md_seg_fds[forknum][segno];
> 
> I think the condition is always true, and with == it would also always
> be true.  If that weren't the case, the call to _fdvec_resize() code
> would effectively leak vfds.

I may be missing something, but it seems possible that
_mdfd_getseg calls it with segno > opensegs.

|     for (nextsegno = reln->md_num_open_segs[forknum];
|          nextsegno <= targetseg; nextsegno++)
| ...
|         v = _mdfd_openseg(reln, forknum, nextsegno, flags);


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Strange coding in _mdfd_openseg()

От
Thomas Munro
Дата:
On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I may be missing something, but it seems possible that
> _mdfd_getseg calls it with segno > opensegs.
>
> |     for (nextsegno = reln->md_num_open_segs[forknum];

Here nextsegno starts out equal to opensegs.

> |          nextsegno <= targetseg; nextsegno++)

Here we add one to nextsegno...

> | ...
> |         v = _mdfd_openseg(reln, forknum, nextsegno, flags);

... after adding one to opensegs.  So they're always equal.  This fits
a general programming pattern when appending to an array, the next
element's index is the same as the number of elements.  But I claim
the coding is weird, because _mdfd_openseg's *looks* like it can
handle opening segments in any order, except that the author
accidentally wrote "<=" instead of ">=".  In fact it can't open them
in any order, because we don't support "holes" in the array.  So I
think it should really be "==", and it should be an assertion, not a
condition.

-- 
Thomas Munro
https://enterprisedb.com



Re: Strange coding in _mdfd_openseg()

От
Andres Freund
Дата:
Hi,

On 2019-04-04 09:24:49 +1300, Thomas Munro wrote:
> On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I may be missing something, but it seems possible that
> > _mdfd_getseg calls it with segno > opensegs.
> >
> > |     for (nextsegno = reln->md_num_open_segs[forknum];
> 
> Here nextsegno starts out equal to opensegs.
> 
> > |          nextsegno <= targetseg; nextsegno++)
> 
> Here we add one to nextsegno...
> 
> > | ...
> > |         v = _mdfd_openseg(reln, forknum, nextsegno, flags);
> 
> ... after adding one to opensegs.  So they're always equal.  This fits
> a general programming pattern when appending to an array, the next
> element's index is the same as the number of elements.  But I claim
> the coding is weird, because _mdfd_openseg's *looks* like it can
> handle opening segments in any order, except that the author
> accidentally wrote "<=" instead of ">=".  In fact it can't open them
> in any order, because we don't support "holes" in the array.  So I
> think it should really be "==", and it should be an assertion, not a
> condition.

Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion
of equality, or just invert the >= (which I agree I probably just
screwed up and didn't notice when reviewing the patch because it looked
close enough to correct and it didn't have a measurable effect).

Greetings,

Andres Freund



Re: Strange coding in _mdfd_openseg()

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund <andres@anarazel.de> wrote in
<20190403204746.2yumq7c2mirmodsg@alap3.anarazel.de>
> Hi,
> 
> On 2019-04-04 09:24:49 +1300, Thomas Munro wrote:
> > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI
> > <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > > I may be missing something, but it seems possible that
> > > _mdfd_getseg calls it with segno > opensegs.
> > >
> > > |     for (nextsegno = reln->md_num_open_segs[forknum];
> > 
> > Here nextsegno starts out equal to opensegs.
> > 
> > > |          nextsegno <= targetseg; nextsegno++)
> > 
> > Here we add one to nextsegno...
> > 
> > > | ...
> > > |         v = _mdfd_openseg(reln, forknum, nextsegno, flags);
> > 
> > ... after adding one to opensegs.  So they're always equal.  This fits
> > a general programming pattern when appending to an array, the next
> > element's index is the same as the number of elements.  But I claim
> > the coding is weird, because _mdfd_openseg's *looks* like it can
> > handle opening segments in any order, except that the author
> > accidentally wrote "<=" instead of ">=".  In fact it can't open them
> > in any order, because we don't support "holes" in the array.  So I
> > think it should really be "==", and it should be an assertion, not a
> > condition.
> 
> Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion
> of equality, or just invert the >= (which I agree I probably just
> screwed up and didn't notice when reviewing the patch because it looked
> close enough to correct and it didn't have a measurable effect).

I looked there and agreed. _mdfd_openseg is always called just to
add one new segment after the last opened segment by the two
callers. So I think == is better.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: Strange coding in _mdfd_openseg()

От
Thomas Munro
Дата:
On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund <andres@anarazel.de> wrote in
<20190403204746.2yumq7c2mirmodsg@alap3.anarazel.de>
> > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion
> > of equality, or just invert the >= (which I agree I probably just
> > screwed up and didn't notice when reviewing the patch because it looked
> > close enough to correct and it didn't have a measurable effect).
>
> I looked there and agreed. _mdfd_openseg is always called just to
> add one new segment after the last opened segment by the two
> callers. So I think == is better.

Thanks.  Some other little things I noticed while poking around in this area:

Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if
it returns NULL, and it expects the same of
mdopen(EXTERNSION_RETURN_NULL), and yet the latter does:

    fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);

    if (fd < 0)
    {
        if ((behavior & EXTENSION_RETURN_NULL) &&
            FILE_POSSIBLY_DELETED(errno))
        {
            pfree(path);
            return NULL;
        }

1.  I guess that needs save_errno treatment to protect it from being
clobbered by pfree()?
2.  It'd be nice if function documentation explicitly said which
functions set errno on error (and perhaps which syscalls).
3.  Why does some code use "file < 0" and other code "file <= 0" to
detect errors from fd.c functions that return File?

-- 
Thomas Munro
https://enterprisedb.com



Re: Strange coding in _mdfd_openseg()

От
Kyotaro HORIGUCHI
Дата:
At Fri, 5 Apr 2019 18:44:15 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in
<CA+hUKGKa-OKiNEsWOs+SWugpSE-C7MebejK-dDipaoS17BkRNw@mail.gmail.com>
> On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund <andres@anarazel.de> wrote in
<20190403204746.2yumq7c2mirmodsg@alap3.anarazel.de>
> > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion
> > > of equality, or just invert the >= (which I agree I probably just
> > > screwed up and didn't notice when reviewing the patch because it looked
> > > close enough to correct and it didn't have a measurable effect).
> >
> > I looked there and agreed. _mdfd_openseg is always called just to
> > add one new segment after the last opened segment by the two
> > callers. So I think == is better.
> 
> Thanks.  Some other little things I noticed while poking around in this area:
> 
> Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if
> it returns NULL, and it expects the same of

Only mdsyncfiletag seems expecting that and it is documented. But
_mdfd_getseg is not documented as the same. mdopen also is not.

> mdopen(EXTERNSION_RETURN_NULL), and yet the latter does:
> 
>     fd = PathNameOpenFile(path, O_RDWR | PG_BINARY);
> 
>     if (fd < 0)
>     {
>         if ((behavior & EXTENSION_RETURN_NULL) &&
>             FILE_POSSIBLY_DELETED(errno))
>         {
>             pfree(path);
>             return NULL;
>         }

> 1.  I guess that needs save_errno treatment to protect it from being
> clobbered by pfree()?

If both elog() and free() don't change errno, we don't need to do
that at least for AllocSetFree, and is seems to be the same for
other allocators. I think it is better to guarantee (and
document) that errno does not change by pfree(), rather than to
protect in the caller side.a

> 2.  It'd be nice if function documentation explicitly said which
> functions set errno on error (and perhaps which syscalls).

I agree about errno. I'm not sure about syscall (names?).

> 3.  Why does some code use "file < 0" and other code "file <= 0" to
> detect errors from fd.c functions that return File?

That seems just a thinko, or difference of how to think about
invalid (or impossible) values. Vfd == 0 is invalid and
impossible, so file <=0 and < 0 are effectively the
equivalents. I think we should treat 0 as error rather than
sucess. I don't think it worth to do Assert(file != 0).

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Strange coding in _mdfd_openseg()

От
Noah Misch
Дата:
On Thu, Apr 04, 2019 at 12:15:52PM +0900, Kyotaro HORIGUCHI wrote:
> At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund <andres@anarazel.de> wrote in
<20190403204746.2yumq7c2mirmodsg@alap3.anarazel.de>
> > On 2019-04-04 09:24:49 +1300, Thomas Munro wrote:
> > > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > > > I may be missing something, but it seems possible that
> > > > _mdfd_getseg calls it with segno > opensegs.
> > > >
> > > > |     for (nextsegno = reln->md_num_open_segs[forknum];
> > > 
> > > Here nextsegno starts out equal to opensegs.
> > > 
> > > > |          nextsegno <= targetseg; nextsegno++)
> > > 
> > > Here we add one to nextsegno...
> > > 
> > > > | ...
> > > > |         v = _mdfd_openseg(reln, forknum, nextsegno, flags);
> > > 
> > > ... after adding one to opensegs.  So they're always equal.  This fits
> > > a general programming pattern when appending to an array, the next
> > > element's index is the same as the number of elements.  But I claim
> > > the coding is weird, because _mdfd_openseg's *looks* like it can
> > > handle opening segments in any order, except that the author
> > > accidentally wrote "<=" instead of ">=".  In fact it can't open them
> > > in any order, because we don't support "holes" in the array.  So I
> > > think it should really be "==", and it should be an assertion, not a
> > > condition.
> > 
> > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion
> > of equality, or just invert the >= (which I agree I probably just
> > screwed up and didn't notice when reviewing the patch because it looked
> > close enough to correct and it didn't have a measurable effect).
> 
> I looked there and agreed. _mdfd_openseg is always called just to
> add one new segment after the last opened segment by the two
> callers. So I think == is better.

Agreed.  The rest of md.c won't cope with a hole in this array, so allowing
less-than-or-equal here is futile.  The patch in the original post looks fine.



Re: Strange coding in _mdfd_openseg()

От
Thomas Munro
Дата:
On Sun, Jan 26, 2020 at 8:23 AM Noah Misch <noah@leadboat.com> wrote:
> Agreed.  The rest of md.c won't cope with a hole in this array, so allowing
> less-than-or-equal here is futile.  The patch in the original post looks fine.

Thanks.  Pushed.