Re: [HACKERS] Open 6.4 items

Поиск
Список
Период
Сортировка
От Paul A Vixie
Тема Re: [HACKERS] Open 6.4 items
Дата
Msg-id 199810130737.AAA01665@bb.rc.vix.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Open 6.4 items  (Tom Ivar Helbekkmo <tih@nhh.no>)
Список pgsql-hackers
Wow.  Tough room!  I've got to admit that I don't have any test cases
with hex strings in them, and had the hex portion of inet_net_pton_ipv4()
ever been tested here it would have shown every error you folks found:

> From: Tom Ivar Helbekkmo <tih@nhh.no>
> Date: 09 Oct 1998 22:38:37 +0200
>
> As far as I can tell, the only actual error in Paul Vixie's code is that
> the two lines you quote above should be:
>
>             else if (--size > 0)
>                 *++dst = 0, dirty = 0;
>
> That is, size should be predecremented instead of postdecremented.
> I'm cc-ing Paul on this, as I assume he wants to get the fix into
> BIND, which is where inet_net_ntop() and inet_net_pton() came from.

As later bespoken by someone else, this is not yet the right answer.

> > BTW, shouldn't this routine clear out all "size" bytes of the
> > destination, even if the given data doesn't fill it all?  A memset
> > at the top might be worthwhile.

No.  "size" is the maximum I'm allowed to fill -- but I only touch the
bits I need, and I return the number of bits I touched.

> It does: there is code further down that handles that.  Another
> example of the obscure programming style: this function really shows
> what a low-level language C is!  :-)

C is not a language, it is a very smart macro assembler.  Modula-3 is a
language.  I'm told that Eiffel is a language.  Scheme is definitely a
language.  But C is not a language.  It's just what we all use :-).

More seriously, this function is called in the inner loop and speed was
important enough to trade off some simplicity for.  The amazing thing is,
because I was blasting the same byte of *dst twice per loop iteration, it
was actually sped up considerably by the patches suggested here.  I've got
egg on my face this time fershure man.

And as I said, if you think this routine is obscure, that just shows that
you've not looked at the rest of BIND.  This routine is at least self
contained and its bugs don't lean on other bugs.

> Date: Fri, 09 Oct 1998 17:20:22 -0400
> From: Tom Lane <tgl@sss.pgh.pa.us>
>
> > As far as I can tell, the only actual error in Paul Vixie's code is
> > that the two lines you quote above should be:
> >
> >             else if (--size > 0)
> >                 *++dst = 0, dirty = 0;
>
> No, that's still wrong, because it will error out (jump to emsgsize)
> one byte sooner than it should.  The loop is fundamentally broken
> because it wants to grab and zero a byte before it knows whether there
> are any digits for the byte.

Yes.

> From: Tom Ivar Helbekkmo <tih@nhh.no>
> Date: 10 Oct 1998 20:14:51 +0200
>
> > I think its behavior for an odd number of digits is wrong too, or at
> > least pretty nonintuitive.  I like my code a lot better ;-)
>
> Well, I like your code better too on closer inspection.  I changed it
> a bit, though, because I don't feel that an odd number of digits in
> the hex string is wrong at all -- it's just a representation of a bit
> string, with the representation padded to an even multiple of four
> bits.

I agree with this interpretation.  I'm now wracking my brain trying to
remember what other source file I got this code from, in what other
package, in what year, done for what project, because it's possible that
the bug was inherited.

> Anyway, here's what I ended up with for the code in question:
>
>         if (ch == '0'&& (src[0] == 'x'|| src[0] == 'X')
>                 && isascii(src[1]) && isxdigit(src[1]))
>         {
>                 /* Hexadecimal: Eat nybble string. */
>                 if (size <= 0)
>                         goto emsgsize;
>                 tmp = 0;
>                 dirty = 0;
>                 src++;                                  /* skip x or X. */
>                 while ((ch = *src++) != '\0'&&
>                            isascii(ch) && isxdigit(ch))
>                 {
>                         if (isupper(ch))
>                                 ch = tolower(ch);
>                         n = strchr(xdigits, ch) - xdigits;
>                         assert(n >= 0 && n <= 15);
>                         tmp = (tmp << 4) | n;
>                         if (++dirty == 2) {
>                                 if (size-- <= 0)
>                                         goto emsgsize;
>                                 *dst++ = (u_char) tmp;
>                                 tmp = 0, dirty = 0;
>                         }
>                 }
>                 if (dirty) {
>                         if (size-- <= 0)
>                                 goto emsgsize;
>                         tmp <<= 4;
>                         *dst++ = (u_char) tmp;
>                 }
>         }

Since that made some stylistic changes that had nothing to do with the
algorythm, it caught my eye (that's a hint.)  In keeping with its spirit,
I came up with the following.  Comments please?

        if (ch == '0' && (src[0] == 'x' || src[0] == 'X')
            && isascii(src[1]) && isxdigit(src[1])) {
                /* Hexadecimal: Eat nybble string. */
                if (size <= 0)
                        goto emsgsize;
                dirty = 0;
                src++;  /* skip x or X. */
                while ((ch = *src++) != '\0' && isascii(ch) && isxdigit(ch)) {
                        if (isupper(ch))
                                ch = tolower(ch);
                        n = strchr(xdigits, ch) - xdigits;
                        assert(n >= 0 && n <= 15);
                        if (dirty == 0)
                                tmp = n;
                        else
                                tmp = (tmp << 4) | n;
                        if (++dirty == 2) {
                                if (size-- <= 0)
                                        goto emsgsize;
                                *dst++ = (u_char) tmp;
                                dirty = 0;
                        }
                }
                if (dirty) {  /* Odd trailing nybble? */
                        if (size-- <= 0)
                                goto emsgsize;
                        *dst++ = (u_char) (tmp << 4);
                }
        }

For the record, I found my original code completely impenetrable, and I'd
like to both thank those of you who slogged through it, and apologize for
not testing every code path before I released it.  I oughta know better,
and you find folks deserve better.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: jwieck@debis.com (Jan Wieck)
Дата:
Сообщение: Re: [HACKERS] Dumping of views -- done!
Следующее
От: jwieck@debis.com (Jan Wieck)
Дата:
Сообщение: Re: [HACKERS] dynamic libraries