Обсуждение: [PATCH] btree_gist: fix union implementation for variable length columns

Поиск
Список
Период
Сортировка

[PATCH] btree_gist: fix union implementation for variable length columns

От
Pavel Raiskup
Дата:
Hi hackers,

while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
that gbt_var_union() mistreats the first vector element.  Patch is attached.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1544349
Pavel

Вложения

Re: [PATCH] btree_gist: fix union implementation for variable length columns

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
> that gbt_var_union() mistreats the first vector element.  Patch is attached.

Hi Pavel!  For patches that purport to resolve bugs, we usually like to
add a regression test case that demonstrates the bug in unpatched code.
Can you provide a small test case that does so?  (The BZ you pointed to
doesn't seem to address this...)

            regards, tom lane


Re: [PATCH] btree_gist: fix union implementation for variable length columns

От
Pavel Raiskup
Дата:
Hi Tom,

On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed
> > that gbt_var_union() mistreats the first vector element.  Patch is attached.
> 
> Hi Pavel!  For patches that purport to resolve bugs, we usually like to
> add a regression test case that demonstrates the bug in unpatched code.
> Can you provide a small test case that does so?  (The BZ you pointed to
> doesn't seem to address this...)

I've been looking at the reproducer for a while now...  and I probably
need a hint if that's necessary.

Turns out the problem is only related to bit/bit varying type (so the
patch comments need to be reworded properly, at least) since those are the
only types which have implemented the f_l2n() callback.

Considering testcase 'bit.sql' (bit(33) type), it's pretty clear that on
little endian boxes the problematic strings would be '00100001*' (int32
value for '33', because there's varbit header).  Big endian boxes would
have problems with char[] like {0, 0, 0, 33, ...}.  But I fail to
construct right order of correctly picked elements into 'bit.data'.
The problem probably is (a) that picksplit reorders the elements very
often, and probably that (b) random() brings non-determinism into the
final tree shape (if the reproducer was to be based on many equal keys in
the index).  It's amazing to see how the index fixes itself :-) for this
bug.

Can you help me with the reproducer idea, resp. can we live without the
reproducer in this particular case?

Pavel





Re: [PATCH] btree_gist: fix union implementation for variable length columns

От
Tom Lane
Дата:
Pavel Raiskup <praiskup@redhat.com> writes:
> On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:
>> Hi Pavel!  For patches that purport to resolve bugs, we usually like to
>> add a regression test case that demonstrates the bug in unpatched code.
>> Can you provide a small test case that does so?  (The BZ you pointed to
>> doesn't seem to address this...)

> Turns out the problem is only related to bit/bit varying type (so the
> patch comments need to be reworded properly, at least) since those are the
> only types which have implemented the f_l2n() callback.

What I'm failing to wrap my head around is why this code exists at all.
AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
an INTALIGN boundary, which it wouldn't necessarily be otherwise.  But
why bother?  That should have no effect on the behavior of bit_cmp().
So I'm speculating that the reason nobody has noticed a problem is that
there is no problem because this code is useless and could be ripped out.

It would be easier to figure this out if the btree_gist code weren't
so desperately undocumented.  Teodor, do you remember why it's like
this?

            regards, tom lane


Re: [PATCH] btree_gist: fix union implementation for variable length columns

От
Pavel Raiskup
Дата:
On Wednesday, July 11, 2018 7:26:40 PM CEST Tom Lane wrote:
> Pavel Raiskup <praiskup@redhat.com> writes:
> > On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote:
> >> Hi Pavel!  For patches that purport to resolve bugs, we usually like to
> >> add a regression test case that demonstrates the bug in unpatched code.
> >> Can you provide a small test case that does so?  (The BZ you pointed to
> >> doesn't seem to address this...)
> 
> > Turns out the problem is only related to bit/bit varying type (so the
> > patch comments need to be reworded properly, at least) since those are the
> > only types which have implemented the f_l2n() callback.
> 
> What I'm failing to wrap my head around is why this code exists at all.
> AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to
> an INTALIGN boundary, which it wouldn't necessarily be otherwise.
> But why bother?  That should have no effect on the behavior of bit_cmp().

The gbt_bit_xfrm() method actually also skips the varbit header (number of
bits, stored in VarBit.bit_len), the magic is done by VARBITS macro.  If
the header is dropped, it's OK to just use existing byteacmp() for bit
array comparison.

Pavel





Re: [PATCH] btree_gist: fix union implementation for variable lengthcolumns

От
Teodor Sigaev
Дата:
> It would be easier to figure this out if the btree_gist code weren't
> so desperately undocumented.  Teodor, do you remember why it's like
> this?

Will look.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: [PATCH] btree_gist: fix union implementation for variable length columns

От
Pavel Raiskup
Дата:
On Thursday, July 12, 2018 5:53:18 PM CET Teodor Sigaev wrote:
> > It would be easier to figure this out if the btree_gist code weren't
> > so desperately undocumented.  Teodor, do you remember why it's like
> > this?
> 
> Will look.

Ping on this issue.  I guess the patch I proposed isn't wrong in this
case, I'm just not sure about the commit message.

Pavel