Re: Inconsistent results with libc sorting on Windows

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Inconsistent results with libc sorting on Windows
Дата
Msg-id 20230811052944.GB3559080@rfd.leadboat.com
обсуждение исходный текст
Ответ на Re: Inconsistent results with libc sorting on Windows  (Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>)
Ответы Re: Inconsistent results with libc sorting on Windows
Список pgsql-hackers
On Wed, Jun 14, 2023 at 12:50:28PM +0200, Juan José Santamaría Flecha wrote:
> On Wed, Jun 14, 2023 at 4:13 AM Peter Geoghegan <pg@bowt.ie> wrote:
> 
> > On Tue, Jun 13, 2023 at 5:59 PM Thomas Munro <thomas.munro@gmail.com>
> > wrote:
> > > Trying to follow along here... you're doing the moral equivalent of
> > > strxfrm(), so sort keys have the transitive property but direct string
> > > comparisons don't?  Or is this because LCIDs reach a different
> > > algorithm somehow (or otherwise why do you need to use LCIDs for this,
> > > when there is a non-LCID version of that function, with a warning not
> > > to use the older LCID version[1]?)
> >
> > I'm reminded of the fact that the abbreviated keys strxfrm() debacle
> > (back when 9.5 was released) was caused by a bug in strcoll() -- not a
> > bug in strxfrm() itself. From our point of view the problem was that
> > strxfrm() failed to be bug compatible with strcoll() due to a buggy
> > strcoll() optimization.
> >
> > I believe that strxfrm() is generally less likely to have bugs than
> > strcoll(). There are far fewer opportunities to dodge unnecessary work
> > in the case of strxfrm()-like algorithms (offering something like
> > ICU's pg_strnxfrm_prefix_icu() prefix optimization is the only one).
> > On the other hand, collation library implementers are likely to
> > heavily optimize strcoll() for typical use-cases such as sorting and
> > binary search. Using strxfrm() for everything is discouraged [1].
> >
> 
> Yes, I think the situation is quite similar to what you describe, with its
> WIN32 peculiarities. Take for example the attached program, it'll output:
> 
> s1 = s2
> s2 = s3
> s1 > s3
> c1 > c2
> c2 > c3
> c1 > c3
> 
> As you can see the test for CompareStringEx() is broken, but we get a sane
> answer with LCMapStringEx().

The LCMapStringEx() solution is elegant.  I do see
https://learn.microsoft.com/en-us/windows/win32/intl/handling-sorting-in-your-applications
says, "If an application calls the function to create a sort key for a string
containing an Arabic kashida, the function creates no sort key value."  That's
aggravating.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Следующее
От: Amit Langote
Дата:
Сообщение: Re: generic plans and "initial" pruning