Обсуждение: Bug #610: collation fails sorting because of strcoll() bug

Поиск
Список
Период
Сортировка

Bug #610: collation fails sorting because of strcoll() bug

От
pgsql-bugs@postgresql.org
Дата:
Mathias August Gruber (mgruber) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
collation fails sorting because of strcoll() bug

Long Description
Hi there,

I was trying to migrate a MS-SQL Server database to a Postgresql platform about two years ago and could not make things
workbecause I needed collation.
 
Although documentation states that collation will work, this is not true when using string separated by blanks.
What happens is the strings are sorted as if they had no spaces.
This was really bad.
Nowadays I've taken this project again and noticed the problem is still there. So I started to read all docs and the
sourcecode and made lots of tests.
 
Also your regression tests lacks on this topic. You are only sorting single worded strings.

Now I have a verdict: The problem is on the GNU-C libraries strcoll()
function.

I have attached a little C program that reproduces this behavior. Just
compile it (and don't forget to set LC_ALL to any western language; I've
tested with pt_BR but the problem occurs almost with any other
configuration).

Hope I could help you with this superb project.

Very Best Regards


Sample Code

#include <stdio.h>
#include <string.h>
#include <locale.h>



int main(int argc, char **arv)
{
    int i;
    char src[4][32] =
    {
        "Joseval Almeida",
        "Jose Valter",
        "JOSE CAMARGO",
        "Jose Americo",
    };
    char arr[4][32];

    memcpy(arr, src, sizeof(src));

    /* Use current locale settings (in my case LC_ALL=pt_BR), that uses
    coventional LATIN 1 collation settings. */
    setlocale(LC_ALL, "");

    /* Print current array */
    puts("The input array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    qsort(arr, 4, sizeof(char)*32, strcmp);

    /* Print the output */
    puts("\nThe strcmp sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    memcpy(arr, src, sizeof(src));
    qsort(arr, 4, sizeof(char)*32, strcasecmp);

    /* Print the output */
    puts("\nThe strcasecmp sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    memcpy(arr, src, sizeof(src));
    qsort(arr, 4, sizeof(char)*32, strcoll);

    /* Print the output */
    puts("\nThe strcoll sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    return 0;
}


No file was uploaded with this report

Re: Bug #610: collation fails sorting because of strcoll() bug

От
Tom Lane
Дата:
pgsql-bugs@postgresql.org writes:
> collation fails sorting because of strcoll() bug

Sorry, strcoll() is behaving as defined.  If you don't like it, use
a different LOCALE.  Or at the very least, complain to the strcoll
authors, not us ;-)

            regards, tom lane