Re: varchar as primary key

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: varchar as primary key
Дата
Msg-id 6171.1178307765@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: varchar as primary key  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: varchar as primary key  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-general
Jeff Davis <pgsql@j-davis.com> writes:
> $ ./cmp
> locale set to: en_US.UTF-8
> strcmp time elapsed:  2034183 us
> strcoll time elapsed: 2019880 us

It's hardly credible that you could do either strcmp or strcoll in 2 nsec
on any run-of-the-mill hardware.  What I think is happening is that the
compiler is aware that these are side-effect-free functions and is
removing the calls entirely, or at least moving them out of the loops;
these times would be credible for loops consisting only of an increment,
test, and branch.

Integer overflow in your elapsed-time calculation is probably a risk
as well --- do the reports add up to something like the actual elapsed
time?

I tried a modified form of your program (attached) on an FC6 machine
and found that at any optimization level above -O0, that compiler
optimizes the strcmp() case into oblivion, even with code added as below
to try to make it look like a real operation.  The strcoll() call without
any following test, as you had, draws a warning about "statement with
no effect" which is pretty suspicious too.  With the modified program
I get

$ gcc -O1 -Wall cmptest.c
$ time ./a.out
locale set to: en_US.UTF-8
strcmp time elapsed:  0 us
strcoll time elapsed: 67756363 us

real    1m7.758s
user    1m7.746s
sys     0m0.006s

$ gcc -O0 -Wall cmptest.c
$ time ./a.out
locale set to: en_US.UTF-8
strcmp time elapsed:  4825504 us
strcoll time elapsed: 68864890 us

real    1m13.692s
user    1m13.676s
sys     0m0.010s

So as best I can tell, strcoll() is pretty dang expensive on Linux too.

            regards, tom lane

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <locale.h>
#include <sys/time.h>

#define ITERATIONS 100000000
#define THE_LOCALE "en_US.UTF-8"

int main(int argc, char *argv[]) {
    int i;
    char *str1 = "abcdefghijklmnop1";
    char *str2 = "abcdefghijklmnop2";
    char *newlocale;
    struct timeval t1,t2,t3;
    double elapsed_strcmp,elapsed_strcoll;

    if( (newlocale = setlocale(LC_ALL,THE_LOCALE)) == NULL ) {
        printf("error setting locale!\n");
        exit(1);
    }
    else {
        printf("locale set to: %s\n",newlocale);
    }

    gettimeofday(&t1,NULL);
    for(i=0; i < ITERATIONS; i++) {
        if (strcmp(str1,str2) == 0)
            printf("unexpected equality\n");
    }
    gettimeofday(&t2,NULL);
    for(i=0; i < ITERATIONS; i++) {
        if (strcoll(str1,str2) == 0)
            printf("unexpected equality\n");
    }
    gettimeofday(&t3,NULL);
    elapsed_strcmp = (t2.tv_sec * 1000000.0 + t2.tv_usec) - (t1.tv_sec * 1000000.0 + t1.tv_usec);
    elapsed_strcoll = (t3.tv_sec * 1000000.0 + t3.tv_usec) - (t2.tv_sec * 1000000.0 + t2.tv_usec);
    printf("strcmp time elapsed:  %.0f us\n",elapsed_strcmp);
    printf("strcoll time elapsed: %.0f us\n",elapsed_strcoll);

    return 0;
}

В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: varchar as primary key
Следующее
От: Scott Ribe
Дата:
Сообщение: Casting to varchar