Re: Bug in UTF8-Validation Code?

Поиск

Список

Период

Сортировка

От	Andrew Dunstan
Тема	Re: Bug in UTF8-Validation Code?
Дата	17 марта 2007 г. 20:09:15
Msg-id	45FC7513.8040206@dunslane.net обсуждение
Ответ на	Re: Bug in UTF8-Validation Code? (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Here are some timing tests in 1m rows of random utf8 encoded 100 char 
>> data. It doesn't look to me like the saving you're suggesting is worth 
>> the trouble.
>>     
>
> Hmm ... not sure I believe your numbers.  Using a test file of 1m lines
> of 100 random latin1 characters converted to utf8 (thus, about half and
> half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII
> encoding:
>
> regression=# \timing
> Timing is on.
> regression=# create temp table test(f1 text);
> CREATE TABLE
> Time: 5.047 ms
> regression=# copy test from '/home/tgl/zzz1m';
> COPY 1000000
> Time: 4337.089 ms
>
> and this in UTF8 encoding:
>
> utf8=# \timing
> Timing is on.
> utf8=# create temp table test(f1 text);
> CREATE TABLE
> Time: 5.108 ms
> utf8=# copy test from '/home/tgl/zzz1m';
> COPY 1000000
> Time: 7776.583 ms
>
> The numbers aren't super repeatable, but it sure looks to me like the
> encoding check adds at least 50% to the runtime in this example; so
> doing it twice seems unpleasant.
>  
> (This is CVS HEAD, compiled without assert checking, on an x86_64
> Fedora Core 6 box.)
>
>   

Are you comparing apples with apples? The db is utf8 in both of my cases.

cheers

andrew

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bug in UTF8-Validation Code?