Re: A speed comparison with sqlite

Поиск

Список

Период

Сортировка

От	drh@hwaci.com
Тема	Re: A speed comparison with sqlite
Дата	22 января 2008 г. 14:29:25
Msg-id	2454484.483031911258676100307556033850405637744713689@hwaci.com обсуждение исходный текст
Ответ на	A speed comparison with sqlite (Shane Ambler <pgsql@Sheeky.Biz>)
Список	pgsql-advocacy

Дерево обсуждения

> A guy using RealBasic (GUI development IDE) which uses sqlite as it's
> inbuilt db engine wanted to generate a series of unique codes and
> decided sql will help (wanting 30 million codes starts to rule out ram
> based solutions)
>
> This is a 7 character alphanumeric code.
>
> His program generated 30M codes in about 15 minutes into the sqlite db
> file without any duplicate checks.
>
> Select distinct(pincode) from codes; returned the results (that is to
> his client not out to file) after 22 hours and he was after a faster
> solution.
>
> Using a unique index to check as he went, inserts dropped from 18,000
> per second to about 200 a second after 8 hours (without completing)
>
> (the following times are taken from psql's timing option)
>
> With a P4 3Ghz - 80GB IDE drive - running XP pro - pg 8.2.6 -
> ...
> The same hardware running FreeBSD 7.0RC1 GENERIC Kernel -
>
> create the table and insert 30M codes in 5.6 mins
>
> select count(distinct(pincode)) from codes; took 5.9 minutes
> (29,993,279 unique codes of the 30M created)
>
> setting output to a text file - select distinct(pincode) from codes took
> 9.7 minutes to send them all out to file.
>
> It would appear that the method of locating distinct results is the key
> here - postgresql sorts the distinct column allowing it to identify a
> duplicate or new value quicker - sqlite returns the distinct columns in
> the order they are inserted - so it must be scanning all previous
> entries in the output to ensure distinct.
>

Come on, guys.  SQLite does have performance issues with sorting
large tables.  It is something we are working on.  But it isn't
*that* bad.  I reran the test with no special tricks and SQLite
took 15 minutes.  Still 3x slower than PostgreSQL, I admit.  But
it is within a factor of 3....

Here is the output I get when I create 2^25 random 7-character
strings, then did a CREATE TABLE t2 AS SELECT DISTINCT * FROM t1:

   2008-01-18 23:29:22: i=16
   2008-01-18 23:29:22: i=32
   2008-01-18 23:29:22: i=64
   2008-01-18 23:29:22: i=128
   2008-01-18 23:29:22: i=256
   2008-01-18 23:29:22: i=512
   2008-01-18 23:29:22: i=1024
   2008-01-18 23:29:22: i=2048
   2008-01-18 23:29:22: i=4096
   2008-01-18 23:29:22: i=8192
   2008-01-18 23:29:22: i=16384
   2008-01-18 23:29:22: i=32768
   2008-01-18 23:29:23: i=65536
   2008-01-18 23:29:23: i=131072
   2008-01-18 23:29:24: i=262144
   2008-01-18 23:29:25: i=524288
   2008-01-18 23:29:27: i=1048576
   2008-01-18 23:29:31: i=2097152
   2008-01-18 23:29:40: i=4194304
   2008-01-18 23:29:59: i=8388608
   2008-01-18 23:30:37: i=16777216
   2008-01-18 23:31:54: start the distinct...
   2008-01-18 23:46:46: finish the distinct...

So you can see, the SELECT distinct took about 15 minutes.
The TCL script used to run this test is:

   sqlite db test.db
   db eval {
     CREATE TABLE t1(pincode TEXT);
     INSERT INTO t1 VALUES(randstr(7,7));
   }
   proc msg {txt} {
     set now [db one {SELECT datetime('now')}]
     puts "$now: $txt"
     flush stdout
   }
   for {set i 1} {$i<30000000} {incr i $i} {
     msg "i=$i"
     flush stdout
     db eval {
       INSERT INTO t1 SELECT randstr(7,7) FROM t1;
     }
   }
   msg "start the distinct..."
   db eval {
     CREATE TABLE t2 AS SELECT distinct(pincode) FROM t1;
   }
   msg "finish the distinct..."

--
D. Richard Hipp <drh@hwaci.com>

В списке pgsql-advocacy по дате отправления:

Предыдущее

От: "Dave Page"
Дата: 22 января 2008 г., 13:35:29
Сообщение: Re: Merchandise FOSDEM 2008

Следующее

От: Luke Lonergan
Дата: 22 января 2008 г., 14:32:40
Сообщение: Sun and SAP invest in Greenplum

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: A speed comparison with sqlite

Предыдущее

Следующее