Re: EXTERNAL storage and substring on long strings

Поиск

Список

Период

Сортировка

От	Joe Conway
Тема	Re: EXTERNAL storage and substring on long strings
Дата	6 августа 2003 г. 17:56:36
Msg-id	3F316A41.6020403@joeconway.com обсуждение исходный текст
Ответ на	Re: EXTERNAL storage and substring on long strings (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: EXTERNAL storage and substring on long strings
Список	pgsql-performance

Дерево обсуждения

Tom Lane wrote:
> Scott Cain <cain@cshl.org> writes:
>
>>A few days ago, I asked for advice on speeding up substring queries on
>>the GENERAL mailing list.  Joe Conway helpfully pointed out the ALTER
>>TABLE STORAGE EXTERNAL documentation.  After doing the alter,
>>the queries got slower!  Here is the background:
>
> Ah-hah, I've sussed it ... you didn't actually change the storage
> representation.  You wrote:

Yeah, I came to the same conclusion this morning (update longdna set dna
= dna || '';), but it still seems that the chunked table is very
slightly faster than the substring on the  externally stored column:

dna=# explain analyze select pdna from dna where foffset > 6000000 and
foffset < 6024000;
                                                     QUERY PLAN
------------------------------------------------------------------------------------------------------------------
  Index Scan using foffset_idx on dna  (cost=0.00..4.22 rows=14
width=32) (actual time=0.06..0.16 rows=11 loops=1)
    Index Cond: ((foffset > 6000000) AND (foffset < 6024000))
  Total runtime: 0.27 msec
(3 rows)

dna=# explain analyze select pdna from dna where foffset > 6000000 and
foffset < 6024000;
                                                     QUERY PLAN
------------------------------------------------------------------------------------------------------------------
  Index Scan using foffset_idx on dna  (cost=0.00..4.22 rows=14
width=32) (actual time=0.07..0.16 rows=11 loops=1)
    Index Cond: ((foffset > 6000000) AND (foffset < 6024000))
  Total runtime: 0.25 msec
(3 rows)

dna=# explain analyze select substr(dna,6002000,20000) from longdna;
                                            QUERY PLAN
------------------------------------------------------------------------------------------------
  Seq Scan on longdna  (cost=0.00..1.01 rows=1 width=32) (actual
time=0.46..0.47 rows=1 loops=1)
  Total runtime: 0.58 msec
(2 rows)

dna=# explain analyze select substr(dna,6002000,20000) from longdna;
                                            QUERY PLAN
------------------------------------------------------------------------------------------------
  Seq Scan on longdna  (cost=0.00..1.01 rows=1 width=32) (actual
time=0.23..0.24 rows=1 loops=1)
  Total runtime: 0.29 msec
(2 rows)

I ran each command twice after starting psql to observe the effects of
caching.

However with the provided sample data, longdna has only one row, and dna
has 122,540 rows, all of which are chunks of the one longdna row. I
would tend to think that if you had 1000 or so longdna records indexed
on some id column, versus 122,540,000 dna chunks indexed on both an id
and segment column, the substring from longdna would win.

Joe

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: EXTERNAL storage and substring on long strings