Re: EXTERNAL storage and substring on long strings

От: Tom Lane
Тема: Re: EXTERNAL storage and substring on long strings
Дата: ,
Msg-id: 7896.1060202191@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: EXTERNAL storage and substring on long strings  (Scott Cain)
Ответы: Re: EXTERNAL storage and substring on long strings  (Joe Conway)
Список: pgsql-performance

Скрыть дерево обсуждения

EXTERNAL storage and substring on long strings  (Scott Cain, )
 Re: EXTERNAL storage and substring on long strings  (Scott Cain, )
  Re: EXTERNAL storage and substring on long strings  (Tom Lane, )
  Re: EXTERNAL storage and substring on long strings  (Joe Conway, )
 Re: EXTERNAL storage and substring on long strings  (Scott Cain, )
  Re: EXTERNAL storage and substring on long strings  (Tom Lane, )
 Re: EXTERNAL storage and substring on long strings  (Joe Conway, )
  Re: EXTERNAL storage and substring on long strings  (Ron Johnson, )
   Re: EXTERNAL storage and substring on long strings  (Scott Cain, )
    Re: [SQL] EXTERNAL storage and substring on long strings  (Tom Lane, )
     Re: [SQL] EXTERNAL storage and substring on long strings  (Scott Cain, )
      Re: [SQL] EXTERNAL storage and substring on long strings  (Joe Conway, )
       Re: [SQL] EXTERNAL storage and substring on long strings  (Scott Cain, )
        Re: [SQL] EXTERNAL storage and substring on long strings  (Scott Cain, )
         Re: [SQL] EXTERNAL storage and substring on long strings  (Joe Conway, )
     Re: [SQL] EXTERNAL storage and substring on long strings  (Jan Wieck, )
      Re: [SQL] EXTERNAL storage and substring on long strings  (Tom Lane, )
       Re: [SQL] EXTERNAL storage and substring on long strings  (Scott Cain, )
       Re: [SQL] EXTERNAL storage and substring on long strings  (Jan Wieck, )
    Re: [SQL] EXTERNAL storage and substring on long strings  (Richard Huxton, )
     Re: [SQL] EXTERNAL storage and substring on long strings  (Scott Cain, )
      Re: [SQL] EXTERNAL storage and substring on long strings  ("Shridhar Daithankar", )
      Re: [SQL] EXTERNAL storage and substring on long strings  ("Matt Clark", )
  Re: EXTERNAL storage and substring on long strings  (Scott Cain, )
 Re: EXTERNAL storage and substring on long strings  (Tom Lane, )
 Re: EXTERNAL storage and substring on long strings  (Joe Conway, )
  Re: EXTERNAL storage and substring on long strings  (Scott Cain, )
 Re: EXTERNAL storage and substring on long strings  (Tom Lane, )
  Re: EXTERNAL storage and substring on long strings  (Joe Conway, )
   Re: EXTERNAL storage and substring on long strings  (Tom Lane, )
    Re: EXTERNAL storage and substring on long strings  (Scott Cain, )

Scott Cain <> writes:
> A few days ago, I asked for advice on speeding up substring queries on
> the GENERAL mailing list.  Joe Conway helpfully pointed out the ALTER
> TABLE STORAGE EXTERNAL documentation.  After doing the alter,
> the queries got slower!  Here is the background:

Ah-hah, I've sussed it ... you didn't actually change the storage
representation.  You wrote:

> Now, I'll change the storage:
>    alter table feature alter column residues set storage external;
> To make sure that really happens, I run an update on feature:
>    update feature set residues = residues where feature_id<8;
> and then VACUUM ANALYZE again.

This sounds good --- in fact, I think we all just accepted it when we
read it --- but in fact *that update didn't decompress the toasted data*.
The tuple toaster sees that the same toasted value is being stored back
into the row, and so it just re-uses the existing toasted data; it does
not stop to notice that the column storage preference has changed.

To actually get the storage to change, you need to feed the value
through some function or operator that will decompress it.  Then it
won't get recompressed when it's stored.  One easy way (since this
is a text column) is

    update feature set residues = residues || '' where feature_id<8;

To verify that something really happened, try doing VACUUM VERBOSE on
the table before and after.  The quoted number of tuples in the toast
table should rise substantially.

I did the following comparisons on the test data you made available,
using two tables in which one has default storage and one has "external"
(not compressed) storage:

scott=# \timing
Timing is on.
scott=# select length (dna) from edna;
  length
-----------
 245203899
(1 row)

Time: 1.05 ms
scott=# select length (dna) from ddna;
  length
-----------
 245203899
(1 row)

Time: 1.11 ms
scott=# select length(substring(dna from 1000000 for 20000)) from edna;
 length
--------
  20000
(1 row)

Time: 30.43 ms
scott=# select length(substring(dna from 1000000 for 20000)) from ddna;
 length
--------
  20000
(1 row)

Time: 37383.02 ms
scott=#

So it looks like the external-storage optimization for substring() does
work as expected, once you get the data into the right format ...

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: How to efficiently duplicate a whole schema?
От: Yaroslav Mazurak
Дата:
Сообщение: Re: PostgreSQL performance problem -> tuning