Re: Applying TOAST to CURRENT

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: Applying TOAST to CURRENT
Дата
Msg-id 3934AB49.5209569E@tm.ee
обсуждение исходный текст
Ответ на Re: Applying TOAST to CURRENT  (JanWieck@t-online.de (Jan Wieck))
Ответы Re: Applying TOAST to CURRENT  (JanWieck@t-online.de (Jan Wieck))
Список pgsql-hackers
Jan Wieck wrote:
> 
> The Hermit Hacker wrote:
> >
> > have to third this one ... I think it should be totally transparent to the
> > admin/user ... just create it when the table is created, what's the worst
> > case scenario?  it never gets used and you waste 16k of disk space?
> >
> 
>     Not exactly.
> 
>     I've  made  some  good  experiences  with  having the toaster
>     trying to keep the main tuple size below 1/4 of  MaxTupleSize
>     (BLKSIZE  -  block  header). 

Can't _that_ behaviour be made modifyable by some setting ?

>     Remember  that external stored
>     attributes are only fetched from the  secondary  relation  if
>     really  needed  (when the result set is sent to the client or
>     if explicitly used in the query). So in a usual case, where a
>     relatively  small  amount of the entire data is retrieved and
>     key attributes are small, it's a win. With this  config  more
>     main tuples fit into one block, and if the attributes used in
>     the WHERE clause  aren't  stored  external,  the  result  set
>     (including  sort  and  group  actions)  can be collected with
>     fewer block reads. Only those big  values,  that  the  client
>     really wanted, have to be fetched at send time.

What is the priority of checks on indexed fetch?

I mean if we do "SELECT * FROM ttable WHERE toasted LIKE 'ab%' "

DO we first scan by index to 'ab%', then check if tuple is live and 
after that to the LIKE comparison ?

Would it not be faster in toast case to use the already retrieved 
index data and check that first, before going to main table (not to 
mention the TOAST table)

>     If  no  external  table  exists, the toaster will try the <2K
>     thing by compression only. If the resulting tuple  fits  into
>     the  8K  limit,  it's OK. 

Would it not be faster/cleaner to check some configuration variable 
than the existance of toest table ?

>     But if a secondary relation exists,
>     it'll store external to make the tuple <2K.  Thus, a 4K or 6K
>     tuple,  that  actually  fits  and would be stored in the main
>     table, will cause the toaster to jump in if we allways create
>     the secondary table.

Do our current (btree/hash) indexes support toast ?

If not, will they ?

> 
>     Hmmm - thinking about that it doesn't sound bad if we allways
>     create a secondary relation at CREATE TABLE time, but NOT the
>     index  for  it.  And at VACUUM time we create the index if it
>     doesn't exist AND there is external stored data.

Is there a plan to migrate to some combined index/database table for 
at least toast tables later ? 

For at least toast tables it seems feasible to start using the 
originally planned tuple-spanning mechanisms, unless we plan 
migrating LOs to toast table at some point which would make index-less 
tuple chaining a bad idea as it would make seeking on really large 
LOs slow. 

>     The table is prepared for external  storage  allways  and  we
>     avoid  the  risks  from  creating  tables  in  possibly later
>     aborting transactions or due to concurrency  issues.  But  we
>     don't  waste  the  index space for really allways-small-tuple
>     tables.

That could perhaps be done for other tables too, ie CREATE INDEX 
would not actually create index until VACUUM notices that table is 
big enough to make use of that index ?

On second thought that seems not a good idea to me ;(

> 
>     Another benefit would  be,  that  reloads  should  be  faster
>     because  with  this  technique,  the  toaster doesn't need to
>     insert index tuples during the load. The indices are  created
>     later at VACUUM after reload.

AFAIK reloads (from pg_dump at least) create indexes after LOAD'ing data

>     The  toaster  needs  to  use sequential scans on the external
>     table until the next vacuum  run,  but  index  usage  allways
>     depends on vacuum so that's not a real issue from my PoV.
> 
>     At least a transparent compromise - isn't it?

But do we need it ?

I suspect there are other issues that need your attention more than 
complicating table creation to save a few kb ;)

Creating toast tables still wastes only 1MB per 64 tables _that have 
toastable columns_, which seems real cheap considering today's HD
prices.

You would need 6400 toast tables to consume 1% of the smallest currently 
available (10GB) disk.

If that is a concern this can probably be cured by good docs that say 
in detail which datatypes cause toast tables an which don't.

-----------
Hannu


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: SET FSYNC command?
Следующее
От: Zeugswetter Andreas SB
Дата:
Сообщение: AW: Applying TOAST to CURRENT