Re: TOAST usage setting

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: TOAST usage setting
Дата
Msg-id 200705300220.l4U2KR225227@momjian.us
обсуждение исходный текст
Ответ на TOAST usage setting  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: TOAST usage setting  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: TOAST usage setting  ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
Re: TOAST usage setting  (Heikki Linnakangas <heikki@enterprisedb.com>)
Список pgsql-hackers
Bruce Momjian wrote:
> Gregory Stark wrote:
> > "Bruce Momjian" <bruce@momjian.us> writes:
> > 
> > >> No, we did substring() too :)
> > >
> > > Uh, I looked at text_substring(), and while there is an optimization to
> > > do character counting for encoding length == 1, it is still accessing
> > > the data.
> > 
> > Sure but it'll only access the first chunk. There are two chunks in your test.
> > It might be interesting to run tests accessing 0 (length()), 1 (substr()), and
> > 2 chunks (hashtext()).
> > 
> > Or if you're concerned with the cpu cost of hashtext you could calculate the
> > precise two bytes you need to access with substr to force it to load both
> > chunks. But I think the real cost of unnecessary toasting is the random disk
> > i/o so the cpu cost is of secondary interest.
> 
> OK, will run a test with hashtext().  What I am seeing now is a 10-20x
> slowdown to access the TOAST data, and a 0-1x speedup for accessing the
> non-TOAST data when the rows are long:

I reran the tests with hashtext(), and created a SUMMARY.HTML chart:
http://momjian.us/expire/TOAST/

What you will see is that pushing TEXT to a TOAST column allows quick
access to non-TOAST values and single-row TOAST values, but accessing
all TOAST columns is slower than accessing them in the heap, by a factor
of 3-18x.

Looking at the chart, it seems 512 is the proper breakpoint for TOAST
because 512 gives us a 2x change in accessing non-TOAST values and
single-row TOAST values, and it is only 2x slower to access all TOAST
values than we have now.

Of course, this has all the data in the cache, but if the cache is
limited, pushing more to TOAST is going to be a bigger win.  In general,
I would guess that the number of times all >512 byte rows are accessed
is much less than the number of times that pushing those >512 byte
values to TOAST will give a speedup.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Treat
Дата:
Сообщение: Re: Reviewing temp_tablespaces GUC patch
Следующее
От: Tom Lane
Дата:
Сообщение: Re: interval / interval -> double operator