Re: Column storage positions

Поиск
Список
Период
Сортировка
От Florian G. Pflug
Тема Re: Column storage positions
Дата
Msg-id 45DCC441.5000709@phlo.org
обсуждение исходный текст
Ответ на Re: Column storage positions  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Ответы Re: Column storage positions  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Stephan Szabo wrote:
> On Wed, 21 Feb 2007, Alvaro Herrera wrote:
>> Did I miss something in what you were trying to say?  I assume you must
>> already know this.
> 
> I think so. What I was mentioning was that I was pretty sure that there
> was a message with someone saying that they actually tried something that
> did this and that they found left-most varchar access was slightly slower
> after the reordering although general access was faster. I believe the
> table case was alternating smallint and varchar columns, but I don't know
> what was tested for the retrieval. If that turns out to be able to be
> supported by other tests, then for some access patterns, the rearranged
> version might be slower.

Here is the original quote:
----------------------------------------------------------------
The results were encouraging: on a table
with 20 columns of alternating smallint and varchar(10) datatypes,
selecting the max() of one of the rightmost int columns across 1
million rows ran around 3 times faster.  The same query on the
leftmost varchar column (which should suffer the most from this
change) predictably got a little slower (about 10%);
----------------------------------------------------------------

What the OP doesn't mention is how the exact layouts looked before
and after the reordering - maybe a nullable field fixed-length field
got moved before the varchar column in question, which would disable
offset caching I guess.

Let's say the reodering algorithm is changed to only move non-nullable
fixed-width columns to the left - can anyone see an access pattern that 
would run slower after the reodering? I certainly can't - because the set
of columns for which offset caching works after the reodering would
be a superset of the one for which it works before the reordering.

BTW, this is a good case for why the storage order should - directly or
indirectly - be tweakable. You can either optimize for space, and _then_
for speed - which is what the OP did I think - or first for speed, and then 
for space. If the dba cannot choose the strategy, there will always be 
workloads where the engine does it the wrong way around.

greetings, Florian Pflug


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Column storage positions
Следующее
От: "Andrew Hammond"
Дата:
Сообщение: Re: Plan invalidation design