Re: estimating # of distinct values

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: estimating # of distinct values
Дата
Msg-id 31BC2A43-8CCE-4358-B188-0F930CC0E2E5@nasby.net
обсуждение исходный текст
Ответ на Re: estimating # of distinct values  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: estimating # of distinct values  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Jan 18, 2011, at 11:24 AM, Robert Haas wrote:
> On Tue, Jan 18, 2011 at 12:23 PM, Jim Nasby <jim@nasby.net> wrote:
>> On Jan 17, 2011, at 8:11 PM, Robert Haas wrote:
>>> On Mon, Jan 17, 2011 at 7:56 PM, Jim Nasby <jim@nasby.net> wrote:
>>>> - Forks are very possibly a more efficient way to deal with TOAST than having separate tables. There's a fair
amountof overhead we pay for the current setup. 
>>>
>>> That seems like an interesting idea, but I actually don't see why it
>>> would be any more efficient, and it seems like you'd end up
>>> reinventing things like vacuum and free space map management.
>>
>> The FSM would take some effort, but I don't think vacuum would be that hard to deal with; you'd just have to free up
thespace in any referenced toast forks at the same time that you vacuumed the heap. 
>
> How's that different from what vacuum does on a TOAST table now?

TOAST vacuum is currently an entirely separate vacuum. It might run at the same time as the main table vacuum, but it
stillhas all the work that would be associated with vacuuming a table with the definition of a toast table. In fact, at
onepoint vacuuming toast took two passes: the first deleted the toast rows that were no longer needed, then you had to
goback and vacuum those deleted rows. 

>>>> - Dynamic forks would make it possible to do a column-store database, or at least something approximating one.
>>>
>>> I've been wondering whether we could do something like this by
>>> treating a table t with columns pk, a1, a2, a3, b1, b2, b3 as two
>>> tables t1 and t2, one with columns pk, a1, a2, a3 and the other with
>>> columns pk, b1, b2, b3.  SELECT * FROM t would be translated into
>>> SELECT * FROM t1, t2 WHERE t1.pk = t2.pk.
>>
>> Possibly, but you'd be paying tuple overhead twice, which is what I was looking to avoid with forks.
>
> What exactly do you mean by "tuple overhead"?

typedef struct HeapTupleHeaderData. With only two tables it might not be that bad, depending on the fields. Beyond two
tablesit's almost certainly a loser. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: estimating # of distinct values
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: estimating # of distinct values