Re: estimating # of distinct values

Поиск

Список

Период

Сортировка

От	Jim Nasby
Тема	Re: estimating # of distinct values
Дата	18 января 2011 г. 13:33:25
Msg-id	31BC2A43-8CCE-4358-B188-0F930CC0E2E5@nasby.net обсуждение исходный текст
Ответ на	Re: estimating # of distinct values (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: estimating # of distinct values
Список	pgsql-hackers

Дерево обсуждения

On Jan 18, 2011, at 11:24 AM, Robert Haas wrote:
> On Tue, Jan 18, 2011 at 12:23 PM, Jim Nasby <jim@nasby.net> wrote:
>> On Jan 17, 2011, at 8:11 PM, Robert Haas wrote:
>>> On Mon, Jan 17, 2011 at 7:56 PM, Jim Nasby <jim@nasby.net> wrote:
>>>> - Forks are very possibly a more efficient way to deal with TOAST than having separate tables. There's a fair
amountof overhead we pay for the current setup. 
>>>
>>> That seems like an interesting idea, but I actually don't see why it
>>> would be any more efficient, and it seems like you'd end up
>>> reinventing things like vacuum and free space map management.
>>
>> The FSM would take some effort, but I don't think vacuum would be that hard to deal with; you'd just have to free up
thespace in any referenced toast forks at the same time that you vacuumed the heap. 
>
> How's that different from what vacuum does on a TOAST table now?

TOAST vacuum is currently an entirely separate vacuum. It might run at the same time as the main table vacuum, but it
stillhas all the work that would be associated with vacuuming a table with the definition of a toast table. In fact, at
onepoint vacuuming toast took two passes: the first deleted the toast rows that were no longer needed, then you had to
goback and vacuum those deleted rows. 

>>>> - Dynamic forks would make it possible to do a column-store database, or at least something approximating one.
>>>
>>> I've been wondering whether we could do something like this by
>>> treating a table t with columns pk, a1, a2, a3, b1, b2, b3 as two
>>> tables t1 and t2, one with columns pk, a1, a2, a3 and the other with
>>> columns pk, b1, b2, b3.  SELECT * FROM t would be translated into
>>> SELECT * FROM t1, t2 WHERE t1.pk = t2.pk.
>>
>> Possibly, but you'd be paying tuple overhead twice, which is what I was looking to avoid with forks.
>
> What exactly do you mean by "tuple overhead"?

typedef struct HeapTupleHeaderData. With only two tables it might not be that bad, depending on the fields. Beyond two
tablesit's almost certainly a loser. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: estimating # of distinct values