Performance and doing USENET style threaded messages

Поиск
Список
Период
Сортировка
От Lincoln Yeoh
Тема Performance and doing USENET style threaded messages
Дата
Msg-id 3.0.5.32.20001223180911.00960e80@192.228.128.13
обсуждение исходный текст
Ответ на Re: allowing users access to a trusted C function  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Performance and doing USENET style threaded messages  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Hi (Merry Christmas and Happy New Millennium everyone!),

I've been wondering what would be the best way to do things like threaded
messages on Postgresql- e.g. where you have the usual parent-child
relationships between entities- each entity has a unique id and a parent id.

So far I'm considering two reasonable options.

<options>
Option 1 - do it the obvious "proper way" - have the application do selects
recursively.
e.g. To select child items
<pseudocode>
traverse ( id, orderspec) {
 select id,subdata from tablename where parentid=id order by orderspec
 for each id
       display(id,subdata)
    traverse(id, orderspec)

}
</pseudocode>

Option 2 - do it the kludgy limited way.
Keep the same structure as Option 1 but add a new text column I'll call
geneaology (or gene for short).

For a structure like
1
|_
| |
2 3
  |_
  | |
  4 5

Item 4 will have gene=00000001,00000003,00000004,

Where the ids are all zeropadded _hexadecimal_ and delimited by commas (for
easier human processing if necessary).

So to select all items with parent id=3 (including item id=3) one would use
a query like:

select * from tablename where gene like '00000001,00000003,%' order by
orderspec;

Option 2.1 - use a combination of the two, e.g. use option 1 when the depth
gets crazy.

Option X - any ideas?
</options>

The trouble with option 1 is it seems it will be quite slow when you have
lots of items at significant depths.

The trouble with option 2 is the geneaology column can get quite huge (9 X
depth bytes) if the depth increases (e.g. there is a flame war ;) ). What
is the recommended max indexable text length in Postgresql 7.0? If it won't
index well then option 1 may actually be better.

If option 2 indexes well, I'm thinking of using it and limiting the depth
to say maybe a hundred. I figure in practice it's time for a new thread
once the depth is > 100 - it's mutated enough to warrant a new geneaology ;).

Oracle has a "start with" feature, but even so I'm not sure if the
performance is better that way - it may just be for application level
convenience.

Any comments or suggestions welcome.

Thanks,
Link.


В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: allowing users access to a trusted C function
Следующее
От: Lincoln Yeoh
Дата:
Сообщение: Database I/O and other performance questions.