Threaded Records in SQL: Advice Needed
От | Ingram, Bryan |
---|---|
Тема | Threaded Records in SQL: Advice Needed |
Дата | |
Msg-id | 9B7D4396307CD311809A00500415EB405D1F99@BKMAIL обсуждение исходный текст |
Список | pgsql-sql |
Hi Everyone, I'm planning to implement a project using a php/postgres 6.5.3 combination. Part of what I need to do is develop a model in SQL to keep track of "threaded records." What I mean by this, is that I need to keep track of parent-child relationships in exactly the same way that a threaded-discussion group does. The design should be able to support approximately 100 users, mostly readers, and be able to handle approximately 1 insert per minute. So far, I've found two methods/algorithms for implenting this in SQL, but there's a problem with each. The first one is extremely efficient and inexpensive for selecting and ordering all of the records in a thread (taking only one select to get all of the records and in order), but requires that ALL records be updated on insert, which is very expensive. Because I'm expecting about 120000 records a year, or ca. 450 new records every business day, the prospect of each insert updating each record in the database is not appealing. At first it would be ok, but as the record count increased, the inserts would become extremely slow. This method works by modeling the records as "nested sets" and was explained by Joe Celko in one of his articles. The second method is simply adding fields to keep track of an id, parent_id, and root_id. Its benefit is that it is inexpensive on inserts, requiring only one insert and no updates, but it is costly to select and order all the records in a thread. Essentially, the only way I have found to do this is to implement a recursive function that selects one record at a time, checks for children, ad infinitum. Not very efficient. My concern with method 1 is that the great expense of inserting new records will slow the database down to unacceptable levels. This is based on my past experience with postgres when updating a database of 1.5 million rows (which took an hour and a half with syncing turned off.) Method 2 is problematic because it uses recursion first of all (which I understand to be heavy in resource usage) and because it makes a select for each and every record in a thread to be displayed. What I'd like to find is either a better way that I'm not aware of, or some kind of middle ground. Also, what do you make of these methods. Are they as inefficient as I want to believe? Thanks! Bryan Ingram Database Programmer S!XTYFOOTSP!DER 972-492-5676 bingram@sixtyfootspider.com <mailto:bingram@sixtyfootspider.com> <http://www.sixtyfootspider.com/> www.sixtyfootspider.com
В списке pgsql-sql по дате отправления: