Table Design for Hierarchical Data

Поиск
Список
Период
Сортировка
От Lee Hachadoorian
Тема Table Design for Hierarchical Data
Дата
Msg-id n2i5ab13581004061033j4ed25734xd0504cb95a1fc092@mail.gmail.com
обсуждение исходный текст
Ответы Re: Table Design for Hierarchical Data  (Michael Glaesemann <grzm@seespotcode.net>)
Re: Table Design for Hierarchical Data  (Steve Crawford <scrawford@pinpointresearch.com>)
Re: Table Design for Hierarchical Data  (silly sad <sad@bankir.ru>)
Re: Table Design for Hierarchical Data  (silly sad <sad@bankir.ru>)
Re: Table Design for Hierarchical Data  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Re: Table Design for Hierarchical Data  (Sergey Konoplev <gray.ru@gmail.com>)
Список pgsql-sql
Please point me to another listserv or forum if this question is more appropriately addressed elsewhere.<br /><br />I
amtrying to come up with a structure to store employment data by NAICS (North American Industrial Classification
System).The data uses a hierarchical encoding scheme ranging between 2 and 5 digits. That is, each 2-digit code
includesall industries beginning with the same two digits. 61 includes 611 which includes 6111, 6112, 6113, etc. A
portionof the hierarchy is shown after the sig.<br /><br />A standard way to store hierarchical data is the adjacency
listmodel, where each node's parent appears as an attribute (table column). So 6111 would list 611 as its parent. Since
NAICSuses a hierarchical encoding scheme, the node's name is the same as the node's id, and the parent can always be
derivedfrom the node's id. Storing the parent id separately would seem to violate a normal form (because of the
redundancy).<br/><br />One way to store this data would be to store at the most granular level (5-digit NAICS) and then
aggregateup if I wanted employment at the 4-, 3-, or 2-digit level. The problem is that because of nondisclosure rules,
thedata is sometimes censored at the more specific level. I might, for example, have data for 6114, but not 61141,
61142,61143. For a different branch of the tree, I might have data at the 5-digit level while for yet another branch I
mighthave data only to the 3-digit level (not 4 or 5). I think that means I have to store all data at multiple levels,
evenif some of the higher-level data could be reconstructed from other, lower-level data.<br /><br />Specifically I'd
liketo know if this should be a single table or should there be a separate table for each level of the hierarchy (four
inall)? If one table, should the digits be broken into separate columns? Should parent ids be stored in each node?<br
/><br/>More generally, what questions should I be asking to help decide what structure makes the most sense? Are there
anywebsites, forums, or books that cover this kind of problem?<br /><br />Regards,<br />--Lee<br /><br />-- <br />Lee
Hachadoorian<br/> PhD Student, Geography<br />Program in Earth & Environmental Sciences<br />CUNY Graduate
Center<br/><br />A Portion of the NAICS scheme<br /><br />61    Educational Services<br /> 611    Educational
Services<br/> 6111    Elementary and Secondary Schools<br /> 61111    Elementary and Secondary Schools<br /> 6112   
JuniorColleges<br /> 61121    Junior Colleges<br /> 6113    Colleges, Universities, and Professional Schools<br />
61131   Colleges, Universities, and Professional Schools<br /> 6114    Business Schools and Computer and Management
Training<br/> 61141    Business and Secretarial Schools<br /> 61142    Computer Training<br /> 61143    Professional
andManagement Development Training<br /> etc…<br /><br /> 

В списке pgsql-sql по дате отправления:

Предыдущее
От: Cliff Wells
Дата:
Сообщение: INSERT INTO...RETURNING vs SELECT
Следующее
От: Thomas Kellerer
Дата:
Сообщение: Re: count function alternative in postgres