Re: Re: From TODO, XML?
От | mlw |
---|---|
Тема | Re: Re: From TODO, XML? |
Дата | |
Msg-id | 3B6437A4.6228C63D@mohawksoft.com обсуждение исходный текст |
Ответ на | Re: From TODO, XML? (mlw <markw@mohawksoft.com>) |
Ответы |
Re: Re: From TODO, XML?
("Ross J. Reedstrom" <reedstrm@rice.edu>)
|
Список | pgsql-hackers |
Ken Hirsch wrote: > > mlw <markw@mohawksoft.com> wrote: > > > "Frank Ch. Eigler" wrote: > > > : So a parser that can scan a DTD and make a usable create table (...) > > > : line would be very helpful. [...] > > > > > > Hmm, but hierarchically structured documents such as XML don't map > > > well to a relational model. The former tend to be recursive (e.g., > > > have more levels of containment than the one or two that might be > > > mappable to tables and columns.) > > > > Yes!!! Exactly, being able to understand the recursive nature of XML and > create > > relations on the fly would be a very cool feature. > > I think there is a pretty straight forward mapping, except for one possible > ambiguity. > > If an element, say <address>, is contained within another element, say > <employee>, it could either be a column (or group of columns) in an Employee > table, or it could be a table Address which references Employee. > > When you say "create relations on the fly", what exactly do you mean? I can > see it would be handy to have CREATE TABLE statements written for you, but > it seems likely that a human would want to edit them before the tables are > actually created. You cannot infer much type information from the DTD. I > don't think there's a way to infer a primary key from a DTD, so you would > want to either specify one or add a serial column (or perhaps that would > always be done automatically). An XML schema would have more information, > of course. I have been thinking about this. A lot of guessing would have to be done, of course. But, unless some extra information is specified, when you have an XML record, contained within another, the parser would have to generate its own primary key and a sequence for each table. Obviously, the user should be able to specify the primary key for each table, but lacking that input, the XML parser/importer should do it automatically. So this: <employee> <name>Bill</name> <position>Programmer</position> <address><number>1290</number><street> <name>Canton Ave</name></street><town> <name>Milton</name></town> </address> </emplyee> The above is almost impossible to convert to a relational format without additional information or a good set of rules. However, we can determine which XML titles are "containers" and which are "data." "employee" is a container because it has sub tags. "position" is "data" because it has no sub tags. We can recursively scan this hierarchy, decide which are containers and which are data. Data gets assigned an appropriate SQL type and containers get separated from the parent container, and an integer index is put in its place. For each container, either a primary key is specified or created on the fly. We insert sub containers first and pop back the primary key value, until we have the whole record. The primary key could even be the OID. A second strategy is to concatenate the hierarchy into the field name, as street_name, town_name, and so on. What do you think? -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
В списке pgsql-hackers по дате отправления: