Re: Native XML
От | Anton |
---|---|
Тема | Re: Native XML |
Дата | |
Msg-id | 4D780860.2060000@gmail.com обсуждение исходный текст |
Ответ на | Re: Native XML (Yeb Havinga <yebhavinga@gmail.com>) |
Список | pgsql-hackers |
On 03/09/2011 08:21 PM, Yeb Havinga wrote: <blockquote cite="mid:4D77D31F.9060501@gmail.com" type="cite"></blockquote> On2011-03-09 19:30, Robert Haas wrote: <blockquote cite="mid:AANLkTi=E+Lamz7onQ_w1uS55a5ymGjpWMqrv8eDH1Cmb@mail.gmail.com"type="cite"><pre wrap="">On Wed, Mar 9, 2011 at 1:11PM, Bruce Momjian <a class="moz-txt-link-rfc2396E" href="mailto:bruce@momjian.us" moz-do-not-send="true"><bruce@momjian.us></a>wrote: </pre><blockquote type="cite"><pre wrap="">Robert Haas wrote: </pre><blockquote type="cite"><pre wrap="">On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane <a class="moz-txt-link-rfc2396E"href="mailto:tgl@sss.pgh.pa.us" moz-do-not-send="true"><tgl@sss.pgh.pa.us></a> wrote: </pre><blockquote type="cite"><pre wrap="">Well, in principle we could allow them to work on both, just the same way that (for instance) "+" is a standardized operator but works on more than one datatype. ?But I agree that the prospect of two parallel types with essentially duplicate functionality isn't pleasing at all. </pre></blockquote><pre wrap="">The real issue hereis whether we want to store XML as text (as we do now) or as some predigested form which would make "output the whole thing" slower but speed up things like xpath lookups. We had the same issue with JSON, and due to the uncertainty about which way to go with it we ended up integrating nothing into core at all. It's really not clear that there is one way of doing this that is right for all use cases. If you are storing xml in an xml column just to get it validated, and doing no processing in the DB, then you'd probably prefer our current representation. If you want to build functional indexes on xpath expressions, and then run queries that extract data using other xpath expressions, you would probably prefer the other representation. </pre></blockquote><pre wrap="">Someone should measure how much overhead the indexing of xml values might have. If it is minor, we might be OK with only an indexed xml type. </pre></blockquote><pre wrap="">I think the relevant thing to measure would be how fast the predigested representation speeds up the evaluation of xpath expressions. </pre></blockquote> About a predigested representation, I hope I'm not insulting anyone's education here,but a lot of XML database 'accellerators' seem to be using the pre and post orders (see <a class="moz-txt-link-freetext"href="http://en.wikipedia.org/wiki/Tree_traversal" moz-do-not-send="true">http://en.wikipedia.org/wiki/Tree_traversal</a>)of the document nodes. The following two pdfs showhow these orders can be used to query for e.g. all ancestors of a node: second pdf slide 10: for nodes x,y : x is anancestor of y when x.pre < y.pre AND x.post > y.post.<br /><br /><a class="moz-txt-link-abbreviated" href="http://www.cse.unsw.edu.au/%7Ecs4317/09s1/tutorials/tutor4.pdf" moz-do-not-send="true">www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor4.pdf</a> about the format<br /><a class="moz-txt-link-abbreviated"href="http://www.cse.unsw.edu.au/%7Ecs4317/09s1/tutorials/tutor10.pdf" moz-do-not-send="true">www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor10.pdf</a>about querying the format<br /><br /> regards,<br/> Yeb Havinga<span id="search"><span class="f"><cite><br /></cite></span></span><br /> This looks rather likea special kind of XML shredding and that is listed at <a class="moz-txt-link-freetext" href="http://wiki.postgresql.org/wiki/Todo">http://wiki.postgresql.org/wiki/Todo</a><br/><br /> About the predigested / plainstorage and the evaluation: I haven't yet fully given up the idea to play with it, even though on purely experimentalbasis (i.e. with little or no ambition to contribute to the core product). If doing so, interesting might alsobe to use TOAST slicing during the xpath evaluation so that large documents are not fetched immediately as a whole, ifthe xpath is rather 'short'.<br /><br /> But first I should let all the thoughts 'settle down'. There may well be otherareas of Postgres where it's worth to spend some time, whether writing something or just reading.<br />
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Следующее
От: Tom LaneДата:
Сообщение: select_common_collation callers way too ready to throw error