Обсуждение: pgxml & xpath_table
Hi, I'm playing with the contrib/pgxml library under PG 8.1.4, and I'm not sure if what I found with pgxml is a feature of abug: I've got the following table: -------------------- CREATE TABLE test ( id int4 NOT NULL, xml varchar(200), CONSTRAINT pk PRIMARY KEY (id) ) WITHOUT OIDS; INSERT INTO test VALUES (1, '<doc num="C1"><line num="L1"><a>1</a><b>2</b><c>3</c></line><line num="L2"><a>11</a><b>22</b><c>33</c></line></doc>'); INSERT INTO test VALUES (2, '<doc num="C2"><line num="L1"><a>111</a><b>222</b><c>333</c></line><line num="L2"><a>111</a><b>222</b><c>333</c></line></doc>'); -------------------- If I launch this query: -------------------- select * from xpath_table('id','xml','test', '/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1') AS t(id int4, doc_numvarchar(10), line_num varchar(10), val1 int4, val2 int4, val3 int4) where id = 1 order by doc_num, line_num -------------------- I get: -------------------- id doc_num line_num val1 val2 val3 1 C1 L1 1 2 3 1 L2 11 22 33 -------------------- I was expecting doc_num would receive twice the C1 value, just like with a normal sql join. Regards, ---------------------------------- Philippe Lang, Ing. Dipl. EPFL Attik System rte de la Fonderie 2 1700 Fribourg Switzerland http://www.attiksystem.ch Tel: +41 (26) 422 13 75 Fax: +41 (26) 422 13 76
Вложения
Hi, On Fri, 09 Jun 2006 08:43:51 +0200, Philippe Lang wrote: > I'm playing with the contrib/pgxml library under PG 8.1.4, and I'm not sure if what I found with pgxml is a feature ofa bug: > [snip] > I get: > > -------------------- > id doc_num line_num val1 val2 val3 > 1 C1 L1 1 2 3 > 1 L2 11 22 33 > -------------------- > > I was expecting doc_num would receive twice the C1 value, just like with a normal sql join. > The results from the XPath expressions should be seen as a plain list representation of a multivalued answer rather than a join expression. i.e. This is intended to be a feature. In order to deal with multivalued results, the xpath_table function as many rows as the largest number of result values from any of the XPath expressions. There is no sound way to fill in the other columns if the result sets are of different lengths, so they are left as null. The assumption was that the XPath expressions would be used together - the code doesn't know that /doc/@num only occurs once and that it is equally applicable for all the rows. This is the reason why xpath_table allows you to specify an identifying field (usually a primary key but doesn't have to be)- the solution to your question is to join an xpath_table that just fetches the document number against the primary key, e.g.: SELECT t.*,i.doc_num FROM xpath_table('id','xml','test', '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1') AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4), xpath_table('id','xml','test','/doc/@num','1=1') AS i(id int4, doc_num varchar(10)) WHERE i.id=t.id and i.id=1 ORDER BY doc_num, line_num; Giving id | line_num | val1 | val2 | val3 | doc_num ----+----------+------+------+------+--------- 1 | L1 | 1 | 2 | 3 | C1 1 | L2 | 11 | 22 | 33 | C1 (2 rows) Hope this helps. Regards John
John Gray <jgray@azuli.co.uk> writes: > This is the reason why xpath_table allows you to specify an > identifying field (usually a primary key but doesn't have to be)- the > solution to your question is to join an xpath_table that just fetches the > document number against the primary key, e.g.: John, do you think anything could be done in the pgxml documentation to make this usage pattern clearer? regards, tom lane
On Sat, 2006-06-10 at 14:06 -0400, Tom Lane wrote: > John Gray <jgray@azuli.co.uk> writes: > > This is the reason why xpath_table allows you to specify an > > identifying field (usually a primary key but doesn't have to be)- the > > solution to your question is to join an xpath_table that just fetches the > > document number against the primary key, e.g.: > > John, do you think anything could be done in the pgxml documentation to > make this usage pattern clearer? > Yes - I'll write a doc patch for this and include an example like this by way of demonstration - I notice now that this particular feature (which I went to some lengths to incorporate in the code!) is completely undocumented. I'll also consider whether another function with different behaviour in this situation is possible or helpful. Give me a few days... Regards John > regards, tom lane
John Gray wrote: > On Sat, 2006-06-10 at 14:06 -0400, Tom Lane wrote: > > John Gray <jgray@azuli.co.uk> writes: > > > This is the reason why xpath_table allows you to specify an > > > identifying field (usually a primary key but doesn't have to be)- the > > > solution to your question is to join an xpath_table that just fetches the > > > document number against the primary key, e.g.: > > > > John, do you think anything could be done in the pgxml documentation to > > make this usage pattern clearer? > > > > Yes - I'll write a doc patch for this and include an example like this > by way of demonstration - I notice now that this particular feature > (which I went to some lengths to incorporate in the code!) is completely > undocumented. I'll also consider whether another function with different > behaviour in this situation is possible or helpful. > > Give me a few days... FYI, I have not seen a patch for this yet. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +