Обсуждение: xpath question

Поиск
Список
Период
Сортировка

xpath question

От
"Sean Davis"
Дата:
Just a quick question, not being a big xpath or xml user in the past.

Why does this:

select xpath('//Abstract/AbstractText[1]',content) from medlinexml;

return an array for each entry rather than a single xml entry?  I
thought that appending the [1] should not return an array, but a
single xml value instead.

Thanks,
Sean

example output below

 {"<Abstract>
 <AbstractText>Strains of Pseudomonas aeruginosa, isolated from the
sputum of relatively fit patients with cystic fibrosis (CF) who had
been recently colonized by the organism, showed typical cultural and
serologic characteristics. The majority of strains of P. aeruginosa
isolated from CF patients with chronic bronchopulmonary infection had
3 distinctive features, loss of 0 serotype reaction, expression of a
new somatic antigen, and sensitivity to normal human serum. Patients
with organisms with one or two of these features were more severely
affected by the disease. The appearance of these variants may
represent a critical stage in the progression of CF.</AbstractText>
 </Abstract>"}
 {"<Abstract>
 <AbstractText>The changes in airway caliber and plasma cyclic-AMP
levels after intravenously administered aminophylline, and the effect
of DL- and D-propranolol on these responses have been investigated in
a double-blind manner in normal subjects. Aminophylline 5.6 mg/kg was
given intravenously over a 10-min period and the airway response was
measured as change in specific airway conductance (delta SGaw) in the
body plethysmograph. In the initial study in 6 subjects, orally
administered placebo or propranolol was followed 2 h later by
intravenously administered aminophylline. Neither placebo nor
propranolol alone caused any change in SGaw at 2 h. After placebo,
intravenously injected aminophylline produced a 30% increase in SGaw,
reaching a peak 5 min after injection. This response was equivalent to
77% of the maximal response to 400 micrograms inhaled albuterol in the
same subjects. After propranolol, the airway response to aminophylline
was attenuated, with a 53% reduction in delta SGaw at the time of peak
response. In a further study on 6 subjects, intravenously given
aminophylline produced a 25% increase in SGaw and a 51% increase in
plasma cyclic-AMP levels after placebo tablets. Pretreatment with 40
and 80 mg DL-propranolol caused a dose-dependent reduction of both the
airway and plasma cyclic-AMP response to aminophylline. The airway
response to aminophylline was not attenuated by D-propranolol so the
effect of DL-propranolol is thought to be due to beta-adrenoceptor
blockade. The absence of any detectable change in SGaw after
DL-propranolol suggests there is little resting sympathetic tone to
the airways in normal subjects. In the absence of sympathetic
stimulation, the rapid response to aminophylline is unlikely to be due
to phosphodiesterase inhibition. The attenuation of the airway and
cyclic-AMP response by propranolol suggests that part of the action of
aminophylline may be due to beta-agonist activity.</AbstractText>
 </Abstract>"}

Fwd: xpath question

От
"Sean Davis"
Дата:
>
> Sean Davis wrote:
> > Just a quick question, not being a big xpath or xml user in the past.
> >
> > Why does this:
> >
> > select xpath('//Abstract/AbstractText[1]',content) from medlinexml;
> >
> > return an array for each entry rather than a single xml entry?  I
> > thought that appending the [1] should not return an array, but a
> > single xml value instead.
> >
> > Thanks,
> > Sean
> >
> > example output below
> >
> >  {"<Abstract>
> >  <AbstractText>Strains of Pseudomonas aeruginosa, isolated from the
> > sputum of relatively fit patients with cystic fibrosis (CF) who had
> > been recently colonized by the organism, showed typical cultural and
> > serologic characteristics. The majority of strains of P. aeruginosa
> > isolated from CF patients with chronic bronchopulmonary infection had
> > 3 distinctive features, loss of 0 serotype reaction, expression of a
> > new somatic antigen, and sensitivity to normal human serum. Patients
> > with organisms with one or two of these features were more severely
> > affected by the disease. The appearance of these variants may
> > represent a critical stage in the progression of CF.</AbstractText>
> >  </Abstract>"}
> >  {"<Abstract>
> >  <AbstractText>The changes in airway caliber and plasma cyclic-AMP
> > levels after intravenously administered aminophylline, and the effect
> > of DL- and D-propranolol on these responses have been investigated in
> > a double-blind manner in normal subjects. Aminophylline 5.6 mg/kg was
> > given intravenously over a 10-min period and the airway response was
> > measured as change in specific airway conductance (delta SGaw) in the
> > body plethysmograph. In the initial study in 6 subjects, orally
> > administered placebo or propranolol was followed 2 h later by
> > intravenously administered aminophylline. Neither placebo nor
> > propranolol alone caused any change in SGaw at 2 h. After placebo,
> > intravenously injected aminophylline produced a 30% increase in SGaw,
> > reaching a peak 5 min after injection. This response was equivalent to
> > 77% of the maximal response to 400 micrograms inhaled albuterol in the
> > same subjects. After propranolol, the airway response to aminophylline
> > was attenuated, with a 53% reduction in delta SGaw at the time of peak
> > response. In a further study on 6 subjects, intravenously given
> > aminophylline produced a 25% increase in SGaw and a 51% increase in
> > plasma cyclic-AMP levels after placebo tablets. Pretreatment with 40
> > and 80 mg DL-propranolol caused a dose-dependent reduction of both the
> > airway and plasma cyclic-AMP response to aminophylline. The airway
> > response to aminophylline was not attenuated by D-propranolol so the
> > effect of DL-propranolol is thought to be due to beta-adrenoceptor
> > blockade. The absence of any detectable change in SGaw after
> > DL-propranolol suggests there is little resting sympathetic tone to
> > the airways in normal subjects. In the absence of sympathetic
> > stimulation, the rapid response to aminophylline is unlikely to be due
> > to phosphodiesterase inhibition. The attenuation of the airway and
> > cyclic-AMP response by propranolol suggests that part of the action of
> > aminophylline may be due to beta-agonist activity.</AbstractText>
> >  </Abstract>"}
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >                http://www.postgresql.org/docs/faq
> >
> >
> hi your xpath is working correct. You have two results as you have to
> parent nodes 'Abstract'
>
> so if you want only first text from all Abstract node, you should give
> one more condition as
>
> /Abstract[1]/AbstractText[1]

Thanks, Pavel for the reply.  However, I looked a bit more and it
appears that xpath always returns an xml array in 8.3b2.

annodb=# \df xpath;
                      List of functions
   Schema   | Name  | Result data type | Argument data types
------------+-------+------------------+---------------------
 pg_catalog | xpath | xml[]            | text, xml
 pg_catalog | xpath | xml[]            | text, xml, text[]
(2 rows)

So, there is not a way to force a single element to be returned as far
as I can see.  I did an equivalent example to the one you suggested:


annodb=# select xpath('/MedlineCitation/PMID/text()',content) from
medline.medlinexml limit 10;
   xpath
------------
 {10111733}
 {10145466}
 {10111734}
 {10111735}
 {1830312}
 {1830313}
 {1830314}
 {1830315}
 {1830316}
 {1713217}
(10 rows)

annodb=# select xpath('/MedlineCitation[1]/PMID[1]/text()',content)
from medline.medlinexml limit 10;
   xpath
-----------
 {1859432}
 {1859433}
 {1859434}
 {1650203}
 {1859435}
 {1859436}
 {1907139}
 {1677568}
 {1859437}
 {1830481}
(10 rows)

One can select the first element of the array like this, if necessary:

annodb=# select
(xpath('/MedlineCitation[1]/PMID[1]/text()',content))[1] from
medline.medlinexml limit 10;
  xpath
---------
 1678397
 1869692
 1869697
 1869698
 1869693
 1869694
 1869695
 1869696
 1869699
 1869700
(10 rows)

Sean

Re: xpath question

От
"Nikolay Samokhvalov"
Дата:
On Nov 21, 2007 8:42 PM, Sean Davis <sdavis2@mail.nih.gov> wrote:
> Thanks, Pavel for the reply.  However, I looked a bit more and it
> appears that xpath always returns an xml array in 8.3b2.
>
> annodb=# \df xpath;
>                       List of functions
>    Schema   | Name  | Result data type | Argument data types
> ------------+-------+------------------+---------------------
>  pg_catalog | xpath | xml[]            | text, xml
>  pg_catalog | xpath | xml[]            | text, xml, text[]
> (2 rows)
>
> So, there is not a way to force a single element to be returned as far
> as I can see.  I did an equivalent example to the one you suggested:

The xpath() function added to 8.3 is generic function, so it really
returns xml[] always. That was the main aim -- to add the generic
function.

You can easily create any wrapper to meet your needs. E.g., smth like
xpath_first() that always returns only the first xml chunk, or
xpath_single() that performs concatenation of all xml chunks and
returns single xml.

Anyway, in both cases you break significantly from the general XML
semantics -- that's why such stuff is not implemented by default. BTW,
maybe some convenient wrappers will be added to Postgres in the
future, but surely this should be done only after good volume of
practical experience is collected.

--
Nikolay Samokhvalov  <nikolay@samokhvalov.com>
http://nikolay.samokhvalov.com

Postgresmen http://postgresmen.ru
OpenWebTechnologies http://openwebtech.ru

Re: xpath question

От
"Sean Davis"
Дата:
On Nov 22, 2007 5:42 AM, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
> On Nov 21, 2007 8:42 PM, Sean Davis <sdavis2@mail.nih.gov> wrote:
> > Thanks, Pavel for the reply.  However, I looked a bit more and it
> > appears that xpath always returns an xml array in 8.3b2.
> >
> > annodb=# \df xpath;
> >                       List of functions
> >    Schema   | Name  | Result data type | Argument data types
> > ------------+-------+------------------+---------------------
> >  pg_catalog | xpath | xml[]            | text, xml
> >  pg_catalog | xpath | xml[]            | text, xml, text[]
> > (2 rows)
> >
> > So, there is not a way to force a single element to be returned as far
> > as I can see.  I did an equivalent example to the one you suggested:
>
> The xpath() function added to 8.3 is generic function, so it really
> returns xml[] always. That was the main aim -- to add the generic
> function.
>
> You can easily create any wrapper to meet your needs. E.g., smth like
> xpath_first() that always returns only the first xml chunk, or
> xpath_single() that performs concatenation of all xml chunks and
> returns single xml.
>
> Anyway, in both cases you break significantly from the general XML
> semantics -- that's why such stuff is not implemented by default. BTW,
> maybe some convenient wrappers will be added to Postgres in the
> future, but surely this should be done only after good volume of
> practical experience is collected.

Nikolay,

I agree almost fully with these thoughts.  I have not used xpath much,
so I don't know what the default behavior is (in terms of a spec).  I
would suggest that the default behavior in postgres should match
whatever a spec is.  That said, the utility of the current
implementation is fantastic, especially when combined with the ability
to produce custom functions when necessary.

Thanks again,
Sean