Обсуждение: xpath question
Just a quick question, not being a big xpath or xml user in the past. Why does this: select xpath('//Abstract/AbstractText[1]',content) from medlinexml; return an array for each entry rather than a single xml entry? I thought that appending the [1] should not return an array, but a single xml value instead. Thanks, Sean example output below {"<Abstract> <AbstractText>Strains of Pseudomonas aeruginosa, isolated from the sputum of relatively fit patients with cystic fibrosis (CF) who had been recently colonized by the organism, showed typical cultural and serologic characteristics. The majority of strains of P. aeruginosa isolated from CF patients with chronic bronchopulmonary infection had 3 distinctive features, loss of 0 serotype reaction, expression of a new somatic antigen, and sensitivity to normal human serum. Patients with organisms with one or two of these features were more severely affected by the disease. The appearance of these variants may represent a critical stage in the progression of CF.</AbstractText> </Abstract>"} {"<Abstract> <AbstractText>The changes in airway caliber and plasma cyclic-AMP levels after intravenously administered aminophylline, and the effect of DL- and D-propranolol on these responses have been investigated in a double-blind manner in normal subjects. Aminophylline 5.6 mg/kg was given intravenously over a 10-min period and the airway response was measured as change in specific airway conductance (delta SGaw) in the body plethysmograph. In the initial study in 6 subjects, orally administered placebo or propranolol was followed 2 h later by intravenously administered aminophylline. Neither placebo nor propranolol alone caused any change in SGaw at 2 h. After placebo, intravenously injected aminophylline produced a 30% increase in SGaw, reaching a peak 5 min after injection. This response was equivalent to 77% of the maximal response to 400 micrograms inhaled albuterol in the same subjects. After propranolol, the airway response to aminophylline was attenuated, with a 53% reduction in delta SGaw at the time of peak response. In a further study on 6 subjects, intravenously given aminophylline produced a 25% increase in SGaw and a 51% increase in plasma cyclic-AMP levels after placebo tablets. Pretreatment with 40 and 80 mg DL-propranolol caused a dose-dependent reduction of both the airway and plasma cyclic-AMP response to aminophylline. The airway response to aminophylline was not attenuated by D-propranolol so the effect of DL-propranolol is thought to be due to beta-adrenoceptor blockade. The absence of any detectable change in SGaw after DL-propranolol suggests there is little resting sympathetic tone to the airways in normal subjects. In the absence of sympathetic stimulation, the rapid response to aminophylline is unlikely to be due to phosphodiesterase inhibition. The attenuation of the airway and cyclic-AMP response by propranolol suggests that part of the action of aminophylline may be due to beta-agonist activity.</AbstractText> </Abstract>"}
> > Sean Davis wrote: > > Just a quick question, not being a big xpath or xml user in the past. > > > > Why does this: > > > > select xpath('//Abstract/AbstractText[1]',content) from medlinexml; > > > > return an array for each entry rather than a single xml entry? I > > thought that appending the [1] should not return an array, but a > > single xml value instead. > > > > Thanks, > > Sean > > > > example output below > > > > {"<Abstract> > > <AbstractText>Strains of Pseudomonas aeruginosa, isolated from the > > sputum of relatively fit patients with cystic fibrosis (CF) who had > > been recently colonized by the organism, showed typical cultural and > > serologic characteristics. The majority of strains of P. aeruginosa > > isolated from CF patients with chronic bronchopulmonary infection had > > 3 distinctive features, loss of 0 serotype reaction, expression of a > > new somatic antigen, and sensitivity to normal human serum. Patients > > with organisms with one or two of these features were more severely > > affected by the disease. The appearance of these variants may > > represent a critical stage in the progression of CF.</AbstractText> > > </Abstract>"} > > {"<Abstract> > > <AbstractText>The changes in airway caliber and plasma cyclic-AMP > > levels after intravenously administered aminophylline, and the effect > > of DL- and D-propranolol on these responses have been investigated in > > a double-blind manner in normal subjects. Aminophylline 5.6 mg/kg was > > given intravenously over a 10-min period and the airway response was > > measured as change in specific airway conductance (delta SGaw) in the > > body plethysmograph. In the initial study in 6 subjects, orally > > administered placebo or propranolol was followed 2 h later by > > intravenously administered aminophylline. Neither placebo nor > > propranolol alone caused any change in SGaw at 2 h. After placebo, > > intravenously injected aminophylline produced a 30% increase in SGaw, > > reaching a peak 5 min after injection. This response was equivalent to > > 77% of the maximal response to 400 micrograms inhaled albuterol in the > > same subjects. After propranolol, the airway response to aminophylline > > was attenuated, with a 53% reduction in delta SGaw at the time of peak > > response. In a further study on 6 subjects, intravenously given > > aminophylline produced a 25% increase in SGaw and a 51% increase in > > plasma cyclic-AMP levels after placebo tablets. Pretreatment with 40 > > and 80 mg DL-propranolol caused a dose-dependent reduction of both the > > airway and plasma cyclic-AMP response to aminophylline. The airway > > response to aminophylline was not attenuated by D-propranolol so the > > effect of DL-propranolol is thought to be due to beta-adrenoceptor > > blockade. The absence of any detectable change in SGaw after > > DL-propranolol suggests there is little resting sympathetic tone to > > the airways in normal subjects. In the absence of sympathetic > > stimulation, the rapid response to aminophylline is unlikely to be due > > to phosphodiesterase inhibition. The attenuation of the airway and > > cyclic-AMP response by propranolol suggests that part of the action of > > aminophylline may be due to beta-agonist activity.</AbstractText> > > </Abstract>"} > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/docs/faq > > > > > hi your xpath is working correct. You have two results as you have to > parent nodes 'Abstract' > > so if you want only first text from all Abstract node, you should give > one more condition as > > /Abstract[1]/AbstractText[1] Thanks, Pavel for the reply. However, I looked a bit more and it appears that xpath always returns an xml array in 8.3b2. annodb=# \df xpath; List of functions Schema | Name | Result data type | Argument data types ------------+-------+------------------+--------------------- pg_catalog | xpath | xml[] | text, xml pg_catalog | xpath | xml[] | text, xml, text[] (2 rows) So, there is not a way to force a single element to be returned as far as I can see. I did an equivalent example to the one you suggested: annodb=# select xpath('/MedlineCitation/PMID/text()',content) from medline.medlinexml limit 10; xpath ------------ {10111733} {10145466} {10111734} {10111735} {1830312} {1830313} {1830314} {1830315} {1830316} {1713217} (10 rows) annodb=# select xpath('/MedlineCitation[1]/PMID[1]/text()',content) from medline.medlinexml limit 10; xpath ----------- {1859432} {1859433} {1859434} {1650203} {1859435} {1859436} {1907139} {1677568} {1859437} {1830481} (10 rows) One can select the first element of the array like this, if necessary: annodb=# select (xpath('/MedlineCitation[1]/PMID[1]/text()',content))[1] from medline.medlinexml limit 10; xpath --------- 1678397 1869692 1869697 1869698 1869693 1869694 1869695 1869696 1869699 1869700 (10 rows) Sean
On Nov 21, 2007 8:42 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > Thanks, Pavel for the reply. However, I looked a bit more and it > appears that xpath always returns an xml array in 8.3b2. > > annodb=# \df xpath; > List of functions > Schema | Name | Result data type | Argument data types > ------------+-------+------------------+--------------------- > pg_catalog | xpath | xml[] | text, xml > pg_catalog | xpath | xml[] | text, xml, text[] > (2 rows) > > So, there is not a way to force a single element to be returned as far > as I can see. I did an equivalent example to the one you suggested: The xpath() function added to 8.3 is generic function, so it really returns xml[] always. That was the main aim -- to add the generic function. You can easily create any wrapper to meet your needs. E.g., smth like xpath_first() that always returns only the first xml chunk, or xpath_single() that performs concatenation of all xml chunks and returns single xml. Anyway, in both cases you break significantly from the general XML semantics -- that's why such stuff is not implemented by default. BTW, maybe some convenient wrappers will be added to Postgres in the future, but surely this should be done only after good volume of practical experience is collected. -- Nikolay Samokhvalov <nikolay@samokhvalov.com> http://nikolay.samokhvalov.com Postgresmen http://postgresmen.ru OpenWebTechnologies http://openwebtech.ru
On Nov 22, 2007 5:42 AM, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote: > On Nov 21, 2007 8:42 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > Thanks, Pavel for the reply. However, I looked a bit more and it > > appears that xpath always returns an xml array in 8.3b2. > > > > annodb=# \df xpath; > > List of functions > > Schema | Name | Result data type | Argument data types > > ------------+-------+------------------+--------------------- > > pg_catalog | xpath | xml[] | text, xml > > pg_catalog | xpath | xml[] | text, xml, text[] > > (2 rows) > > > > So, there is not a way to force a single element to be returned as far > > as I can see. I did an equivalent example to the one you suggested: > > The xpath() function added to 8.3 is generic function, so it really > returns xml[] always. That was the main aim -- to add the generic > function. > > You can easily create any wrapper to meet your needs. E.g., smth like > xpath_first() that always returns only the first xml chunk, or > xpath_single() that performs concatenation of all xml chunks and > returns single xml. > > Anyway, in both cases you break significantly from the general XML > semantics -- that's why such stuff is not implemented by default. BTW, > maybe some convenient wrappers will be added to Postgres in the > future, but surely this should be done only after good volume of > practical experience is collected. Nikolay, I agree almost fully with these thoughts. I have not used xpath much, so I don't know what the default behavior is (in terms of a spec). I would suggest that the default behavior in postgres should match whatever a spec is. That said, the utility of the current implementation is fantastic, especially when combined with the ability to produce custom functions when necessary. Thanks again, Sean