Обсуждение: Using Lateral
In the past I could use this in a query:
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
No longer. The error message suggests I should use a lateral query.
But I could not figure out in the documentation how to get the same
result using a "lateral" construct.
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.
Regards
Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)
On 03/27/2018 03:22 AM, Johann Spies wrote:
> In the past I could use this in a query:
>
> SELECT
> DISTINCT ut,
> CASE
> WHEN xpath_exists ('//t:address_spec/t:country/text()',
> q.address_spec,
> p.ns)
> THEN unnest (xpath ('//t:address_spec/t:country/text()',
> q.address_spec,
> p.ns))::citext
> ELSE NULL
> END country,
>
> No longer. The error message suggests I should use a lateral query.
> But I could not figure out in the documentation how to get the same
> result using a "lateral" construct.
>
> Just selecting "unnest(...)" gives the wrong result because if the
> xpath does not exist all the other parts of the query (like 'ut' in
> this case) also are empty.
It is hard to suggest something without seeing your whole query (e.g.
how are you joining q & p?). But it sounds like you basically want a
left join to the unnested xpath result. It could be a lateral join or not.
It is common to use UNNEST with an implicit lateral join, like this:
SELECT ...
FROM q, UNNEST(xpath('...', q.address_spec))
But that gives you an inner join. To get an outer join you need to be
more explicit. Something like this:
SELECT ...
FROM q, p
LEFT OUTER JOIN LATERAL (
SELECT *
FROM unnest(xpath('//t:address_spec/t:country/text()',
q.address_spec, p.ns))::citext
) x(country)
ON true
(Presumably you would do something more restrictive to connect q & p
though.)
Yours,
--
Paul ~{:-)
pj@illuminatedcomputing.com
Thanks Paul.
I was hesitant to post my full query. It is a long and complicated
query. But here it is now:
WITH p AS (
SELECT
ARRAY [ ARRAY [ 't',
'http://scientific.thomsonreuters.com/schema/wok5.4/public/FullRecord'
] ] AS ns),
uts AS (
SELECT
s.ut
FROM
wos_source.core_2015 s
WHERE
s.id BETWEEN 999900
AND 100000
),
utsb AS (
SELECT
b.ut
FROM
wos_2017_1.belongs2 b,
uts
WHERE
b.ut = uts.ut), q AS (
SELECT
s.ut,
unnest (xpath
('//t:static_data/t:fullrecord_metadata/t:addresses/t:address_name/t:address_spec',
xml,
ns)) AS address_spec
FROM
p,
uts a
LEFT JOIN utsb b ON b.ut = a.ut
LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
b.ut IS NULL), r AS (
SELECT
s.ut,
unnest (xpath
('//t:static_data/t:item/t:reprint_contact/t:address_spec',
xml,
ns)) AS raddress_spec
FROM
p,
wos_2017_1.publication l,
uts a
LEFT JOIN utsb b ON b.ut = a.ut
LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
b.ut IS NULL
AND xpath_exists
('//t:static_data/t:item/t:reprint_contact/t:address_spec', xml, ns)
AND s.ut = l.ut
AND l.pubyear < 1998), qd AS (
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
CASE
WHEN xpath_exists ('//t:address_spec/t:city/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:city/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END city,
CASE
WHEN xpath_exists ('//t:organizations/t:organization/text()',
q.address_spec,
ns)
THEN unnest (xpath ('//t:organizations/t:organization/text()',
q.address_spec,
ns))::citext
ELSE NULL
END organisation,
CASE
WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec,
ns)
THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec,
ns))::citext
ELSE NULL
END AS prefname,
CASE
WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
q.address_spec,
ns)
THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
q.address_spec,
ns))::citext
ELSE NULL
END suborgname,
CASE
WHEN xpath_exists ('/t:address_spec/@addr_no',
q.address_spec,
ns)
THEN (xpath ('/t:address_spec/@addr_no',
q.address_spec,
ns))
[ 1 ]::text::INTEGER
ELSE NULL
END addr_no
FROM
p,
q),
rd AS (
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
r.raddress_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
r.raddress_spec,
p.ns))::citext
ELSE NULL
END country,
CASE
WHEN xpath_exists ('//t:address_spec/t:city/text()',
r.raddress_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:city/text()',
r.raddress_spec,
p.ns))::citext
ELSE NULL
END city,
CASE
WHEN xpath_exists ('//t:organizations/t:organization/text()',
r.raddress_spec,
ns)
THEN unnest (xpath ('//t:organizations/t:organization/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END organisation,
CASE
WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
r.raddress_spec,
ns)
THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END AS prefname,
CASE
WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
r.raddress_spec,
ns)
THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END suborgname,
CASE
WHEN xpath_exists ('/t:address_spec/@addr_no',
r.raddress_spec,
ns)
THEN (xpath ('/t:address_spec/@addr_no',
r.raddress_spec,
ns))
[ 1 ]::text::INTEGER
ELSE NULL
END reprint_addr_no
FROM
p,
r
WHERE
r.raddress_spec IS NOT NULL), uq AS (
SELECT
DISTINCT qd.ut,
qd.addr_no::INTEGER,
0 AS reprint_addr_no,
c.uuid city_id,
y.uuid country_id,
o.uuid organisation_id,
u.uuid suborg_id,
p.uuid pref_name_id
FROM
qd
LEFT JOIN wos_2017_1.city c ON (c.city = qd.city)
LEFT JOIN wos_2017_1.country_alias y ON (y.country = qd.country)
LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
regexp_replace (
regexp_replace (qd.organisation, '<', '<', 'g'),
'&', '&', 'g'),
'>', '>', 'g'))
LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = qd.prefname)
LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
regexp_replace (
regexp_replace (qd.suborgname, '<', '<', 'g'),
'&', '&', 'g'),
'>', '>', 'g'))),
ur AS (
SELECT
DISTINCT rd.ut,
0 AS addr_no,
rd.reprint_addr_no::INTEGER,
c.uuid city_id,
y.uuid country_id,
o.uuid organisation_id,
u.uuid suborg_id,
p.uuid pref_name_id
FROM
r,
rd
LEFT JOIN wos_2017_1.city c ON (c.city = rd.city)
LEFT JOIN wos_2017_1.country_alias y ON (y.country = rd.country)
LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
regexp_replace (
regexp_replace (rd.organisation, '<', '<', 'g'),
'&', '&', 'g'),
'>', '>', 'g'))
LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = rd.prefname)
LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
regexp_replace (
regexp_replace (rd.suborgname, '<', '<', 'g'),
'&', '&', 'g'),
'>', '>', 'g'))
WHERE
r.raddress_spec IS NOT NULL), qr AS (
SELECT
*
FROM
uq
UNION
SELECT
ur.*
FROM
ur)
SELECT
DISTINCT ON (qr.ut,
qr.addr_no,
qr.reprint_addr_no,
a.uuid,
qr.organisation_id,
qr.suborg_id,
qr.pref_name_id)
qr.ut,
qr.addr_no,
qr.reprint_addr_no,
a.uuid AS address_id,
qr.organisation_id,
qr.suborg_id,
qr.pref_name_id,
uuid_generate_v1 ()
uuid
FROM
qr
LEFT JOIN wos_2017_1.address a ON (a.city_id = qr.city_id
AND a.country_id = qr.country_id)
Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)
On 03/28/2018 05:22 AM, Johann Spies wrote:
> Thanks Paul.
>
> I was hesitant to post my full query. It is a long and complicated
> query.
Ha ha, you aren't joking. :-)
With something that long I don't think I'd want to split every xpath
call into a new join. I guess you could try it and see though.
Stepping back, your original query is prepared for xpath to give 0
results or 1 result, but not 2+, and the problem is that you're getting
a 2+ result from the address_spec element. What is the right behavior
there? To get two rows in the end result? Just use the first/last
address_spec?
If you want to force 0/1 results, you can keep your structure and do this:
SELECT DISTINCT ut,
(xpath('//t:address_spec/t:country/text()', q.address_spec,
p.ns))[1]::citext AS country,
(xpath('//t:address_spec/t:city/text()', q.address_spec,
p.ns))[1]::citext AS city,
(xpath('//t:organizations/t:organization/text()',
q.address_spec, p.ns))[1]::citext AS organisation,
(xpath('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec, p.ns))[1]::citext AS prefname,
(xpath ('//t:suborganizations/t:suborganization/text()',
q.address_spec, p.ns))[1]::citext AS suborgname,
(xpath ('/t:address_spec/@addr_no', q.address_spec,
p.ns))[1]::text::integer AS addr_no
(Actually I see you are already doing that for addr_no. And an aside:
did you intend `/t:address_spec` instead of `//t:address_spec` there?)
If you would prefer to get multiple rows back, then you'll probably need
a subquery to give one row per `xpath('//t:address_spec')`, so that you
can keep the cities/countries/addr_nos together.
Anyway, I hope that's enough to get you started on the right path! The
root cause is that UNNEST is giving you two rows where it only makes
sense to have one, so you need to restrict that or pull it into a
context where several rows are allowed (like a join).
--
Paul ~{:-)
pj@illuminatedcomputing.com
On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:
In the past I could use this in a query:
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
[...]
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.
You should be able to solve the empty-set-in-target-list problem via a scalar subquery instead of a case construct.
Select distinct ut, (select unnest(...)) as country from ...
The subselect wil return null if fed zero rows. Though you will still have to solve an issue if the unrest returns 1+ rows.
In lieu of the inline scalar subquery I would advise writing a function and just calling it directly in the target-list. But it should not return setof, I'd return an array if you need to accept the possibility of 1+ matches, and return an empty array for zero matches.
David J.
Thanks David and Paul,
You have helped me a lot.
Regards
Johann.
On 28 March 2018 at 20:49, David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:
>>
>> In the past I could use this in a query:
>>
>> SELECT
>> DISTINCT ut,
>> CASE
>> WHEN xpath_exists ('//t:address_spec/t:country/text()',
>> q.address_spec,
>> p.ns)
>> THEN unnest (xpath ('//t:address_spec/t:country/text()',
>> q.address_spec,
>> p.ns))::citext
>> ELSE NULL
>> END country,
>> [...]
>> Just selecting "unnest(...)" gives the wrong result because if the
>> xpath does not exist all the other parts of the query (like 'ut' in
>> this case) also are empty.
>
>
> You should be able to solve the empty-set-in-target-list problem via a
> scalar subquery instead of a case construct.
>
> Select distinct ut, (select unnest(...)) as country from ...
>
> The subselect wil return null if fed zero rows. Though you will still have
> to solve an issue if the unrest returns 1+ rows.
>
> In lieu of the inline scalar subquery I would advise writing a function and
> just calling it directly in the target-list. But it should not return
> setof, I'd return an array if you need to accept the possibility of 1+
> matches, and return an empty array for zero matches.
>
> David J.
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)