Обсуждение: Why does this array query fail?

Поиск
Список
Период
Сортировка

Why does this array query fail?

От
Ken Tanzer
Дата:
Hi.  Can someone explain to me why the last query below is failing, or what exactly the error message means?  I'm sure there's a simple reason, but I'm totally not seeing it.  I boiled this down from a more complicated example, but I think the problem is the same.  Thanks in advance.

Ken

ets_reach=> SELECT ARRAY['test','pass','fail'];
      array       
------------------
 {test,pass,fail}
(1 row)

ets_reach=> SELECT 'found' WHERE 'test' = ANY( ARRAY['test','pass','fail'] );
 ?column? 
----------
 found
(1 row)

ets_reach=> SELECT 'found' WHERE ARRAY['test','pass','fail'] = (SELECT ARRAY['test','pass','fail']);
 ?column? 
----------
 found
(1 row)

ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail']) );

ERROR:  array value must start with "{" or dimension information
LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
                             ^


--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Why does this array query fail?

От
bricklen
Дата:
On Mon, Sep 16, 2013 at 5:32 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail']) );


It works if you drop the inner SELECT.

SELECT 'found' WHERE 'test' = ANY( ARRAY['test','pass','fail'] );

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
> ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT
> ARRAY['test','pass','fail']) );
>
> ERROR:  array value must start with "{" or dimension information
> LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
>                              ^

Per documentation of "ANY" it accepts either an array or a subquery.  This
is the subquery form.  PostgreSQL is trying to convert 'test' into an array
in order to match the array column returned by the subquery.

Remove the "SELECT" to make it work the way you expect - i.e., the
ANY(array) form of the expression.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771170.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Thanks for the explanation.  I think I at least understand what it's doing now.  I'm either surprised or confused though, as I was under the impression that you could substitute a subquery for a value pretty much anywhere, but I guess that's not the case?

Cheers,
Ken


On Mon, Sep 16, 2013 at 6:16 PM, David Johnston <polobo@yahoo.com> wrote:
Ken Tanzer wrote
> ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT
> ARRAY['test','pass','fail']) );
>
> ERROR:  array value must start with "{" or dimension information
> LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
>                              ^

Per documentation of "ANY" it accepts either an array or a subquery.  This
is the subquery form.  PostgreSQL is trying to convert 'test' into an array
in order to match the array column returned by the subquery.

Remove the "SELECT" to make it work the way you expect - i.e., the
ANY(array) form of the expression.

David J.





--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771170.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Why does this array query fail?

От
bricklen
Дата:

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
Thanks for the explanation.  I think I at least understand what it's doing now.  I'm either surprised or confused though, as I was under the impression that you could substitute a subquery for a value pretty much anywhere, but I guess that's not the case?

Cheers,
Ken


Your subquery can also be explicitly casted to make it work. Note the "::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail'])::TEXT[] );

Re: Why does this array query fail?

От
Ken Tanzer
Дата:
OK I tried that and see it works with the cast.  But now I'm confused about both what exactly is failing without the cast, and about the resulting error message.

Is the query failing because PG doesn't understand the subquery is yielding an array?  Seems unlikely.  But if the problem is a type mismatch between 'test' (on the left) and my subquery, I'd expect the same error message as if I try to compare an int to a text array:

SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
ERROR:  operator does not exist: integer = text[]
LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
                               ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Instead of the error message I actually got:

ERROR:  array value must start with "{" or dimension information

Thanks.

Ken





On Mon, Sep 16, 2013 at 6:42 PM, bricklen <bricklen@gmail.com> wrote:

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
Thanks for the explanation.  I think I at least understand what it's doing now.  I'm either surprised or confused though, as I was under the impression that you could substitute a subquery for a value pretty much anywhere, but I guess that's not the case?

Cheers,
Ken


Your subquery can also be explicitly casted to make it work. Note the "::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail'])::TEXT[] );



--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Well I partially take back my last question.  In the error message, I missed the non-array / array part of "integer = text[]"

But I'm still confused.  My subselect returns an array.  If I cast it to a text array, ANY is happy.  But if I don't do so, what exactly does Postgres think my subquery has yielded?  And the error message still doesn't seem to make sense...



On Mon, Sep 16, 2013 at 6:59 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
OK I tried that and see it works with the cast.  But now I'm confused about both what exactly is failing without the cast, and about the resulting error message.

Is the query failing because PG doesn't understand the subquery is yielding an array?  Seems unlikely.  But if the problem is a type mismatch between 'test' (on the left) and my subquery, I'd expect the same error message as if I try to compare an int to a text array:

SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
ERROR:  operator does not exist: integer = text[]
LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
                               ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Instead of the error message I actually got:

ERROR:  array value must start with "{" or dimension information

Thanks.

Ken





On Mon, Sep 16, 2013 at 6:42 PM, bricklen <bricklen@gmail.com> wrote:

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
Thanks for the explanation.  I think I at least understand what it's doing now.  I'm either surprised or confused though, as I was under the impression that you could substitute a subquery for a value pretty much anywhere, but I guess that's not the case?

Cheers,
Ken


Your subquery can also be explicitly casted to make it work. Note the "::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail'])::TEXT[] );



--
AGENCY Software  
A data system that puts you in control
100% Free Software

learn more about AGENCY or
follow the discussion.



--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
> Well I partially take back my last question.  In the error message, I
> missed the non-array / array part of "integer = text[]"
>
> But I'm still confused.  My subselect returns an array.  If I cast it to a
> text array, ANY is happy.  But if I don't do so, what exactly does
> Postgres
> think my subquery has yielded?  And the error message still doesn't seem
> to
> make sense...
>
>
>
> On Mon, Sep 16, 2013 at 6:59 PM, Ken Tanzer <

> ken.tanzer@

> > wrote:
>
>> OK I tried that and see it works with the cast.  But now I'm confused
>> about both what exactly is failing without the cast, and about the
>> resulting error message.
>>
>> Is the query failing because PG doesn't understand the subquery is
>> yielding an array?  Seems unlikely.  But if the problem is a type
>> mismatch
>> between 'test' (on the left) and my subquery, I'd expect the same error
>> message as if I try to compare an int to a text array:
>>
>> SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
>> ERROR:  operator does not exist: integer = text[]
>> LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
>>                                ^
>> HINT:  No operator matches the given name and argument type(s). You might
>> need to add explicit type casts.
>>
>> Instead of the error message I actually got:
>>
>> ERROR:  array value must start with "{" or dimension information
>>
>> Thanks.
>>
>> Ken
>>
>>
>>>>
>>> Your subquery can also be explicitly casted to make it work. Note the
>>> "::TEXT[]"
>>>
>>> SELECT 'found' WHERE 'test' = ANY( (SELECT
>>> ARRAY['test','pass','fail'])::TEXT[] );
>>>

Can we please follow list norms (or at least my example since I was the
first to respond) and bottom-post.

Sub-queries come in a few different flavors:

This:

ANY( (SELECT ARRAY[])::text[] )

works for the same reason this:

SELECT (SELECT col1 FROM (VALUES (ARRAY[4,5,6])) tbl (col1))::integer[]

works;

but this:

SELECT (SELECT col1 FROM (VALUES (ARRAY[1,2,3]),(ARRAY[4,5,6])) tbl
(col1))::integer[]

fails.

For this explanation I will simplify and state that there are two kinds of
sub-queries:

Scalar
Table-like (i.e., non-scalar)

A scalar sub-query must return, at most, one row and only a single column.
A scalar sub-query can be used wherever a literal value is needed.

A table-like sub-query can return as many rows as desired and either one or
an unlimited number of columns - context depending.  In the case
"ANY(sub-query)" context it may only return a single column (but still
multiple rows).  In this context PostgreSQL goes looking for an operator -
e.g., equals(=) - with the right-side argument being of the type of the
sub-query column.  Since equals needs to have a matching type on the
left-side PostgreSQL presumes that whatever is on the left side must be of
the same type.  In this example you supplied an unadorned literal ('test')
which has no type information.  Thus PostgreSQL attempts to cast the unknown
literal to the type it requires (text[]) and fails since an array literal
must begin with '{'.  The attempt to cast 'test' failed.

WHERE 1 = ANY(SELECT ARRAY[]:text[])

This gives a different error because PostgreSQL knows that the number "1" is
an integer and thus has a known type - no cast is necessary.  However, since
the type "integer" does not match the needed type "text[]" a type mis-match
error is thrown or in this case no operator equals(integer, text[]) was
located.

It is the context of the use of the sub-query; not the form of the query
itself, that determines whether a particular sub-query will be treated as
scalar or table-like.  It is because "ANY(...)" can accept either a literal
or a table-like sub-query that this ambiguity arises.

The example query just happens to only have a single row but nothing is
explicitly stopping it from generating more.  The work-around of casting the
sub-query to "text[]" works because the system knows that it must either get
a scalar result or the sub-query will throw an exception (as in my failing
example above).  The system cannot use the actual number of rows returned to
make the decision and so if you know that only one row (at most) can be
returned and you want a scalar interpretation you have to explicitly
indicate that in the query.

There is quite a bit more to this that can be gleaned by reading the
documentation for sub-queries.

Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if you
cannot directly invoke the ARRAY[] syntax.  Where it comes in handy would be
something like:

WITH array_to_check (atc) AS ( VALUES (ARRAY[1,2,3]::integer[]) )
SELECT ...
FROM ...
WHERE 2 = ANY((SELECT atc FROM array_to_check)::integer[])

For the most part, however, forcing a non-scalar sub-query to become a
scalar sub-query is an indication that you are doing something wrong.

It may help to think of:

ANY(subquery) as meaning:

ANY(setof "whatever column type the subquery returns as its only column")

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771183.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Can we please follow list norms (or at least my example since I was the
first to respond) and bottom-post.

Absolutely.  Gmail did it without my realizing, but my bad and I'm all for following the list conventions.

Thanks for taking the time to explain this stuff, which I appreciate.  Mostly it makes sense, but a couple of things remain puzzling to me.

1)  On what exactly does PG base its decision to interpret the ANY as scalar or not?  Or are you saying a sub-query will always be treated as non-scalar, unless it is explicitly cast to an array?

2) Regarding: 
In this context PostgreSQL goes looking for an operator -
e.g., equals(=) - with the right-side argument being of the type of the
sub-query column. 

Why?  In this case you have ANY (varchar[]), which as I understand it "=" needs to compare with a varchar.  So why is it looking for an array?  If it was just varchar = varchar[], I'd get that, but in this case it's ANY(varchar[]), so does PG extract the type of the argument to ANY without paying attention to the ANY itself?

There is quite a bit more to this that can be gleaned by reading the
documentation for sub-queries.

 I'm not sure if there's something specific you're referring to.  I had looked at the page on subquery expressions (http://www.postgresql.org/docs/9.0/static/functions-subquery.html), as well as the following page on "row and array comparisons" to see the two forms of ANY, but don't see anything that covers these nuances in greater depth.  Is there another page I should be looking at?

Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if...

Yeah, I tried to boil down my example, but this is closer to what I was really trying to do:

CREATE TEMP TABLE mytable ( codes varchar[] );
INSERT INTO mytable VALUES ( array[ 'pass','fail'] );
INSERT INTO mytable VALUES ( array[ 'found'] );
SELECT 'found' WHERE 'found' =ANY(
        (SELECT array_agg(code) FROM (SELECT unnest(codes) AS code FROM mytable) foo
)
);

And for immediate purposes, found this worked just as well (as a non-scalar subquery, I guess):

SELECT 'found' WHERE 'found' =ANY(
        (SELECT unnest(codes) AS code FROM mytable)
);

Thanks again for your help and explanations!

Ken

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
> 1)  On what exactly does PG base its decision to interpret the ANY as
> scalar or not?  Or are you saying a sub-query will always be treated as
> non-scalar, unless it is explicitly cast to an array?

Correct.  With respect to a sub-query inside ANY(...) it will be treated as
non-scalar.  You can explicitly make it scalar by casting it to an array -
understanding that the query will fail if the sub-query does not actually
conform.


> 2) Regarding:
>
>> In this context PostgreSQL goes looking for an operator -
>> e.g., equals(=) - with the right-side argument being of the type of the
>> sub-query column.
>
>
> Why?  In this case you have ANY (varchar[]), which as I understand it "="
> needs to compare with a varchar.  So why is it looking for an array?  If
> it
> was just varchar = varchar[], I'd get that, but in this case it's
> ANY(varchar[]), so does PG extract the type of the argument to ANY without
> paying attention to the ANY itself?

No.  The sub-query version is basically:

varchar = ANY (setof varchar[]) ... which is wrong

varchar[] = ANY (setof varchar[]) ... is correct

The "setof" is the difference; its not trying to look inside the array but
rather looking for an entire array that matches one of the arrays the
sub-query generates.

ARRAY[1,2,3] = ANY ( SELECT col1 FROM (VALUES (ARRAY[2,3,4]::int[]),
(ARRAY[1,2,3])) src (col1) )

In the above ANY has to decide whether {2,3,4} or {1,2,3} is equal to the
input; which must be an array.  It does not mean "does the number 1 exist in
any of the supplied arrays".  Again, it becomes more clear if you understand
ANY(subquery) can return more than one row.



>  Is there another page I should be looking at?

Not that I can think of offhand.


> Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if...
>
>
> Yeah, I tried to boil down my example, but this is closer to what I was
> really trying to do:
>
> CREATE TEMP TABLE mytable ( codes varchar[] );
> INSERT INTO mytable VALUES ( array[ 'pass','fail'] );
> INSERT INTO mytable VALUES ( array[ 'found'] );
> SELECT 'found' WHERE 'found' =ANY(
>         (SELECT array_agg(code) FROM (SELECT unnest(codes) AS code FROM
> mytable) foo
>  )
> );
>
>
> And for immediate purposes, found this worked just as well (as a
> non-scalar
> subquery, I guess):
>
> SELECT 'found' WHERE 'found' =ANY(
>         (SELECT unnest(codes) AS code FROM mytable)
> );

Yes, un-nesting can make the problem go away though it too is unusual.  For
the most part either use relations/sets or use arrays (for a specific
component of the schema).  Your example mixes the two which makes using that
part of the schema difficult.

David J.







--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771343.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Thanks again David.  I think that's all making sense to me now, except I want to follow up on your last point:

Yes, un-nesting can make the problem go away though it too is unusual.  For the most part either use relations/sets or use arrays (for a specific
component of the schema).  Your example mixes the two which makes using that part of the schema difficult.

I'm not sure exactly what you're saying here, but it's important to me because I've recently had to do a lot with arrays, and continue to have to do so.  What I'm working with is similar to the example I gave you, but let me be more concrete.

I'm working with a social service agency.  Every time they see a client they fill out an "encounter" record.  Part of what is tracked is what kind of services were provided, which is stored as an array of "service codes" within the encounter.  The encounter also has a date.

So I frequently have to provide information like "what were all the types of services this client received during the last quarter?" or "show me all the clients who received service X last year."  I've learned enough to use ANY, array_agg and unnest to get through these queries, but if I'm going about this wrong or there's a better way to do it I'd love to know about it!

Ken

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
> So I frequently have to provide information like "what were all the types
> of services this client received during the last quarter?" or "show me all
> the clients who received service X last year."  I've learned enough to use
> ANY, array_agg and unnest to get through these queries, but if I'm going
> about this wrong or there's a better way to do it I'd love to know about
> it!

Your example query does not ask those questions.

SELECT DISTINCT service_code
FROM (SELECT unnest(services_rendered_array) AS service_code FROM
services_tables WHERE ...) svcs;

SELECT DISTINCT client_id FROM (
SELECT * FROM services_table WHERE 'X' = ANY(services_rendered_array)
) svcs;

In neither case do you need to use a sub-query answer the question.  Namely,
what you describe makes use of arrays only, and not relations (though the
allowed array item values could be defined on a table somewhere).

Option A:
A. T1: session_id, client_id, service_codes[], date

Note that A is the basic structured assumed for the two example queries
above.

Option B:
B. T1: session_id, session_date, client_id
B. T2: session_id (FK-many), service_code


B. T2 would have a single record for each service performed within a given
session while A. T1 models the multiple service aspect of a session by using
an array.

Incorrect Option C:
C. T1: session_id, session_date, client_id
C. T2: session_id, service_codes[]

This makes use of a one-to-many relationship but also embeds yet another
"many" aspect within C. T2  This is generally going to be a bad idea as you
are now mixing the models together.  And note that I do qualify this as
generally since you may very well decide that C is an elegant and/or the
most correct way to model your domain.


David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771359.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Based on what you described, I think I've generally gone with option A.  Conceptually I like B better, but it's generally more complicated and seems like overkill for simple checkbox-type options.  (But as an aside,  I am looking forward to the time when ELEMENT FKs overcome their performance issues and become part of Postgres!)

The trouble seems to be that even with Option A (services_codes in an array within an encounter record), you still kind of end up with option C on a client level:

client {service_codes}
client {service_codes}

There may be no way around it, but it seems like you end up needing to write rather cumbersome queries to get at your data.  OTOH there's always room for improvement; since I'm relatively new to working extensively with arrays, I'm hoping they become more intuitive and less painful as one gets used to them. :)

SELECT client_id,
COALESCE(
  (SELECT array_agg(code) FROM (
    SELECT distinct
      client_id,unnest(accessed_health_care_non_urgent_codes) AS code
    FROM service_reach
    WHERE client_id=client.client_id
    AND service_date BETWEEN '2013-08-01' AND '2013-08-31'
    ) foo
  ),array['(none)'])
AS accessed_health_care_non_urgent_codes
FROM client;

It's probably way more detail than you want, but I've attached the table structure and pasted in a quarterly report that the query above was taken from in case you have any pointers or are simply curious.  

Thanks again!

Ken

/*
CREATE OR REPLACE VIEW hch_quarterly AS
*/

SELECT *
FROM (

SELECT


export_id,
UPPER(SUBSTRING(name_last,1,2) || SUBSTRING(name_first,1,2) || COALESCE(to_char(dob,'MMDDYY'),'')) AS hch_id,
name_last,
name_first,
dob,
'2013-01-01' AS quarter_start_date,
'2013-03-31' AS quarter_end_date,
referral_source_code || COALESCE(' (' || referral_source_other || ')','') AS referral_source,
facility_code AS living_situation_end,

/*
unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(other_living_situation_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS other_sleeping_codes,
*/

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(other_living_situation_codes) AS code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31'

UNION SELECT distinct client_id,moved_from_code AS code
FROM residence_other
WHERE client_id=client.client_id
AND residence_date BETWEEN '2013-01-01' AND '2013-03-31' ) foo
),array['(none)'])) AS other_sleeping_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_health_care_non_urgent_codes) AS code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_health_care_non_urgent_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_services_cd_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_services_cd_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(completed_services_cd_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS completed_services_cd_codes,
unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_services_mh_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_services_mh_codes,
CASE WHEN client_id IN (SELECT DISTINCT ON (client_id,staff_assign_date) client_id
FROM staff_assign
WHERE staff_assign_type_code='PAYEE' AND staff_assign_date<='2013-03-31' AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS has_payee,
CASE WHEN client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_type_code='OUTREACH' AND staff_assign_date<='2013-03-31' AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS outreach_client,
CASE WHEN client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_type_code='CM' AND staff_assign_date<='2013-03-31' AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS cm_client,

service_plan_status_code

FROM client
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM intake_reach
WHERE intake_reach_date <= '2013-03-31' 
ORDER BY client_id,intake_reach_date DESC ) AS ir USING (client_id)
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM residence_other
WHERE residence_date <= '2013-03-31' AND COALESCE(residence_date_end,'2013-03-31')>='2013-01-01'
ORDER BY client_id,residence_date DESC) AS ro USING (client_id)
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM service_reach
WHERE service_date BETWEEN '2013-01-01' AND '2013-03-31'
ORDER BY client_id,service_date DESC) AS sr USING (client_id)
LEFT JOIN (SELECT client_id,export_id
FROM client_export_id
WHERE export_organization_code='HCH') exp USING (client_id)
WHERE client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_date <= '2013-03-31' AND staff_assign_type_code IN ('CM','OUTREACH') AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01' AND staff_project(staff_id) IN ('OUTREACH','REACH'))

) AS whole_shebang
--ORDER BY client_name(client_id)


On Tue, Sep 17, 2013 at 5:02 PM, David Johnston <polobo@yahoo.com> wrote:
Ken Tanzer wrote
> So I frequently have to provide information like "what were all the types
> of services this client received during the last quarter?" or "show me all
> the clients who received service X last year."  I've learned enough to use
> ANY, array_agg and unnest to get through these queries, but if I'm going
> about this wrong or there's a better way to do it I'd love to know about
> it!

Your example query does not ask those questions.

SELECT DISTINCT service_code
FROM (SELECT unnest(services_rendered_array) AS service_code FROM
services_tables WHERE ...) svcs;

SELECT DISTINCT client_id FROM (
SELECT * FROM services_table WHERE 'X' = ANY(services_rendered_array)
) svcs;

In neither case do you need to use a sub-query answer the question.  Namely,
what you describe makes use of arrays only, and not relations (though the
allowed array item values could be defined on a table somewhere).

Option A:
A. T1: session_id, client_id, service_codes[], date

Note that A is the basic structured assumed for the two example queries
above.

Option B:
B. T1: session_id, session_date, client_id
B. T2: session_id (FK-many), service_code


B. T2 would have a single record for each service performed within a given
session while A. T1 models the multiple service aspect of a session by using
an array.

Incorrect Option C:
C. T1: session_id, session_date, client_id
C. T2: session_id, service_codes[]

This makes use of a one-to-many relationship but also embeds yet another
"many" aspect within C. T2  This is generally going to be a bad idea as you
are now mixing the models together.  And note that I do qualify this as
generally since you may very well decide that C is an elegant and/or the
most correct way to model your domain.


David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771359.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.
Вложения

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
>
> SELECT client_id,
> COALESCE(
>   (SELECT array_agg(code) FROM (
>     SELECT distinct
>       client_id,unnest(accessed_health_care_non_urgent_codes) AS code
>     FROM service_reach
>     WHERE client_id=client.client_id
>     AND service_date BETWEEN '2013-08-01' AND '2013-08-31'
>     ) foo
>   ),array['(none)'])
> AS accessed_health_care_non_urgent_codes
> FROM client;

Equivalent semantics:

WITH clients_with_codes AS (
SELECT client_id, array_agg(code) AS client_codes FROM (SELECT client_id,
unnest(accessed...) AS code FROM service_reach) foo GROUP BY client_id
)
SELECT client_id, COALESCE(client_codes, ARRAY['(none)']) AS client_codes
FROM client LEFT JOIN client_with_codes USING (client_id)

Should (recommend testing) perform better due to the simple fact that you
avoid the correlated sub-query (i.e., a sub-query that references the outer
query to obtain some parameter - in this case the client_id of the current
row).  The goal is to create an uncorrelated sub-query/relation that
contains all the data you require then JOIN it with the original outer
relation using the same equality you were using in the correlated version.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771366.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
> It's probably way more detail than you want, but I've attached the table
> structure and pasted in a quarterly report that the query above was taken
> from in case you have any pointers or are simply curious.

Is this a quarterly report because that is how long it takes to run?

On a smaller scale I've written queries like this.  I enrolled in a
university database design course shortly thereafter...

I would suggest considering how to use functions to encapsulate some of the
"medical code collecting" logic.  And consider WITH/CTE constructs as well,
like I used in my last message, to effectively create temporary named tables
for different parts of the query.

Might want to move the whole thing into function and pass in the various
parameters - namely the date range - instead of hard-coding the values into
the view.

Those thoughts aside I've done my own share of "write-once, read never"
queries and if the performance meets your needs and the maintenance burden
is acceptable then sometimes you just leave this in place until someone more
knowledgeable (like a future you probably) comes back and decides to toss
and rewrite it.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771367.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Why does this array query fail?

От
Ken Tanzer
Дата:
Is this a quarterly report because that is how long it takes to run?

It takes about 7 seconds to run.  I suppose if I optimized it I could save a minute every couple of years.

I usually get concerned about performance issues when they're actually causing problems.  I'm generally more concerned about how long it takes to write queries, and how cumbersome the SQL involved is.  And since arrays are relatively new to me, I've been trying to understand generally the best ways to query information out of them, or when their behavior just doesn't make sense to me.  I'll say I answer 99.99% of my own questions before they ever make it to the list, and by the time they do I invariably have read the documentation as best as I can.  This has been my favorite list ever to read, as people are invariably helpful, patient and polite to each other.

I would suggest considering how to use functions to encapsulate some of the "medical code collecting" logic.  And consider WITH/CTE constructs as well, like I used in my last message, to effectively create temporary named tables for different parts of the query.
 
Might want to move the whole thing into function and pass in the various
parameters - namely the date range - instead of hard-coding the values into the view.

Thanks for these constructive suggestions.  I see benefits both ways.  And the dates are actually parsed in by an app at run-time.  (I stripped that part out to avoid confusion--I find it hard to know when submitting a list item how much to just dump a full real example, and how much to simplify down to a test case that illustrates the specific issue.)

On a smaller scale I've written queries like this.  I enrolled in a
university database design course shortly thereafter...

until someone more knowledgeable (like a future you probably) comes back and

I'm not sure what the point of either of these comments were, but perhaps they made you feel better.  Either way, thanks for taking the time to look my stuff over and for the other comments and explanations you made.

Ken

Re: Why does this array query fail?

От
David Johnston
Дата:
Ken Tanzer wrote
>>
>> Is this a quarterly report because that is how long it takes to run?
>
>
> It takes about 7 seconds to run.  I suppose if I optimized it I could save
> a minute every couple of years.

Was meant to be tongue-in-cheek...though I was curious on the real
answer...agree on optimize when necessary.


> On a smaller scale I've written queries like this.  I enrolled in a
>> university database design course shortly thereafter...
>
> until someone more knowledgeable (like a future you probably) comes back
> and
>
> I'm not sure what the point of either of these comments were, but perhaps
> they made you feel better.  Either way, thanks for taking the time to look
> my stuff over and for the other comments and explanations you made.
>
> Ken

Mostly that SQL, and regular expressions - which I use a lot too, tend to
result in stuff that makes perfect sense as you build it up from scratch but
if you have to come back and modify it later its very difficult to makes
changes as opposed to building up a new query with the new needs and
experience.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771691.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.