Обсуждение: Query pattern tha Postgres doesn't handle well

Поиск
Список
Период
Сортировка

Query pattern tha Postgres doesn't handle well

От
Greg Stark
Дата:
At my day job I've been doing a fair amount of routine query and
schema optimization and I've noticed on particular query shape that
has repeatedly caused problems, and it's one we've talked about
before.

select * from table where simple-restriction 0 OR (complex-subquery)

For example something like:

SELECT * FROM projects WHERE ispublic OR project_id IN (SELECT
project_id FROM project_members WHERE userid = ?)

Either half of this clause can easily be executed using indexes but
the combination forces Postgres to do a full sequential table scan.

The solution has been to rewrite each of these cases into:

SELECT * FROM projects WHERE ispublic
UNION ALL
SELECT * FROM projects WHERE NOT ispublic AND project_id IN (SELECT
project_id FROM project_members WHERE userid = ?)

But there are several problems with this.

1) It's often difficult to get the conditions to be exactly disjoint
such that you can use UNION ALL, and if you can't then UNION can be
slow and possibly even incorrect.

2) The resulting query is difficult to combine with other clauses.
They must either be copied into both sides of the UNION or wrapped
around the outside and hope that Postgres pushes them down into the
union branches where necessary. More complex conditions can't be
pushed down, and in particular combining two conditions that have been
rewritten to use union is very difficult.

3) It's extremely difficult to generate this kind of query in an ORM
such as ActiveRecord without breaking the abstractions and causing
surprising interactions.

I'm not sure I have much to offer here. I definitely don't know where
to start implementing it. But I know it's come up before on the list
and just thought I would mention that I've noticed it being a
recurring problem for at least this user.

-- 
greg


Re: Query pattern tha Postgres doesn't handle well

От
Tom Lane
Дата:
Greg Stark <stark@mit.edu> writes:
> At my day job I've been doing a fair amount of routine query and
> schema optimization and I've noticed on particular query shape that
> has repeatedly caused problems, and it's one we've talked about
> before.

> select * from table where simple-restriction 0 OR (complex-subquery)

> For example something like:

> SELECT * FROM projects WHERE ispublic OR project_id IN (SELECT
> project_id FROM project_members WHERE userid = ?)

> Either half of this clause can easily be executed using indexes but
> the combination forces Postgres to do a full sequential table scan.

Yeah.  This is at least related to, if not the exact same as,
what I was fooling with a year ago:

https://www.postgresql.org/message-id/flat/7f70bd5a-5d16-e05c-f0b4-2fdfc8873489@BlueTreble.com

The single-relation-scan case is possibly a bit easier to deal with
than what we were looking at there, in that it's clear that you
can use the rel's CTID to de-duplicate, and that that will give
the right answer.

            regards, tom lane