Re: Slow query with backwards index scan

От: Nis Jørgensen
Тема: Re: Slow query with backwards index scan
Дата: ,
Msg-id: f8ddhg$3fh$1@sea.gmane.org
(см: обсуждение, исходный текст)
Ответ на: Slow query with backwards index scan  (Tilmann Singer)
Ответы: Re: Slow query with backwards index scan  (Tilmann Singer)
Список: pgsql-performance

Скрыть дерево обсуждения

Slow query with backwards index scan  (Tilmann Singer, )
 Re: Slow query with backwards index scan  (Nis Jørgensen, )
  Re: Slow query with backwards index scan  (Tilmann Singer, )
 Re: Slow query with backwards index scan  (, )
  Re: Slow query with backwards index scan  (Tilmann Singer, )
   Re: Slow query with backwards index scan  (Craig James, )
    Re: Slow query with backwards index scan  (Tilmann Singer, )
     Re: Slow query with backwards index scan  (Nis Jørgensen, )
      Re: Slow query with backwards index scan  (Tilmann Singer, )
       Re: Slow query with backwards index scan  (Nis Jørgensen, )
   Re: Slow query with backwards index scan  (Jeremy Harris, )
 Re: Slow query with backwards index scan  (, )

Tilmann Singer skrev:

> The query works fine for the common cases when matching rows are found
> early in the sorted large table, like this:
>
> testdb=# EXPLAIN ANALYZE
> SELECT * FROM large_table lt
> LEFT JOIN relationships r ON lt.user_id=r.contact_id
> WHERE r.user_id = 55555 OR lt.user_id = 55555
> ORDER BY lt.created_at DESC LIMIT 10;
>                                                                                     QUERY PLAN
                                     
> but for the following user_id there are 3M rows in the large table
> which are more recent then the 10th matching one. The query then does
> not perform so well:
>
>
> testdb=# EXPLAIN ANALYZE
> SELECT * FROM large_table lt
> LEFT JOIN relationships r ON lt.user_id=r.contact_id
> WHERE r.user_id = 12345 OR lt.user_id = 12345
> ORDER BY lt.created_at DESC LIMIT 10;
>                                                                                         QUERY PLAN
                                     
> When split it up into the two following queries it performs much
> better for that user_id. Since the results of the two could be
> combined into the desired result, I'm assuming it could also be done
> efficiently within one query, if only a better plan would be used.
>
>
> testdb=# EXPLAIN ANALYZE
> SELECT * FROM large_table lt
> WHERE lt.user_id = 12345
> ORDER BY lt.created_at DESC LIMIT 10;
>                                                                                    QUERY PLAN
                                     
> testdb=# EXPLAIN ANALYZE
> SELECT * FROM large_table lt
> WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
> ORDER BY created_at DESC LIMIT 10;
>                                                                                    QUERY PLAN
                                     
> I'm not very experienced reading query plans and don't know how to go
> about this from here - is it theoretically possible to have a query
> that performs well with the given data in both cases or is there a
> conceptual problem?

How does the "obvious" UNION query do - ie:

SELECT * FROM (
SELECT * FROM large_table lt
WHERE lt.user_id = 12345

UNION

SELECT * FROM large_table lt
WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
) q

ORDER BY created_at DESC LIMIT 10;

?

How about

SELECT * FROM large_table lt
WHERE lt.user_id = 12345 OR user_id IN (SELECT contact_id FROM
relationships WHERE user_id=12345)
ORDER BY created_at DESC LIMIT 10;

?

I am missing a unique constraint on (user_id, contact_id) - otherwise
the subselect is not equivalent to the join.

Probably you also should have foreign key constraints on
relationships.user_id and relationships.contact_id. These are unlikely
to affect performance though, in my experience.

It might be good to know whether contact_id = user_id is possible -
since this would rule out the possibility of a row satisfying both
branches of the union.

Nis


В списке pgsql-performance по дате сообщения:

От: "Steven Flatt"
Дата:
Сообщение: Vacuum looping?
От: Tilmann Singer
Дата:
Сообщение: Re: Slow query with backwards index scan