Disjunctions and sequential scans

Поиск
Список
Период
Сортировка
От Ronuk Raval
Тема Disjunctions and sequential scans
Дата
Msg-id CAPhHnhpc6bdGbRBa9hG7FQiKByVqR3s37VoY64DSMUxjeJGOjQ@mail.gmail.com
обсуждение исходный текст
Список pgsql-performance
Hi there,

This is likely me not understanding something, but I have a query that
I would expect to be fast but PG insists on using a sequential scan.
I've attached a minimized test case but I'll walk through the steps as
well.

I'm running PostgreSQL 10.1 using the standard ArchLinux packages, but
I've been able to reproduce this issue on our production systems
running 9.5 as well.

I have the following 2 tables in a standard users/addresses
configuration with an extra index on addresses to make lookups on the
referring side faster:

    CREATE TABLE users (
        id integer PRIMARY KEY
    );

    CREATE TABLE addresses (
        id integer PRIMARY KEY,
        user_id integer REFERENCES users(id)
    );

    CREATE INDEX ix_addresses_user_id ON addresses (user_id);

Also, I turn off sequential scanning to force the database to consider
any other plan first:

    SET enable_seqscan TO OFF;

Then, I would expect the following query to have a query plan without
any sequential scans:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE (
        addresses.id = 1 OR
        EXISTS (
            SELECT 1 FROM users
            WHERE (
                users.id = addresses.user_id AND
                users.id = 1
            )
        )
    );

    -[ RECORD 1

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Seq Scan on addresses
(cost=10000000000.00..10000018508.10 rows=1130 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
    -[ RECORD 2

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Filter: ((id = 1) OR (alternatives: SubPlan 1 or
hashed SubPlan 2))
    -[ RECORD 3

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   SubPlan 1
    -[ RECORD 4

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |     ->  Result  (cost=0.15..8.17 rows=1 width=0)
(never executed)
    -[ RECORD 5

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           One-Time Filter: (addresses.user_id = 1)
    -[ RECORD 6

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           ->  Index Only Scan using users_pkey on
users  (cost=0.15..8.17 rows=1 width=0) (never executed)
    -[ RECORD 7

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |                 Index Cond: (id = 1)
    -[ RECORD 8

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |                 Heap Fetches: 0
    -[ RECORD 9

]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   SubPlan 2
    -[ RECORD 10

]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |     ->  Index Only Scan using users_pkey on users
users_1  (cost=0.15..8.17 rows=1 width=4) (never executed)
    -[ RECORD 11

]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           Index Cond: (id = 1)
    -[ RECORD 12

]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           Heap Fetches: 0
    -[ RECORD 13

]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.082 ms
    -[ RECORD 14

]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.032 ms

Given the `Seq Scan on addresses` above, the database clearly
disagrees. What am I missing here?

Strangely, breaking down the query to its components does as I expect.
This is the primary key lookup:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE addresses.id = 1;

    -[ RECORD 1

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Index Only Scan using addresses_pkey on addresses
(cost=0.15..8.17 rows=1 width=4) (actual time=0.008..0.008 rows=0
loops=1)
    -[ RECORD 2

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Index Cond: (id = 1)
    -[ RECORD 3

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Heap Fetches: 0
    -[ RECORD 4

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Buffers: shared hit=1
    -[ RECORD 5

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.206 ms
    -[ RECORD 6

]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.031 ms

And this is the semi-join:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE EXISTS (
        SELECT 1 FROM users
        WHERE (
            users.id = addresses.user_id AND
            users.id = 1
        )
    );

    -[ RECORD 1

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Nested Loop  (cost=4.40..23.19 rows=11 width=4)
(actual time=0.007..0.007 rows=0 loops=1)
    -[ RECORD 2

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Buffers: shared hit=1
    -[ RECORD 3

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   ->  Index Only Scan using users_pkey on users
(cost=0.15..8.17 rows=1 width=4) (actual time=0.007..0.007 rows=0
loops=1)
    -[ RECORD 4

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Index Cond: (id = 1)
    -[ RECORD 5

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Heap Fetches: 0
    -[ RECORD 6

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Buffers: shared hit=1
    -[ RECORD 7

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   ->  Bitmap Heap Scan on addresses
(cost=4.24..14.91 rows=11 width=8) (never executed)
    -[ RECORD 8

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Recheck Cond: (user_id = 1)
    -[ RECORD 9

]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         ->  Bitmap Index Scan on ix_addresses_user_id
 (cost=0.00..4.24 rows=11 width=0) (never executed)
    -[ RECORD 10

]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |               Index Cond: (user_id = 1)
    -[ RECORD 11

]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.145 ms
    -[ RECORD 12

]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.038 ms

I could break the `OR` into 2 separate queries with a `UNION` but that
seems like a rather strange contortion that I would expect the
database to handle for me.

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Zenz
Дата:
Сообщение: Re: Slow queries after Windows startup
Следующее
От: Éric Fontaine
Дата:
Сообщение: RE: Slow queries after Windows startup