Re: Efficiently searching for the most recent rows where a columnmatches any result from a different query

Поиск
Список
Период
Сортировка
От
Тема Re: Efficiently searching for the most recent rows where a columnmatches any result from a different query
Дата
Msg-id L5OKvxB--3-0@keemail.me
обсуждение исходный текст
Ответ на Efficiently searching for the most recent rows where a columnmatches any result from a different query  (<mkslaf@keemail.me>)
Ответы Re: Efficiently searching for the most recent rows where a columnmatches any result from a different query  (Nandakumar M <m.nanda92@gmail.com>)
Список pgsql-performance

Hello Hellmuth,

Thank you for your response.

I've uploaded the query plan for the first query (user_id=2) here: https://gist.github.com/anonymous/6d251b277ef71f8977b03cab91fedccd
The query plan for the second query (user_id=1) can be found here: https://gist.github.com/anonymous/32ed485b40cce2651ddc52661f3e7f7b

Just like in the original queries, posts_user_id_id_index is not used.

Kind regards,
Milo

13. Feb 2018 22:13 by hivs77@gmail.com:

Hello:


EXPLAIN (ANALYZE, BUFFERS)
select * from (
SELECT posts.id, users.name, posts.content
FROM posts JOIN users ON posts.user_id = users.id
WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)

ORDER BY posts.id DESC 
) as a
ORDER BY a.id DESC 
LIMIT 10;

------


EXPLAIN (ANALYZE, BUFFERS)
select * from (
SELECT posts.id, users.name, posts.content
FROM posts JOIN users ON posts.user_id = users.id
WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 2)

ORDER BY posts.id DESC 
) as a
ORDER BY a.id DESC 
LIMIT 10;

2018-02-13 8:28 GMT-05:00 <mkslaf@keemail.me>:
Hello,

I have the following schema:

    CREATE TABLE users (
        id   BIGSERIAL PRIMARY KEY,
        name TEXT      NOT NULL UNIQUE
    );
    
    CREATE TABLE friends (
        user_id        BIGINT NOT NULL REFERENCES users,
        friend_user_id BIGINT NOT NULL REFERENCES users,
        UNIQUE (user_id, friend_user_id)
    );
    
    CREATE TABLE posts (
        id      BIGSERIAL PRIMARY KEY,
        user_id BIGINT    NOT NULL REFERENCES users,
        content TEXT      NOT NULL
    );
    CREATE INDEX posts_user_id_id_index ON posts(user_id, id);

Each user can unilaterally follow any number of friends. The posts table has a large number of rows and is rapidly growing.

My goal is to retrieve the 10 most recent posts of a user's friends. This query gives the correct result, but is inefficient:

    SELECT posts.id, users.name, posts.content
    FROM posts JOIN users ON posts.user_id = users.id
    WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)
    ORDER BY posts.id DESC LIMIT 10;

If the user's friends have recently posted, the query is still reasonably fast (https://explain.depesz.com/s/6ykR). But if the user's friends haven't recently posted or the user has no friends, it quickly deteriorates (https://explain.depesz.com/s/OnoG).

If I match only a single post author (e.g. WHERE posts.user_id = 5), Postgres uses the index posts_user_id_id_index. But if I use IN, the index doesn't appear to be used at all.

How can I get these results more efficiently?

I've uploaded the schema and the queries I've tried to dbfiddle at http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cf1489b7f6d53c3fe0b55ed7ccbad1f0. The output of "SELECT version()" is "PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit" for me.

Thank you in advance for any insights, pointers or suggestions you are able to give me.

Regards,
Milo



--
Cordialmente,

Ing. Hellmuth I. Vargas S.
Esp. Telemática y Negocios por Internet 
Oracle Database 10g Administrator Certified Associate
EnterpriseDB Certified PostgreSQL 9.3 Associate

В списке pgsql-performance по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: OT: Performance of VM
Следующее
От: Mariel Cherkassky
Дата:
Сообщение: pgpool 2 rotate logs