Efficiently searching for the most recent rows where a columnmatches any result from a different query

Поиск

Список

Период

Сортировка

От
Тема	Efficiently searching for the most recent rows where a columnmatches any result from a different query
Дата	13 февраля 2018 г. 16:28:07
Msg-id	L5EKys7--3-0@keemail.me обсуждение исходный текст
Ответы	Re: Efficiently searching for the most recent rows where a columnmatches any result from a different query
Список	pgsql-performance

Дерево обсуждения

Hello,

I have the following schema:

CREATE TABLE users (

id BIGSERIAL PRIMARY KEY,

name TEXT NOT NULL UNIQUE

);

CREATE TABLE friends (

user_id BIGINT NOT NULL REFERENCES users,

friend_user_id BIGINT NOT NULL REFERENCES users,

UNIQUE (user_id, friend_user_id)

);

CREATE TABLE posts (

id BIGSERIAL PRIMARY KEY,

user_id BIGINT NOT NULL REFERENCES users,

content TEXT NOT NULL

);

CREATE INDEX posts_user_id_id_index ON posts(user_id, id);

Each user can unilaterally follow any number of friends. The posts table has a large number of rows and is rapidly growing.

My goal is to retrieve the 10 most recent posts of a user's friends. This query gives the correct result, but is inefficient:

SELECT posts.id, users.name, posts.content

FROM posts JOIN users ON posts.user_id = users.id

WHERE posts.user_id IN (SELECT friend_user_id FROM friends WHERE user_id = 1)

ORDER BY posts.id DESC LIMIT 10;

If the user's friends have recently posted, the query is still reasonably fast (https://explain.depesz.com/s/6ykR). But if the user's friends haven't recently posted or the user has no friends, it quickly deteriorates (https://explain.depesz.com/s/OnoG).

If I match only a single post author (e.g. WHERE posts.user_id = 5), Postgres uses the index posts_user_id_id_index. But if I use IN, the index doesn't appear to be used at all.

How can I get these results more efficiently?

I've uploaded the schema and the queries I've tried to dbfiddle at http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cf1489b7f6d53c3fe0b55ed7ccbad1f0. The output of "SELECT version()" is "PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit" for me.

Thank you in advance for any insights, pointers or suggestions you are able to give me.

Regards,

Milo

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Efficiently searching for the most recent rows where a columnmatches any result from a different query