Re: where in (select array)

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: where in (select array)
Дата
Msg-id 49253CFA.7080202@archonet.com
обсуждение исходный текст
Ответ на where in (select array)  (Marcus Engene <mengpg2@engene.se>)
Ответы Re: where in (select array)  (Marcus Engene <mengpg2@engene.se>)
Список pgsql-general
Marcus Engene wrote:
> Hi List,
>
> I have the might_like table that contains products a user might like if
> he likes the present one (item).
>
> CREATE TABLE might_like
> (
> item                       INTEGER NOT NULL
> ,created_at                 TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
> ,child                      INTEGER NOT NULL
> )
> WITHOUT OIDS;
>
> CREATE INDEX might_like_x1 ON might_like(item);
>
> Since there are (will be) houndreds of thousands of items, and 20+ might
> like items, i thought it would be nice to reduce the set to 1/20th by
> using a vector.
>
> CREATE TABLE might_like_vector
> (
> item                       INTEGER NOT NULL
> ,created_at                 TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
> ,child_arr                  INTEGER[]
> )
> WITHOUT OIDS;

You haven't reduced the set at all, you've just turned part of it
sideways. You might gain something on your search, but I'm guessing
you've not tested it.

Hmm - the attached script generates 100,000 items and 10 liked ones for
each (well for the first 99,990 it says you like the next 10 items).
They're all given different timestamps at day intervals which means
you'll end up with 6 or seven matches for you sample query.

> But then this don't work:
>
> select
>    ...
> from
>    item pic
> where
>    pic.objectid in (
>        select mlv.child_arr
>        from might_like_vector mlv
>        where mlv.item = 125 AND
>              mlv.created_at > now() - interval '1 week'
>    )
> limit 16

Without messing around with arrays you get this query (which seems
readable enough to me)

SELECT
    objectid, objname
FROM
    items i
    JOIN might_like m ON (i.objectid = m.child)
WHERE
    m.created_at > (now() - '1 week'::interval)
    AND m.item = 125
ORDER BY
    objectid
LIMIT
    16
;

I'm getting times less than a millisecond for this - are you sure it's
worth fiddling with arrays?

--
  Richard Huxton
  Archonet Ltd
BEGIN;

CREATE SCHEMA mightlike;

SET search_path = mightlike;

CREATE TABLE items (
    objectid  integer NOT NULL,
    objname   text NOT NULL
);

CREATE TABLE might_like (
    item       integer NOT NULL,
    created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
    child      integer NOT NULL
);

INSERT INTO items SELECT i, 'item number ' || i
FROM generate_series(1, 100000) i;

INSERT INTO might_like SELECT i, (now() - j * '1 day'::interval), i+j
FROM generate_series(1, 99990) i, generate_series(1, 10) j;

ALTER TABLE items ADD PRIMARY KEY (objectid);
ALTER TABLE might_like ADD PRIMARY KEY (item, child);
ALTER TABLE might_like ADD CONSTRAINT valid_child FOREIGN KEY (child) REFERENCES items;
CREATE INDEX might_like_idx1 ON might_like (item, created_at);

-- EXPLAIN ANALYSE
SELECT
    objectid, objname
FROM
    items i
    JOIN might_like m ON (i.objectid = m.child)
WHERE
    m.created_at > (now() - '1 week'::interval)
    AND m.item = 125
ORDER BY
    objectid
LIMIT
    16
;

ROLLBACK;

В списке pgsql-general по дате отправления:

Предыдущее
От: "Grzegorz Jaśkiewicz"
Дата:
Сообщение: Re: where in (select array)
Следующее
От: Richard Huxton
Дата:
Сообщение: Re: Serial - last value