Tim Dudgeon <tdudgeon.ml@gmail.com> writes:
> I'm considering using arrays to handle managing "lists" of rows (I know
> this may not be the best approach, but bear with me).
> I create a table for my lists like this:**
> create table lists (
> id SERIAL PRIMARY KEY,
> hits INTEGER[] NOT NULL
> );
> Then I can insert the results of a query into that table as a new list
> of hits
> INSERT INTO lists (hits)
> SELECT array_agg(id)
> FROM some_table
> WHERE ...;
> Now the problem part. How to best use that array of primary key values
> to restore the data at a later stage. Conceptually I'm wanting this:
> SELECT * from some_table
> WHERE id <is in the list of ids in the array in the lists table>;
> These both work by are really slow:
> SELECT t1.*
> FROM some_table t1
> WHERE t1.id IN (SELECT unnest(hits) from lists WHERE id = 2);
> SELECT t1.*
> FROM some_table t1
> JOIN lists l ON t1.id = any(l.hits)
> WHERE l.id = 2;
> Is there an efficient way to do this, or is this a dead end?
You could create a GIN index on lists.hits and then do
SELECT t1.*
FROM some_table t1
JOIN lists l ON array[t1.id] <@ l.hits
WHERE l.id = 2;
How efficient that will be remains to be determined though;
if the l.id condition will eliminate a lot of matches it
could still be kind of slow.
(ISTR some talk of teaching the planner to convert =ANY(array)
conditions to this form automatically when there's a suitable
index, but for now you'd have to write it out like this.)
regards, tom lane