Join with an array
От | Markus Schiltknecht |
---|---|
Тема | Join with an array |
Дата | |
Msg-id | 1140694595.11865.8.camel@fotomarburg обсуждение исходный текст |
Ответы |
Re: Join with an array
Re: Join with an array Re: Join with an array |
Список | pgsql-hackers |
Hi, I'm trying to speed up a query with a lookup table. This lookup table gets very big and should still fit into memory. It does not change very often. Given these facts I decided to use an array, as follows: CREATE TABLE lookup_table (id INT PRIMARY KEY, items INT[] NOT NULL); I know this is not considered good database design, but it saves a lot of overhead for tuple visibility compared to a 1:1 table. To fetch an item via the lookup_table I tried to use the following query: SELECT i.id, i.title FROM item iJOIN lookup_table lut ON i.id = ANY(lut.items)WHERE lut.id = $LOOKUP_ID; Unfortunately that one seems to always use a sequential scan over items. As the items array in the lookup table often has only 3 - 10 entries (compared to about 1 mio rows in the item table) this is a very expensive operation. I tried to circumvent the problem with generate_series: SELECT i.id, i.title FROM generate_series(0, $MAX) sJOIN lookup_table lut ON s = ANY(lut.items)JOIN item i ON s = i.idWHERElut.id = $LOOKUP_ID; That query uses the index to lookup the item, but as soon as $MAX gets bigger than 10000 generate_series takes too long and too many comparisons s = ANY(lut.items) need to be done. I think it would be possible to write a function generate_series(INT[]) which returns all the elements of the array. The query would then look something like: SELECT i.id, i.titleFROM generate_series(SELECT lut.items FROM lookup_table lut WHERE lut.id = $LOOKUP_ID) sJOIN item i ON s = i.id; Do you see any problem in implementing such function? Does something similar already existt? Why does the first query use a seqscan instead of the index on items? Do I miss anything? What problems do I face if I want to teach the planner to use the index in the first query [1]? Regards Markus [1]: generally in most cases like "JOIN .. ON x IN ANY($ARRAY)" where $ARRAY is reasonably small.
В списке pgsql-hackers по дате отправления: