Re: [PERFORM] Big IN() clauses etc : feature proposal
От | Dawid Kuroczko |
---|---|
Тема | Re: [PERFORM] Big IN() clauses etc : feature proposal |
Дата | |
Msg-id | 758d5e7f0605090943p1bbfdae2p4cb28ba128288316@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PERFORM] Big IN() clauses etc : feature proposal (PFC <lists@peufeu.com>) |
Ответы |
Re: [PERFORM] Big IN() clauses etc : feature proposal
|
Список | pgsql-hackers |
On 5/9/06, PFC <lists@peufeu.com> wrote: > > You might consider just selecting your primary key or a set of > > primary keys to involved relations in your search query. If you > > currently use "select *" this can make your result set very large. > > > > Copying all the result set to the temp. costs you additional IO > > that you propably dont need. > > It is a bit of a catch : I need this information, because the purpose of > the query is to retrieve these objects. I can first store the ids, then > retrieve the objects, but it's one more query. > > > Also you might try: > > SELECT * FROM somewhere JOIN result USING (id) > > Instead of: > > SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) > > Yes you're right in this case ; however the query to retrieve the owners > needs to eliminate duplicates, which IN() does. Well, you can either SELECT * FROM somewhere JOIN (SELECT id FROM result GROUP BY id) AS a USING (id); or even, for large number of ids: CREATE TEMPORARY TABLE result_ids AS SELECT id FROM RESULT GROUP BY id; SELECT * FROM somewhere JOIN result_ids USING (id); > > On the other hand if your search query runs in 10ms it seems to be fast > > enough for you to run it multiple times. Theres propably no point in > > optimizing anything in such case. > > I don't think so : > - 10 ms is a mean time, sometimes it can take much more time, sometimes > it's faster. > - Repeating the query might yield different results if records were added > or deleted in the meantime. You may SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; though locking might bite you. :) > - Complex search queries have imprecise rowcount estimates ; hence the > joins that I would add to them will get suboptimal plans. > > Using a temp table is really the cleanest solution now ; but it's too > slow so I reverted to generating big IN() clauses in the application. A thought, haven't checked it though, but... You might want to use PL to store values, say PLperl, or even C, say: create or replace function perl_store(name text, val int) returns void as $$ my $name = shift; push @{$foo{$name}}, shift; return $$ LANGUAGE plperl; select perl_store('someids', id) from something group by id; (you may need to warp it inside count()) Then use it: create or replace function perl_retr(name text) returns setof int as $$ my $name = shift; return $foo{$name} $$ LANGUAGE plperl; select * from someother join perl_retr('someids') AS a(id) using (id); All is in the memory. Of course, you need to do some cleanup, test it, etc, etc, etc. :) Should work faster than a in-application solution :) Regards, Dawid
В списке pgsql-hackers по дате отправления: