Re: Alternative to a temporary table
От | Steve Atkins |
---|---|
Тема | Re: Alternative to a temporary table |
Дата | |
Msg-id | 20051003162010.GB4839@gp.word-to-the-wise.com обсуждение исходный текст |
Ответ на | Alternative to a temporary table (Steven Rosenstein <srosenst@us.ibm.com>) |
Список | pgsql-performance |
On Mon, Oct 03, 2005 at 11:47:52AM -0400, Steven Rosenstein wrote: > I currently create a temporary table to hold the selected server_id's and > characteristics. I then join this temp table with other data tables to > produce my reports. My reason for using the temporary table method is that > the SQL for the initial server selection is dynamically created based on > the user's selections, and is complex enough that it does not lend itself > to being easily incorporated into any of the other data extraction queries > (which may also contain dynamic filtering). > > Unfortunately, the PHP connection to the database does not survive from > webscreen to webscreen, so I have to re-generate the temporary server_id > table each time it is needed for a report screen. An idea I had to make > this process more efficient was instead of re-creating the temporary table > over and over each time it is needed, do a one-time extraction of the list > of user-selected server_id's, store the list in a PHP global variable, and > then use the list in a dynamically-created WHERE clause in the rest of the > queries. The resulting query would look something like > > SELECT * > FROM some_data_table > WHERE server_id IN (sid1,sid5,sid6,sid17,sid24...) > > Simple enough, however in rare cases the list of server_id's can range > between 6,000 and 10,000. > > My question to the group is, can having so many values in a WHERE/IN clause > effect query performance? Probably, yes. As always, benchmark a test case, but last time I checked (in 7.4) you'd be better creating a new temporary table for every query than use an IN list that long. It's a lot better in 8.0, I believe, so you should benchmark it there. > Am I being naive about this and is there a > different, better way? The server_id field is of course indexed, but it is > possible that the list of selected sid's can contain almost all of the > values in the some_data_table server_id index (in the situation where _all_ > of the records are requested I wouldn't use the WHERE clause in the query). > The some_data_table can contain millions of records for thousands of > servers, so every bit of efficiency helps. Don't use a temporary table. Instead use a permanent table that contains the server ids you need and the PHP session token. Then you can create your list of server_ids once and insert it into that table associated with your sessionid. Then future queries can be simple joins against that table. SELECT some_data_table.* FROM some_data_table, session_table WHERE some_data_table.server_id = session_table.server_id AND session_table.session_id = 'foobar' You'd need a reaper process to delete old data from that table to prevent it from growing without limit, and probably a table associating session start time with sessionid to make reaping easier. Cheers, Steve
В списке pgsql-performance по дате отправления: