Re: Alternative to a temporary table

Поиск
Список
Период
Сортировка
От Steve Atkins
Тема Re: Alternative to a temporary table
Дата
Msg-id 20051003162010.GB4839@gp.word-to-the-wise.com
обсуждение исходный текст
Ответ на Alternative to a temporary table  (Steven Rosenstein <srosenst@us.ibm.com>)
Список pgsql-performance
On Mon, Oct 03, 2005 at 11:47:52AM -0400, Steven Rosenstein wrote:

> I currently create a temporary table to hold the selected server_id's and
> characteristics.  I then join this temp table with other data tables to
> produce my reports.  My reason for using the temporary table method is that
> the SQL for the initial server selection is dynamically created based on
> the user's selections, and is complex enough that it does not lend itself
> to being easily incorporated into any of the other data extraction queries
> (which may also contain dynamic filtering).
>
> Unfortunately, the PHP connection to the database does not survive from
> webscreen to webscreen, so I have to re-generate the temporary server_id
> table each time it is needed for a report screen.  An idea I had to make
> this process more efficient was instead of re-creating the temporary table
> over and over each time it is needed, do a one-time extraction of the list
> of user-selected server_id's, store the list in a PHP global variable, and
> then use the list in a dynamically-created WHERE clause in the rest of the
> queries.  The resulting query would look something like
>
>      SELECT *
>      FROM some_data_table
>      WHERE server_id IN (sid1,sid5,sid6,sid17,sid24...)
>
> Simple enough, however in rare cases the list of server_id's can range
> between 6,000 and 10,000.
>
> My question to the group is, can having so many values in a WHERE/IN clause
> effect query performance?

Probably, yes. As always, benchmark a test case, but last time I
checked (in 7.4) you'd be better creating a new temporary table for
every query than use an IN list that long. It's a lot better in 8.0, I
believe, so you should benchmark it there.

> Am I being naive about this and is there a
> different, better way?  The server_id field is of course indexed, but it is
> possible that the list of selected sid's can contain almost all of the
> values in the some_data_table server_id index (in the situation where _all_
> of the records are requested I wouldn't use the WHERE clause in the query).
> The some_data_table can contain millions of records for thousands of
> servers, so every bit of efficiency helps.

Don't use a temporary table. Instead use a permanent table that
contains the server ids you need and the PHP session token. Then you
can create your list of server_ids once and insert it into that table
associated with your sessionid. Then future queries can be simple
joins against that table.

    SELECT some_data_table.*
    FROM some_data_table, session_table
    WHERE some_data_table.server_id = session_table.server_id
      AND session_table.session_id = 'foobar'

You'd need a reaper process to delete old data from that table to
prevent it from growing without limit, and probably a table associating
session start time with sessionid to make reaping easier.

Cheers,
  Steve

В списке pgsql-performance по дате отправления:

Предыдущее
От: Steven Rosenstein
Дата:
Сообщение: Alternative to a temporary table
Следующее
От: Dan Harris
Дата:
Сообщение: Re: Ultra-cheap NVRAM device