Re: Poor performance on a simple join

Поиск
Список
Период
Сортировка
От Shaun Thomas
Тема Re: Poor performance on a simple join
Дата
Msg-id 4EB2A50F.8080009@peak6.com
обсуждение исходный текст
Ответ на Re: Poor performance on a simple join  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
On 11/02/2011 09:04 PM, Scott Marlowe wrote:

> Take a look here:
> http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views

Not sure materialized views are the approach I would take here. We
actually see a lot of these kinds of queries with giant result sets,
here. If they actually need all 12k rows for every execution (not
likely, but possible) and 300ms is just too darn slow for that, there's
always client-side caching.

We have a couple queries that we need to keep cached at all times. Stock
quotes and positions, for example, have to be available in sub-ms time
thanks to the level of parallelism involved. One query in particular
effectively grabs the entire set of current positions and every
optimization in the book brings its execution time down to about two
seconds. We can't have thousands of clients executing that all the time,
so it gets shoved into a local memcached on each webserver.

But if he's getting back 12k rows even *after* specifying a contact ID,
a materialized view is still going to return 12k rows, and still has to
perform at least an index scan unless he creates an MV for each contact
ID (eww). This doesn't really look like fact-table territory either.

I think the real question is: Why isn't 300ms fast enough? Is it because
the client executes this repeatedly? If so, what changes often enough it
must fetch all 12k rows every single time? Would implementing a
timestamp and only grabbing newer rows work better? Is it because of
several connections each running it in parallel? Why not cache a local
copy and refresh periodically? Do they actually need all 12k rows every
time? maybe some limit and offset clauses are in order.

There's very little a human can do with 12k results. An automated tool
shouldn't be grabbing them either, unless they're actually changing with
every execution. If they're not, the tool really wants items since the
last change, or it's doing too much work. If it were a report, 300ms is
nothing compared to most reporting queries which can run for several
minutes.

I think we're missing a lot of details here.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@peak6.com

______________________________________________

See http://www.peak6.com/email-disclaimer/ for terms and conditions related to this email

В списке pgsql-performance по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Intel 710 pgbench write latencies
Следующее
От: Shaun Thomas
Дата:
Сообщение: Re: Intel 710 pgbench write latencies