Re: difficulties with time based queries

От: PFC
Тема: Re: difficulties with time based queries
Дата: ,
Msg-id: op.usddg5sycigqcu@soyouz
(см: обсуждение, исходный текст)
Ответ на: Re: difficulties with time based queries  ("Rainer Mager")
Ответы: Re: difficulties with time based queries  (Nikolas Everett)
Список: pgsql-performance

Скрыть дерево обсуждения

difficulties with time based queries  ("Rainer Mager", )
 Re: difficulties with time based queries  (David Wilson, )
  Re: difficulties with time based queries  ("Rainer Mager", )
   Re: difficulties with time based queries  (Tom Lane, )
    Re: difficulties with time based queries  ("Rainer Mager", )
     Re: difficulties with time based queries  (Robert Haas, )
   Re: difficulties with time based queries  (Matthew Wakeling, )
 Re: difficulties with time based queries  (PFC, )
 Re: difficulties with time based queries  (Tom Lane, )
 Re: difficulties with time based queries  ("Rainer Mager", )
  Re: difficulties with time based queries  (PFC, )
   Re: difficulties with time based queries  (Nikolas Everett, )
 Re: difficulties with time based queries  ("Rainer Mager", )
  Re: difficulties with time based queries  (Tom Lane, )

> What can I do to prevent the index from getting bloated, or in whatever
> state it was in?
>
>
> What else can I do to further improve queries on this table? Someone
> suggested posting details of my conf file. Which settings are most
> likely to
> be useful for this?

    If you often do range queries on date, consider partitioning your table
by date (something like 1 partition per month).
    Of course, if you also often do range queries on something other than
date, and uncorrelated, forget it.

    If you make a lot of big aggregate queries, consider materialized views :

    Like "how many games player X won this week", etc

    - create "helper" tables which contain the query results
    - every night, recompute the results taking into account the most recent
data
    - don't recompute results based on old data that never changes

    This is only interesting if the aggregation reduces the data volume by
"an appreciable amount". For instance, if you run a supermarket with 1000
distinct products in stock and you sell 100.000 items a day, keeping a
cache of "count of product X sold each day" will reduce your data load by
about 100 on the query "count of product X sold this month".

    The two suggestion above are not mutually exclusive.

    You could try bizgres also. Or even MySQL !... MySQL's query engine is
slower than pg but the tables take much less space than Postgres, and it
can do index-only queries. So you can fit more in the cache. This is only
valid for MyISAM (InnoDB is a bloated hog). Of course, noone would want to
use MyISAM for the "safe" storage, but it's pretty good as a read-only
storage. You can even use the Archive format for even more compactness and
use of cache. Of course you'd have to devise a way to dump from pg and
load into MySQL but that's not hard. MySQL can be good if you target a
table with lots of small rows with a few ints, all of them in a
multicolumn index, so it doesn't need to hit the table itself.

    Note that one in his right mind would never run aggregate queries on a
live R/W MyISAM table since the long queries will block all writes and
blow up the reaction time. But for a read-only cache updated at night, or
replication slave, it's okay.


В списке pgsql-performance по дате сообщения:

От: Merlin Moncure
Дата:
Сообщение: Re: Nested query performance issue
От: Stephen Frost
Дата:
Сообщение: Re: INSERT times - same storage space but more fields -> much slower inserts