How can this be optimized, if possible?

Поиск
Список
Период
Сортировка
От Net Virtual Mailing Lists
Тема How can this be optimized, if possible?
Дата
Msg-id 20050705044702.28859@mail.net-virtual.com
обсуждение исходный текст
Ответы Re: How can this be optimized, if possible?  (Greg Stark <gsstark@mit.edu>)
Список pgsql-general
Hello,


My database has grown far faster then expected and a query which used to
run acceptably now does not.   I'm trying to figure out a way to make
this operate faster and scale better.   I'm very open to the idea that
this does not need to be done using a SQL query at all - right now I'm
really just in need of some conceptual/architectural help on this one.


So I have two tables:

         Table "category"
    Column    |       Type        | Modifiers
--------------+-------------------+-----------
 head_title   | character varying |
 cat_title    | character varying |
 subcat_title | character varying |
 category     | ltree             |



                                             Table "test"
       Column       |           Type           |
   Modifiers
--------------------+--------------------------
+-----------------------------------------------------------------
 id                 | integer                  | not null default
nextval('master.test_id_seq'::text)
 category           | ltree[]                  |


... there are other fields in the test table, but these are really the
only two relevant to this.


The query I want to run against these two tables is something like this:

SELECT
 count(*) as count,
 category.category,
 nlevel(category.category) AS level,
 subpath(category.category,0,nlevel(category.category)-1) as parent,
 category.head_title,
 category.cat_title,
 category.subcat_title
FROM
 test,
 category
WHERE
 test.category <@ category.category
GROUP BY
  category.category, category.head_title, category.cat_title,
category.subcat_title |


Many times the "WHERE" clause will contain additional search criteria on
the 'test' table.  What I am trying to get is a count of how many rows
from the test table fall into each category, being limited by the search
criteria.

This query is starting to take an enormous amount of time (30+ seconds)
and I really need the results of this in a couple seconds tops.  I can do
a "select category from test" and it completes in about .5 seconds.  The
category table currently only has 225 rows, the test table having
approximately 30,000.  "SELECT count(category,category FROM test GROUP BY
category" is quite slow and I thought of making a materialized view of
this, but then of course I don't see any way to make that result limited
by my search criteria.

I am completely open to re-architecting this entirely, performance of
this query is critical to my application - I really just am not sure
where to start.  It seems like everything I do is worse then what I
started with.

... It *almost* seems as if I need to build some sort of "search engine
like" tool which performs all the queries against the database, has tons
of memory, and cache the category attributes for each record in memory.
This sure seems like a lot of work though - I sincerely hope there is an
easier way.....

Thanks for your help, as always!

- Greg


В списке pgsql-general по дате отправления:

Предыдущее
От: Mike Rylander
Дата:
Сообщение: Re: tsearch2 and case
Следующее
От: postgresql@bryden.co.za
Дата:
Сообщение: Statistics and Indexes