Re: PG 7.0 is 2.5 times slower running a big report

Поиск
Список
Период
Сортировка
От Bryan White
Тема Re: PG 7.0 is 2.5 times slower running a big report
Дата
Msg-id 007701bfc65e$53093100$2dd260d1@arcamax.com
обсуждение исходный текст
Ответ на PG 7.0 is 2.5 times slower running a big report  ("Bryan White" <bryan@arcamax.com>)
Ответы Re: PG 7.0 is 2.5 times slower running a big report  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
> Well, when you have 2.7 million records in a database, the code might be
as good
> as it can be.

I have recoverd the performance lost when I moved to Postgres 7.0 by
executing
SET enable_indexscan = OFF before creating my cursors and turning it back on
for the inner loop query.  It may even be faster then before so I am happy.

> So each run of the log takes 2.7 million queries.
>
> Is there no way to use a where clause to limit the scope of your queries?

No, the point is the resulting report covers the entire database.

> An explanation of the purpose of these queries would help me think about
what
> you are after.  Maybe it isn't a code problem, may it a business rules
problem.
> At the moment, I understand why you need to go through all the items in
the db
> on every day.  Can't you just go through the items that were updated on a
given
> day?  So, in general, what is the goal of this report?

The report summarizes a lot of different aspects of our database.  It breaks
down totals in many by a variety of variables.  For example orders are
broken down by source, date, status, and sku (most orders have only one line
item).  I could come up with quicker methods to do any one of the breakdowns
but to do them all it is more efficient to make a single pass over all the
data.

> I'm new to the cursor method (and to PostgreSQL, though I've done a lot of
work
> with MySQL), but it is interesting, so I'm certainly going to look into
it.

Part of the advantage of the cursor is in memory management.  Is a
non-cursor select the database builds and transfers the entire result set to
the front end application.  For large queries this can be quite a chunk of
memory.

With a sequential scan and sort of the database I think it is still
buffering the result set on the back end but one copy is better then two.
Keep in mind I am running this report on a backup database server so I don't
have to worry much about other processes being hurt by the load.

It seems that with index scans the cursors start producing data right away
(but the overall rate is slower).  With sequential scan and sort the report
gets no data for the first 30 minutes and then runs at about 4 times the
rate of the index scan.

> What difference do you want to capture in distinguishing the customer
table from
> the custlist table?

I am not certain I understand the question.  The custlist table contains
subscriptions to various services.  It is a very simple table containing a
listid and a custid.



В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: problem with NOTICE: _outNode: don't know how to print type
Следующее
От: "Bryan White"
Дата:
Сообщение: Re: PG 7.0 is 2.5 times slower running a big report