Обсуждение: Cost based SELECT/UPDATE

Поиск
Список
Период
Сортировка

Cost based SELECT/UPDATE

От
Leonid Safronie
Дата:
Hi, ppl

Is there any way to do SELECTs with different priorities?

Once a month I need to do some complex reports on table with over 7
billion rows, which implies several nested SELECTS and grouping (query
runs over 20 minutes on P4/2.4GHz). Concurrently, there are over 50
processes updating tables in the same database, including table being
SELECTed to do monthly report. The issue is that response time for
these 50 processes is very important unlike for report generation, and
time spent by these processes while report running is unacceptable for
my production environment (response time grows from 1-3 seconds up to
1-2 minutes).

Is there any way to give different priorities to different
queries/transactions, as it's done for VACUUMing (vacuum_cost_*
options in config file)?

--
Leonid Safronie
DIAS-RIPE

Re: Cost based SELECT/UPDATE

От
Richard Huxton
Дата:
Leonid Safronie wrote:
> Hi, ppl
>
> Is there any way to do SELECTs with different priorities?

> The issue is that response time for
> these 50 processes is very important unlike for report generation, and
> time spent by these processes while report running is unacceptable for
> my production environment (response time grows from 1-3 seconds up to
> 1-2 minutes).

The most important question is why response time drops so much? Does it
look like it's disk I/O that's the problem?

--
   Richard Huxton
   Archonet Ltd

Cost based SELECT/UPDATE

От
Leonid Safronie
Дата:
> >>>Is there any way to do SELECTs with different priorities?
> >>
> >>>The issue is that response time for
> >>>these 50 processes is very important unlike for report generation, and
> >>>time spent by these processes while report running is unacceptable for
> >>>my production environment (response time grows from 1-3 seconds up to
> >>>1-2 minutes).
> >>
> >>The most important question is why response time drops so much? Does it
> >>look like it's disk I/O that's the problem?
> >>
> >
> > Yes, I/O grows as much as 300 - 700 tps (100% load) according to systat -vmstat.
> > I'm having 2 x 160Gb HDDs, data on one of them, pg_xlog on another
>
> Hmm - with your pg_xlog on a separate disk, updates should be relatively
> unaffected by a large SELECT going through. With these 50 other
> processes are most going through fairly quickly (less than 10 seconds),
> with some taking longer and a few taking 2 minutes or do they all take
> 1-2 minutes?
Results differ, but in range 30-120 secs...
Now looking whether some kind of RAID can improve my situation...
(workaround i'm currently using is COPY to another server, then SELECT
from it, but this does not work if report period includes, e.g.
current day)

--
Leonid Safronie
DIAS-RIPE

Re: Cost based SELECT/UPDATE

От
"Jim C. Nasby"
Дата:
On Thu, Sep 08, 2005 at 06:45:33AM +0400, Leonid Safronie wrote:
> Hi, ppl
>
> Is there any way to do SELECTs with different priorities?
>
> Once a month I need to do some complex reports on table with over 7
> billion rows, which implies several nested SELECTS and grouping (query
> runs over 20 minutes on P4/2.4GHz). Concurrently, there are over 50
> processes updating tables in the same database, including table being
> SELECTed to do monthly report. The issue is that response time for
> these 50 processes is very important unlike for report generation, and
> time spent by these processes while report running is unacceptable for
> my production environment (response time grows from 1-3 seconds up to
> 1-2 minutes).
>
> Is there any way to give different priorities to different
> queries/transactions, as it's done for VACUUMing (vacuum_cost_*
> options in config file)?

You can try running the select from a process that's niced via the OS;
some OS's will take nice into account when scheduling IO. But there is
currently no mechanism to provide this capability from within
PostgreSQL.

There should probably be a TODO for this, since it's something that's
asked about fairly often.

* Provide a means for individual queries to be run at a lower priority

  While nice allows this for CPU-bound queries, it generally doesn't
  work for I/O bound queries.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Cost based SELECT/UPDATE

От
Jan Wieck
Дата:
On 9/7/2005 10:45 PM, Leonid Safronie wrote:

> Hi, ppl
>
> Is there any way to do SELECTs with different priorities?
>
> Once a month I need to do some complex reports on table with over 7
> billion rows, which implies several nested SELECTS and grouping (query
> runs over 20 minutes on P4/2.4GHz). Concurrently, there are over 50
> processes updating tables in the same database, including table being
> SELECTed to do monthly report. The issue is that response time for
> these 50 processes is very important unlike for report generation, and
> time spent by these processes while report running is unacceptable for
> my production environment (response time grows from 1-3 seconds up to
> 1-2 minutes).
>
> Is there any way to give different priorities to different
> queries/transactions, as it's done for VACUUMing (vacuum_cost_*
> options in config file)?
>

Fiddling with Postgres backend priorities (nice) has bee beaten to death
before, and was allways rejected for good reasons.

Which Postgres version is this? Everything pre-8.0 will suffer from
complete cache eviction on sequential scans of huge tables.

Have you thought about replicating the database to a "reporting slave"?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #