Re: Operator performance question

Поиск
Список
Период
Сортировка
От Brandon Aiken
Тема Re: Operator performance question
Дата
Msg-id F8E84F0F56445B4CB39E019EF67DACBA4020FF@exchsrvr.winemantech.com
обсуждение исходный текст
Ответ на Operator performance question  (Alban Hertroys <alban@magproductions.nl>)
Ответы Re: Operator performance question  (Alban Hertroys <alban@magproductions.nl>)
Список pgsql-general
Shouldn't these be using HAVING?

SELECT COUNT(max_persons) ...
GROUP BY NULL
HAVING max_persons >= 5 AND max_persons <= 8;

--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alban Hertroys
Sent: Tuesday, January 09, 2007 11:07 AM
To: Postgres General
Subject: [GENERAL] Operator performance question

Hi all,

I need your help on a small performance problem.

I have a table of which I have to do a bunch of counts of various
conditions. The worst case scenario where I have to iterate over every
record in the table performs just a little bit too slow (800ms). That
particular query will be hit a lot (it will be on the index of our web
app).

PostgreSQL uses a sequential scan (it should IMO) - I think my
bottleneck is in the operators on the various columns.

My queries look like this:

SELECT COUNT(NULLIF(max_persons BETWEEN 5 AND 8, false)) AS "persons
5-8",
-- And other variations

COUNT(NULLIF(country_id = 74, false)) AS "LOCATION_NETHERLANDS",
-- Basically for every country in Europe

COUNT(NULLIF(specifications & '00000000000000000000000001000000',
0::bit(32))) AS "washing machine",
-- And a bunch more of these; the bit mask is almost fully covered

COUNT(*) AS all
FROM table;

The plan is:
                                                            QUERY PLAN

------------------------------------------------------------------------
-----------------------------------------------------------
 Aggregate  (cost=7371.23..7371.55 rows=1 width=18) (actual
time=803.374..803.376 rows=1 loops=1)
   ->  Seq Scan on fewo_property_location  (cost=0.00..828.84 rows=41538
width=18) (actual time=0.036..147.211 rows=41492 loops=1)
         Filter: ((location_id IS NOT NULL) AND (property_state_id = 3))
 Total runtime: 804.398 ms
(4 rows)

The table definition is like:
      Column       |   Type   |      Modifiers
-------------------+----------+----------------------
 property_id       | integer  | not null
 property_state_id | integer  | not null
 location_id       | integer  |
 min_persons       | smallint | not null
 max_persons       | smallint | not null
 specifications    | bit(32)  | default (0)::bit(32)
 country_id        | integer  |
Indexes:
    "fewo_property_location_pkey" PRIMARY KEY, btree (property_id)
    "fewo_property_location_country_idx" btree (country_id) WHERE
location_id IS NOT NULL
    "fewo_property_location_country_location_idx" btree (country_id,
location_id) CLUSTER
    "fewo_property_location_location_online_idx" btree (location_id)
WHERE location_id IS NOT NULL AND property_state_id = 3
    "fewo_property_location_property_location_idx" btree (property_id,
location_id) WHERE location_id IS NOT NULL AND property_state_id = 3
    "fewo_property_location_specifications_idx" btree (specifications)
Foreign-key constraints:
    "fewo_property_location_location_id_fkey" FOREIGN KEY (location_id)
REFERENCES fewo_location(location_id) MATCH FULL
    "fewo_property_location_property_state_id_fkey" FOREIGN KEY
(property_state_id) REFERENCES fewo_property_state(property_state_id)
MATCH FULL

My conclusion is that this query time is mostly limited to the somewhat
complex COUNT expressions. Is there any way to do this more efficiently?

For the record, if I constrain this query to specific countries it
performs in about 80ms (10x as fast).

The hardware is a dual Opteron64x2, 4G RAM and some kind of RAID setup
(software, don't know what type) running in a Xen host - it's our
development DB-server.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

В списке pgsql-general по дате отправления:

Предыдущее
От: "Jeanna Geier"
Дата:
Сообщение: Re: SELECT INTO using Views?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Operator performance question