Pam Ozer
Обсуждение: Using Between
I have a query that
Select Distinct VehicleId
From Vehicle
Where VehicleMileage between 0 and 15000.
I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up. It does get picked up when I run
Select Distinct VehicleId
From Vehicle
Where VehicleMileage = 15000.
I just want to make sure that there is not a special index I should be using.
Thanks
On 8/27/10 5:21 PM, Ozer, Pam wrote: > I have a query that > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage between 0 and 15000. > > I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up.It does get picked up when I run > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage = 15000. > > I just want to make sure that there is not a special index I should be using. You need to post EXPLAIN ANALYZE of your query. It could be that an index scan is actually not a good plan (for example,a sequential scan might be faster if most of your vehicles have low mileage). Without the EXPLAIN ANALYZE, there'sno way to say what's going on. Did you ANALYZE your database after you loaded the data? Craig > Thanks > > *Pam Ozer* >
Yes. ANALYZE was run after we loaded the data. Thanks for your
assistance
Here is the full Query.
select distinct VehicleUsed.VehicleUsedId as VehicleUsedId ,
VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority ,
VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail ,
VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice ,
VehicleUsed.VehicleUsedPrice as VehicleUsedPrice ,
VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage ,
VehicleUsed.VehicleUsedMileage as VehicleUsedMileage ,
VehicleUsed.IsCPO as IsCPO ,
VehicleUsed.IsMTCA as IsMTCA
from VehicleUsed
inner join PostalCodeRegionCountyCity on ( lower (
VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
) )
where
( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000)
and
( PostalCodeRegionCountyCity.RegionId = 26 )
order by VehicleUsed.VehicleUsedDisplayPriority ,
VehicleUsed.HasVehicleUsedThumbnail desc ,
VehicleUsed.HasVehicleUsedPrice desc ,
VehicleUsed.VehicleUsedPrice ,
VehicleUsed.HasVehicleUsedMileage desc ,
VehicleUsed.VehicleUsedMileage ,
VehicleUsed.IsCPO desc ,
VehicleUsed.IsMTCA desc
limit 500000
Here is the explain Analyze
Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual
time=1940.274..1944.312 rows=2363 loops=1)
Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
-> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual
time=1940.272..1943.011 rows=2363 loops=1)
Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
-> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual
time=1940.270..1941.101 rows=2363 loops=1)
Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
Sort Key: vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca,
vehicleused.vehicleusedid
Sort Method: quicksort Memory: 231kB
-> Hash Join (cost=289.85..55057.07 rows=44673 width=39)
(actual time=3.799..1923.958 rows=2363 loops=1)
Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
Hash Cond: (lower((vehicleused.postalcode)::text) =
lower((postalcoderegioncountycity.postalcode)::text))
-> Seq Scan on vehicleused (cost=0.00..51807.63
rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1)
Output: vehicleused.vehicleusedid,
vehicleused.datasetid, vehicleused.vehicleusedproductid,
vehicleused.sellernodeid, vehicleused.vehicleyear,
vehicleused.vehiclemakeid, vehicleused.vehiclemodelid,
vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid,
vehicleused.vehicledoors, vehicleused.vehicleenginetypeid,
vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid,
vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode,
vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage,
vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid,
vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority,
vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath,
vehicleused.vehiclebodystylegroupid, vehicleused.productid,
vehicleused.productgroupid, vehicleused.vehiclevin,
vehicleused.vehicleclassgroupid,
vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight,
vehicleused.buyerid, vehicleused.dealerid,
vehicleused.hasvehicleusedprice, vehicleused.dealerstockid,
vehicleused.datesold, vehicleused.hasthumbnailimagepath,
vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid,
vehicleused.vehicletitletypeid, vehicleused.warranty,
vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath,
vehicleused.description, vehicleused.inserteddate,
vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor,
vehicleused.vehicleusedmileagerangefloor,
vehicleused.hasvehicleusedmileage,
vehicleused.VehicleUsedIntId.distinct_count,
vehicleused.VehicleUsedPrice.average,
vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo,
vehicleused.ismtca, vehicleused.cpoprogramoemid,
vehicleused.cpoprogram3rdpartyid
Filter: ((vehicleusedpricerangefloor >= 0) AND
(vehicleusedpricerangefloor <= 15000))
-> Hash (cost=283.32..283.32 rows=522 width=6)
(actual time=1.084..1.084 rows=532 loops=1)
Output: postalcoderegioncountycity.postalcode
-> Bitmap Heap Scan on
postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6)
(actual time=0.092..0.361 rows=532 loops=1)
Output:
postalcoderegioncountycity.postalcode
Recheck Cond: (regionid = 26)
-> Bitmap Index Scan on
postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0)
(actual time=0.082..0.082 rows=532 loops=1)
Index Cond: (regionid = 26)
Total runtime: 1945.244 ms
-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Craig James
Sent: Friday, August 27, 2010 5:42 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between
On 8/27/10 5:21 PM, Ozer, Pam wrote:
> I have a query that
>
> Select Distinct VehicleId
>
> From Vehicle
>
> Where VehicleMileage between 0 and 15000.
>
> I have an index on VehicleMileage. Is there another way to put an
index on a between? The index is not being picked up. It does get picked
up when I run
>
> Select Distinct VehicleId
>
> From Vehicle
>
> Where VehicleMileage = 15000.
>
> I just want to make sure that there is not a special index I should be
using.
You need to post EXPLAIN ANALYZE of your query. It could be that an
index scan is actually not a good plan (for example, a sequential scan
might be faster if most of your vehicles have low mileage). Without the
EXPLAIN ANALYZE, there's no way to say what's going on.
Did you ANALYZE your database after you loaded the data?
Craig
> Thanks
>
> *Pam Ozer*
>
--
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote: > Yes. ANALYZE was run after we loaded the data. Thanks for your > assistance > Here is the full Query. > > select distinct VehicleUsed.VehicleUsedId as VehicleUsedId , > VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail , > VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice , > VehicleUsed.VehicleUsedPrice as VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage , > VehicleUsed.VehicleUsedMileage as VehicleUsedMileage , > VehicleUsed.IsCPO as IsCPO , > VehicleUsed.IsMTCA as IsMTCA > from VehicleUsed > inner join PostalCodeRegionCountyCity on ( lower ( > VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode > ) ) > where > ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000) > and > ( PostalCodeRegionCountyCity.RegionId = 26 ) > > order by VehicleUsed.VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail desc , > VehicleUsed.HasVehicleUsedPrice desc , > VehicleUsed.VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage desc , > VehicleUsed.VehicleUsedMileage , > VehicleUsed.IsCPO desc , > VehicleUsed.IsMTCA desc > limit 500000 > > > > > Here is the explain Analyze > > Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.274..1944.312 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.272..1943.011 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual > time=1940.270..1941.101 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Sort Key: vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca, > vehicleused.vehicleusedid > Sort Method: quicksort Memory: 231kB > -> Hash Join (cost=289.85..55057.07 rows=44673 width=39) > (actual time=3.799..1923.958 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Hash Cond: (lower((vehicleused.postalcode)::text) = > lower((postalcoderegioncountycity.postalcode)::text)) > -> Seq Scan on vehicleused (cost=0.00..51807.63 > rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.datasetid, vehicleused.vehicleusedproductid, > vehicleused.sellernodeid, vehicleused.vehicleyear, > vehicleused.vehiclemakeid, vehicleused.vehiclemodelid, > vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid, > vehicleused.vehicledoors, vehicleused.vehicleenginetypeid, > vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid, > vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode, > vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage, > vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid, > vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority, > vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath, > vehicleused.vehiclebodystylegroupid, vehicleused.productid, > vehicleused.productgroupid, vehicleused.vehiclevin, > vehicleused.vehicleclassgroupid, > vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight, > vehicleused.buyerid, vehicleused.dealerid, > vehicleused.hasvehicleusedprice, vehicleused.dealerstockid, > vehicleused.datesold, vehicleused.hasthumbnailimagepath, > vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid, > vehicleused.vehicletitletypeid, vehicleused.warranty, > vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath, > vehicleused.description, vehicleused.inserteddate, > vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor, > vehicleused.vehicleusedmileagerangefloor, > vehicleused.hasvehicleusedmileage, > vehicleused.VehicleUsedIntId.distinct_count, > vehicleused.VehicleUsedPrice.average, > vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo, > vehicleused.ismtca, vehicleused.cpoprogramoemid, > vehicleused.cpoprogram3rdpartyid > Filter: ((vehicleusedpricerangefloor >= 0) AND > (vehicleusedpricerangefloor <= 15000)) > -> Hash (cost=283.32..283.32 rows=522 width=6) > (actual time=1.084..1.084 rows=532 loops=1) > Output: postalcoderegioncountycity.postalcode > -> Bitmap Heap Scan on > postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6) > (actual time=0.092..0.361 rows=532 loops=1) > Output: > postalcoderegioncountycity.postalcode > Recheck Cond: (regionid = 26) > -> Bitmap Index Scan on > postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0) > (actual time=0.082..0.082 rows=532 loops=1) > Index Cond: (regionid = 26) > Total runtime: 1945.244 ms How many rows are in the vehicleused table in total? Is your database small enough to fit in memory? Do you have any non-default settings in postgresql.conf? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
There are 850,000 records in vehicleused. And the database is too big to be kept in memory.
Here are our config settings.
listen_addresses = '*' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost', '*' = all
# (change requires restart)
port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
# (change requires restart)
bonjour_name = 'colapcnt1d' # defaults to the computer name
# (change requires restart)
shared_buffers = 500MB # min 128kB
effective_cache_size = 1000MB
log_destination = 'stderr' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' # locale for system error message
# strings
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8' # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting
# default configuration for text search
default_text_search_config = 'pg_catalog.english'
max_connections = 100
temp_buffers = 100MB
work_mem = 100MB
maintenance_work_mem = 500MB
max_files_per_process = 10000
seq_page_cost = 1.0
random_page_cost = 1.1
cpu_tuple_cost = 0.1
cpu_index_tuple_cost = 0.05
cpu_operator_cost = 0.01
default_statistics_target = 1000
autovacuum_max_workers = 1
#log_min_messages = DEBUG1
#log_min_duration_statement = 1000
#log_statement = all
#log_temp_files = 128
#log_lock_waits = on
#log_line_prefix = '%m %u %d %h %p %i %c %l %s'
#log_duration = on
#debug_print_plan = on
-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Tuesday, September 21, 2010 12:35 PM
To: Ozer, Pam
Cc: Craig James; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between
On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote:
> Yes. ANALYZE was run after we loaded the data. Thanks for your
> assistance
> Here is the full Query.
>
> select distinct VehicleUsed.VehicleUsedId as VehicleUsedId ,
> VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority ,
> VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail ,
> VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice ,
> VehicleUsed.VehicleUsedPrice as VehicleUsedPrice ,
> VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage ,
> VehicleUsed.VehicleUsedMileage as VehicleUsedMileage ,
> VehicleUsed.IsCPO as IsCPO ,
> VehicleUsed.IsMTCA as IsMTCA
> from VehicleUsed
> inner join PostalCodeRegionCountyCity on ( lower (
> VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
> ) )
> where
> ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000)
> and
> ( PostalCodeRegionCountyCity.RegionId = 26 )
>
> order by VehicleUsed.VehicleUsedDisplayPriority ,
> VehicleUsed.HasVehicleUsedThumbnail desc ,
> VehicleUsed.HasVehicleUsedPrice desc ,
> VehicleUsed.VehicleUsedPrice ,
> VehicleUsed.HasVehicleUsedMileage desc ,
> VehicleUsed.VehicleUsedMileage ,
> VehicleUsed.IsCPO desc ,
> VehicleUsed.IsMTCA desc
> limit 500000
>
>
>
>
> Here is the explain Analyze
>
> Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.274..1944.312 rows=2363 loops=1)
> Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
> -> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.272..1943.011 rows=2363 loops=1)
> Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
> -> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual
> time=1940.270..1941.101 rows=2363 loops=1)
> Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
> Sort Key: vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca,
> vehicleused.vehicleusedid
> Sort Method: quicksort Memory: 231kB
> -> Hash Join (cost=289.85..55057.07 rows=44673 width=39)
> (actual time=3.799..1923.958 rows=2363 loops=1)
> Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
> Hash Cond: (lower((vehicleused.postalcode)::text) =
> lower((postalcoderegioncountycity.postalcode)::text))
> -> Seq Scan on vehicleused (cost=0.00..51807.63
> rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1)
> Output: vehicleused.vehicleusedid,
> vehicleused.datasetid, vehicleused.vehicleusedproductid,
> vehicleused.sellernodeid, vehicleused.vehicleyear,
> vehicleused.vehiclemakeid, vehicleused.vehiclemodelid,
> vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid,
> vehicleused.vehicledoors, vehicleused.vehicleenginetypeid,
> vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid,
> vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode,
> vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage,
> vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid,
> vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority,
> vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath,
> vehicleused.vehiclebodystylegroupid, vehicleused.productid,
> vehicleused.productgroupid, vehicleused.vehiclevin,
> vehicleused.vehicleclassgroupid,
> vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight,
> vehicleused.buyerid, vehicleused.dealerid,
> vehicleused.hasvehicleusedprice, vehicleused.dealerstockid,
> vehicleused.datesold, vehicleused.hasthumbnailimagepath,
> vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid,
> vehicleused.vehicletitletypeid, vehicleused.warranty,
> vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath,
> vehicleused.description, vehicleused.inserteddate,
> vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor,
> vehicleused.vehicleusedmileagerangefloor,
> vehicleused.hasvehicleusedmileage,
> vehicleused.VehicleUsedIntId.distinct_count,
> vehicleused.VehicleUsedPrice.average,
> vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo,
> vehicleused.ismtca, vehicleused.cpoprogramoemid,
> vehicleused.cpoprogram3rdpartyid
> Filter: ((vehicleusedpricerangefloor >= 0) AND
> (vehicleusedpricerangefloor <= 15000))
> -> Hash (cost=283.32..283.32 rows=522 width=6)
> (actual time=1.084..1.084 rows=532 loops=1)
> Output: postalcoderegioncountycity.postalcode
> -> Bitmap Heap Scan on
> postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6)
> (actual time=0.092..0.361 rows=532 loops=1)
> Output:
> postalcoderegioncountycity.postalcode
> Recheck Cond: (regionid = 26)
> -> Bitmap Index Scan on
> postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0)
> (actual time=0.082..0.082 rows=532 loops=1)
> Index Cond: (regionid = 26)
> Total runtime: 1945.244 ms
How many rows are in the vehicleused table in total?
Is your database small enough to fit in memory?
Do you have any non-default settings in postgresql.conf?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote: > There are 850,000 records in vehicleused. And the database is too big to be kept in memory. Ah. So in other words, you are retrieving about half the rows in that table. For those kinds of queries, using the index tends to actually be slower, because (1) you read the index in addition to reading the table, which has CPU and I/O cost, and (2) instead of reading the table sequentially, you end up jumping around and reading it out of order, which tends to result in more disk seeks and defeats the OS prefetch logic. The query planner is usually pretty smart about making good decisions about this kind of thing. As a debugging aid (but never in production), you can try disabling enable_seqscan and see what plan you get that way. If it's slower, well then the query planner did the right thing. If it's faster, then probably you need to adjust seq_page_cost and random_page_cost a bit. But my guess is that it will be somewhere between a lot slower and only very slightly faster. A whole different line of inquiry is ask the more general question "how can I make this query faster?", but I'm not sure whether you're unhappy with how the query is running or just curious about why the index isn't being used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
The question is how can we make it faster. -----Original Message----- From: Robert Haas [mailto:robertmhaas@gmail.com] Sent: Wednesday, September 22, 2010 3:52 AM To: Ozer, Pam Cc: Craig James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote: > There are 850,000 records in vehicleused. And the database is too big to be kept in memory. Ah. So in other words, you are retrieving about half the rows in that table. For those kinds of queries, using the index tends to actually be slower, because (1) you read the index in addition to reading the table, which has CPU and I/O cost, and (2) instead of reading the table sequentially, you end up jumping around and reading it out of order, which tends to result in more disk seeks and defeats the OS prefetch logic. The query planner is usually pretty smart about making good decisions about this kind of thing. As a debugging aid (but never in production), you can try disabling enable_seqscan and see what plan you get that way. If it's slower, well then the query planner did the right thing. If it's faster, then probably you need to adjust seq_page_cost and random_page_cost a bit. But my guess is that it will be somewhere between a lot slower and only very slightly faster. A whole different line of inquiry is ask the more general question "how can I make this query faster?", but I'm not sure whether you're unhappy with how the query is running or just curious about why the index isn't being used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com> wrote: > The question is how can we make it faster. If there's just one region ID for any given postal code, you might try adding a column to vehicleused and storing the postal codes there. You could possibly populate that column using a trigger; probably it doesn't change unless the postalcode changes. Then you could index that column and query against it directly, rather than joining to PostalCodeRegionCountyCity. Short of that, I don't see any obvious way to avoid reading most of the vehicleused table. There may or may not be an index that can speed that up slightly and of course you can always throw hardware at the problem, but fundamentally reading half a million or more rows isn't going to be instantaneous. Incidentally, it would probably simplify things to store postal codes in the same case throughout the system. If you can avoid the need to write lower(x) = lower(y) and just write x = y you may get better plans. I'm not sure that's the case in this particular example but it's something to think about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Thank you. I will take a look at those suggestions. -----Original Message----- From: Robert Haas [mailto:robertmhaas@gmail.com] Sent: Wednesday, September 22, 2010 9:27 AM To: Ozer, Pam Cc: Craig James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com> wrote: > The question is how can we make it faster. If there's just one region ID for any given postal code, you might try adding a column to vehicleused and storing the postal codes there. You could possibly populate that column using a trigger; probably it doesn't change unless the postalcode changes. Then you could index that column and query against it directly, rather than joining to PostalCodeRegionCountyCity. Short of that, I don't see any obvious way to avoid reading most of the vehicleused table. There may or may not be an index that can speed that up slightly and of course you can always throw hardware at the problem, but fundamentally reading half a million or more rows isn't going to be instantaneous. Incidentally, it would probably simplify things to store postal codes in the same case throughout the system. If you can avoid the need to write lower(x) = lower(y) and just write x = y you may get better plans. I'm not sure that's the case in this particular example but it's something to think about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
>>> The question is how can we make it faster. >>If there's just one region ID for any given postal code, you might try >>adding a column to vehicleused and storing the postal codes there. >>You could possibly populate that column using a trigger; probably it >>doesn't change unless the postalcode changes. Then you could index >>that column and query against it directly, rather than joining to >>PostalCodeRegionCountyCity. Short of that, I don't see any obvious >>way to avoid reading most of the vehicleused table. There may or may >>not be an index that can speed that up slightly and of course you can >>always throw hardware at the problem, but fundamentally reading half a >>million or more rows isn't going to be instantaneous. >>Incidentally, it would probably simplify things to store postal codes >>in the same case throughout the system. If you can avoid the need to >>write lower(x) = lower(y) and just write x = y you may get better >>plans. I'm not sure that's the case in this particular example but >>it's something to think about. Something else you might test is bumping the read-ahead value. Most linux installs have this at 256, might try bumping the value to ~8Meg and tune from there . this may help you slightly for seq scan performance. As always: YMMV. It's not going to magically fix low performing I/O subsystems and it won't help many applications of PG but there are a few outlying instances where this change can help a little bit. I am sure someone will step in and tell you it is a bad idea - AND they will probably have perfectly valid reasons for why it is, so you will need to consider the ramifications.. if at all possible test and tune to see. ..: Mark >>-- >>Robert Haas >>EnterpriseDB: http://www.enterprisedb.com >>The Enterprise Postgres Company -- >>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >>To make changes to your subscription: >>http://www.postgresql.org/mailpref/pgsql-performance