Обсуждение: sub select performance due to seq scans

Поиск
Список
Период
Сортировка

sub select performance due to seq scans

От
H Hale
Дата:
I am testing a query what that has a sub-select. The query performance is very very poor as shown below due to the use of sequencial scans.  The actual row count of both tables is also shown. It appears the row count shown by explain analyze does not match the actual count. Columns dstobj, srcobj & objectid are all indexed yet postgres insists on using seq scans. Vacuum analyze makes no difference. I am using 8.1.3 on linux.

This is a very simple query with relatively small amount of data and  the query is taking 101482 ms.  Queries with sub-selects on both tables individually is very fast  (8 ms).

How do I prevent the use of seq scans? 





capsa=# explain analyze select name from capsa.flatomfilesysentry where objectid in ( select dstobj from capsa.flatommemberrelation where srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409');


                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..1386.45 rows=5809 width=14) (actual time=2.933..101467.463 rows=5841 loops=1)
   Join Filter: ("outer".objectid = "inner".dstobj)
   ->  Seq Scan on flatomfilesysentry  (cost=0.00..368.09 rows=5809 width=30) (actual time=0.007..23.451 rows=5844 loops=1)
   ->  Seq Scan on flatommemberrelation  (cost=0.00..439.05 rows=5842 width=16) (actual time=0.007..11.790 rows=2922 loops=5844)
         Filter: (srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409'::capsa_sys.uuid)
 Total runtime: 101482.256 ms
(6 rows)

capsa=# select count(*) from capsa.flatommemberrelation ;
 count
-------
 11932
(1 row)

capsa=# select count(*) from capsa.flatomfilesysentry ;
 count
-------
  5977




Re: sub select performance due to seq scans

От
Richard Huxton
Дата:
H Hale wrote:
> I am testing a query what that has a sub-select. The query performance is very very poor as shown below due to the
useof sequencial scans.  The actual row count of both tables is also shown. It appears the row count shown by explain
analyzedoes not match the actual count. Columns dstobj, srcobj & objectid are all indexed yet postgres insists on using
seqscans. Vacuum analyze makes no difference. I am using 8.1.3 on linux.  
>
>  This is a very simple query with relatively small amount of data and  the query is taking 101482 ms.  Queries with
sub-selectson both tables individually is very fast  (8 ms).  
>
>  How do I prevent the use of seq scans?

Hmm - something strange here.

>  capsa=# explain analyze select name from capsa.flatomfilesysentry where objectid in ( select dstobj from
capsa.flatommemberrelationwhere srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409'); 
>
>
>                                                             QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------
>   Nested Loop IN Join  (cost=0.00..1386.45 rows=5809 width=14) (actual time=2.933..101467.463 rows=5841 loops=1)
>     Join Filter: ("outer".objectid = "inner".dstobj)
>     ->  Seq Scan on flatomfilesysentry  (cost=0.00..368.09 rows=5809 width=30) (actual time=0.007..23.451 rows=5844
loops=1)
>     ->  Seq Scan on flatommemberrelation  (cost=0.00..439.05 rows=5842 width=16) (actual time=0.007..11.790 rows=2922
loops=5844)
>           Filter: (srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409'::capsa_sys.uuid)
>   Total runtime: 101482.256 ms

Look at that second seq-scan (on flatommemberrelation) - it's looping
5844 times (once for each row in flatmfilesysentry). I'd expect PG to
materialise the seq-scan once and then join (unless I'm missing
something, the subselect just involves the one test against a constant).

I'm guessing something in your configuration is pushing your cost
estimates far away from reality. Could you try issuing a "set
enable_seqscan=off" and then running explain-analyse again. That will
show us alternatives.

Also, what performance-related configuration values have you changed?
Could you post them with a brief description of your hardware?

--
   Richard Huxton
   Archonet Ltd

Re: sub select performance due to seq scans

От
Rod Taylor
Дата:
> capsa=# explain analyze select name from capsa.flatomfilesysentry
> where objectid in ( select dstobj from capsa.flatommemberrelation
> where srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409');
>
>
>                                                            QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop IN Join  (cost=0.00..1386.45 rows=5809 width=14) (actual
> time=2.933..101467.463 rows=5841 loops=1)
>    Join Filter: ("outer".objectid = "inner".dstobj)
>    ->  Seq Scan on flatomfilesysentry  (cost=0.00..368.09 rows=5809
> width=30) (actual time=0.007..23.451 rows=5844 loops=1)
>    ->  Seq Scan on flatommemberrelation  (cost=0.00..439.05 rows=5842
> width=16) (actual time=0.007..11.790 rows=2922 loops=5844)

A loop for an IN indicates that you are using a very old version of
PostgreSQL (7.2 or earlier).  Please double check that the server is
8.1.3 as you indicated and not just the client.

From psql:
        select version();

Hmm... Perhaps it is an 8.1.3 server with mergejoin and hashjoin
disabled?
        show enable_mergejoin;
        show enable_hashjoin;

You can try this query syntax:

        select name from capsa.flatomfilesysentry join
        capsa.flatommemberrelation on (objectid = dstobj) where srcobj =
        'c1c7304a-1fe1-11db-8af7-001143214409';


>          Filter: (srcobj =
> 'c1c7304a-1fe1-11db-8af7-001143214409'::capsa_sys.uuid)
>  Total runtime: 101482.256 ms
> (6 rows)
>
> capsa=# select count(*) from capsa.flatommemberrelation ;
>  count
> -------
>  11932
> (1 row)
>
> capsa=# select count(*) from capsa.flatomfilesysentry ;
>  count
> -------
>   5977
>
>
>
>


Re: sub select performance due to seq scans

От
Tom Lane
Дата:
Rod Taylor <rbt@rbt.ca> writes:
>> Nested Loop IN Join  (cost=0.00..1386.45 rows=5809 width=14) (actual
>> time=2.933..101467.463 rows=5841 loops=1)
>> Join Filter: ("outer".objectid = "inner".dstobj)
>> ->  Seq Scan on flatomfilesysentry  (cost=0.00..368.09 rows=5809
>> width=30) (actual time=0.007..23.451 rows=5844 loops=1)
>> ->  Seq Scan on flatommemberrelation  (cost=0.00..439.05 rows=5842
>> width=16) (actual time=0.007..11.790 rows=2922 loops=5844)

> A loop for an IN indicates that you are using a very old version of
> PostgreSQL (7.2 or earlier).

No, it's not that, because 7.2 certainly had no idea of "IN Join"s.
But there's something mighty fishy about this plan anyway.  The
planner was predicting 5809 rows out from flatomfilesysentry (not
too far off), so why didn't it predict something north of
368.09 + 5809 * 439.05 as the total join cost?  There's a special case
in cost_nestloop for IN joins, but it sure shouldn't have reduced the
estimate by a factor of 1800+ ...

            regards, tom lane

Re: sub select performance due to seq scans

От
H Hale
Дата:


Look at that second seq-scan (on flatommemberrelation) - it's looping
5844 times (once for each row in flatmfilesysentry). I'd expect PG to
materialise the seq-scan once and then join (unless I'm missing
something, the subselect just involves the one test against a constant).

I'm guessing something in your configuration is pushing your cost
estimates far away from reality. Could you try issuing a "set
enable_seqscan=off" and then running explain-analyse again. That will
show us alternatives.

Also, what performance-related configuration values have you changed?
Could you post them with a brief description of your hardware?

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
The hardware is XEON 3GHZ P4 2GB Memory with 80GB SATA drive.
Kernel.SHMMAX=128MB

The following config changes have been made from the defaults...

shared_buffers = 8000            # min 16 or max_connections*2, 8KB each
max_fsm_pages = 50000            # min max_fsm_relations*16, 6 bytes each
vacuum_cost_delay = 10            # 0-1000 milliseconds
stats_start_collector = on
stats_row_level = on
autovacuum = on                # enable autovacuum subprocess?
autovacuum_naptime = 20        # time between autovacuum runs, in secs
autovacuum_vacuum_threshold = 500    # min # of tuple updates before# vacuum
autovacuum_analyze_threshold = 250    # min # of tuple updates before

Here is the query plan...

capsa=# set enable_seqscan=off;
SET
Time: 0.478 ms
capsa=# explain analyze select name from capsa.flatomfilesysentry where objectid in ( select dstobj from capsa.flatommemberrelation where srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409');
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=873.32..1017581.78 rows=6476 width=14) (actual time=80.402..241.881 rows=6473 loops=1)
   ->  Unique  (cost=871.32..903.68 rows=3229 width=16) (actual time=80.315..113.282 rows=6473 loops=1)
         ->  Sort  (cost=871.32..887.50 rows=6473 width=16) (actual time=80.310..94.279 rows=6473 loops=1)
               Sort Key: flatommemberrelation.dstobj
               ->  Bitmap Heap Scan on flatommemberrelation  (cost=56.66..461.57 rows=6473 width=16) (actual time=2.613..14.229 rows=6473 loops=1)
                     Recheck Cond: (srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409'::capsa_sys.uuid)
                     ->  Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx  (cost=0.00..56.66 rows=6473 width=0) (actual time=2.344..2.344 rows=6473 loops=1)
                           Index Cond: (srcobj = 'c1c7304a-1fe1-11db-8af7-001143214409'::capsa_sys.uuid)
   ->  Bitmap Heap Scan on flatomfilesysentry  (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)
         Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
         ->  Bitmap Index Scan on flatomfilesysentry_pkey  (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473)
               Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
 Total runtime: 251.611 ms
(13 rows)

Time: 252.825 ms

I went back to the stock conf settings, did a vaccuum full analyze and still get the same results.

Background...

We have spikes of activty where both tables get rows inserted & have many updates. During this time performance drops.
I have been experimenting with auto vac settings as vaccuuming was helping although query performance
did not return to normal until after the activity spike.
In this case ( and I not sure why yet) vac made no difference.






Re: sub select performance due to seq scans

От
Tom Lane
Дата:
H Hale <hhale21@rogers.com> writes:
>     ->  Bitmap Heap Scan on flatomfilesysentry  (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013
rows=1loops=6473) 
>           Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
>           ->  Bitmap Index Scan on flatomfilesysentry_pkey  (cost=0.00..2.00 rows=3238 width=0) (actual
time=0.007..0.007rows=1 loops=6473) 
>                 Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)

Well, there's our estimation failure: 3238 rows expected, one row
actual.

What is the data distribution of flatomfilesysentry.objectid?
It looks from this example like it is unique or nearly so,
but the planner evidently does not think that.

            regards, tom lane

Re: sub select performance due to seq scans

От
H Hale
Дата:
Tom,

It is unique.

Indexes:
    "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid)
    "capsa_flatomfilesysentry_name_idx" btree (name)
Foreign-key constraints:
    "objectid" FOREIGN KEY (objectid) REFERENCES capsa_sys.master(objectid) ON DELETE CASCADE


Tom Lane <tgl@sss.pgh.pa.us> wrote:
H Hale writes:
> -> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)
> Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
> -> Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473)
> Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)

Well, there's our estimation failure: 3238 rows expected, one row
actual.

What is the data distribution of flatomfilesysentry.objectid?
It looks from this example like it is unique or nearly so,
but the planner evidently does not think that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: sub select performance due to seq scans

От
H Hale
Дата:
Not sure if this helps solve the problem but...
(see below)

As new records are added Indexes are used for awhile  and then at some point postgres switches to seq scan. It is repeatable. 

Any suggestions/comments to try and solve this are welcome.  Thanks

Data is as follows:
capsa.flatommemberrelation 1458 records
capsa.flatommemberrelation(srcobj)  3 distinct
capsa.flatommemberrelation(dstobj)  730 distinct
capsa.flatomfilesysentry 732 records
capsa.flatommemberrelation(objectid)  732 distinct

capsa=# set enable_seqscan=on;
SET
Time: 0.599 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=196.01..196.02 rows=1 width=0) (actual time=965.420..965.422 rows=1 loops=1)
   ->  Nested Loop IN Join  (cost=0.00..194.19 rows=728 width=0) (actual time=3.373..964.371 rows=729 loops=1)
         Join Filter: ("outer".objectid = "inner".dstobj)
         ->  Seq Scan on flatomfilesysentry  (cost=0.00..65.28 rows=728 width=16) (actual time=0.007..1.505 rows=732 loops=1)
         ->  Seq Scan on flatommemberrelation  (cost=0.00..55.12 rows=725 width=16) (actual time=0.004..0.848 rows=366 loops=732)
               Filter: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
 Total runtime: 965.492 ms
(7 rows)

Time: 966.806 ms

-----------------------------------------------------------------------------------------------------------
capsa=# set enable_seqscan=off;
SET
Time: 0.419 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                                                QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24847.73..24847.74 rows=1 width=0) (actual time=24.859..24.860 rows=1 loops=1)
   ->  Nested Loop  (cost=90.05..24845.91 rows=728 width=0) (actual time=2.946..23.640 rows=729 loops=1)
         ->  Unique  (cost=88.04..91.67 rows=363 width=16) (actual time=2.917..6.671 rows=729 loops=1)
               ->  Sort  (cost=88.04..89.86 rows=725 width=16) (actual time=2.914..3.998 rows=729 loops=1)
                     Sort Key: flatommemberrelation.dstobj
                     ->  Bitmap Heap Scan on flatommemberrelation  (cost=7.54..53.60 rows=725 width=16) (actual time=0.260..1.411 rows=729 loops=1)
                           Recheck Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
                           ->  Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx  (cost=0.00..7.54 rows=725 width=0) (actual time=0.244..0.244 rows=729 loops=1)
                                 Index Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
         ->  Bitmap Heap Scan on flatomfilesysentry  (cost=2.00..63.64 rows=364 width=16) (actual time=0.014..0.015 rows=1 loops=729)
               Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
               ->  Bitmap Index Scan on flatomfilesysentry_pkey  (cost=0.00..2.00 rows=364 width=0) (actual time=0.009..0.009 rows=1 loops=729)
                     Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
 Total runtime: 25.101 ms
(14 rows)

Time: 26.878 ms






H Hale <hhale21@rogers.com> wrote:
Tom,

It is unique.

Indexes:
    "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid)
    "capsa_flatomfilesysentry_name_idx" btree (name)
Foreign-key constraints:
    "objectid" FOREIGN KEY (objectid) REFERENCES capsa_sys.master(objectid) ON DELETE CASCADE


Tom Lane <tgl@sss.pgh.pa.us> wrote:
H Hale writes:
> -> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)
> Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
> -> Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473)
> Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)

Well, there's our estimation failure: 3238 rows expected, one row
actual.

What is the data distribution of flatomfilesysentry.objectid?
It looks from this example like it is unique or nearly so,
but the planner evidently does not think that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Re: sub select performance due to seq scans

От
H Hale
Дата:

Initial testing was with data that essentially looks like a single collection with many items. 
I then changed this to have 60 collections of 50 items.
The result, much better (but not optimum) use of indexs, but a seq scan still
used.

Turning seq scan off, all indexes where used.
Query was much faster (1.5ms vs 300ms).

I have tried to increase stats collection...

alter table capsa.flatommemberrelation column srcobj set statistics 1000;
alter table capsa.flatommemberrelation column dstobj set statistics 1000;
alter table capsa.flatommemberrelation column objectid set statistics 1000;
alter table capsa.flatomfilesysentry column objectid set statistics 1000;
vacuum full analyze;
Experimented with many postgres memory parameters.
No difference.

Is seq scan off the solution here?
My tests are with a relatively small number of records.
My concern here is what happens with 100,000'sof records and seq scan off?
I will find out shortly...

Does anyone know of of any know issues with the query planner?

Explain analyze results below.

capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where
objectid in (select dstobj from capsa.flatommemberrelation where
srcobj='5bdef74c-21d3-11db-9a20-001143214409');

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=742380.16..742380.17 rows=1 width=0) (actual
time=1520.269..1520.270 rows=1 loops=1)
-> Nested Loop (cost=878.91..742355.41 rows=9899 width=0) (actual
time=41.516..1520.076 rows=56 loops=1)
JoinFilter: ("inner".objectid = "outer".dstobj)
-> Unique (cost=437.03..453.67 rows=3329 width=16) (actual
time=0.241..0.624 rows=56 loops=1)
-> Sort (cost=437.03..445.35 rows=3329 width=16) (actual
time=0.237..0.346 rows=56 loops=1)
Sort Key: flatommemberrelation.dstobj
-> Bitmap Heap Scan on flatommemberrelation
(cost=30.65..242.26 rows=3329 width=16) (actual time=0.053..0.135 rows=56
loops=1)
Recheck Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Index Scan on
capsa_flatommemberrelation_srcobj_idx (cost=0.00..30.65 rows=3329 width=0)
(actual time=0.044..0.044 rows=56 loops=1)
Index Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Materialize (cost=441.89..540.88 rows=9899 width=16)(actual
time=0.011..14.918 rows=9899 loops=56)
-> Seq Scan on flatomfilesysentry (cost=0.00..431.99 rows=9899
width=16) (actual time=0.005..19.601 rows=9899 loops=1)
Total runtime: 1521.040 ms
(13 rows)

capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where
objectid in (select dstobj from capsa.flatommemberrelation where
srcobj='5bdef74c-21d3-11db-9a20-001143214409');

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1486472.45..1486472.46 rows=1 width=0) (actual
time=2.112..2.113 rows=1 loops=1)
-> Nested Loop (cost=439.03..1486447.70 rows=9899 width=0) (actual
time=0.307..2.019rows=56 loops=1)
-> Unique (cost=437.03..453.67 rows=3329 width=16) (actual
time=0.236..0.482 rows=56 loops=1)
-> Sort (cost=437.03..445.35 rows=3329 width=16) (actual
time=0.233..0.306 rows=56 loops=1)
Sort Key: flatommemberrelation.dstobj
-> Bitmap Heap Scan on flatommemberrelation
(cost=30.65..242.26 rows=3329 width=16) (actual time=0.047..0.132 rows=56
loops=1)
Recheck Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Index Scan on
capsa_flatommemberrelation_srcobj_idx (cost=0.00..30.65 rows=3329 width=0)
(actual time=0.038..0.038 rows=56 loops=1)
Index Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..384.50
rows=4950 width=16)(actual time=0.019..0.020 rows=1 loops=56)
Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
-> Bitmap Index Scan on flatomfilesysentry_pkey
(cost=0.00..2.00 rows=4950 width=0) (actual time=0.014..0.014 rows=1 loops=56)
Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
Total runtime: 2.258 ms
(14 rows)


H Hale <hhale21@rogers.com> wrote:
Not sure if this helps solve the problem but...
(see below)

As new records are added Indexes are used for awhile  and then at some point postgres switches to seq scan. It is repeatable. 

Any suggestions/comments to try and solve this are welcome.  Thanks

Data is as follows:
capsa.flatommemberrelation 1458 records
capsa.flatommemberrelation(srcobj)  3 distinct
capsa.flatommemberrelation(dstobj)  730 distinct
capsa.flatomfilesysentry 732 records
capsa.flatommemberrelation(objectid)  732 distinct

capsa=# set enable_seqscan=on;
SET
Time: 0.599 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=196.01..196.02 rows=1 width=0) (actual time=965.420..965.422 rows=1 loops=1)
   ->  Nested Loop IN Join  (cost=0.00..194.19 rows=728 width=0) (actual time=3.373..964.371 rows=729 loops=1)
         Join Filter: ("outer".objectid = "inner".dstobj)
         ->  Seq Scan on flatomfilesysentry  (cost=0.00..65.28 rows=728 width=16) (actual time=0.007..1.505 rows=732 loops=1)
         ->  Seq Scan on flatommemberrelation  (cost=0.00..55.12 rows=725 width=16) (actual time=0.004..0.848 rows=366 loops=732)
               Filter: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
 Total runtime: 965.492 ms
(7 rows)

Time: 966.806 ms

-----------------------------------------------------------------------------------------------------------
capsa=# set enable_seqscan=off;
SET
Time: 0.419 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                                                QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24847.73..24847.74 rows=1 width=0) (actual time=24.859..24.860 rows=1 loops=1)
   ->  Nested Loop  (cost=90.05..24845.91 rows=728 width=0) (actual time=2.946..23.640 rows=729 loops=1)
         ->  Unique  (cost=88.04..91.67 rows=363 width=16) (actual time=2.917..6.671 rows=729 loops=1)
               ->  Sort  (cost=88.04..89.86 rows=725 width=16) (actual time=2.914..3.998 rows=729 loops=1)
                     Sort Key: flatommemberrelation.dstobj
                     ->  Bitmap Heap Scan on flatommemberrelation  (cost=7.54..53.60 rows=725 width=16) (actual time=0.260..1.411 rows=729 loops=1)
                           Recheck Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
                           ->  Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx  (cost=0.00..7.54 rows=725 width=0) (actual time=0.244..0.244 rows=729 loops=1)
                                 Index Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
         ->  Bitmap Heap Scan on flatomfilesysentry  (cost=2.00..63.64 rows=364 width=16) (actual time=0.014..0.015 rows=1 loops=729)
               Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
               ->  Bitmap Index Scan on flatomfilesysentry_pkey  (cost=0.00..2.00 rows=364 width=0) (actual time=0.009..0.009 rows=1 loops=729)
                     Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
 Total runtime: 25.101 ms
(14 rows)

Time: 26.878 ms






H Hale <hhale21@rogers.com> wrote:
Tom,

It is unique.

Indexes:
    "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid)
    "capsa_flatomfilesysentry_name_idx" btree (name)
Foreign-key constraints:
    "objectid" FOREIGN KEY (objectid) REFERENCES capsa_sys.master(objectid) ON DELETE CASCADE


Tom Lane <tgl@sss.pgh.pa.us> wrote:
H Hale writes:
> -> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)
> Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
> -> Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473)
> Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)

Well, there's our estimation failure: 3238 rows expected, one row
actual.

What is the data distribution of flatomfilesysentry.objectid?
It looks from this example like it is unique or nearly so,
but the planner evidently does not think that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings




Re: sub select performance due to seq scans

От
Scott Marlowe
Дата:
On Wed, 2006-08-02 at 07:17, H Hale wrote:
> Initial testing was with data that essentially looks like a single collection with many items.
> I then changed this to have 60 collections of 50 items.
> The result, much better (but not optimum) use of indexs, but a seq scan still
> used.
>
> Turning seq scan off, all indexes where used.
> Query was much faster (1.5ms vs 300ms).
>
> I have tried to increase stats collection...
>
> alter table capsa.flatommemberrelation column srcobj set statistics 1000;
> alter table capsa.flatommemberrelation column dstobj set statistics 1000;
> alter table capsa.flatommemberrelation column objectid set statistics 1000;
> alter table capsa.flatomfilesysentry column objectid set statistics 1000;
> vacuum full analyze;
> Experimented with many postgres memory parameters.
> No difference.
>
> Is seq scan off the solution here?

It almost never is the right answer.

> My tests are with a relatively small number of records.
> My concern here is what happens with 100,000's
>  of records and seq scan off?

What you need to do is tune PostgreSQL to match your predicted usage
patterns.

Will most or all of your dataset always fit in RAM?  Then you can tune
random_page_cost down near 1.0  normally for large memory / small data
set servers, 1.2 to 1.4 is about optimal.  There will still be times
when seq scan is a win.  You can build a test data set of about the size
you'll expect to run in the future, and take a handful of the queries
you'll be running, and use more and less versions of those queries and
explain analyze to get an idea of about where random_page_cost should
be.  Make sure analyze has been run and that the statistics are fairly
accurate.

effective_cache_size should be set to some reasonable size based on the
steady state size of your machine's kernel cache + disk buffers,
preferably before you tune random_page_cost too much.

There are other numbers you can tune as well (the cpu cost ones in
particular).  If you find yourself needing values of random_page_cost at
1.0 or below to get the planner to make the right choices, then you've
got issues.  Otherwise, if a number between 1.2 and 2.0 make it work
right, you're likely set for a while.

Re: sub select performance due to seq scans

От
Markus Schaber
Дата:
Hi, Scott and Hale,

Scott Marlowe wrote:
> Make sure analyze has been run and that the statistics are fairly
> accurate.

It might also help to increase the statistics_target on the column in
question.

HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org