Обсуждение: Why primary key index are not using in joining?

Поиск
Список
Период
Сортировка

Why primary key index are not using in joining?

От
AI Rumman
Дата:


Please have a look at the following explain plan:


explain analyze
select *
from vtiger_crmentity
inner JOIN vtiger_users
         ON vtiger_users.id = vtiger_crmentity.smownerid
where vtiger_crmentity.deleted = 0  ;
                                                                         QUERY PLAN                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=3665.17..40019.25 rows=640439 width=1603) (actual time=115.613..3288.436 rows=638081 loops=1)
   Hash Cond: ("outer".smownerid = "inner".id)
   ->  Bitmap Heap Scan on vtiger_crmentity  (cost=3646.54..30394.02 rows=640439 width=258) (actual time=114.763..986.504 rows=638318 loops=1)
         Recheck Cond: (deleted = 0)
         ->  Bitmap Index Scan on vtiger_crmentity_deleted_idx  (cost=0.00..3646.54 rows=640439 width=0) (actual time=107.851..107.851 rows=638318 loops=1)
               Index Cond: (deleted = 0)
   ->  Hash  (cost=18.11..18.11 rows=211 width=1345) (actual time=0.823..0.823 rows=211 loops=1)
         ->  Seq Scan on vtiger_users  (cost=0.00..18.11 rows=211 width=1345) (actual time=0.005..0.496 rows=211 loops=1)
 Total runtime: 3869.022 ms


Sequential index is occuring on vtiger_users table while it has primary key index on id.
Could anyone please tell me why?

 

 \d vtiger_users
                                                           Table "public.vtiger_users"
       Column        |            Type             |                                          Modifiers                                          
---------------------+-----------------------------+----------------------------------------------------------------------------------------------
 id                  | integer                     | not null default nextval('vtiger_users_seq'::regclass)
 user_name           | character varying(255)      |
 user_password       | character varying(30)       |
 user_hash           | character varying(32)       |
 ...

Indexes:
    "vtiger_users_pkey" PRIMARY KEY, btree (id)
    "user_user_name_idx" btree (user_name)
    "user_user_password_idx" btree (user_password)
    "vtiger_users_user_name_lo_idx" btree (lower(user_name::text) varchar_pattern_ops)


 \d vtiger_crmentity
                 Table "public.vtiger_crmentity"
    Column    |            Type             |     Modifiers     
--------------+-----------------------------+--------------------
 crmid        | integer                     | not null
 smcreatorid  | integer                     | not null default 0
 smownerid    | integer                     | not null default 0
 modifiedby   | integer                     | not null default 0
 setype       | character varying(30)       | not null
 description  | text                        |
 createdtime  | timestamp without time zone | not null
 modifiedtime | timestamp without time zone | not null
 viewedtime   | timestamp without time zone |
 status       | character varying(50)       |
 version      | integer                     | not null default 0
 presence     | integer                     | default 1
 deleted      | integer                     | not null default 0
Indexes:
    "vtiger_crmentity_pkey" PRIMARY KEY, btree (crmid)
    "crmentity_deleted_smownerid_idx" btree (deleted, smownerid)
    "crmentity_modifiedby_idx" btree (modifiedby)
    "crmentity_smcreatorid_idx" btree (smcreatorid)
    "crmentity_smownerid_deleted_idx" btree (smownerid, deleted)
    "crmentity_smownerid_idx" btree (smownerid)
    "vtiger_crmentity_deleted_idx" btree (deleted)

Re: Why primary key index are not using in joining?

От
Scott Marlowe
Дата:
On Mon, Feb 15, 2010 at 2:35 AM, AI Rumman <rummandba@gmail.com> wrote:
>
> Please have a look at the following explain plan:
>
> explain analyze
> select *
> from vtiger_crmentity
> inner JOIN vtiger_users
>          ON vtiger_users.id = vtiger_crmentity.smownerid
> where vtiger_crmentity.deleted = 0  ;
>
> QUERY
> PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=3665.17..40019.25 rows=640439 width=1603) (actual
> time=115.613..3288.436 rows=638081 loops=1)
>    Hash Cond: ("outer".smownerid = "inner".id)
>    ->  Bitmap Heap Scan on vtiger_crmentity  (cost=3646.54..30394.02
> rows=640439 width=258) (actual time=114.763..986.504 rows=638318 loops=1)
>          Recheck Cond: (deleted = 0)
>          ->  Bitmap Index Scan on vtiger_crmentity_deleted_idx
> (cost=0.00..3646.54 rows=640439 width=0) (actual time=107.851..107.851
> rows=638318 loops=1)
>                Index Cond: (deleted = 0)
>    ->  Hash  (cost=18.11..18.11 rows=211 width=1345) (actual
> time=0.823..0.823 rows=211 loops=1)
>          ->  Seq Scan on vtiger_users  (cost=0.00..18.11 rows=211
> width=1345) (actual time=0.005..0.496 rows=211 loops=1)
>  Total runtime: 3869.022 ms
>
> Sequential index is occuring on vtiger_users table while it has primary key
> index on id.
> Could anyone please tell me why?

Cause it's only 211 rows and only takes 0.5 milliseconds to scan?

Re: Why primary key index are not using in joining?

От
Yeb Havinga
Дата:
AI Rumman wrote:
>
> explain analyze
> select *
> from vtiger_crmentity
> inner JOIN vtiger_users
>          ON vtiger_users.id <http://vtiger_users.id> =
> vtiger_crmentity.smownerid
> where vtiger_crmentity.deleted = 0  ;
>
> QUERY
> PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=3665.17..40019.25 rows=640439 width=1603) (actual
> time=115.613..3288.436 rows=638081 loops=1)
>    Hash Cond: ("outer".smownerid = "inner".id)
>    ->  Bitmap Heap Scan on vtiger_crmentity  (cost=3646.54..30394.02
> rows=640439 width=258) (actual time=114.763..986.504 rows=638318 loops=1)
>          Recheck Cond: (deleted = 0)
>          ->  Bitmap Index Scan on vtiger_crmentity_deleted_idx
> (cost=0.00..3646.54 rows=640439 width=0) (actual time=107.851..107.851
> rows=638318 loops=1)
>                Index Cond: (deleted = 0)
>    ->  Hash  (cost=18.11..18.11 rows=211 width=1345) (actual
> time=0.823..0.823 rows=211 loops=1)
>          ->  Seq Scan on vtiger_users  (cost=0.00..18.11 rows=211
> width=1345) (actual time=0.005..0.496 rows=211 loops=1)
>  Total runtime: 3869.022 ms
>
> Sequential index is occuring on vtiger_users table while it has
> primary key index on id.
> Could anyone please tell me why?
>
 From the list of indexes you also supplied it seems to me you very much
want index scanning, the reason being that 4secs is too slow? The
seqscan is not the reason for that - the main reason is that you process
almost all rows of the crmentity table. I bet that if you add a LIMIT,
or adding a clause that selects only for a specific vtiger_user, the
plan looks different on the access to the crmentity table as well as the
kind of join, however if your application really needs to process the
600k rows, I'm not sure if it can get any faster than that. Perhaps it
would help a bit to shorten the SELECT * to only the attributes you
really need.

Regards,
Yeb Havinga


Re: Why primary key index are not using in joining?

От
Tom Lane
Дата:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Mon, Feb 15, 2010 at 2:35 AM, AI Rumman <rummandba@gmail.com> wrote:
>> Please have a look at the following explain plan:

>>  Hash Join  (cost=3665.17..40019.25 rows=640439 width=1603) (actual
>> time=115.613..3288.436 rows=638081 loops=1)
>>    Hash Cond: ("outer".smownerid = "inner".id)
>>    ->  Bitmap Heap Scan on vtiger_crmentity  (cost=3646.54..30394.02
>> rows=640439 width=258) (actual time=114.763..986.504 rows=638318 loops=1)
>>          Recheck Cond: (deleted = 0)
>>          ->  Bitmap Index Scan on vtiger_crmentity_deleted_idx
>> (cost=0.00..3646.54 rows=640439 width=0) (actual time=107.851..107.851
>> rows=638318 loops=1)
>>                Index Cond: (deleted = 0)
>>    ->  Hash  (cost=18.11..18.11 rows=211 width=1345) (actual
>> time=0.823..0.823 rows=211 loops=1)
>>          ->  Seq Scan on vtiger_users  (cost=0.00..18.11 rows=211
>> width=1345) (actual time=0.005..0.496 rows=211 loops=1)
>>  Total runtime: 3869.022 ms
>>
>> Sequential index is occuring on vtiger_users table while it has primary key
>> index on id.

> Cause it's only 211 rows and only takes 0.5 milliseconds to scan?

Or, even more to the point, because a nestloop-with-inner-index-scan
plan would require 638318 repetitions of the inner index scan.  There's
no way that is going to be a better plan than this one.  Given the
rowcounts --- in particular, the fact that each vtiger_users row seems
to have a lot of join partners --- I don't think there *is* any better
plan than this one.  A nestloop with vtiger_crmentity on the inside
is the only alternative worth considering, and it doesn't look like
that could be any cheaper.

            regards, tom lane