Обсуждение: [GENERAL] Matching indexe for timestamp

Поиск
Список
Период
Сортировка

[GENERAL] Matching indexe for timestamp

От
Job
Дата:
Hello,
 
on this table:
 
                                          Table "public.gruorari_tmp"
  Column   |          Type          |                                Modifiers
-----------+------------------------+--------------------------------------------------------------------------
 id        | numeric(1000,1)        | not null default function_get_next_sequence('gruorari_tmp_id_seq'::text)
 idgrucate | numeric(1000,1)        |
 dalle     | time without time zone |
 alle      | time without time zone |
 gg_sett   | integer                |
 azione    | character varying      |
Indexes:
    "keygruorari_tmp" PRIMARY KEY, btree (id)
    "gruorari_tmp_alle_idx" btree (alle)
    "gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
    "gruorari_tmp_dalle_idx" btree (dalle)
    "gruorari_tmp_gg_sett_idx" btree (gg_sett)
    "gruorari_tmp_idgrucate_idx" btree (idgrucate)
 
i have a specific condition (i report example value):
"and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time between gruorari_tmp.dalle and gruorari_tmp.alle ) )"
 
But in the query planner, at that point, Postgresql 9.6.1 seems not to use any index (single on dalle / alle field and combindex index on dalle+alle) but it use seqscan:
 
Seq Scan on gruorari_tmp  (cost=0.00..5.90 rows=290 width=68) (actual time=0.014..0.062 rows=290 loops=1)
                                 ->  Hash  (cost=164.06..164.06 rows=1 width=29) (actual time=0.770..0.770 rows=1 loops=1)
And it seems to be the main cost for the whole query.
Which kind of index should i use for that condition/fields?

Thank you!

/F

Re: [GENERAL] Matching indexe for timestamp

От
Adrian Klaver
Дата:
On 01/09/2017 01:58 PM, Job wrote:
> Hello,
>
> on this table:
>
>                                           Table "public.gruorari_tmp"
>   Column   |          Type          |
> Modifiers
> -----------+------------------------+--------------------------------------------------------------------------
>  id        | numeric(1000,1)        | not null default
> function_get_next_sequence('gruorari_tmp_id_seq'::text)
>  idgrucate | numeric(1000,1)        |
>  dalle     | time without time zone |
>  alle      | time without time zone |
>  gg_sett   | integer                |
>  azione    | character varying      |
> Indexes:
>     "keygruorari_tmp" PRIMARY KEY, btree (id)
>     "gruorari_tmp_alle_idx" btree (alle)
>     "gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
>     "gruorari_tmp_dalle_idx" btree (dalle)
>     "gruorari_tmp_gg_sett_idx" btree (gg_sett)
>     "gruorari_tmp_idgrucate_idx" btree (idgrucate)
>


> i have a specific condition (i report example value):
> "and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time
> between gruorari_tmp.dalle and gruorari_tmp.alle ) )"
>
> But in the query planner, at that point, Postgresql 9.6.1 seems not to
> use any index (single on dalle / alle field and combindex index on
> dalle+alle) but it use seqscan:
>
> Seq Scan on gruorari_tmp  (cost=0.00..5.90 rows=290 width=68) (actual
> time=0.014..0.062 rows=290 loops=1)
>                                  ->  Hash  (cost=164.06..164.06 rows=1
> width=29) (actual time=0.770..0.770 rows=1 loops=1)
> And it seems to be the main cost for the whole query.
> Which kind of index should i use for that condition/fields?

As before, giving just snippets of the information is going to prolong
or prevent arriving at an answer. So:

1) What is the complete query?

2) What is the complete EXPLAIN ANALYZE?


>
> Thank you!
>
> /F


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: R: [GENERAL] Matching indexe for timestamp

От
Adrian Klaver
Дата:
On 01/09/2017 03:38 PM, Job wrote:

Please also reply to list.  I do not have time at the moment to go
through this, someone else on the list might.

> Hi Adrian,
>
> You are right; here is the query and the planner.
> I think indexes are not used at all!
>
> /F
>
> EXPLAIN ANALYZE select
>       webrecord.dominio
> from webrecord
>       left join grucategorie on grucategorie.codcategoria=webrecord.categoria and grucategorie.codgruppo='f50147_01'
>       left join grulist on grulist.nome=webrecord.dominio and grulist.codgruppo='f50147_01' and grulist.stato in (1)
>       left join firewall_geo_reject on firewall_geo_reject.country=webrecord.country and
firewall_geo_reject.codgruppo='f50147_01'
>       left join gruorari_tmp on gruorari_tmp.idgrucate=grucategorie.id
> where dominio='PATTERN'
>       and ( grulist.stato=1 OR grucategorie.codcategoria is not null OR firewall_geo_reject.country is not null )
>       and ( gruorari_tmp.id is null or ( 1 = gg_sett and '17:23:00'::time between gruorari_tmp.dalle and
gruorari_tmp.alle) ) 
>       and NOT EXISTS (select 1 from grulist where stato=2 and codgruppo='f50147_01' and nome='PATTERN')
> limit 1;
>                                                                                                          QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=130.51..172.16 rows=1 width=14) (actual time=436.537..436.538 rows=1 loops=1)
>    InitPlan 1 (returns $0)
>      ->  Index Only Scan using aaa_idx on grulist grulist_1  (cost=0.29..80.31 rows=1 width=0) (actual
time=0.062..0.062rows=0 loops=1) 
>            Index Cond: ((stato = '2'::numeric) AND (codgruppo = 'f50147_01'::text) AND (nome = 'PATTERN'::text))
>            Heap Fetches: 0
>    ->  Result  (cost=50.21..303115.67 rows=7277 width=14) (actual time=436.534..436.534 rows=1 loops=1)
>          One-Time Filter: (NOT $0)
>          ->  Nested Loop Left Join  (cost=50.21..303115.67 rows=7277 width=14) (actual time=436.463..436.463 rows=1
loops=1)
>                Join Filter: ((grulist.nome)::text = (webrecord.dominio)::text)
>                Filter: ((grulist.stato = '1'::numeric) OR (grucategorie.codcategoria IS NOT NULL) OR
(firewall_geo_reject.countryIS NOT NULL)) 
>                ->  Nested Loop Left Join  (cost=49.92..302908.01 rows=7277 width=25) (actual time=436.347..436.347
rows=1loops=1) 
>                      Join Filter: ((firewall_geo_reject.country)::text = (webrecord.country)::text)
>                      Rows Removed by Join Filter: 13
>                      ->  Nested Loop Left Join  (cost=49.92..300318.08 rows=7277 width=46) (actual
time=431.407..431.407rows=1 loops=1) 
>                            Join Filter: ((grucategorie.codcategoria)::text = (webrecord.categoria)::text)
>                            Rows Removed by Join Filter: 18
>                            Filter: ((gruorari_tmp.id IS NULL) OR ((1 = gruorari_tmp.gg_sett) AND ('17:23:00'::time
withouttime zone >= gruorari_tmp.dalle) AND ('17:23:00'::time without time zone <= gruorari_tmp.alle))) 
>                            ->  Seq Scan on webrecord  (cost=0.00..249584.12 rows=159614 width=70) (actual
time=430.696..430.696rows=1 loops=1) 
>                                  Filter: ((dominio)::text = 'PATTERN'::text)
>                                  Rows Removed by Filter: 596858
>                            ->  Materialize  (cost=49.92..455.58 rows=14 width=35) (actual time=0.663..0.689 rows=19
loops=1)
>                                  ->  Hash Left Join  (cost=49.92..455.51 rows=14 width=35) (actual time=0.639..0.656
rows=19loops=1) 
>                                        Hash Cond: (grucategorie.id = gruorari_tmp.idgrucate)
>                                        ->  Bitmap Heap Scan on grucategorie  (cost=40.40..445.70 rows=14 width=17)
(actualtime=0.142..0.143 rows=19 loops=1) 
>                                              Recheck Cond: ((codgruppo)::text = 'f50147_01'::text)
>                                              Heap Blocks: exact=5
>                                              ->  Bitmap Index Scan on grucategorie_codgruppo_idx  (cost=0.00..40.39
rows=14width=0) (actual time=0.084..0.084 rows=83 loops=1) 
>                                                    Index Cond: ((codgruppo)::text = 'f50147_01'::text)
>                                        ->  Hash  (cost=5.90..5.90 rows=290 width=36) (actual time=0.381..0.381
rows=290loops=1) 
>                                              Buckets: 1024  Batches: 1  Memory Usage: 29kB
>                                              ->  Seq Scan on gruorari_tmp  (cost=0.00..5.90 rows=290 width=36)
(actualtime=0.023..0.176 rows=290 loops=1) 
>                      ->  Materialize  (cost=0.00..297.73 rows=21 width=3) (actual time=0.151..4.928 rows=13 loops=1)
>                            ->  Seq Scan on firewall_geo_reject  (cost=0.00..297.62 rows=21 width=3) (actual
time=0.141..4.912rows=13 loops=1) 
>                                  Filter: ((codgruppo)::text = 'f50147_01'::text)
>                                  Rows Removed by Filter: 15717
>                ->  Materialize  (cost=0.29..80.31 rows=1 width=19) (actual time=0.109..0.109 rows=0 loops=1)
>                      ->  Index Scan using bbb_idx on grulist  (cost=0.29..80.31 rows=1 width=19) (actual
time=0.095..0.095rows=0 loops=1) 
>                            Index Cond: (((codgruppo)::text = 'f50147_01'::text) AND (stato = '1'::numeric))
>                            Filter: ((nome)::text = 'PATTERN'::text)
>                            Rows Removed by Filter: 1
>  Planning time: 14.996 ms
>  Execution time: 436.840 ms
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Matching indexe for timestamp

От
"David G. Johnston"
Дата:
On Mon, Jan 9, 2017 at 2:58 PM, Job <Job@colliniconsulting.it> wrote:

But in the query planner, at that point, Postgresql 9.6.1 seems not to use any index (single on dalle / alle field and combindex index on dalle+alle) but it use seqscan:
 
Seq Scan on gruorari_tmp  (cost=0.00..5.90 rows=290 width=68) (actual time=0.014..0.062 rows=290 loops=1)
                                 ->  Hash  (cost=164.06..164.06 rows=1 width=29) (actual time=0.770..0.770 rows=1 loops=1)


​Given a query with expected "rows=290" I am not surprised that it would simply scan the entire relation.  Especially since you have four columns in your where clause and so any one index would be insufficient.

You sound as if you believe that any query that doesn't use an index is flawed.  That is not the case.

David J.

Re: [GENERAL] Matching indexe for timestamp

От
Vitaly Burovoy
Дата:
On 1/9/17, Job <Job@colliniconsulting.it> wrote:
> Hello,
>
> on this table:
>
>                                           Table "public.gruorari_tmp"
>   Column   |          Type          |
> Modifiers
> -----------+------------------------+--------------------------------------------------------------------------
>  id        | numeric(1000,1)        | not null default
> function_get_next_sequence('gruorari_tmp_id_seq'::text)
>  idgrucate | numeric(1000,1)        |
>  dalle     | time without time zone |
>  alle      | time without time zone |
>  gg_sett   | integer                |
>  azione    | character varying      |
> Indexes:
>     "keygruorari_tmp" PRIMARY KEY, btree (id)
>     "gruorari_tmp_alle_idx" btree (alle)
>     "gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
>     "gruorari_tmp_dalle_idx" btree (dalle)
>     "gruorari_tmp_gg_sett_idx" btree (gg_sett)
>     "gruorari_tmp_idgrucate_idx" btree (idgrucate)
>
> i have a specific condition (i report example value):
> "and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time between
> gruorari_tmp.dalle and gruorari_tmp.alle ) )"
>
> But in the query planner, at that point, Postgresql 9.6.1 seems not to use
> any index (single on dalle / alle field and combindex index on dalle+alle)

Of course. There is no reason to use any index because the condition
"gruorari_tmp.id is null" is not covered by any of them. To find such
rows you have to scan all table (because there can be with any
"alle".."dalle" values), that's why Postgres uses SeqScan.

> but it use seqscan:
>
> Seq Scan on gruorari_tmp  (cost=0.00..5.90 rows=290 width=68) (actual
> time=0.014..0.062 rows=290 loops=1)
>                                  ->  Hash  (cost=164.06..164.06 rows=1
> width=29) (actual time=0.770..0.770 rows=1 loops=1)
> And it seems to be the main cost for the whole query.

> Which kind of index should i use for that condition/fields?

for _big_ tables Postgres can use "bitmap OR" node if there are two
indexes which can be used.
So create index for "gruorari_tmp.id is null" and gin/gist "(gg_sett,
timerange(gruorari_tmp.dalle, gruorari_tmp.alle))" (see below).

If you use a condition like "<value> between colA and colB", some sort
of a "timerange" is the best case. Unfortunately there is no such
type, but it is easy to create it by an example[1].
Then you can use btree_gin or btree_gist (depending on a base index
type) extension to use an ordinary type column(s) with range type
column(s).

P.S.: Postgres can not to use indexes even if they are right because
according to a statistics SeqScan will take similar access time.

[1]https://www.postgresql.org/docs/9.6/static/rangetypes.html#RANGETYPES-DEFINING

--
Best regards,
Vitaly Burovoy