Обсуждение: Index only select count(*)

Поиск
Список
Период
Сортировка

Index only select count(*)

От
Toni Helenius
Дата:

Hello,

 

I have a table with 961 columns. 1 of the columns is included in a primary key. The table has 52 760 rows. The table size is 137 MB and indexes size is 1176 kB (just the primary key). When I try:

SELECT count(*)

on it, it uses Seq Scan instead of Index scan. The query takes ~3 seconds. I have almost 500 similar tables. No index scans, why?

 

I’m using PostgreSQL 9.2.4 server. And I’ve read https://wiki.postgresql.org/wiki/Index-only_scans , apparently without understanding it. This is quite a performance issue to us. I’ve manually vacuumed and recreated indexes and all. No help. The output of analyze:

"Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"

"  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0)"

 

--

Toni Helenius

Re: Index only select count(*)

От
Alban Hertroys
Дата:
On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

> The output of analyze:
> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0)"

That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated
costs,which gives a lot more insight. 

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



Re: Index only select count(*)

От
Toni Helenius
Дата:
Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Here:

"Aggregate  (cost=18240.50..18240.51 rows=1 width=0) (actual time=2911.117..2911.119 rows=1 loops=1)"
"  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0) (actual time=5.390..2816.274 rows=52760 loops=1)"
"Total runtime: 2912.211 ms"

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

> The output of analyze:
> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0)"

That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated
costs,which gives a lot more insight. 

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll find there is no forest.



Re: Index only select count(*)

От
Alban Hertroys
Дата:
On Oct 7, 2013, at 11:34, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

> Hi,
>
> yes, I'm cheating by using GUI :) (PgAdmin)

Please do not top-post.

Analyze is an entirely different command than Explain analyze. Analyze updates the statistics of tables, while Explain
analyzetells how those statistics affect the query plan. 

> Here:
>
> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0) (actual time=2911.117..2911.119 rows=1 loops=1)"
> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0) (actual time=5.390..2816.274 rows=52760
loops=1)"
> "Total runtime: 2912.211 ms"

That article you referenced mentions pg_class.relallvisible - what value does that have for your table?
Is it possible that a relatively large amount of the data in that table is not visible to other sessions, or was that
perhapsthe case when you last (vacuum) analyzed the table? 

What is the plan if you set enable_seqscan = off; in your session? Does that give any more insight?

> -----Original Message-----
> From: Alban Hertroys [mailto:haramrae@gmail.com]
> Sent: 7. lokakuuta 2013 12:31
> To: Toni Helenius
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Index only select count(*)
>
> On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:
>
>> The output of analyze:
>> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
>> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0)"
>
> That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated
costs,which gives a lot more insight. 
>
> Alban Hertroys
> --
> If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
>

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



Re: Index only select count(*)

От
Toni Helenius
Дата:
> On Oct 7, 2013, at 11:34, Toni Helenius <Toni.Helenius@syncrontech.com>
> wrote:
>
> > Hi,
> >
> > yes, I'm cheating by using GUI :) (PgAdmin)
>
> Please do not top-post.
>
> Analyze is an entirely different command than Explain analyze. Analyze
> updates the statistics of tables, while Explain analyze tells how those
> statistics affect the query plan.
>
> > Here:
> >
> > "Aggregate  (cost=18240.50..18240.51 rows=1 width=0) (actual
> time=2911.117..2911.119 rows=1 loops=1)"
> > "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760 width=0)
> (actual time=5.390..2816.274 rows=52760 loops=1)"
> > "Total runtime: 2912.211 ms"
>
> That article you referenced mentions pg_class.relallvisible - what
> value does that have for your table?
> Is it possible that a relatively large amount of the data in that table
> is not visible to other sessions, or was that perhaps the case when you
> last (vacuum) analyzed the table?

Hmm, yes I didn't check these before.
SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages FROM pg_class where pg_class.relname = 'min1_009';
"min1_009";0;17581

So no pages are visible? How come? They should be? The schema I'm on has GRANT ALL ON SCHEMA "SERIES" TO public;

> What is the plan if you set enable_seqscan = off; in your session? Does
> that give any more insight?
>
> > -----Original Message-----
> > From: Alban Hertroys [mailto:haramrae@gmail.com]
> > Sent: 7. lokakuuta 2013 12:31
> > To: Toni Helenius
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] Index only select count(*)
> >
> > On Oct 7, 2013, at 11:23, Toni Helenius
> <Toni.Helenius@syncrontech.com> wrote:
> >
> >> The output of analyze:
> >> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
> >> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760
> width=0)"
> >
> > That's the output of Explain, not of Explain Analyze. The latter has
> actual measurements to go with the estimated costs, which gives a lot
> more insight.
> >
> > Alban Hertroys
> > --
> > If you can't see the forest for the trees, cut the trees and you'll
> find there is no forest.
> >
>
> Alban Hertroys
> --
> If you can't see the forest for the trees, cut the trees and you'll
> find there is no forest.



Re: Index only select count(*)

От
Toni Helenius
Дата:
> > On Oct 7, 2013, at 11:34, Toni Helenius
> > <Toni.Helenius@syncrontech.com>
> > wrote:
> >
> > > Hi,
> > >
> > > yes, I'm cheating by using GUI :) (PgAdmin)
> >
> > Please do not top-post.
> >
> > Analyze is an entirely different command than Explain analyze.
> Analyze
> > updates the statistics of tables, while Explain analyze tells how
> > those statistics affect the query plan.
> >
> > > Here:
> > >
> > > "Aggregate  (cost=18240.50..18240.51 rows=1 width=0) (actual
> > time=2911.117..2911.119 rows=1 loops=1)"
> > > "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760
> > > width=0)
> > (actual time=5.390..2816.274 rows=52760 loops=1)"
> > > "Total runtime: 2912.211 ms"
> >
> > That article you referenced mentions pg_class.relallvisible - what
> > value does that have for your table?
> > Is it possible that a relatively large amount of the data in that
> > table is not visible to other sessions, or was that perhaps the case
> > when you last (vacuum) analyzed the table?
>
> Hmm, yes I didn't check these before.
> SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages FROM
> pg_class where pg_class.relname = 'min1_009';
> "min1_009";0;17581
>
> So no pages are visible? How come? They should be? The schema I'm on
> has GRANT ALL ON SCHEMA "SERIES" TO public;

Hmm, I was able to get an index-only scan and get all pages to be visible by doing the VACUUM in the same session.
RunningVACUUM from PgAdmin GUI didn't have any effect. 

This did the trick:
VACUUM "SERIES".min1_009;
EXPLAIN ANALYZE SELECT count(*)
  FROM "SERIES".min1_009;

But I'm still confused about this, do I need to log in on the database with the login I use and manually do the vacuum?
Becauseauto-vacuum doesn't update the visibility maps for that user or session or something? Should auto-vacuum do
this?

> > What is the plan if you set enable_seqscan = off; in your session?
> > Does that give any more insight?
> >
> > > -----Original Message-----
> > > From: Alban Hertroys [mailto:haramrae@gmail.com]
> > > Sent: 7. lokakuuta 2013 12:31
> > > To: Toni Helenius
> > > Cc: pgsql-general@postgresql.org
> > > Subject: Re: [GENERAL] Index only select count(*)
> > >
> > > On Oct 7, 2013, at 11:23, Toni Helenius
> > <Toni.Helenius@syncrontech.com> wrote:
> > >
> > >> The output of analyze:
> > >> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
> > >> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760
> > width=0)"
> > >
> > > That's the output of Explain, not of Explain Analyze. The latter
> has
> > actual measurements to go with the estimated costs, which gives a lot
> > more insight.
> > >
> > > Alban Hertroys
> > > --
> > > If you can't see the forest for the trees, cut the trees and you'll
> > find there is no forest.
> > >
> >
> > Alban Hertroys
> > --
> > If you can't see the forest for the trees, cut the trees and you'll
> > find there is no forest.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
> make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Index only select count(*)

От
Toni Helenius
Дата:
> > > On Oct 7, 2013, at 11:34, Toni Helenius
> > > <Toni.Helenius@syncrontech.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > yes, I'm cheating by using GUI :) (PgAdmin)
> > >
> > > Please do not top-post.
> > >
> > > Analyze is an entirely different command than Explain analyze.
> > Analyze
> > > updates the statistics of tables, while Explain analyze tells how
> > > those statistics affect the query plan.
> > >
> > > > Here:
> > > >
> > > > "Aggregate  (cost=18240.50..18240.51 rows=1 width=0) (actual
> > > time=2911.117..2911.119 rows=1 loops=1)"
> > > > "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760
> > > > width=0)
> > > (actual time=5.390..2816.274 rows=52760 loops=1)"
> > > > "Total runtime: 2912.211 ms"
> > >
> > > That article you referenced mentions pg_class.relallvisible - what
> > > value does that have for your table?
> > > Is it possible that a relatively large amount of the data in that
> > > table is not visible to other sessions, or was that perhaps the
> case
> > > when you last (vacuum) analyzed the table?
> >
> > Hmm, yes I didn't check these before.
> > SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages
> > FROM pg_class where pg_class.relname = 'min1_009';
> > "min1_009";0;17581
> >
> > So no pages are visible? How come? They should be? The schema I'm on
> > has GRANT ALL ON SCHEMA "SERIES" TO public;
>
> Hmm, I was able to get an index-only scan and get all pages to be
> visible by doing the VACUUM in the same session. Running VACUUM from
> PgAdmin GUI didn't have any effect.
>
> This did the trick:
> VACUUM "SERIES".min1_009;
> EXPLAIN ANALYZE SELECT count(*)
>   FROM "SERIES".min1_009;
>
> But I'm still confused about this, do I need to log in on the database
> with the login I use and manually do the vacuum? Because auto-vacuum
> doesn't update the visibility maps for that user or session or
> something? Should auto-vacuum do this?

Ok, from the GUI I checked FULL, FREEZE & ANALYZE. That combination doesn't seem to set the last vacuum date either. So
itwas wrong for me to use it like that. And auto-vacuum seems to update visibility maps as planned. My mistakes. Sorry
aboutthat. Everything seems to work. 

> > > What is the plan if you set enable_seqscan = off; in your session?
> > > Does that give any more insight?
> > >
> > > > -----Original Message-----
> > > > From: Alban Hertroys [mailto:haramrae@gmail.com]
> > > > Sent: 7. lokakuuta 2013 12:31
> > > > To: Toni Helenius
> > > > Cc: pgsql-general@postgresql.org
> > > > Subject: Re: [GENERAL] Index only select count(*)
> > > >
> > > > On Oct 7, 2013, at 11:23, Toni Helenius
> > > <Toni.Helenius@syncrontech.com> wrote:
> > > >
> > > >> The output of analyze:
> > > >> "Aggregate  (cost=18240.50..18240.51 rows=1 width=0)"
> > > >> "  ->  Seq Scan on min1_009  (cost=0.00..18108.60 rows=52760
> > > width=0)"
> > > >
> > > > That's the output of Explain, not of Explain Analyze. The latter
> > has
> > > actual measurements to go with the estimated costs, which gives a
> > > lot more insight.
> > > >
> > > > Alban Hertroys
> > > > --
> > > > If you can't see the forest for the trees, cut the trees and
> > > > you'll
> > > find there is no forest.
> > > >
> > >
> > > Alban Hertroys
> > > --
> > > If you can't see the forest for the trees, cut the trees and you'll
> > > find there is no forest.
> >
> >
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
> > make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general