Обсуждение: Slow counting still true?

Поиск
Список
Период
Сортировка

Slow counting still true?

От
Thomas Guettler
Дата:
Release 9.2 should increase count(*) performance. Is this wiki page still valid?

http://wiki.postgresql.org/wiki/Slow_Counting

Please update the content.

Thank you,
   Thomas Güttler


--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de


Re: Slow counting still true?

От
Chris Travers
Дата:


On Mon, Sep 17, 2012 at 1:07 AM, Thomas Guettler <hv@tbz-pariv.de> wrote:
Release 9.2 should increase count(*) performance. Is this wiki page still valid?

http://wiki.postgresql.org/wiki/Slow_Counting

Please update the content.

As I understand it, covering indexes don't currently help with count(*) because indexes can't be traversed in physical order, so it is a matter of trading random disk I/O for a much larger amount of sequential disk I/O.

Best Wishes,
Chris Travers

Re: Slow counting still true?

От
Jeff Janes
Дата:
On Mon, Sep 17, 2012 at 1:07 AM, Thomas Guettler <hv@tbz-pariv.de> wrote:
> Release 9.2 should increase count(*) performance. Is this wiki page still
> valid?

Even with index only scans, it still has to walk through every tuple,
even if they are only index tuples not table tuples.  So nothing has
fundamentally changed.  So it is still true that "PostgreSQL must walk
through all rows, in some sense".

I guess this sentence is partly out of date: "PostgreSQL will still
need to read the resulting rows to verify that they exist; other
database systems may only need to reference the index in this
situation."  Maybe just deleting it would be appropriate.

Cheers,

Jeff


Re: Slow counting still true?

От
Edson Richter
Дата:
Em 17/09/2012 06:13, Chris Travers escreveu:


On Mon, Sep 17, 2012 at 1:07 AM, Thomas Guettler <hv@tbz-pariv.de> wrote:
Release 9.2 should increase count(*) performance. Is this wiki page still valid?

http://wiki.postgresql.org/wiki/Slow_Counting

Please update the content.

As I understand it, covering indexes don't currently help with count(*) because indexes can't be traversed in physical order, so it is a matter of trading random disk I/O for a much larger amount of sequential disk I/O.

Best Wishes,
Chris Travers

I'm just a little bit curious, and since count(*) affects a lot my applications (every web system has a paginating feature that depends on count(*) to calculate number of pages without loading everything), I'm also interested in this topic.

The wiki page in question has been updated today, and I see the alert in top of page "Note that the following article only applies to versions of PostgreSQL prior to 9.2. Index-only scans are now implemented."

So seems that traversing indexes for count(*) would be faster on 9.2, right?

AFAIK, for count(*) doesn't matter the order data is stored - just need to load index leaf pages and count from there, right?


Edson

Re: Slow counting still true?

От
Dann Corbit
Дата:

If the numbers do not have to be exact the web applications could use the cardinality estimates stored in the system tables.

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Edson Richter
Sent: Monday, September 17, 2012 9:14 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Slow counting still true?

 

Em 17/09/2012 06:13, Chris Travers escreveu:

 

On Mon, Sep 17, 2012 at 1:07 AM, Thomas Guettler <hv@tbz-pariv.de> wrote:

Release 9.2 should increase count(*) performance. Is this wiki page still valid?

http://wiki.postgresql.org/wiki/Slow_Counting

Please update the content.

As I understand it, covering indexes don't currently help with count(*) because indexes can't be traversed in physical order, so it is a matter of trading random disk I/O for a much larger amount of sequential disk I/O.

 

Best Wishes,

Chris Travers


I'm just a little bit curious, and since count(*) affects a lot my applications (every web system has a paginating feature that depends on count(*) to calculate number of pages without loading everything), I'm also interested in this topic.

The wiki page in question has been updated today, and I see the alert in top of page "Note that the following article only applies to versions of PostgreSQL prior to 9.2. Index-only scans are now implemented."

So seems that traversing indexes for count(*) would be faster on 9.2, right?

AFAIK, for count(*) doesn't matter the order data is stored - just need to load index leaf pages and count from there, right?


Edson

Re: Slow counting still true?

От
Jeff Janes
Дата:
On Mon, Sep 17, 2012 at 9:14 AM, Edson Richter <edsonrichter@hotmail.com> wrote:

> The wiki page in question has been updated today, and I see the alert in top
> of page "Note that the following article only applies to versions of
> PostgreSQL prior to 9.2. Index-only scans are now implemented."
>
> So seems that traversing indexes for count(*) would be faster on 9.2, right?

Not really, as it still needs to visit some representation of every
tuple.  Now, if the entire index in is RAM while the table would not
be, it could be a lot faster.  But that is more of a special case than
a general one.

> AFAIK, for count(*) doesn't matter the order data is stored - just need to
> load index leaf pages and count from there, right?

That would only work if there was no concurrent activity.  If someone
else splits on index page, some of the entries on that page could move
to a location where they would get visited either zero times or two
times.

Cheers,

Jeff


Re: Slow counting still true?

От
Edson Richter
Дата:
Em 18/09/2012 15:24, Jeff Janes escreveu:
> On Mon, Sep 17, 2012 at 9:14 AM, Edson Richter <edsonrichter@hotmail.com> wrote:
>
>> The wiki page in question has been updated today, and I see the alert in top
>> of page "Note that the following article only applies to versions of
>> PostgreSQL prior to 9.2. Index-only scans are now implemented."
>>
>> So seems that traversing indexes for count(*) would be faster on 9.2, right?
> Not really, as it still needs to visit some representation of every
> tuple.  Now, if the entire index in is RAM while the table would not
> be, it could be a lot faster.  But that is more of a special case than
> a general one.
>
>> AFAIK, for count(*) doesn't matter the order data is stored - just need to
>> load index leaf pages and count from there, right?
> That would only work if there was no concurrent activity.  If someone
> else splits on index page, some of the entries on that page could move
> to a location where they would get visited either zero times or two
> times.
I see. This is were MS SQL Server escalates row locks into page locks,
and get rid of the concurrency (at very expensive cost, IMHO).

Regards,

Edson

>
> Cheers,
>
> Jeff
>
>