Обсуждение: Performance of count(*)

От:
Andreas Tille
Дата:

Hi,

I just try to find out why a simple count(*) might last that long.
At first I tried explain, which rather quickly knows how many rows
to check, but the final count is two orders of magnitude slower.

My MS_SQL server using colleague can't believe that.

$ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
                               QUERY PLAN
-----------------------------------------------------------------------
  Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
    ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
(2 rows)

real    0m0.066s
user    0m0.024s
sys     0m0.008s

$ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
   count
---------
  4708941
(1 row)

real    0m4.474s
user    0m0.036s
sys     0m0.004s


Any explanation?

Kind regards

          Andreas.

--
http://fam-tille.de

От:
Andreas Kostyrka
Дата:

* Andreas Tille <> [070322 12:07]:
> Hi,
>
> I just try to find out why a simple count(*) might last that long.
> At first I tried explain, which rather quickly knows how many rows
> to check, but the final count is two orders of magnitude slower.

Which version of PG?

The basic problem is, that explain knows quickly, because it has it's
statistics.

The select proper, OTOH, has to go through the whole table to make
sure which rows are valid for your transaction.

That's the reason why PG (check the newest releases, I seem to
remember that there has been some aggregate optimizations there), does
a SeqScan for select count(*) from table. btw, depending upon your
data, doing a select count(*) from table where user=X will use an
Index, but will still need to fetch the rows proper to validate them.

Andreas

>
> My MS_SQL server using colleague can't believe that.
>
> $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
>                               QUERY PLAN -----------------------------------------------------------------------
>  Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
>    ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
> (2 rows)
>
> real    0m0.066s
> user    0m0.024s
> sys     0m0.008s
>
> $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
>   count ---------
>  4708941
> (1 row)
>
> real    0m4.474s
> user    0m0.036s
> sys     0m0.004s
>
>
> Any explanation?
>
> Kind regards
>
>          Andreas.
>
> --
> http://fam-tille.de
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate

От:
Albert Cervera Areny
Дата:

As you can see, PostgreSQL needs to do a sequencial scan to count because its
MVCC nature and indices don't have transaction information. It's a known
drawback inherent to the way PostgreSQL works and which gives very good
results in other areas. It's been talked about adding some kind of
approximated count which wouldn't need a full table scan but I don't think
there's anything there right now.

A Dijous 22 Març 2007 11:53, Andreas Tille va escriure:
> Hi,
>
> I just try to find out why a simple count(*) might last that long.
> At first I tried explain, which rather quickly knows how many rows
> to check, but the final count is two orders of magnitude slower.
>
> My MS_SQL server using colleague can't believe that.
>
> $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
>                                QUERY PLAN
> -----------------------------------------------------------------------
>   Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
>     ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
> (2 rows)
>
> real    0m0.066s
> user    0m0.024s
> sys     0m0.008s
>
> $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
>    count
> ---------
>   4708941
> (1 row)
>
> real    0m4.474s
> user    0m0.036s
> sys     0m0.004s
>
>
> Any explanation?
>
> Kind regards
>
>           Andreas.

--
Albert Cervera Areny
Dept. Informàtica Sedifa, S.L.

Av. Can Bordoll, 149
08202 - Sabadell (Barcelona)
Tel. 93 715 51 11
Fax. 93 715 51 12

====================================================================
........................  AVISO LEGAL  ............................
La   presente  comunicación  y sus anexos tiene como destinatario la
persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
ningún  fin.  Su  contenido  puede  tener información confidencial o
protegida legalmente   y   únicamente   expresa  la  opinión     del
remitente.  El   uso   del   correo   electrónico   vía Internet  no
permite   asegurar    ni  la   confidencialidad   de   los  mensajes
ni    su    correcta     recepción.   En    el  caso   de   que   el
destinatario no consintiera la utilización  del correo  electrónico,
deberá ponerlo en nuestro conocimiento inmediatamente.
====================================================================
........................... DISCLAIMER .............................
This message and its  attachments are  intended  exclusively for the
named addressee. If you  receive  this  message  in   error,  please
immediately delete it from  your  system  and notify the sender. You
may  not  use  this message  or  any  part  of it  for any  purpose.
The   message   may  contain  information  that  is  confidential or
protected  by  law,  and  any  opinions  expressed  are those of the
individual    sender.  Internet  e-mail   guarantees   neither   the
confidentiality   nor  the  proper  receipt  of  the  message  sent.
If  the  addressee  of  this  message  does  not  consent to the use
of   internet    e-mail,    please    inform     us    inmmediately.
====================================================================




От:
ismo.tuononen@solenovo.fi
Дата:

explain is just "quessing" how many rows are in table. sometimes quess is
right, sometimes just an estimate.

sailabdb=# explain SELECT count(*) from sl_tuote;
                              QUERY PLAN
----------------------------------------------------------------------
 Aggregate  (cost=10187.10..10187.11 rows=1 width=0)
   ->  Seq Scan on sl_tuote  (cost=0.00..9806.08 rows=152408 width=0)
(2 rows)

sailabdb=# SELECT count(*) from sl_tuote;
 count
-------
 62073
(1 row)


so in that case explain estimates that sl_tuote table have 152408 rows, but
there are only 62073 rows.

after analyze estimates are better:

sailabdb=# vacuum analyze sl_tuote;
VACUUM
sailabdb=# explain SELECT count(*) from sl_tuote;
                             QUERY PLAN
---------------------------------------------------------------------
 Aggregate  (cost=9057.91..9057.92 rows=1 width=0)
   ->  Seq Scan on sl_tuote  (cost=0.00..8902.73 rows=62073 width=0)
(2 rows)

you can't never trust that estimate, you must always count it!

Ismo

On Thu, 22 Mar 2007, Andreas Tille wrote:

> Hi,
>
> I just try to find out why a simple count(*) might last that long.
> At first I tried explain, which rather quickly knows how many rows
> to check, but the final count is two orders of magnitude slower.
>
> My MS_SQL server using colleague can't believe that.
>
> $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
>                               QUERY PLAN
> -----------------------------------------------------------------------
>  Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
>    ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
> (2 rows)
>
> real    0m0.066s
> user    0m0.024s
> sys     0m0.008s
>
> $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
>   count ---------
>  4708941
> (1 row)
>
> real    0m4.474s
> user    0m0.036s
> sys     0m0.004s
>
>
> Any explanation?
>
> Kind regards
>
>          Andreas.
>
> --
> http://fam-tille.de
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>

От:
ismo.tuononen@solenovo.fi
Дата:

approximated count?????

why? who would need it? where you can use it?

calculating costs and desiding how to execute query needs
approximated count, but it's totally worthless information for any user
IMO.

Ismo

On Thu, 22 Mar 2007, Albert Cervera Areny wrote:

> As you can see, PostgreSQL needs to do a sequencial scan to count because its
> MVCC nature and indices don't have transaction information. It's a known
> drawback inherent to the way PostgreSQL works and which gives very good
> results in other areas. It's been talked about adding some kind of
> approximated count which wouldn't need a full table scan but I don't think
> there's anything there right now.
>
> A Dijous 22 Març 2007 11:53, Andreas Tille va escriure:
> > Hi,
> >
> > I just try to find out why a simple count(*) might last that long.
> > At first I tried explain, which rather quickly knows how many rows
> > to check, but the final count is two orders of magnitude slower.
> >
> > My MS_SQL server using colleague can't believe that.
> >
> > $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
> >                                QUERY PLAN
> > -----------------------------------------------------------------------
> >   Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
> >     ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
> > (2 rows)
> >
> > real    0m0.066s
> > user    0m0.024s
> > sys     0m0.008s
> >
> > $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
> >    count
> > ---------
> >   4708941
> > (1 row)
> >
> > real    0m4.474s
> > user    0m0.036s
> > sys     0m0.004s
> >
> >
> > Any explanation?
> >
> > Kind regards
> >
> >           Andreas.
>
> --
> Albert Cervera Areny
> Dept. Informàtica Sedifa, S.L.
>
> Av. Can Bordoll, 149
> 08202 - Sabadell (Barcelona)
> Tel. 93 715 51 11
> Fax. 93 715 51 12
>
> ====================================================================
> ........................  AVISO LEGAL  ............................
> La   presente  comunicación  y sus anexos tiene como destinatario la
> persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
> por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
> sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
> ningún  fin.  Su  contenido  puede  tener información confidencial o
> protegida legalmente   y   únicamente   expresa  la  opinión     del
> remitente.  El   uso   del   correo   electrónico   vía Internet  no
> permite   asegurar    ni  la   confidencialidad   de   los  mensajes
> ni    su    correcta     recepción.   En    el  caso   de   que   el
> destinatario no consintiera la utilización  del correo  electrónico,
> deberá ponerlo en nuestro conocimiento inmediatamente.
> ====================================================================
> ........................... DISCLAIMER .............................
> This message and its  attachments are  intended  exclusively for the
> named addressee. If you  receive  this  message  in   error,  please
> immediately delete it from  your  system  and notify the sender. You
> may  not  use  this message  or  any  part  of it  for any  purpose.
> The   message   may  contain  information  that  is  confidential or
> protected  by  law,  and  any  opinions  expressed  are those of the
> individual    sender.  Internet  e-mail   guarantees   neither   the
> confidentiality   nor  the  proper  receipt  of  the  message  sent.
> If  the  addressee  of  this  message  does  not  consent to the use
> of   internet    e-mail,    please    inform     us    inmmediately.
> ====================================================================
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>

От:
Andreas Tille
Дата:

On Thu, 22 Mar 2007, Andreas Kostyrka wrote:

> Which version of PG?

Ahh, sorry, forgot that.  The issue occurs in Debian (Etch) packaged
version 7.4.16.  I plan to switch soon to 8.1.8.

> That's the reason why PG (check the newest releases, I seem to
> remember that there has been some aggregate optimizations there),

I'll verify this once I moved to the new version.

> does
> a SeqScan for select count(*) from table. btw, depending upon your
> data, doing a select count(*) from table where user=X will use an
> Index, but will still need to fetch the rows proper to validate them.

I have an index on three (out of 7 columns) of this table.  Is there
any chance to optimize indexing regarding this.

Well, to be honest I'm not really interested in the performance of
count(*).  I was just discussing general performance issues on the
phone line and when my colleague asked me about the size of the
database he just wonderd why this takes so long for a job his
MS-SQL server is much faster.  So in principle I was just asking
a first question that is easy to ask.  Perhaps I come up with
more difficult optimisation questions.

Kind regards

         Andreas.

--
http://fam-tille.de

От:
Andreas Kostyrka
Дата:

* Andreas Tille <> [070322 13:24]:
> On Thu, 22 Mar 2007, Andreas Kostyrka wrote:
>
> >Which version of PG?
>
> Ahh, sorry, forgot that.  The issue occurs in Debian (Etch) packaged
> version 7.4.16.  I plan to switch soon to 8.1.8.
I'd recommend 8.2 if at all possible :)
>
> >That's the reason why PG (check the newest releases, I seem to
> >remember that there has been some aggregate optimizations there),
>
> I'll verify this once I moved to the new version.
8.1 won't help you I'd guess. ;)

>
> >does
> >a SeqScan for select count(*) from table. btw, depending upon your
> >data, doing a select count(*) from table where user=X will use an
> >Index, but will still need to fetch the rows proper to validate them.
>
> I have an index on three (out of 7 columns) of this table.  Is there
> any chance to optimize indexing regarding this.
Well, that depends upon you query pattern. It's an art and a science
at the same time ;)
>
> Well, to be honest I'm not really interested in the performance of
> count(*).  I was just discussing general performance issues on the
> phone line and when my colleague asked me about the size of the
> database he just wonderd why this takes so long for a job his
> MS-SQL server is much faster.  So in principle I was just asking
> a first question that is easy to ask.  Perhaps I come up with
> more difficult optimisation questions.

Simple. MSSQL is optimized for this case, and uses "older"
datastructures. PG uses a MVCC storage, which is not optimized for
this usecase. It's quite fast for different kinds of queries.

The basic trouble here is that mvcc makes it a little harder to decide
what is valid for your transaction, plus the indexes seems to be
designed for lookup, not for data fetching. (Basically, PG can use
indexes only to locate potential data, but cannot return data directly
out of an index)

Andreas

От:
Bill Moran
Дата:

In response to :
>
> approximated count?????
>
> why? who would need it? where you can use it?
>
> calculating costs and desiding how to execute query needs
> approximated count, but it's totally worthless information for any user
> IMO.

I don't think so.

We have some AJAX stuff where users enter search criteria on a web form,
and the # of results updates in "real time" as they change their criteria.

Right now, this works fine with small tables using count(*) -- it's fast
enough not to be an issue, but we're aware that we can't use it on large
tables.

An estimate_count(*) or similar that would allow us to put an estimate of
how many results will be returned (not guaranteed accurate) would be very
nice to have in these cases.

We're dealing with complex sets of criteria.  It's very useful for the users
to know in "real time" how much their search criteria is effecting the
result pool.  Once they feel they've limited as much as they can without
reducing the pool too much, they can hit submit and get the actual result.

As I said, we do this with small data sets, but it's not terribly useful
there.  Where it will be useful is searches of large data sets, where
constantly submitting and then retrying is overly time-consuming.

Of course, this is count(*)ing the results of a complex query, possibly
with a bunch of joins and many limitations in the WHERE clause, so I'm
not sure what could be done overall to improve the response time.

> On Thu, 22 Mar 2007, Albert Cervera Areny wrote:
>
> > As you can see, PostgreSQL needs to do a sequencial scan to count because its
> > MVCC nature and indices don't have transaction information. It's a known
> > drawback inherent to the way PostgreSQL works and which gives very good
> > results in other areas. It's been talked about adding some kind of
> > approximated count which wouldn't need a full table scan but I don't think
> > there's anything there right now.
> >
> > A Dijous 22 Març 2007 11:53, Andreas Tille va escriure:
> > > Hi,
> > >
> > > I just try to find out why a simple count(*) might last that long.
> > > At first I tried explain, which rather quickly knows how many rows
> > > to check, but the final count is two orders of magnitude slower.
> > >
> > > My MS_SQL server using colleague can't believe that.
> > >
> > > $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
> > >                                QUERY PLAN
> > > -----------------------------------------------------------------------
> > >   Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
> > >     ->  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
> > > (2 rows)
> > >
> > > real    0m0.066s
> > > user    0m0.024s
> > > sys     0m0.008s
> > >
> > > $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
> > >    count
> > > ---------
> > >   4708941
> > > (1 row)
> > >
> > > real    0m4.474s
> > > user    0m0.036s
> > > sys     0m0.004s
> > >
> > >
> > > Any explanation?
> > >
> > > Kind regards
> > >
> > >           Andreas.
> >
> > --
> > Albert Cervera Areny
> > Dept. Informàtica Sedifa, S.L.
> >
> > Av. Can Bordoll, 149
> > 08202 - Sabadell (Barcelona)
> > Tel. 93 715 51 11
> > Fax. 93 715 51 12
> >
> > ====================================================================
> > ........................  AVISO LEGAL  ............................
> > La   presente  comunicación  y sus anexos tiene como destinatario la
> > persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
> > por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
> > sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
> > ningún  fin.  Su  contenido  puede  tener información confidencial o
> > protegida legalmente   y   únicamente   expresa  la  opinión     del
> > remitente.  El   uso   del   correo   electrónico   vía Internet  no
> > permite   asegurar    ni  la   confidencialidad   de   los  mensajes
> > ni    su    correcta     recepción.   En    el  caso   de   que   el
> > destinatario no consintiera la utilización  del correo  electrónico,
> > deberá ponerlo en nuestro conocimiento inmediatamente.
> > ====================================================================
> > ........................... DISCLAIMER .............................
> > This message and its  attachments are  intended  exclusively for the
> > named addressee. If you  receive  this  message  in   error,  please
> > immediately delete it from  your  system  and notify the sender. You
> > may  not  use  this message  or  any  part  of it  for any  purpose.
> > The   message   may  contain  information  that  is  confidential or
> > protected  by  law,  and  any  opinions  expressed  are those of the
> > individual    sender.  Internet  e-mail   guarantees   neither   the
> > confidentiality   nor  the  proper  receipt  of  the  message  sent.
> > If  the  addressee  of  this  message  does  not  consent to the use
> > of   internet    e-mail,    please    inform     us    inmmediately.
> > ====================================================================
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 7: You can help support the PostgreSQL project by donating at
> >
> >                 http://www.postgresql.org/about/donate
> >
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>
>
>
>
>
>


--
Bill Moran
Collaborative Fusion Inc.


Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

От:
Mario Weilguni
Дата:

Am Donnerstag, 22. März 2007 12:30 schrieb :
> approximated count?????
>
> why? who would need it? where you can use it?
>
> calculating costs and desiding how to execute query needs
> approximated count, but it's totally worthless information for any user
> IMO.

No, it is not useless. Try:
http://www.google.com/search?hl=de&q=test&btnG=Google-Suche&meta=

Do you really think google counted each of those individual 895 million
results? It doesn't. In fact, the estimate of google can be off by an order
of magnitude, and still nobody complains...


От:
"Merlin Moncure"
Дата:

On 3/22/07, Andreas Tille <> wrote:
> I just try to find out why a simple count(*) might last that long.
> At first I tried explain, which rather quickly knows how many rows
> to check, but the final count is two orders of magnitude slower.

You can get the approximate count by selecting reltuples from
pg_class.  It is valid as of last analyze.

As others suggest select count(*) from table is very special case
which non-mvcc databases can optimize for.  There are many reasons why
this is the case and why it explains nothing about the relative
performance of the two databases.   This is probably #1 most
frequenctly asked question to -performance...there is a wealth of
information in the archives.

merlin

От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 01:30:35PM +0200,  wrote:
>approximated count?????
>
>why? who would need it? where you can use it?

Do a google query. Look at the top of the page, where it says
"results N to M of about O". For user interfaces (which is where a lot
of this count(*) stuff comes from) you quite likely don't care about the
exact count, because the user doesn't really care about the exact count.

IIRC, that's basically what you get with the mysql count anyway, since
there are corner cases for results in a transaction. Avoiding those
cases is why the postgres count takes so long; sometimes that's what's
desired and sometimes it is not.

Mike Stone

От:
"Jonah H. Harris"
Дата:

On 3/22/07, Merlin Moncure <> wrote:
> As others suggest select count(*) from table is very special case
> which non-mvcc databases can optimize for.

Well, other MVCC database still do it faster than we do.  However, I
think we'll be able to use the dead space map for speeding this up a
bit wouldn't we?

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation            | fax: 732.331.1301
33 Wood Ave S, 3rd Floor            | 
Iselin, New Jersey 08830            | http://www.enterprisedb.com/

От:
Mario Weilguni
Дата:

Am Donnerstag, 22. März 2007 15:33 schrieb Jonah H. Harris:
> On 3/22/07, Merlin Moncure <> wrote:
> > As others suggest select count(*) from table is very special case
> > which non-mvcc databases can optimize for.
>
> Well, other MVCC database still do it faster than we do.  However, I
> think we'll be able to use the dead space map for speeding this up a
> bit wouldn't we?

Which MVCC DB do you mean? Just curious...

От:
Michael Fuhr
Дата:

On Thu, Mar 22, 2007 at 01:29:46PM +0100, Andreas Kostyrka wrote:
> * Andreas Tille <> [070322 13:24]:
> > Well, to be honest I'm not really interested in the performance of
> > count(*).  I was just discussing general performance issues on the
> > phone line and when my colleague asked me about the size of the
> > database he just wonderd why this takes so long for a job his
> > MS-SQL server is much faster.  So in principle I was just asking
> > a first question that is easy to ask.  Perhaps I come up with
> > more difficult optimisation questions.
>
> Simple. MSSQL is optimized for this case, and uses "older"
> datastructures. PG uses a MVCC storage, which is not optimized for
> this usecase. It's quite fast for different kinds of queries.

Ask about performing concurrent selects, inserts, updates, and
deletes in SQL Server and about the implications on ACID of locking
hints such as NOLOCK.  Then consider how MVCC handles concurrency
without blocking or the need for dirty reads.

--
Michael Fuhr

От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 09:39:18AM -0400, Merlin Moncure wrote:
>You can get the approximate count by selecting reltuples from
>pg_class.  It is valid as of last analyze.

Of course, that only works if you're not using any WHERE clause.
Here's a (somewhat ugly) example of getting an approximate count based
off the statistics info, which will work for more complicated queries:
http://archives.postgresql.org/pgsql-sql/2005-08/msg00046.php
The ugliness is that you have to provide the whole query as a
parameter to the function, instead of using it as a drop-in replacement
for count. I assume that the TODO item is to provide the latter, but for
now this method can be useful for UI type stuff where you just want to
know whether there's "a little" or "a lot".

Mike Stone

От:
"Luke Lonergan"
Дата:

Andreas,

On 3/22/07 4:48 AM, "Andreas Tille" <> wrote:

> Well, to be honest I'm not really interested in the performance of
> count(*).  I was just discussing general performance issues on the
> phone line and when my colleague asked me about the size of the
> database he just wonderd why this takes so long for a job his
> MS-SQL server is much faster.  So in principle I was just asking
> a first question that is easy to ask.  Perhaps I come up with
> more difficult optimisation questions.

This may be the clue you needed - in Postgres SELECT COUNT(*) is an
approximate way to measure the speed of your disk setup (up to about
1,200MB/s).  Given that you are having performance problems, it may be that
your disk layout is either:
- slow by design
- malfunctioning

If this is the case, then any of your queries that require a full table scan
will be affected.

You should check your sequential disk performance using the following:

time bash -c "dd if=/dev/zero of=/your_file_system/bigfile bs=8k
count=(your_memory_size_in_KB*2/8) && sync"
time dd if=/your_file_system/bigfile of=/dev/null bs=8k

Report those times here and we can help you with it.

- Luke



От:
mark@mark.mielke.cc
Дата:

On Thu, Mar 22, 2007 at 10:18:10AM -0400, Michael Stone wrote:
> IIRC, that's basically what you get with the mysql count anyway, since
> there are corner cases for results in a transaction. Avoiding those
> cases is why the postgres count takes so long; sometimes that's what's
> desired and sometimes it is not.

Adding to this point:

In any production system, the count presented to the user is usually
wrong very shortly after it is displayed anyways. Transactions in the
background or from other users are adding or removing items, perhaps
even before the count reaches the user's display.

The idea of transaction-safety for counts doesn't apply in this case.
Both the transaction and the number are complete before the value is
displayed.

In my own systems, I rarely use count(*) for anything except user
visible results. For the PostgreSQL system I use, I keep a table of
counts, and lock the row for update when adding or removing items.
This turns out to be best in this system anyways, as I need my new
rows to be ordered, and locking the 'count' row lets me assign a
new sequence number for the row. (Don't want to use SEQUENCE objects,
as there could as the rows are [key, sequence, data], with thousands
or more keys)

Cheers,
mark

--
 /  /      __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


От:
Andreas Kostyrka
Дата:

* Mario Weilguni <> [070322 15:59]:
> Am Donnerstag, 22. März 2007 15:33 schrieb Jonah H. Harris:
> > On 3/22/07, Merlin Moncure <> wrote:
> > > As others suggest select count(*) from table is very special case
> > > which non-mvcc databases can optimize for.
> >
> > Well, other MVCC database still do it faster than we do.  However, I
> > think we'll be able to use the dead space map for speeding this up a
> > bit wouldn't we?
>
> Which MVCC DB do you mean? Just curious...
Well, mysql claims InnoDB to be mvcc ;)

Andreas

От:
"Craig A. James"
Дата:

Michael Stone wrote:
> On Thu, Mar 22, 2007 at 01:30:35PM +0200,  wrote:
>> approximated count?????
>>
>> why? who would need it? where you can use it?
>
> Do a google query. Look at the top of the page, where it says "results N
> to M of about O". For user interfaces (which is where a lot of this
> count(*) stuff comes from) you quite likely don't care about the exact
> count...

Right on, Michael.

One of our biggest single problems is this very thing.  It's not a Postgres problem specifically, but more embedded in
theidea of a relational database: There are no "job status" or "rough estimate of results" or "give me part of the
answer"features that are critical to many real applications. 

In our case (for a variety of reasons, but this one is critical), we actually can't use Postgres indexing at all -- we
wrotean entirely separate indexing system for our data, one that has the following properties: 

  1. It can give out "pages" of information (i.e. "rows 50-60") without
     rescanning the skipped pages the way "limit/offset" would.
  2. It can give accurate estimates of the total rows that will be returned.
  3. It can accurately estimate the time it will take.

For our primary business-critical data, Postgres is merely a storage system, not a search system, because we have to do
the"heavy lifting" in our own code.  (To be fair, there is no relational database that can handle our data.) 

Many or most web-based search engines face these exact problems.

Craig

От:
Tino Wildenhain
Дата:

Craig A. James schrieb:
...
> In our case (for a variety of reasons, but this one is critical), we
> actually can't use Postgres indexing at all -- we wrote an entirely
> separate indexing system for our data, one that has the following
> properties:
>
>  1. It can give out "pages" of information (i.e. "rows 50-60") without
>     rescanning the skipped pages the way "limit/offset" would.
>  2. It can give accurate estimates of the total rows that will be returned.
>  3. It can accurately estimate the time it will take.
>

Thats certainly not entirely correct. There is no need to store or
maintain this information along with postgres when you can store
and maintain it directly in postgres as well. When you have some
outside application I think I can savely assume you are doing
less updates compared to many reads to have it actually pay out.

So why not store this information in separate "index" and "statistic"
tables? You would have just to join with your real data for retrival.

On top of that, postgres has a very flexible and extensible index
system. This would mean you save on database roundtrips and
double information storage (and the sync problems you certainly
get from it)

Regards
Tino


От:
Carlos Moreno
Дата:

>>count(*).  I was just discussing general performance issues on the
>>phone line and when my colleague asked me about the size of the
>>database he just wonderd why this takes so long for a job his
>>MS-SQL server is much faster.  [...].
>>
>>
>
>Simple. MSSQL is optimized for this case, and uses "older"
>datastructures. PG uses a MVCC storage,
>

Which version of MSSQL?  Wikipedia says that SQL Server 2005 uses the
MVCC model.

Carlos
--


От:
Brian Hurt
Дата:

Craig A. James wrote:

>
> One of our biggest single problems is this very thing.  It's not a
> Postgres problem specifically, but more embedded in the idea of a
> relational database: There are no "job status" or "rough estimate of
> results" or "give me part of the answer" features that are critical to
> many real applications.
>
For the "give me part of the answer", I'm wondering if cursors wouldn't
work (and if not, why not)?

Brian


От:
Mario Weilguni
Дата:

Am Donnerstag, 22. März 2007 16:17 schrieb Andreas Kostyrka:
> * Mario Weilguni <> [070322 15:59]:
> > Am Donnerstag, 22. März 2007 15:33 schrieb Jonah H. Harris:
> > > On 3/22/07, Merlin Moncure <> wrote:
> > > > As others suggest select count(*) from table is very special case
> > > > which non-mvcc databases can optimize for.
> > >
> > > Well, other MVCC database still do it faster than we do.  However, I
> > > think we'll be able to use the dead space map for speeding this up a
> > > bit wouldn't we?
> >
> > Which MVCC DB do you mean? Just curious...
>
> Well, mysql claims InnoDB to be mvcc ;)

Ok, but last time I tried count(*) with InnoDB tables did take roughly(*) the
same time last time I tried - because InnoDB has the same problem as postgres
and has to do a seqscan too (I think it's mentioned somewhere in their docs).

(*) in fact, postgres was faster, but the values were comparable, 40 seconds
vs. 48 seconds

Maybe the InnoDB have made some progress here, I tested it with MySQL 5.0.18.


От:
"Craig A. James"
Дата:

Brian Hurt wrote:
>> One of our biggest single problems is this very thing.  It's not a
>> Postgres problem specifically, but more embedded in the idea of a
>> relational database: There are no "job status" or "rough estimate of
>> results" or "give me part of the answer" features that are critical to
>> many real applications.
>>
> For the "give me part of the answer", I'm wondering if cursors wouldn't
> work (and if not, why not)?

There is no mechanism in Postgres (or any RDB that I know of) to say, "Give me rows 1000 through 1010", that doesn't
alsoexecute the query on rows 1-1000.  In other words, the RDBMS does the work for 1010 rows, when only 10 are needed
--100 times more work than is necessary. 

Limit/Offset will return the correct 10 rows, but at the cost of doing the previous 1000 rows and discarding them.

Web applications are stateless.  To use a cursor, you'd have to keep it around for hours or days, and create complex
"serveraffinity" code to direct a user back to the same server of your server farm (where that cursor is being held),
onthe chance that the user will come back and ask for rows 1000 through 1010, then a cursor isn't up to the task. 

Craig

От:
"Craig A. James"
Дата:

Tino Wildenhain wrote:
> Craig A. James schrieb:
> ...
>> In our case (for a variety of reasons, but this one is critical), we
>> actually can't use Postgres indexing at all -- we wrote an entirely
>> separate indexing system for our data...
>
> ...There is no need to store or
> maintain this information along with postgres when you can store
> and maintain it directly in postgres as well.

Whether we store our data inside or outside Postgres misses the point (in fact, most of our data is stored IN
Postgres). It's the code that actually performs the index operation that has to be external to Postgres. 

> On top of that, postgres has a very flexible and extensible index
> system.

You guys can correct me if I'm wrong, but the key feature that's missing from Postgres's flexible indexing is the
abilityto maintain state across queries.  Something like this: 

  select a, b, my_index_state() from foo where ...
    offset 100 limit 10 using my_index(prev_my_index_state);

The my_index_state() function would issue something like a "cookie", an opaque text or binary object that would record
informationabout how it got from row 1 through row 99.  When you issue the query above, it could start looking for row
100WITHOUT reexamining rows 1-99. 

This could be tricky in a OLTP environment, where the "cookie" could be invalidated by changes to the database.  But in
warehouseread-mostly or read-only environments, it could yield vastly improved performance for database web
applications.

If I'm not mistaken, Postgres (nor Oracle, MySQL or other RDBMS) can't do this.  I would love to be corrected.

The problem is that relational databases were invented before the web and its stateless applications.  In the "good old
days",you could connect to a database and work for hours, and in that environment cursors and such work well -- the
RDBMSmaintains the internal state of the indexing system.  But in a web environment, state information is very
difficultto maintain.  There are all sorts of systems that try (Enterprise Java Beans, for example), but they're very
complex.

Craig


От:
Tino Wildenhain
Дата:

Craig A. James schrieb:
> Tino Wildenhain wrote:
>> Craig A. James schrieb:
>> ...
>>> In our case (for a variety of reasons, but this one is critical), we
>>> actually can't use Postgres indexing at all -- we wrote an entirely
>>> separate indexing system for our data...
>>
>> ...There is no need to store or
>> maintain this information along with postgres when you can store
>> and maintain it directly in postgres as well.
>
> Whether we store our data inside or outside Postgres misses the point
> (in fact, most of our data is stored IN Postgres).  It's the code that
> actually performs the index operation that has to be external to Postgres.
>
>> On top of that, postgres has a very flexible and extensible index
>> system.
>
> You guys can correct me if I'm wrong, but the key feature that's missing
> from Postgres's flexible indexing is the ability to maintain state
> across queries.  Something like this:
>
>  select a, b, my_index_state() from foo where ...
>    offset 100 limit 10 using my_index(prev_my_index_state);
>

Yes, you are wrong :-) The technique is called "CURSOR"
if you maintain persistent connection per session
(e.g. stand allone application or clever pooling webapplication)

If its a naive web application you just store your session
in tables where you can easily maintain the scroll state
as well.

Regards
Tino

От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 06:27:32PM +0100, Tino Wildenhain wrote:
>Craig A. James schrieb:
>>You guys can correct me if I'm wrong, but the key feature that's missing
>>from Postgres's flexible indexing is the ability to maintain state
>>across queries.  Something like this:
>>
>>  select a, b, my_index_state() from foo where ...
>>    offset 100 limit 10 using my_index(prev_my_index_state);
>>
>
>Yes, you are wrong :-) The technique is called "CURSOR"
>if you maintain persistent connection per session
>(e.g. stand allone application or clever pooling webapplication)

Did you read the email before correcting it? From the part you trimmed
out:

>The problem is that relational databases were invented before the web
>and its stateless applications.  In the "good old days", you could
>connect to a database and work for hours, and in that environment
>cursors and such work well -- the RDBMS maintains the internal state of
>the indexing system.  But in a web environment, state information is
>very difficult to maintain.  There are all sorts of systems that try
>(Enterprise Java Beans, for example), but they're very complex.

It sounds like they wrote their own middleware to handle the problem,
which is basically what you suggested (a "clever pooling web
application") after saying "wrong".

Mike Stone

От:
"Craig A. James"
Дата:

Tino Wildenhain wrote:
>> You guys can correct me if I'm wrong, but the key feature that's
>> missing from Postgres's flexible indexing is the ability to maintain
>> state across queries.  Something like this:
>>
>>  select a, b, my_index_state() from foo where ...
>>    offset 100 limit 10 using my_index(prev_my_index_state);
>>
>
> Yes, you are wrong :-) The technique is called "CURSOR"
> if you maintain persistent connection per session
> (e.g. stand allone application or clever pooling webapplication)

That's my whole point: If relational databases had a simple mechanism for storing their internal state in an external
application,the need for cursors, connection pools, and all those other tricks would be eliminated. 

As I said earlier, relational technology was invented in an earlier era, and hasn't caught up with the reality of
modernweb apps. 

> If its a naive web application you just store your session
> in tables where you can easily maintain the scroll state
> as well.

One thing I've learned in 25 years of software development is that people who use my software have problems I never
imagined. I've been the one who was naive when I said similar things about my customers, and was later embarrassed to
learnthat their problems were more complex than I ever imagined. 

Craig

От:
Steve Atkins
Дата:

On Mar 22, 2007, at 10:21 AM, Craig A. James wrote:

> Tino Wildenhain wrote:
>> Craig A. James schrieb:
>> ...
>>> In our case (for a variety of reasons, but this one is critical),
>>> we actually can't use Postgres indexing at all -- we wrote an
>>> entirely separate indexing system for our data...
>>
>> ...There is no need to store or
>> maintain this information along with postgres when you can store
>> and maintain it directly in postgres as well.
>
> Whether we store our data inside or outside Postgres misses the
> point (in fact, most of our data is stored IN Postgres).  It's the
> code that actually performs the index operation that has to be
> external to Postgres.
>
>> On top of that, postgres has a very flexible and extensible index
>> system.
>
> You guys can correct me if I'm wrong, but the key feature that's
> missing from Postgres's flexible indexing is the ability to
> maintain state across queries.  Something like this:
>
>  select a, b, my_index_state() from foo where ...
>    offset 100 limit 10 using my_index(prev_my_index_state);
>
> The my_index_state() function would issue something like a
> "cookie", an opaque text or binary object that would record
> information about how it got from row 1 through row 99.  When you
> issue the query above, it could start looking for row 100 WITHOUT
> reexamining rows 1-99.
>
> This could be tricky in a OLTP environment, where the "cookie"
> could be invalidated by changes to the database.  But in warehouse
> read-mostly or read-only environments, it could yield vastly
> improved performance for database web applications.
>
> If I'm not mistaken, Postgres (nor Oracle, MySQL or other RDBMS)
> can't do this.  I would love to be corrected.

As long as you're ordering by some row in the table then you can do
that in
straight SQL.

select a, b, ts from foo where (stuff) and foo > X order by foo limit 10

Then, record the last value of foo you read, and plug it in as X the
next
time around.

This has the advantage over a simple offset approach of actually
displaying all the data as a user pages through it too. (Consider
the case where the user is viewing offsets 91-100, and you delete
the record at offset 15. The user goes to the next page and will
miss the record that used to be at offset 101 and is now at offset
100).

> The problem is that relational databases were invented before the
> web and its stateless applications.  In the "good old days", you
> could connect to a database and work for hours, and in that
> environment cursors and such work well -- the RDBMS maintains the
> internal state of the indexing system.  But in a web environment,
> state information is very difficult to maintain.  There are all
> sorts of systems that try (Enterprise Java Beans, for example), but
> they're very complex.

I think the problem is more that most web developers aren't very good
at using the database, and tend to fall back on simplistic, wrong,
approaches
to displaying the data. There's a lot of monkey-see, monkey-do in web
UI design too, which doesn't help.

Cheers,
   Steve


От:
"Craig A. James"
Дата:

Steve Atkins wrote:
> As long as you're ordering by some row in the table then you can do that in
> straight SQL.
>
> select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
>
> Then, record the last value of foo you read, and plug it in as X the next
> time around.

We've been over this before in this forum: It doesn't work as advertised.  Look for postings by me regarding the fact
thatthere is no way to tell the optimizer the cost of executing a function.  There's one, for example, on Oct 18, 2006. 

> I think the problem is more that most web developers aren't very good
> at using the database, and tend to fall back on simplistic, wrong,
> approaches
> to displaying the data. There's a lot of monkey-see, monkey-do in web
> UI design too, which doesn't help.

Thanks, I'm sure your thoughtful comments will help me solve my problem.  Somehow. ;-)

Craig

От:
"Merlin Moncure"
Дата:

On 3/22/07, Michael Stone <> wrote:
> On Thu, Mar 22, 2007 at 06:27:32PM +0100, Tino Wildenhain wrote:
> >Craig A. James schrieb:
> >>You guys can correct me if I'm wrong, but the key feature that's missing
> >>from Postgres's flexible indexing is the ability to maintain state
> >>across queries.  Something like this:
> >>
> >>  select a, b, my_index_state() from foo where ...
> >>    offset 100 limit 10 using my_index(prev_my_index_state);
> >>
> >
> >Yes, you are wrong :-) The technique is called "CURSOR"
> >if you maintain persistent connection per session
> >(e.g. stand allone application or clever pooling webapplication)
>
> Did you read the email before correcting it? From the part you trimmed
> out:
>
> >The problem is that relational databases were invented before the web
> >and its stateless applications.  In the "good old days", you could
> >connect to a database and work for hours, and in that environment
> >cursors and such work well -- the RDBMS maintains the internal state of
> >the indexing system.  But in a web environment, state information is
> >very difficult to maintain.  There are all sorts of systems that try
> >(Enterprise Java Beans, for example), but they're very complex.
>
> It sounds like they wrote their own middleware to handle the problem,
> which is basically what you suggested (a "clever pooling web
> application") after saying "wrong".

Tino was saying that rather that build a complete indexing storage
management solution that lives outside the database, it is better to
do intelligent session management so that you get the simplicity if a
two tier client server system but the scalability of a web app.

Web apps are not necessarily stateless, you just have to be a little
clever about how database connections are opened and closed.  Then you
get all the database stuff that comes along with a persistent
connection (advisory locks, cursors, prepared statements, etc) without
building all kinds of data management into the middleware.

merlin

От:
Guido Neitzer
Дата:

On 22.03.2007, at 11:53, Steve Atkins wrote:

> As long as you're ordering by some row in the table then you can do
> that in
> straight SQL.
>
> select a, b, ts from foo where (stuff) and foo > X order by foo
> limit 10
>
> Then, record the last value of foo you read, and plug it in as X
> the next
> time around.

This does only work if you have unique values in foo. You might have
"batch breaks" inside   a list of rows with equal values for foo.

But: a web application that needs state and doesn't maintain it by
itself (or inside the dev toolkit) is imho broken by design. How
should the database store a "state" for a web app? It's only possible
on the web app part, because the app is either stateless and so are
the queries to the database - they have to be re-evaluated for every
request as the request might come from totally different sources
(users, ...) or it is stateful and has to maintain the state because
only the app developer knows, what information is needed for the
"current state".

This is why all web application toolkits have a "session" concept.

> I think the problem is more that most web developers aren't very good
> at using the database, and tend to fall back on simplistic, wrong,
> approaches
> to displaying the data. There's a lot of monkey-see, monkey-do in web
> UI design too, which doesn't help.

Sure. That is the other problem ... ;-) But, and I think this is much
more important: most toolkits today free you from using the database
directly and writing lots and lots of lines of sql code which
instantly breaks when you switch the storage backend. It's just the
thing from where you look at something.

cug

От:
Steve Atkins
Дата:

On Mar 22, 2007, at 11:26 AM, Guido Neitzer wrote:

> On 22.03.2007, at 11:53, Steve Atkins wrote:
>
>> As long as you're ordering by some row in the table then you can
>> do that in
>> straight SQL.
>>
>> select a, b, ts from foo where (stuff) and foo > X order by foo
>> limit 10
>>
>> Then, record the last value of foo you read, and plug it in as X
>> the next
>> time around.
>
> This does only work if you have unique values in foo. You might
> have "batch breaks" inside   a list of rows with equal values for foo.

If I don't have unique values in foo, I certainly have unique values
in (foo, pk).

>
> But: a web application that needs state and doesn't maintain it by
> itself (or inside the dev toolkit) is imho broken by design. How
> should the database store a "state" for a web app? It's only
> possible on the web app part, because the app is either stateless
> and so are the queries to the database - they have to be re-
> evaluated for every request as the request might come from totally
> different sources (users, ...) or it is stateful and has to
> maintain the state because only the app developer knows, what
> information is needed for the "current state".
>
> This is why all web application toolkits have a "session" concept.

Yes. HTTP is not very stateful. Web applications are stateful. There
are some really obvious approaches to maintaining state cleanly that
work well with databases and let you do some quite complex stuff
(tying a persistent database connection to a single user, for
instance). But they don't scale at all well.

What Craig was suggesting is, basically, to assign a persistent
database connection to each user. But rather than maintain that
connection as a running process, to serialise all the state out of
the database connection and store that in the webapp, then when the
next action from that user comes in take a database connection and
stuff all that state into it again.

It's a lovely idea, but strikes me as completely infeasible in the
general case. There's just too much state there. Doing it in the
specific case is certainly possible, but rapidly devolves to the
standard approach of "On the first page of results, run the query and
record the first 5000 results. Store those in a scratch table,
indexed by session-id, or in external storage. On displaying later
pages of results to the same user, pull directly from the already
calculated results."

>
>> I think the problem is more that most web developers aren't very good
>> at using the database, and tend to fall back on simplistic, wrong,
>> approaches
>> to displaying the data. There's a lot of monkey-see, monkey-do in web
>> UI design too, which doesn't help.
>
> Sure. That is the other problem ... ;-) But, and I think this is
> much more important: most toolkits today free you from using the
> database directly and writing lots and lots of lines of sql code
> which instantly breaks when you switch the storage backend. It's
> just the thing from where you look at something.

The real problem is the user-interface is designed around what is
easy to implement in elderly cgi scripts, rather than what's
appropriate to the data being displayed or useful to the user.
Displaying tables of results, ten at a time, is just one of the more
egregious examples of that.

Cheers,
   Steve


От:
Tom Lane
Дата:

"Craig A. James" <> writes:
> Steve Atkins wrote:
>> As long as you're ordering by some row in the table then you can do that in
>> straight SQL.
>>
>> select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
>>
>> Then, record the last value of foo you read, and plug it in as X the next
>> time around.

> We've been over this before in this forum: It doesn't work as advertised.  Look for postings by me regarding the fact
thatthere is no way to tell the optimizer the cost of executing a function.  There's one, for example, on Oct 18, 2006. 

You mean
http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php
?  I don't see anything there that bears on Steve's suggestion.
(The complaint is obsolete as of CVS HEAD anyway.)

            regards, tom lane

От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote:
>Tino was saying that rather that build a complete indexing storage
>management solution that lives outside the database, it is better to
>do intelligent session management so that you get the simplicity if a
>two tier client server system but the scalability of a web app.

No, what he was saying was "there's this thing called a cursor". I
thought there was enough information in the original message to indicate
that the author knew about cursors. There are certainly pros and cons
and nuances to different approaches, but Tino's message didn't touch on
anything that specific.

And even if you do use some kind of "intelligent session management",
how many simultaneous cursors can postgres sanely keep track of?
Thousands? Millions? Tens of Millions? I suspect there's a scalability
limit in there somewhere. Luckily I don't spend much time in the web
application space, so I don't need to know.  :)

Mike Stone
От:
"Craig A. James"
Дата:

Tom Lane wrote:
> "Craig A. James" <> writes:
>> Steve Atkins wrote:
>>> As long as you're ordering by some row in the table then you can do that in
>>> straight SQL.
>>>
>>> select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
>>>
>>> Then, record the last value of foo you read, and plug it in as X the next
>>> time around.
>
>> We've been over this before in this forum: It doesn't work as advertised.
>> Look for postings by me regarding the fact that there is no way to tell
>> the optimizer the cost of executing a function.  There's one, for example,
>> on Oct 18, 2006.
>
> You mean
> http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php
> ?  I don't see anything there that bears on Steve's suggestion.
> (The complaint is obsolete as of CVS HEAD anyway.)

Mea culpa, it's October 8, not October 18:

   http://archives.postgresql.org/pgsql-performance/2006-10/msg00143.php

The relevant part is this:

"My example, discussed previously in this forum, is a classic.  I have a VERY expensive function (it's in the class of
NP-completeproblems, so there is no faster way to do it).  There is no circumstance when my function should be used as
afilter, and no circumstance when it should be done before a join.  But PG has no way of knowing the cost of a
function,and so the optimizer assigns the same cost to every function.  Big disaster. 

"The result?  I can't use my function in any WHERE clause that involves any other conditions or joins.  Only by itself.
PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions
first,and I'm dead. 

"The interesting thing is that PG works pretty well for me on big tables -- it does the join first, then applies my
expensivefunctions.  But with a SMALL (like 50K rows) table, it applies my function first, then does the join.  A
searchthat completes in 1 second on a 5,000,000 row database can take a minute or more on a 50,000 row database." 

Craig

От:
Tom Lane
Дата:

"Craig A. James" <> writes:
> Tom Lane wrote:
>> You mean
>> http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php
>> ?  I don't see anything there that bears on Steve's suggestion.

> Mea culpa, it's October 8, not October 18:
>    http://archives.postgresql.org/pgsql-performance/2006-10/msg00143.php

I still don't see the relevance to Steve's suggestion.

            regards, tom lane

От:
Tino Wildenhain
Дата:

Michael Stone schrieb:
> On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote:
>> Tino was saying that rather that build a complete indexing storage
>> management solution that lives outside the database, it is better to
>> do intelligent session management so that you get the simplicity if a
>> two tier client server system but the scalability of a web app.
>
> No, what he was saying was "there's this thing called a cursor". I
> thought there was enough information in the original message to indicate
> that the author knew about cursors. There are certainly pros and cons
> and nuances to different approaches, but Tino's message didn't touch on
> anything that specific.

Sure, the message thread sometimes loose history so I wasnt 100% sure
what the framework really is - although I assumed it could be a web
solution. With stand alone applications you usually have a limited
number of users connecting and they are connected during the session
so you can easily use cursors there.

> And even if you do use some kind of "intelligent session management",
> how many simultaneous cursors can postgres sanely keep track of?
> Thousands? Millions? Tens of Millions? I suspect there's a scalability
> limit in there somewhere. Luckily I don't spend much time in the web
> application space, so I don't need to know.  :)

Depending on the application, you can even simulate above situation
with a web framework if you manage session in the web framework
with persistent connections for a limited amount of users to work
the same time (certainly not feasable for a public web shop but for
data management systems for inhouse use). In this case, cursors
would be perfect too.

In any other case I fail to see the advantage in storing "index
data" outside the database with all the roundtripping involved.

If the query is complex and rerunning it for every batch is
expensive, fetching the whole result to the application in
case of users really traversing the complete batch
(How often is that really done? I mean, who browses to an
end of a huge result set?) is costy as well w/o really
benefit.

It would be much more easy and clean imho, in this case
to really fetch the data to session and batch linked
scratch table.

If its fast or you can prepare a batch helper table
with index, you can just select the batch equival
portion of the result.

You dont need extensive session management in the
web application to scroll thru result sets in this
way. This can all be encoded in forms or links.

Regards
Tino



От:
Tino Wildenhain
Дата:

Craig A. James schrieb:
> Tino Wildenhain wrote:
>>> You guys can correct me if I'm wrong, but the key feature that's
>>> missing from Postgres's flexible indexing is the ability to maintain
>>> state across queries.  Something like this:
>>>
>>>  select a, b, my_index_state() from foo where ...
>>>    offset 100 limit 10 using my_index(prev_my_index_state);
>>>
>>
>> Yes, you are wrong :-) The technique is called "CURSOR"
>> if you maintain persistent connection per session
>> (e.g. stand allone application or clever pooling webapplication)
>
> That's my whole point: If relational databases had a simple mechanism
> for storing their internal state in an external application, the need
> for cursors, connection pools, and all those other tricks would be
> eliminated.

Well the cursor is exactly the simple handle to the internal
state of the relational db you are looking for.
Do you really think transferring the whole query-tree, open index
and data files to the client over the network would really improve
the situation?

> As I said earlier, relational technology was invented in an earlier era,
> and hasn't caught up with the reality of modern web apps.

There is nothing modern with todays web apps.

>> If its a naive web application you just store your session
>> in tables where you can easily maintain the scroll state
>> as well.
>
> One thing I've learned in 25 years of software development is that
> people who use my software have problems I never imagined.  I've been
> the one who was naive when I said similar things about my customers, and
> was later embarrassed to learn that their problems were more complex
> than I ever imagined.

Sure it really depends on the application how the best solution
would look like but I'm quite certain, counterfaiting internal
stuff of the underlying relational database in the application
makes more problems then it solves. If you can't handle SQL,
dont use SQL, you can build web applications w/o any relational
database if you want it.

Regards
Tino Wildenhain

От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote:
>Tino was saying that rather that build a complete indexing storage
>management solution that lives outside the database, it is better to
>do intelligent session management so that you get the simplicity if a
>two tier client server system but the scalability of a web app.

No, what he was saying was "there's this thing called a cursor". I 
thought there was enough information in the original message to indicate 
that the author knew about cursors. There are certainly pros and cons 
and nuances to different approaches, but Tino's message didn't touch on 
anything that specific. 

And even if you do use some kind of "intelligent session management", 
how many simultaneous cursors can postgres sanely keep track of? 
Thousands? Millions? Tens of Millions? I suspect there's a scalability 
limit in there somewhere. Luckily I don't spend much time in the web 
application space, so I don't need to know.  :)

Mike Stone


От:
Michael Stone
Дата:

On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote:
>Tino was saying that rather that build a complete indexing storage
>management solution that lives outside the database, it is better to
>do intelligent session management so that you get the simplicity if a
>two tier client server system but the scalability of a web app.

No, what he was saying was "there's this thing called a cursor". I
thought there was enough information in the original message to indicate
that the author knew about cursors. There are certainly pros and cons
and nuances to different approaches, but Tino's message didn't touch on
anything that specific.

And even if you do use some kind of "intelligent session management",
how many simultaneous cursors can postgres sanely keep track of?
Thousands? Millions? Tens of Millions? I suspect there's a scalability
limit in there somewhere. Luckily I don't spend much time in the web
application space, so I don't need to know.  :)

Mike Stone

От:
Tino Wildenhain
Дата:

Michael Stone schrieb:
> On Thu, Mar 22, 2007 at 06:27:32PM +0100, Tino Wildenhain wrote:
>> Craig A. James schrieb:
>>> You guys can correct me if I'm wrong, but the key feature that's
>>> missing from Postgres's flexible indexing is the ability to maintain
>>> state across queries.  Something like this:
>>>
>>>  select a, b, my_index_state() from foo where ...
>>>    offset 100 limit 10 using my_index(prev_my_index_state);
>>>
>>
>> Yes, you are wrong :-) The technique is called "CURSOR"
>> if you maintain persistent connection per session
>> (e.g. stand allone application or clever pooling webapplication)
>
> Did you read the email before correcting it? From the part you trimmed out:
>
>> The problem is that relational databases were invented before the web
>> and its stateless applications.  In the "good old days", you could
>> connect to a database and work for hours, and in that environment
>> cursors and such work well -- the RDBMS maintains the internal state
>> of the indexing system.  But in a web environment, state information
>> is very difficult to maintain.  There are all sorts of systems that
>> try (Enterprise Java Beans, for example), but they're very complex.

Yes, but actually this is not true. They are not so complex in this
regard. All you have to do is to look in the pg_cursor view if
your cursor is there and if not, create it in your session.
All you need to maintain is the cursor name which maps to your
session + the special query you run. This should be easy
in any web application.

> It sounds like they wrote their own middleware to handle the problem,
> which is basically what you suggested (a "clever pooling web
> application") after saying "wrong".

I read about "building index data outside postgres" which still is
the wrong approach imho.

This discussion is a bit theoretical until we see the actual problem
and the proposed solution here.

Regards
Tino

От:
Michael Stone
Дата:

On Fri, Mar 23, 2007 at 01:01:02PM +0100, Tino Wildenhain wrote:
>This discussion is a bit theoretical until we see the actual problem
>and the proposed solution here.

It's good to see you back off a bit from your previous stance of
assuming that someone doesn't know what they're doing and that their
solution is absolutely wrong without actually knowing anything about
what they are trying to do.

Mike Stone

От:
Tino Wildenhain
Дата:

Michael Stone schrieb:
> On Fri, Mar 23, 2007 at 01:01:02PM +0100, Tino Wildenhain wrote:
>> This discussion is a bit theoretical until we see the actual problem
>> and the proposed solution here.
>
> It's good to see you back off a bit from your previous stance of
> assuming that someone doesn't know what they're doing and that their
> solution is absolutely wrong without actually knowing anything about
> what they are trying to do.

Well I'm sure its very likely wrong :-) At least the core part of
it with the statement of "keeping index data outside postgres".

What I meant with my comment about the theoreticalness: we cannot
make educated suggestions about alternative solutions to the problem
until we know the problem and maybe the current solution in detail.

Regards
Tino