Обсуждение: Re: [HACKERS] []performance issues
On Fri, Aug 02, 2002 at 03:48:39PM +0400, Yaroslav Dmitriev wrote:
>
> So I am still interested in PostgreSQL's ability to deal with
> multimillon records tables.
[x-posted and Reply-To: to -general; this isn't a development
problem.]
We have tables with multimillion records, and they are fast. But not
fast to count(). The MVCC design of PostgreSQL will give you very
few concurerncy problems, but you pay for that in the response time
of certain kinds of aggregates, which cannot use an index.
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110
Count() is slow even on your Sun server with 16gb ram? How big is the
database?
David Blood
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Andrew Sullivan
Sent: Friday, August 02, 2002 9:40 AM
To: PostgreSQL-development
Cc: PostgreSQL general list
Subject: Re: [GENERAL] [HACKERS] []performance issues
On Fri, Aug 02, 2002 at 03:48:39PM +0400, Yaroslav Dmitriev wrote:
>
> So I am still interested in PostgreSQL's ability to deal with
> multimillon records tables.
[x-posted and Reply-To: to -general; this isn't a development
problem.]
We have tables with multimillion records, and they are fast. But not
fast to count(). The MVCC design of PostgreSQL will give you very
few concurerncy problems, but you pay for that in the response time
of certain kinds of aggregates, which cannot use an index.
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
On Fri, Aug 02, 2002 at 09:57:16AM -0600, David Blood wrote:
>
> Count() is slow even on your Sun server with 16gb ram? How big is the
> database?
Well, just relatively slow! It's always going to be relatively slow
to seqscan a few million records. We have some tables which have
maybe 4 or 4.5 million records in them. (I don't spend a lot of time
count()ing them ;-)
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110
We have tables of over 3.1 million records. Performance is fine for most things as long as access hits an index. As already stated, count(*) takes a long time. Just took over a minute for me to check the record count. Our DB is primarily a data warehouse role. Creating an index on a char(43) field on that table from scratch takes a while, but I think that's expected. Under normal loads we have well under 1 second "LIKE" queries on that the indexed char(43) field in the table with a join on a table of 1.1 million records using a char(12) primary key. Server is a Dell PowerEdge 2400, Dual PIII 667's with a gig of memory, 800 something megs allocated to postgres shared buffers. -Pete Andrew Sullivan wrote: >On Fri, Aug 02, 2002 at 03:48:39PM +0400, Yaroslav Dmitriev wrote: > > >>So I am still interested in PostgreSQL's ability to deal with >>multimillon records tables. >> >> > >[x-posted and Reply-To: to -general; this isn't a development >problem.] > >We have tables with multimillion records, and they are fast. But not >fast to count(). The MVCC design of PostgreSQL will give you very >few concurerncy problems, but you pay for that in the response time >of certain kinds of aggregates, which cannot use an index. > >A > > >
On Fri, 2002-08-02 at 11:39, Andrew Sullivan wrote: > On Fri, Aug 02, 2002 at 03:48:39PM +0400, Yaroslav Dmitriev wrote: > > > > So I am still interested in PostgreSQL's ability to deal with > > multimillon records tables. > > [x-posted and Reply-To: to -general; this isn't a development > problem.] > > We have tables with multimillion records, and they are fast. But not > fast to count(). The MVCC design of PostgreSQL will give you very > few concurerncy problems, but you pay for that in the response time > of certain kinds of aggregates, which cannot use an index. Of course, as suggested this is easily overcome by keeping your own c counter. begin; insert into bigtable values (); update into counttable set count=count+1; commit; Now you get all the fun concurrency issues -- but fetching the information will be quick. What happens more, the counts, or the inserts :)
On Fri, Aug 02, 2002 at 02:08:02PM -0400, Rod Taylor wrote:
>
> Of course, as suggested this is easily overcome by keeping your own c
> counter.
>
> begin;
> insert into bigtable values ();
> update into counttable set count=count+1;
> commit;
>
> Now you get all the fun concurrency issues -- but fetching the
> information will be quick. What happens more, the counts, or the
> inserts :)
You could get around this with a trigger that just inserts 1 into one
table (call it counter_unposted), and then using an external process
to take those units, add them to the value in counter_posted, and
delete them from counter_unposted. You'd always be a few minutes
behind, but you'd get a counter that's pretty close without too much
overhead. Of course, this raises the obvious question: why use
count() at all?
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110