Re: Slow count(*) again...

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Slow count(*) again...
Дата
Msg-id 4CB1DC01.4000409@postnewspapers.com.au
обсуждение исходный текст
Ответ на Re: Slow count(*) again...  (Neil Whelchel <neil.whelchel@gmail.com>)
Список pgsql-performance
On 10/10/2010 6:29 PM, Neil Whelchel wrote:
> On the other hand, I copied a table out of one of my production servers that
> has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
> text). The first numeric column has numbers evenly spread between 0 and 100
> and it is indexed. I put the table in a pair of database servers both running
> on the same physical hardware. One server is Postgres, the other is a popular
> server (I am not mentioning names here).

Please do. Your comment is pretty meaningless otherwise.

If you're talking about MySQL: Were you using InnoDB or MyISAM table
storage? Of course it's fast with MyISAM, it relies on locks to do
updates and has bugger all capability for write concurrency, or to
permit readers while writing is going on.

If you're using InnoDB, then I'd like to know how they've managed that.

If you're talking about some *other* database, please name it and
provide any useful details, because the hand waving is not helpful.

 > I don't think that any amount of settings
> tweaking will bring them even in the same ball park.

If you are, in fact, comparing MySQL+MyISAM and PostgreSQL, then you're
quite right. Pg will never have such a fast count() as MyISAM does or
the same insanely fast read performance, and MyISAM will never be as
reliable, robust or concurrency-friendly as Pg is. Take your pick, you
can't have both.

> There has been discussion
> about the other server returning an incorrect count because all of the indexed
> rows may not be live at the time. This is not a problem for the intended use,
> that is why I suggested another function like estimate(*). It's name suggests
> that the result will be close, not 100% correct, which is plenty good enough
> for generating a list of results pages in most cases.

Do you have any practical suggestions for generating such an estimate,
though? I find it hard to think of any way the server can do that
doesn't involve executing the query. The table stats are WAY too general
and a bit hit-and-miss, and there isn't really any other way to do it.

If all you want is a way to retrieve both a subset of results AND a
count of how many results would've been generated, it sounds like all
you really need is a way to get the total number of results returned by
a cursor query, which isn't a big engineering challenge. I expect that
in current Pg versions a trivial PL/PgSQL function could be used to
slurp and discard unwanted results, but a better in-server option to
count the results from a cursor query would certainly be nice.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Mladen Gogala
Дата:
Сообщение: Re: large dataset with write vs read clients
Следующее
От: Reid Thompson
Дата:
Сообщение: Re: Slow count(*) again...