Обсуждение: Database I/O and other performance questions.

Поиск
Список
Период
Сортировка

Database I/O and other performance questions.

От
Lincoln Yeoh
Дата:
Hi everyone,

What typical/max database I/O throughputs are people getting for simple
selects of datasets  cached in memory but not explicitly clustered?
100KB/sec? 1MB/sec? 10MB/sec? (assuming << 8KB rows).

Does a "select count(*) from tablename where ..." actually retrieve all
columns from all the selected rows or does it only count the rows?

That is to say:
would/should select count(*) be slower than select
count(averysmallcolumnmaybeboolean)

I'll test stuff out in practice, but it'll be good to know what it should
be in theory.

Thanks!
Link.


Re: Database I/O and other performance questions.

От
Bruce Momjian
Дата:
> Hi everyone,
>
> What typical/max database I/O throughputs are people getting for simple
> selects of datasets  cached in memory but not explicitly clustered?
> 100KB/sec? 1MB/sec? 10MB/sec? (assuming << 8KB rows).

I see transfer rates that match the I/O speed of the disk.

>
> Does a "select count(*) from tablename where ..." actually retrieve all
> columns from all the selected rows or does it only count the rows?

Yes, it reads all rows.

>
> That is to say:
> would/should select count(*) be slower than select
> count(averysmallcolumnmaybeboolean)

Same speed.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Database I/O and other performance questions.

От
Tom Lane
Дата:
Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> would/should select count(*) be slower than select
> count(averysmallcolumnmaybeboolean)

There is actually a semantic difference there, see
http://www.postgresql.org/devel-corner/docs/postgres/sql-expressions.htm#SYNTAX-AGGREGATES
count(*) means the count of selected rows, but count(foo) means the
number of selected rows where the variable or expression is not NULL.

count(*) should be a fraction faster, because it doesn't expend any
cycles to check whether the specific column is NULL.  The "*" in this
syntax doesn't mean "all columns" the way it does in a select list,
it's just a dummy.

            regards, tom lane