Re: Slow count(*) again...

Поиск
Список
Период
Сортировка
От Scott Carey
Тема Re: Slow count(*) again...
Дата
Msg-id CF6DEAD9-50AC-4EE4-A4CC-FA0B31D82D2C@richrelevance.com
обсуждение исходный текст
Ответ на Re: Slow count(*) again...  (Jesper Krogh <jesper@krogh.cc>)
Список pgsql-performance
On Oct 21, 2010, at 11:13 AM, Jesper Krogh wrote:

> On 2010-10-21 06:47, Scott Carey wrote:
> > On a wimpy disk, I/O bound for
> sure.   But my disks go 1000MB/sec.
>
> > No query can go fast enough for them.  The best I've gotten is
>
> > 800MB/sec, on a wide row (average 800 bytes).  Most tables go
>
> > 300MB/sec or so.  And with 72GB of RAM, many scans are in-memory
>
> > anyway.
>
>
> Is it cpu or io bound while doing it?
I/O bound with the fio benchmark tool if 16K blocks or greater, CPU bound with 8K blocks or smaller.  CentOS 5.5.
CPU bound with postgres.


> Can you scan it faster using time cat relation-oid.* > /dev/null
>

I'm not sure what you mean.  in psql, select * piped to /dev/null is VERY CPU bound because of all the formatting.  I
haven'ttoyed with COPY.   Do you mean the actual files?  'dd' tests from actual files are similar to fio, but not as
consistentand hard to add concurrency.  That is faster than postgres. 

>
> > A single SSD with supercapacitor will go about 500MB/sec by itself
>
> > next spring.   I will easily be able to build a system with 2GB/sec
>
> > I/O for under $10k.
>
>
>
>  What filesystem are you using? Readahead?
> Can you try to check the filesystemfragmentation of the table using filefrag?
>
XFS, defragmented once a day.  Readahead 40960 (20MB, 1MB per spindle).  two raid 10 arrays, each 10 discs each (2 hot
spare),software raid-0 tying those together (md, 1MB blocks).  Two Adaptec 5805 (or 5085, the external SAS one).  A
thirdraid card for the OS/xlog with 4x10krpm sas drives internal. 

Fragmentation quickly takes this down a lot as do small files and concurrent activity, since its only enough spindles
for~2000 iops.  But its almost all large reporting queries on partitioned tables (500,000 partitions).   A few smaller
tablesare starting to cause too many seeks so those might end up on a smaller, high iops tablespace later. 

Over time the disks have filled up and there is a significant slowdown in sequential transfer at the end of the
partition-- 600MB/sec max.  That is still CPU bound on most scans, but postgres can go that fast on some scans. 

Off topic:
Other interesting features is how this setup causes the system tables to bloat by factors of 2x to 8x each week, and
requiresfrequent vacuum full + reindex on several of them else they become 1.5GB in size.  Nothing like lots of temp
tablework + hour long concurrent transactions to make the system catalog bloat.  I suppose with 8.4 many temp tables
couldbe replaced using WITH queries, but in other cases analyzing a temp table is the only way to get a sane query
plan.


> --
> Jesper
>
>
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: BBU Cache vs. spindles
Следующее
От: Greg Smith
Дата:
Сообщение: Re: BBU Cache vs. spindles