Re: [PERFORM] A Better External Sort?

Поиск
Список
Период
Сортировка
От Ron Peacetree
Тема Re: [PERFORM] A Better External Sort?
Дата
Msg-id 18329659.1128188552696.JavaMail.root@elwamui-polski.atl.sa.earthlink.net
обсуждение исходный текст
Ответ на [PERFORM] A Better External Sort?  (Ron Peacetree <rjpeace@earthlink.net>)
Ответы Re: [PERFORM] A Better External Sort?  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
You have not said anything about what HW, OS version, and pg version
used here, but even at that can't you see that something Smells Wrong?

The most common CPUs currently shipping have clock rates of ~2-3GHz
and have 8B-16B internal pathways.  SPARCs and other like CPUs are
clocked slower but have 16B-32B internal pathways.  In short, these
CPU's have an internal bandwidth of 16+ GBps.

The most common currently shipping mainboards have 6.4GBps RAM
subsystems.  ITRW, their peak is ~80% of that, or ~5.1GBps.

In contrast, the absolute peak bandwidth of a 133MHx 8B PCI-X bus is
1GBps, and ITRW it peaks at ~800-850MBps.  Should anyone ever build
a RAID system that can saturate a PCI-Ex16 bus, that system will be
maxing ITRW at ~3.2GBps.

CPUs should NEVER be 100% utilized during copy IO.  They should be
idling impatiently waiting for the next piece of data to finish being
processed even when the RAM IO subsystem is pegged; and they
definitely should be IO starved rather than CPU bound when doing
HD IO.

Those IO rates are also alarming in all but possibly the first case.  A
single ~50MBps HD doing 21MBps isn't bad, but for even a single
~80MBps HD it starts to be of concern.  If any these IO rates came
from any reasonable 300+MBps RAID array, then they are BAD.

What your simple experiment really does is prove We Have A
Problem (tm) with our IO code at either or both of the OS or the pg
level(s).

Ron


-----Original Message-----
From: Martijn van Oosterhout <kleptog@svana.org>
Sent: Oct 1, 2005 12:19 PM
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote:
> Assuming we get the abyssmal physical IO performance fixed...
> (because until we do, _nothing_ is going to help us as much)

I'm still not convinced this is the major problem. For example, in my
totally unscientific tests on an oldish machine I have here:

Direct filesystem copy to /dev/null
21MB/s    10% user 50% system  (dual cpu, so the system is using a whole CPU)

COPY TO /dev/null WITH binary
13MB/s    55% user 45% system  (ergo, CPU bound)

COPY TO /dev/null
4.4MB/s   60% user 40% system

\copy to /dev/null in psql
6.5MB/s   60% user 40% system

This machine is a bit strange setup, not sure why fs copy is so slow.
As to why \copy is faster than COPY, I have no idea, but it is
repeatable. And actually turning the tuples into a printable format is
the most expensive. But it does point out that the whole process is
probably CPU bound more than anything else.

So, I don't think physical I/O is the problem. It's something further
up the call tree. I wouldn't be surprised at all it it had to do with
the creation and destruction of tuples. The cost of comparing tuples
should not be underestimated.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Expression index ignores column statistics target
Следующее
От: Tom Lane
Дата:
Сообщение: Re: \d on database with a lot of tables is slow