Re: Scaling SELECT:s with the number of disks on a stripe

Поиск
Список
Период
Сортировка
От Dave Cramer
Тема Re: Scaling SELECT:s with the number of disks on a stripe
Дата
Msg-id 16C6C659-1C0D-4C1A-AEB7-E92FBAADC045@fastcrypt.com
обсуждение исходный текст
Ответ на Re: Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller <peter.schuller@infidyne.com>)
Список pgsql-performance
On 4-Apr-07, at 2:01 AM, Peter Schuller wrote:

> Hello,
>
>> The next question then is whether anything in your postgres
>> configuration
>> is preventing it getting useful performance from the OS. What
>> settings
>> have you changed in postgresql.conf?
>
> The only options not commented out are the following (it's not even
> tweaked for buffer sizes and such, since in this case I am not
> interested in things like sort performance and cache locality other
> than as an afterthought):
>
> hba_file = '/etc/postgresql/8.1/main/pg_hba.conf'
> ident_file = '/etc/postgresql/8.1/main/pg_ident.conf'
> external_pid_file = '/var/run/postgresql/8.1-main.pid'
> listen_addresses = '*'
> port = 5432
> max_connections = 100
> unix_socket_directory = '/var/run/postgresql'
> ssl = true
> shared_buffers = 1000
This is way too low, if this 8.x then set it to 25% of available
memory, and effective cache should be 3x that
> log_line_prefix = '%t '
> stats_command_string = on
> stats_row_level = on
> autovacuum = on
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
>
>> Are you using any unusual settings within the OS itself?
>
> No. It's a pretty standard kernel. The only local tweaking done is
> enabling/disabling various things; there are no special patches used
> or attempts to create a minimalistic kernel or anything like that.
>
>> You're forgetting the LIMIT clause. For the straight index scan, the
>> query aborts when the LIMIT is reached having scanned only the
>> specified
>> number of index rows (plus any index entries that turned out to be
>> dead
>> in the heap). For the bitmap scan case, the limit can be applied
>> only after
>> the heap scan is under way, therefore the index scan to build the
>> bitmap
>> will need to scan ~50k rows, not the 10k specified in the limit,
>> so the
>> amount of time spent scanning the index is 50 times larger than in
>> the
>> straight index scan case.
>
> Ok - makes sense that it has to scan the entire subset of the index
> for the value in question. I will have to tweak the CPU/disk costs
> settings (which I have, on purpose, not yet done).
>
>> However, I do suspect you have a problem here somewhere, because
>> in my
>> tests the time taken to do the bitmap index scan on 50k rows, with
>> the
>> index in cache, is on the order of 30ms (where the data is cached in
>> shared_buffers) to 60ms (where the data is cached by the OS).
>> That's on
>> a 2.8GHz xeon.
>
> This is on a machine with 2.33GHz xeons and I wasn't trying to
> exaggerate. I timed it and it is CPU bound (in userspace; next to no
> system CPU usage at all) for about 15 seconds for the case of
> selecting with a limit of 10000.
>
> Given that there is no disk activity I can't imagine any buffer sizes
> or such affecting this other than userspace vs. kernelspace CPU
> concerns (since obviously the data being worked on is in RAM). Or am I
> missing something?
>
> It is worth noting that the SELECT of fewer entries is entirely disk
> bound; there is almost no CPU usage whatsoever. Even taking the
> cumulative CPU usage into account (gut feeling calculation, nothing
> scientific) and multiplying by 50 you are nowhere near 15 seconds of
> CPU boundness. So it is indeed strange.
>
> --
> / Peter Schuller
>
> PGP userID: 0xE9758B7D or 'Peter Schuller
> <peter.schuller@infidyne.com>'
> Key retrieval: Send an E-Mail to getpgpkey@scode.org
> E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: Arnau
Дата:
Сообщение: Re: Equivalents in PostgreSQL of MySQL's "ENGINE=MEMORY" "MAX_ROWS=1000"
Следующее
От: Andrew - Supernews
Дата:
Сообщение: Re: Scaling SELECT:s with the number of disks on a stripe