Обсуждение: Asynchronous I/O in Postgres

Поиск
Список
Период
Сортировка

Asynchronous I/O in Postgres

От
Mladen Gogala
Дата:
Postgres 8.4 and  9.0 have the parameter named
"effective_io_concurrency". The manual page is very short, it says the
following:

Sets the number of concurrent disk I/O operations that PostgreSQL
expects can be executed simultaneously. Raising this value will increase
the number of I/O operations that any individual PostgreSQL session
attempts to initiate in parallel. The allowed range is 1 to 1000, or
zero to disable issuance of asynchronous I/O requests.

http://www.postgresql.org/docs/current/static/runtime-config-resource.html

My initial understanding was that this was the size of  the table,
containing aiocb pointers, so that PgSQL can launch up to 1000
simultaneous aio_read or aio_write, per process. While monitoring the
system, I noticed that there is no asynchronous I/O at all! Nothing,
nada, zilch! Then I noticed that the "postgres" binary,  is not even
linked with libaio, so aio_read was out of the question:

-bash-3.2$ ldd postgres|grep libaio
-bash-3.2$

The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
the "effective_io_concurrency" was apparently very wrong. What is the
"effective concurrency" and what are those "simultaneous I/O requests"
that man page is talking about. Can somebody please define in precise
terms what is it that this parameter defines? What kind of "concurrent
I/O" is Postgres doing without asynchronous I/O calls? If this parameter
is just a stub for the future reference, I'd like to know. Will Postgres
use asynchronous I/O? Is that planned?



--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


Re: Asynchronous I/O in Postgres

От
Mladen Gogala
Дата:
Mladen Gogala wrote:
>
> The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
> the "effective_io_concurrency" was apparently very wrong. What is the
> "effective concurrency" and what are those "simultaneous I/O requests"
> that man page is talking about. Can somebody please define in precise
> terms what is it that this parameter defines? What kind of "concurrent
> I/O" is Postgres doing without asynchronous I/O calls? If this parameter
> is just a stub for the future reference, I'd like to know. Will Postgres
> use asynchronous I/O? Is that planned?
>
>
>
The mystery deepens. I thought that this might be the size of the I/O
vector, for readv and writev routines, but not so. I did
"ltrace -e readv -p <PID> on a PID that was doing a large sequential
scan  and not a single "readv" library call was encountered. All calls
were just plain and simple "read" calls. Where is the concurrency? I am
really curious now. The LWN article pompously announced that PostgreSQL
9.0 will use asynchronous I/O, with aio_read and aio_write.  What does
effective_io_concurrency define? What kind of "concurrent I/O" is
Postgresql doing? This doesn't look very "concurrent":

read(65, "\16\0\0\0\210\254\333\240\1\0\4\0L\0P\0\0 \4
\0\0\0\0000\231\240\r\370\227p\2"..., 8192) = 8192
read(65, "\16\0\0\0000\1\334\240\1\0\4\0008\0 \1\0 \4
\0\0\0\0(\234\260\7`\231\220\5"..., 8192) = 8192
read(65, "\16\0\0\0\20;\334\240\1\0\4\0<\0(\1\0 \4
\0\0\0\0\360\233\36\10\320\232@\2"..., 8192) = 8192
read(65, "\16\0\0\0Pk\334\240\1\0\4\0004\0\300\0\0 \4 \0\0\0\0H\232p\v
\224P\f"..., 8192) = 8192
read(65, "\16\0\0\0\220\273C\241\1\0\4\0D\0p\0\0 \4
\0\0\0\0\230\234\320\6\220\233\16\2"..., 8192) = 8192
read(65, "\16\0\0\0P\311\335\240\1\0\4\0<\0008\1\0 \4
\0\0\0\0\240\231\300\fp\230`\2"..., 8192) = 8192
read(65, "\16\0\0\0\260*\335\240\1\0\4\0008\0\350\0\0 \4
\0\0\0\0\20\230\340\0178\224\256\7"..., 8192) = 8192
read(65, "\16\0\0\0\20\10\337\240\1\0\4\0004\0h\0\0 \4
\0\0\0\0\210\231\356\f\230\225\340\7"..., 8192) = 8192
read(65, "\16\0\0\0\220\310C\241\1\0\4\0@\0\260\0\0 \4
\0\0\0\0H\231p\r0\227.\4"..., 8192) = 8192
read(65, "\16\0\0\0\350-\301\241\1\0\4\0<\0X\0\0 \4
\0\0\0\0H\232p\v\0\226\216\10"..., 8192) = 8192

Descriptor 65 is a DB file:
[root@lpo-postgres-01 ~]# cd /proc/16663/fd
[root@lpo-postgres-01 fd]# ls -l 65
lrwx------ 1 postgres postgres 64 Oct  7 23:26 65 ->
/software/pgsql/m-over/PG_9.0_201008051/16417/1572186.7

So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?


--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


Re: Asynchronous I/O in Postgres

От
Mladen Gogala
Дата:
Mladen Gogala wrote:
> So, essentially, the process is reading block by block, in a sequence.
> What, exactly, does "effective_io_concurrency" mean?
>
To rephrase my question, can anybody tell me where in the code is it used?


--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


Re: Asynchronous I/O in Postgres

От
Josh Kupershmidt
Дата:
On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:
> Mladen Gogala wrote:
>>
>> So, essentially, the process is reading block by block, in a sequence.
>> What, exactly, does "effective_io_concurrency" mean?
>>
>
> To rephrase my question, can anybody tell me where in the code is it used?

The docs are a bit sparse here :-(

But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.

The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm>
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."

Josh

Re: Asynchronous I/O in Postgres

От
Bruce Momjian
Дата:
Josh Kupershmidt wrote:
> On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:
> > Mladen Gogala wrote:
> >>
> >> So, essentially, the process is reading block by block, in a sequence.
> >> What, exactly, does "effective_io_concurrency" mean?
> >>
> >
> > To rephrase my question, can anybody tell me where in the code is it used?
>
> The docs are a bit sparse here :-(
>
> But it looks to me like effective_io_concurrency only affects bitmap
> heap scans. The setting from effective_io_concurrency gets put into
> "target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
> only place which uses that variable is
> ./src/backend/executor/nodeBitmapHeapscan.c.
>
> The EnterpriseDB docs
> <http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm>
> mention:
> "effective_io_concurrency is only used for Bitmap Heap Scans. For
> normal sequential scans the operating system should handle read-ahead
> internally (On Linux, see the blockdev command, in particular --setra
> and --setfra)."

So, this this also true for community Postgres?  Can someone suggest
updated docs?


--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +