Обсуждение: Weird disk write load caused by PostgreSQL?

Поиск
Список
Период
Сортировка

Weird disk write load caused by PostgreSQL?

От
Alexander Staubo
Дата:
I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
seems to be writing data to disk at rates that I think are
disproportional to the update load imposed on the database. I am
looking for ways to determine the cause of this I/O.

As an example, here is a typical graph produced by Munin:

   http://purefiction.net/paste/pg-iostat.png

Running an hourly iostat produces this output:

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              43.50         0.21         0.04     427915      72736
sda              43.62         0.21         0.04     428183      88904
sda              43.74         0.21         0.05     428440     104877
sda              43.90         0.21         0.06     428808     124681
sda              44.06         0.21         0.07     429111     145447
sda              44.27         0.21         0.08     429532     170317
sda              44.46         0.21         0.09     429985     193594

In other words, it's reading about 400MB/hour and writing around
15-20GB/hour, or exactly 118GB during the last six hours. To
determine how well this correlates to the actual inserts and updates
being performed on the database, I ran a loop alongside iostat that
executed "select sum(n_tup_upd), sum(n_tup_ins) from
pg_stat_all_tables" against PostgreSQL every hour and output the
difference. Here are a few samples:

| delta_upd | delta_ins |
+-----------+-----------+
|      7111 |      2343 |
|      7956 |      2302 |
|      7876 |      2181 |
|      9269 |      2477 |
|      8553 |      2205 |

For the write numbers to match the tuple numbers, each updated/
inserted tuple would have to average at least 1.5MB (15 GB divided by
10,000 tuples), which is not the case; the total size of the raw
tuples updated/inserted during the above session probably does not
exceed a couple of megabytes. Even considering overhead, page size,
MVCC, etc., this does not compute.

I have not narrowed this explicitly down to PostgreSQL, but since the
stock Linux kernel we are running does not offer per-process I/O
statistics, I cannot determine this for sure. However, except for the
PostgreSQL database, everything else on the box should be identical
to what we are running on other boxes, which are not exhibiting the
same kind of load. Oh, and I have PostgreSQL logging turned off.

Note that PostgreSQL's performance in itself seems fine, and
according to top/ps it's only very rarely in iowait.

Alexander.

Re: Weird disk write load caused by PostgreSQL?

От
Tom Lane
Дата:
Alexander Staubo <alex@purefiction.net> writes:
> I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
> seems to be writing data to disk at rates that I think are
> disproportional to the update load imposed on the database. I am
> looking for ways to determine the cause of this I/O.

Are you sure that iostat is to be trusted?  The read numbers in
particular look suspiciously uniform ... it would be a strange
query load that would create a read demand changing less than 1%
from hour to hour, unless perhaps that represented the disk's
saturation point, which is not the case if you're not seeing
obvious performance problems.

            regards, tom lane

Re: Weird disk write load caused by PostgreSQL?

От
Alexander Staubo
Дата:
On Oct 2, 2006, at 17:50 , Tom Lane wrote:

> Alexander Staubo <alex@purefiction.net> writes:
>> I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
>> seems to be writing data to disk at rates that I think are
>> disproportional to the update load imposed on the database. I am
>> looking for ways to determine the cause of this I/O.
>
> Are you sure that iostat is to be trusted?

No. :) But iostat reads directly from /dev/diskstats, which should be
reliable. Of course, it still doesn't say anything about which
process is doing the writing; for that I would need to install the
atop kernel patches or similar.

...
> The read numbers in
> particular look suspiciously uniform ... it would be a strange
> query load that would create a read demand changing less than 1%
> from hour to hour, unless perhaps that represented the disk's
> saturation point, which is not the case if you're not seeing
> obvious performance problems.

They are not uniform at all -- they correlate perfectly with the web
traffic; it just so happens that the samples I quoted were from peak
hours. Take a look at the Munin graph. (The spikes correspond to
scheduled maintenance tasks like backups.)

Alexander.


Re: Weird disk write load caused by PostgreSQL?

От
Alexander Staubo
Дата:
I'm surprised that I have to reply to myself, since in hindsight this
should be bloody obvious: It's the pgsql_tmp directory. I just
monitored the file creation in that directory, and found PostgreSQL
to be creating huge temporary, extremely short-lived files ranging
from 1MB to 20MB in size. I increased work_mem to 25MB -- should be
perfectly safe on a box with 4GB of RAM, I hope - and the iostat
traffic seems to have dropped to near zero.

I'm a bit baffled as to why PostgreSQL would ever be sorting 20MB of
data in the first place, even with ~12 connections running queries
concurrently, but I suppose I will have to look more closely at our
query patterns.

Alexander.

On Oct 2, 2006, at 19:25 , Alexander Staubo wrote:

> On Oct 2, 2006, at 17:50 , Tom Lane wrote:
>
>> Alexander Staubo <alex@purefiction.net> writes:
>>> I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
>>> seems to be writing data to disk at rates that I think are
>>> disproportional to the update load imposed on the database. I am
>>> looking for ways to determine the cause of this I/O.
>>
>> Are you sure that iostat is to be trusted?
>
> No. :) But iostat reads directly from /dev/diskstats, which should
> be reliable. Of course, it still doesn't say anything about which
> process is doing the writing; for that I would need to install the
> atop kernel patches or similar.
>
> ...
>> The read numbers in
>> particular look suspiciously uniform ... it would be a strange
>> query load that would create a read demand changing less than 1%
>> from hour to hour, unless perhaps that represented the disk's
>> saturation point, which is not the case if you're not seeing
>> obvious performance problems.
>
> They are not uniform at all -- they correlate perfectly with the
> web traffic; it just so happens that the samples I quoted were from
> peak hours. Take a look at the Munin graph. (The spikes correspond
> to scheduled maintenance tasks like backups.)
>
> Alexander.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match