On Wed, May 15, 2013 at 4:34 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2013-05-15 16:26:15 -0500, Jon Nelson wrote:
>> >> I have written up a patch to use posix_fallocate in new WAL file
>> >> creation, including configuration by way of a GUC variable, but I've
>> >> not contributed to the PostgreSQL project before. Therefore, I'm
>> >> fairly certain the patch is not formatted properly or conforms to the
>> >> appropriate style guides. Currently, the patch is based on 9.2, and is
>> >> quite small in size - 3.6KiB.
>>
>> I have re-based and reformatted the code, and basic testing shows a
>> reduction in WAL-file creation time of a fairly significant amount.
>> I ran 'make test' and did additional local testing without issue.
>> Therefore, I am attaching the patch. I will try to add it to the
>> commitfest page.
>
> Some where quick comments, without thinking about this:
Thank you for the kind feedback.
> * needs a configure check for posix_fallocate. The current version will
> e.g. fail to compile on windows or many other non linux systems. Check
> how its done for posix_fadvise.
I will address as soon as I am able.
> * Is wal file creation performance actually relevant? Is the performance
> of a system running on fallocate()d wal files any different?
In my limited testing, I noticed a drop of approx. 100ms per WAL file.
I do not have a good idea for how to really stress the WAL-file
creation area without calling pg_start_backup and pg_stop_backup over
and over (with archiving enabled).
However, a file allocated with fallocate is (supposed to be) less
fragmented than one created by the traditional means.
> * According to the man page posix_fallocate doesn't set errno but rather
> returns the error code.
That's true. I originally wrote the patch using fallocate(2). What
would be appropriate here? Should I switch on the return value and the
six (6) or so relevant error codes?
> * I wonder whether we ever want to actually disable this? Afair the libc
> contains emulation for posix_fadvise if the filesystem doesn't support
> it.
I know that glibc does, but I don't know about other libc implementations.
--
Jon