Re: odd postgresql performance (excessive lseek)

Поиск
Список
Период
Сортировка
От Jon Nelson
Тема Re: odd postgresql performance (excessive lseek)
Дата
Msg-id AANLkTi=RTJMHXN_zMcYdDCRnd4zDDMW=-PEB1egpCoRS@mail.gmail.com
обсуждение исходный текст
Ответ на Re: odd postgresql performance (excessive lseek)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: odd postgresql performance (excessive lseek)
Список pgsql-performance
On Tue, Oct 19, 2010 at 9:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> This is another situation where using pread would have saved a lot of
>> time and sped things up a bit, but failing that, keeping track of the
>> file position ourselves and only lseek'ing when necessary would also
>> help.
>
> No, it wouldn't; you don't have the slightest idea what's going on
> there.  Those lseeks are for the purpose of detecting the current EOF
> location, ie, finding out whether some other backend has extended the
> file recently.  We could get rid of them, but only at the cost of
> putting in some other communication mechanism instead.

That's a little harsh (it's not untrue, though).

It's true I don't know how postgresql works WRT how it manages files,
but now I've been educated (some). I'm guessing, then, that due to how
each backend may extend files without the other backends knowing of
it, that using fallocate or some-such is also likely a non-starter. I
ask because, especially when allocating files 8KB at a time, file
fragmentation on a busy system is potentially high. I recently saw an
ext3 filesystem (dedicated to postgresql) with 38% file fragmentation
and, yes, it does make a huge performance difference in some cases.
After manually defragmenting some files (with pg offline) I saw a read
speed increase for single-MB-per-second to
high-double-digit-MB-per-second.  However, after asking pg to rewrite
some of the worst files (by way of CLUSTER or ALTER TABLE) I saw no
improvement - I'm guessing due to the 8KB-at-a-time allocation
mechanism.

Has any work been done on making use of shared memory for file stats
or using fallocate (or posix_fallocate) to allocate files in larger
chunks?

--
Jon

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Следующее
От: Greg Smith
Дата:
Сообщение: Re: odd postgresql performance (excessive lseek)