Re: Relation extension scalability

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Relation extension scalability
Дата
Msg-id CAA4eK1J-fWbZZNJKnrB60gh9T_gEjddcPePnb+TiFehoFeuGdg@mail.gmail.com
обсуждение исходный текст
Ответ на Relation extension scalability  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Relation extension scalability  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Mar 30, 2015 at 12:26 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> Hello,
>
> Currently bigger shared_buffers settings don't combine well with
> relations being extended frequently. Especially if many/most pages have
> a high usagecount and/or are dirty and the system is IO constrained.
>
> As a quick recap, relation extension basically works like:
> 1) We lock the relation for extension
> 2) ReadBuffer*(P_NEW) is being called, to extend the relation
> 3) smgrnblocks() is used to find the new target block
> 4) We search for a victim buffer (via BufferAlloc()) to put the new
>    block into
> 5) If dirty the victim buffer is cleaned
> 6) The relation is extended using smgrextend()
> 7) The page is initialized
>
>
> The problems come from 4) and 5) potentially each taking a fair
> while. If the working set mostly fits into shared_buffers 4) can
> requiring iterating over all shared buffers several times to find a
> victim buffer. If the IO subsystem is buys and/or we've hit the kernel's
> dirty limits 5) can take a couple seconds.
>

In the past, I have observed in one of the Write-oriented tests that
backend's have to flush the pages by themselves many a times, so
in above situation that can lead to more severe bottleneck.

> I've prototyped solving this for heap relations moving the smgrnblocks()
> + smgrextend() calls to RelationGetBufferForTuple(). With some care
> (including a retry loop) it's possible to only do those two under the
> extension lock. That indeed fixes problems in some of my tests.
>

So do this means that the problem is because of contention on extension
lock?

> I'm not sure whether the above is the best solution however. 

Another thing to note here is that during extension we are extending
just one block, won't it make sense to increment it by some bigger
number (we can even take input from user for the same where user
can specify how much to autoextend a relation when the relation doesn't
have any empty space).  During mdextend(), we might increase just one
block, however we can register the request for background process to
increase the size similar to what is done for fsync.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Relation extension scalability
Следующее
От: Michael Paquier
Дата:
Сообщение: cache lookup error for shell type creation with incompatible output function (DDL deparsing bug)