Re: PATCH: optimized DROP of multiple tables within a transaction

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: PATCH: optimized DROP of multiple tables within a transaction
Дата
Msg-id CA+TgmoYG93bukBAvt8xO5kC=NwgjrfFfu7QHsc=PmOGGk4PM0A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PATCH: optimized DROP of multiple tables within a transaction  ("Tomas Vondra" <tv@fuzzy.cz>)
Список pgsql-hackers
On Thu, Aug 30, 2012 at 3:17 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 30 Srpen 2012, 17:53, Robert Haas wrote:
>> On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>> attached is a patch that improves performance when dropping multiple
>>> tables within a transaction. Instead of scanning the shared buffers for
>>> each table separately, the patch removes this and evicts all the tables
>>> in a single pass through shared buffers.
>>>
>>> Our system creates a lot of "working tables" (even 100.000) and we need
>>> to perform garbage collection (dropping obsolete tables) regularly. This
>>> often took ~ 1 hour, because we're using big AWS instances with lots of
>>> RAM (which tends to be slower than RAM on bare hw). After applying this
>>> patch and dropping tables in groups of 100, the gc runs in less than 4
>>> minutes (i.e. a 15x speed-up).
>>>
>>> This is not likely to improve usual performance, but for systems like
>>> ours, this patch is a significant improvement.
>>
>> Seems pretty reasonable.  But instead of duplicating so much code,
>> couldn't we find a way to use replace DropRelFileNodeAllBuffers with
>> DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
>> the first one could instead call the second one with a count of one
>> and a pointer to the address of the data they were planning to pass.
>> I'd probably swap the order of arguments to
>> DropRelFileNodeAllBuffersList as well.  We could do something similar
>> with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
>> code is needed.
>
> Yeah, I was thinking about that too, but I simply wasn't sure which is the
> best choice so I've sent the raw patch. OTOH these functions are called on
> a very limited number of places, so a refactoring like this seems fine.

If there are enough call sites then DropRelFileNodeAllBuffers could
become a one-line function that simply calls
DropRelFileNodeAllBuffersList(1, &arg).  But we should avoid
duplicating the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [PERFORM] pg_dump and thousands of schemas
Следующее
От: "Tomas Vondra"
Дата:
Сообщение: Re: PATCH: pgbench - aggregation of info written into log