Обсуждение: Double linking MemoryContext children

Поиск
Список
Период
Сортировка

Double linking MemoryContext children

От
Jan Wieck
Дата:
The attached script demonstrates an O(N^2) problem we recently became
aware of. The script simply executes a large anonymous code block that
doesn't do anything useful. Usage is

     ./t1.py [NUM_STMTS] | psql [DBNAME]

NUM_STMTS defaults to 20,000. The code block executes rather fast, but
the entire script runs for over a minute (at 20,000 statements) on a
2.66 GHz Xeon.

The time is spent when all the prepared SPI statements are freed at the
end of execution. All prepared SPI statements are children of the Cache
context. MemoryContext children are a single linked list where new
members are inserted at the head. This works best when children are
created and destroyed in a last-in-first-out pattern. SPI however
destroys the SPI statements in the order they were created, forcing
MemoryContextSetParent() to traverse the entire linked list for each child.

The attached patch makes MemoryContext children a double linked list
that no longer needs any list traversal no find the position of the
child within the list.


Regards, Jan


--
Jan Wieck
Senior Software Engineer
http://slony.info

Вложения

Re: Double linking MemoryContext children

От
Tom Lane
Дата:
Jan Wieck <jan@wi3ck.info> writes:
> The attached script demonstrates an O(N^2) problem we recently became 
> aware of. The script simply executes a large anonymous code block that 
> doesn't do anything useful. Usage is

>      ./t1.py [NUM_STMTS] | psql [DBNAME]

> NUM_STMTS defaults to 20,000. The code block executes rather fast, but 
> the entire script runs for over a minute (at 20,000 statements) on a 
> 2.66 GHz Xeon.

> The time is spent when all the prepared SPI statements are freed at the 
> end of execution. All prepared SPI statements are children of the Cache 
> context. MemoryContext children are a single linked list where new 
> members are inserted at the head. This works best when children are 
> created and destroyed in a last-in-first-out pattern. SPI however 
> destroys the SPI statements in the order they were created, forcing 
> MemoryContextSetParent() to traverse the entire linked list for each child.

> The attached patch makes MemoryContext children a double linked list 
> that no longer needs any list traversal no find the position of the 
> child within the list.

Seems less invasive to fix SPI to delete in the opposite order?
        regards, tom lane



Re: Double linking MemoryContext children

От
Jan Wieck
Дата:
On 09/11/2015 09:38 AM, Tom Lane wrote:
> Jan Wieck <jan@wi3ck.info> writes:
>> The attached script demonstrates an O(N^2) problem we recently became
>> aware of. The script simply executes a large anonymous code block that
>> doesn't do anything useful. Usage is
>
>>      ./t1.py [NUM_STMTS] | psql [DBNAME]
>
>> NUM_STMTS defaults to 20,000. The code block executes rather fast, but
>> the entire script runs for over a minute (at 20,000 statements) on a
>> 2.66 GHz Xeon.
>
>> The time is spent when all the prepared SPI statements are freed at the
>> end of execution. All prepared SPI statements are children of the Cache
>> context. MemoryContext children are a single linked list where new
>> members are inserted at the head. This works best when children are
>> created and destroyed in a last-in-first-out pattern. SPI however
>> destroys the SPI statements in the order they were created, forcing
>> MemoryContextSetParent() to traverse the entire linked list for each child.
>
>> The attached patch makes MemoryContext children a double linked list
>> that no longer needs any list traversal no find the position of the
>> child within the list.
>
> Seems less invasive to fix SPI to delete in the opposite order?

That would assume that there are no other code paths that destroy memory 
contexts out of LIFO order.


Regards, Jan

-- 
Jan Wieck
Senior Software Engineer
http://slony.info



Re: Double linking MemoryContext children

От
Tom Lane
Дата:
Jan Wieck <jan@wi3ck.info> writes:
> On 09/11/2015 09:38 AM, Tom Lane wrote:
>> Seems less invasive to fix SPI to delete in the opposite order?

> That would assume that there are no other code paths that destroy memory 
> contexts out of LIFO order.

Given that we've not seen this sort of problem reported in the dozen or
more years the code has been like that, I'm a bit dubious that we need
to worry about that.  It's possible you're right that we need to expend
more space in the MemoryContext lists --- but I'd like to see more than
one example.
        regards, tom lane



Re: Double linking MemoryContext children

От
Jan Wieck
Дата:
On 09/11/2015 09:58 AM, Tom Lane wrote:
> Jan Wieck <jan@wi3ck.info> writes:
>> On 09/11/2015 09:38 AM, Tom Lane wrote:
>>> Seems less invasive to fix SPI to delete in the opposite order?
>
>> That would assume that there are no other code paths that destroy memory
>> contexts out of LIFO order.
>
> Given that we've not seen this sort of problem reported in the dozen or
> more years the code has been like that, I'm a bit dubious that we need
> to worry about that.  It's possible you're right that we need to expend
> more space in the MemoryContext lists --- but I'd like to see more than
> one example.

I added a bit of debug code to MemoryContextSetParent(), tracking loop 
iterations per context name. This is what happens during "make check" of 
current master:
                 name                 | count  | not_first | iterations
--------------------------------------+--------+-----------+------------ CacheMemoryContext                   |   6271
|     2634 |     104846 ExecutorState                        | 222290 |      2951 |       7176 TopMemoryContext
           |  28464 |       620 |       2716 SPI Proc                             |   8424 |       225 |        381
PortalMemory                        |  24616 |        31 |        291 TopTransactionContext                |  19312 |
   114 |        141 ExprContext                          |   3317 |         1 |          1
 

There was a total of 6,271 calls to MemoryContextSetParent() where the 
old parent is the CacheMemoryContext. For 2,634 of those the context is 
not parent->firstchild and this lead to 104,846 loop iterations total.

The remaining numbers indicate that other contexts are mostly used in 
the intended fashion, but not strictly. This means there is definitely 
potential for more edge cases, similar to the SPI example above.


Regards, Jan

-- 
Jan Wieck
Senior Software Engineer
http://slony.info



Re: Double linking MemoryContext children

От
Kevin Grittner
Дата:
Jan Wieck <jan@wi3ck.info> wrote:

>>> On 09/11/2015 09:38 AM, Tom Lane wrote:
>>>> Seems less invasive to fix SPI to delete in the opposite order?

> The remaining numbers indicate that other contexts are mostly used in
> the intended fashion, but not strictly. This means there is definitely
> potential for more edge cases, similar to the SPI example above.

I guess the question is whether to add a pointer for each memory
context to give the MemoryContextSetParent() function O(1)
performance characteristics or add comments in front of this
function to document how callers should organize their code to
avoid O(N^2) performance.  I generally prefer that callers of such
a function need not be that aware of implementation details, so I
would prefer the former.

On the other hand, a grep indicates that there are two places that
MemoryContextData.nextchild is set (and we therefore probably need
to also set the new field), and Jan's proposed patch only changes
one of them.  If we do this, I think we need to change both places
that are affected, so ResourceOwnerCreate() in resowner.c would
need a line or two added.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Double linking MemoryContext children

От
Tom Lane
Дата:
Kevin Grittner <kgrittn@ymail.com> writes:
> I guess the question is whether to add a pointer for each memory
> context to give the MemoryContextSetParent() function O(1)
> performance characteristics or add comments in front of this
> function to document how callers should organize their code to
> avoid O(N^2) performance.  I generally prefer that callers of such
> a function need not be that aware of implementation details, so I
> would prefer the former.

On reflection, it's a bit silly to complain about one extra pointer per
MemoryContext, when the memory represented by the context is going to be
at least one kilobyte and usually a lot more.  So I withdraw my objection
to the concept.  I concur that we'd just as soon not worry about what
order things are done in.

> On the other hand, a grep indicates that there are two places that
> MemoryContextData.nextchild is set (and we therefore probably need
> to also set the new field), and Jan's proposed patch only changes
> one of them.  If we do this, I think we need to change both places
> that are affected, so ResourceOwnerCreate() in resowner.c would
> need a line or two added.

Um.  Sounds like it needs some actual code review then ...
        regards, tom lane



Re: Double linking MemoryContext children

От
Jan Wieck
Дата:
On 09/12/2015 11:35 AM, Kevin Grittner wrote:

> On the other hand, a grep indicates that there are two places that
> MemoryContextData.nextchild is set (and we therefore probably need
> to also set the new field), and Jan's proposed patch only changes
> one of them.  If we do this, I think we need to change both places
> that are affected, so ResourceOwnerCreate() in resowner.c would
> need a line or two added.

ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not 
MemoryContextData.nextchild.


Regards, Jan

-- 
Jan Wieck
Senior Software Engineer
http://slony.info



Re: Double linking MemoryContext children

От
Jim Nasby
Дата:
On 9/14/15 7:24 AM, Jan Wieck wrote:
> On 09/12/2015 11:35 AM, Kevin Grittner wrote:
>
>> On the other hand, a grep indicates that there are two places that
>> MemoryContextData.nextchild is set (and we therefore probably need
>> to also set the new field), and Jan's proposed patch only changes
>> one of them.  If we do this, I think we need to change both places
>> that are affected, so ResourceOwnerCreate() in resowner.c would
>> need a line or two added.
>
> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
> MemoryContextData.nextchild.

Anything ever happen with this? </Momjian-Mode>
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Double linking MemoryContext children

От
Thom Brown
Дата:
On 7 December 2015 at 01:30, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 9/14/15 7:24 AM, Jan Wieck wrote:
>>
>> On 09/12/2015 11:35 AM, Kevin Grittner wrote:
>>
>>> On the other hand, a grep indicates that there are two places that
>>> MemoryContextData.nextchild is set (and we therefore probably need
>>> to also set the new field), and Jan's proposed patch only changes
>>> one of them.  If we do this, I think we need to change both places
>>> that are affected, so ResourceOwnerCreate() in resowner.c would
>>> need a line or two added.
>>
>>
>> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
>> MemoryContextData.nextchild.
>
>
> Anything ever happen with this? </Momjian-Mode>

Yeah, it was committed... a few mins ago.

Thom



Re: Double linking MemoryContext children

От
Kevin Grittner
Дата:
On Sun, Dec 6, 2015 at 7:30 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 9/14/15 7:24 AM, Jan Wieck wrote:
>> On 09/12/2015 11:35 AM, Kevin Grittner wrote:
>>
>>> On the other hand, a grep indicates that there are two places that
>>> MemoryContextData.nextchild is set (and we therefore probably need
>>> to also set the new field), and Jan's proposed patch only changes
>>> one of them.  If we do this, I think we need to change both places
>>> that are affected, so ResourceOwnerCreate() in resowner.c would
>>> need a line or two added.
>>
>> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
>> MemoryContextData.nextchild.
>
> Anything ever happen with this? </Momjian-Mode>

Jan was right; the code for operating on resource owners was
similar enough that I mistook it for memory context code in a quick
review of grep results looking for any places that Jan might have
missed.  I went over it all again and couldn't resist adding an
Assert() at one point, but otherwise it looks good.

An optimized build without assertions runs his 20000 statement DO
test in 25646.811 ms without the patch and 2933.754 ms with the
patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company