IA64 versus effective stack limit
От | Tom Lane |
---|---|
Тема | IA64 versus effective stack limit |
Дата | |
Msg-id | 21563.1289064886@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: IA64 versus effective stack limit
(Greg Stark <gsstark@mit.edu>)
Re: IA64 versus effective stack limit (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Sergey was kind enough to lend me use of buildfarm member dugong (IA64, Debian Etch) so I could poke into why its behavior in the recursion-related regression tests was so odd. I had previously tried and failed to reproduce the behavior on a Red Hat IA64 test machine (running RHEL of course) so I was feeling a bit baffled. Here's what I found out: 1. Debian Etch has the make-resets-the-stack-rlimit bug that I reported about yesterday, whereas the RHEL version I was testing had the fix for that. So that's why I couldn't reproduce max_stack_depth getting set to 100kB. 2. IA64 is a very weird architecture: it has two separate hardware stacks. One is reserved for saving registers, which IA64 has got a lot of, and the other "normal" stack holds everything else. The method we use in check_stack_depth (ie, measure the difference in addresses of local variables) effectively measures the depth of the normal stack. I don't know of any simple way to find out the depth of the register stack. You can get gdb to tell you about both stacks, though. I found out that with PG HEAD, the recursion distance for the "infinite_recurse()" regression test is 160 bytes of normal stack and 928 bytes of register stack per fmgr_sql call level. This is with gcc (I got identical numbers on dugong and the RHEL machine). But, if you build PG with icc as the buildfarm critter is doing, that bloats to 3232 bytes of normal stack and 2832 bytes of register stack. For comparison, my x86_64 Fedora 13 box uses 704 bytes of stack per recursion level. I don't know why icc is so much worse than gcc on this measure of stack depth consumption, but clearly the combination of that and the 100kB max_stack_depth explains why dugong is failing to do very many levels of recursion before erroring out. Fixing get_stack_depth_rlimit as I proposed yesterday should give it a reasonable stack depth. However, we're not out of the woods yet. Because check_stack_depth is only checking the normal stack depth, and the two stacks don't grow at the same rate, it's possible for a crash to occur due to running out of register stack space. We haven't seen that happen on dugong because, as shown above, with icc the register stack grows more slowly than the normal stack (at least for the specific functions we care about here). But with gcc, the same code eats register stack a lot faster than normal stack --- and in fact I observed a crash in the infinite_recurse() test when building with gcc and testing in a manually-started postmaster. The manually-started postmaster was under ulimit -s 8MB, which apparently Debian interprets as "8MB for normal stack and another 8MB for register stack". Even though check_stack_depth was trying to constrain the normal stack to just 2MB, the register stack grew 5.8 times faster and so blew through 8MB before check_stack_depth thought there was a problem. Raising ulimit -s allowed it to work. (Curiously, I did *not* see the same type of crash on the RHEL machine. I surmise that Red Hat has tweaked the kernel to allow the register stack to grow more than the normal stack, but I haven't tried to verify that.) So this means we have a problem. To some extent it's new in HEAD: before the changes I made last week to not keep a local FunctionCallInfoData in ExecMakeFunctionResult, there would have been at least another 900 bytes of normal stack per recursion level, so even with gcc the register stack would grow slower than normal stack in this test, and you wouldn't have seen any crash in the regression tests. But I'm sure there are lots of other potentially recursive routines in PG where register stack could grow faster than normal stack, so we shouldn't suppose that this fmgr_sql recursion is the only trouble spot. As I said above, I don't know of any good way to measure register stack depth directly. It's probably possible to find out by asking the kernel or something like that, but we surely do not want to introduce a kernel call into check_stack_depth(). So a good solution for this is hard to see. The best idea I have at the moment is to reduce the reported stack limit by some arbitrary factor, ie do something like #ifdef __IA64__ val /= 8;#endif in get_stack_depth_rlimit(). Anyone have a better idea? BTW, this also suggests to me that it'd be a real good idea to have a buildfarm critter for IA64+gcc --- the differences between gcc and icc are clearly pretty significant on this hardware. regards, tom lane
В списке pgsql-hackers по дате отправления: