Re: logical decoding : exceeded maxAllocatedDescs for .spill files

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Дата
Msg-id 20200109053704.GA2502006@rfd.leadboat.com
обсуждение исходный текст
Ответ на Re: logical decoding : exceeded maxAllocatedDescs for .spill files  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: logical decoding : exceeded maxAllocatedDescs for .spill files
TestLib condition for deleting temporary directories
Список pgsql-hackers
On Wed, Jan 08, 2020 at 02:50:53PM +0530, Amit Khandekar wrote:
> On Sun, 5 Jan 2020 at 00:21, Noah Misch <noah@leadboat.com> wrote:
> > The buildfarm client can capture stack traces, but it currently doesn't do so
> > for TAP test suites (search the client code for get_stack_trace).  If someone
> > feels like writing a fix for that, it would be a nice improvement.  Perhaps,
> > rather than having the client code know all the locations where core files
> > might appear, failed runs should walk the test directory tree for core files?
> 
> I think this might end up having the same code to walk the directory
> spread out on multiple files. Instead, I think in the build script, in
> get_stack_trace(), we can do an equivalent of "find <inputdir> -name
> "*core*" , as against the current way in which it looks for core files
> only in the specific data directory.

Agreed.

> Noah, is it possible to run a patch'ed build script once I submit a
> patch, so that we can quickly get the stack trace ? I mean, can we do
> this before getting the patch committed ? I guess, we can run the
> build script with a single branch specified, right ?

Yes to all questions, but it would not have helped in this case.  First, v10
deletes PostgresNode base directories at the end of this test file, despite
the failure[1].  Second, the stack trace was minimal:

  (gdb) bt       
  #0  0xd011119c in extend_brk () from /usr/lib/libc.a(shr.o)

Even so, a web search for "extend_brk" led to the answer.  By default, 32-bit
AIX binaries get only 256M of RAM for stack and sbrk.  The new regression test
used more than that, hence this crash.  Setting LDR_CNTRL=MAXDATA=0x80000000
in the environment cured the crash.  I've put that in the buildfarm member
configuration and started a new run.

(PostgreSQL documentation actually covers this problem:
https://www.postgresql.org/docs/devel/installation-platform-notes.html#INSTALLATION-NOTES-AIX)


[1] It has the all_tests_passing() logic in an attempt to stop this.  I'm
guessing it didn't help because the file failed by calling die "connection
error: ...", not by reporting a failure to Test::More via ok(0) or similar.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum
Следующее
От: Tom Lane
Дата:
Сообщение: Re: logical decoding : exceeded maxAllocatedDescs for .spill files