Weird irreproducible behaviors in plpython tests

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Weird irreproducible behaviors in plpython tests
Дата
Msg-id 10578.1460322657@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Weird irreproducible behaviors in plpython tests  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Buildfarm member tick failed today with what appears to be a bug
introduced by (or at least exposed by) the recent plpython ereport
patch.  Presumably the fact that it's failing and no other critters are
has to do with its use of -DCLOBBER_CACHE_ALWAYS and/or
-DRANDOMIZE_ALLOCATED_MEMORY.  However, I cannot reproduce the failure
by using those switches, even though tick's CentOS platform should be
pretty much identical to my RHEL6 machine.  Can anyone else reproduce it?

After failing at that, it occurred to me to try it under valgrind,
and kaboom!  I found a *different* bug, which has apparently been
there a long time.  (I say different because I don't see how this
one could produce tick's symptoms; it's a reference to already-freed
strings, but not an attempt to pfree one.)  I'll be fixing this one
shortly, but now we have another puzzle: why isn't buildfarm member
skink seeing the same failure?  It is running the plpython tests.
Can anyone else reproduce a failure by valgrind'ing the plpython
tests?  It looks here like

==00:00:00:29.670 24996== Invalid read of size 1
==00:00:00:29.670 24996==    at 0x4A07F92: strlen (mc_replace_strmem.c:403)
==00:00:00:29.670 24996==    by 0x826860: MemoryContextStrdup (mcxt.c:1158)
==00:00:00:29.670 24996==    by 0x800441: set_errdata_field (elog.c:1230)
==00:00:00:29.670 24996==    by 0x803EE8: err_generic_string (elog.c:1210)
==00:00:00:29.670 24996==    by 0xDFEF382: PLy_elog (plpy_elog.c:117)
==00:00:00:29.670 24996==    by 0xDFF049D: PLy_procedure_call (plpy_exec.c:1084)
==00:00:00:29.670 24996==    by 0xDFF1BBF: PLy_exec_function (plpy_exec.c:105)
==00:00:00:29.670 24996==    by 0xDFF25C6: plpython_inline_handler (plpy_main.c:345)
==00:00:00:29.670 24996==    by 0x809737: OidFunctionCall1Coll (fmgr.c:1596)
==00:00:00:29.670 24996==    by 0x70E97F: standard_ProcessUtility (utility.c:515)
==00:00:00:29.670 24996==    by 0x70A856: PortalRunUtility (pquery.c:1175)
==00:00:00:29.670 24996==    by 0x70B8FF: PortalRunMulti (pquery.c:1306)
==00:00:00:29.670 24996==  Address 0xefe2894 is 36 bytes inside a block of size 49 free'd
==00:00:00:29.670 24996==    at 0x4A06430: free (vg_replace_malloc.c:446)
==00:00:00:29.670 24996==    by 0x398AE90D5C: ??? (in /usr/lib64/libpython2.6.so.1.0)
==00:00:00:29.670 24996==    by 0x398AE79E3A: ??? (in /usr/lib64/libpython2.6.so.1.0)
==00:00:00:29.670 24996==    by 0x398AE5C586: ??? (in /usr/lib64/libpython2.6.so.1.0)
==00:00:00:29.670 24996==    by 0x398AE5C5BF: ??? (in /usr/lib64/libpython2.6.so.1.0)
==00:00:00:29.670 24996==    by 0x398AE9A704: ??? (in /usr/lib64/libpython2.6.so.1.0)
==00:00:00:29.670 24996==    by 0xDFEECFE: PLy_traceback (plpy_elog.c:358)
==00:00:00:29.670 24996==    by 0xDFEF21E: PLy_elog (plpy_elog.c:83)
==00:00:00:29.670 24996==    by 0xDFF049D: PLy_procedure_call (plpy_exec.c:1084)
==00:00:00:29.670 24996==    by 0xDFF1BBF: PLy_exec_function (plpy_exec.c:105)
==00:00:00:29.670 24996==    by 0xDFF25C6: plpython_inline_handler (plpy_main.c:345)
==00:00:00:29.670 24996==    by 0x809737: OidFunctionCall1Coll (fmgr.c:1596)

the core of the problem being that PLy_get_spi_error_data returns a
pointer to a string that is pointing into the "val" (a/k/a "v") Python
object, which PLy_traceback has carefully released the last refcount on.
So this coding should have failed immediately under any valgrind
testing.  The ereport patch perhaps gave us more scope for the error,
but for me, valgrind'ing the plpython tests fails similarly in 9.5.
So we should have noticed this long since.

I dislike bugs that are platform-dependent with no obvious
reason for it :-(
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: VS 2015 support in src/tools/msvc
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Weird irreproducible behaviors in plpython tests