Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Дата
Msg-id 18398.1536410055@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Список pgsql-bugs
I wrote:
> We're still no closer to an explanation of Jeremy's failure, though
> I'm now pretty sure that pg_saslprep itself isn't the issue.

I had an idea about that --- it's probably all wet, but the code as
written seems bulletproof enough that I'm forced to postulate something
very strange is happening.

Observe that there are two copies of pg_saslprep() in play: there is one
in the backend, which is compiled to allocate its result with palloc,
and there is one in libpq, which is compiled to allocate its result
with malloc.  Could it be that somehow, when libpq is loaded into the
backend address space as it is here, libpq winds up calling the backend's
copy of pg_saslprep rather than its own?  That would work just fine,
until libpq tried to free the returned string using free(), and then
we'd get exactly the reported error.

The main weakness in this theory is that it suggests that Jeremy's
postgres_fdw connections ought to be falling over more easily than
they are.  However, I think that postgres_fdw will never explicitly
close a PGconn unless it's forced to; during a normal backend session
exit, the process just dies without going through PQfinish, so that
the problem would be masked.  The only way to get to the PQfinish
call shown in the backtrace is for pgfdw_inval_callback to mark the
connection invalid, which'd require either an update on the relevant
foreign server object, an update on the user mapping in use, or a
SI cache reset.  That explains how heavy DDL activity in an apparently
unrelated database can trigger the problem: it eventually results in
a SI message queue overrun and ensuing cache reset.  In the absence
of SI cache resets, maybe indeed the problem is rare even if the
pg_saslprep result is misallocated every time.

Not sure about a good way to test this theory.  Need more caffeine.

            regards, tom lane


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables
Следующее
От: Andrew Gierth
Дата:
Сообщение: Re: BUG #15367: Crash in pg_fe_scram_free when using foreign tables