On 5/8/19 14:58, Tom Lane wrote:
> neeraj kumar <neeru.cse@gmail.com> writes:
>> Yes we use SSL to connect to DB.
>
> Hm. I'm suspicious that one of the functions that fetch data for
> an SSL connection threw an error. In particular, it doesn't look
> to be hard at all to make X509_NAME_to_cstring fall over --- an
> encoding conversion failure would do it, even without any stretchy
> assumptions about OOM this early in backend start. Have you got
> any SSL certificates floating around with non-ASCII subject name
> or issuer name?
Crazy timing. We just had a report come in from a database in the RDS
fleet that's hitting this same issue. It was one of the Aurora systems,
but there wasn't anything Aurora-specific that I could see in the
relevant bits of code.
Seems to me that at a minimum, this loop shouldn't go on forever. Even
having an arbitrary, crazy high, hard-coded number of attempts before
failure (like a million) would be better than spinning on the CPU
forever - which is what we are seeing.
Would be even cooler to detect and correct a broken slot in
PgBackendStatus... if I have a good idea I'll post/try it. :)
-Jeremy
--
Jeremy Schneider
Database Engineer
Amazon Web Services