Re: emergency outage requiring database restart

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: emergency outage requiring database restart
Дата
Msg-id CAHyXU0yUTcvLs5Hk+_p2iFNz0EXf3otbLDUA56BnEj0tN7zztA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: emergency outage requiring database restart  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: emergency outage requiring database restart  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Oct 24, 2016 at 9:18 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>> On Mon, Oct 24, 2016 at 6:01 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> > Corruption struck again.
>> > This time got another case of view busted -- attempting to create
>> > gives missing 'type' error.
>>
>> Call it a hunch -- I think the problem is in pl/sh.
>
> I've heard that before.

well, yeah, previously I had an issue where the database crashed
during a heavy concurrent pl/sh based load.   However the problems
went away when I refactored the code.   Anyways, I looked at the code
and couldn't see anything obviously wrong so who knows?  All I know is
my production database is exploding continuously and I'm looking for
answers.  The only other extension in heavy use on this servers is
postgres_fdw.

The other database on the cluster is fine, which kind of suggests we
are not facing clog or WAL type problems.

After last night, I rebuilt the cluster, turning on checksums, turning
on synchronous commit (it was off) and added a standby replica.  This
should help narrow the problem down should it re-occur; if storage is
bad (note, other database on same machine is doing 10x write activity
and is fine) or something is scribbling on shared memory (my guess
here)  then checksums should be popped, right?

merlin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Wraparound warning
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: emergency outage requiring database restart