Re: BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005

Поиск
Список
Период
Сортировка
От Francisco Olarte
Тема Re: BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005
Дата
Msg-id CA+bJJbyF9Pz6cTv5=qik5VL5yvCaEz+-6pEuAFrB-DQhkMu3yA@mail.gmail.com
обсуждение исходный текст
Ответ на BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005  (daniele.posenato@smartec.ch)
Ответы Re: BUG #12785: server process (PID 2872) was terminated by exception 0xC0000005
Список pgsql-bugs
Hi Daniele:

On Mon, Feb 23, 2015 at 6:09 PM, Daniele Posenato <
daniele.posenato@smartec.ch> wrote:

>  Thank you a lot for the answer, I really appreciate it. I will try to do
> what you have suggested and then I will let you know.
>

=E2=80=8BThat's ok, but I doubt I can help you more ( I abandoned Windows m=
ore than
a dozen years ago, haven't looked back, although I still remember how that
code appeared when I did something wrong in my programs ).=E2=80=8B



>
> Just for information the problem has occurred  again since  the last emai=
l
> and always on the same query.  I could understand a crash of the service =
on
> performing an update or a delete, but I have some difficulties to
> understand this on a select.  If it was an hardware problem I would expec=
t
> the service to crash also on other actions and not randomly (about once p=
er
> week) only on  a specific select (that is executed every 10  seconds).
>

=E2=80=8BIs that query consuming a lot of your resources? ( It may be due t=
o it
being lengthy or just frequent ) because in that case it makes sense.

In many applications I have 99.9% of the work / ram usage are selects, so a
random crash is normally going to hit me in one of this.

On the crashing on select stuff. Suppose you have a faulty sector or ram
location. When you write to it ( upd or del ) nothing happens, it just
sotres the bad value, when you read it ( select, part 1, reading from
disk/ram ) nothing happens, you just get bad data, say a null pointer, then
when you use ( select part 2 ) you get the fault. In fact, if a ram
location loses data written you do not notice it on writting it, or on
reading it ( unless you get a parity error ) but on using what you read
from it.

=E2=80=8BThis is a normal pattern on programming bugs too. You have an erro=
r in
some code  and store something in a random ( or not so random ) ram
location . That code seems to work ok. But then an unrelated piece of code
reads the corrupted data and crashes ( it is one of the way the buffer
overflows work, the guilty code overflows a buffer, but works, and another
chunk of code gets its data overwritten and crashes ).


>
> Is there a way to write a select that is able to crash the service?
>

=E2=80=8BWith a good database, on good hardware, with adequate ( inifinite,=
 as you
can crash any service by just joining enough copies of a table to exhaust
avalible  ) memory and disk there shouldn't be, but if you read corrupted
data or get hit by a bit flips in the middle of processing, it may Are you
able to do a full database dump ( pg_dump, not base backup ) of your
database? If you are then you are able to read all the tables, and I would
suggest trying to reindex every table if you have quiescent periods (
pg_dump does not touch indexes, so if you have good data bad corrupted
indexes that should fix it  )=E2=80=8B


>
> I will let you know the results of the hardware check after the planned
> restart.
>

=E2=80=8BI do not know ( or remember )  what your DB sizes =E2=80=8Band upt=
ime requirements
are. But I've had that kind of problems caused by corrupted disk
structures, and have being able to recover them rewritting the database,
that means dump, drop, restore, but this depends on the system, I cannot
recommend doing it, but as I said before, if I had the same aplication in 4
machines crashing randomly in only one of them I would try to triple test
the machine and dump / restore it.

Best ergards.
    Francisco Olarte.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: erich@npp-asia.com
Дата:
Сообщение: BUG #12797: Cannot compile pgAgent
Следующее
От: David Steele
Дата:
Сообщение: View restore error in 9.3-9.4 upgrade