Re: Sync Rep v17

Поиск
Список
Период
Сортировка
От Daniel Farina
Тема Re: Sync Rep v17
Дата
Msg-id AANLkTinFcM494Vn+Fj2nqctAVo4zZMv6zKn30TMWR18N@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Sync Rep v17  (Jaime Casanova <jaime@2ndquadrant.com>)
Ответы Re: Sync Rep v17  (Daniel Farina <daniel@heroku.com>)
Re: Sync Rep v17  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On Tue, Feb 22, 2011 at 11:43 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
> On Sat, Feb 19, 2011 at 11:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> DEBUG:  write 0/3027BC8 flush 0/3014690 apply 0/3014690
>> DEBUG:  released 0 procs up to 0/3014690
>> DEBUG:  write 0/3027BC8 flush 0/3027BC8 apply 0/3014690
>> DEBUG:  released 2 procs up to 0/3027BC8
>> WARNING:  could not locate ourselves on wait queue
>> server closed the connection unexpectedly
>>        This probably means the server terminated abnormally
>>        before or while processing the request.
>> The connection to the server was lost. Attempting reset: DEBUG:
>
> you can make this happen more easily, i just run "pgbench -n -c10 -j10
> test" and qot that warning and sometimes a segmentation fault and
> sometimes a failed assertion

I have also reproduced this. Notably, things seem fine as long as
pgbench is confined to one backend, but as soon as two are used (-c 2)
by the feature I can get segfaults.

In the UI department, I am finding it somewhat difficult to affirm
that I am, in fact, synchronously replicating anything in the HA
scenario (where I never want to block. However, by enjoying the patch
at DEBUG3 and running what I think to be syncrepped and non-syncrepped
runs, I believe that I am not committing user error (seeing syncrep
waiting vs. lack thereof).  This is in part hard to confirm because
the single-backend performance (if DEBUG3 is to be believed, I will
write a real torture test later) of syncrep is actually very good, I
was expecting a more perceptible performance dropoff. But then again,
I imagine the real kicker will happen when we can run concurrent
backends. Also, Amazon EBS doesn't have the fastest disks, and within
an availability zone network latency is awfully low.

I can't quite explain what I was seeing before w.r.t.  memory usage,
and more pressingly, a very slow recover. I turned off hot standby and
was messing around and, before I knew it, the server was caught up. I
do not know if that was just coincidence (probably) or overhead
imposed by HS. The very high RES number was linux fooling me, as most
of it was SHR and in SHMEM.

--
fdr


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Fwd: psql include file using relative path
Следующее
От: Daniel Farina
Дата:
Сообщение: Re: Sync Rep v17