Обсуждение: FATAL: terminating connection due to conflict with recovery

Поиск
Список
Период
Сортировка

FATAL: terminating connection due to conflict with recovery

От
Jeff Ross
Дата:
Hi,

I have a set of servers in the rack running 9.0.3.  The production
server is doing streaming replication and that is working fine.  I have
some quarterly reports that are select only so I've been running them
against the replica.

I have one part of that report that consistently dies with the error
message

"FATAL:  terminating connection due to conflict with recovery".

Following that in the logs is:

"2011-08-30 13:33:02.336353500 <_postgresql%wykids> DETAIL:  User query
might have needed to see row versions that must be removed.
2011-08-30 13:33:02.336359500 <_postgresql%wykids> HINT:  In a moment
you should be able to reconnect to the database and repeat your command."

The failure point is not always at the same point of the query.  The
query is a long cross-tab so I've posted it to

    http://www.wykids.org/query.html

On my workstation using psql this query runs in about 1.5 minutes. I can
choose the quarter the query uses and I'm virtually positive that no
rows in that set will be updated or deleted so the error message to me
seems wrong.

It does successfully complete when run against the master and in less
than a minute.

Here are my server settings from the replica:

jross@dukkha:/home/jross $ cat /var/postgresql/my_postgresql.conf
listen_addresses = '*'          # what IP address(es) to listen on;
unix_socket_directory = '/var/postgresql/' # (change requires restart)
log_line_prefix = '<%u%%%d> '                   # special values:
log_statement = 'all'        # none, ddl, mod, all
archive_mode = on
wal_level = hot_standby   # minimal, archive, or hot_standby
archive_command = 'cd .'
default_statistics_target = 50 # pgtune wizard 2009-12-19
maintenance_work_mem = 120MB # pgtune wizard 2009-12-19
constraint_exclusion = on # pgtune wizard 2009-12-19
checkpoint_completion_target = 0.9 # pgtune wizard 2009-12-19
effective_cache_size = 1408MB # pgtune wizard 2009-12-19
work_mem = 12MB # pgtune wizard 2009-12-19
wal_buffers = 8MB # pgtune wizard 2009-12-19
checkpoint_segments = 16 # pgtune wizard 2009-12-19
shared_buffers = 480MB # pgtune wizard 2009-12-19
max_connections = 80 # pgtune wizard 2009-12-19
max_wal_senders = 3     # max number of walsender processes
wal_keep_segments = 30    # in logfile segments, 16MB each; 0 disables
#recovery_connections = on
hot_standby = on              # "on" allows queries during recovery
log_timezone = 'America/Denver'
timezone = 'America/Denver'

Is there a setting in this or something else that I should tweak so this
query can complete against the replica?  Google turned up some threads
on the error code associated with the error but I didn't find much else
that seems applicable.

Thanks,

Jeff Ross
Wyoming Children's Action Alliance
Cheyenne, Wyoming



Re: FATAL: terminating connection due to conflict with recovery

От
Fujii Masao
Дата:
On Wed, Aug 31, 2011 at 5:51 AM, Jeff Ross <jross@wykids.org> wrote:
> Is there a setting in this or something else that I should tweak so this
> query can complete against the replica?  Google turned up some threads on
> the error code associated with the error but I didn't find much else that
> seems applicable.

Increasing max_standby_archive_delay and max_standby_streaming_delay
would be helpful to make the query complete. Please see the following manual
for details.
http://www.postgresql.org/docs/9.0/interactive/hot-standby.html#HOT-STANDBY-CONFLICT

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: FATAL: terminating connection due to conflict with recovery

От
Craig Ringer
Дата:
On 31/08/2011 4:51 AM, Jeff Ross wrote:

> On my workstation using psql this query runs in about 1.5 minutes. I can
> choose the quarter the query uses and I'm virtually positive that no
> rows in that set will be updated or deleted so the error message to me
> seems wrong.

AFAIK: There may be other data on the same page that is being written
to, or it might be VACUUM activity replicated from the master. Either
way, the standby is saying it has to get rid of that page in order to
stay within sync delay limits with the master.

See the other reply post you got for the tuning parameters you can use
to tweak that permissible sync delay.

--
Craig Ringer


Re: FATAL: terminating connection due to conflict with recovery

От
Jeff Ross
Дата:
On 08/30/11 18:03, Fujii Masao wrote:
> On Wed, Aug 31, 2011 at 5:51 AM, Jeff Ross<jross@wykids.org>  wrote:
>> Is there a setting in this or something else that I should tweak so this
>> query can complete against the replica?  Google turned up some threads on
>> the error code associated with the error but I didn't find much else that
>> seems applicable.
>
> Increasing max_standby_archive_delay and max_standby_streaming_delay
> would be helpful to make the query complete. Please see the following manual
> for details.
> http://www.postgresql.org/docs/9.0/interactive/hot-standby.html#HOT-STANDBY-CONFLICT
>
> Regards,
>

Thank you!

And now the error message is linked to the solution for the next person.

Jeff