Re: 9.3 to 9.5 upgrade problems

Поиск
Список
Период
Сортировка
От Andy Colson
Тема Re: 9.3 to 9.5 upgrade problems
Дата
Msg-id f0699290-2c89-cb4e-0318-84603b629e3e@squeakycode.net
обсуждение исходный текст
Ответ на Re: 9.3 to 9.5 upgrade problems  (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы Re: 9.3 to 9.5 upgrade problems  (Andy Colson <andy@squeakycode.net>)
Список pgsql-general
On 07/03/2016 11:04 AM, Adrian Klaver wrote:
> On 07/03/2016 08:49 AM, Andy Colson wrote:
>> On 07/03/2016 10:35 AM, Adrian Klaver wrote:
>>> On 07/03/2016 08:06 AM, Andy Colson wrote:
>>>> Hi all,
>>>>
>>>> I have a master (web1) and two slaves (web2, webserv), one slave is
>>>> quite far from the master, the db is 112 Gig, so pg_basebackup is my
>>>> last resort.
>>>>
>>>> I followed the page here:
>>>> https://www.postgresql.org/docs/9.5/static/pgupgrade.html
>>>>
>>>> including the rsync stuff.  I practiced it _twice_, once in PG 9.5 beta,
>>>> and again a week ago, on two VM's I created locally.  Both practice
>>>> sessions worked perfect.
>>>>
>>>> I just ran it on the live databases.  The master seems ok, its running
>>>> PG 9.5 now, I can login to it, and no errors in the log.
>>>>
>>>> Neither slave works.  After I'd gotten done with the pgupgrade steps,
>>>> both slaves gave me this error:
>>>>
>>>> FATAL:  database system identifier differs between the primary and
>>>> standby
>>>>
>>>> Sure enough pg_controldata show'd their database system id different
>>>> (all three web1, web2, webserv were different.  no matches at all), so
>>>> I'm assuming the rsync didnt rsync right, or I missed a step and ran it
>>>> to early, or something ... I'm not quite sure.
>>>>
>>>> I needed to get the live website back up and running again, so I let the
>>>> master go, ran analyze, and when it was finished, used the steps here to
>>>> try and resync:
>>>>
>>>> https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
>>>>
>>>> on Master:
>>>> select pg_start_backup('clone',true);
>>>> rsync -av --exclude pg_xlog --exclude postgresql.conf /pub/pg95/*
>>>> web2:/pub/pg95/
>>>> select pg_stop_backup();
>>>> rsync -av /pub/pg95/pg_xlog web2:/pub/pg95/
>>>
>>> Not sure about above rsync, that seems to undo what you did previously.
>>>
>>> Also was the remote directory empty when you did this?
>>>
>>
>> Not sure what you mean by undo.  pgupgrade.html page, step 10, has you
>> rsync the master to the slave, so the pg95 directory is hard linked to
>> the pg93, which save's a ton to bandwidth when your servers are cross
>> county.
>
> I understand I am just trying to figure out what mixing methods (pg-upgrade, pg_start_backup) is doing?
>
> In particular the section on pg_start_backup:
>
> https://www.postgresql.org/docs/9.5/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP
>
> starts with:
>
> "Ensure that WAL archiving is enabled and working."
>
> and from I gather that is not the case.
>

The slave log says it reached a consistent state, and is accepting connections, which tells me it should run ok, maybe
notwith the newest data, but at least not get: 

FATAL:  cache lookup failed for database 16401

>
> Also was the remote directory empty when you did this?

Now that I think about this more, I think you're on to something.  I'm trying to get an _exact_ copy of the master db
ontothe slave.  Checking rsync man, it matches only on size and modified time, and I didn't include deletes. 

I'm going to re-try with this:

select pg_start_backup('clone',true);
rsync -av --delete --checksum --exclude pg_xlog --exclude postgresql.conf /pub/pg95/* web2:/pub/pg95/
select pg_stop_backup();
rsync -av --delete --checksum /pub/pg95/pg_xlog web2:/pub/pg95/

That should make sure the copies are exact.


-Andy



В списке pgsql-general по дате отправления:

Предыдущее
От: James Keener
Дата:
Сообщение: Re: GRANTable Row Permissions
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: deadlock between "WITH agg_tmp AS ({sel_stmt}), upd AS ({upd_stmt}) {ins_stmt}" and pure UPDATE statements