On 06/07/16 17:41, Marco Nenciarini wrote:
> On 06/07/16 17:37, Marco Nenciarini wrote:
>> Hi,
>>
>> On 06/07/16 17:07, francesco.canovai@2ndquadrant.it wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference: 14230
>>> Logged by: Francesco Canovai
>>> Email address: francesco.canovai@2ndquadrant.it
>>> PostgreSQL version: 9.6beta2
>>> Operating system: Linux
>>> Description:
>>>
>>> I'm taking a concurrent backup from a standby in PostgreSQL beta2 and I get
>>> the wrong timeline from pg_stop_backup(false).
>>>
>>> This is what I'm doing:
>>>
>>> 1) I set up an environment with a primary server and a replica in streaming
>>> replication.
>>>
>>> 2) On the replica, I run
>>>
>>> postgres=# SELECT pg_start_backup('test_backup', true, false);
>>> pg_start_backup
>>> -----------------
>>> 0/3000A00
>>> (1 row)
>>>
>>> 3) When I run pg_stop_backup, it returns a start wal location belonging to a
>>> file with timeline 0.
>>>
>>> postgres=# SELECT pg_stop_backup(false);
>>> pg_stop_backup
>>>
>>> ---------------------------------------------------------------------------
>>> (0/3000AE0,"START WAL LOCATION: 0/3000A00 (file
>>> 000000000000000000000003)+
>>> CHECKPOINT LOCATION: 0/3000A38
>>> +
>>> BACKUP METHOD: streamed
>>> +
>>> BACKUP FROM: standby
>>> +
>>> START TIME: 2016-07-06 16:44:31 CEST
>>> +
>>> LABEL: test_backup
>>> +
>>> ","")
>>> (1 row)
>>>
>>> The timeline returned is fine (is 1) when running the same commands on the
>>> master.
>>>
>>> An incorrect backup label doesn't prevent PostgreSQL from starting up, but
>>> it affects the tools using that information.
>>>
>>>
>>
>> The issue here is that the do_pg_stop_backup function uses the
>> ThisTimeLineID variable that is not valid on standbys.
>>
>> I think that it should read it from
>> ControlFile->checkPointCopy.ThisTimeLineID as we do in do_pg_start_backup.
>>
>
> No, that's not the solution.
>
> The backup_label is generated during the do_pg_start_backup call, so
> also the copy in ControlFile->checkPointCopy.ThisTimeLineID is
> uninitialized.
>
After further analysis, the issue is that we retrieve the starttli from
the ControlFile structure, but it was using ThisTimeLineID when writing
the backup label.
I've attached a very simple patch that fixes it.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it