Re: Attempt to consolidate reading of XLOG page
| От | Antonin Houska | 
|---|---|
| Тема | Re: Attempt to consolidate reading of XLOG page | 
| Дата | |
| Msg-id | 75115.1569499713@antos обсуждение исходный текст | 
| Ответ на | Re: Attempt to consolidate reading of XLOG page (Alvaro Herrera <alvherre@2ndquadrant.com>) | 
| Ответы | Re: Attempt to consolidate reading of XLOG page | 
| Список | pgsql-hackers | 
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On 2019-Sep-24, Antonin Houska wrote:
> 
> > Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> 
> > > If you don't have any strong dislikes for these changes, I'll push this
> > > part and let you rebase the remains on top.
> > 
> > No objections here.
> 
> oK, pushed.  Please rebase the other parts.
Thanks!
> I made one small adjustment: in read_local_xlog_page() there was one
> *readTLI output parameter that was being changed to a local variable
> plus later assigment to the output struct member; I changed the code to
> continue to assign directly to the output variable instead.  There was
> an error case in which the TLI was not assigned to; I suppose this
> doesn't really change things (we don't examine the TLI in that case, do
> we?), but it seemed dangerous to leave like that.
I used the local variable to make some expressions simpler, but missed the
fact that this way I can leave the ws_tli field unassigned if the function
returns prematurely. Now that I look closer, I see that it can be a problem -
in the case of ERROR, XLogReadRecord() does reset the state, but it does not
reset the TLI:
err:
    /*
     * Invalidate the read state. We might read from a different source after
     * failure.
     */
    XLogReaderInvalReadState(state);
Thus the TLI appears to be important even on ERROR, and what you've done is
correct. Thanks for fixing that.
One comment on the remaining part of the series:
Before this refactoring, the walsender.c:XLogRead() function contained these
lines
       /*
        * After reading into the buffer, check that what we read was valid. We do
        * this after reading, because even though the segment was present when we
        * opened it, it might get recycled or removed while we read it. The
        * read() succeeds in that case, but the data we tried to read might
        * already have been overwritten with new WAL records.
        */
       XLByteToSeg(startptr, segno, segcxt->ws_segsize);
       CheckXLogRemoved(segno, ThisTimeLineID);
but they don't fit into the new, generic implementation, so I copied these
lines to the two places right after the call of the new XLogRead(). However I
was not sure if ThisTimeLineID was ever correct here. It seems the original
walsender.c:XLogRead() implementation did not update ThisTimeLineID (and
therefore neither the new callback WalSndSegmentOpen() does), so both
logical_read_xlog_page() and XLogSendPhysical() could read the data from
another (historic) timeline. I think we should check the segment we really
read data from:
    CheckXLogRemoved(segno, sendSeg->ws_tli);
The rebased code is attached.
-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com
		
	Вложения
В списке pgsql-hackers по дате отправления: