Re: logical changeset generation v5
От | Robert Haas |
---|---|
Тема | Re: logical changeset generation v5 |
Дата | |
Msg-id | CA+TgmoaHPnVBfyjcKrbWdgGMMtyftM5y1+zm+Od=w_+NNED4pw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: logical changeset generation v5 (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: logical changeset generation v5
(Andres Freund <andres@2ndquadrant.com>)
|
Список | pgsql-hackers |
On Tue, Sep 3, 2013 at 12:57 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> To my way of thinking, it seems as though we ought to always begin >> replay at a checkpoint, so the standby ought always to see one of >> these records immediately. Obviously that's not good enough, but why >> not? > > We always see one after the checkpoint (well, actually before the > checkpoint record, but ...), correct. The problem is just that reading a > single xact_running record doesn't automatically make you consistent. If > there's a single suboverflowed transaction running on the primary when > the xl_runing_xacts is logged we won't be able to switch to > consistent. Check procarray.c:ProcArrayApplyRecoveryInfo() for some fun > and some optimizations. > Since the only place where we currently have the information to > potentially become consistent is ProcArrayApplyRecoveryInfo() we will > have to wait checkpoint_timeout time till we get consistent. Which > sucks as there are good arguments to set that to 1h. > That especially sucks as you loose consistency everytime you restart the > standby... Right, OK. >> And why is every 15 seconds good enough? > > Waiting 15s to become consistent instead of checkpoint_timeout seems to > be ok to me and to be a good tradeoff between overhead and waiting. We > can certainly discuss other values or making it configurable. The latter > seemed to be unnecessary to me, but I have don't have a problem > implementing it. I just don't want to document it :P I don't think it particularly needs to be configurable, but I wonder if we can't be a bit smarter about when we do it. For example, suppose we logged it every 15 s but only until we log a non-overflowed snapshot. I realize that the overhead of a WAL record every 15 seconds is fairly small, but the load on some systems is all but nonexistent. It would be nice not to wake up the HD unnecessarily. >> The WAL writer is supposed to call XLogBackgroundFlush() every time >> WalWriterDelay expires. Yeah, it can hibernate, but if it's >> hibernating, then we should respect that decision for this WAL record >> type also. > > Why should we respect it? Because I don't see any reason to believe that this WAL record is any more important or urgent than any other WAL record that might get logged. >> >> I understand why logical replication needs to connect to a database, >> >> but I don't understand why any other walsender would need to connect >> >> to a database. >> > >> > Well, logical replication actually streams out data using the walsender, >> > so that's the reason why I want to add it there. But there have been >> > cases in the past where we wanted to do stuff in the walsender that need >> > database access, but we couldn't do so because you cannot connect to >> > one. > >> Could you be more specific? > > I only remember 3959.1349384333@sss.pgh.pa.us but I think it has come up > before. It seems we need some more design there. Perhaps entering replication mode could be triggered by writing either dbname=replication or replication=yes. But then, do the replication commands simply become SQL commands? I've certainly seen hackers use them that way. And I can imagine that being a sensible approach, but this patch seems like it's only covering a fairly small fraction of what really ought to be a single commit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Bruce MomjianДата:
Сообщение: Re: [9.4] Make full_page_writes only settable on server start?