Re: logical changeset generation v6.8

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: logical changeset generation v6.8
Дата
Msg-id CA+TgmobYM+5=bnoAOe4aTu2z1YOJrqenzv+S1sf9J_3G6jpRKg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: logical changeset generation v6.8  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: logical changeset generation v6.8
Список pgsql-hackers
On Mon, Dec 16, 2013 at 6:01 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> There's no hard and
>> fast rule here, because some cases are distinguished, but my gut
>> feeling is that all of the errors your patch introduces are
>> sufficiently obscure cases that separate messages with separate
>> translations are not warranted.
>
> Perhaps we should just introduce a marker that some such strings are not
> to be translated if they are of the unexpected kind. That would probably
> make debugging easier too ;)

Well, we have that: it's called elog.  But that doesn't seem like the
right thing here.

>> > b) make hot_standby_feedback work across disconnections of the walsender
>> >    connection (i.e peg xmin, not just for catalogs though)
>>
>> Check; might need to be optional.
>
> Yea, I am pretty sure it will. It'd probably pretty nasty to set
> min_recovery_apply_delay=7d and force xmin kept to that...

Yes, that would be... unfortunate.

>> > c) Make sure we can transport those across cascading
>> >    replication.
>>
>> Not sure I follow.
>
> Consider a replication scenario like primary <-> standby-1 <->
> standby-2. The primary may not only not remove data that standby-1
> requires, but also not data that standby-2 needs. Not really necessary
> for WAL since that will also reside on standby-1 but definitely for the
> xmin horizon.
> So standby-1 will need to signal not only his own needs, but also of the
> nodes below.

True.

>> > The hard questions that I see are like:
>> > * How do we manage standby registration? Does the admin have to do that
>> >   manually? Or does a standby register itself automatically if some config
>> >   paramter is set?
>> > * If automatically, how do we deal with the situation that registrant
>> >   dies before noting his own identifier somewhere persistent? My best idea
>> >   is a two phase registration process where registration in phase 1 are
>> >   thrown away after a restart, but yuck.
>>
>> If you don't know the answers to these questions for the kind of
>> replication that we have now, then how do you know the answers for
>> logical replication?  Conversely, what makes the answers that you've
>> selected for logical replication unsuitable for our existing
>> replication?
>
> There's a pretty fundamental difference imo - with the logical decoding
> stuff we only supply support for change producing nodes, with physical
> rep we supply both.

I'm not sure I follow this.  "Both" what and what?

> There's no need to decide about the way node ids are stored in in-core logical
> rep. consumers since there are no in-core ones. Yet.

I don't know that we have or need to make any judgements about how to
store node IDs.  You have decided that slots have names, and I see no
problem there.

> Also, physical rep
> by now is a pretty established thing, we need to be much more careful
> about compatibility there.

I don't think we should change anything in backward-incompatible
fashion.  If we add any new behavior, it'd surely be optional.

> I think we need to improve the monitoring facilities a bit, and that
> should be it. Like
> * expose xmin in pg_stat_activity, pg_prepared_xacts,
>   pg_replication_slots (or whatever it's going to be called)
> * expose the last restartpoint's redo pointer in pg_stat_replication, pg_replication_slots

+1.

> That said, the consequences can be a bit harsher than a full disk - the
> anti-wraparound security measures might kick in requiring a restart into
> single user mode. That's way more confusing than cleaning up a bit of
> space on the disk.

Yes, true.  I'm not sure what approach to that problem is best.  It's
long seemed to me that well before we get to the point of shutting
down the whole cluster we ought to just start killing sessions with
old xmins.  But that doesn't generalize well to prepared transactions,
which can't just be rolled back or committed without guidance; and
killing slots seems a bit dicey too.

> Consider what happens though, if you promote a node for physical rep. As
> soon as it's promoted, it will accept writes and then start a
> checkpoint. Unless other standbys have started to follow that node
> before either that checkpoint happens (removing WAL) or
> autovacuuming/hot-pruning is performed (creating recovery conflicts),
> we'll possibly loose the data required to let the standbys follow the
> promotion. Note that wal_keep_segments and vacuum_defer_cleanup_age both
> sorta work for that...

True.

> Could somebody please deliver me a time dilation device?

Upon reflection, I am less concerned with actually having physical
slots in this release than I am with making sure we're not boxing
ourselves into a corner that will make them hard to add later.  If
we've got a clear design that can be generalized to that case, but the
SMOP required exceeds what can be done in the time available, I am OK
to punt it.  But I am not sure we're at that point yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Amiel
Дата:
Сообщение: pg_rewarm status
Следующее
От: Tom Lane
Дата:
Сообщение: Re: logical changeset generation v6.8