Re: logical changeset generation v6.8
От | Robert Haas |
---|---|
Тема | Re: logical changeset generation v6.8 |
Дата | |
Msg-id | CA+TgmobBmuAAgy8JV=veCtDK9Nj-5Bjo_4VF+=TtfpXS+JfKgQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: logical changeset generation v6.8 (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: logical changeset generation v6.8
Re: logical changeset generation v6.8 |
Список | pgsql-hackers |
On Wed, Dec 11, 2013 at 7:08 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> I don't think there's much point in including "remapping" in all of >> the error messages. It adds burden for translators and users won't >> know what a remapping file is anyway. > > It helps in locating wich part of the code caused a problem. I utterly > hate to get reports with error messages that I can't correlate to a > sourcefile. Yes, I know that verbose error output exists, but usually > you don't get it that way... That said, I'll try to make the messages > simpler. Well, we could adopt a policy of not making messages originating from different locations in the code the same. However, it looks to me like that's NOT the current policy, because some care has been taken to reuse messages rather than distinguish them. There's no hard and fast rule here, because some cases are distinguished, but my gut feeling is that all of the errors your patch introduces are sufficiently obscure cases that separate messages with separate translations are not warranted. The time when this is likely to fail is when someone borks the permissions on the data directory, and the user probably won't need to care exactly which file we can't write. >> + if (!TransactionIdIsNormal(xmax)) >> + { >> + /* >> + * no xmax is set, can't have any permanent ones, so >> this check is >> + * sufficient >> + */ >> + } >> + else if (HEAP_XMAX_IS_LOCKED_ONLY(new_tuple->t_data->t_infomask)) >> + { >> + /* only locked, we don't care */ >> + } >> + else if (!TransactionIdPrecedes(xmax, cutoff)) >> + { >> + /* tuple has been deleted recently, log */ >> + do_log_xmax = true; >> + } >> >> Obfuscated. Rewrite without empty blocks. > > I don't understand why having an empty block is less clear than > including a condition in several branches? Especially if the individual > conditions might need to be documented? It's not a coding style we typically use, to my knowledge. Of course, what is actually clearest is a matter of opinion, but my own experience is that such blocks typically end up seeming clearer to me when the individual comments are joined together into a paragraph that explains the logic in full sentences what's going on. >> + /* >> + * Write out buffer everytime we've too many in-memory entries. >> + */ >> + if (state->rs_num_rewrite_mappings >= 1000 /* arbitrary number */) >> + logical_heap_rewrite_flush_mappings(state); >> >> What happens if the number of items per hash table entry is small but >> the number of entries is large? > > rs_num_rewrite_mappings is the overall number of in-memory mappings, not > the number of per-entry mappings. That means we flush, even if all > entries have only a couple of mappings, as soon as 1000 in memory > entries have been collected. A bit simplistic, but seems okay enough? Possibly, not sure yet. I need to examine that logic in more detail, but I had trouble following it and was hoping the next version would be better-commented. >> I'm still unhappy that we're introducing logical decoding slots but no >> analogue for physical replication. If we had the latter, would it >> share meaningful amounts of structure with this? > > Yes, I think we could mostly reuse it, we'd probably want to add a field > or two more (application_name, sync_prio?). I have been wondering > whether some of the code in replication/logical/logical.c shouldn't be > in replication/slot.c or similar. So far I've opted for leaving it in > its current place since it would have to change a bit for a more general > role. I strongly favor moving the slot-related code to someplace with "slot" in the name, and replication/slot.c seems about right. Even if we don't extend them to cover non-logical replication in this release, we'll probably do it eventually, and it'd be better if that didn't require moving large amounts of code between files. > I still think that the technical parts of properly supporting persistent > slots for physical rep aren't that hard, but that the behavioural > decisions are. I think there are primarily two things for SR that we > want to prevent using slots: > a) removal of still used WAL (i.e. maintain knowledge about a standby's > last required REDO location) Check. > b) make hot_standby_feedback work across disconnections of the walsender > connection (i.e peg xmin, not just for catalogs though) Check; might need to be optional. > c) Make sure we can transport those across cascading > replication. Not sure I follow. > once those are there it's also useful to keep a bit more information > about the state of replicas: > * write, flush, apply lsn Check. > The hard questions that I see are like: > * How do we manage standby registration? Does the admin have to do that > manually? Or does a standby register itself automatically if some config > paramter is set? > * If automatically, how do we deal with the situation that registrant > dies before noting his own identifier somewhere persistent? My best idea > is a two phase registration process where registration in phase 1 are > thrown away after a restart, but yuck. If you don't know the answers to these questions for the kind of replication that we have now, then how do you know the answers for logical replication? Conversely, what makes the answers that you've selected for logical replication unsuitable for our existing replication? I have to admit that before I saw your design for the logical replication slots, the problem of making this work for our existing replication stuff seemed almost intractable to me; I had no idea how that was going to work. GUCs didn't seem suitable because I thought we might need some data that is tabular in nature - i.e. configuration specific to each slot. And GUC has problems with that sort of thing. And a new system catalog didn't seem good either, because you are highly likely to want different configuration on the standby vs. on the master. But after I saw your design for the logical slots I said, dude, get me some of that. Keeping the data in shared memory, persisting them across shutdowns, and managing them via either function calls or the replication command language seems perfect. Now, in terms of how registration works, whether for physical replication or logical, it seems to me that the DBA will have to be responsible for telling each client the name of the slot to which that client should connect (omitting it at said DBA's option if, as for physical replication, the slot is not mandatory). Assuming such a design, clients could elect for one of two possible behaviors when the anticipated slot doesn't exist: either they error out, or they create it. Either is reasonable; we could even support both. A third alternative is to make each client responsible for generating a name, but I think that's probably not as good. If we went that route, the name would presumably be some kind of random string, which will probably be a lot less usable than a DBA-assigned name. The client would first generate it, second save it somewhere persistent (like a file in the data directory), and third create a slot by that name. If the client crashes before creating the slot, it will find the name in its persistent store after restart and, upon discovering that no such slot exists, try again to create it. But note that no matter which of those three options we pick, the server support really need not work any differently. I can imagine any of them being useful and I don't care all that much which one we end up with. My personal preference is probably for manual registration: if the DBA wants to use slots, said DBA will need to set them up. This also mitigates the issue you raise in your next point: what is manually created will naturally also need to be manually destroyed. > * How do we deal with the usability wart that people *will* forget to > delete a slot when removing a node? Aside from the point mentioned in the previous paragraph, we don't. The fact that a standby which gets too far behind can't recover is a usability wart, too. I think this is a bit like asking what happens if a user keeps inserting data of only ephemeral value into a table and never deletes any of it as they ought to have done. Well, they'll be wasting disk space; potentially, they will fill the disk. Sucks to be them. The solution is not to prohibit inserting data. > * What requirements do we have for transporting knowlede about this > across a failover? > > I have little hope that we can resolve all that for 9.4. I wouldn't consider this a requirement for a useful feature, and if it is a requirement, then how are you solving it for logical replication?For physical replication, there's basically only oneevent that needs to be considered: master dead, promote standby. That scenario can also happen in logical replication... but there's a whole host of other things that can happen, too. An individual replication solution may be single-writer but may, like Slony, allow that write authority to be moved around. Or it may have multiple writers. Suppose A, B, and C replicate between themselves using mult-master replication for write availability across geographies. Within each geo, A physically replicates to A', B to B', and C to C'. There is also a reporting server D which replicates selected tables from each of A, B, and C. On a particularly bad day, D is switched to replicate a particular table from B rather than A while at the same time B is failed over to B' in the midst of a network outage that renders B and B' only intermittently network-accessible, leading to a high rate of multi-master conflicts with C that require manual resolution, which doesn't happen until the following week. In the meantime, B' is failed back to B and C is temporarily removed from the multi-master cluster and allowed to run standalone. If you can develop a replication solution that leaves D with the correct data at the end of all of that, my hat is off to you ... and the problem of getting all this to work when only physical replication is in use and then number of possible scenarios is much less ought to seem simple by comparison. But whether it does or not, I don't see why we have to solve it in this release. Following standby promotions, etc. is a whole feature, or several, unto itself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: