Обсуждение: [HACKERS] Logical replication and inheritance
I see that if the table is a inheritance parent, and ONLY is not specified, the child tables are also added to the publication. create table parent (a int); create table child () inherits (parent); create publication parent_pub for table parent; \d parent Table "public.parent"Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------a | integer | | | Publications: "parent_pub" Number of child tables: 1 (Use \d+ to list them.) \d child Table "public.child"Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------a | integer | | | Publications: "parent_pub" Inherits: parent If the child table is later removed from the inheritance hierarchy, it continues to be a part of the publication. alter table child no inherit parent; ALTER TABLE \d child Table "public.child"Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------a | integer | | | Publications: "parent_pub" Perhaps that's intentional? Thanks, Amit
On 2/27/17 01:57, Amit Langote wrote: > I see that if the table is a inheritance parent, and ONLY is not > specified, the child tables are also added to the publication. > If the child table is later removed from the inheritance hierarchy, it > continues to be a part of the publication. > Perhaps that's intentional? I came across this the other day as well. It's not clear what the best way for this to behave would be. Another option would be to check the then-current inheritance relationships in the output plugin. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/03/04 4:24, Peter Eisentraut wrote: > On 2/27/17 01:57, Amit Langote wrote: >> I see that if the table is a inheritance parent, and ONLY is not >> specified, the child tables are also added to the publication. > >> If the child table is later removed from the inheritance hierarchy, it >> continues to be a part of the publication. > >> Perhaps that's intentional? > > I came across this the other day as well. It's not clear what the best > way for this to behave would be. Another option would be to check the > then-current inheritance relationships in the output plugin. I thought removal of a table from inheritance hierarchy would delete it from publications that its parents are part of. One more option is for OpenTableList() called by CreatePublication() and AlterPublicationTables() to not disregard inheritance, as if ONLY was specified. Related: I noticed that if you dump a database containing a publication and an inheritance parent is published by it, loading that into another database causes the following error for each child. ERROR: relation "bar" is already member of publication "foo_pub" Because when foo, the parent, is added to foo_pub publication, bar (a child of foo) is added implicitly. Thanks, Amit
On 2017/03/06 18:04, Amit Langote wrote: > One more option is for OpenTableList() called by CreatePublication() and > AlterPublicationTables() to not disregard inheritance, as if ONLY was > specified. Oops, meant to say: One more option is for OpenTableList to disregard inheritance... Thanks, Amit
After thinking about it some more, I think the behavior we want would be that changes to inheritance would reflect in the publication membership.So if you have a partitioned table, adding more partitionsover time would automatically add them to the publication. We could implement this either by updating pg_publication_rel as inheritance changes, or perhaps more easily by checking and expanding inheritance in the output plugin/walsender. There might be subtle behavioral differences between those two approaches that we should think through. But I think one of these two should be done. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Apr 05, 2017 at 08:25:56AM -0400, Peter Eisentraut wrote: > After thinking about it some more, I think the behavior we want would be > that changes to inheritance would reflect in the publication membership. > So if you have a partitioned table, adding more partitions over time > would automatically add them to the publication. > > We could implement this either by updating pg_publication_rel as > inheritance changes, or perhaps more easily by checking and expanding > inheritance in the output plugin/walsender. There might be subtle > behavioral differences between those two approaches that we should think > through. But I think one of these two should be done. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
On 4/9/17 22:16, Noah Misch wrote: > On Wed, Apr 05, 2017 at 08:25:56AM -0400, Peter Eisentraut wrote: >> After thinking about it some more, I think the behavior we want would be >> that changes to inheritance would reflect in the publication membership. >> So if you have a partitioned table, adding more partitions over time >> would automatically add them to the publication. >> >> We could implement this either by updating pg_publication_rel as >> inheritance changes, or perhaps more easily by checking and expanding >> inheritance in the output plugin/walsender. There might be subtle >> behavioral differences between those two approaches that we should think >> through. But I think one of these two should be done. > > [Action required within three days. This is a generic notification.] Relative to some of the other open items I'm involved in, I consider this a low priority, so I'm not working on it right now. I would also appreciate some guidance from those who are more involved in inheritance and partitioning to determine the best behavior. It could be argued that the current behavior is correct, and also that there are several other correct behaviors. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Apr 5, 2017 at 8:25 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > After thinking about it some more, I think the behavior we want would be > that changes to inheritance would reflect in the publication membership. > So if you have a partitioned table, adding more partitions over time > would automatically add them to the publication. I think the threshold question here is: is the partitioned table a member of the publication, or are the partitions members of the publication? Probably both should be allowed. If you add a partition to the publication by name, then we should publish exactly that partition, full stop. If you add the partitioned table, then there is a choice of behaviors, including: (1) That's an error; the user should publish the partitions instead. (2) That's a shorthand for all partitions, resolved at the time the table is added to the publication. Later changes affect nothing. (3) All partitions are published, and changes in the set of partitions change what is published. In my view, (3) is more desirable than (1) which is more desirable than (2). > We could implement this either by updating pg_publication_rel as > inheritance changes, or perhaps more easily by checking and expanding > inheritance in the output plugin/walsender. There might be subtle > behavioral differences between those two approaches that we should think > through. But I think one of these two should be done. The latter approach sounds better to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Apr 12, 2017 at 11:02:44AM -0400, Peter Eisentraut wrote: > On 4/9/17 22:16, Noah Misch wrote: > > On Wed, Apr 05, 2017 at 08:25:56AM -0400, Peter Eisentraut wrote: > >> After thinking about it some more, I think the behavior we want would be > >> that changes to inheritance would reflect in the publication membership. > >> So if you have a partitioned table, adding more partitions over time > >> would automatically add them to the publication. > >> > >> We could implement this either by updating pg_publication_rel as > >> inheritance changes, or perhaps more easily by checking and expanding > >> inheritance in the output plugin/walsender. There might be subtle > >> behavioral differences between those two approaches that we should think > >> through. But I think one of these two should be done. > > > > [Action required within three days. This is a generic notification.] > > Relative to some of the other open items I'm involved in, I consider > this a low priority, so I'm not working on it right now. By what day should the community look for your next update?
On 2017/04/13 0:10, Robert Haas wrote: > On Wed, Apr 5, 2017 at 8:25 AM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> After thinking about it some more, I think the behavior we want would be >> that changes to inheritance would reflect in the publication membership. >> So if you have a partitioned table, adding more partitions over time >> would automatically add them to the publication. > > I think the threshold question here is: is the partitioned table a > member of the publication, or are the partitions members of the > publication? One more question could be - are we going to handle this specially only for the partitioned tables? Of course, if we use an approach with general inheritance in mind, it will cover the partitioned table case, but we might have to reconsider if the generality makes a solution prohibitive. > Probably both should be allowed. If you add a partition > to the publication by name, then we should publish exactly that > partition, full stop. Later if someone adds the parent, current code would try to add the partition again. create table bar (a int); create publication mypub for table bar; create table foo (a int); create table baz () inherits (foo); alter table bar inherit foo; alter publication mypub add table foo; ERROR: relation "bar" is already member of publication "mypub" One way to avoid the error, is to add ONLY foo: alter publication mypub add table only foo; ALTER PUBLICATION But then other children (baz) is not added. There should be some way to avoid the error if the parent is added to the same publication as the one which a child is already part of. I think it should be a no-op. Sent a patch for that in a separate thread. > If you add the partitioned table, then there is > a choice of behaviors, including: > > (1) That's an error; the user should publish the partitions instead. > (2) That's a shorthand for all partitions, resolved at the time the > table is added to the publication. Later changes affect nothing. > (3) All partitions are published, and changes in the set of partitions > change what is published. > > In my view, (3) is more desirable than (1) which is more desirable than (2). Agree with that order. >> We could implement this either by updating pg_publication_rel as >> inheritance changes, or perhaps more easily by checking and expanding >> inheritance in the output plugin/walsender. There might be subtle >> behavioral differences between those two approaches that we should think >> through. But I think one of these two should be done. > > The latter approach sounds better to me. Which, IIUC, means OpenTableList() called by both CreatePublication() and AlterPublicationTables() does not expand inheritance and hence pg_publication_rel entries are created only for the specified relation. That is an important consideration because of pg_dump. See below: create table foo (a int); create table bar () inherits (foo); create publication mypub for table foo; -- bar is added too. $ pg_dump -s | psql -e test <snip> ALTER PUBLICATION mypub ADD TABLE foo; ERROR: relation "bar" is already member of publication "mypub" This however ain't a problem if we consider my above suggestion to make duplicate addition to publication a no-op. Thanks, Amit
I don't think inheritance and partitioning should behave the same in terms of logical replication. For me the current behavior with inherited tables is correct. What I would like partitioned tables support to look like is that if we add partitioned table, the data decoded from any of the partitions would be sent as if the change was for that partitioned table so that the partitioning scheme on subscriber does not need to be same as on publisher. That's nontrivial to do though probably. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017/04/14 10:57, Petr Jelinek wrote: > I don't think inheritance and partitioning should behave the same in > terms of logical replication. I see. > > For me the current behavior with inherited tables is correct. OK. By the way, what do you think about the pg_dump example/issue I mentioned?Is that a pg_dump problem or backend? To reiterate,if you add an inheritance parent to a publication, dump the database, and restore it into another, an error occurs. Why? Because every child table is added *twice* because of the way publication tables are dumped. Once by itself and again via inheritance expansion when the parent is added. Adding a table again to the same publication is currently an error, which I was wondering if it could be made a no-op instead. > What I would like partitioned tables support to look like is that if we > add partitioned table, the data decoded from any of the partitions would > be sent as if the change was for that partitioned table so that the > partitioning scheme on subscriber does not need to be same as on > publisher. That's nontrivial to do though probably. I agree that it'd be nontrivial. I'm not sure if you're also implying that a row decoded from a partition is *never* published as having been inserted into the partition itself. A row can end up in a partition via both the inserts into the partitioned table and the partition itself. Also, AFAICT, obviously the output pluggin would have to have a dedicated logic to determine which table to publish a given row as coming from (possibly the root partitioned table), since nothing decode-able from WAL is going to have that information. Also, with the current state of partitioning, if a row decoded and published as coming from the partitioned table had no corresponding partition defined on the destination server, an error will occur in the subscription worker I'd guess. Or may be we don't assume anything about whether the table on the subscription end is partitioned or not. Anyway, that perhaps also means that for time being, we might need to go with the following option that Robert mentioned (I suppose strictly in the context of partitioned tables, not general inheritance): (1) That's an error; the user should publish the partitions instead. That is, we should make adding a partitioned table to a publication a user error (feature not supported). Thanks, Amit
On 14/04/17 06:14, Amit Langote wrote: > On 2017/04/14 10:57, Petr Jelinek wrote: >> I don't think inheritance and partitioning should behave the same in >> terms of logical replication. > > I see. > >> >> For me the current behavior with inherited tables is correct. > > OK. > > By the way, what do you think about the pg_dump example/issue I mentioned? > Is that a pg_dump problem or backend? To reiterate, if you add an > inheritance parent to a publication, dump the database, and restore it > into another, an error occurs. Why? Because every child table is added > *twice* because of the way publication tables are dumped. Once by itself > and again via inheritance expansion when the parent is added. Adding a > table again to the same publication is currently an error, which I was > wondering if it could be made a no-op instead. > That's good question. I think it's fair to treat membership of table in publication is "soft object" or "property" rather than real object where we would enforce error on ADD of something that's already there. So I am not against changing it to no-op (like doing alter sequence owned by to column which is the current owner already). >> What I would like partitioned tables support to look like is that if we >> add partitioned table, the data decoded from any of the partitions would >> be sent as if the change was for that partitioned table so that the >> partitioning scheme on subscriber does not need to be same as on >> publisher. That's nontrivial to do though probably. > > I agree that it'd be nontrivial. I'm not sure if you're also implying > that a row decoded from a partition is *never* published as having been > inserted into the partition itself. A row can end up in a partition via > both the inserts into the partitioned table and the partition itself. > Also, AFAICT, obviously the output pluggin would have to have a dedicated > logic to determine which table to publish a given row as coming from > (possibly the root partitioned table), since nothing decode-able from WAL > is going to have that information. > Well, yes that what I mean by nontrivial, IMHO the hard part is defining behavior (the coding is the easy part here). I think there are more or less 3 options, a) either partitions can't be added to publication individually or b) they will always publish their data as their main partitioned table (which for output plugin means same thing, ie we'll only see the rows as changes to partitioned tables) or alternatively c) if partition is added and partitioned table is added we publish changes twice, but that seems like quite bad option to me. This was BTW the reason why I was saying in the original partitioning thread that it's unclear to me from documentation what is general guiding principle in terms of threating partitions as individual objects or not. Currently it's mixed bag, structure is treated as global for whole partitioned tree, but things like indexes can be defined separately on individual partitions. Also existing tables retain some of their differences when they are being added as partitions but other differences are strictly checked and will result in error. I don't quite understand if this is current implementation limitation and we'll eventually "lock down" the differences (when we have global indexes and such) or if it's intended long term to allow differences between partitions and what will be the rules for what's allowed and what not. > Also, with the current state of partitioning, if a row decoded and > published as coming from the partitioned table had no corresponding > partition defined on the destination server, an error will occur in the > subscription worker I'd guess. Or may be we don't assume anything about > whether the table on the subscription end is partitioned or not. > > Anyway, that perhaps also means that for time being, we might need to go > with the following option that Robert mentioned (I suppose strictly in the > context of partitioned tables, not general inheritance): > > (1) That's an error; the user should publish the partitions instead. > Yes I agree with Robert's statement and that's how it should behave already. > That is, we should make adding a partitioned table to a publication a user > error (feature not supported). > It already should be error no? (Unless some change was committed that I missed, I definitely wrote it as error in original patch). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 4/13/17 06:48, Amit Langote wrote: > That is an important consideration because of pg_dump. See below: > > create table foo (a int); > create table bar () inherits (foo); > create publication mypub for table foo; -- bar is added too. > > $ pg_dump -s | psql -e test > <snip> > ALTER PUBLICATION mypub ADD TABLE foo; > ERROR: relation "bar" is already member of publication "mypub" To fix this, pg_dump should emit ADD TABLE ONLY foo. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/13/17 01:00, Noah Misch wrote: >> Relative to some of the other open items I'm involved in, I consider >> this a low priority, so I'm not working on it right now. > > By what day should the community look for your next update? Tuesday -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/04/15 3:53, Peter Eisentraut wrote: > On 4/13/17 06:48, Amit Langote wrote: >> That is an important consideration because of pg_dump. See below: >> >> create table foo (a int); >> create table bar () inherits (foo); >> create publication mypub for table foo; -- bar is added too. >> >> $ pg_dump -s | psql -e test >> <snip> >> ALTER PUBLICATION mypub ADD TABLE foo; >> ERROR: relation "bar" is already member of publication "mypub" > > To fix this, pg_dump should emit ADD TABLE ONLY foo. Yeah, that's one way. Attached is a tiny patch for that. By the way, I noticed that although grammar accepts ONLY and * against a table name to affect whether descendant tables are included, the same is not mentioned in the CREATE PUBLICATION and ALTER PUBLICATION reference pages. I suspect it was probably not intentional, so attached a doc patch for that too. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On 4/16/17 23:00, Amit Langote wrote: >> To fix this, pg_dump should emit ADD TABLE ONLY foo. > > Yeah, that's one way. Attached is a tiny patch for that. > > By the way, I noticed that although grammar accepts ONLY and * against a > table name to affect whether descendant tables are included, the same is > not mentioned in the CREATE PUBLICATION and ALTER PUBLICATION reference > pages. I suspect it was probably not intentional, so attached a doc patch > for that too. Committed those, with some extra tests. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/04/17 23:08, Peter Eisentraut wrote: > On 4/16/17 23:00, Amit Langote wrote: >>> To fix this, pg_dump should emit ADD TABLE ONLY foo. >> >> Yeah, that's one way. Attached is a tiny patch for that. >> >> By the way, I noticed that although grammar accepts ONLY and * against a >> table name to affect whether descendant tables are included, the same is >> not mentioned in the CREATE PUBLICATION and ALTER PUBLICATION reference >> pages. I suspect it was probably not intentional, so attached a doc patch >> for that too. > > Committed those, with some extra tests. Thanks. Regards, Amit
On 2017/04/14 21:44, Petr Jelinek wrote: > On 14/04/17 06:14, Amit Langote wrote: >> On 2017/04/14 10:57, Petr Jelinek wrote: >>> For me the current behavior with inherited tables is correct. >> >> By the way, what do you think about the pg_dump example/issue I mentioned? >> Is that a pg_dump problem or backend? To reiterate, if you add an >> inheritance parent to a publication, dump the database, and restore it >> into another, an error occurs. Why? Because every child table is added >> *twice* because of the way publication tables are dumped. Once by itself >> and again via inheritance expansion when the parent is added. Adding a >> table again to the same publication is currently an error, which I was >> wondering if it could be made a no-op instead. >> > > That's good question. I think it's fair to treat membership of table in > publication is "soft object" or "property" rather than real object where > we would enforce error on ADD of something that's already there. So I am > not against changing it to no-op (like doing alter sequence owned by to > column which is the current owner already). The patch committed by Peter should be enough for the pg_dump issue I was talking about (a pg_dump fix after all). It seems that at least Tom and Robert are not too excited about making duplicate membership addition a no-op, so I don't intend to push for it. >>> What I would like partitioned tables support to look like is that if we >>> add partitioned table, the data decoded from any of the partitions would >>> be sent as if the change was for that partitioned table so that the >>> partitioning scheme on subscriber does not need to be same as on >>> publisher. That's nontrivial to do though probably. >> >> I agree that it'd be nontrivial. I'm not sure if you're also implying >> that a row decoded from a partition is *never* published as having been >> inserted into the partition itself. A row can end up in a partition via >> both the inserts into the partitioned table and the partition itself. >> Also, AFAICT, obviously the output pluggin would have to have a dedicated >> logic to determine which table to publish a given row as coming from >> (possibly the root partitioned table), since nothing decode-able from WAL >> is going to have that information. >> > > Well, yes that what I mean by nontrivial, IMHO the hard part is defining > behavior (the coding is the easy part here). I think there are more or > less 3 options, a) either partitions can't be added to publication > individually or b) they will always publish their data as their main > partitioned table (which for output plugin means same thing, ie we'll > only see the rows as changes to partitioned tables) or alternatively c) > if partition is added and partitioned table is added we publish changes > twice, but that seems like quite bad option to me. Note that (a) will amount to reversing what v10 will support, that is, can publish individual leaf partitions, but not the partitioned tables. About (b): sounds good, but not sure of the interface. About (c): agreed that a bad option > This was BTW the reason why I was saying in the original partitioning > thread that it's unclear to me from documentation what is general > guiding principle in terms of threating partitions as individual objects > or not. Currently it's mixed bag, structure is treated as global for > whole partitioned tree, but things like indexes can be defined > separately on individual partitions. Also existing tables retain some of > their differences when they are being added as partitions but other > differences are strictly checked and will result in error. I don't quite > understand if this is current implementation limitation and we'll > eventually "lock down" the differences (when we have global indexes and > such) or if it's intended long term to allow differences between > partitions and what will be the rules for what's allowed and what not. > >> Also, with the current state of partitioning, if a row decoded and >> published as coming from the partitioned table had no corresponding >> partition defined on the destination server, an error will occur in the >> subscription worker I'd guess. Or may be we don't assume anything about >> whether the table on the subscription end is partitioned or not. >> >> Anyway, that perhaps also means that for time being, we might need to go >> with the following option that Robert mentioned (I suppose strictly in the >> context of partitioned tables, not general inheritance): >> >> (1) That's an error; the user should publish the partitions instead. >> > > Yes I agree with Robert's statement and that's how it should behave already. Yes, it does. > >> That is, we should make adding a partitioned table to a publication a user >> error (feature not supported). >> > > It already should be error no? (Unless some change was committed that I > missed, I definitely wrote it as error in original patch). Oh, you're right. Although, the error message could be spelled a bit differently, IMHO, which currently is: create table p (a int, b char) partition by list (a); create publication mypub for table p; ERROR: "p" is not a table DETAIL: Only tables can be added to publications. We could be more explicit here and say the following instead: create publication mypub for table p; ERROR: "p" is a partitioned table DETAIL: Adding partitioned tables to publications is not supported. Thoughts? (a patch is attached for consideration) I think it could be argued that quite a few instances "%s is not a table" could be preceded by such explicit check for if partitioned, but that seems like a topic for a different thread. Only a subset of those sites are such that a table's *being partitioned* prevents it from being processed, for which it might make sense to add the explicit check. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On 4/18/17 01:58, Amit Langote wrote: > We could be more explicit here and say the following instead: > > create publication mypub for table p; > ERROR: "p" is a partitioned table > DETAIL: Adding partitioned tables to publications is not supported. > > Thoughts? (a patch is attached for consideration) Committed. I added an errhint, moved the tests around a bit to match the existing structure, and added some documentation. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/04/21 3:22, Peter Eisentraut wrote: > On 4/18/17 01:58, Amit Langote wrote: >> We could be more explicit here and say the following instead: >> >> create publication mypub for table p; >> ERROR: "p" is a partitioned table >> DETAIL: Adding partitioned tables to publications is not supported. >> >> Thoughts? (a patch is attached for consideration) > > Committed. I added an errhint, moved the tests around a bit to match > the existing structure, and added some documentation. Thanks. Regards, Amit