Re: matview niceties: pick any two of these three
От | Kevin Grittner |
---|---|
Тема | Re: matview niceties: pick any two of these three |
Дата | |
Msg-id | 1367611294.85081.YahooMailNeo@web162902.mail.bf1.yahoo.com обсуждение исходный текст |
Ответ на | Re: matview niceties: pick any two of these three (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: matview niceties: pick any two of these three
|
Список | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kevin Grittner <kgrittn@ymail.com> writes: >> Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> The current matview design gets around this problem by >>> requiring that transition between scannable and unscannable >>> states involve a complete table rewrite, and thus the >>> transactionality issue can be hidden behind a transactional >>> update of the matview's pg_class.relfilenode field. IMO, that >>> is obviously a dead-end design, because we are going to want >>> scannability status updates associated with partial updates of >>> the matview's contents. > >> I don't think the change between populated and non-populated >> needs to change without a new heap being swapped in. > > I think you are using the distinction between "populated" and > "scannable" to dodge the issue. My point is that whenever we add > some better control over whether matviews are considered > too-stale-to-use, it is going to be important that the staleness > state can be changed, in either direction, without a full rewrite > of the matview's contents. Whether you call that "populated", or > "scannable", or "stale", or "warm fuzzy" doesn't matter to me. > If we can't do that then the feature is not actually going to be > of any great use. I suspect that we will never support scanning of a non-populated matview, but that other considerations for what is scannable will be added. That's not a word game, it's an important semantic difference. > And I want such updates to be transactional, because they'll very > likely be connected to transactional updates of the matview's > contents. Yep, that is definitely my intent. > So I continue to maintain that the current design is a dead end. Not if you don't muddle multiple concepts as one thing. >> We'll get into all kinds of design trouble if we conflate these >> two separate concepts. > > Perhaps I still don't understand what useful distinction is > there. If there is a critical distinction, why is the current > patch exposing only one state value to users? We don't have any other criteria for what is scannable implemented yet in this release, but perhaps you're right that we should have a pg_relation_is_populated() function available to user in addition to the pg_relation_is_scannable() function. For the moment, the latter would just return the result of calling the former. I hadn't thought the former was useful at the SQL level at this point, but thinking about it now, I wonder whether "populated" wouldn't be the more appropriate test for pg_dump to use. I'm not sure just yet, but it bears some thinking. More on that tomorrow after I sleep on it. >> I could probably add a hundred other desirable properties for a >> materialized view implementation; I was trying to list the ones >> which were still up-for-grabs for the initial 9.3 >> implementation. If we want to make progress on catching up to >> other major databases on this, it will take progress in many >> releases; and every one of those will lack something that >> someone wants. OK, a lot of things that a lot of people want. >> But if we can't commit something that is less than complete or >> less than perfect to make incremental change, we're not going to >> be getting anywhere very fast. > > Yes, exactly. I am not sure why you are so desperate to have > unlogged matviews in 9.3 that you are willing to risk finding > yourself boxed in and unable to make future improvements that > everybody agrees are useful. ISTM that plain matviews are a > perfectly good first-cut feature and we can add unlogged ones > whenever we can think of a better way to do it. I think matviews are a useful feature without an unlogged option; but I think they will be significantly more useful with the option. I agree it's a judgment call whether the hacks needed to support it for now are worth the benefit of the option. Based on my reading of the user needs and the code, I think it is worth the trade-off, but I can understand arguments to the contrary. The lack of specifics on where problems would be has left me unconvinced. The most specific arguments so far have been with the "extend from zero page length during incremental maintenance" (which I hope I have made clear why I don't see a problem there, but maybe I need to say more...), and the argument that we might want to stuff metadata at the front of the heap to track this, which is out of the question for 9.3 and would require the same response later whether we use the zero-length hack for now or not. >> If I thought that the current hack for tracking the populated >> state would be permanent, or that it would be so hard to replace >> in a later release that it would impede further development, I >> would be the first one to rip it out. I haven't been able to >> see such a risk, and nobody has adequately explained where they >> think the risk lies. > > You can't update that state transactionally, and you can't update > it without a complete rewrite. Right, the *populated* state, indicating whether the query has been run to initially fill the matview with data, blocks other access. > The only solutions you've offered to that are devices Rube > Goldberg would be proud of (not that the code wouldn't fill his > heart with joy as it is). This is not a good place to be for > a first-cut implementation, at least not if we don't want to find > ourselves needing to make a representation change that pg_upgrade > can't readily cope with. I've already explained why I don't think pg_upgrade is an issue, short of the metadata in the first heap page change which was suggested, but I'll give it another go. pg_dump uses CREATE MATERIALIZED VIEW ... WITH NO DATA in the same phase as views are created, because each can reference the other. If we come up with a new mechanism for indicating non-populated matviews in a later release, the server from such a release which is digesting those commands will use its new mechanism. pg_upgrade need not know or care what the server is doing, or even that these commands are in the schema dump. So no problem there. For the pg_dump upgrade path, the REFRESH commands which may follow for some or all matviews will likewise do the right thing for the version of the server processing them. Still no problem. At the point where we move information about whether a matview is populate to the catalogs, we will probably, like at least one other database product, add an ALTER MATERIALIZED VIEW option to flag a matview as scannable, and the pg_dump of that new release could be modified to emit those where appropriate in the schema dump phase when the option used by pg_upgrade is set. That's some coding, but hardly a big deal. So where is the Rube Goldberg solution in that? > The long and the short of it is this: having unlogged matviews in > 9.3 is not worth taking that risk for. IMO anyway. We differ on that, but maybe you see a risk I'm missing, so I'm trying to understand what you see. >> I will certainly go along with the consensus, but from my >> perspective, living with the current hack for determining >> whether a matview has been populated is worth the benefit of >> having unlogged matviews in 9.3. I don't expect that hack to be >> viable for more that one or two releases; I just think that it's >> a practical matter of working with the tools we have *now* for >> unlogged relations to try to create a new type of unlogged >> relation. > > If you yourself admit that this hack has got a very short life > expectancy, why are you so willing to buy into the assumption > that you can get rid of it readily? I do not grant that > assumption, and I'm not sure how come you think that hacking > pg_upgrade to the point of being able to get rid of it will be > zero-cost, or even acceptable to the community at all. I have not heard a single suggestion of why pg_upgrade would need a single line changed. I'm believe that pg_dump will need some ALTER statements added when we add the catalog-based flagging of this, but that's true whether or not we use this technique in the meantime. The reason I think it can be readily abandoned is that the heap remains a valid heap according to the historical rules, and the rules which by all appearances would follow. So it is a question of whether the populated state can be transferred from a version using this technique to a version which is not with little or no pain (which I've explained above seems quite feasible to me) or whether we have to go so far as to tell people to REFRESH their unlogged materialized views in the release notes for the new release. Unlikely as that seems ot me, I think people would accept that. What do you see that I'm missing? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: