counting algorithm for incremental matview maintenance

Поиск
Список
Период
Сортировка
<div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt">In
surveyingthe literature on $subject, I find that most of the<br />theoretical work related to how to incrementally
update<br/>materialized views based on the matview declaration was published<br />between 1988 and 1993.  The best
paperI have been able to find on<br />the topic was published in ACM SIGMOD in 1993[1], and covers two<br />algorithms:
countingand DRed.  The former should be very fast for<br />non-recursive views, but not very efficient for recursive
views.<br/>The latter algorithm is the other way around -- it looks like it<br />will do well with recursive views but
generallybe slower for<br />non-recursive views.<br /><br />It does not seem feasible to me to implement both
techniquesin a<br />single one-year PostgreSQL release.  In fact, if we have trouble<br />getting everyone onto the
samepage early, we might have to settle<br />for trying to just get some infrastructure into place, without<br
/>anythingto actually make use of it.  That would be unfortunate,<br />since Oracle added incremental maintenance of
matviewsto their<br />existing feature in 1999, and have been improving it regularly<br />since then.  Many other
productsalso have mature implementations<br />of this, and there seems to be a lot of demand for it in<br
/>PostgreSQL. In the best of circumstances it will take years for us<br />to catch up on this front.<br /><br />What I
proposefor 9.4 is this:<br /><br />During the first CF I would like to have a discussion to settle on<br />how best to
dosome things that the counting algorithm requires, so<br />that following the first CF I can try to implement that<br
/>infrastructureand get at least simple matviews working for<br />incremental maintenance, and have something ready for
thesecond<br />CF.  I can hack up a rough patch to suggest a possible approach for<br />purposes of discussion, but I'm
notsure whether either that's<br />necessary or desirable.<br /><br />----------<br />Needed infrastructure changes
follow.<br/>----------<br /><br />First, the algorithm requires a new system column called<br />count_t (or similar). 
Itwould be optional (like oid is) would<br />only be used by tuples in matviews maintained by the counting<br
/>algorithm,and by delta relations used in the matview incremental<br />maintenance process. Tables and matviews not
createdor altered to<br />specify incremental maintenance would not carry this new system<br />column.  Existing
matviewsfrom 9.3 would continue to work as<br />before until and unless they were ALTERed to specify incremental<br
/>maintenance;this form of ALTER would require rewrite of the<br />matview to add the count_t column.<br /><br />This
algorithmalso requires that we have new execution node types<br />(or conditional behavior in existing nodes) to
supportprocessing<br />counts.  It is worth noting that relations without a count_t column<br />will sometimes need to
beprocessed using the new logic (in which<br />case each tuple is treated as having an implied count of one).<br /><br
/>count_tin a matview must always be greater than zero.  In delta<br />relations it can be negative (to reflect tuples
beingdeleted or<br />the "before" image of an update) or positive (to reflect tuples<br />being inserted or the "after"
imageof an update).<br /><br />Examples of the new behavior in nodes used by incremental<br />maintenance are:<br /><br
/> *DISTINCT or GROUP BY: count_t in the result should represent the<br />number of tuples being combined.<br /><br
/> *JOIN: count_t in the result is the product of multiplying the<br />counts from the pair of joined tuples.<br /><br
/> *UNION (without ALL): count_t in the result should be the value<br />from an unmatched tuple or the sum of the
countsof matched tuples,<br />with any result row with a zero count_t being omitted from the<br />result.<br /><br
/>Notethat for nodes other than UNION, the logic is identical except<br />for calculation of the count for the
output.<br/><br />The resulting matviews can be used in the traditional set logic of<br />PostgreSQL, without knowing
aboutor caring about the count_t<br />column, exactly as before; the new system column is only used for<br
/>maintenance.<br/><br />We could drive the triggering of incremental maintenance off of the<br />dependency
informationwhich is already stored, but for performance<br />we probably want to add a new pg_class flag to indicate
thatthe<br />relation is referenced by a matview definition which specifies<br />incremental update.  That would allow
afast path for skipping<br />other tests for DML on non-referenced relations, at the expense of<br />some additional
catalogupdates on some DDL.<br /><br />----------<br />The above is everything that is a significant infrastructure
change<br/>needed for a first cut at incremental maintenance of materialized<br />views, driven by the view
definition. Below is additional detail<br />about what will happen when incremental maintenance for a view is<br
/>specified,for those who wonder why these infrastructure changes<br />are needed.<br />----------<br /><br
/>Essentially,the technique is based on relational algebra.  This<br />allows previously existing theory to prove the
correctnessof the<br />technique and its optimizations.  The addition of the count_t<br />column allows accurate
identificationof exactly which rows need to<br />be touched during incremental maintenance, with minimal overhead.<br
/><br/>As the paper notes, these incremental maintenance techniques are a<br />heuristic; there will always be cases
whereit is faster to<br />regenerate the data from scratch than to use these incremental<br />techniques, so part of
thegoal of having the transactional REFRESH<br />(as described in a separate thread) is to allow some front-end<br
/>teststo attempt to recognize when an incremental approach will be<br />slower than a REFRESH and fall back onto the
transactionalREFRESH<br />technique seamlessly and fairly quietly.  (My gut feeling on the<br />right level for an
ereportof this is DEBUG1, although I can see<br />the argument for a LOG level entry.)<br /><br />The original and
modifiedversions of the relations (tables or<br />other matviews) which define a matview must be available to<br
/>calculatethe matview deltas, so the snapshot information for this<br />must be available and registered until the
matviewdelta has been<br />calculated.  They can be released once the delta has been<br />established and before it has
beenapplied to the matview.  Initial<br />milestones in this development will focus on "eager" maintenance,<br />so
thiswill not initially be a big issue, but will become more<br />important as we work on making more of the work
asynchronous,so<br />that the foreground query is not held up maintaining data which is<br />allowed to be a little
stale.This seems not entirely unrelated to<br />background worker processes and snapshot sharing.<br /><br />I don't
knowwhether we can get to this during 9.4 development, but<br />it might be worth at least considering as the other
workis done:<br />some aggregate columns with a small amount of data kept while the<br />aggregate is being computed
canbe further optimized under this<br />technique by storing the working data in the tuple as some sort of<br />hidden
orsystem column related to the aggregate.  A few of the<br />obvious candidates for such optimization include count(),
sum()and<br />avg().  Note that with this capability, especially if we ever<br />implement the query rewrite capability
formatviews, we would have<br />a nice answer for the count(*) speed complaints.<br /><br />Obviously, there will need
tobe some new syntax around CMV and AMV<br />to support specification of how the matview should be maintained.<br /><br
/>Also,we are moving into a phase where it makes sense to start the<br />bikeshedding on what timings we should
support,how information<br />about the desired timings should be stored, how "freshness" should<br />be tracked, and
whatsyntax we put around all that.  Frankly, I<br />think I will have my hands pretty full with the incremental<br
/>maintenancework, and I see the timing work as largely orthogonal,<br />so it would be great if anyone who was
interestedin that aspect of<br />things could take primary responsibility for moving that along.<br /><br />--<br
/>KevinGrittner<br />EnterpriseDB: http://www.enterprisedb.com<br />The Enterprise PostgreSQL Company<br /><br />[1]
http://dl.acm.org/citation.cfm?id=170066<br/><br /></div> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marti Raudsepp
Дата:
Сообщение: Re: PostgreSQL 9.3 beta breaks some extensions "make install"
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: Differential (transactional) REFRESH