Re: record identical operator - Review
От | Bruce Momjian |
---|---|
Тема | Re: record identical operator - Review |
Дата | |
Msg-id | 20130926215008.GB7113@momjian.us обсуждение исходный текст |
Ответ на | Re: record identical operator - Review (Steve Singer <steve@ssinger.info>) |
Ответы |
Re: record identical operator - Review
(Bruce Momjian <bruce@momjian.us>)
|
Список | pgsql-hackers |
On Thu, Sep 19, 2013 at 06:31:38PM -0400, Steve Singer wrote: > On 09/12/2013 06:27 PM, Kevin Grittner wrote: > >Attached is a patch for a bit of infrastructure I believe to be > >necessary for correct behavior of REFRESH MATERIALIZED VIEW > >CONCURRENTLY as well as incremental maintenance of matviews. > > Here is attempt at a review of the v1 patch. > > There has been a heated discussion on list about if we even want > this type of operator. I will try to summarize it here > > The arguments against it are > * that a record_image_identical call on two records that returns > false can still return true under the equality operator, and the > records might (or might not) appear to be the same to users > * Once we expose this as an operator via SQL someone will find it > and use it and then complain when we change things such as the > internal representation of a datatype which might > break there queries > * The differences between = and record_image_identical might not be > understood by everywhere who finds the operator leading to confusion > * Using a criteria other than equality (the = operator provided by > the datatype) might cause problems if we later provide 'on change' > triggers for materialized views > > The arguments for this patch are > * We want the materialized view to return the same value as would be > returned if the query were executed directly. This not only means > that the values should be the same according to a datatypes = > operator but that they should appear the same 'to the eyeball'. > * Supporting the materialized view refresh check via SQL makes a lot > of sense but doing that requires exposing something via SQL > * If we adequately document what we mean by record_image_identical > and the operator we pick for this then users shouldn't be surprised > at what they get if they use this This is a good summary. I think there are a few things that make this issue difficult to decide: 1. We have to use an operator to give the RMVC (REFRESH MATERIALIZED VIEW CONCURRENTLY) SPI query the ability to optimize this query. If we could do this with an SQL or C function, there would be less concern about the confusion caused by adding this capability. Question: If we are comparing based on some primary key, why do we need this to optimize? Certainly the primary key index will be used, no? 2. Agregates are already non-deterministic, so some feel that adding this feature doesn't improve much. The counter-argument is that without the binary row comparison ability, rows could be returned that could _never_ have been produced by the base data, which is more of a user surprise than non-deterministic rows. 3. Our type casting and operators are already complex, and adding another set of operators only compounds that. 4. There are legitimate cases where tool makers might want the ability to compare rows binarily, e.g. for replication, and adding these operators would help with that. I think we need to see a patch from Kevin that shows all the row comparisons documented and we can see the impact of the user-visibile part. One interesting approach would be to only allow the operator to be called from SPI queries. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
В списке pgsql-hackers по дате отправления:
Следующее
От: Bruce MomjianДата:
Сообщение: Re: [PATCH] pg_upgrade: Split off pg_fatal() from pg_log()