Re: Hot Standby b-tree delete records review
От | Heikki Linnakangas |
---|---|
Тема | Re: Hot Standby b-tree delete records review |
Дата | |
Msg-id | 4CD931E0.3020607@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Hot Standby b-tree delete records review (Simon Riggs <simon@2ndQuadrant.com>) |
Ответы |
Re: Hot Standby b-tree delete records review
(Simon Riggs <simon@2ndQuadrant.com>)
|
Список | pgsql-hackers |
(cleaning up my inbox, and bumped into this..) On 22.04.2010 12:31, Simon Riggs wrote: > On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: >> Simon Riggs wrote: >>> On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: >>> >>>>>>>> If none of the removed heap tuples were present anymore, we currently >>>>>>>> return InvalidTransactionId, which kills/waits out all read-only >>>>>>>> queries. But if none of the tuples were present anymore, the read-only >>>>>>>> queries wouldn't have seen them anyway, so ISTM that we should treat >>>>>>>> InvalidTransactionId return value as "we don't need to kill anyone". >>>>>>> That's not the point. The tuples were not themselves the sole focus, >>>>>> Yes, they were. We're replaying a b-tree deletion record, which removes >>>>>> pointers to some heap tuples, making them unreachable to any read-only >>>>>> queries. If any of them still need to be visible to read-only queries, >>>>>> we have a conflict. But if all of the heap tuples are gone already, >>>>>> removing the index pointers to them can'ẗ change the situation for any >>>>>> query. If any of them should've been visible to a query, the damage was >>>>>> done already by whoever pruned the heap tuples leaving just the >>>>>> tombstone LP_DEAD item pointers (in the heap) behind. >>>>> You're missing my point. Those tuples are indicators of what may lie >>>>> elsewhere in the database, completely unreferenced by this WAL record. >>>>> Just because these referenced tuples are gone doesn't imply that all >>>>> tuple versions written by the as yet-unknown-xids are also gone. We >>>>> can't infer anything about the whole database just from one small group >>>>> of records. >>>> Have you got an example of that? >>> >>> I don't need one, I have suggested the safe route. In order to infer >>> anything, and thereby further optimise things, we would need proof that >>> no cases can exist, which I don't have. Perhaps we can add "yet", not >>> sure about that either. >> >> It's good to be safe rather than sorry, but I'd still like to know >> because I'm quite surprised by that, and got me worried that I don't >> understand how hot standby works as well as I thought I did. I thought >> the point of stopping replay/killing queries at a b-tree deletion record >> is precisely that it makes some heap tuples invisible to running >> read-only queries. If it doesn't make any tuples invisible, why do any >> queries need to be killed? And why was it OK for them to be running just >> before replaying the b-tree deletion record? > > I'm sorry but I'm too busy to talk further on this today. Since we are > discussing a further optimisation rather than a bug, I hope it is OK to > come back to this again later. Would now be a good time to revisit this? I still don't see why a b-tree deletion record should conflict with anything, if all the removed index tuples point to just LP_DEAD tombstones in the heap. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: