Re: drop/truncate table sucks for large values of shared buffers

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: drop/truncate table sucks for large values of shared buffers
Дата
Msg-id CAA4eK1JyKYq2E8L3DeRE7LVUkEu5UTMFTz-ULMuv6NZyQkV0eg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: drop/truncate table sucks for large values of shared buffers  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sun, Jun 28, 2015 at 9:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On 27 June 2015 at 15:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I don't like this too much because it will fail badly if the caller
> >> is wrong about the maximum possible page number for the table, which
> >> seems not exactly far-fetched.  (For instance, remember those kernel bugs
> >> we've seen that cause lseek to lie about the EOF position?)
>
> > If that is true, then our reliance on lseek elsewhere could also cause data
> > loss, for example by failing to scan data during a seq scan.
>
> The lseek point was a for-example, not the entire universe of possible
> problem sources for this patch.  (Also, underestimating the EOF point in
> a seqscan is normally not an issue since any rows in a just-added page
> are by definition not visible to the scan's snapshot.

How do we ensure that just-added page is before or after the scan's snapshot?
If it is before, then the above point mentioned by Simon is valid.  Does this
mean that all other usages of smgrnblocks()/mdnblocks() is safe with respect
to this issue or the consequences will not be so bad as for this usage?

>  But I digress.)
>
> > The consequences of failure of lseek in this case are nowhere near as dire,
> > since by definition the data is being destroyed by the user.
>
> I'm not sure what you consider "dire", but missing a dirty buffer
> belonging to the to-be-destroyed table would result in the system being
> permanently unable to checkpoint, because attempts to write out the buffer
> to the no-longer-extant file would fail.

So another idea here could be that if instead of failing, we just ignore the
error in case the the object (to which that page belongs) doesn't exist and
we can make Drop free by not invalidating from shared_buffers in case of
Drop/Truncate.  I think this might not be sane idea as we need to have a
way to do lookup of objects from checkpoint and need to handle the case
where same Oid could be assigned to new objects (after wraparound?). 



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Refactoring pgbench.c
Следующее
От: Robert Haas
Дата:
Сообщение: Re: anole: assorted stability problems