Re: Accidental removal of a file causing various problems

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: Accidental removal of a file causing various problems
Дата
Msg-id CABOikdPC=LCZ650F5ka8Bzx3NHaguwv6ZVQe6DByvGV0th83iw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Accidental removal of a file causing various problems  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers


On Sat, Aug 25, 2018 at 1:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Actually, I think the main point is given that we've somehow got into
a situation like that, how do we get out again?

I and Alvaro discussed this off-list a bit and we came up with couple of ideas. 

1. Reserve some buffers in the shared buffers for system critical functionality. As this case shows, failure to write blocks populated the entire shared buffers with bad blocks and thus making the database completely inaccessible, even for remedial actions. So the idea is to leave aside say first 100 (or some such number) of blocks for system catalogs and allocate buffers from the remaining pool for user tables. Since will at least help in cases where one bad user table does not bring down the entire cluster. Of course, this may not help if the system catalogs themselves are unwritable. But that's probably a major issue anyways.

2. Provide either an automatic or manual way to evict unwritable buffers to a spillover file or set of files. The buffer pool can then be rescued from the critical situation and the DBA can manually inspect the spillover files to take any corrective action, if needed and if feasible. My idea was to create a shadow relfilenode and write buffers to their logical location. Alvaro though thinks that writing one block per file (relfilenode/fork/block) is a better idea since that provides an easy way for DBA to take action. Irrespective of whether we pick one file per block or per relfilenode, a more interesting question is: should this be automatic or require administrative action?

Does either of the ideas sound interesting enough for further work? 

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavan Deolasee
Дата:
Сообщение: Re: MERGE SQL statement for PG12
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: pg_verify_checksums failure with hash indexes