Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT

Поиск
Список
Период
Сортировка
От Atri Sharma
Тема Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT
Дата
Msg-id CAOeZVicHLJBD7nAJynMq8QaFfAjy-pyXcL9ZXAcuaKX=RrBVaA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT  ("Tomas Vondra" <tv@fuzzy.cz>)
Ответы Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT
Список pgsql-hackers


IMHO writing all the data into a WAL would be the cleanest solution.

Also, what is a small database? I don't think a static value will work,
because the sweet spot between the current approach (forcing two
checkpoints) and writing everything in WAL depends on the amount of dirty
buffers that need to be checkpointed. Which is mostly driven by the size
of shared buffers and write activity - for small shared buffers and/or
mostly-read workload, checkpoints are cheap, so the 'small database'
threshold (when choosing the WAL approach) is much higher than for large
shared buffers or write-heavy workloads.

So are you proposing having a heuristic based on the amount of data in shared buffers and write activity? Do you have something in mind that works for general workloads as well?


So maybe if we could determine the amount of data to be checkpointed, and
then base the decision on that, that'd work better? This would also have
to take into account that writing into WAL is sequential, while
checkpoints usually cause random writes all over the datafiles (which is
more expensive).

Another option might be forcing just a "spread" checkpoint, not the
immediate one (which is what we do now). That would not fix the CREATE
DATABASE duration (actually, it would make it longer), but it would lower
the impact on other activity on the machine.



I believe this to be the cleanest way to reduce the amount of I/O generated. If I understand correctly, the original problem you mentioned was not the time CREATE DATABASE is taking but rather the amount of I/O each one is generating.

This also leads me to think if it makes sense to explore group commits around the creation of files for a new database (for a same backend, of course). This might be on call, if the user knows he/she is going to create a lot of databases in the near future and is fine with a large spike in I/O at one go. Again, might be even more broken than the current scenario, but depends on what the user wants...

Regards,

Atri



--
Regards,
 
Atri
l'apprenant

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Reducing lock strength of adding foreign keys
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Possible problem with shm_mq spin lock