Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Дата
Msg-id CA+Tgmoa9hzdKYEWci9+45X0Z81+BdL6H4Pkaqzsz=vLxWKHX9w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Ответы Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Mon, Dec 6, 2021 at 9:23 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> One last point - If we try to clone a huge database, as expected CREATE DATABASE emits a lot of WALs, causing a lot
ofintermediate checkpoints which seems to be affecting the performance slightly.
 

Yes, I think this needs to be characterized better. If you have a big
shared buffers setting and a lot of those buffers are dirty and the
template database is small, all of which is fairly normal, then this
new approach should be much quicker. On the other hand, what if the
situation is reversed? Perhaps you have a small shared buffers and not
much of it is dirty and the template database is gigantic. Then maybe
this new approach will be slower. But right now I think we don't know
where the crossover point is, and I think we should try to figure that
out.

So for example, imagine tests with 1GB of shard_buffers, 8GB, and
64GB. And template databases with sizes of whatever the default is,
1GB, 10GB, 100GB. Repeatedly make 75% of the pages dirty and then
create a new database from one of the templates. And then just measure
the performance. Maybe for large databases this approach is just
really the pits -- and if your max_wal_size is too small, it
definitely will be. But, I don't know, maybe with reasonable settings
it's not that bad. Writing everything to disk twice - once to WAL and
once to the target directory - has to be more expensive than doing it
once. But on the other hand, it's all sequential I/O and the data
pages don't need to be fsync'd, so perhaps the overhead is relatively
mild. I don't know.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Dilger
Дата:
Сообщение: Re: Optionally automatically disable logical replication subscriptions on error
Следующее
От: Robert Haas
Дата:
Сообщение: Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)