Re: CREATE DATABASE with filesystem cloning

Поиск
Список
Период
Сортировка
От Ranier Vilela
Тема Re: CREATE DATABASE with filesystem cloning
Дата
Msg-id CAEudQApu=jn0QOeG9YFfXPMysywVAGCjQNY1GtTabMzBd2UOGw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: CREATE DATABASE with filesystem cloning  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Список pgsql-hackers


Em qua., 8 de mai. de 2024 às 10:06, Nazir Bilal Yavuz <byavuz81@gmail.com> escreveu:
Hi,

On Wed, 8 May 2024 at 15:23, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Em qua., 8 de mai. de 2024 às 08:42, Nazir Bilal Yavuz <byavuz81@gmail.com> escreveu:
>>
>> Hi,
>>
>> On Wed, 8 May 2024 at 14:16, Ranier Vilela <ranier.vf@gmail.com> wrote:
>> >
>> >
>> > Em qua., 8 de mai. de 2024 às 04:37, Nazir Bilal Yavuz <byavuz81@gmail.com> escreveu:
>> >>
>> >> Hi Ranier,
>> >>
>> >> Thanks for looking into this!
>> >>
>> >> I am not sure why but your reply does not show up in the thread, so I
>> >> copied your reply and answered it in the thread for visibility.
>> >>
>> >> On Tue, 7 May 2024 at 16:28, Ranier Vilela <ranier.vf@gmail.com> wrote:
>> >> >
>> >> > I know it's coming from copy-and-paste, but
>> >> > I believe the flags could be:
>> >> > - dstfd = OpenTransientFile(tofile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
>> >> > + dstfd = OpenTransientFile(tofile, O_CREAT | O_WRONLY | O_TRUNC | O_EXCL | PG_BINARY);
>> >> >
>> >> > The flags:
>> >> > O_WRONLY | O_TRUNC
>> >> >
>> >> > Allow the OS to make some optimizations, if you deem it possible.
>> >> >
>> >> > The flag O_RDWR indicates that the file can be read, which is not true in this case.
>> >> > The destination file will just be written.
>> >>
>> >> You may be right about the O_WRONLY flag but I am not sure about the
>> >> O_TRUNC flag.
>> >>
>> >> O_TRUNC from the linux man page [1]: If the file already exists and is
>> >> a regular file and the access mode allows writing (i.e., is O_RDWR or
>> >> O_WRONLY) it will be truncated to length 0.  If the file is a FIFO  or
>> >> terminal device file, the O_TRUNC flag is ignored. Otherwise, the
>> >> effect of O_TRUNC is unspecified.
>> >
>> > O_TRUNC is usually used in conjunction with O_WRONLY.
>> > See the example at:
>> > open.html
>> >
>> > O_TRUNC signals the OS to forget the current contents of the file, if it happens to exist.
>> > In other words, there will be no seeks, only and exclusively writes.
>>
>> You are right; the O_TRUNC flag truncates the file, if it happens to
>> exist. But it should not exist in the first place because the code at
>> [2] should check this before the OpenTransientFile() call. Even if we
>> assume somehow the check at [2] does not work, I do not think the
>> correct operation is to truncate the contents of the existing file.
>
> I don't know if there is a communication problem here.
> But what I'm trying to suggest refers to the destination file,
> which doesn't matter if it exists or not?

I do not think there is a communication problem. Actually it matters
because the destination file should not exist, there is a code [2]
which already checks and confirms that it does not exist.
I got it.
 

>
> If the destination file does not exist, O_TRUNC is ignored.
> If the destination file exists, O_TRUNC truncates the current contents of the file.
> I don't see why you think it's a problem to truncate the current content if the destination file exists.
> Isn't he going to be replaced anyway?

'If the destination file does not exist' means the code at [2] works
well and we do not need the O_TRUNC flag.
True, the O_TRUNC is ignored in this case.
 

'If the destination file already exists' means the code at [2] is
broken somehow and there is a high chance that we could truncate
something that we do not want to. For example, there is a foo db and
we want to create bar db, Postgres chose the foo db's location as the
destination of the bar db (which should not happen but let's assume
somehow checks at [2] failed), then we could wrongly truncate the foo
db's contents.
Of course, truncating the wrong file would be pretty bad.
 

Hence, if Postgres works successfully I think the O_TRUNC flag is
unnecessary but if Postgres does not work successfully, the O_TRUNC
flag could cause harm.
The disaster will happen anyway, but of course we can help in some way.
Even without truncating, the wrong file will be destroyed anyway.


>
> Unless we want to preserve the current content (destination file), in case the copy/clone fails?

Like I said above, Postgres should not choose the existing file as the
destination file.

Also, we have O_CREAT | O_EXCL flags together, from the link [3] you
shared before: If O_CREAT and O_EXCL are set, open() shall fail if the
file exists. So, overwriting the already existing file is already
prevented.
That said, I agree that not using O_TRUNC helps in some way.

best regards,
Ranier Vilela

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: CREATE DATABASE with filesystem cloning
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: AIX support