Re: [HACKERS] Parallel Hash take II

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] Parallel Hash take II
Дата
Msg-id CAEepm=1fWpa0ZcHDB6JH-gBtH8EqSdAqJOMtb5uao9Rvp3Gp-g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel Hash take II  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] Parallel Hash take II  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: [HACKERS] Parallel Hash take II  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Sat, Dec 2, 2017 at 4:46 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sat, Dec 2, 2017 at 3:54 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Sat, Dec 2, 2017 at 1:55 PM, Andres Freund <andres@anarazel.de> wrote:
>>> - As we don't clean temp files after crash-restarts it isn't impossible
>>>   to have a bunch of crash-restarts and end up with pids *and* per-pid
>>>   shared file set counters reused. Which'd lead to conflicts. Do we care?
>>
>> We don't care.  PathNameCreateTemporaryDir() on a path that already
>> exists will return silently.  PathNameCreateTemporaryFile() on a path
>> that already exists, as mentioned in a comment and following an
>> existing convention, will open and truncate it.  So either it was
>> really an orphan and that is a continuation of our traditional
>> recycling behaviour, or the calling code has a bug and used the same
>> path twice and it's not going to end well.
>
> On further reflection, there are problems with that higher up.  (1)
> Even though PathNameCreateTemporaryFile() will happily truncate and
> reuse an orphaned file when BufFileCreateShared() calls it,
> BufFileOpenShared() could get confused by the orphaned files.  It
> could believe that XXX.1 is a continuation of XXX.0, when in fact it
> is junk left over from a crash restart.  Perhaps BufFileCreateShared()
> needs to delete XXX.{N+1} if it exists, whenever it creates XXX.{N}.
> That will create a gap in the series of existing files that will cause
> BufFileOpenShared()'s search to terminate.  (2) BufFileOpenShared()
> thinks that the absence of an XXX.0 file means there is no BufFile by
> this name, when it could mistakenly open pre-crash junk due to a
> colliding name.  I use that condition as information, but I think I
> can fix that easily by using SharedTuplestoreParticipant::npage == 0
> to detect that there should be no file, rather than trying to open it,
> and then I can define this problem away by declaring that
> BufFileOpenShared() on a name that you didn't call
> BufFileCreateShared() on invokes undefined behaviour.  I will make it
> so.

Here is a patch to deal with that problem.  Thoughts?

I suppose if we wanted to make this type of problem go away, but still
keep files for forensic purposes, we could introduce a "restart
number", and stick it into the top level temporary directory's name.
That way you'd never be able to collide with files created before a
crash-restart, and we could add O_EXCL so it'd become an error to try
to create the same filename again.

I'll post a new SharedTuplestore and Parallel Hash patch set shortly.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: [HACKERS] Proposal: Local indexes for partitioned table
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: How to use set/reset role in contrib_regression test?