Re: DSM segment handle generation in background workers

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: DSM segment handle generation in background workers
Дата
Msg-id CAEepm=1COuNcj_va0Qt2KTEE4rDvhWCixZ-pyh8iUbrusEVBzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: DSM segment handle generation in background workers  (Noah Misch <noah@leadboat.com>)
Ответы Re: DSM segment handle generation in background workers  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On Wed, Nov 14, 2018 at 3:24 PM Noah Misch <noah@leadboat.com> wrote:
> On Mon, Nov 12, 2018 at 09:39:01AM -0500, Tom Lane wrote:
> > Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > > On Mon, Nov 12, 2018 at 9:34 PM Noah Misch <noah@leadboat.com> wrote:
> > >> Compared to the old code, the new code requires more wall time to visit every
> > >> possible seed value.  New code xor's the PID bits into the fastest-changing
> > >> timestamp bits, so only about twenty bits can vary within any given one-second
> > >> period.  (That assumes a PID space of twenty or fewer bits; fifteen bits is
> > >> the Linux default.)  Is that aspect of the change justified?
> >
> > > Hmm, right.  How about applying pg_bswap32() to one of the terms, as
> > > an easy approximation of reversing the bits?
>
> That seems adequate, yes.  But why not just restore the old algorithm in the
> new function?  (I'm not opposed to revising the algorithm, but I think a
> revision should have explicit goals.  The ease of pg_bswap32() is not itself a
> worthy goal, because the old BackendRun() algorithm was also quite easy.)

Ok.  I rewrote it only because in the new coding I had a TimestampTz
rather than the separate sec and usec components of the old code.  I
didn't intend to revise the algorithm fundamentally, but I missed the
significance of the bit shifting from commit 98c50656c that moved the
faster moving bits up.  I'll change it back.

> > I doubt that's a good idea; to a first approximation, it would mean that
> > half the seed depends only on the PID and the other half only on the
> > timestamp.  Maybe we could improve matters a little by left-shifting the
> > PID four bits or so, but I think we still want it to mix with some
> > rapidly-changing time bits.
> >
> > I'm not really sure that we need to do anything though.  Basically,
> > what we've got here is a tradeoff between how many bits change over
> > a given timespan and how unpredictable those bits are.  I don't see
> > that one of those is necessarily more important than the other.
>
> What counts is the ease of predicting a complete seed.  HEAD's algorithm has
> ~13 trivially-predictable bits, and the algorithm that stood in BackendRun()
> from 98c5065 until 197e4af had no such bits.  You're right that the other 19
> bits are harder to predict than any given 19 bits under the old algorithm, but
> the complete seed remains more predictable than it was before 197e4af.

However we mix them, given that the source code is well known, isn't
an attacker's job really to predict the time and pid, two not
especially well guarded secrets?  I don't see how the mixing matters
in that respect.  (You can also just clobber the seed from SQL, but
that'd be cheating.)

> Goal I recommend: if connections arrive in a Poisson distribution of unknown
> lambda, maximize the number of distinct seeds.

Yeah.  I think bit reversing one of them achieves the maximum number
of distinct seeds by putting the busy ends far apart, but the original
bit shifting is similar.

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kuntal Ghosh
Дата:
Сообщение: In-place updates and serializable transactions
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Race condition in WaitForBackgroundWorkerStartup