Re: proposal: make NOTIFY list de-duplication optional

Поиск
Список
Период
Сортировка
От Filip Rembiałkowski
Тема Re: proposal: make NOTIFY list de-duplication optional
Дата
Msg-id CAP_rww=n3sPMGeKh4ERb4BpC46uDC-SMkFyMZMUQfb0TTwZwgw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: proposal: make NOTIFY list de-duplication optional  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sat, Feb 6, 2016 at 5:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> On Sat, 6 Feb 2016 at 12:50 Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, I agree that a GUC for this is quite unappetizing.
>
>> How would you feel about a variant for calling NOTIFY?
>
> If we decide that this ought to be user-visible, then an extra NOTIFY
> parameter would be the way to do it.  I'd much rather it "just works"
> though.  In particular, if we do start advertising user control of
> de-duplication, we are likely to start getting bug reports about every
> case where it's inexact, eg the no-checks-across-subxact-boundaries
> business.

It is not enough to say "database server can decide to deliver a
single notification only." - which is already said in the docs?

The ALL keyword would be a clearly separated "do-nothing" version.

>
>> Optimising the remove-duplicates path is still probably a worthwhile
>> endeavour, but if the user really doesn't care at all about duplication, it
>> seems silly to force them to pay any performance price for a behaviour they
>> didn't want, no?
>
> I would only be impressed with that argument if it could be shown that
> de-duplication was a significant fraction of the total cost of a typical
> NOTIFY cycle.

Even if a typical NOTIFY cycle excludes processing 10k or 100k
messages, why penalize users who have bigger transactions?

> Obviously, you can make the O(N^2) term dominate if you
> try, but I really doubt that it's significant for reasonable numbers of
> notify events per transaction.

Yes, it is hard to observe for less than few thousands messages in one
transaction.
But big data happens. And then the numbers get really bad.
In my test for 40k messages, it is 400 ms versus 9 seconds. 22 times
slower. For 200k messages, it is 2 seconds  versus 250 seconds. 125
times slower.
And I tested with very short payload strings, so strcmp() had not much to do.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shubham Barai
Дата:
Сообщение: Optimization- Check the set of conditionals on a WHERE clause against CHECK constraints.
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Explanation for bug #13908: hash joins are badly broken