Обсуждение: notify duplicate elimination performance
I know that it is not a big problem for most users, but allowing a very
large number of notifications while using linear search is a bit dumb.
I can fix this with a very small modification to
struct Notification:
{char *channel ;char *payload ;uint32 hash ;struct Notification *left ;struct Notification *right ;
}
AsyncExistsPendingNotify does an iterative binary tree search.
The tree is insert-only, there is no need for rebalancing, and the code
is quite simple.
Any comments?
Hi,
On 2014-02-08 18:56:41 +0100, Hardy Falk wrote:
> I know that it is not a big problem for most users, but allowing a very
> large number of notifications while using linear search is a bit dumb.
> I can fix this with a very small modification to
> struct Notification:
> {
> char *channel ;
> char *payload ;
> uint32 hash ;
> struct Notification *left ;
> struct Notification *right ;
> }
> AsyncExistsPendingNotify does an iterative binary tree search.
> The tree is insert-only, there is no need for rebalancing, and the code
> is quite simple.
> Any comments?
Well, you didn't add any code, so it's hard to say... Simple ways of
doing what I think you describe will remove the queue's order. Do you
preserve the ordering guarantees?
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services
> Well, you didn't add any code, so it's hard to say... Simple ways of
> doing what I think you describe will remove the queue's order. Do you
> preserve the ordering guarantees?
>
> Greetings,
>
> Andres Freund
>
Yes, the order is preserved.
I didn't remove the the original list code.
The tree is just an additional access path.
> oldcontext = MemoryContextSwitchTo(CurTransactionContext);
>
> n = (Notification *) palloc(sizeof(Notification));
> n->channel = pstrdup(channel);
> if (payload)
> n->payload = pstrdup(payload);
> else
> n->payload = "";
> n->hash = hash ;
> n->left = NULL ;
> n->right= NULL ;
> *tt = n ;
tt is a Notification** obtained by the search.
> static Notification **search(const char *channel, const char *payload )
> {
> Notification *t,**tt ;
> uint32 hash ;
> t = hashroot ;
> tt = &hashroot ;
> hash = hashf(channel,691) ;
> hash = hashf(payload,hash) ;
> while ( t )
> {
> if ( hash < t->hash )
> {
> tt = &t->left ;
> t = t->left ;
> }
> else if ( hash > t->hash )
> {
> tt = &t->right ;
> t = t->right ;
> }
> else
> {
> if (0==strcmp(t->channel,channel) && 0==strcmp(t->payload,payload))
> {
> return NULL
> }
> else
> {
> tt = &t->left ;
> t = t->left ;
> }
> }
> }
> return tt ;
> }
Hi, On 2014-02-08 19:28:56 +0100, Hardy Falk wrote: > > Well, you didn't add any code, so it's hard to say... Simple ways of > > doing what I think you describe will remove the queue's order. Do you > > preserve the ordering guarantees? > > > > Greetings, > > > > Andres Freund > > > Yes, the order is preserved. > I didn't remove the the original list code. > The tree is just an additional access path. How about actually attaching a patch? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hardy Falk <hardy.falk@blue-cable.de> writes:
>> Well, you didn't add any code, so it's hard to say... Simple ways of
>> doing what I think you describe will remove the queue's order. Do you
>> preserve the ordering guarantees?
> Yes, the order is preserved.
> I didn't remove the the original list code.
> The tree is just an additional access path.
It seems likely that this approach would be a net loss for small numbers
of notify events (which is surely the common case). Have you done any
measurements of the performance tradeoff?
regards, tom lane
Tom Lane schrieb:
> Hardy Falk <hardy.falk@blue-cable.de> writes:
>>> Well, you didn't add any code, so it's hard to say... Simple ways of
>>> doing what I think you describe will remove the queue's order. Do you
>>> preserve the ordering guarantees?
>
>> Yes, the order is preserved.
>> I didn't remove the the original list code.
>> The tree is just an additional access path.
>
> It seems likely that this approach would be a net loss for small numbers
> of notify events (which is surely the common case). Have you done any
> measurements of the performance tradeoff?
>
> regards, tom lane
>
>
I can easily keep the tail test. This avoids the hash computation in a
common case. The tree search compares only uint32 values, this should be
quite fast. Even a degenerate tree is no worse than the list. Im my
first tests I didn't use murmurhash, a select
pg_notify('alasys',format('Hello %s',x) from generate_series(1,1000000)
as x triggered this worst case. With murmurhash2 the average search
depth for 10^6 entries is 24.57.
I am about to prepare a patch, should I do some performance tests with
rtdsc first?