Обсуждение: C based plugins, clocks, locks, and configuration variables

Поиск
Список
Период
Сортировка

C based plugins, clocks, locks, and configuration variables

От
Clifford Hammerschmidt
Дата:
Hi all,

Apologies in advance if this isn't the right place to be posting this.

I've started work on a plugin in C (https://github.com/tanglebones/pg_tuid) for generating generally monotonically ascending UUIDs (aka TUIDs), and after googling around I couldn't find any guidence on a few things. (It's hard to google for anything in the postgres C api as most results coming back are for using postgres itself, not developing plugins for postgres.)

I'm looking for the idiomatic (and portable) way of:

1) getting microseconds (or nanoseconds) from UTC epoch in a plugin
2) getting an exclusive lock for a user plugin to serialize access to its shared state (I'm assuming that plugins must be reentrant)
3) creating a configuration variable for a plugin and accessing its values in the plugin. (e.g. `set plugin.configuration_variable=1` or somesuch)

Thanks,

-- 
Clifford Hammerschmidt, P.Eng.

Re: C based plugins, clocks, locks, and configuration variables

От
Craig Ringer
Дата:
<p dir="ltr"><p dir="ltr">On 4 Nov. 2016 06:05, "Clifford Hammerschmidt" <<a
href="mailto:tanglebones@gmail.com">tanglebones@gmail.com</a>>wrote:<br /> ><br /> > Hi all,<br /> ><br />
>Apologies in advance if this isn't the right place to be posting this.<br /> ><br /> > I've started work on a
pluginin C (<a href="https://github.com/tanglebones/pg_tuid">https://github.com/tanglebones/pg_tuid</a>) for generating
generallymonotonically ascending UUIDs (aka TUIDs), and after googling around I couldn't find any guidence on a few
things.(It's hard to google for anything in the postgres C api as most results coming back are for using postgres
itself,not developing plugins for postgres.)<br /> ><br /> > I'm looking for the idiomatic (and portable) way
of:<br/> ><br /> > 1) getting microseconds (or nanoseconds) from UTC epoch in a plugin<p
dir="ltr">GetCurrentIntegerTimestamp()<pdir="ltr">> 2) getting an exclusive lock for a user plugin to serialize
accessto its shared state (I'm assuming that plugins must be reentrant)<p dir="ltr">Allocate an LWLock in your shared
memorysegment and use it to arbitrate access. Multiple examples in contrib. Lwlock allocation infonin developer docs.<p
dir="ltr">>3) creating a configuration variable for a plugin and accessing its values in the plugin. (e.g. `set
plugin.configuration_variable=1`or somesuch)<p dir="ltr">DefineCustomIntegerVariable etc (I think, name not exactly
right?On phone). See guc.h .<br /> 

Re: C based plugins, clocks, locks, and configuration variables

От
Craig Ringer
Дата:
On 8 November 2016 at 07:41, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:
> Hi Craig,
>
> Thanks for the pointers; I made a stab at it in:
> https://github.com/tanglebones/pg_tuid
>
> I've no idea if the shmem and lwlock code is correct, or how to test it. It
> seems to work (requires loading via the shared_preload_libraries) on osx in
> that the tuid_ calls work and produce the expected results on my lightly
> loaded development box (not really a good test of shmem or locks in that I
> doubt either are being exercised).

Since that's a public github I took the liberty of replying to the
list. Please reply to the list, not just to me.

Good on you for giving it a go.

For concurrency testing, the isolation tester tool in
src/test/isolation is quite handy. Custom pgbench scripts can also be
useful, though they're really only useful if you can detect an
anomalous situation and Assert to crash the backend in an
--enable-cassert build when there's a problem.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Remote DBA, Training
&Services
 



Re: C based plugins, clocks, locks, and configuration variables

От
Jim Nasby
Дата:
On 11/3/16 7:14 PM, Craig Ringer wrote:
>> 1) getting microseconds (or nanoseconds) from UTC epoch in a plugin
>
> GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo 
the timestamp completely. It's not like the values your generating are 
globally unique anymore, or hard to guess.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: C based plugins, clocks, locks, and configuration variables

От
Clifford Hammerschmidt
Дата:
Hi Jim,

The values are still globally unique. The odds of a collision are very very low. Two instances with the same node_id generating on the same millisecond (in their local view of time) have a 1:2^34 chance of collision. node_id only repeats every 256 machines in a cluster (assuming you're configured correctly), and the probability of the same millisecond being used on both machines is also low (depends on generation rate and machine speed). The only real concern is with clock replays (i.e. something sets the clock backwards, like an admin or a badly implemented time sync system), which does happen in rare instances and is why seq is there to extend that space out and reduce the chance of a collision in that millisecond. (time replays are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally monotonically ascending aspect I want. This causes inserts to append to the index (much faster than random inserts in large indexes because of cache coherency), and causes data generated around the same time to occupy near nodes in the index (again, cache benefits, as related data tends to be generated bunched up in time).

Thanks,
-Cliff. 

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 11/3/16 7:14 PM, Craig Ringer wrote:
1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo the timestamp completely. It's not like the values your generating are globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

Re: C based plugins, clocks, locks, and configuration variables

От
Clifford Hammerschmidt
Дата:
Looking closer at the bit math, I screwed it up.... it should be 64 bits time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is 42 bits of random. I'll find the code in a bit.

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 9:42 AM, Clifford Hammerschmidt <tanglebones@gmail.com> wrote:
Hi Jim,

The values are still globally unique. The odds of a collision are very very low. Two instances with the same node_id generating on the same millisecond (in their local view of time) have a 1:2^34 chance of collision. node_id only repeats every 256 machines in a cluster (assuming you're configured correctly), and the probability of the same millisecond being used on both machines is also low (depends on generation rate and machine speed). The only real concern is with clock replays (i.e. something sets the clock backwards, like an admin or a badly implemented time sync system), which does happen in rare instances and is why seq is there to extend that space out and reduce the chance of a collision in that millisecond. (time replays are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally monotonically ascending aspect I want. This causes inserts to append to the index (much faster than random inserts in large indexes because of cache coherency), and causes data generated around the same time to occupy near nodes in the index (again, cache benefits, as related data tends to be generated bunched up in time).

Thanks,
-Cliff. 

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 11/3/16 7:14 PM, Craig Ringer wrote:
1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo the timestamp completely. It's not like the values your generating are globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: C based plugins, clocks, locks, and configuration variables

От
Craig Ringer
Дата:
<p dir="ltr"><p dir="ltr">On 9 Nov. 2016 02:48, "Clifford Hammerschmidt" <<a
href="mailto:tanglebones@gmail.com">tanglebones@gmail.com</a>>wrote:<br /> ><br /> > Looking closer at the bit
math,I screwed it up.... it should be 64 bits time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is
42bits of random. I'll find the code in a bit.<p dir="ltr">Huh, so that's what you are doing. <p dir="ltr">I just added
thesame thing to the 9.6 BDR development tree last week, though using 64-bit values, based on a draft Petr wrote. Feel
freeto take a look. bdr-plugin/dev-bdr96 branch in 2ndQuadrant/bdr github repo. The main file is seq2.c .<br /> 

Re: C based plugins, clocks, locks, and configuration variables

От
Clifford Hammerschmidt
Дата:

On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com> wrote:
2ndQuadrant/bdr

That is similar. I'm not clear on the usage of OID for sequence (`DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock around a sequence generation? also different is that your sequence doesn't reset on the time basis, it ascends and wraps independently of the time.

(also, you appear to modulo against the max (2^n-1), not the cardinality (2^n), ... should that be an & ... i.e. take SEQUENCE_BITS of 1 -> MAX_SEQ_ID of ((1<<1)-1) = 1 -> (seq % 1) = {0} ... not {0,1} as expected; (seq & 1) = {0,1} as expected)

We tried 64-bit values for ids (based on twitter's snowflake), but found that time-replay would cause collisions. We had a server have its time corrected, going backwards, by an admin; leading to duplicate ids being generated, leading to a fun day of debugging and a hard lesson about our assumption that time always increases over time. Using node_id doesn't protect against this, since it is the same node creating the colliding ids as the original ids. By extending the ids to include a significant amount of randomness, and requiring a restart of the db for the time value to move backwards (by latching onto the last seen time), we narrow the window for collisions to close enough to zero that winning the lottery is far more likely (http://preshing.com/20110504/hash-collision-probabilities/ has the exact math). We also increase the time scale for id wrap around to long past the likely life expectancy of the software we're building today.

-- 
Clifford Hammerschmidt, P.Eng.

Re: C based plugins, clocks, locks, and configuration variables

От
Craig Ringer
Дата:
On 10 November 2016 at 07:18, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:
>
> On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
> wrote:
>>
>> 2ndQuadrant/bdr
>
>
> That is similar. I'm not clear on the usage of OID for sequence
> (`DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock
> around a sequence generation?

No.

> also different is that your sequence doesn't
> reset on the time basis, it ascends and wraps independently of the time.

Right.

> (also, you appear to modulo against the max (2^n-1), not the cardinality
> (2^n), ... should that be an & ... i.e. take SEQUENCE_BITS of 1 ->
> MAX_SEQ_ID of ((1<<1)-1) = 1 -> (seq % 1) = {0} ... not {0,1} as expected;
> (seq & 1) = {0,1} as expected)

Hm. I think you're right there.

> We tried 64-bit values for ids (based on twitter's snowflake), but found
> that time-replay would cause collisions. We had a server have its time
> corrected, going backwards, by an admin; leading to duplicate ids being
> generated, leading to a fun day of debugging and a hard lesson about our
> assumption that time always increases over time.

That's a good point, but it's just going to have to be a documented
limitation. BDR expects you to use NTP and slew time when needed
anyway.

> Using node_id doesn't
> protect against this, since it is the same node creating the colliding ids
> as the original ids. By extending the ids to include a significant amount of
> randomness, and requiring a restart of the db for the time value to move
> backwards (by latching onto the last seen time), we narrow the window for
> collisions to close enough to zero that winning the lottery is far more
> likely (http://preshing.com/20110504/hash-collision-probabilities/ has the
> exact math). We also increase the time scale for id wrap around to long past
> the likely life expectancy of the software we're building today.

It's a good idea. I like what you're doing. I've run into too many
sites that can't or won't use 128-bit generated values though.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Remote DBA, Training
&Services
 



Re: C based plugins, clocks, locks, and configuration variables

От
Craig Ringer
Дата:
On 10 November 2016 at 07:18, Clifford Hammerschmidt
<tanglebones@gmail.com> wrote:
>
> On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
> wrote:
>>
>> 2ndQuadrant/bdr
>
>
> That is similar. I'm not clear on the usage of OID for sequence
> (`DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock
> around a sequence generation? also different is that your sequence doesn't
> reset on the time basis, it ascends and wraps independently of the time.

Meant to explain more here.

Most of the system identifies sequence relations by oid. All this does
is call nextval. By accepting and passing oid we reduce the number of
syscache/relcache lookups and memory allocations required to call
nextval vs calling it by name. That's about all, really.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Remote DBA, Training
&Services