Обсуждение: NOTIFY with tuples
Hi, I've used LISTEN/NOTIFY in a few apps with great success (both the new and the old implementation) but I've found myself wondering why I couldn't use a richer payload, and wondered if anyone already had plans in this direction. It seems there are number of academic and commercial systems (StreamSQL [1], CQL [2], ...) which provide powerful queryable streams of tuples, including windowing, grouping, joining and pipelining facilities, all of which are far beyond what I have been picturing. I imagine a very simple system like this, somehow built on top of the existing NOTIFY infrastructure: CREATE STREAM foo (sensor INTEGER, temperature NUMERIC); In session A: INSERT INTO foo VALUES (42, 99.0); INSERT INTO foo VALUES (99, 100.0); COMMIT; Meanwhile in session B: SELECT * FROM foo; And perhaps even some simple filtering: SELECT * FROM foo WHERE sensor = 42; I don't know how you would first signal your interest in foo before you can start SELECTing from it... perhaps with LISTEN. I suppose running the SELECT query on a stream would return only tuples that are queued up and ready to fetch, without blocking to wait for more, and a client could execute the query repeatedly, using select() on the file descriptor to wait for data to be ready (the same way people currently wait between calls to PGnotifies). As for implementation details, I haven't done much research yet into how something like this would be done and am very new to the source tree, but I thought I'd present this idea and see if it's a duplicate effort, or someone has a much better idea, or it is instantly shot down in flames for technical or other reasons, before investing further in it. Thanks! Thomas Munro [1] http://en.wikipedia.org/wiki/StreamSQL [2] http://de.wikipedia.org/wiki/Continuous_Query_Language
On Tue, Dec 13, 2011 at 6:30 PM, Thomas Munro <munro@ip9.org> wrote: > It seems there are number of academic and commercial > systems (StreamSQL [1], CQL [2], ...) which provide powerful > queryable streams of tuples, including windowing, grouping, > joining and pipelining facilities, all of which are far beyond > what I have been picturing. > > I imagine a very simple system like this, somehow built on top of > the existing NOTIFY infrastructure: > > CREATE STREAM foo (sensor INTEGER, temperature NUMERIC); > > In session A: > > INSERT INTO foo VALUES (42, 99.0); > INSERT INTO foo VALUES (99, 100.0); > COMMIT; > > Meanwhile in session B: > > SELECT * FROM foo; > > And perhaps even some simple filtering: > > SELECT * FROM foo WHERE sensor = 42; > > I don't know how you would first signal your interest in foo > before you can start SELECTing from it... perhaps with LISTEN. > > I suppose running the SELECT query on a stream would return only > tuples that are queued up and ready to fetch, without blocking to > wait for more, and a client could execute the query repeatedly, > using select() on the file descriptor to wait for data to be > ready (the same way people currently wait between calls to > PGnotifies). > > As for implementation details, I haven't done much research yet > into how something like this would be done and am very new to the > source tree, but I thought I'd present this idea and see if it's > a duplicate effort, or someone has a much better idea, or it is > instantly shot down in flames for technical or other reasons, > before investing further in it. I'm not sure whether we'd want something like this in core, so for a first go-around, you might want to consider building it as an extension. It might work to just decree that each stream must be built around a composite type. Then you could do this: pg_create_stream(regclass) - create a stream based on the given composite type pg_destroy_stream(regclass) - nuke the stream pg_subscribe_stream(regclass) - current backend wants to read from the stream pg_unsubscribe_stream(regclass) - current backend no longer wants to read from the stream The function pg_create_stream() could create reader and writer functions for the stream. For example, if the composite type were called "foo", then you'd end up with foo_read() returning SETOF foo and foo_write(foo) returning void. The C functions would look at the argument types to figure out which stream they were operating on. The writer function store all the tuples written to the stream into a temp file with a name based on the composite type OID. The reader function would return all tuples added to the temp file since the last read. You'd want the last read locations for all the subscribers stored in the file (or another file) somewhere so that when the furthest-back reader read the data, it could figure out which data was no longer need it and arrange for it to be truncated away. I'm not sure you need NOTIFY for anything anywhere in here. All in all, this is probably a pretty complicated project, but I'm sure there are people who would use it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Dec 13, 2011 at 6:30 PM, Thomas Munro <munro@ip9.org> wrote: >> I imagine a very simple system like this, somehow built on top of >> the existing NOTIFY infrastructure: > I'm not sure whether we'd want something like this in core, so for a > first go-around, you might want to consider building it as an > extension. ... I'm not sure you > need NOTIFY for anything anywhere in here. Actually, what I'd suggest is just some code to serialize and deserialize tuples and transmit 'em via the existing NOTIFY payload facility. I agree that presenting it as some functions would be a lot less work than inventing bespoke syntax, but what you sketched still involves writing a lot of communications infrastructure from scratch, and I'm not sure it's worth doing that. regards, tom lane
On 14 December 2011 04:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Dec 13, 2011 at 6:30 PM, Thomas Munro <munro@ip9.org> wrote: >>> I imagine a very simple system like this, somehow built on top of >>> the existing NOTIFY infrastructure: > >> I'm not sure whether we'd want something like this in core, so for a >> first go-around, you might want to consider building it as an >> extension. ... I'm not sure you >> need NOTIFY for anything anywhere in here. > > Actually, what I'd suggest is just some code to serialize and > deserialize tuples and transmit 'em via the existing NOTIFY payload > facility. I agree that presenting it as some functions would be a lot > less work than inventing bespoke syntax, but what you sketched still > involves writing a lot of communications infrastructure from scratch, > and I'm not sure it's worth doing that. Thank you both for your feedback! Looking at commands/async.c, it seems as thought it would be difficult for function code running in the backend to get its hands on the payload containing the serialized tuple, since the notification is immediately passed to the client in NotifyMyFrontEnd and there is only one queue for all notifications, you can't just put things back or not consume some of them yet IIUC. Maybe the code could changed to handle payloads holding serialized tuples differently, and stash them somewhere backend-local rather than sending to the client, so that a function returning SETOF (or a new executor node type) could deserialize them when the user asks for them. Or did you mean that libpq could support deserializing tuples on the client side? Thinking about Robert's suggestion for extension-only implementation, maybe pg_create_stream could create an unlogged table with a monotonically increasing primary key plus the columns from the composite type, and a high-water mark table to track subscribers, foo_write could NOTIFY foo to wake up subscribed clients only (ie not use the payload for the data, but clients need to use regular LISTEN to know when to call foo_read), and foo_read could update the per-subscriber high water mark and delete rows if the current session is the slowest reader. That does sound hideously heavyweight... I guess that wouldn't be anywhere near as fast as a circular buffer in a plain old file and/or a bit of shared memory. A later version could use files as suggested, bit I do want these streams to participate in transactions, and that sounds incompatible to me (?). I'm going to prototype that and see how it goes. I do like the idea of using composite types to declare the stream structure, and the foo_read function returning the SETOF composite type seems good because it could be filtered and incorporated into arbitrary queries with joins and so forth.
On Dec 13, 2011, at 8:21 PM, Tom Lane wrote: >> I'm not sure whether we'd want something like this in core, so for a >> first go-around, you might want to consider building it as an >> extension. ... I'm not sure you >> need NOTIFY for anything anywhere in here. > > Actually, what I'd suggest is just some code to serialize and > deserialize tuples and transmit 'em via the existing NOTIFY payload > facility. I agree that presenting it as some functions would be a lot > less work The ability to cast RECORDs to JSON would be awesome for this. Best, David
On Tue, Dec 13, 2011 at 11:27 PM, Thomas Munro <munro@ip9.org> wrote: >> Actually, what I'd suggest is just some code to serialize and >> deserialize tuples and transmit 'em via the existing NOTIFY payload >> facility. I agree that presenting it as some functions would be a lot >> less work than inventing bespoke syntax, but what you sketched still >> involves writing a lot of communications infrastructure from scratch, >> and I'm not sure it's worth doing that. > > Thank you both for your feedback! > > Looking at commands/async.c, it seems as thought it would be difficult > for function code running in the backend to get its hands on the > payload containing the serialized tuple, since the notification is > immediately passed to the client in NotifyMyFrontEnd and there is only > one queue for all notifications, you can't just put things back or not > consume some of them yet IIUC. Maybe the code could changed to handle > payloads holding serialized tuples differently, and stash them > somewhere backend-local rather than sending to the client, so that a > function returning SETOF (or a new executor node type) could > deserialize them when the user asks for them. Or did you mean that > libpq could support deserializing tuples on the client side? One way of grabbing notifications in a backend function would be via dblink -- you LISTEN on a sideband connection and grab notifications via http://www.postgresql.org/docs/9.1/interactive/contrib-dblink-get-notify.html. As to the wider point I'm wondering why you can't layer your API on top of existing facilities (tables, notifications, etc). PGQ (have you seen that?) does this and it's an absolute marvel. Meaning, I bet you could do this with an 'all sql (or plpgsql)' implementation. That's a good thing -- C code significantly raises the bar in terms of putting your code in the hands of people who might be interested in using it. merlin
On 14 December 2011 15:10, Merlin Moncure <mmoncure@gmail.com> wrote: > As to the wider point I'm wondering why you can't layer your API on > top of existing facilities (tables, notifications, etc). PGQ (have you > seen that?) does this and it's an absolute marvel. Meaning, I bet you > could do this with an 'all sql (or plpgsql)' implementation. That's a > good thing -- C code significantly raises the bar in terms of putting > your code in the hands of people who might be interested in using it. Well I was interested in the idea of using the NOTIFY payload somehow for high performance (it's not backed by a table that gets fsynced and needs to be vacuumed etc, and it delivers data to clients without an extra round trip), and I guess also really like the idea of streams being first class objects in a kind of StreamSQL-lite language extension. But I've been playing around with Robert's suggestion, and I realised that I can dress up the foo_read and foo_write functions (probably written in pure plpgsql) with a VIEW so that I can INSERT and SELECT tuples, and to be able to join it against other tables. Here's what I have working so far: https://github.com/macdice/pg_stream/blob/master/hack.sql I guess at this point this becomes off topic for pgsql-hackers. Thanks all for the pointers and ideas. PGQ looks interesting, I'll check it out.