Обсуждение: Proposal: access control jails (and introduction as aspiring GSoC student)
Hello, I'm Joey Adams, and I'm interested in applying for Google Summer of Code to work on PostgreSQL. I'm a former GSoC student (I worked on CCAN last year), and a strong C programmer, though I am still new to working with large, established communities. I apologize if this is the wrong place to send GSoC student introductions. My proposal is bold, though I believe it can be trimmed down and refined into something very useful, yet simple enough to implement in 3 months by a newcomer to the PostgreSQL code base. I propose adding application-level access control to PostgreSQL via a jails concept. In a nutshell, a jail is created as part of the database definition (typically exposing a free variable for the current user). When a jail is activated for a session, the only accesses allowed are those indicated in the jail itself. A jail cannot be exited without closing the session. If used properly, jails make it possible to safely execute untrusted SQL code (though one may not want to, citing the principle of least privilege). For example, suppose we have the following database definition for a trivial discussion board system: CREATE TABLE users (id SERIAL PRIMARY KEY,name TEXT,password TEXT,email TEXT,email_public BOOLEAN ); CREATE TABLE posts (id SERIAL PRIMARY KEY,title TEXT,"user" INT REFERENCES users(id),content TEXT,published BOOLEAN ); CREATE TABLE tags (post INT REFERENCES posts(id),tag TEXT ); Suppose our discussion board is written in PHP. It authenticates the user via cookies, after which it has an ID for the current user. One way to arbitrate access is to code it into the PHP using queries like this: pg_query_params('SELECT users.email WHERE id=$1 AND (email_public=TRUE OR id=$2)', $_GET['user'], $current_user); Simple enough. However, this access control check has to be done for every query, and it can get rather complex as more tables are involved. Views are a way to alleviate this complexity, but: * Views only apply to SELECTs * Queries still have to specify the current user (unless one uses a global variable (which requires a bit of a hack, if I'm not mistaken)) My proposal would make it possible to have a statement in the database definition much like this: CREATE JAIL jail (SELECT id, name FROM users;SELECT email FROM users WHERE id=current_user OR email_public=TRUE;SELECT* FROM posts WHERE "user"=current_user OR published=TRUE;UPDATE posts WHERE "user"=current_user SET title, content, published;INSERT INTO posts WHERE "user"=current_user;DELETE FROM posts WHERE"user"=current_user;SELECT * FROM tags, posts WHERE tags.post=posts.id;INSERT INTO tags WHERE post IN ( SELECT id FROM posts WHERE "user"=current_user);DELETE FROM posts WHERE post IN ( SELECT id FROM posts WHERE "user"=current_user); ); Inside of the jail definition is a series of pseudo-statements that indicate the space of queries the user can perform. Simply creating a jail does not make it go into effect. A jail is activated using another query, and it remains in effect for the remainder of the session. It cannot be deactivated through the protocol, as doing so would constitute a privilege escalation. Example of a PHP script invoking a jail: <?php pg_connect('user=foo password=spiderman') or die('Database error occurred'); /* authentication-fu */ // $current_user now contains the id of the logged-in user. pg_query_params('USE JAIL jail WITH current_user=$1', array($current_user)); /* Remaining queries in this session will only be able to SELECT, UPDATE, INSERT, and DELETE on users, posts, and tags, following the rules set forth by the jail. */ ?> I came up with this idea while working on a small software-as-a-service application using PostgreSQL. I needed to implement access control in the PHP frontend. I ended up creating a database abstraction class to protect the rest of the frontend code from SQL messiness. Every database query I needed, I wrote/extended a PHP function for it. My database access class for my "small" application exploded to over 1000 lines (granted, a lot of it was comments). Although having accessor functions can make the code easier to read and maintain, it can result in a lot of redundancy. Why not just use the database abstraction library PostgreSQL gives you: SQL? Jails seem to me like a revolutionary addition to SQL, and implementing them "fully" could be quite involved. However, my guess is that by restricting what may be done in a jail simply to SELECT, UPDATE, INSERT and DELETE, an initial jail implementation capable of what I posted above would be relatively easy. Work would mainly consist of adding code to PostgreSQL's parser and the rewrite engine. Comments?
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Heikki Linnakangas
Дата:
Joseph Adams wrote: > Hello, I'm Joey Adams, and I'm interested in applying for Google > Summer of Code to work on PostgreSQL. I'm a former GSoC student (I > worked on CCAN last year), and a strong C programmer, though I am > still new to working with large, established communities. I apologize > if this is the wrong place to send GSoC student introductions. Hi Joey, you're in the right place! > * Views only apply to SELECTs Views can be made updateable by defining suitable rules for them, that is usually the way people solve this problem. See chapters http://www.postgresql.org/docs/8.4/interactive/rules-views.html and http://www.postgresql.org/docs/8.4/interactive/rules-update.html in the user manual. You should also take a look at Veil (http://veil.projects.postgresql.org/), which uses views to implement something that sounds much like what you're proposing. You should also be aware of an issue with this approach: http://archives.postgresql.org/message-id/4AE02DF0.40101@enterprisedb.com. Some ideas on plugging that hole were discussed, but unfortunately there was no clear consensus and no progress since. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Stephen Frost
Дата:
* Joseph Adams (joeyadams3.14159@gmail.com) wrote: > I propose adding application-level access control to PostgreSQL via a > jails concept. In a nutshell, a jail is created as part of the > database definition (typically exposing a free variable for the > current user). When a jail is activated for a session, the only > accesses allowed are those indicated in the jail itself. A jail > cannot be exited without closing the session. If used properly, jails > make it possible to safely execute untrusted SQL code (though one may > not want to, citing the principle of least privilege). I guess my initial reaction to this is that you can use roles, views, and pl/pgsql (security definer) functions to achieve this. This does have an interesting intersection with row-level security concepts and that's definitely a project that I'd like to see happen at some point in PG. Not sure if you've considered this, but you can do a 'set role' at the start of a session and then use CURRENT_ROLE in view definitions and in other places. You can also make it so that the user who is logging in (eg 'www-data') doesn't have any rights to anything, except the ability to 'set role' to other roles. Note that, with any of this, you need to consider pooled database connections. Unfortunately, it's still pretty expensive to establish a new database connection to PG. Thanks, Stephen
On Mon, Mar 22, 2010 at 9:39 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Joseph Adams (joeyadams3.14159@gmail.com) wrote: >> I propose adding application-level access control to PostgreSQL via a >> jails concept. In a nutshell, a jail is created as part of the >> database definition (typically exposing a free variable for the >> current user). When a jail is activated for a session, the only >> accesses allowed are those indicated in the jail itself. A jail >> cannot be exited without closing the session. If used properly, jails >> make it possible to safely execute untrusted SQL code (though one may >> not want to, citing the principle of least privilege). > > I guess my initial reaction to this is that you can use roles, views, > and pl/pgsql (security definer) functions to achieve this. This does > have an interesting intersection with row-level security concepts and > that's definitely a project that I'd like to see happen at some point in > PG. Not sure if you've considered this, but you can do a 'set role' at > the start of a session and then use CURRENT_ROLE in view definitions and > in other places. You can also make it so that the user who is logging > in (eg 'www-data') doesn't have any rights to anything, except the > ability to 'set role' to other roles. Sometimes it would be nice to conditionalize queries on a value other than the authenticated role. I really wish we had some kind of SQL variable support. Talking out of my rear end: CREATE VARIABLE name AS type [GLOBAL | LOCAL]; -- local variables are per-backend SET VARIABLE name = value; SELECT * FROM foo WHERE col = ${name}; We have had one previous request for the ability to limit the list of queries that the user can issue to a fixed set, but it's not clear to me that that's really all that useful. If that's what you want, you might be better off putting an intermediate layer between the client and the database - e.g. this kind of restriction could be implemented in the connection pooler... ...Robert
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Stephen Frost
Дата:
* Robert Haas (robertmhaas@gmail.com) wrote: > Sometimes it would be nice to conditionalize queries on a value other > than the authenticated role. I really wish we had some kind of SQL > variable support. Talking out of my rear end: I certainly agree- having variable support in the backend would definitely be nice. I'd want it to be explicit and distinct from GUCs though, unlike the situation we have w/ psql right now. All that said, I'm not really a huge fan of write-your-own-authorization-system in general. If the existing authorization system isn't sufficient for what you want, then let's improve it. There may be specific cases where what's needed is particularly complex, but that's what security definer functions are for.. Thanks, Stephen
On Mon, Mar 22, 2010 at 10:03 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> Sometimes it would be nice to conditionalize queries on a value other >> than the authenticated role. I really wish we had some kind of SQL >> variable support. Talking out of my rear end: > > I certainly agree- having variable support in the backend would > definitely be nice. I'd want it to be explicit and distinct from GUCs > though, unlike the situation we have w/ psql right now. Agreed. > All that said, > I'm not really a huge fan of write-your-own-authorization-system in > general. If the existing authorization system isn't sufficient for what > you want, then let's improve it. There may be specific cases where > what's needed is particularly complex, but that's what security definer > functions are for.. Fortunately this functionality also has other uses, so I don't know that we really need to decide which of those uses we approve of more or less. Does the SQL standard specify anything in this area? ...Robert
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Pavel Stehule
Дата:
2010/3/22 Robert Haas <robertmhaas@gmail.com>: > On Mon, Mar 22, 2010 at 9:39 AM, Stephen Frost <sfrost@snowman.net> wrote: >> * Joseph Adams (joeyadams3.14159@gmail.com) wrote: >>> I propose adding application-level access control to PostgreSQL via a >>> jails concept. In a nutshell, a jail is created as part of the >>> database definition (typically exposing a free variable for the >>> current user). When a jail is activated for a session, the only >>> accesses allowed are those indicated in the jail itself. A jail >>> cannot be exited without closing the session. If used properly, jails >>> make it possible to safely execute untrusted SQL code (though one may >>> not want to, citing the principle of least privilege). >> >> I guess my initial reaction to this is that you can use roles, views, >> and pl/pgsql (security definer) functions to achieve this. This does >> have an interesting intersection with row-level security concepts and >> that's definitely a project that I'd like to see happen at some point in >> PG. Not sure if you've considered this, but you can do a 'set role' at >> the start of a session and then use CURRENT_ROLE in view definitions and >> in other places. You can also make it so that the user who is logging >> in (eg 'www-data') doesn't have any rights to anything, except the >> ability to 'set role' to other roles. > > Sometimes it would be nice to conditionalize queries on a value other > than the authenticated role. I really wish we had some kind of SQL > variable support. Talking out of my rear end: > > CREATE VARIABLE name AS type [GLOBAL | LOCAL]; -- local variables are > per-backend > SET VARIABLE name = value; > SELECT * FROM foo WHERE col = ${name}; I though about it two, three years ago as lighter solution without packages. Now we have more almost all what we need. Variables are supported by parser (with last changes related to integration of main parser to plpgsql). just you can write select * from tab where col = var; there are not necessary some special syntax. And it is available from all environments. Implementation for scalar type cannot be dificult (maybe). Regards Pavel Stehule > > We have had one previous request for the ability to limit the list of > queries that the user can issue to a fixed set, but it's not clear to > me that that's really all that useful. If that's what you want, you > might be better off putting an intermediate layer between the client > and the database - e.g. this kind of restriction could be implemented > in the connection pooler... > > ...Robert > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
"Kevin Grittner"
Дата:
Robert Haas <robertmhaas@gmail.com> wrote: > Does the SQL standard specify anything in this area? The only thing that comes to mind for me is the SQL/PSM <SQL variable declaration>. -Kevin
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Peter Eisentraut
Дата:
On mån, 2010-03-22 at 09:54 -0400, Robert Haas wrote: > We have had one previous request for the ability to limit the list of > queries that the user can issue to a fixed set, but it's not clear to > me that that's really all that useful. Well, sudo is pretty useful, and this would be quite similar.
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Dimitri Fontaine
Дата:
Peter Eisentraut <peter_e@gmx.net> writes: > Well, sudo is pretty useful, and this would be quite similar. +1. I guess one of the big difficulties would be to be able to match a given random query with the list of queries we have in any Jail, given that we put in there "generic" queries and we want to allow "specific" queries. But once we have that, it could turn out pretty useful for other thoughts. I can't find it again in the archives, but the idea was to collect statistics on views rather than plain table so that you can have correlated stats on JOINs and some columns etc. The hard part here too looks like being able to tell at runtime that a given query is a specific form of an existing view. Regards, -- dim
On 3/21/10 9:36 PM, Joseph Adams wrote: > Inside of the jail definition is a series of pseudo-statements that > indicate the space of queries the user can perform. Simply creating a > jail does not make it go into effect. A jail is activated using > another query, and it remains in effect for the remainder of the > session. It cannot be deactivated through the protocol, as doing so > would constitute a privilege escalation. This is an interesting approach and I don't think that most of the people commenting on this list have quite grasped it. I see two major difficulties to solve with this approach: (1) developing a way of phrasing the query stubs which would allow common things like dynamic where clauses, order by, and limit, and (2) whether it's practical for the author of any real application to define all of those queries beforehand. For (1), you might want to look at Meredith's libDejector, which takes a similar approach for SQL-injection protection: http://www.thesmartpolitenerd.com/code/dejector.html I don't think that the idea of turning on the jail mode via a session-level switch works, given the realities of connection pooling. Also, I do not believe that we currently have any USERSET variable which can be turned on but not off, so that would require adding a whole new mode. BTW, if you wanted something less ambitious, we have a longstanding request to implement "local superuser", that is, the ability to give one role the ability to edit other roles in one database only. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote: > I don't think that the idea of turning on the jail mode via a > session-level switch works, given the realities of connection pooling. > Also, I do not believe that we currently have any USERSET variable which > can be turned on but not off, so that would require adding a whole new mode. I think this could be done with an assign hook. > BTW, if you wanted something less ambitious, we have a longstanding > request to implement "local superuser", that is, the ability to give one > role the ability to edit other roles in one database only. But roles aren't database-specific... they're globals. ...Robert
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Alvaro Herrera
Дата:
Robert Haas escribió: > On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote: > > BTW, if you wanted something less ambitious, we have a longstanding > > request to implement "local superuser", that is, the ability to give one > > role the ability to edit other roles in one database only. > > But roles aren't database-specific... they're globals. Well, that's another longstanding request ;-) (See the db_user_namespace hack) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Robert Haas escribi�: >> On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote: >>> BTW, if you wanted something less ambitious, we have a longstanding >>> request to implement "local superuser", that is, the ability to give one >>> role the ability to edit other roles in one database only. >> >> But roles aren't database-specific... they're globals. > Well, that's another longstanding request ;-) (See the > db_user_namespace hack) Yeah, you'd have to fix that first. The "ambitious" part of that is coming up with a spec that everybody will accept. Once you had that, coding it might not be very hard ... BTW, "local superuser" is an oxymoron. If you're superuser you'd have no trouble whatsoever breaking into other databases. "Local CREATEROLE" privilege could be a sane concept, though, if we had local roles. regards, tom lane
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Alvaro Herrera
Дата:
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Robert Haas escribi�: > >> But roles aren't database-specific... they're globals. > > > Well, that's another longstanding request ;-) (See the > > db_user_namespace hack) > > Yeah, you'd have to fix that first. The "ambitious" part of that is > coming up with a spec that everybody will accept. Once you had that, > coding it might not be very hard ... I wonder if this is simpler now that we got rid of the flat files stuff. We could validate the user once we've connected to a database and thus able to poke at the local user catalog, not just the global one. I think that was a serious roadblock. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > I wonder if this is simpler now that we got rid of the flat files stuff. > We could validate the user once we've connected to a database and thus > able to poke at the local user catalog, not just the global one. I > think that was a serious roadblock. I think it'd be a mistake to invent a separate catalog for local users; what had been nice clean foreign key relationships (eg, relowner -> pg_auth.oid) would suddenly become a swamp. My first thought about a catalog representation would be to add a column to pg_auth which is a DB OID for local users or zero for global users. However, you'd probably want to prevent local users and global users from having the same names, and it's not very clear how to do that with this representation (though that'd be even worse with separate catalogs). I guess we could fall back on a creation-time check (ick). regards, tom lane
On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> I wonder if this is simpler now that we got rid of the flat files stuff. >> We could validate the user once we've connected to a database and thus >> able to poke at the local user catalog, not just the global one. I >> think that was a serious roadblock. > > I think it'd be a mistake to invent a separate catalog for local users; > what had been nice clean foreign key relationships (eg, relowner -> > pg_auth.oid) would suddenly become a swamp. > > My first thought about a catalog representation would be to add a column > to pg_auth which is a DB OID for local users or zero for global users. > However, you'd probably want to prevent local users and global users > from having the same names, and it's not very clear how to do that > with this representation (though that'd be even worse with separate > catalogs). I guess we could fall back on a creation-time check (ick). Could we use a suitably defined exclusion constraint? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> My first thought about a catalog representation would be to add a column >> to pg_auth which is a DB OID for local users or zero for global users. >> However, you'd probably want to prevent local users and global users >> from having the same names, and it's not very clear how to do that >> with this representation (though that'd be even worse with separate >> catalogs). �I guess we could fall back on a creation-time check (ick). > Could we use a suitably defined exclusion constraint? Not unless you'd like to solve the issues with triggers on system catalogs first ... regards, tom lane
On Tue, Mar 23, 2010 at 8:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My first thought about a catalog representation would be to add a column >>> to pg_auth which is a DB OID for local users or zero for global users. >>> However, you'd probably want to prevent local users and global users >>> from having the same names, and it's not very clear how to do that >>> with this representation (though that'd be even worse with separate >>> catalogs). I guess we could fall back on a creation-time check (ick). > >> Could we use a suitably defined exclusion constraint? > > Not unless you'd like to solve the issues with triggers on system > catalogs first ... Urp. Not really, though I don't know what they are exactly. I didn't think exclusion constraints depended on triggers. UNIQUE constraints work on system catalogs, right? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Mar 23, 2010 at 8:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Not unless you'd like to solve the issues with triggers on system >> catalogs first ... > Urp. Not really, though I don't know what they are exactly. I didn't > think exclusion constraints depended on triggers. UNIQUE constraints > work on system catalogs, right? UNIQUE constraints depend on internal support in the index access method (see today's thread with Gokulakannan Somasundaram for some details of how btree does it). Exclusion constraints have a totally different implementation --- they don't require index AM support, but they do use triggers. Now having said that, my recollection is that the worst issues surrounding triggers on catalogs had to do with BEFORE triggers. Exclusion constraint triggers would be AFTER triggers, so maybe it could be made to work. It'd still be significant work though, for not a lot of value as far as this particular issue goes. regards, tom lane
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Joseph Adams
Дата:
I apologize for my silence, as I've been busy reading up more on the internals of PostgreSQL. From what I can tell, a big problem with my jails idea (as well as the variables Robert described) is that there really isn't a way to store context in the backend specifically for the end client (e.g. a PHP script) due to connection pooling. Also, I almost feel that storing such context would be a disadvantage, as it would harm some of the referential transparency that pooling and caching take advantage of, now and in the future. However, I'm not going to give up :) Perhaps we could have some sort of LET statement that allows the client to pass data to the server, then have libpq automatically wrap queries with the LET statement (when necessary). Here's what it would look like to the PHP scripter: // New libpq function pg_set('current_user', 'bob'); $result = pg_query_params('SELECT answer FROM secrets WHERE user=current_user AND question=$1',array('Birth place')); What this really does is something like: $result = pg_query_params('LET current_user=$1 DO $2 $3',array( 'bob', 'SELECT answer FROM secrets WHERE user=current_userAND question=$1', 'Birth place'))); Here, the hypothetical LET statement executes a query string, binding current_user to our desired value. The client library would wrap all future queries in this fashion. Granted, it would be silly to pass the value itself to the server over and over, so a serious implementation would probably pass a context ID, and these variable assignments would live in the backend instead. Moreover, LET is a terrible keyword choice here, considering most PostgreSQL users won't need to use it explicitly thanks to additional libpq support. Alternatively (this might require changing the client/server protocol), a context ID could be passed back and forth, thus providing a way to tell clients apart. Implementing this idea requires adding to the backend and to libpq. The backend would need at least two new statements. One would set a variable of a session context, creating one if necessary and returning its ID. Another would execute a string as a parameter and bind both immediate arguments and session context to it. libpq would need a function to set a variable, and it would need to wrap queries it sends out with LET statements if necessary. Note that these variables can't be used in pre-defined functions unless they are somehow declared in advance. One idea would be to first add global variable support, then make session-local contexts be able to temporarily reassign those variables. Another would be to provide an explicit declaration statement. Would this make a good proposal for GSoC?: Implement the backend part of my proposal, and create a proof-of-concept wrapper demonstrating it. This way, I add the new statements, but don't mess around with existing functionality too much.
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Dimitri Fontaine
Дата:
Disclaimer: the following is only my view on the matter and is not meant as representative of the project views, which are reached through discussion and consensus. IOW, my 2¢. Joseph Adams <joeyadams3.14159@gmail.com> writes: > Perhaps we could have some sort of LET statement that allows the > client to pass data to the server, then have libpq automatically wrap > queries with the LET statement (when necessary). Here's what it would > look like to the PHP scripter: My first reaction reading this would be to keep the idea but extend the WITH syntax instead, so you'd have WITH BINDINGS bob AS ( current_user = 'bob' ) SELECT answer FROM secrets WHERE user=current_user AND question=$1; That said, you can already (as of 8.4) do the following: WITH bob(name) AS ( SELECT 'bob' ) SELECT answer FROM secrets, bob WHERE user=bob.name AND question=$1; The syntax WITH bob(current_user) is not possible because of the standard using current_user as a keyword (IIUC), but you get the idea. > Granted, it would be silly to pass the value itself to the server over > and over, so a serious implementation would probably pass a context > ID, and these variable assignments would live in the backend instead. I wonder if creating a temporary (I mean per-backend) new catalog where to store the bindings (or whatever you name them) and then allow another syntax like the following would help here: WITH BINDINGS bob () SELECT ... The real problem though is that when using a transaction level pooling system you want to tie your bindings to a transaction, not to a session. So I'm not sure if storing the bindings in a local backend catalog is a must-have feature. Regards, -- dim
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Marko Tiikkaja
Дата:
On 3/26/10 5:42 AM +0200, Joseph Adams wrote: > // New libpq function > pg_set('current_user', 'bob'); > > $result = pg_query_params( > 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', > array('Birth place')); > > > What this really does is something like: > > $result = pg_query_params( > 'LET current_user=$1 DO $2 $3', > array( > 'bob', > 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', > 'Birth place') > )); Looks to me like this is already achievable with custom GUCs and views. Regards, Marko Tiikkaja
On Thu, Mar 25, 2010 at 11:42 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote: > From what I can tell, a big problem with my jails idea (as well as the > variables Robert described) is that there really isn't a way to store > context in the backend specifically for the end client (e.g. a PHP > script) due to connection pooling. Also, I almost feel that storing > such context would be a disadvantage, as it would harm some of the > referential transparency that pooling and caching take advantage of, > now and in the future. However, I'm not going to give up :) > > Perhaps we could have some sort of LET statement that allows the > client to pass data to the server, then have libpq automatically wrap > queries with the LET statement (when necessary). Here's what it would > look like to the PHP scripter: > > // New libpq function > pg_set('current_user', 'bob'); > > $result = pg_query_params( > 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', > array('Birth place')); > > > What this really does is something like: > > $result = pg_query_params( > 'LET current_user=$1 DO $2 $3', > array( > 'bob', > 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', > 'Birth place') > )); > > > Here, the hypothetical LET statement executes a query string, binding > current_user to our desired value. The client library would wrap all > future queries in this fashion. > > Granted, it would be silly to pass the value itself to the server over > and over, so a serious implementation would probably pass a context > ID, and these variable assignments would live in the backend instead. > Moreover, LET is a terrible keyword choice here, considering most > PostgreSQL users won't need to use it explicitly thanks to additional > libpq support. > > Alternatively (this might require changing the client/server > protocol), a context ID could be passed back and forth, thus providing > a way to tell clients apart. > > Implementing this idea requires adding to the backend and to libpq. > The backend would need at least two new statements. One would set a > variable of a session context, creating one if necessary and returning > its ID. Another would execute a string as a parameter and bind both > immediate arguments and session context to it. libpq would need a > function to set a variable, and it would need to wrap queries it sends > out with LET statements if necessary. > > Note that these variables can't be used in pre-defined functions > unless they are somehow declared in advance. One idea would be to > first add global variable support, then make session-local contexts be > able to temporarily reassign those variables. Another would be to > provide an explicit declaration statement. > > Would this make a good proposal for GSoC?: Implement the backend part > of my proposal, and create a proof-of-concept wrapper demonstrating > it. This way, I add the new statements, but don't mess around with > existing functionality too much. Hmm. I'm not sure exactly what problem you're trying to solve here. I don't think this is a particularly good design for supporting variables inside the server, since, well, it doesn't actually support variables inside the server. If we just want a crude hack for allowing the appearance of session-local server-side variables, that could be implemented entirely in client code - in fact it could be done as a thin wrapper around libpq that just does textual substitution of the variables actually referenced by a particular query. That wouldn't require any modifications to core PostgreSQL at all, and it would probably perform better too since you'd not send all the unnecessary variables with every query. Of course, you're 100% correct that connection pooling won't necessarily play well with this feature, but that doesn't mean that we shouldn't implement it. For one thing, not everybody uses connection pooling; for two things, I think global variables (that would behave sort of like a sequence - they'd act sort of like a single column single row relation) would also be useful, and those WOULD work in a connection-pooling environment. But, I think that implementing any kind of variable support in the backend is way too ambitious a project for a first-time hacker to get done in a couple of months. I would guess that's a two-year project for a first time hacker or a one-year project for an experienced hacker (or a three week project for Tom Lane). Here are some ideas from http://wiki.postgresql.org/wiki/Todo that I think MIGHT be closer to the right size for GSOC: Allow administrators to cancel multi-statement idle transactions Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly Add functions to check correctness of configuration files before they are loaded "live" Add JSON (JavaScript Object Notation) data type [tricky part will be getting community buy-in on which JSON library to use] Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME Allow ALTER TABLE to change constraint deferrability and actions Add missing object types for ALTER ... SET SCHEMA Add support for multiple pg_restore -t options, like pg_dump Allow triggers to be disabled in only the current session [without the necessity of modifying system tables] Allow single batch hash joins to preserve outer pathkeys [definitely harder than some of the above] Fix system views like pg_stat_all_tables to use set-returning functions, rather than views of per-column functions Other ideas: Allow per-tablespace effective_io_concurrency Add a GIST opclass for inet/cidr that can support an exclusion constraint for "cidr blocks do not overlap" ALTER VIEW ... DROP COLUMN (or alternatively/in addition RENAME COLUMN) (I now wait for the chorus of people telling me that these ideas are (a) too easy, (b) too hard, or (c) too biased toward my own priorities. I readily admit to (c) - I tried to list things here where I have some idea of what would be required to implement the feature, so it's therefore biased toward the parts of the system with which I'm familiar, which in turn are the ones I care about. Feel free to add your own ideas or critique these.) ...Robert
Re: Proposal: access control jails (and introduction as aspiring GSoC student)
От
Joseph Adams
Дата:
On Fri, Mar 26, 2010 at 12:07 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Hmm. I'm not sure exactly what problem you're trying to solve here. > I don't think this is a particularly good design for supporting > variables inside the server, since, well, it doesn't actually support > variables inside the server. If we just want a crude hack for > allowing the appearance of session-local server-side variables, that > could be implemented entirely in client code - in fact it could be > done as a thin wrapper around libpq that just does textual > substitution of the variables actually referenced by a particular > query. That wouldn't require any modifications to core PostgreSQL at > all, and it would probably perform better too since you'd not send all > the unnecessary variables with every query. One problem with a textual substitution is that implicit variable use (e.g. selecting from a view) can't be substituted, at least not trivially. As for "sending unnecessary variables with every query", my idea was to store those variables in a global table keyed by context ID, then just send that context ID with every query. > But, I think that implementing any kind of variable support in the > backend is way too ambitious a project for a first-time hacker to get > done in a couple of months. I would guess that's a two-year project > for a first time hacker or a one-year project for an experienced > hacker (or a three week project for Tom Lane). Here are some ideas > from http://wiki.postgresql.org/wiki/Todo that I think MIGHT be closer > to the right size for GSOC: > [...] > Add JSON (JavaScript Object Notation) data type [tricky part will be > getting community buy-in on which JSON library to use] The JSON idea caught my eye. I guess the best approach here would be not to use an external library, but to implement it manually using flex/bison. Most of the work would probably revolve around converting things to/from PostgreSQL types, writing test cases, and getting it integrated; writing the parser itself should be a "piece of cake". At first, I figured adding JSON support would be almost too trivial: just parse it, then you're done. After seeing that src/backend/utils/adt/xml.c is 3497 lines, I learned there's a bit more to it :) I skimmed through some JSON implementations in C, and I didn't find any using bison/flex. From the looks of it, I do like JSON_parser ( http://fara.cs.uni-potsdam.de/~jsg/json_parser/ ) because it appears to be written for speed. I think one benefit of adding JSON support is that it would provide a way to store EAV-type data with less overhead than XML (and no dependency on an external library). If this were the only goal, binary encoding would be even better. However, I suppose JSON is more popular and easier to work with in practice.
On Fri, Mar 26, 2010 at 10:19 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote: > One problem with a textual substitution is that implicit variable use > (e.g. selecting from a view) can't be substituted, at least not > trivially. As for "sending unnecessary variables with every query", > my idea was to store those variables in a global table keyed by > context ID, then just send that context ID with every query. That could work, though I'm not totally sure I like the design. >> But, I think that implementing any kind of variable support in the >> backend is way too ambitious a project for a first-time hacker to get >> done in a couple of months. I would guess that's a two-year project >> for a first time hacker or a one-year project for an experienced >> hacker (or a three week project for Tom Lane). Here are some ideas >> from http://wiki.postgresql.org/wiki/Todo that I think MIGHT be closer >> to the right size for GSOC: >> [...] >> Add JSON (JavaScript Object Notation) data type [tricky part will be >> getting community buy-in on which JSON library to use] > > The JSON idea caught my eye. I guess the best approach here would be > not to use an external library, but to implement it manually using > flex/bison. Most of the work would probably revolve around converting > things to/from PostgreSQL types, writing test cases, and getting it > integrated; writing the parser itself should be a "piece of cake". Well, that's one of the central questions: is that the best approach, or should we integrate to an external library and if so which one? That's part of why this is a research problem. > At first, I figured adding JSON support would be almost too trivial: > just parse it, then you're done. After seeing that > src/backend/utils/adt/xml.c is 3497 lines, I learned there's a bit > more to it :) > > I skimmed through some JSON implementations in C, and I didn't find > any using bison/flex. From the looks of it, I do like JSON_parser ( > http://fara.cs.uni-potsdam.de/~jsg/json_parser/ ) because it appears > to be written for speed. > > I think one benefit of adding JSON support is that it would provide a > way to store EAV-type data with less overhead than XML (and no > dependency on an external library). If this were the only goal, > binary encoding would be even better. However, I suppose JSON is more > popular and easier to work with in practice. Yeah. And the row can always be toasted afterwards if appropriate. ...Robert