Efficiency of stored procedure vs large join
От | Malcolm Hutty |
---|---|
Тема | Efficiency of stored procedure vs large join |
Дата | |
Msg-id | 3DCBFAD1.3030502@hutty.com обсуждение исходный текст |
Ответы |
Re: Efficiency of stored procedure vs large join
|
Список | pgsql-novice |
Which is more efficient, a stored procedure that performs a query followed by a second, where the choice of the second query depends on the truth/falsity of the first, or a single large join? My previous experience was with relatively simple applications built with Apache+PHP+MySQL. My current project is considerably more complex, which is why I've switched to PostgreSQL. The application I'm building has a complex roles-based permissions system, and it is necessary to check the users permissions on every application-level operation, which is to say on a very high proportion of database queries. Because this security check occurs so frequently, efficiency here is extremely important. Details of my particular situation are as follows: User, roles, permissions, role-permission relationships for particular records, and role-permission defaults for groups of records (those built from a particular "template") are all stored in relevent tables. The conceptual logic is essentially like this: For a given user, database record (in certain tables) and operation: * What are the roles this user has? * What template was this database record built from? (this is a field within the record that is a key to another record) * Are any of the user’s roles permitted to perform the operation by the template? * If so, are any of the user’s roles prohibited from performing the operation by the individual record? If prohibited, return FALSE; otherwise, return TRUE. * If not, any of the user’s roles permitted to perform the operation by the record? If permitted, return TRUE; otherwise, return NULL. Returning TRUE means the operation is permitted; FALSE or NULL means that it is not. My choices seem to be: (1) implement the above logic in PL/pgSQL, so all my database queries will have this form: SELECT foo FROM bar WHERE thingy='wibble' AND SecurityCheck(<userid>,bar.id,<operation>); (2) do it in SQL: SELECT foo FROM bar B WHERE thingy='wibble' AND ((B.template IN ( SELECT DISTINCT D.template FROM securityprivssets S, securitytemplatedefaults D, roledata R WHERE R.person=<userid> AND D.role=R.role AND D.privset=S.id AND D.template = B.template AND S.operation = <operation> AND S.value = TRUE ) ) OR ( B.id IN ( SELECT DISTINCT O.objectid FROM securityprivssets S, securityobjectACL O, roledata R WHERE R.person=$userid AND O.role=R.role AND O.privset=S.id AND O.objectid = Bid AND S.operation = <operation> AND S.value = TRUE ) )) AND ( B.id NOT IN ( SELECT DISTINCT O.objectid FROM securityprivssets S, securityobjectACL O, roledata R WHERE R.person=$userid AND O.role=R.role AND O.privset=S.id AND O.objectid = B.id AND S.operation = <operation> AND S.value = FALSE ) ); On the one hand (with pure SQL), I'm sending a much larger length of query text across the network from PHP to the database, and I might be constructing a very large joined resultset before it gets trimmed down (I don't know how the optimisation works). On the other hand (with a stored procedure) I'm performing several queries, taking the resultsets out of the Postgres optimiser and iteritively querying each result from PL/pgSQL. Is that a bad thing? Or is it no worse than the joined subselects in the big query? Does it make a performance difference at all? I don't really care about query legibility; the whole query is being constructed programmatically anyway, so appending the large fragment above as a string (with variable substitution) to each principal query fragment doesn't harm application-level legibility: it'll look like this in my PHP code: ... $querystring = "SELECT foo FROM bar WHERE thingy='wibble'"; $querystring .= getSecurityCheckString($userid,$operation); $query->Execute($querystring) ... Thanks in advance for your advice, Malcolm.
В списке pgsql-novice по дате отправления: