Re: Schema variables - new implementation for Postgres 15

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Schema variables - new implementation for Postgres 15
Дата
Msg-id CAFj8pRCZ9fVAC6h-s8fHS9muXmPwvHcwAX3omRJm9vUaQTMUcA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Schema variables - new implementation for Postgres 15  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: Schema variables - new implementation for Postgres 15  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers


st 26. 1. 2022 v 8:23 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Jan 25, 2022 at 10:53:00PM +0100, Pavel Stehule wrote:
>
> út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
> >
> > First, I don't think that acquiring the lock in
> > get_session_variable_type_typmod_collid() and
> > prepare_variable_for_reading() is
> > the correct approach.  In transformColumnRef() and transformLetStmt() you
> > first
> > call IdentifyVariable() to check if the given name is a variable without
> > locking it and later try to lock the variable if you get a valid Oid.
> > This is
> > bug prone as any other backend could drop the variable between the two
> > calls
> > and you would end up with a cache lookup failure.  I think the lock should
> > be
> > acquired during IdentifyVariable.  It should probably be optional as one
> > codepath only needs the information to raise a warning when a variable is
> > shadowed, so a concurrent drop isn't a problem there.
> >
>
> I moved lock to IdentifyVariable routine

+IdentifyVariable(List *names, char **attrname, bool lockit, bool *not_unique)
+{
[...]
+               return varoid_without_attr;
+           }
+           else
+           {
+               *attrname = c;
+               return varoid_with_attr;
[...]
+
+   if (OidIsValid(varid) && lockit)
+       LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
+
+   return varid;

There are still some code paths that may not lock the target variable when
required.

fixed
 

Also, the function comment doesn't say much about attrname handling, it should
be clarifed.  I think it should initially be set to NULL, to make sure that
it's always a valid pointer after the function returns.

done
 


> attached updated patch

Various comments on the patch:

No test for GRANT/REVOKE ... ALL VARIABLES IN SCHEMA, maybe it would be good to
have one?

done
 

Documentation:

catalogs.sgml:

You're still using the old-style 4 columns table, it should be a single column
like the rest of the file.

done
 

+  <para>
+   The <command>CREATE VARIABLE</command> command creates a session variable.
+   Session variables, like relations, exist within a schema and their access is
+   controlled via <command>GRANT</command> and <command>REVOKE</command>
+   commands.  Changing a session variable is non-transactional.
+  </para>

The "changing a session variable is non-transactional" is ambiguous.  I think
that only the value part isn't transactional, the variable metadata themselves
(ALTER VARIABLE and other DDL) are transactional right?  This should be
explicitly described here (although it's made less ambiguous in the next
paragraph).

sure, DDL of session variables are transactional. I removed this sentence.


+  <para>
+   Session variables are retrieved by the <command>SELECT</command> SQL
+   command.  Their value is set with the <command>LET</command> SQL command.
+   While session variables share properties with tables, their value cannot be
+   updated with an <command>UPDATE</command> command.
+  </para>

should this part mention that session variables can be shadowed?  For now the
only mention to that is in advanced.sgml.

good idea, I wrote note about it there
 

+      The <literal>DEFAULT</literal> clause can be used to assign a default
+      value to a session variable.

The expression is lazily evaluated during the session first use of the
variable.  This should be documented as any usage of volatile expression will
be impacted.

done

 

+      The <literal>ON TRANSACTION END RESET</literal>
+      clause causes the session variable to be reset to its default value when
+      the transaction is committed or rolled back.

As far as I can see this clauses doesn't play well with IMMUTABLE VARIABLE, as
you can reassign a value once the transaction ends.  Same for DISCARD [ ALL |
VARIABLES ], or LET var = NULL (or DEFAULT if no default value).  Is that
intended?

I think so it is expected. The life scope of assigned (immutable) value is limited to transaction (when ON TRANSACTION END RESET).
DISCARD is used for reset of session, and after it, you can write the value first time.

I enhanced doc in IMMUTABLE clause


+   <literal>LET</literal> extends the syntax defined in the SQL
+   standard. The <literal>SET</literal> command from the SQL standard
+   is used for different purposes in <productname>PostgreSQL</productname>.

I don't fully understand that.  Are (session) variables defined in the SQL
standard?  If yes, all the other documentation pages should clarify that as
they currently say that this is a postgres extension.  If not, this part should
made it clear what is defined in the standard.

I reread standard more carefully, and it looks so SQL/PSM doesn't define global variables ever. The modules defined by SQL/PSM can holds only temporal tables or routines. Unfortunately, this part of standard is almost dead, and there is not referential implementation. The most near to standard in this area is DB2, but global session variables are proprietary feature. The usage is very similar to our session variables with one significant difference - the global session variables can be modified by commands SELECT INTO, VALUES INTO, EXECUTE INTO and SET (Our session variables can be modified just by LET command.). I am sure, so if SQL/PSM supports global session variables, then it uses SET statement - like DB2, but I didn't find any note about support in standard.

I think so the best comment to compatibility is just

  <para>
   The <command>LET</command> is a <productname>PostgreSQL</productname>
   extension.
  </para>

 

In revoke.sgml:
+ REVOKE [ GRANT OPTION FOR ]
+     { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
+     ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
+     FROM { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...]
+     [ CASCADE | RESTRICT ]

there's no extra documentation for that, and therefore no clarification on
variable_name.

This is same like function_name, domain_name, ...
 

VariableIsVisible():
+                * If it is in the path, it might still not be visible; it could be
+                * hidden by another relation of the same name earlier in the path. So
+                * we must do a slow check for conflicting relations.

should it be "another variable of the same name"?


yes, fixed

 

Tab completion: CREATE IMMUTABLE VARIABLE is not handled

fixed
 


pg_variable.c:
Do we really need both session_variable_get_name() and
get_session_variable_name()?

They are different - first returns possibly qualified name, second returns only name. Currently it is used just for error messages in transformAssignmentIndirection, and I think so it is good for consistency with other usage of this routine (transformAssignmentIndirection).
 

+/*
+ * Fetch all fields of session variable from the syscache.
+ */
+void
+initVariable(Variable *var, Oid varid, bool missing_ok, bool fast_only)

As least fast_only should be documented in the function comment, especially
regarding var->varname, since:

+   var->oid = varid;
+   var->name = pstrdup(NameStr(varform->varname));
[...]
+   if (!fast_only)
+   {
+       Datum       aclDatum;
+       bool        isnull;
+
+       /* name */
+       var->name = pstrdup(NameStr(varform->varname));A
[...]
+   else
+   {
+       var->name = NULL;

is the output value guaranteed or not?  In any case it shouldn't be set twice.

It was messed, fixed
 

Also, I don't see any caller for missing_ok == true, should we remove it?

removed
 

+/*
+ * Create entry in pg_variable table
+ */
+ObjectAddress
+VariableCreate(const char *varName,
[...]
+   /* dependency on any roles mentioned in ACL */
+   if (varacl != NULL)
+   {
+       int         nnewmembers;
+       Oid        *newmembers;
+
+       nnewmembers = aclmembers(varacl, &newmembers);
+       updateAclDependencies(VariableRelationId, varid, 0,
+                             varOwner,
+                             0, NULL,
+                             nnewmembers, newmembers);

Shouldn't you use recordDependencyOnNewAcl() instead?  Also, sn't it missing a
recordDependencyOnOwner()?

changed and fixed
 

sessionvariable.c:

+ * Although session variables are not transactional, we don't
+ * want (and we cannot) to run cleaning immediately (when we
+ * got sinval message). The value of session variables can
+ * be still used or the operation that emits cleaning can be
+ * reverted. Unfortunatelly, this check can be done only in
+ * when transaction is committed (the check against system
+ * catalog requires transaction state).

This was the original idea, but since there's now locking to make all DDL safe,
the metadata should be considered fully transactional and no session should
still be able to use a concurrently dropped variable.  Also, the invalidation
messages are not sent until the transaction is committed.  So is that approach
still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION END
RESET)?

I enhanced comment
 

I'm also attaching a 3rd patch with some proposition for documentation
rewording (including consistent use of *session* variable), a few comments
rewording, copyright year bump and minor things like that.

Thank you very much for it. This patch is based on your changes.

Regards

Pavel
 

Note that I still didn't really review pg_variable.c or sessionvariable.c since
there might be significant changes there for either the sinval / immutable part
I mentioned.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: [BUG]Update Toast data failure in logical replication
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)