Re: Report search_path value back to the client.

Поиск
Список
Период
Сортировка
От Alexander Kukushkin
Тема Re: Report search_path value back to the client.
Дата
Msg-id CAFh8B=kYfZekb7tVZZSNWQzXSYMKmtJeQLqbLZbhjdLujENRcQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Report search_path value back to the client.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

2015-02-20 16:19 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
2. Semantics.  The existing GUC_REPORT variables are all things that
directly relate to client-visible behavior, eg how values of type
timestamp will be interpreted and printed.  search_path is no such thing,
so it's hard to make a principled argument for reporting it that doesn't
lead to the conclusion that you want *everything* reported.  (In
particular, I don't believe at all your argument that this would help
pgbouncer significantly.)
 
Well, I will try to explain in details why do we want to use pgbouncer in front of postgres and what stops us from doing this.
Usually our applications don't access data directly but via stored procedures. One database can be accessed by different applications in different versions. Different versions of applications may use different sets and versions of stored procedures. It achieved by keeping stored procedures in different schemas (we call them API schema). For example at the same time 'app-v1' is working with stored procedures located in schema 'app_api_v1', 'app-v2' with 'app_api_v2' and 'other-app-v3' is working with 'other_app_api_v3'. This is very convenient. Before deploying new version of application we create new 'api schema' for it and then create all stored procedures in this schema. Directly after connecting to database application executes 'set search_path to app_api_vX, public' and after this point it always call stored procedures without specifying schema name. This way we are able to keep different versions of the same stored procedure in database.
And here the problem come. Sometimes we have dozens of application instances. Each application instance might keep more than one connection to database. In total there can be hundreds connections but most of the time they are not in state 'active'. For such cases connection pools have been invented. It's cheep to keep even thousands open connections to event-based connection pool comparing with the real database. The first idea that comes into mind - lets put pgbouncer with pool_mode = transaction in front of database. But here our good old friend 'set search_path to app_api_vX, public' does weird things with pgbouncer. The same server connection might be reused by different applications in different versions and soon or later application will get: 'ERROR:  function foo() does not exist' because search_path for this connection between pgbouncer and postgres has been changed by another application.

My first idea was - it should be possible to fix pgbouncer and make it aware of current search_path value of connections between application <-> pgbouncer and pgbouncer <-> postgres. I started to look into it's source code and yea, this can be done relatively easy, because pgbouncer already does similar job for "client_encoding", "DateStyle", "TimeZone", "standard_conforming_strings" and "application_name", which doesn't relate to the 'application visible behaviour' more than the search_path. Pgbouncer always keeps current values of these variables for each connection and when it takes a connection from the pool, it compares the values of client_connection and server_connection parameters. If they don't match - it changes the value on the server side by calling 'set variable to client_value'.

Execution of statement 'set some_variable to some_value' from the client side is not the unique way to change value of variable. As you already mentioned such statement can be executed from stored procedure as well. So scanning of client statements is not an option. Only postgres is able to tell at any time current value of search_path and asking it every time for current value also is not an option, because this will double amount of queries executed by database. Luckily, postgres already has this wonderful mechanism as GUC_REPORT...

If the postgres will report the value of search_path on connect and on change - pgbouncer could be really easy patched:

diff --git a/include/varcache.h b/include/varcache.h
index 4984b01..916fa01 100644
--- a/include/varcache.h
+++ b/include/varcache.h
@@ -5,6 +5,7 @@ enum VarCacheIdx {
        VTimeZone,
        VStdStr,
        VAppName,
+       VSearchPath,
        NumVars
 };
 
diff --git a/lib b/lib
index fe4dd02..c0111d4 160000
--- a/lib
+++ b/lib
@@ -1 +1 @@
-Subproject commit fe4dd0255ea796e64599c0fc28e6900fbc07e6aa
+Subproject commit c0111d42712202df5ccb01c4e07e88ab5c8b94b8
diff --git a/src/varcache.c b/src/varcache.c
index 6321dc5..cb3f559 100644
--- a/src/varcache.c
+++ b/src/varcache.c
@@ -35,6 +35,7 @@ static const struct var_lookup lookup [] = {
  {"TimeZone",                    VTimeZone },
  {"standard_conforming_strings", VStdStr },
  {"application_name",            VAppName },
+ {"search_path",                 VSearchPath },
  {NULL},
 };

We could possibly alleviate problem #1 by changing the behavior of guc.c
so it doesn't report every single transition of flagged variables, but
only (say) once just before ReadyForQuery if the variable changed in
the just-finished command.  That's not exactly a one-line fix though.

Probably for some variables this really make sense. Inside stored procedures any of GUC_REPORT variables can be changed the same way as search_path (thousands of times), but not all of these variables directly relate to the application visible behaviour.

Regards,
Alexander Kukushkin

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thom Brown
Дата:
Сообщение: Re: Join push-down support for foreign tables
Следующее
От: Kouhei Kaigai
Дата:
Сообщение: Re: Join push-down support for foreign tables