Re: pgsql: Add putenv support for msvcrt from Visual Studio 2013

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: pgsql: Add putenv support for msvcrt from Visual Studio 2013
Дата
Msg-id 20161116072246.GA2297058@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: pgsql: Add putenv support for msvcrt from Visual Studio 2013  (Christian Ullrich <chris@chrullrich.net>)
Ответы Re: pgsql: Add putenv support for msvcrt from Visual Studio 2013
Список pgsql-hackers
On Tue, Apr 26, 2016 at 07:39:29PM +0200, Christian Ullrich wrote:
> * Christian Ullrich wrote:
> 
> >wrong even without considering the debug/release split. If we load a
> >compiled extension built with a CRT we have not seen yet, _after_ the
> >first call to pgwin32_putenv(), that module's CRT's view of its
> >environment will be frozen because we will never attempt to update it.
> 
> Four patches attached:
> 
> master --- 0001 --- 0002 --- 0003
>                          \
>                           `- 0004
> 
> 0001 is just some various fixes to set the stage.

Patch 1 looks good, except that it should be three patches:

- cosmetic parts: change whitespace and s/0/NULL/
- remove CloseHandle() call
- probe for debug CRT modules, not just release CRT modules

Please split along those boundaries.  I plan to back-patch all of that.  I've
seen some gettext builds mysteriously ignore "SET lc_messages = ..."; ignoring
debug CRTs could have been the cause.

> I tested this with a project
> (<https://bitbucket.org/chrullrich/pgputenv-demo>) that contains two DLLs:

That's a pithy test; thanks for assembling it.

> Even with patch 0004, there is still a race condition between the main
> thread and a theoretical additional thread created by some other component
> that unloads some CRT between GetProcAddress() and the _putenv() call, but
> that is hardly fixable.

I think you can fix it by abandoning GetModuleHandle() in favor of
GetModuleHandleEx() + GetProcessAddress() + FreeLibrary().  I recommend also
moving the SetEnvironmentVariable() call before the putenv calls.  That way,
if a CRT loads while pgwin32_putenv() is executing, the newly-loaded CRT will
always have the latest value.  (I'm assuming CRTs populate their environment
from GetEnvironmentStrings(), because I can't think of an alternative.)

As a separate patch, I am inclined to remove the "#ifdef _MSC_VER" directive,
activating its enclosed code under all compilers.  A MinGW-built postgres.exe
has the same need to update all CRTs.

> The fact that master looks like it does means that there have not been many
> (or any) complaints about missing cross-module environment variables. If
> nobody ever needs to see a variable set elsewhere, we have a very simple
> solution: Why don't we simply throw out the whole #ifdef _MSC_VER block?

pgwin32_putenv() originated, in commit 0154345, to make "SET lc_messages =
..." effective when gettext uses a different CRT from postgres.exe.  I suspect
it also makes krb_server_keyfile effective when GSS uses a different CRT.
Those are achievements worth keeping.  I'm not surprised about the lack of
complaints, because environment variables don't often change after backend
startup.  Here are some ways one could notice the difference between master
and patches 2+3 or 2+4:

- Use shared_preload_libraries to load a module that reads LC_CTYPE or LC_COLLATE.  CheckMyDatabase() sets those
variablessubsequent to process_shared_preload_libraries().
 

- Load, at any time, a module that reads LC_MESSAGES.  There's little reason to read that variable outside of gettext.
Amodule could use a gettext DLL other than the postgres.exe gettext DLL, but such a module would need to work around
pg_bindtextdomain()always using the postgres.exe gettext.
 

- Load, at any time, a module that itself changes environment variables, other than LC_MESSAGES, after backend startup.
I suspect PL/Python suffices.
 

Those are plausible scenarios, but they're sufficiently specialized that
problems could lie unnoticed or undiagnosed for years.  I lean against
back-patching anything from patches 2, 3 or 4.

> There is another possible fix, ugly as sin, if we really need to keep the
> whole environment update machinery *and* cannot do the full loop each time.
> Patch 0003 pins each CRT when we see it for the first time.
> GET_MODULE_HANDLE_EX_FLAG_PIN is documented as "The module stays loaded
> until the process is terminated, no matter how many times FreeLibrary is
> called", so the unload race cannot occur anymore.

I prefer the simplicity of abandoning the cache (patch 4), if it performs
decently.  Would you compare the performance of patch 1, patches 1+2+3, and
patches 1+2+4?  This should measure the right thing (after substituting
@libdir@ for your environment):

CREATE FUNCTION putenv(text)  RETURNS void  AS '@libdir@/regress.dll', 'regress_putenv'  LANGUAGE C STRICT;
\timing on
SELECT count(putenv('foo=' || n)) FROM generate_series(1,1000000) t(n);

(I'm interested for the sake of backend startup time.  I recall nine putenv()
in every backend startup, seven in main() and two in CheckMyDatabase().)

Thanks,
nm



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Patch: Implement failover on libpq connect level.
Следующее
От: "Okano, Naoki"
Дата:
Сообщение: Re: Adding the optional clause 'AS' in CREATE TRIGGER