Re: narwhal and PGDLLIMPORT
От | Noah Misch |
---|---|
Тема | Re: narwhal and PGDLLIMPORT |
Дата | |
Msg-id | 20141020050331.GA259313@tornado.leadboat.com обсуждение исходный текст |
Ответ на | Re: narwhal and PGDLLIMPORT (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: narwhal and PGDLLIMPORT
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: narwhal and PGDLLIMPORT (Andres Freund <andres@2ndquadrant.com>) Re: narwhal and PGDLLIMPORT (Noah Misch <noah@leadboat.com>) |
Список | pgsql-hackers |
On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote: > On Tue, Oct 14, 2014 at 07:07:17PM -0400, Tom Lane wrote: > > Dave Page <dpage@pgadmin.org> writes: > > > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> I think we're hoping that somebody will step up and investigate how > > >> narwhal's problem might be fixed. > > I have planned to look at reproducing narwhal's problem once the dust settles > on orangutan, but I wouldn't mind if narwhal went away instead. > > No argument here. I would kind of like to have more than zero > > understanding of *why* it's failing, just in case there's more to it > > than "oh, probably a bug in this old toolchain". But finding that out > > might well take significant time, and in the end not tell us anything > > very useful. > > Agreed on all those points. I reproduced narwhal's problem using its toolchain on another 32-bit Windows Server 2003 system. The crash happens at the SHGetFolderPath() call in pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or via shell32.dll; we've used the former method since commit 889f038, for better compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version loads and unloads shell32.dll. In PostgreSQL built using this older compiler, shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! That started with commit 846e91e. I don't expect to understand the mechanism behind it, but I recommend we switch back to linking libpq with shell32.dll. The MSVC build already does that in all supported branches, and it feels right for the MinGW build to follow suit in 9.4+. Windows versions that lack the symbol in shell32.dll are now ancient history. I happened to try the same contrib/dblink test suite on PostgreSQL built with modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave a crash-like symptom starting with commit 846e91e. Specifically, a backend that LOADed any module linked to libpq (libpqwalreceiver, dblink, postgres_fdw) would suffer this after calling exit(0): === 3056 2014-10-20 00:40:15.163 GMT LOG: disconnection: session time: 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1port=3936 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 9300 2014-10-20 00:40:15.163 GMT LOG: server process (PID 3056) exited with exit code 3 === The mechanism turned out to be disjoint from the mechanism behind the ancient-compiler crash. Based on the functions called from exit(), my best guess is that exit() encountered recursion and used something like an abort() to escape. (I can send the gdb transcript if anyone is curious to see the gory details.) The proximate cause was commit 846e91e allowing modules to use shared libgcc. A 32-bit libpq acquires 64-bit integer division from libgcc. Passing -static-libgcc to the link restores the libgcc situation as it stood before commit 846e91e. The main beneficiary of shared libgcc is C++/Java exception handling, so PostgreSQL doesn't care. No doubt there's some deeper bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc should not disrupt exit(). I'm content with this workaround. Patches attached. For readers inclined toward animal husbandry, we could benefit from more MinGW buildfarm coverage. There are three compiler strains (classic MinGW, 32-bit MinGW-w64, 64-bit MinGW-w64) and three supported Windows Server versions. [1] http://www.postgresql.org/message-id/flat/6BCB9D8A16AC4241919521715F4D8BCE476665@algol.sollentuna.se
Вложения
В списке pgsql-hackers по дате отправления: