Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows
От | Tatsuo Ishii |
---|---|
Тема | Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows |
Дата | |
Msg-id | 20241207.081412.2050532354647835961.ishii@postgresql.org обсуждение исходный текст |
Ответ на | Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows
|
Список | pgsql-bugs |
> Tatsuo Ishii <ishii@postgresql.org> writes: >> I have looked into canonicalize_path() and found this: > >> if (*p == '\\') >> *p = '/'; > > Right, that's where the trouble is. It'd be easy enough to make > that loop (and the similar one in cleanup_path) encoding-aware, > if we knew what encoding applies. Deciding that is the sticky part. > > After sleeping on it, I'm coming around to the opinion that > client_encoding (pset.encoding) is what to use in psql, for > two reasons: > * we already do our best to set that correctly, and the user > is able to change it if it's wrong; > * as previously noted, psqlscan.l will do the wrong things > if it's not set correctly, so you're probably already hosed > if working in a non-server-safe encoding with the wrong > setting of client_encoding. I think the encoding we need to supply to canonicalize_path() is not necessarily the same as client_encoding. For example we could set client_encoding to UTF-8 but use a file which has Shift-JIS encode file name. I think what we really need to supply to canonicalize_path() is the "file system encoding", not client_encoding. Among the file system encodings, the only problematic one is Shift-JIS. As far as I know, currently there's no OS except Windows which uses Shift-JIS as the file system encoding. So probably we can safely assume that if the OS is Windows for Japanese, we can assume that the file system encoding is Shift-JIS. If we know how to determine the OS is Windows for Japanese inside the canonicalize_path(), we don't need to change the API of it. Quick gooling found this page (sorry, in Japanese) https://tarenagashi.hatenablog.jp/entry/2023/07/17/160149 and it says: - In Windows "system locale" represents the language/country used. - The code for system locale is called "LCID" and it's 1041 (decimal) for Japanese/Japan. - There are some APIs to obtain LCID (GetSystemDefaultLocaleName etc.) As I am not familiar with Windows and I cannot test these. Can someone confirm? Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
В списке pgsql-bugs по дате отправления: