Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
От | Thomas Munro |
---|---|
Тема | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails |
Дата | |
Msg-id | CA+hUKGKKNAc599Vp7kFAnLE1=V=ceYujz_YQoSNrvNFGaJ6i7w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
|
Список | pgsql-bugs |
On Thu, Nov 28, 2024 at 5:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > There is nothing > about our handling of non-ASCII characters in shared system catalogs > that isn't squishy as heck, and yet there have been darn few field > complaints over the many years it's been like that. Maybe trying to > make this truncation issue better in isolation wasn't such a great > plan. I guess most people in Unix-land just use UTF-8 in every layer of their software stack these days, so don't often see confused encodings anymore? But I don't think that's true in the other place, where they still routinely juggle multiple encodings and see garbled junk when it goes wrong[1]. They might still generally prefer UTF-8 for database encoding though, IDK. > (If we recorded the encoding of names in shared catalogs then this > particular issue would be far easier to solve, but then we have > other problems to address --- particularly, what to do if a name > in the catalog fails to convert to the encoding we are using.) Here is a much dumber coarse-grained way I have wondered about for making the encoding certain, without having to do any new conversions at all: (1) single-encoding cluster mode, shared catalogues use same encoding as all databases, (2) multi-encoding cluster mode with ASCII-only shared catalogues, and (3) legacy squishy/raw mode you normally only reach by pg_upgrade. Maybe you could switch between them with an operation that validates names. Then I think you could always know the shared cat encoding even with no database context, and when you are connected to a database you could mostly just carry on assuming it's database encoding (either it is, or it's the ASCII subset). That can only be wrong in mode 3, all bets off just like today, but that's your own fault for using mode 3. I guess serious users of multi-encoding clusters already learn to stick to ASCII-only role names and database names anyway, unless they like seeing garbage? [1] https://www.postgresql.org/message-id/flat/00a601db3b20%24b00261e0%24100725a0%24%40gmx.net
В списке pgsql-bugs по дате отправления: