Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
От | Bruce Momjian |
---|---|
Тема | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails |
Дата | |
Msg-id | Zz9B3KQGXFCGVPXy@momjian.us обсуждение исходный текст |
Ответ на | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
|
Список | pgsql-bugs |
On Thu, Nov 21, 2024 at 07:27:22AM +0000, Bertrand Drouvot wrote: > + /* > + * If the original name is too long and we see two consecutive bytes > + * with their high bits set at the truncation point, we might have > + * truncated in the middle of a multibyte character. In multibyte > + * encodings, every byte of a multibyte character has its high bit > + * set. So if IS_HIGHBIT_SET is true for both NAMEDATALEN-1 and > + * NAMEDATALEN-2, we know we're in the middle of a multibyte > + * character. We need to try truncating one more byte back to find the > + * start of the next character. > + */ ... > + /* > + * If we've hit a byte with high bit clear (an ASCII byte), we > + * know we can't be in the middle of a multibyte character, > + * because all bytes of a multibyte character must have their > + * high bits set. Any following byte must therefore be the > + * start of a new character, so we can stop looking for > + * earlier truncation points. > + */ I don't understand this logic. Why are two bytes important? If we knew it was UTF8 we could check for non-first bytes always starting with bits 10, but we can't know that. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?"
В списке pgsql-bugs по дате отправления: