Обсуждение: psql completion for ids in multibyte string

Поиск
Список
Период
Сортировка

psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
Hello. I don't know whether this is a bug fix or improvement,
anyway show you a patch for psql completion.


psql completes identifiers in many places but donesn't for
multibyte identifiers.

=> ALTER TABLE "[tab]
"いろは" "with space"

=> ALTER TABLE "い[tab]
<none>

Currently psql counts the length of the string to complete in
bytes but it is desirable to count in client encoding. The
attached patch does so and the uncompleted completion would
be completed.

=> ALTER TABLE "い[tab]
=> ALTER TABLE "いろは" _

During the investigation into this issue, I found a mistake in
the comment for PQmblen. It give the byte length of the character
at the point, not word. The attached patche also fixes it.

> /*
>  * returns the byte length of the word beginning s, using the
>  * specified encoding.
>  */
> int
> PQmblen(const char *s, int encoding)


What do you think about this?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: psql completion for ids in multibyte string

От
Amit Langote
Дата:
Horiguchi-san,

On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
> Hello. I don't know whether this is a bug fix or improvement,

Would it be 50-50? :-)

...

> 
> During the investigation into this issue, I found a mistake in
> the comment for PQmblen. It give the byte length of the character
> at the point, not word. The attached patche also fixes it.
> 
>> /*
>>  * returns the byte length of the word beginning s, using the
>>  * specified encoding.
>>  */
>> int
>> PQmblen(const char *s, int encoding)
> 

In the following change,

- * returns the byte length of the word beginning s, using the
- * specified encoding.
+ * returns the byte length of the character beginning s, using the specified
+ * encoding.


Just a minor nitpick -

... character beginning *at* s ...?

If so, there would be one more instance to fix.

Thanks,
Amit




Re: psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
Hi. Thank you for the comments.

The revised version is attaced.
- A typo is fixed in the comment for PQmblen().- Do the same fix to PQdsplen().



At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
> On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
> > Hello. I don't know whether this is a bug fix or improvement,
> 
> Would it be 50-50? :-)

Yeah, honestly saying, I doubt that this is valuable but feel
uneasy to see some of the candidates vanish as completon proceeds
for no persuasive reason.

> In the following change,
> 
> - * returns the byte length of the word beginning s, using the
> - * specified encoding.
> + * returns the byte length of the character beginning s, using the specified
> + * encoding.
> 
> 
> Just a minor nitpick -
> 
> ... character beginning *at* s ...?
> 
> If so, there would be one more instance to fix.

I think so.  I overlooked both of them. And as you mention,
PQdsplen has the same kind of typo. It returns display length of
the *character* beginning *at* s, too.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Fixing wrong comment on PQmblen and PQdsplen.

От
Kyotaro HORIGUCHI
Дата:
Hello,

I divided the last patch into one typo-fix patch and one
improvement patch. This is the former one.

At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp>
> > Just a minor nitpick -
> > 
> > ... character beginning *at* s ...?
> > 
> > If so, there would be one more instance to fix.
> 
> I think so.  I overlooked both of them. And as you mention,
> PQdsplen has the same kind of typo. It returns display length of
> the *character* beginning *at* s, too.

This patch fixes wrong comments of PQmblen() and PQdsplen().  The
comments say that "returns the length of the word beginning s"
but what it actually does is "returns the length of the
*character* beginning at s".

regards,

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
Hello, this is the second patch plitted out. This allows
multibyte names to be completed in psql.

At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp>
> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
> > > Hello. I don't know whether this is a bug fix or improvement,
> > 
> > Would it be 50-50? :-)
> 
> Yeah, honestly saying, I doubt that this is valuable but feel
> uneasy to see some of the candidates vanish as completon proceeds
> for no persuasive reason.

The current version of tab-completion failed to complete names
with multibyte characters.

=# create table いろは (あ int);
=# create table いこい (あ int);
=# drop table <tab>
"いろは"             hint_plan.           pg_toast.
"いこい"             information_schema.  pg_toast_temp_1.
ALL IN TABLESPACE    pg_catalog.          public.
dbms_stats.          pg_temp_1.           
postgres=# alter table "い
=# drop table "い<tab>
=# drop table "い           /* No candidate offered */

This is because _complet_from_query counts the string length in
bytes, instead of characters. With this patch the candidates will
appear.

=# drop table "い<tab>
"いこい"  "いろは"  
postgres=# drpo table "い

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: psql completion for ids in multibyte string

От
Thomas Munro
Дата:
On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, this is the second patch plitted out. This allows
> multibyte names to be completed in psql.
>
> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp> 
>> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
>> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>> > > Hello. I don't know whether this is a bug fix or improvement,
>> >
>> > Would it be 50-50? :-)
>>
>> Yeah, honestly saying, I doubt that this is valuable but feel
>> uneasy to see some of the candidates vanish as completon proceeds
>> for no persuasive reason.

+1 from me, it's entirely reasonable to want to name database objects
in any human language and use auto-completion.  It's not working today
as you showed.

> The current version of tab-completion failed to complete names
> with multibyte characters.
>
> =# create table いろは (あ int);
> =# create table いこい (あ int);
> =# drop table <tab>
> "いろは"             hint_plan.           pg_toast.
> "いこい"             information_schema.  pg_toast_temp_1.
> ALL IN TABLESPACE    pg_catalog.          public.
> dbms_stats.          pg_temp_1.
> postgres=# alter table "い
> =# drop table "い<tab>
> =# drop table "い           /* No candidate offered */
>
> This is because _complet_from_query counts the string length in
> bytes, instead of characters. With this patch the candidates will
> appear.
>
> =# drop table "い<tab>
> "いこい"  "いろは"
> postgres=# drpo table "い

The patch looks correct to me: it counts characters rather than bytes,
which is the right thing to do because the value is passed to substr()
which also works in characters rather than bytes.  I tested with
"éclair", and without the patch, tab completion doesn't work if you
press tab after 'é'.  With the patch it does.

--
Thomas Munro
http://www.enterprisedb.com



Re: Fixing wrong comment on PQmblen and PQdsplen.

От
Robert Haas
Дата:
On Fri, Feb 26, 2016 at 12:33 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I divided the last patch into one typo-fix patch and one
> improvement patch. This is the former one.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Robert Haas
Дата:
On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> Hello, this is the second patch plitted out. This allows
>> multibyte names to be completed in psql.
>>
>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp> 
>>> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
>>> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>>> > > Hello. I don't know whether this is a bug fix or improvement,
>>> >
>>> > Would it be 50-50? :-)
>>>
>>> Yeah, honestly saying, I doubt that this is valuable but feel
>>> uneasy to see some of the candidates vanish as completon proceeds
>>> for no persuasive reason.
>
> +1 from me, it's entirely reasonable to want to name database objects
> in any human language and use auto-completion.  It's not working today
> as you showed.
>
>> The current version of tab-completion failed to complete names
>> with multibyte characters.
>>
>> =# create table いろは (あ int);
>> =# create table いこい (あ int);
>> =# drop table <tab>
>> "いろは"             hint_plan.           pg_toast.
>> "いこい"             information_schema.  pg_toast_temp_1.
>> ALL IN TABLESPACE    pg_catalog.          public.
>> dbms_stats.          pg_temp_1.
>> postgres=# alter table "い
>> =# drop table "い<tab>
>> =# drop table "い           /* No candidate offered */
>>
>> This is because _complet_from_query counts the string length in
>> bytes, instead of characters. With this patch the candidates will
>> appear.
>>
>> =# drop table "い<tab>
>> "いこい"  "いろは"
>> postgres=# drpo table "い
>
> The patch looks correct to me: it counts characters rather than bytes,
> which is the right thing to do because the value is passed to substr()
> which also works in characters rather than bytes.  I tested with
> "éclair", and without the patch, tab completion doesn't work if you
> press tab after 'é'.  With the patch it does.

OK, but I am a little concerned about this code:
   /* Find something that matches */   if (result && PQresultStatus(result) == PGRES_TUPLES_OK)   {       const char
*item;
       while (list_index < PQntuples(result) &&              (item = PQgetvalue(result, list_index++, 0)))           if
(pg_strncasecmp(text,item, string_length) == 0)               return pg_strdup(item);   } 

Every other use of string_length in this function is using it as the
argument to the SQL substring function, which cares about characters,
not bytes.  But this use seems to care about bytes, not characters.

Am I wrong?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Thomas Munro
Дата:
On Wed, Mar 2, 2016 at 7:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>> Hello, this is the second patch plitted out. This allows
>>> multibyte names to be completed in psql.
>>>
>>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp> 
>>>> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
>>>> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>>>> > > Hello. I don't know whether this is a bug fix or improvement,
>>>> >
>>>> > Would it be 50-50? :-)
>>>>
>>>> Yeah, honestly saying, I doubt that this is valuable but feel
>>>> uneasy to see some of the candidates vanish as completon proceeds
>>>> for no persuasive reason.
>>
>> +1 from me, it's entirely reasonable to want to name database objects
>> in any human language and use auto-completion.  It's not working today
>> as you showed.
>>
>>> The current version of tab-completion failed to complete names
>>> with multibyte characters.
>>>
>>> =# create table いろは (あ int);
>>> =# create table いこい (あ int);
>>> =# drop table <tab>
>>> "いろは"             hint_plan.           pg_toast.
>>> "いこい"             information_schema.  pg_toast_temp_1.
>>> ALL IN TABLESPACE    pg_catalog.          public.
>>> dbms_stats.          pg_temp_1.
>>> postgres=# alter table "い
>>> =# drop table "い<tab>
>>> =# drop table "い           /* No candidate offered */
>>>
>>> This is because _complet_from_query counts the string length in
>>> bytes, instead of characters. With this patch the candidates will
>>> appear.
>>>
>>> =# drop table "い<tab>
>>> "いこい"  "いろは"
>>> postgres=# drpo table "い
>>
>> The patch looks correct to me: it counts characters rather than bytes,
>> which is the right thing to do because the value is passed to substr()
>> which also works in characters rather than bytes.  I tested with
>> "éclair", and without the patch, tab completion doesn't work if you
>> press tab after 'é'.  With the patch it does.
>
> OK, but I am a little concerned about this code:
>
>     /* Find something that matches */
>     if (result && PQresultStatus(result) == PGRES_TUPLES_OK)
>     {
>         const char *item;
>
>         while (list_index < PQntuples(result) &&
>                (item = PQgetvalue(result, list_index++, 0)))
>             if (pg_strncasecmp(text, item, string_length) == 0)
>                 return pg_strdup(item);
>     }
>
> Every other use of string_length in this function is using it as the
> argument to the SQL substring function, which cares about characters,
> not bytes.  But this use seems to care about bytes, not characters.
>
> Am I wrong?

Ugh, and the other problem is that string_length is always 0 there if
state isn't 0 (in master it is static so that the value is reused for
subsequent calls, but this patch made it automatic).

I think we should leave string_length as it is and use a new variable
for character-based length, as in the attached.

--
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: Fixing wrong comment on PQmblen and PQdsplen.

От
Kyotaro HORIGUCHI
Дата:
Hello,

At Tue, 1 Mar 2016 13:33:02 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoYAQMFk3ir07HRoPczJdAXdWApNT173SRNSJgFBK2jnSA@mail.gmail.com>
> On Fri, Feb 26, 2016 at 12:33 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I divided the last patch into one typo-fix patch and one
> > improvement patch. This is the former one.
> 
> Committed.

Thank you.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
Hello, thank you for the comments.

At Wed, 2 Mar 2016 10:10:55 +1300, Thomas Munro <thomas.munro@enterprisedb.com> wrote in
<CAEepm=2JjPY-v1JWWrJyBone-=t1A7TJRyKSfaseQLzh5sMQGg@mail.gmail.com>
> On Wed, Mar 2, 2016 at 7:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
> > <thomas.munro@enterprisedb.com> wrote:
> >> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >>> Hello, this is the second patch plitted out. This allows
> >>> multibyte names to be completed in psql.
> >>>
> >>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
wrotein <20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp>
 
> >>>> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
> >>>> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
> >>>> > > Hello. I don't know whether this is a bug fix or improvement,
> >>>> >
> >>>> > Would it be 50-50? :-)
> >>>>
> >>>> Yeah, honestly saying, I doubt that this is valuable but feel
> >>>> uneasy to see some of the candidates vanish as completon proceeds
> >>>> for no persuasive reason.
> >>
> >> +1 from me, it's entirely reasonable to want to name database objects
> >> in any human language and use auto-completion.  It's not working today
> >> as you showed.
> >>
> >>> The current version of tab-completion failed to complete names
> >>> with multibyte characters.
> >>>
> >>> =# create table いろは (あ int);
> >>> =# create table いこい (あ int);
> >>> =# drop table <tab>
> >>> "いろは"             hint_plan.           pg_toast.
> >>> "いこい"             information_schema.  pg_toast_temp_1.
> >>> ALL IN TABLESPACE    pg_catalog.          public.
> >>> dbms_stats.          pg_temp_1.
> >>> postgres=# alter table "い
> >>> =# drop table "い<tab>
> >>> =# drop table "い           /* No candidate offered */
> >>>
> >>> This is because _complet_from_query counts the string length in
> >>> bytes, instead of characters. With this patch the candidates will
> >>> appear.
> >>>
> >>> =# drop table "い<tab>
> >>> "いこい"  "いろは"
> >>> postgres=# drpo table "い
> >>
> >> The patch looks correct to me: it counts characters rather than bytes,
> >> which is the right thing to do because the value is passed to substr()
> >> which also works in characters rather than bytes.  I tested with
> >> "éclair", and without the patch, tab completion doesn't work if you
> >> press tab after 'é'.  With the patch it does.
> >
> > OK, but I am a little concerned about this code:
> >
> >     /* Find something that matches */
> >     if (result && PQresultStatus(result) == PGRES_TUPLES_OK)
> >     {
> >         const char *item;
> >
> >         while (list_index < PQntuples(result) &&
> >                (item = PQgetvalue(result, list_index++, 0)))
> >             if (pg_strncasecmp(text, item, string_length) == 0)
> >                 return pg_strdup(item);
> >     }
> >
> > Every other use of string_length in this function is using it as the
> > argument to the SQL substring function, which cares about characters,
> > not bytes.  But this use seems to care about bytes, not characters.
> >
> > Am I wrong?
> 
> Ugh, and the other problem is that string_length is always 0 there if
> state isn't 0 (in master it is static so that the value is reused for
> subsequent calls, but this patch made it automatic).

Thanks for pointing it out.

> I think we should leave string_length as it is and use a new variable
> for character-based length, as in the attached.

Basically agreed but I like byte_length for the previous
string_length and string_length for string_length_cars. Also
text_length is renamed in the attached patch.

What do you think about this?

# I named it as version 3 at my own decision.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5f27120..ed92912 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3198,9 +3198,8 @@ static char *_complete_from_query(int is_schema_query, const char *text, int state){    static
int   list_index,
 
-                string_length;
+                byte_length;    static PGresult *result = NULL;
-    /*     * If this is the first time for this completion, we fetch a list of our     * "things" from the backend.
@@ -3211,9 +3210,18 @@ _complete_from_query(int is_schema_query, const char *text, int state)        char
*e_text;       char       *e_info_charp;        char       *e_info_charp2;
 
+        const char *pstr = text;
+        int            string_length = 0;        list_index = 0;
-        string_length = strlen(text);
+        byte_length = strlen(text);
+
+        /* Count length as number of characters (not bytes), for passing to substring */
+        while (*pstr)
+        {
+            string_length++;
+            pstr += PQmblen(pstr, pset.encoding);
+        }        /* Free any prior result */        PQclear(result);
@@ -3353,7 +3361,7 @@ _complete_from_query(int is_schema_query, const char *text, int state)        while (list_index <
PQntuples(result)&&               (item = PQgetvalue(result, list_index++, 0)))
 
-            if (pg_strncasecmp(text, item, string_length) == 0)
+            if (pg_strncasecmp(text, item, byte_length) == 0)                return pg_strdup(item);    }
@@ -3372,7 +3380,7 @@ _complete_from_query(int is_schema_query, const char *text, int state)static char
*complete_from_list(constchar *text, int state){
 
-    static int    string_length,
+    static int    byte_length,                list_index,                matches;    static bool casesensitive;
@@ -3385,7 +3393,7 @@ complete_from_list(const char *text, int state)    if (state == 0)    {        list_index = 0;
-        string_length = strlen(text);
+        byte_length = strlen(text);        casesensitive = completion_case_sensitive;        matches = 0;    }
@@ -3393,14 +3401,14 @@ complete_from_list(const char *text, int state)    while ((item =
completion_charpp[list_index++]))   {        /* First pass is case sensitive */
 
-        if (casesensitive && strncmp(text, item, string_length) == 0)
+        if (casesensitive && strncmp(text, item, byte_length) == 0)        {            matches++;            return
pg_strdup(item);       }        /* Second pass is case insensitive, don't bother counting matches */
 
-        if (!casesensitive && pg_strncasecmp(text, item, string_length) == 0)
+        if (!casesensitive && pg_strncasecmp(text, item, byte_length) == 0)        {            if
(completion_case_sensitive)               return pg_strdup(item);
 
@@ -3627,13 +3635,13 @@ pg_strdup_keyword_case(const char *s, const char *ref)static char *escape_string(const char
*text){
-    size_t        text_length;
+    size_t        byte_length;    char       *result;
-    text_length = strlen(text);
+    byte_length = strlen(text);
-    result = pg_malloc(text_length * 2 + 1);
-    PQescapeStringConn(pset.db, result, text, text_length, NULL);
+    result = pg_malloc(byte_length * 2 + 1);
+    PQescapeStringConn(pset.db, result, text, byte_length, NULL);    return result;}

Re: psql completion for ids in multibyte string

От
Thomas Munro
Дата:
On Thu, Mar 3, 2016 at 2:07 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, thank you for the comments.
>
> At Wed, 2 Mar 2016 10:10:55 +1300, Thomas Munro <thomas.munro@enterprisedb.com> wrote in
<CAEepm=2JjPY-v1JWWrJyBone-=t1A7TJRyKSfaseQLzh5sMQGg@mail.gmail.com>
>> On Wed, Mar 2, 2016 at 7:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> > On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
>> > <thomas.munro@enterprisedb.com> wrote:
>> >> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
>> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> >>> Hello, this is the second patch plitted out. This allows
>> >>> multibyte names to be completed in psql.
>> >>>
>> >>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
wrotein <20151106.114717.170526268.horiguchi.kyotaro@lab.ntt.co.jp> 
>> >>>> At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<563B224B.3020400@lab.ntt.co.jp>
>> >>>> > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>> >>>> > > Hello. I don't know whether this is a bug fix or improvement,
>> >>>> >
>> >>>> > Would it be 50-50? :-)
>> >>>>
>> >>>> Yeah, honestly saying, I doubt that this is valuable but feel
>> >>>> uneasy to see some of the candidates vanish as completon proceeds
>> >>>> for no persuasive reason.
>> >>
>> >> +1 from me, it's entirely reasonable to want to name database objects
>> >> in any human language and use auto-completion.  It's not working today
>> >> as you showed.
>> >>
>> >>> The current version of tab-completion failed to complete names
>> >>> with multibyte characters.
>> >>>
>> >>> =# create table いろは (あ int);
>> >>> =# create table いこい (あ int);
>> >>> =# drop table <tab>
>> >>> "いろは"             hint_plan.           pg_toast.
>> >>> "いこい"             information_schema.  pg_toast_temp_1.
>> >>> ALL IN TABLESPACE    pg_catalog.          public.
>> >>> dbms_stats.          pg_temp_1.
>> >>> postgres=# alter table "い
>> >>> =# drop table "い<tab>
>> >>> =# drop table "い           /* No candidate offered */
>> >>>
>> >>> This is because _complet_from_query counts the string length in
>> >>> bytes, instead of characters. With this patch the candidates will
>> >>> appear.
>> >>>
>> >>> =# drop table "い<tab>
>> >>> "いこい"  "いろは"
>> >>> postgres=# drpo table "い
>> >>
>> >> The patch looks correct to me: it counts characters rather than bytes,
>> >> which is the right thing to do because the value is passed to substr()
>> >> which also works in characters rather than bytes.  I tested with
>> >> "éclair", and without the patch, tab completion doesn't work if you
>> >> press tab after 'é'.  With the patch it does.
>> >
>> > OK, but I am a little concerned about this code:
>> >
>> >     /* Find something that matches */
>> >     if (result && PQresultStatus(result) == PGRES_TUPLES_OK)
>> >     {
>> >         const char *item;
>> >
>> >         while (list_index < PQntuples(result) &&
>> >                (item = PQgetvalue(result, list_index++, 0)))
>> >             if (pg_strncasecmp(text, item, string_length) == 0)
>> >                 return pg_strdup(item);
>> >     }
>> >
>> > Every other use of string_length in this function is using it as the
>> > argument to the SQL substring function, which cares about characters,
>> > not bytes.  But this use seems to care about bytes, not characters.
>> >
>> > Am I wrong?
>>
>> Ugh, and the other problem is that string_length is always 0 there if
>> state isn't 0 (in master it is static so that the value is reused for
>> subsequent calls, but this patch made it automatic).
>
> Thanks for pointing it out.
>
>> I think we should leave string_length as it is and use a new variable
>> for character-based length, as in the attached.
>
> Basically agreed but I like byte_length for the previous
> string_length and string_length for string_length_cars. Also
> text_length is renamed in the attached patch.
>
> What do you think about this?

It seems fine this way too.  Maybe you should also rename
string_length to byte_length in create_or_drop_command_generator, just
for consistency.

One peculiarity I noticed when testing this on MacOSX 10.10.2 with
Apple-supplied libedit is that the following sequence does something
strange:

CREATE TABLE いこい ();
CREATE TABLE いこいこ ();
CREATE TABLE いろは ();
SELECT *  FROM "い<tab>
-> the line magically changes to:
SELECT *  FROM <cursor>

If you press tab at any other point in any of those Japanese strings,
it works correctly.  I couldn't make anything similar happen with
Latin multibyte characters, and it doesn't happen on Debian with
libreadline, or on MacOSX with libreadline (installed from MacPorts).
I suspect this is a problem with Apple libedit and not with psql or
your patch, but I didn't have time to investigate further.

--
Thomas Munro
http://www.enterprisedb.com



Re: psql completion for ids in multibyte string

От
Robert Haas
Дата:
On Wed, Mar 2, 2016 at 8:07 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, thank you for the comments.
>> I think we should leave string_length as it is and use a new variable
>> for character-based length, as in the attached.
>
> Basically agreed but I like byte_length for the previous
> string_length and string_length for string_length_cars. Also
> text_length is renamed in the attached patch.

I committed this and back-patched this but (1) I avoided changing the
other functions for now and (2) I gave both the byte length and the
character length new names to avoid confusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Alvaro Herrera
Дата:
Robert Haas wrote:
> On Wed, Mar 2, 2016 at 8:07 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Hello, thank you for the comments.
> >> I think we should leave string_length as it is and use a new variable
> >> for character-based length, as in the attached.
> >
> > Basically agreed but I like byte_length for the previous
> > string_length and string_length for string_length_cars. Also
> > text_length is renamed in the attached patch.
> 
> I committed this and back-patched this but (1) I avoided changing the
> other functions for now and (2) I gave both the byte length and the
> character length new names to avoid confusion.

These tweaks appear to have been universally disliked by buildfarm
members.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: psql completion for ids in multibyte string

От
Robert Haas
Дата:
On Fri, Mar 4, 2016 at 12:02 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> On Wed, Mar 2, 2016 at 8:07 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> > Hello, thank you for the comments.
>> >> I think we should leave string_length as it is and use a new variable
>> >> for character-based length, as in the attached.
>> >
>> > Basically agreed but I like byte_length for the previous
>> > string_length and string_length for string_length_cars. Also
>> > text_length is renamed in the attached patch.
>>
>> I committed this and back-patched this but (1) I avoided changing the
>> other functions for now and (2) I gave both the byte length and the
>> character length new names to avoid confusion.
>
> These tweaks appear to have been universally disliked by buildfarm
> members.

Crap.  Wasn't careful enough, sorry.  Will fix shortly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Robert Haas
Дата:
On Fri, Mar 4, 2016 at 12:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Mar 4, 2016 at 12:02 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Robert Haas wrote:
>>> On Wed, Mar 2, 2016 at 8:07 PM, Kyotaro HORIGUCHI
>>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>> > Hello, thank you for the comments.
>>> >> I think we should leave string_length as it is and use a new variable
>>> >> for character-based length, as in the attached.
>>> >
>>> > Basically agreed but I like byte_length for the previous
>>> > string_length and string_length for string_length_cars. Also
>>> > text_length is renamed in the attached patch.
>>>
>>> I committed this and back-patched this but (1) I avoided changing the
>>> other functions for now and (2) I gave both the byte length and the
>>> character length new names to avoid confusion.
>>
>> These tweaks appear to have been universally disliked by buildfarm
>> members.
>
> Crap.  Wasn't careful enough, sorry.  Will fix shortly.

Fix pushed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
At Fri, 4 Mar 2016 12:30:17 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoafGr2J90RgiF_iS8Kj_7bb5xaXnPwYcT2578cYDDUjjw@mail.gmail.com>
> >>> I committed this and back-patched this but (1) I avoided changing the
> >>> other functions for now and (2) I gave both the byte length and the
> >>> character length new names to avoid confusion.
> >>
> >> These tweaks appear to have been universally disliked by buildfarm
> >> members.
> >
> > Crap.  Wasn't careful enough, sorry.  Will fix shortly.
> 
> Fix pushed.

Thank you for committing this. I can see only one commit for this
in the repository but I believe it is the fixed one.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: psql completion for ids in multibyte string

От
Robert Haas
Дата:
On Sun, Mar 6, 2016 at 11:24 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> At Fri, 4 Mar 2016 12:30:17 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoafGr2J90RgiF_iS8Kj_7bb5xaXnPwYcT2578cYDDUjjw@mail.gmail.com>
>> >>> I committed this and back-patched this but (1) I avoided changing the
>> >>> other functions for now and (2) I gave both the byte length and the
>> >>> character length new names to avoid confusion.
>> >>
>> >> These tweaks appear to have been universally disliked by buildfarm
>> >> members.
>> >
>> > Crap.  Wasn't careful enough, sorry.  Will fix shortly.
>>
>> Fix pushed.
>
> Thank you for committing this. I can see only one commit for this
> in the repository but I believe it is the fixed one.

It was OK in master, but the back-branches had problems.  See
369c0b09080812943a2efcebe91cf4b271bc4f86.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: psql completion for ids in multibyte string

От
Kyotaro HORIGUCHI
Дата:
Hello,

At Mon, 7 Mar 2016 12:34:15 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoaQsvJnVhe05QXJM5PAniVAB3iRAEL-6DHTxUhX92GqPA@mail.gmail.com>
> >> Fix pushed.
> >
> > Thank you for committing this. I can see only one commit for this
> > in the repository but I believe it is the fixed one.
> 
> It was OK in master, but the back-branches had problems.  See
> 369c0b09080812943a2efcebe91cf4b271bc4f86.

I understand that. Thank you for replying.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center