Обсуждение: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

Поиск
Список
Период
Сортировка

BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
"Vitali Stupin"
Дата:
The following bug has been logged online:

Bug reference:      2683
Logged by:          Vitali Stupin
Email address:      Vitali.Stupin@ria.ee
PostgreSQL version: 8.1.4
Operating system:   sparc-sun-solaris2.10
Description:        spi_exec_query in plperl returns column names which are
not marked as UTF8
Details:

If database uses UTF8 encoding, then spi_exec_query in plperl should return
query results in UTF8 encoding. But unfortunately only data is marked as
UTF8, while column names are not.

The following test function demonstrates this bug:
CREATE OR REPLACE FUNCTION spi_test("varchar")
  RETURNS varchar AS
$BODY$
my $result = spi_exec_query($_[0]);
my @row_keys = keys %{$result->{rows}};
elog(WARNING, 'Column name: "' . $row_keys[0] . '" is UTF8: ' .
(utf8::is_utf8($row_keys[0]) ? 'y' : 'n'));
my $test_result = join(',', @row_keys) . '|';
for $row_key (@row_keys){
    $value = $result->{rows}[0]->{$row_key};
    elog(WARNING, 'Value: "' . $value . '" is UTF8: ' . (utf8::is_utf8($value)
? 'y' : 'n'));
    $test_result .= $value . '|';
}
elog(WARNING, 'Result: "' . $test_result . '" is UTF8: ' .
(utf8::is_utf8($test_result) ? 'y' : 'n'));
return $test_result;
$BODY$
  LANGUAGE 'plperlu' VOLATILE;

When it is called as:
select spi_test('select ''val_äü'' AS "col_äü"')

The following output is produced:
"col_äü|val_äü|"

And the generated warnings are:
WARNING:  Column name: "col_äü" is UTF8: n
WARNING:  Value: "val_äü" is UTF8: y
WARNING:  Result: "col_äü|val_äü|" is UTF8: y


Therefore it is possible to make a conclusion, that after execution of
query, column names contain a valid UTF8 string without UTF8 flag. When two
strings, one of which is marked as UTF8 and the other marked as non UTF8,
are concatenated, Perl automatically tries to convert column name into UTF8
as if it was in ISO-8859-1 encoding. As a result new string contains invalid
column name.

This bug can also be reproduced if spi_exec_query makes SELECT from the
actual table, when column names contain UTF8 characters.

This bug was also reproduced on:
PostgreSQL 8.1.3 on sparc-sun-solaris2.10;
PostgreSQL 8.1.3 on i686-pc-linux-gnu.

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
"Vitali Stupin" <Vitali.Stupin@ria.ee> writes:
> If database uses UTF8 encoding, then spi_exec_query in plperl should return
> query results in UTF8 encoding. But unfortunately only data is marked as
> UTF8, while column names are not.

It looks to me like basically everywhere in plperl.c that does newSVpv()
should follow it with

#if PERL_BCDVERSION >= 0x5006000L           if (GetDatabaseEncoding() == PG_UTF8)               SvUTF8_on(sv);
#endif

whereas currently there are only a couple of places that do that.

I'm tempted to consolidate this into a function on the order of
newSVstring(const char *) or some such.  Comments?
        regards, tom lane


Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Tom Lane wrote:
> #if PERL_BCDVERSION >= 0x5006000L
...
> #endif
...
> I'm tempted to consolidate this into a function on the order of
> newSVstring(const char *) or some such.  Comments?

+1

I suggested at one point raising the minimum requirement for Perl to
5.6 (which came out way back in 2000, so we're unlikely to upset anyone).
If we haven't done that already, this would be a good chance as we
can get rid of that ugly #if block.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200610151328
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFMnA9vJuQZxSWSsgRAoS8AKDxCVgCLggaDH+d1BbcEROZORqhEwCg6qe+
wrVsJMi+EKGvnmVGK4MroaM=
=Oi3J
-----END PGP SIGNATURE-----

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
I wrote:
> It looks to me like basically everywhere in plperl.c that does newSVpv()
> should follow it with
>
> #if PERL_BCDVERSION >= 0x5006000L
>             if (GetDatabaseEncoding() == PG_UTF8)
>                 SvUTF8_on(sv);
> #endif

Experimentation proved that this was insufficient to fix Vitali's
problem --- the string he's unhappy about is actually a hash key entry,
and there's no documented way to mark the second argument of hv_store()
as being a UTF-8 string.  Some digging in the Perl source code found
that since at least Perl 5.8.0, hv_fetch and hv_store recognize a
negative key length as meaning a UTF-8 key (ick!!), so I used that hack.
I am not sure there is any reasonable fix available in Perl 5.6.x.

Attached patch applied to HEAD, but I'm not going to risk back-patching
it without some field testing.

            regards, tom lane

*** src/pl/plperl/plperl.c.orig    Tue Oct  3 23:17:16 2006
--- src/pl/plperl/plperl.c    Sun Oct 15 14:47:27 2006
***************
*** 114,119 ****
--- 114,122 ----
  static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc);
  static void plperl_init_shared_libs(pTHX);
  static HV  *plperl_spi_execute_fetch_result(SPITupleTable *, int, int);
+ static SV  *newSVstring(const char *str);
+ static SV **hv_store_string(HV *hv, const char *key, SV *val);
+ static SV **hv_fetch_string(HV *hv, const char *key);

  /*
   * This routine is a crock, and so is everyplace that calls it.  The problem
***************
*** 471,531 ****
                                                  )
          );

!     hv_store(hv, "name", 4, newSVpv(tdata->tg_trigger->tgname, 0), 0);
!     hv_store(hv, "relid", 5, newSVpv(relid, 0), 0);

      if (TRIGGER_FIRED_BY_INSERT(tdata->tg_event))
      {
          event = "INSERT";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
!             hv_store(hv, "new", 3,
!                      plperl_hash_from_tuple(tdata->tg_trigtuple, tupdesc),
!                      0);
      }
      else if (TRIGGER_FIRED_BY_DELETE(tdata->tg_event))
      {
          event = "DELETE";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
!             hv_store(hv, "old", 3,
!                      plperl_hash_from_tuple(tdata->tg_trigtuple, tupdesc),
!                      0);
      }
      else if (TRIGGER_FIRED_BY_UPDATE(tdata->tg_event))
      {
          event = "UPDATE";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
          {
!             hv_store(hv, "old", 3,
!                      plperl_hash_from_tuple(tdata->tg_trigtuple, tupdesc),
!                      0);
!             hv_store(hv, "new", 3,
!                      plperl_hash_from_tuple(tdata->tg_newtuple, tupdesc),
!                      0);
          }
      }
      else
          event = "UNKNOWN";

!     hv_store(hv, "event", 5, newSVpv(event, 0), 0);
!     hv_store(hv, "argc", 4, newSViv(tdata->tg_trigger->tgnargs), 0);

      if (tdata->tg_trigger->tgnargs > 0)
      {
          AV           *av = newAV();

          for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
!             av_push(av, newSVpv(tdata->tg_trigger->tgargs[i], 0));
!         hv_store(hv, "args", 4, newRV_noinc((SV *) av), 0);
      }

!     hv_store(hv, "relname", 7,
!              newSVpv(SPI_getrelname(tdata->tg_relation), 0), 0);

!     hv_store(hv, "table_name", 10,
!              newSVpv(SPI_getrelname(tdata->tg_relation), 0), 0);

!     hv_store(hv, "table_schema", 12,
!              newSVpv(SPI_getnspname(tdata->tg_relation), 0), 0);

      if (TRIGGER_FIRED_BEFORE(tdata->tg_event))
          when = "BEFORE";
--- 474,534 ----
                                                  )
          );

!     hv_store_string(hv, "name", newSVstring(tdata->tg_trigger->tgname));
!     hv_store_string(hv, "relid", newSVstring(relid));

      if (TRIGGER_FIRED_BY_INSERT(tdata->tg_event))
      {
          event = "INSERT";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
!             hv_store_string(hv, "new",
!                             plperl_hash_from_tuple(tdata->tg_trigtuple,
!                                                    tupdesc));
      }
      else if (TRIGGER_FIRED_BY_DELETE(tdata->tg_event))
      {
          event = "DELETE";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
!             hv_store_string(hv, "old",
!                             plperl_hash_from_tuple(tdata->tg_trigtuple,
!                                                    tupdesc));
      }
      else if (TRIGGER_FIRED_BY_UPDATE(tdata->tg_event))
      {
          event = "UPDATE";
          if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
          {
!             hv_store_string(hv, "old",
!                             plperl_hash_from_tuple(tdata->tg_trigtuple,
!                                                    tupdesc));
!             hv_store_string(hv, "new",
!                             plperl_hash_from_tuple(tdata->tg_newtuple,
!                                                    tupdesc));
          }
      }
      else
          event = "UNKNOWN";

!     hv_store_string(hv, "event", newSVstring(event));
!     hv_store_string(hv, "argc", newSViv(tdata->tg_trigger->tgnargs));

      if (tdata->tg_trigger->tgnargs > 0)
      {
          AV           *av = newAV();

          for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
!             av_push(av, newSVstring(tdata->tg_trigger->tgargs[i]));
!         hv_store_string(hv, "args", newRV_noinc((SV *) av));
      }

!     hv_store_string(hv, "relname",
!                     newSVstring(SPI_getrelname(tdata->tg_relation)));

!     hv_store_string(hv, "table_name",
!                     newSVstring(SPI_getrelname(tdata->tg_relation)));

!     hv_store_string(hv, "table_schema",
!                     newSVstring(SPI_getnspname(tdata->tg_relation)));

      if (TRIGGER_FIRED_BEFORE(tdata->tg_event))
          when = "BEFORE";
***************
*** 533,539 ****
          when = "AFTER";
      else
          when = "UNKNOWN";
!     hv_store(hv, "when", 4, newSVpv(when, 0), 0);

      if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
          level = "ROW";
--- 536,542 ----
          when = "AFTER";
      else
          when = "UNKNOWN";
!     hv_store_string(hv, "when", newSVstring(when));

      if (TRIGGER_FIRED_FOR_ROW(tdata->tg_event))
          level = "ROW";
***************
*** 541,547 ****
          level = "STATEMENT";
      else
          level = "UNKNOWN";
!     hv_store(hv, "level", 5, newSVpv(level, 0), 0);

      return newRV_noinc((SV *) hv);
  }
--- 544,550 ----
          level = "STATEMENT";
      else
          level = "UNKNOWN";
!     hv_store_string(hv, "level", newSVstring(level));

      return newRV_noinc((SV *) hv);
  }
***************
*** 567,573 ****

      tupdesc = tdata->tg_relation->rd_att;

!     svp = hv_fetch(hvTD, "new", 3, FALSE);
      if (!svp)
          ereport(ERROR,
                  (errcode(ERRCODE_UNDEFINED_COLUMN),
--- 570,576 ----

      tupdesc = tdata->tg_relation->rd_att;

!     svp = hv_fetch_string(hvTD, "new");
      if (!svp)
          ereport(ERROR,
                  (errcode(ERRCODE_UNDEFINED_COLUMN),
***************
*** 741,749 ****
  }


! /* Uses mksafefunc/mkunsafefunc to create an anonymous sub whose text is
!  * supplied in s, and returns a reference to the closure. */
!
  static SV  *
  plperl_create_sub(char *s, bool trusted)
  {
--- 744,753 ----
  }


! /*
!  * Uses mksafefunc/mkunsafefunc to create an anonymous sub whose text is
!  * supplied in s, and returns a reference to the closure.
!  */
  static SV  *
  plperl_create_sub(char *s, bool trusted)
  {
***************
*** 761,768 ****
      ENTER;
      SAVETMPS;
      PUSHMARK(SP);
!     XPUSHs(sv_2mortal(newSVpv("our $_TD; local $_TD=$_[0]; shift;", 0)));
!     XPUSHs(sv_2mortal(newSVpv(s, 0)));
      PUTBACK;

      /*
--- 765,772 ----
      ENTER;
      SAVETMPS;
      PUSHMARK(SP);
!     XPUSHs(sv_2mortal(newSVstring("our $_TD; local $_TD=$_[0]; shift;")));
!     XPUSHs(sv_2mortal(newSVstring(s)));
      PUTBACK;

      /*
***************
*** 900,910 ****

              tmp = OutputFunctionCall(&(desc->arg_out_func[i]),
                                       fcinfo->arg[i]);
!             sv = newSVpv(tmp, 0);
! #if PERL_BCDVERSION >= 0x5006000L
!             if (GetDatabaseEncoding() == PG_UTF8)
!                 SvUTF8_on(sv);
! #endif
              XPUSHs(sv_2mortal(sv));
              pfree(tmp);
          }
--- 904,910 ----

              tmp = OutputFunctionCall(&(desc->arg_out_func[i]),
                                       fcinfo->arg[i]);
!             sv = newSVstring(tmp);
              XPUSHs(sv_2mortal(sv));
              pfree(tmp);
          }
***************
*** 965,971 ****

      tg_trigger = ((TriggerData *) fcinfo->context)->tg_trigger;
      for (i = 0; i < tg_trigger->tgnargs; i++)
!         XPUSHs(sv_2mortal(newSVpv(tg_trigger->tgargs[i], 0)));
      PUTBACK;

      /* Do NOT use G_KEEPERR here */
--- 965,971 ----

      tg_trigger = ((TriggerData *) fcinfo->context)->tg_trigger;
      for (i = 0; i < tg_trigger->tgnargs; i++)
!         XPUSHs(sv_2mortal(newSVstring(tg_trigger->tgargs[i])));
      PUTBACK;

      /* Do NOT use G_KEEPERR here */
***************
*** 1256,1262 ****
      HeapTuple    procTup;
      Form_pg_proc procStruct;
      char        internal_proname[64];
-     int            proname_len;
      plperl_proc_desc *prodesc = NULL;
      int            i;
      SV          **svp;
--- 1256,1261 ----
***************
*** 1277,1288 ****
      else
          sprintf(internal_proname, "__PLPerl_proc_%u_trigger", fn_oid);

-     proname_len = strlen(internal_proname);
-
      /************************************************************
       * Lookup the internal proc name in the hashtable
       ************************************************************/
!     svp = hv_fetch(plperl_proc_hash, internal_proname, proname_len, FALSE);
      if (svp)
      {
          bool        uptodate;
--- 1276,1285 ----
      else
          sprintf(internal_proname, "__PLPerl_proc_%u_trigger", fn_oid);

      /************************************************************
       * Lookup the internal proc name in the hashtable
       ************************************************************/
!     svp = hv_fetch_string(plperl_proc_hash, internal_proname);
      if (svp)
      {
          bool        uptodate;
***************
*** 1484,1491 ****
                   internal_proname);
          }

!         hv_store(plperl_proc_hash, internal_proname, proname_len,
!                  newSVuv(PTR2UV(prodesc)), 0);
      }

      ReleaseSysCache(procTup);
--- 1481,1488 ----
                   internal_proname);
          }

!         hv_store_string(plperl_proc_hash, internal_proname,
!                         newSVuv(PTR2UV(prodesc)));
      }

      ReleaseSysCache(procTup);
***************
*** 1512,1547 ****
          char       *outputstr;
          Oid            typoutput;
          bool        typisvarlena;
-         int            namelen;
-         SV           *sv;

          if (tupdesc->attrs[i]->attisdropped)
              continue;

          attname = NameStr(tupdesc->attrs[i]->attname);
-         namelen = strlen(attname);
          attr = heap_getattr(tuple, i + 1, tupdesc, &isnull);

          if (isnull)
          {
              /* Store (attname => undef) and move on. */
!             hv_store(hv, attname, namelen, newSV(0), 0);
              continue;
          }

          /* XXX should have a way to cache these lookups */
-
          getTypeOutputInfo(tupdesc->attrs[i]->atttypid,
                            &typoutput, &typisvarlena);

          outputstr = OidOutputFunctionCall(typoutput, attr);

!         sv = newSVpv(outputstr, 0);
! #if PERL_BCDVERSION >= 0x5006000L
!         if (GetDatabaseEncoding() == PG_UTF8)
!             SvUTF8_on(sv);
! #endif
!         hv_store(hv, attname, namelen, sv, 0);

          pfree(outputstr);
      }
--- 1509,1535 ----
          char       *outputstr;
          Oid            typoutput;
          bool        typisvarlena;

          if (tupdesc->attrs[i]->attisdropped)
              continue;

          attname = NameStr(tupdesc->attrs[i]->attname);
          attr = heap_getattr(tuple, i + 1, tupdesc, &isnull);

          if (isnull)
          {
              /* Store (attname => undef) and move on. */
!             hv_store_string(hv, attname, newSV(0));
              continue;
          }

          /* XXX should have a way to cache these lookups */
          getTypeOutputInfo(tupdesc->attrs[i]->atttypid,
                            &typoutput, &typisvarlena);

          outputstr = OidOutputFunctionCall(typoutput, attr);

!         hv_store_string(hv, attname, newSVstring(outputstr));

          pfree(outputstr);
      }
***************
*** 1627,1636 ****

      result = newHV();

!     hv_store(result, "status", strlen("status"),
!              newSVpv((char *) SPI_result_code_string(status), 0), 0);
!     hv_store(result, "processed", strlen("processed"),
!              newSViv(processed), 0);

      if (status > 0 && tuptable)
      {
--- 1615,1624 ----

      result = newHV();

!     hv_store_string(result, "status",
!                     newSVstring(SPI_result_code_string(status)));
!     hv_store_string(result, "processed",
!                     newSViv(processed));

      if (status > 0 && tuptable)
      {
***************
*** 1644,1651 ****
              row = plperl_hash_from_tuple(tuptable->vals[i], tuptable->tupdesc);
              av_push(rows, row);
          }
!         hv_store(result, "rows", strlen("rows"),
!                  newRV_noinc((SV *) rows), 0);
      }

      SPI_freetuptable(tuptable);
--- 1632,1639 ----
              row = plperl_hash_from_tuple(tuptable->vals[i], tuptable->tupdesc);
              av_push(rows, row);
          }
!         hv_store_string(result, "rows",
!                         newRV_noinc((SV *) rows));
      }

      SPI_freetuptable(tuptable);
***************
*** 1811,1817 ****
          if (portal == NULL)
              elog(ERROR, "SPI_cursor_open() failed:%s",
                   SPI_result_code_string(SPI_result));
!         cursor = newSVpv(portal->name, 0);

          /* Commit the inner transaction, return to outer xact context */
          ReleaseCurrentSubTransaction();
--- 1799,1805 ----
          if (portal == NULL)
              elog(ERROR, "SPI_cursor_open() failed:%s",
                   SPI_result_code_string(SPI_result));
!         cursor = newSVstring(portal->name);

          /* Commit the inner transaction, return to outer xact context */
          ReleaseCurrentSubTransaction();
***************
*** 2065,2073 ****
       * Insert a hashtable entry for the plan and return
       * the key to the caller.
       ************************************************************/
!     hv_store(plperl_query_hash, qdesc->qname, strlen(qdesc->qname), newSVuv(PTR2UV(qdesc)), 0);

!     return newSVpv(qdesc->qname, strlen(qdesc->qname));
  }

  HV *
--- 2053,2061 ----
       * Insert a hashtable entry for the plan and return
       * the key to the caller.
       ************************************************************/
!     hv_store_string(plperl_query_hash, qdesc->qname, newSVuv(PTR2UV(qdesc)));

!     return newSVstring(qdesc->qname);
  }

  HV *
***************
*** 2098,2104 ****
          /************************************************************
           * Fetch the saved plan descriptor, see if it's o.k.
           ************************************************************/
!         sv = hv_fetch(plperl_query_hash, query, strlen(query), 0);
          if (sv == NULL)
              elog(ERROR, "spi_exec_prepared: Invalid prepared query passed");
          if (*sv == NULL || !SvOK(*sv))
--- 2086,2092 ----
          /************************************************************
           * Fetch the saved plan descriptor, see if it's o.k.
           ************************************************************/
!         sv = hv_fetch_string(plperl_query_hash, query);
          if (sv == NULL)
              elog(ERROR, "spi_exec_prepared: Invalid prepared query passed");
          if (*sv == NULL || !SvOK(*sv))
***************
*** 2118,2124 ****
          limit = 0;
          if (attr != NULL)
          {
!             sv = hv_fetch(attr, "limit", 5, 0);
              if (*sv && SvIOK(*sv))
                  limit = SvIV(*sv);
          }
--- 2106,2112 ----
          limit = 0;
          if (attr != NULL)
          {
!             sv = hv_fetch_string(attr, "limit");
              if (*sv && SvIOK(*sv))
                  limit = SvIV(*sv);
          }
***************
*** 2239,2245 ****
          /************************************************************
           * Fetch the saved plan descriptor, see if it's o.k.
           ************************************************************/
!         sv = hv_fetch(plperl_query_hash, query, strlen(query), 0);
          if (sv == NULL)
              elog(ERROR, "spi_query_prepared: Invalid prepared query passed");
          if (*sv == NULL || !SvOK(*sv))
--- 2227,2233 ----
          /************************************************************
           * Fetch the saved plan descriptor, see if it's o.k.
           ************************************************************/
!         sv = hv_fetch_string(plperl_query_hash, query);
          if (sv == NULL)
              elog(ERROR, "spi_query_prepared: Invalid prepared query passed");
          if (*sv == NULL || !SvOK(*sv))
***************
*** 2301,2307 ****
              elog(ERROR, "SPI_cursor_open() failed:%s",
                   SPI_result_code_string(SPI_result));

!         cursor = newSVpv(portal->name, 0);

          /* Commit the inner transaction, return to outer xact context */
          ReleaseCurrentSubTransaction();
--- 2289,2295 ----
              elog(ERROR, "SPI_cursor_open() failed:%s",
                   SPI_result_code_string(SPI_result));

!         cursor = newSVstring(portal->name);

          /* Commit the inner transaction, return to outer xact context */
          ReleaseCurrentSubTransaction();
***************
*** 2353,2359 ****
      void       *plan;
      plperl_query_desc *qdesc;

!     sv = hv_fetch(plperl_query_hash, query, strlen(query), 0);
      if (sv == NULL)
          elog(ERROR, "spi_exec_freeplan: Invalid prepared query passed");
      if (*sv == NULL || !SvOK(*sv))
--- 2341,2347 ----
      void       *plan;
      plperl_query_desc *qdesc;

!     sv = hv_fetch_string(plperl_query_hash, query);
      if (sv == NULL)
          elog(ERROR, "spi_exec_freeplan: Invalid prepared query passed");
      if (*sv == NULL || !SvOK(*sv))
***************
*** 2375,2378 ****
--- 2363,2422 ----
      free(qdesc);

      SPI_freeplan(plan);
+ }
+
+ /*
+  * Create a new SV from a string assumed to be in the current database's
+  * encoding.
+  */
+ static SV *
+ newSVstring(const char *str)
+ {
+     SV           *sv;
+
+     sv = newSVpv(str, 0);
+ #if PERL_BCDVERSION >= 0x5006000L
+     if (GetDatabaseEncoding() == PG_UTF8)
+         SvUTF8_on(sv);
+ #endif
+     return sv;
+ }
+
+ /*
+  * Store an SV into a hash table under a key that is a string assumed to be
+  * in the current database's encoding.
+  */
+ static SV **
+ hv_store_string(HV *hv, const char *key, SV *val)
+ {
+     int32    klen = strlen(key);
+
+     /*
+      * This seems nowhere documented, but under Perl 5.8.0 and up,
+      * hv_store() recognizes a negative klen parameter as meaning
+      * a UTF-8 encoded key.  It does not appear that hashes track
+      * UTF-8-ness of keys at all in Perl 5.6.
+      */
+ #if PERL_BCDVERSION >= 0x5008000L
+     if (GetDatabaseEncoding() == PG_UTF8)
+         klen = -klen;
+ #endif
+     return hv_store(hv, key, klen, val, 0);
+ }
+
+ /*
+  * Fetch an SV from a hash table under a key that is a string assumed to be
+  * in the current database's encoding.
+  */
+ static SV **
+ hv_fetch_string(HV *hv, const char *key)
+ {
+     int32    klen = strlen(key);
+
+     /* See notes in hv_store_string */
+ #if PERL_BCDVERSION >= 0x5008000L
+     if (GetDatabaseEncoding() == PG_UTF8)
+         klen = -klen;
+ #endif
+     return hv_fetch(hv, key, klen, 0);
  }

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Sun, Oct 15, 2006 at 05:34:21PM -0000, Greg Sabino Mullane wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Tom Lane wrote:
> > #if PERL_BCDVERSION >= 0x5006000L
> ...
> > #endif
> ...
> > I'm tempted to consolidate this into a function on the order of
> > newSVstring(const char *) or some such.  Comments?
>
> +1
>
> I suggested at one point raising the minimum requirement for Perl to
> 5.6 (which came out way back in 2000, so we're unlikely to upset
> anyone).  If we haven't done that already, this would be a good
> chance as we can get rid of that ugly #if block.

I'd like to suggest that raise that minimum requirement for Perl to
5.8 for 8.3, as Perl 5.8 will be about five years old by then.  I
understand that some people are running versions of some operating
system or other so ancient that Perl 5.8 did not come pre-installed
with it, but we can't be supporting other projects backwards into
eternity.  Supporting our own is already plenty of work.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> I'd like to suggest that raise that minimum requirement for Perl to
> 5.8 for 8.3, as Perl 5.8 will be about five years old by then.

Well, we're still supporting some OS versions that are way over five
years old.  ISTM the real question is what do we buy if we make such
a restriction?  Getting rid of a few small ifdefs doesn't seem like
an adequate reason.  Is there some major improvement we could make?

            regards, tom lane

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Tom Lane asked:
> Well, we're still supporting some OS versions that are way over five
> years old.  ISTM the real question is what do we buy if we make such
> a restriction?  Getting rid of a few small ifdefs doesn't seem like
> an adequate reason.  Is there some major improvement we could make?

Well, as you just pointed out in the last commit, Unicode-aware hashes.
Unicode in general was cleaned up and overhauled in 5.8, so if pl/perl
is going to make a serious attempt to support it, it probably should
require 5.8.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200610151657
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFMqG1vJuQZxSWSsgRApmpAJ9B29AhaBGnEA6h7o5FgemlrIUgzgCgtTZu
QZkaGYy0iH0JnHoZGoE/KRE=
=hgIs
-----END PGP SIGNATURE-----

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Sun, Oct 15, 2006 at 04:43:17PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > I'd like to suggest that raise that minimum requirement for Perl
> > to 5.8 for 8.3, as Perl 5.8 will be about five years old by then.
>
> Well, we're still supporting some OS versions that are way over five
> years old.  ISTM the real question is what do we buy if we make such
> a restriction?  Getting rid of a few small ifdefs doesn't seem like
> an adequate reason.  Is there some major improvement we could make?

UTF-8 handling much improved in 5.8.  Just anecdotally, a former Very
Large Client is really suffering from having to maintain two branches
of every piece of Perl code in the house that touches anything UTF-8.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> On Sun, Oct 15, 2006 at 04:43:17PM -0400, Tom Lane wrote:
>> ISTM the real question is what do we buy if we make such
>> a restriction?  Getting rid of a few small ifdefs doesn't seem like
>> an adequate reason.  Is there some major improvement we could make?

> UTF-8 handling much improved in 5.8.

And?  Sure there are good reasons why someone might want to use 5.8
instead of 5.6, but how is that a reason for us to remove 5.6 support?

            regards, tom lane

Re: [HACKERS] BUG #2683: spi_exec_query in plperl returns

От
Tom Lane
Дата:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> I am also wondering, now that it's been raised, if we need to issue a "use
> utf8;" in the startup code, so that literals in the code get the right
> encoding.

Good question.  I took care to ensure that the code strings passed to
Perl are marked as UTF8; perhaps that makes it happen implicitly?
If not, are there any downsides to issuing "use utf8"?
        regards, tom lane


Re: [HACKERS] BUG #2683: spi_exec_query in plperl returns

От
David Fetter
Дата:
On Sun, Oct 15, 2006 at 04:50:15PM -0500, Andrew Dunstan wrote:
> Tom Lane wrote:
> > I wrote:
> >> It looks to me like basically everywhere in plperl.c that does
> >> newSVpv() should follow it with
> >>
> >> #if PERL_BCDVERSION >= 0x5006000L
> >>             if (GetDatabaseEncoding() == PG_UTF8)
> >>                 SvUTF8_on(sv);
> >> #endif
> >
> > Experimentation proved that this was insufficient to fix Vitali's
> > problem --- the string he's unhappy about is actually a hash key
> > entry, and there's no documented way to mark the second argument
> > of hv_store() as being a UTF-8 string.  Some digging in the Perl
> > source code found that since at least Perl 5.8.0, hv_fetch and
> > hv_store recognize a negative key length as meaning a UTF-8 key
> > (ick!!), so I used that hack.  I am not sure there is any
> > reasonable fix available in Perl 5.6.x.
> >
> > Attached patch applied to HEAD, but I'm not going to risk
> > back-patching it without some field testing.
> 
> Hmm. That negative pointer hack is mighty ugly.
> 
> I am also wondering, now that it's been raised, if we need to issue
> a "use utf8;" in the startup code, so that literals in the code get
> the right encoding.

That would be a reason to go to 5.8, as 'use utf8;' is tricky at best
in 5.6.

Cheers,
D
-- 
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666                             Skype: davidfetter

Remember to vote!


Re: [HACKERS] BUG #2683: spi_exec_query in plperl returns

От
"Andrew Dunstan"
Дата:
Tom Lane wrote:
> I wrote:
>> It looks to me like basically everywhere in plperl.c that does newSVpv()
>> should follow it with
>>
>> #if PERL_BCDVERSION >= 0x5006000L
>>             if (GetDatabaseEncoding() == PG_UTF8)
>>                 SvUTF8_on(sv);
>> #endif
>
> Experimentation proved that this was insufficient to fix Vitali's
> problem --- the string he's unhappy about is actually a hash key entry,
> and there's no documented way to mark the second argument of hv_store()
> as being a UTF-8 string.  Some digging in the Perl source code found
> that since at least Perl 5.8.0, hv_fetch and hv_store recognize a
> negative key length as meaning a UTF-8 key (ick!!), so I used that hack.
> I am not sure there is any reasonable fix available in Perl 5.6.x.
>
> Attached patch applied to HEAD, but I'm not going to risk back-patching
> it without some field testing.
>

Hmm. That negative pointer hack is mighty ugly.

I am also wondering, now that it's been raised, if we need to issue a "use
utf8;" in the startup code, so that literals in the code get the right
encoding.

cheers

andrew




Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Sun, Oct 15, 2006 at 06:12:43PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Sun, Oct 15, 2006 at 04:43:17PM -0400, Tom Lane wrote:
> >> ISTM the real question is what do we buy if we make such a
> >> restriction?  Getting rid of a few small ifdefs doesn't seem like
> >> an adequate reason.  Is there some major improvement we could
> >> make?
>
> > UTF-8 handling much improved in 5.8.
>
> And?  Sure there are good reasons why someone might want to use 5.8
> instead of 5.6, but how is that a reason for us to remove 5.6
> support?

At some point, we will find something where we will have to duplicate
some large hunk of 5.8's functionality to support 5.6.  Why wait for
that to happen?  Murphy's Law says it will happen in a bug fix between
two minor releases.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> At some point, we will find something where we will have to duplicate
> some large hunk of 5.8's functionality to support 5.6.

No, we won't; we are not in the business of fixing Perl bugs.  You
haven't given any reason why someone who is using 5.6 and is happy with
it shouldn't be able to continue to use it with PG.

            regards, tom lane

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Sun, Oct 15, 2006 at 07:07:18PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > At some point, we will find something where we will have to duplicate
> > some large hunk of 5.8's functionality to support 5.6.
>
> No, we won't; we are not in the business of fixing Perl bugs.

My point exactly.

> You haven't given any reason why someone who is using 5.6 and is
> happy with it shouldn't be able to continue to use it with PG.

In my experience, people aren't "happy with" 5.6.  Instead, they've
got a mandate to maintain support for it, and this could be an
argument for upgrading.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: BUG #2683: spi_exec_query in plperl returns column

От
Bruce Momjian
Дата:
David Fetter wrote:
> On Sun, Oct 15, 2006 at 07:07:18PM -0400, Tom Lane wrote:
> > David Fetter <david@fetter.org> writes:
> > > At some point, we will find something where we will have to duplicate
> > > some large hunk of 5.8's functionality to support 5.6.
> >
> > No, we won't; we are not in the business of fixing Perl bugs.
>
> My point exactly.
>
> > You haven't given any reason why someone who is using 5.6 and is
> > happy with it shouldn't be able to continue to use it with PG.
>
> In my experience, people aren't "happy with" 5.6.  Instead, they've
> got a mandate to maintain support for it, and this could be an
> argument for upgrading.

Or an argument to dump PostgreSQL.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Mon, Oct 16, 2006 at 12:10:42AM -0400, Bruce Momjian wrote:
> David Fetter wrote:
> > On Sun, Oct 15, 2006 at 07:07:18PM -0400, Tom Lane wrote:
> > > David Fetter <david@fetter.org> writes:
> > > > At some point, we will find something where we will have to
> > > > duplicate some large hunk of 5.8's functionality to support
> > > > 5.6.
> > >
> > > No, we won't; we are not in the business of fixing Perl bugs.
> >
> > My point exactly.
> >
> > > You haven't given any reason why someone who is using 5.6 and is
> > > happy with it shouldn't be able to continue to use it with PG.
> >
> > In my experience, people aren't "happy with" 5.6.  Instead,
> > they've got a mandate to maintain support for it, and this could
> > be an argument for upgrading.
>
> Or an argument to dump PostgreSQL.

I don't know about your experience, but in mine, outfits that hang on
that tightly to outdated software aren't the kind I find any profit in
supporting, either dollar-wise or in terms of personal satisfaction.

That aside, we have a choice here:

1.  We can continue to support 5.6 until we can't any more, and
statistically speaking that "can't any more" is quite likely to happen
between two minor releases.  In that event, we look flaky by changing
the requirements in mid-cycle.

2.  We can give people ample advance notice that we're phasing out
support for Perl 5.6 and earlier, simplify our code base, and look
professional.

Which do you prefer?

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> 1.  We can continue to support 5.6 until we can't any more, and
> statistically speaking that "can't any more" is quite likely to happen
> between two minor releases.

That's a silly and unfounded assertion.  What sort of event do you
foresee that is going to make us conclude that we have to remove 5.6
support in a minor release?

            regards, tom lane

Re: [HACKERS] BUG #2683: spi_exec_query in plperl returns

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> It's clear whether you actually want to allow people to put utf8
> characters directly into their source (especially if the database is
> not in utf8 encoding anyway). There is always the \u{xxxx} escape.

Well, if the database encoding isn't utf8 then we'd not issue any such
command anyway.  But if it is, then AFAICS the text of pg_proc entries
could be expected to be utf8 too.
        regards, tom lane


Re: BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

От
David Fetter
Дата:
On Mon, Oct 16, 2006 at 10:00:13AM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > 1.  We can continue to support 5.6 until we can't any more, and
> > statistically speaking that "can't any more" is quite likely to
> > happen between two minor releases.
>
> That's a silly and unfounded assertion.  What sort of event do you
> foresee that is going to make us conclude that we have to remove 5.6
> support in a minor release?

I did mention this earlier: a Unicode misbehavior in 5.6 that causes a
data loss or crash, which is not present in 5.8.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: [HACKERS] BUG #2683: spi_exec_query in plperl returns

От
Martijn van Oosterhout
Дата:
On Sun, Oct 15, 2006 at 06:15:27PM -0400, Tom Lane wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
> > I am also wondering, now that it's been raised, if we need to issue a "use
> > utf8;" in the startup code, so that literals in the code get the right
> > encoding.
>
> Good question.  I took care to ensure that the code strings passed to
> Perl are marked as UTF8; perhaps that makes it happen implicitly?
> If not, are there any downsides to issuing "use utf8"?

What "use utf8" does is allow the *source* to be in utf8, thus affecting
what's a valid identifier and such. It doesn't affect the data, for
that you need "use encoding 'utf8'".

It's clear whether you actually want to allow people to put utf8
characters directly into their source (especially if the database is
not in utf8 encoding anyway). There is always the \u{xxxx} escape.

The perlunicode man page describe it better, though I only have
perl5.8. In know the perl5.6 model was different and somewhat more
awkward to use.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.