Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Дата
Msg-id 29388.1309575706@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Список pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> I couldn't resist looking into this, and came up with the attached
> patch. I tested this with:

> CREATE TABLE words (word text);
> COPY words FROM '/usr/share/dict/words';
> CREATE INDEX i_words ON words USING gist (word gist_trgm_ops);

> And then ran "REINDEX INDEX i_words" a few times with and without the
> patch. Without the patch, reindex takes about 4.7 seconds. With the
> patch, 3.7 seconds. That's a worthwhile gain on its own, but becomes
> even more important with Alexander's fast GiST build patch, which calls
> the penalty function more.

I tested this on HPPA (big-endian, picky about alignment) to verify that
you had that code path right.  It's right, but on that platform the
speedup in the "reindex dict/words" test is only about 6%.  I'm afraid
that the benefit is a lot more machine-specific than we could wish.

I tweaked the patch a bit further (don't pessimize the boundary case
where there's exactly 4n+1 trigrams, avoid forcing trg1 into memory,
improve the comment) and attach that version below.  This is a little
bit faster but I still wonder if it's worth the price of adding an
obscure dependency on the header size.

> I used the attached showsign-debug.patch to verify that the patched
> makesign function produces the same results as the existing code. I
> haven't tested the big-endian code, however.

That didn't look terribly convincing.  I added a direct validation that
old and new code give the same results, a la

      if (ISARRKEY(newval))
      {
          BITVEC        sign;
+         BITVEC        osign;

          makesign(sign, newval);
+         origmakesign(osign, newval);
+         Assert(memcmp(sign, osign, sizeof(BITVEC)) == 0);

          if (ISALLTRUE(origval))
              *penalty = ((float) (SIGLENBIT - sizebitvec(sign))) / (float) (SIGLENBIT + 1);

and ran the regression tests and the dict/words example with that.

            regards, tom lane

diff --git a/contrib/pg_trgm/trgm_gist.c b/contrib/pg_trgm/trgm_gist.c
index b328a09f41fee50beb96a28835e15ef835222cd6..7cea6d68fdcde71e0fb033682d77c21978c3406b 100644
*** a/contrib/pg_trgm/trgm_gist.c
--- b/contrib/pg_trgm/trgm_gist.c
*************** gtrgm_out(PG_FUNCTION_ARGS)
*** 84,100 ****
  static void
  makesign(BITVECP sign, TRGM *a)
  {
!     int4        k,
!                 len = ARRNELEM(a);
      trgm       *ptr = GETARR(a);
!     int4        tmp = 0;

      MemSet((void *) sign, 0, sizeof(BITVEC));
      SETBIT(sign, SIGLENBIT);    /* set last unused bit */
!     for (k = 0; k < len; k++)
      {
!         CPTRGM(((char *) &tmp), ptr + k);
!         HASH(sign, tmp);
      }
  }

--- 84,178 ----
  static void
  makesign(BITVECP sign, TRGM *a)
  {
!     int4        len = ARRNELEM(a);
      trgm       *ptr = GETARR(a);
!     char       *p;
!     char       *endptr;
!     uint32        w1,
!                 w2,
!                 w3;
!     uint32        trg0 = 0,
!                 trg1,
!                 trg2,
!                 trg3,
!                 trg4;
!     uint32       *p32;

      MemSet((void *) sign, 0, sizeof(BITVEC));
      SETBIT(sign, SIGLENBIT);    /* set last unused bit */
!
!     if (len <= 0)
!         return;
!
!     /*----------
!      * We have to extract each trigram into a uint32, and calculate the HASH.
!      * This would be a lot easier if the trigrams were aligned on 4-byte
!      * boundaries, but they're not.  The simple way would be to copy each
!      * trigram byte-by-byte, but that is quite slow, and this function is a
!      * hotspot in penalty calculations.
!      *
!      * The first trigram in the array doesn't begin at a 4-byte boundary, as
!      * the flags byte comes first; but the next one does.  So we fetch the
!      * first trigram as a special case, and after that each four trigrams fall
!      * onto 4-byte words like this:
!      *
!      *  w1   w2   w3
!      * AAAB BBCC CDDD
!      *
!      * As long as there's at least four trigrams left to process, we fetch
!      * the next three words and extract the trigrams from them with bit
!      * operations, per the above diagram.  The last few trigrams are handled
!      * one at a time with byte-by-byte fetching.
!      *
!      * Note that this code yields different results on big-endian and
!      * little-endian machines, because the bytes of each trigram are loaded
!      * into a uint32 in memory order and left-justified.  That's probably
!      * undesirable, but changing this behavior would break existing indexes.
!      *----------
!      */
!     endptr = (char *) (ptr + len);
!     p32 = (uint32 *) (((char *) ptr) - 1);
!
!     /* Fetch and extract the initial word */
!     w1 = *(p32++);
! #ifdef WORDS_BIGENDIAN
!     trg1 = w1 << 8;
! #else
!     trg1 = w1 >> 8;
! #endif
!     HASH(sign, trg1);
!
!     while ((char *) p32 <= endptr - 3 * sizeof(uint32))
      {
!         w1 = *(p32++);
!         w2 = *(p32++);
!         w3 = *(p32++);
!
! #ifdef WORDS_BIGENDIAN
!         trg1 = w1 & 0xFFFFFF00;
!         trg2 = (w1 << 24) | ((w2 & 0xFFFF0000) >> 8);
!         trg3 = ((w2 & 0x0000FFFF) << 16) | ((w3 & 0xFF000000) >> 16);
!         trg4 = w3 << 8;
! #else
!         trg1 = w1 & 0x00FFFFFF;
!         trg2 = (w1 >> 24) | ((w2 & 0x0000FFFF) << 8);
!         trg3 = ((w2 & 0xFFFF0000) >> 16) | ((w3 & 0x000000FF) << 16);
!         trg4 = w3 >> 8;
! #endif
!
!         HASH(sign, trg1);
!         HASH(sign, trg2);
!         HASH(sign, trg3);
!         HASH(sign, trg4);
!     }
!
!     /* Handle the remaining 0-3 trigrams the slow way */
!     p = (char *) p32;
!     while (p < endptr)
!     {
!         CPTRGM(((char *) &trg0), p);
!         HASH(sign, trg0);
!         p += 3;
      }
  }


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: add support for logging current role (what to review?)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pg_upgrade defaulting to port 25432