Re: [GENERAL] Creation of tsearch2 index is very slow

От: Martijn van Oosterhout
Тема: Re: [GENERAL] Creation of tsearch2 index is very slow
Дата: ,
Msg-id: 20060120223327.GH31908@svana.org
(см: обсуждение, исходный текст)
Ответ на: Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
 Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
  Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
   Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
    Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
     Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Craig A. James", )
            Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (David Lang, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Alvaro Herrera <-ip.org>, )
                Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
                 Re: [GENERAL] Creation of tsearch2 index is very  ("Craig A. James", )
                  Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
 Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )

On Fri, Jan 20, 2006 at 04:50:17PM -0500, Tom Lane wrote:
> (hemdistcache calls hemdistsign --- I think gprof is doing something
> funny with tail-calls here, and showing hemdistsign as directly called
> from gtsvector_picksplit when control really arrives through hemdistcache.)

It may be the compiler. All these functions are declared static, which
gives the compiler quite a bit of leeway to rearrange code.

> The bulk of the problem is clearly in this loop, which performs O(N^2)
> comparisons to find the two entries that are furthest apart in hemdist
> terms:

Ah. A while ago someone came onto the list asking about bit strings
indexing[1]. If I'd known tsearch worked like this I would have pointed
him to it. Anyway, before he went off to implement it he mentioned
"Jarvis-Patrick clustering", whatever that means.

Probably more relevent was this thread[2] on -hackers a while back with
pseudo-code[3]. How well it works, I don't know, it worked for him
evidently, he went away happy...

[1] http://archives.postgresql.org/pgsql-general/2005-11/msg00473.php
[2] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01067.php
[3] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01069.php

Hope this helps,
--
Martijn van Oosterhout   <>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

В списке pgsql-performance по дате сообщения:

От: "Craig A. James"
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very slow
От: Rikard Pavelic
Дата:
Сообщение: Re: [PERFORMANCE] Stored Procedures