Re: [GENERAL] Creation of tsearch2 index is very

От: Ron
Тема: Re: [GENERAL] Creation of tsearch2 index is very
Дата: ,
Msg-id: 7.0.1.0.2.20060121064930.03aba810@earthlink.net
(см: обсуждение, исходный текст)
Ответ на: Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane)
Ответы: Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov)
Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
 Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
  Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
   Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
    Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
     Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Craig A. James", )
            Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (David Lang, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Alvaro Herrera <-ip.org>, )
                Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
                 Re: [GENERAL] Creation of tsearch2 index is very  ("Craig A. James", )
                  Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
 Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )

At 07:23 PM 1/20/2006, Tom Lane wrote:
>"Steinar H. Gunderson" <> writes:
> > On Fri, Jan 20, 2006 at 06:52:37PM -0500, Tom Lane wrote:
> >> It's also worth considering that the entire approach is a heuristic,
> >> really --- getting the furthest-apart pair of seeds doesn't guarantee
> >> an optimal split as far as I can see.  Maybe there's some totally
> >> different way to do it.
>
> > For those of us who don't know what tsearch2/gist is trying to accomplish
> > here, could you provide some pointers? :-)
>
>Well, we're trying to split an index page that's gotten full into
>two index pages, preferably with approximately equal numbers of items in
>each new page (this isn't a hard requirement though).

Maybe we are over thinking this.  What happens if we do the obvious
and just make a new page and move the "last" n/2 items on the full
page to the new page?

Various forms of "move the last n/2 items" can be tested here:
0= just split the table in half.  Sometimes KISS  works. O(1).
1= the one's with the highest (or lowest) "x" value.
2= the one's with the highest sum of coordinates (x+y+...= values in
the top/bottom n/2 of entries).
3= split the table so that each table has entries whose size_waste
values add up to approximately the same value.
4= I'm sure there are others.
1-5 can be done in O(n) time w/o auxiliary data.  They can be done in
O(1) if we've kept track of the appropriate metric as we've built the
current page.


>I think the true figure of merit for a split is how often will
>subsequent searches have to descend into *both* of the resulting
>pages as opposed to just one
>--- the less often that is true, the better.  I'm not very clear on
>what tsearch2 is doing with these bitmaps, but it looks like an
>upper page's downlink has the union (bitwise OR) of the one-bits in
>the values on the lower page, and you have to visit the lower page
>if this union has a nonempty intersection with the set you are
>looking for.  If that's correct, what you really want is to divide
>the values so that the unions of the two sets have minimal overlap
>... which seems to me to have little to do with what the code does at present.
I'm not sure what "upper page" and "lower page" mean here?


>Teodor, Oleg, can you clarify what's needed here?
Ditto.  Guys what is the real motivation and purpose for this code?

Ron




В списке pgsql-performance по дате сообщения:

От: Oleg Bartunov
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very slow
От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very