Re: [GENERAL] Creation of tsearch2 index is very

От: Oleg Bartunov
Тема: Re: [GENERAL] Creation of tsearch2 index is very
Дата: ,
Msg-id: Pine.GSO.4.63.0601211629510.14417@ra.sai.msu.su
(см: обсуждение, исходный текст)
Ответ на: Re: [GENERAL] Creation of tsearch2 index is very  (Ron)
Ответы: Re: [GENERAL] Creation of tsearch2 index is very  (Ron)
Список: pgsql-performance

Скрыть дерево обсуждения

Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
 Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
  Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
   Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
    Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
     Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
      Re: [GENERAL] Creation of tsearch2 index is very slow  (Ron, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Tom Lane, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Steinar H. Gunderson", )
            Re: [GENERAL] Creation of tsearch2 index is very slow  ("Craig A. James", )
            Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
             Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (David Lang, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Tom Lane, )
              Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
               Re: [GENERAL] Creation of tsearch2 index is very  (Alvaro Herrera <-ip.org>, )
                Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
                 Re: [GENERAL] Creation of tsearch2 index is very  ("Craig A. James", )
                  Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
        Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
         Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
          Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
            Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
           Re: [GENERAL] Creation of tsearch2 index is very slow  (Oleg Bartunov, )
       Re: [GENERAL] Creation of tsearch2 index is very slow  (Martijn van Oosterhout, )
      Re: [GENERAL] Creation of tsearch2 index is very  (Ron, )
 Re: [GENERAL] Creation of tsearch2 index is very  (Oleg Bartunov, )

On Sat, 21 Jan 2006, Ron wrote:

> At 07:23 PM 1/20/2006, Tom Lane wrote:
>> "Steinar H. Gunderson" <> writes:
>> > On Fri, Jan 20, 2006 at 06:52:37PM -0500, Tom Lane wrote:
>> >> It's also worth considering that the entire approach is a heuristic,
>> >> really --- getting the furthest-apart pair of seeds doesn't guarantee
>> >> an optimal split as far as I can see.  Maybe there's some totally
>> >> different way to do it.
>>
>> > For those of us who don't know what tsearch2/gist is trying to accomplish
>> > here, could you provide some pointers? :-)
>>
>> Well, we're trying to split an index page that's gotten full into two index
>> pages, preferably with approximately equal numbers of items in
>> each new page (this isn't a hard requirement though).
>
> Maybe we are over thinking this.  What happens if we do the obvious and just
> make a new page and move the "last" n/2 items on the full page to the new
> page?
>
> Various forms of "move the last n/2 items" can be tested here:
> 0= just split the table in half.  Sometimes KISS  works. O(1).
> 1= the one's with the highest (or lowest) "x" value.
> 2= the one's with the highest sum of coordinates (x+y+...= values in the
> top/bottom n/2 of entries).
> 3= split the table so that each table has entries whose size_waste values add
> up to approximately the same value.
> 4= I'm sure there are others.
> 1-5 can be done in O(n) time w/o auxiliary data.  They can be done in O(1) if
> we've kept track of the appropriate metric as we've built the current page.
>
>
>> I think the true figure of merit for a split is how often will subsequent
>> searches have to descend into *both* of the resulting pages as opposed to
>> just one
>> --- the less often that is true, the better.  I'm not very clear on what
>> tsearch2 is doing with these bitmaps, but it looks like an upper page's
>> downlink has the union (bitwise OR) of the one-bits in the values on the
>> lower page, and you have to visit the lower page if this union has a
>> nonempty intersection with the set you are looking for.  If that's correct,
>> what you really want is to divide the values so that the unions of the two
>> sets have minimal overlap ... which seems to me to have little to do with
>> what the code does at present.
> I'm not sure what "upper page" and "lower page" mean here?
>
>
>> Teodor, Oleg, can you clarify what's needed here?
> Ditto.  Guys what is the real motivation and purpose for this code?

we want not just split the page by two very distinct parts, but to keep
resulted signatures which is ORed signature of all signatures in the page
as much 'sparse' as can.
some information available here
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals

Unfortunately, we're rather busy right now and couldn't be very useful.

>
> Ron
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: , http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very
От: Martijn van Oosterhout
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very slow