Re: still gin index creation takes forever

Поиск
Список
Период
Сортировка
От Ivan Sergio Borgonovo
Тема Re: still gin index creation takes forever
Дата
Msg-id 20081113103332.361f315e@dawn.webthatworks.it
обсуждение исходный текст
Ответ на Re: still gin index creation takes forever  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: still gin index creation takes forever  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On Wed, 12 Nov 2008 15:18:05 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > So, in that case process can insert about 1000 ItemPointers per
> > one data tree lookup, in opposite case it does 1000 lookups in
> > data tree.

> I see.  So this could explain Ivan's issue if his table contains
> large numbers of repeated GIN keys.  Ivan, is that what your data
> looks like?

Well if by GIN keys you mean lexemes it could be. But I wouldn't say
this circumstance is uncommon among users of tsearch. I'd expect
other people had used tsearch2 to search through titles, authors and
publishers of books, so if that was the problem I'd expect the
problem to come up earlier.
Actually tsearch2 is not completely tuned up, since I still have to
"mix" Italian and English configuration to get rid of some more
stop words etc... that may increase the number of repetitions, but I
doubt this only put me in a corner case.

Anyway trying to answer in a more objective way to your question I
ran:

SELECT * FROM ts_stat('SELECT ft1idx FROM catalog_items') order by
nentry desc, ndoc desc limit 20;

It ran over 9h and I still wasn't able to get the answer.
I killed psql client that was running it and postgres continued to
eat 100% CPU for a while till I stopped it.

Considering that running:
SELECT * FROM ts_stat('SELECT ft1idx FROM catalog_items limit 50000')
order by nentry desc, ndoc desc limit 20;
returned in less than 2 minutes and catalog_items has a bit less
than 1M record... there is still something weird.

"springer";10824;10833
"e";7703;8754
"di";6815;7771
"il";5622;6168
"la";4989;5407
"hall";4357;4416
"prentic";4321;4369
"l";3920;4166
"del";3092;3281
"edizioni";2465;2465
"della";2292;2410
"m";2283;2398
"dell";2150;2281
"j";1967;2099
"d";1789;1864
"per";1685;1770
"longman";1671;1746
"le";1656;1736
"press";1687;1687
"de";1472;1564

examining 90K records took a bit more than 6min.

I'll try to move everything on another box and see what happens.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


В списке pgsql-general по дате отправления:

Предыдущее
От: Glyn Astill
Дата:
Сообщение: Re: [Slony1-general] ERROR: incompatible library
Следующее
От: Louis-David Mitterrand
Дата:
Сообщение: DBI error when changing views