Re: FTS performance with the Polish config
От | Oleg Bartunov |
---|---|
Тема | Re: FTS performance with the Polish config |
Дата | |
Msg-id | Pine.LNX.4.64.0911151702010.6801@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: FTS performance with the Polish config (Pavel Stehule <pavel.stehule@gmail.com>) |
Список | pgsql-performance |
On Sun, 15 Nov 2009, Pavel Stehule wrote: > > czech stemmer doesn't exist :( > I'd try morfessor http://www.cis.hut.fi/projects/morpho/, which is unsupervised morphological dictionary. I think it'd be not very hard to add morfessor dictionary template to tsearch2, so people could create their own stemmers. >> >> Ispell dictionary (doesn't matter english, or other language) is slow for >> the first load and then it caches, so there is no problem if use persistent >> database connection, which is de facto standard for any serious projects. >> > > I agree so connection pooling should be a solution. But it is good? > Cannot we share dictionary better? We thought about this issue and got some idea. Teodor can be more clear here, since I don't remember all details. > >>> >>> Pavel >>> >>>> Oleg >>>> On Sat, 14 Nov 2009, Pavel Stehule wrote: >>>> >>>>> 2009/11/14 Tom Lane <tgl@sss.pgh.pa.us>: >>>>>> >>>>>> Kenneth Marshall <ktm@rice.edu> writes: >>>>>>> >>>>>>> On Sat, Nov 14, 2009 at 12:25:05PM +0100, Wojciech Knapik wrote: >>>>>>>> >>>>>>>> I just finished implementing a "search engine" for my site and found >>>>>>>> ts_headline extremely slow when used with a Polish tsearch >>>>>>>> configuratio= >>>>> >>>>> n, >>>>>>>> >>>>>>>> while fast with English. >>>>>> >>>>>>> The documentation for ts_headline() states: >>>>>>> ts_headline uses the original document, not a tsvector summary, so it >>>>>>> can be slow and should be used with care. >>>>>> >>>>>> That's true but the argument in the docs would apply just as well to >>>>>> english or any other config. =C2=A0So while Wojciech would be well >>>>>> advised >>>>>> to try to avoid making a lot of calls to ts_headline, it's still >>>>>> curious >>>>>> that it's so much slower in polish than english. =C2=A0Could we see a >>>>>> self-contained test case? >>>>> >>>>> is it dictionary based or stem based? >>>>> >>>>> Dictionary based FTS is very slow (first load). Minimally czech FTS is >>>>> slow. >>>>> >>>>> regards >>>>> Pavel Stehule >>>>> >>>>>> >>>>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>>>> =C2= >>>>> >>>>> =A0 =C2=A0regards, tom lane >>>>>> >>>>>> -- >>>>>> Sent via pgsql-performance mailing list >>>>>> (pgsql-performance@postgresql.org) >>>>>> To make changes to your subscription: >>>>>> http://www.postgresql.org/mailpref/pgsql-performance >>>>>> >>>>> >>>>> --=20 >>>>> Sent via pgsql-performance mailing list >>>>> (pgsql-performance@postgresql.org) >>>>> To make changes to your subscription: >>>>> http://www.postgresql.org/mailpref/pgsql-performance >>>>> >>>> >>>> Regards, >>>> Oleg >>>> _____________________________________________________________ >>>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >>>> Sternberg Astronomical Institute, Moscow University, Russia >>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >>>> phone: +007(495)939-16-83, +007(495)939-23-83 >>>> >>> >> >> Regards, >> Oleg >> _____________________________________________________________ >> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >> Sternberg Astronomical Institute, Moscow University, Russia >> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >> phone: +007(495)939-16-83, +007(495)939-23-83 > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-performance по дате отправления: