Re: Full text index not being used
От | Oleg Bartunov |
---|---|
Тема | Re: Full text index not being used |
Дата | |
Msg-id | Pine.LNX.4.64.0902032016590.4158@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Full text index not being used (Alex Neth <alex@liivid.com>) |
Список | pgsql-general |
Alex, looks like you need to read documentation and tune postgresql.conf. Currently, it looks not good. I have no time to guide you, so search archives for tuning postgresql.conf. This was discussed a lot of time. Oleg On Wed, 4 Feb 2009, Alex Neth wrote: > I've also found other queries that were really fast with MySQL are really > slow in Postgres. I'm hoping that is a matter of tuning. Overall I'm > finding the query times to be extremely unpredictable. > > I added a slow query logger to my application that also does an explain. > Check these out. The time in parens is the time for the initial execution > (before it was cached). These are not under heavy load. Note that there are > around 400 users - not a lot. 22 seconds is ridiculous. > > Slow query: (22.0198369026184) [0] SELECT * FROM "users" WHERE > ("users"."remember_token" = E'26a2b54d2b2e18edc8aa2e1f0e0508b0d858d650') > LIMIT 1 > Limit (cost=0.00..33.17 rows=1 width=784) (actual time=0.182..0.183 rows=1 > loops=1) > -> Seq Scan on users (cost=0.00..33.17 rows=1 width=784) (actual > time=0.181..0.181 rows=1 loops=1) > Filter: ((remember_token)::text = > '26a2b54d2b2e18edc8aa2e1f0e0508b0d858d650'::text) > Total runtime: 0.223 ms > > Slow query: (59.8780090808868) [0] SELECT id FROM "source_listings" WHERE > (post_time BETWEEN '2009-01-27 08:14:58.261978' AND '2009-02-05 > 08:14:58.262034' AND ((geo_lon BETWEEN 5751555 AND 5759369 AND geo_lat > BETWEEN 12773379 AND 12776908))) ORDER BY post_time DESC LIMIT 60 OFFSET 0 > Limit (cost=89.38..89.38 rows=1 width=12) (actual time=1368.555..1368.644 > rows=60 loops=1) -> Sort (cost=89.38..89.38 rows=1 width=12) (actual > time=1368.552..1368.588 rows=60 loops=1) > Sort Key: post_time Sort Method: top-N heapsort Memory: 19kB > -> Index Scan using x_sl_lat_lon_pt_br_lt_region on source_listings > (cost=0.00..89.37 rows=1 width=12) (actual time=0.097..1365.469 rows=2078 > loops=1) > Index Cond: ((geo_lat >= 12773379) AND (geo_lat <= 12776908) AND > (geo_lon >= 5751555) AND (geo_lon > <= 5759369) AND (post_time >= '2009-01-27 08:14:58.261978'::timestamp without > time zone) AND (post_time <= '2009-02-05 08:14:58.262034'::timestamp without > time zone))Total runtime: 1368.722 ms > > > > On Feb 3, 2009, at 8:15 PM, Oleg Bartunov wrote: > >> Alex, can you somehow identify document, which has problem with >> long word errors ? Also, if you have space on disk I'd recommend to try >> >> select *, to_tsvector('english',full_listing) as flv from source_listings; > > This is equally slow. > >> >> >> I don't remember if you said us information about >> your setup (pg version, OS, memory, what did you change in >> postgresql.conf..) >> > > Version is 8.3.5. Ubuntu 2.6.21. 2Gb RAM. postgresql.conf changes: > > shared_buffers = 24MB # min 128kB or max_connections*16kB > > work_mem = 10MB # min 64kB > > max_fsm_pages = 153600 # min max_fsm_relations*16, 6 bytes > each > > >> >> Oleg >> On Tue, 3 Feb 2009, Alex Neth wrote: >> >>> Based on suggestions from this list, I am trying to create a tsvector >>> column and index that, since it is perhaps the recheck and rebuilding of >>> all the vectors that is slowing things down. I don't understand why a >>> recheck is necessary on a gin index..... >>> >>> My update statement has been running for 36 hours now and has not >>> finished. The statement is: update source_listings set flv = >>> to_tsvector('english', full_listing); I know that it is still working >>> because it occasionally prints out one of those long word errors. >>> >>> I have only 1.6M rows and each entry in that column is a standard size web >>> page with just the text, maybe 3-5K. >>> >>> For sure I don't have meaningful long words. Perhaps that is because it >>> is not handling the HTML well and I should be parsing down the web page >>> first. Hopefully that doesn't mean I need to rebuild this column over the >>> course of 3 days - I didn't expect it to take this long so I thought I'd >>> just try it out. >>> >>> >>> >>> >>> >>> On Feb 2, 2009, at 1:49 AM, Oleg Bartunov wrote: >>> >>>> Alex, >>>> what text you're indexing ? I don't believe you have meaningful >>>> very long words ( > 2047 characters). >>>> Do you really need multicolumn index ? >>>> I'd recommend to separate problem - create column fts for >>>> tsvector('english',full_listing), create index on it and try full-text >>>> query. The way you're doing imply calling to_tsvector every time you >>>> search, which can be very costly. >>>> Olegk >>>> On Sun, 1 Feb 2009, Alex wrote: >>>>> So this seems to be because the result size is too big. I still don't >>>>> know why it is looping through every record and printing a warning, >>>>> but adding a LIMIT makes the queries complete in a reasonable time >>>>> (although not all that fast). >>>>> However I need to sort and also have many other facets that may or may >>>>> not be included in the query. Adding a sort makes it load every >>>>> record again and take forever. >>>>> I tried to create an index including all of the fields I query on to >>>>> see if that would work, but I get an error the the index row is too >>>>> large: >>>>> => create index master_index on source_listings(geo_lat, geo_lon, >>>>> price, bedrooms, region, city, listing_type, to_tsvector('english', >>>>> full_listing), post_time); >>>>> NOTICE: word is too long to be indexed >>>>> DETAIL: Words longer than 2047 characters are ignored. >>>>> NOTICE: word is too long to be indexed >>>>> DETAIL: Words longer than 2047 characters are ignored. >>>>> NOTICE: word is too long to be indexed >>>>> DETAIL: Words longer than 2047 characters are ignored. >>>>> NOTICE: word is too long to be indexed >>>>> DETAIL: Words longer than 2047 characters are ignored. >>>>> ERROR: index row requires 13356 bytes, maximum size is 8191 >>>>> Any ideas about how to resolve this? >>>> >>>> Regards, >>>> Oleg >>>> _____________________________________________________________ >>>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >>>> Sternberg Astronomical Institute, Moscow University, Russia >>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >>>> phone: +007(495)939-16-83, +007(495)939-23-83 >> >> Regards, >> Oleg >> _____________________________________________________________ >> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >> Sternberg Astronomical Institute, Moscow University, Russia >> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >> phone: +007(495)939-16-83, +007(495)939-23-83 Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-general по дате отправления: