Re: [HACKERS] Interesting index/LIKE/join slowness problems
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Interesting index/LIKE/join slowness problems |
Дата | |
Msg-id | 9517.932134448@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] Interesting index/LIKE/join slowness problems (Ole Gjerde <gjerde@icebox.org>) |
Ответы |
Re: [HACKERS] Interesting index/LIKE/join slowness problems
(Ole Gjerde <gjerde@icebox.org>)
|
Список | pgsql-hackers |
Ole Gjerde <gjerde@icebox.org> writes: > Ok.. I get that.. But why does LIKE 'AN914' have the same problem? The % > doesn't have to be there as long as it's either LIKE or ~*(or ~ etc) > query. A pure "where field LIKE constant" doesn't have the problem; it's the OR that does it. More specifically it's an OR of ANDs that doesn't work very well. By the time the parser gets done with it, your query looks like select * from mcrl1 wherereference = 'AN914' OR(reference LIKE 'AN914-%' AND reference >= 'AN914-' AND reference <= 'AN914-\377'); (ugly, ain't it?) Those comparison clauses are what need to be pulled out and fed to the indexscan mechanism, so that only part of the table gets scanned, not the whole table. Indexscan doesn't know anything about LIKE, but it does grok >= and <=. Unfortunately the current optimizer doesn't do it right. I looked into a very similar bug report from Hiroshi Inoue (see his message of 3/19/99 and my response of 4/3 in the hackers archives), and what I found was that the cause is a fairly fundamental optimizer design choice. The ANDed conditions get split into separate top-level clauses and there's no easy way to put them back together. The optimizer ends up passing only one of them to the indexscan executor. That's better than nothing, but on average you still end up scanning half the table rather than just a small range of it. > I haven't been able to find a discussion on this topic last few months, I > found discussion about something similar in March, but that didn't explain > it very well.. I'll just have to look some more :) I was referring to the discussion around 4/15/99 about why LIKE needs a smarter way to generate the upper comparison clause. That's not directly your problem, but it is causing the same kind of slowdown for everyone who does use LOCALE... >> When you throw in the OR, the indexqual logic basically breaks down >> completely; I think you end up scanning the entire table. (This could >> be made smarter, perhaps, but right now I don't believe the system is >> able to figure out the union of indexqual conditions.) I was wrong about that --- the executor *does* handle OR'd indexqual conditions, basically by performing a new indexscan for each OR'd condition. (That's why EXPLAIN is listing the index multiple times.) The trouble with OR-of-ANDs is entirely the optimizer's fault; the executor would do them fine if the optimizer would only hand them over in that form. > What would be the outlook on fixing the problem and not the symptom? :) I plan to look into fixing this for 6.6, but don't hold your breath waiting... regards, tom lane
В списке pgsql-hackers по дате отправления: