Re: Performance query about large tables, lots of concurrent access
От | Karl Wright |
---|---|
Тема | Re: Performance query about large tables, lots of concurrent access |
Дата | |
Msg-id | 4677ECA9.3000402@metacarta.com обсуждение исходный текст |
Ответ на | Re: Performance query about large tables, lots of concurrent access (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Performance query about large tables, lots of concurrent access
Re: Performance query about large tables, lots of concurrent access |
Список | pgsql-performance |
Tom Lane wrote: > Karl Wright <kwright@metacarta.com> writes: >> Also, as I said before, I have done extensive query analysis and found >> that the plans for the queries that are taking a long time are in fact >> very reasonable. Here's an example from the application log of a query >> that took way more time than its plan would seem to indicate it should: > >> [2007-06-18 09:39:49,797]ERROR Plan: Index Scan using i1181764142395 on >> intrinsiclink (cost=0.00..14177.29 rows=5 width=253) >> [2007-06-18 09:39:49,797]ERROR Plan: Index Cond: ((jobid = $2) AND >> ((childidhash)::text = ($3)::text)) >> [2007-06-18 09:39:49,797]ERROR Plan: Filter: ((childid = ($4)::text) >> AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar))) > > I see the discussion thread has moved on to consider lack-of-vacuuming > as the main problem, but I didn't want to let this pass without > comment. The above plan is not necessarily good at all --- it depends > on how many rows are selected by the index condition alone (ie, jobid > and childidhash) versus how many are selected by the index and filter > conditions. If the index retrieves many rows, most of which are > eliminated by the filter condition, it's still gonna take a long time. > > In this case it looks like the planner is afraid that that's exactly > what will happen --- a cost of 14177 suggests that several thousand row > fetches are expected to happen, and yet it's only predicting 5 rows out > after the filter. It's using this plan anyway because it has no better > alternative, but you should think about whether a different index > definition would help. > > regards, tom lane > Well, that's odd, because the hash in question that it is using is the SHA-1 hash of a URL. There's essentially one row per URL in this table. Even with a large table I would not expect more than a couple of collisions at most. How does it arrive at that estimate of 14,000? Karl
В списке pgsql-performance по дате отправления: