Обсуждение: FTS performance issue probably due to wrong planner estimate of detoasting
Hi,
I have problems with the performance of FTS in a query like this:
  SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
plainto_tsquery('english', 'good');
It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped).
The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE
I have to check again, if I'm doing something wrong but I'm pretty
sure it has to do with de-toasting and (wrong?) cost estimations.
I've seen some comments here saying that estimating detoasting costs
(especially with operator "@@" and GIN index) is an open issue (since
years?).
And I found a nice blog here [1] which uses 9.2/9.1 and proposes to
disable sequential table scan (SET enable_seqscan off;). But this is
no option for me since other queries still need seqscan.
Can anyone tell me if is on some agenda here (e.g. as an open item for >9.2)?
Yours, Stefan
[1] http://palominodb.com/blog/2012/03/06/considerations-about-text-searchs-big-fields-and-planner-costs
			
		Re: FTS performance issue probably due to wrong planner estimate of detoasting
От
 
		    	Pavel Stehule
		    Дата:
		        Hello you can try to wrap searching to immutable function and use following trick http://postgres.cz/wiki/PostgreSQL_SQL_Tricks#Using_IMMUTABLE_functions_as_hints_for_the_optimizer Regards Pavel Stehule 2013/2/8 Stefan Keller <sfkeller@gmail.com>: > Hi, > > I have problems with the performance of FTS in a query like this: > > SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ > plainto_tsquery('english', 'good'); > > It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped). > The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE > I have to check again, if I'm doing something wrong but I'm pretty > sure it has to do with de-toasting and (wrong?) cost estimations. > > I've seen some comments here saying that estimating detoasting costs > (especially with operator "@@" and GIN index) is an open issue (since > years?). > And I found a nice blog here [1] which uses 9.2/9.1 and proposes to > disable sequential table scan (SET enable_seqscan off;). But this is > no option for me since other queries still need seqscan. > Can anyone tell me if is on some agenda here (e.g. as an open item for >9.2)? > > Yours, Stefan > > [1] http://palominodb.com/blog/2012/03/06/considerations-about-text-searchs-big-fields-and-planner-costs > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Re: FTS performance issue probably due to wrong planner estimate of detoasting
От
 
		    	Jesper Krogh
		    Дата:
		        On 08/02/13 01:52, Stefan Keller wrote:
> Hi,
>
> I have problems with the performance of FTS in a query like this:
>
>    SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
> plainto_tsquery('english', 'good');
>
> It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped).
> The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE
> I have to check again, if I'm doing something wrong but I'm pretty
> sure it has to do with de-toasting and (wrong?) cost estimations.
If you havent done it .. bump up statistics target on the column and
re-analyze, see what that gives.
I have also been playing with the cost-numbers in order to get it to favour
an index-scan more often. That is lowering random_page_cost to be close to
seq_page_cost, dependent on your system, the amount of memory, etc, then
this can have negative side-effects on non-gin-queries.
--
Jesper
			
		Re: FTS performance issue probably due to wrong planner estimate of detoasting
От
 
		    	Stefan Keller
		    Дата:
		        Hi Jesper and Pavel
Thx for your hints.
I'm rather reluctant in tuning with unwanted side effects, We'll see.
I have to setup my system and db again before I can try out your tricks.
Yours, Stefan
2013/2/8 Jesper Krogh <jesper@krogh.cc>:
> On 08/02/13 01:52, Stefan Keller wrote:
>>
>> Hi,
>>
>> I have problems with the performance of FTS in a query like this:
>>
>>    SELECT * FROM FullTextSearch WHERE content_tsv_gin @@
>> plainto_tsquery('english', 'good');
>>
>> It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB
>> zipped).
>> The planner obviously always chooses table scan:
>> http://explain.depesz.com/s/EEE
>> I have to check again, if I'm doing something wrong but I'm pretty
>> sure it has to do with de-toasting and (wrong?) cost estimations.
>
> If you havent done it .. bump up statistics target on the column and
> re-analyze, see what that gives.
>
> I have also been playing with the cost-numbers in order to get it to favour
> an index-scan more often. That is lowering random_page_cost to be close to
> seq_page_cost, dependent on your system, the amount of memory, etc, then
> this can have negative side-effects on non-gin-queries.
>
> --
> Jesper