Обсуждение: [GENERAL] Slow query plan used

Поиск
Список
Период
Сортировка

[GENERAL] Slow query plan used

От
"Wetzel, Juergen (Juergen)"
Дата:
I have a question concerning the query planner. I observe that chosen query plan differs on length and content of a
likesearch expression. 
We have a view combining data from two tables, both containing same number of rows (round about 30000). Used PostgreSQL
versionis 9.3.15 on Windows. 
DDL of tables and view see below.

Query is:
select ID, OWNERID, FOLDER, RCVSERVICE, STATUS, TASKSTATUS, CREATED
    from DOCUMENTDATA
    where  FOLDER in ('*INBOX','*DELAYED')  and
           ARCHIVED='0' and
           ( lower(subject) like lower('%Sehr%')  or  lower(mailContent) like lower('%Sehr%') ) and
           UMR_ACTUALTOPICID in
('f3fb345741000000','8048405641000000','4fc81b5541000000','d27d9c4d41000200','e9aba54d41000000','4aaf905441000a00','737e9c4d41000900',

'4aaf905441000800','3ecdec4d41000000','4aaf905441000e00','fc7e9c4d41000f00','4aaf905441000c00','11ffc54f41000000','4aaf905441000200')
                                 and
           OWNERID in ('5000239','5000238','5000234','5000113','5000237','5000236')
    order by CREATED desc, ID desc limit 150 offset 0

Only 130 rows out of the 30000 have ARCHIVED = 0
Chosen query plan is following:

Limit  (cost=22738.95..22739.00 rows=20 width=664) (actual time=13929.849..13929.869 rows=98 loops=1)
  Buffers: shared hit=221263 read=45723
  ->  Sort  (cost=22738.95..22739.00 rows=20 width=664) (actual time=13929.848..13929.863 rows=98 loops=1)
        Sort Key: c3k_document.created, c3k_document.id
        Sort Method: quicksort  Memory: 87kB
        Buffers: shared hit=221263 read=45723
        ->  Nested Loop  (cost=0.42..22738.52 rows=20 width=664) (actual time=95.508..13929.478 rows=98 loops=1)
              Buffers: shared hit=221263 read=45723
              ->  Seq Scan on c3k_document_index  (cost=0.00..15160.48 rows=1063 width=285) (actual
time=0.206..13539.353rows=33022 loops=1) 
                    Filter: ((lower((searchfield1)::text) ~~ '%sehr%'::text) OR (lower(searchfield8) ~~
'%sehr%'::text))
                    Rows Removed by Filter: 33832
                    Buffers: shared hit=101130 read=33463
              ->  Index Scan using c3k_docume_6720023941 on c3k_document  (cost=0.42..7.12 rows=1 width=387) (actual
time=0.011..0.011rows=0 loops=33022) 
                    Index Cond: (id = c3k_document_index.documentid)
                    Filter: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar) AND
((umr_actualtopicid)::text= ANY
('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737(...) 
                    Rows Removed by Filter: 1
                    Buffers: shared hit=120133 read=12260
Total runtime: 13930.186 ms

If the like expression is lower('%%Sehr%%')  for fields subject and mailContent, a much faster query plan is chosen:

Limit  (cost=24018.18..24018.42 rows=97 width=664) (actual time=61.110..61.130 rows=98 loops=1)
  Buffers: shared hit=1961 read=598
  ->  Sort  (cost=24018.18..24018.42 rows=97 width=664) (actual time=61.109..61.122 rows=98 loops=1)
        Sort Key: c3k_document.created, c3k_document.id
        Sort Method: quicksort  Memory: 87kB
        Buffers: shared hit=1961 read=598
        ->  Nested Loop  (cost=515.26..24014.98 rows=97 width=664) (actual time=5.193..60.851 rows=98 loops=1)
              Buffers: shared hit=1961 read=598
              ->  Bitmap Heap Scan on c3k_document  (cost=514.96..16137.34 rows=1232 width=387) (actual
time=2.137..10.754rows=282 loops=1) 
                    Recheck Cond: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar))
                    Filter: (((umr_actualtopicid)::text = ANY
('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737e9c4d41000900,4aaf905441000800,3ecdec4d41000000,4aaf905441000e00,fc7e9c4d41000f00,11ff
(...)
                    Rows Removed by Filter: 23
                    Buffers: shared hit=828 read=356
                    ->  Bitmap Index Scan on c3k_document_folder_archived_umr_orgtopicid_idx  (cost=0.00..514.66
rows=6183width=0) (actual time=1.946..1.946 rows=2847 loops=1) 
                          Index Cond: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived =
'0'::bpchar))
                          Buffers: shared hit=7 read=113
              ->  Index Scan using c3k_docume_7965268402 on c3k_document_index  (cost=0.29..6.38 rows=1 width=285)
(actualtime=0.176..0.176 rows=0 loops=282) 
                    Index Cond: (documentid = c3k_document.id)
                    Filter: ((lower((searchfield1)::text) ~~ '%%sehr%%'::text) OR (lower(searchfield8) ~~
'%%sehr%%'::text))
                    Rows Removed by Filter: 1
                    Buffers: shared hit=1133 read=242
Total runtime: 61.385 ms

But when the like expression is only a few characters longer the first - slower - query plan is used again. I thought
asthere's such a strong restriction by field ARCHIVED always an index containing that field would be used. VACUUM
ANALYZEand REINDEX had no influence on the query plan. 
Can you please give me a hint why the behavior is that way and if there's a possibility to force the use of the faster
queryplan. 

Thanks,
Juergen

CREATE OR REPLACE VIEW c3k_documentdata AS
 SELECT document.id,
    document.ownerid,
    document.folder,
    document.doccomment,
    document.rcvservice,
    document.status,
    document.taskstatus,
    document.created,
    document.archived,
    document.frominfo,
    document.ticketid,
    document.actualtopicid,
    document.orgtopicid,
    document_index.searchfield1 AS subject,
    document_index.searchfield2 AS fromaddress,
    document_index.searchfield3 AS toaddress,
    document_index.searchfield4 AS ccaddress,
    document_index.searchfield5 AS bccaddress,
    document_index.searchfield6 AS replytoaddress,
    document_index.searchfield7 AS hasattachment,
    document_index.searchfield8 AS mailcontent,
    document_index.searchfield9 AS mustnotautoreply,
    document_index.searchfield10 AS returnpath,
    document_index.searchfield11 AS stateweight,
    document_index.searchfield12 AS fromaddressextracted,
    document_index.searchfield13 AS sensitivity,
    document_index.searchfield14 AS priority,
    document.docfile
   FROM document
     LEFT JOIN document_index ON document.id = document_index.documentid;

CREATE TABLE document
(
  id bigserial NOT NULL,
  ownerid character(49),
  folder character varying(49),
  created timestamp without time zone NOT NULL,
  touched timestamp without time zone NOT NULL,
  status character varying(31),
  frominfo character varying(255),
  rcvservice character varying(15),
  doccomment character varying(255),
  docfile bytea,
  taskstatus character varying(31) DEFAULT 'INACTIVE'::character varying,
  archived character(1) NOT NULL DEFAULT '0'::bpchar,
  ticketid character varying(25) NOT NULL DEFAULT ''::character varying,
  orgtopicid character varying(25) NOT NULL DEFAULT ''::character varying,
  actualtopicid character varying(25) NOT NULL DEFAULT ''::character varying,
  CONSTRAINT c3k_docume_6720023941 PRIMARY KEY (id)
);

CREATE INDEX c3k_document_archived_ticketid_idx
  ON c3k_document
  USING btree
  (archived COLLATE pg_catalog."default", ticketid COLLATE pg_catalog."default");

CREATE INDEX c3k_document_folder_archived_orgtopicid_idx
  ON c3k_document
  USING btree
  (folder COLLATE pg_catalog."default", archived COLLATE pg_catalog."default", orgtopicid COLLATE
pg_catalog."default");

CREATE INDEX c3k_document_ownerid_folder_created_idx
  ON c3k_document
  USING btree
  (ownerid COLLATE pg_catalog."default", folder COLLATE pg_catalog."default", created);

CREATE INDEX c3k_document_ownerid_folder_status_idx
  ON c3k_document
  USING btree
  (ownerid COLLATE pg_catalog."default", folder COLLATE pg_catalog."default", status COLLATE pg_catalog."default");



CREATE TABLE document_index
(
  documentid bigint NOT NULL,
  searchfield1 character varying(255),
  searchfield2 character varying(255),
  searchfield3 character varying(255),
  searchfield4 character varying(255),
  searchfield5 character varying(255),
  searchfield6 character varying(255),
  searchfield7 character varying(255),
  searchfield8 text,
  searchfield9 character varying(255),
  searchfield10 character varying(255),
  searchfield11 character varying(255),
  searchfield12 character varying(255),
  searchfield13 character varying(255) DEFAULT '0'::character varying,
  searchfield14 character varying(255) DEFAULT '1'::character varying,
  searchfield15 character varying(255),
  searchfield16 text,
  CONSTRAINT docume_7965268402 PRIMARY KEY (documentid),
  CONSTRAINT docindex_docid_fk FOREIGN KEY (documentid)
      REFERENCES document (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE
);



Re: [GENERAL] Slow query plan used

От
Andreas Kretschmer
Дата:


Am 30.05.2017 um 10:42 schrieb Wetzel, Juergen (Juergen):

> I have a question concerning the query planner. I observe that chosen query plan differs on length and content of a like
> search expression. We have a view combining data from two tables, both containing same number of rows (round about 30000).
> Used PostgreSQL version is 9.3.15 on Windows. DDL of tables and view see below.


the query is on DOCUMENTDATA, but plan and DDL for c3k_documentdata and other tables.

> Only 130 rows out of the 30000 have ARCHIVED = 0

in this case i would suggest a partial index:

create index <indexname> on <tablename> (archived) where archived = 0;

You can also increase the statistics for the columns subject and 
mailContent


ALTER TABLE <tablename> ALTER COLUMN <column_name> SET STATISTICS = 1000;

(as example, the default-value is 100)


Regards, Andreas
-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

Re: [GENERAL] Slow query plan used

От
"Wetzel, Juergen (Juergen)"
Дата:
>> Only 130 rows out of the 30000 have ARCHIVED = 0

> in this case i would suggest a partial index:
> create index <indexname> on <tablename> (archived) where archived = 0;

Thanks, Andreas.

Sorry for the confusion about the table names.
The hint with the partial index sounds as it could solve the problem. I will test it.

Regards
Jürgen


Re: [GENERAL] Slow query plan used

От
Andreas Kretschmer
Дата:

Am 31.05.2017 um 13:27 schrieb Wetzel, Juergen (Juergen):
>>> Only 130 rows out of the 30000 have ARCHIVED = 0
>> in this case i would suggest a partial index:
>> create index <indexname> on <tablename> (archived) where archived = 0;
> Thanks, Andreas.
>
> Sorry for the confusion about the table names.
> The hint with the partial index sounds as it could solve the problem. I will test it.
>

you are welcome. please keep me informed if that helps you.

Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: [GENERAL] Slow query plan used

От
"Wetzel, Juergen (Juergen)"
Дата:

Am 31.05.2017 um 13:27 schrieb Wetzel, Juergen (Juergen):
>>> Only 130 rows out of the 30000 have ARCHIVED = 0
>> in this case i would suggest a partial index:
>> create index <indexname> on <tablename> (archived) where archived = 0;
> Thanks, Andreas.
>
> Sorry for the confusion about the table names.
> The hint with the partial index sounds as it could solve the problem. I will test it.
>

Hi,

I created now a partial index
    create index on document (archived) where archived = '0';

But result is same as before: a short like expression included in doubled %-signs leads to a fast query plan whereas a
longerlike expression or use of single %-sign creates a much slower query. Please see below query plans. Most
surprisinglyto me is the influence of the like expression, especially the doubled %-sign on short expressions. Any
otherideas how to speed up that query or what is going on here in general? 

Thanks,
Jürgen

Limit  (cost=24327.12..24327.21 rows=38 width=662) (actual time=15373.542..15373.565 rows=136 loops=1)
  Buffers: shared hit=264747 read=51242 dirtied=6 written=16
  ->  Sort  (cost=24327.12..24327.21 rows=38 width=662) (actual time=15373.541..15373.557 rows=136 loops=1)
        Sort Key: document.created, document.id
        Sort Method: quicksort  Memory: 102kB
        Buffers: shared hit=264747 read=51242 dirtied=6 written=16
        ->  Nested Loop  (cost=0.42..24326.12 rows=38 width=662) (actual time=10.951..15372.914 rows=136 loops=1)
              Buffers: shared hit=264741 read=51242 dirtied=6 written=16
              ->  Seq Scan on document_index  (cost=0.00..15403.68 rows=1257 width=289) (actual time=0.205..14901.743
rows=38545loops=1) 
                    Filter: ((lower((searchfield1)::text) ~~ '%sehr%'::text) OR (lower(searchfield8) ~~
'%sehr%'::text))
                    Rows Removed by Filter: 40531
                    Buffers: shared hit=123181 read=38272 dirtied=5 written=12
              ->  Index Scan using document_6720023941 on document  (cost=0.42..7.09 rows=1 width=381) (actual
time=0.011..0.011rows=0 loops=38545) 
                    Index Cond: (id = document_index.documentid)
                    Filter: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar) AND
((umr_actualtopicid)::text= ANY
('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737(...) 
                    Rows Removed by Filter: 1
                    Buffers: shared hit=141560 read=12970 dirtied=1 written=4
Total runtime: 15373.763 ms


Limit  (cost=34194.37..34194.74 rows=150 width=662) (actual time=181.502..181.525 rows=134 loops=1)
  Buffers: shared hit=8022 read=277
  ->  Sort  (cost=34194.37..34194.83 rows=186 width=662) (actual time=181.501..181.516 rows=134 loops=1)
        Sort Key: document.created, c3k_document.id
        Sort Method: quicksort  Memory: 101kB
        Buffers: shared hit=8022 read=277
        ->  Nested Loop  (cost=3546.02..34187.36 rows=186 width=662) (actual time=32.660..181.064 rows=134 loops=1)
              Buffers: shared hit=8022 read=277
              ->  Bitmap Heap Scan on document  (cost=3545.61..19272.79 rows=2375 width=381) (actual
time=22.771..96.683rows=458 loops=1) 
                    Recheck Cond: ((archived = '0'::bpchar) AND (ownerid = ANY
('{5000239,5000238,5000234,5000113,5000237,5000236,5000230,5000112,5000233,5000111,5000232,13,15,16,18,19,5000249,5000246,5000124,5000245,5000127,5000247,5000242,5000120,5000123
(...)
                    Rows Removed by Index Recheck: 15733
                    Filter: ((actualtopicid)::text = ANY
('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737e9c4d41000900,3ecdec4d41000000,4aaf905441000800,4aaf905441000e00,fc7e9c4d41000f00,11ffc
(...)
                    Rows Removed by Filter: 27
                    Buffers: shared hit=5677
                    ->  BitmapAnd  (cost=3545.61..3545.61 rows=6228 width=0) (actual time=22.056..22.056 rows=0
loops=1)
                          Buffers: shared hit=2470
                          ->  Bitmap Index Scan on document_archived_idx  (cost=0.00..1131.17 rows=54784 width=0)
(actualtime=11.694..11.694 rows=60295 loops=1) 
                                Index Cond: (archived = '0'::bpchar)
                                Buffers: shared hit=184
                          ->  Bitmap Index Scan on document_ownerid_folder_status_idx  (cost=0.00..2413.00 rows=8973
width=0)(actual time=8.718..8.718 rows=14962 loops=1) 
                                Index Cond: ((ownerid = ANY
('{5000239,5000238,5000234,5000113,5000237,5000236,5000230,5000112,5000233,5000111,5000232,13,15,16,18,19,5000249,5000246,5000124,5000245,5000127,5000247,5000242,5000120,5000123,5000244,5000122,50
(...)
                                Buffers: shared hit=2286
              ->  Index Scan using document_7965268402 on document_index  (cost=0.42..6.27 rows=1 width=289) (actual
time=0.182..0.183rows=0 loops=458) 
                    Index Cond: (documentid = document.id)
                    Filter: ((lower((searchfield1)::text) ~~ '%%sehr%%'::text) OR (lower(searchfield8) ~~
'%%sehr%%'::text))
                    Rows Removed by Filter: 1
                    Buffers: shared hit=2345 read=277
Total runtime: 184.053 ms


Re: [GENERAL] Slow query plan used

От
Andreas Kretschmer
Дата:
Am 01.06.2017 um 14:07 schrieb Wetzel, Juergen (Juergen):
>
> Am 31.05.2017 um 13:27 schrieb Wetzel, Juergen (Juergen):
>>>> Only 130 rows out of the 30000 have ARCHIVED = 0
>>> in this case i would suggest a partial index:
>>> create index <indexname> on <tablename> (archived) where archived = 0;
>> Thanks, Andreas.
>>
>> Sorry for the confusion about the table names.
>> The hint with the partial index sounds as it could solve the problem. I will test it.
>>
> Hi,
>
> I created now a partial index
>     create index on document (archived) where archived = '0';

just to be sure: this syntay is wrong, missing index-name. But it seems
the index is document_archived_idx ...


>
> But result is same as before: a short like expression included in doubled %-signs leads to a fast query plan whereas
alonger like expression or use of single %-sign creates a much slower query. Please see below query plans. Most
surprisinglyto me is the influence of the like expression, especially the doubled %-sign on short expressions. Any
otherideas how to speed up that query or what is going on here in general? 
>

please consider my plan B) and increase the stats. See my other mail.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: [GENERAL] Slow query plan used

От
Bill Moran
Дата:
On Thu, 1 Jun 2017 16:45:17 +0200
Andreas Kretschmer <andreas@a-kretschmer.de> wrote:
>
> Am 01.06.2017 um 14:07 schrieb Wetzel, Juergen (Juergen):
> >
> > Am 31.05.2017 um 13:27 schrieb Wetzel, Juergen (Juergen):
> >>>> Only 130 rows out of the 30000 have ARCHIVED = 0
> >>> in this case i would suggest a partial index:
> >>> create index <indexname> on <tablename> (archived) where archived = 0;
> >> Thanks, Andreas.
> >>
> >> Sorry for the confusion about the table names.
> >> The hint with the partial index sounds as it could solve the problem. I will test it.
> >>
> > Hi,
> >
> > I created now a partial index
> >     create index on document (archived) where archived = '0';
>
> just to be sure: this syntay is wrong, missing index-name. But it seems
> the index is document_archived_idx ...
>
> > But result is same as before: a short like expression included in doubled %-signs leads to a fast query plan
whereasa longer like expression or use of single %-sign creates a much slower query. Please see below query plans. Most
surprisinglyto me is the influence of the like expression, especially the doubled %-sign on short expressions. Any
otherideas how to speed up that query or what is going on here in general? 

LIKE queries are probably challenging to plan, especially when they're not
left-anchored: how can the planner be reasonalbly expected to estimate how
many rows will be matched by a given LIKE expression.

Not having looked at the code, I would guess that the length of the LIKE
expression will make the planner assume that the match is more restrictive,
while many % and _ in the LIKE expression make the planner assume that the
match is less restrictive. Extrapolate that into guessing a number of matched
tuples and how that fits into the overall plan and you'll probaby give
yourself a brain anuerism. While having a detailed understanding of exactly
how the planner makes such decisions is certainly worthwhile, I would
recommend a more pragmatic approach: try things and see what works.

That in mind, let me throw pg_trgm into the mix of things to try:
https://www.postgresql.org/docs/current/static/pgtrgm.html
The trigram module allows you to create indexes that LIKE can use
to do index searches instead of always having to do sequential scans
or push the LIKE matching to another part of the plan tree. Based on
your described situation, I have a theory that it might improve things
quite a bit.

--
Bill Moran <wmoran@potentialtech.com>


Re: [GENERAL] Slow query plan used

От
Tom Lane
Дата:
Bill Moran <wmoran@potentialtech.com> writes:
> LIKE queries are probably challenging to plan, especially when they're not
> left-anchored: how can the planner be reasonalbly expected to estimate how
> many rows will be matched by a given LIKE expression.

Yeah, especially without any statistics.  The core problem here appears
to be the poor rowcount estimate for the LIKE:

              ->  Seq Scan on c3k_document_index  (cost=0.00..15160.48 rows=1063 width=285) (actual
time=0.206..13539.353rows=33022 loops=1) 
                    Filter: ((lower((searchfield1)::text) ~~ '%sehr%'::text) OR (lower(searchfield8) ~~
'%sehr%'::text))
                    Rows Removed by Filter: 33832

That's off by a factor of 30.

> Not having looked at the code, I would guess that the length of the LIKE
> expression will make the planner assume that the match is more restrictive,
> while many % and _ in the LIKE expression make the planner assume that the
> match is less restrictive.

Exactly.  It's not terribly bright, but the extra %'s make it think that
the pattern is less selective, producing a higher rowcount estimate that
--- by chance -- better matches reality.  Then you get better decisions
about how to shape the rest of the plan.

You might get some traction by creating indexes on lower(searchfield1)
etc.  This isn't even necessarily with an expectation that the planner
would use those indexes in the plan ... but what it would do is make
use of the statistics that ANALYZE will accumulate about the indexed
expressions.  I think that would give you better estimates about the
LIKE rowcounts.  You might have to crank up the statistics target for
those indexes if the default isn't enough to make the estimates
significantly better.  (Obviously, don't forget to re-ANALYZE before
checking results.)

> That in mind, let me throw pg_trgm into the mix of things to try:
> https://www.postgresql.org/docs/current/static/pgtrgm.html
> The trigram module allows you to create indexes that LIKE can use
> to do index searches instead of always having to do sequential scans
> or push the LIKE matching to another part of the plan tree.

For cases like the above, where half the table is getting selected,
I would not expect the planner to use an index anyway.  Any old index
on lower(searchfieldN) will work in terms of inducing ANALYZE to
collect stats.  But if you have other queries with more selective
LIKE patterns then maybe a trigram index in particular is worthwhile.

            regards, tom lane


Re: [GENERAL] Slow query plan used

От
"Wetzel, Juergen (Juergen)"
Дата:
Andreas Kretschmer <andreas@a-kretschmer.de> writes:
> please consider my plan B) and increase the stats. See my other mail.

I tried that also. Combined with the partial index. But still same result.

Bill Moran <wmoran@potentialtech.com> writes:
> LIKE queries are probably challenging to plan, especially when they're
> not
> left-anchored: how can the planner be reasonalbly expected to estimate
> how many rows will be matched by a given LIKE expression.

That's clear to me. And because of that I expected the planner to use the table document as outer table in the nested
loopjoin. Especially as here is an index available which gives a restriction to only 130 rows out of the 30000. 

Tom Lane <tgl@sss.pgh.pa.us> writes:
> You might get some traction by creating indexes on lower(searchfield1) etc.  This isn't even necessarily with an
expectationthat the planner would use  
> those indexes in the plan ... but what it would do is make use of the statistics that ANALYZE will accumulate about
theindexed expressions.  I think that 
> would give you better estimates about the LIKE rowcounts.  You might have to crank up the statistics target for those
indexesif the default isn't enough to 
> make the estimates significantly better.  (Obviously, don't forget to re-ANALYZE before checking results.)

I will try that. Does that mean the column statistics will only be collected when there's an index on the table/column?

Thanks for all your hints. I will go on and try.



Re: [GENERAL] Slow query plan used

От
Tom Lane
Дата:
"Wetzel, Juergen (Juergen)" <wetzel@avaya.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> You might get some traction by creating indexes on lower(searchfield1) etc.

> I will try that. Does that mean the column statistics will only be collected when there's an index on the
table/column?

No; ANALYZE collects stats on plain columns automatically.  The point
is that your query is not on the plain column searchfield1, it is on
lower(searchfield1).  The planner cannot make a reasonable estimate
for "something LIKE constant-pattern" unless it can find statistics
that are specifically for the "something" value.

            regards, tom lane