Re: Problems with FTS

Поиск
Список
Период
Сортировка
От Rauan Maemirov
Тема Re: Problems with FTS
Дата
Msg-id CAFw_bqOdQqE0fUVntCX-aV+oZnzEgpmGqce=ODktJkNktFEEBQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Problems with FTS  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Problems with FTS
Список pgsql-performance
The problem has returned back, and here's the results, as you've said it's faster now:

SET enable_seqscan=off;
EXPLAIN ANALYZE SELECT "v"."id", "v"."title" FROM "video" AS "v"
WHERE (v.active) AND (v.fts @@ 'dexter:A|season:A|seri:A|декстер:A|качество:A|сезон:A|серия:A'::tsquery and v.id <> 500563 ) 
ORDER BY COALESCE(ts_rank_cd( '{0.1, 0.2, 0.7, 1.0}', v.fts, 'dexter:A|season:A|seri:A|декстер:A|качество:A|сезон:A|серия:A'::tsquery), 1) DESC, v.views DESC 
LIMIT 6

Limit  (cost=219631.83..219631.85 rows=6 width=287) (actual time=1850.567..1850.570 rows=6 loops=1)
  ->  Sort  (cost=219631.83..220059.05 rows=170886 width=287) (actual time=1850.565..1850.566 rows=6 loops=1)
        Sort Key: (COALESCE(ts_rank_cd('{0.1,0.2,0.7,1}'::real[], fts, '( ( ( ( ( ''dexter'':A | ''season'':A ) | ''seri'':A ) | ''декстер'':A ) | ''качество'':A ) | ''сезон'':A ) | ''серия'':A'::tsquery), 1::real)), views
        Sort Method:  top-N heapsort  Memory: 26kB
        ->  Bitmap Heap Scan on video v  (cost=41180.92..216568.73 rows=170886 width=287) (actual time=214.842..1778.830 rows=103087 loops=1)
              Recheck Cond: (fts @@ '( ( ( ( ( ''dexter'':A | ''season'':A ) | ''seri'':A ) | ''декстер'':A ) | ''качество'':A ) | ''сезон'':A ) | ''серия'':A'::tsquery)
              Filter: (active AND (id <> 500563))
              ->  Bitmap Index Scan on idx_video_fts  (cost=0.00..41138.20 rows=218543 width=0) (actual time=170.206..170.206 rows=171945 loops=1)
                    Index Cond: (fts @@ '( ( ( ( ( ''dexter'':A | ''season'':A ) | ''seri'':A ) | ''декстер'':A ) | ''качество'':A ) | ''сезон'':A ) | ''серия'':A'::tsquery)
Total runtime: 1850.632 ms


Should I use this instead?

2011/1/15 Robert Haas <robertmhaas@gmail.com>
On Tue, Jan 11, 2011 at 3:16 AM, Rauan Maemirov <rauan@maemirov.com> wrote:
> Hi, Kevin.
> Sorry for long delay.
> EXPLAIN ANALYZE SELECT "v"."id", "v"."title" FROM "video" AS "v"
> WHERE (v.active) AND (v.fts @@
> 'dexter:A|season:A|seri:A|декстер:A|качество:A|сезон:A|серия:A'::tsquery and
> v.id <> 500563 )
> ORDER BY COALESCE(ts_rank_cd( '{0.1, 0.2, 0.7, 1.0}', v.fts,
> 'dexter:A|season:A|seri:A|декстер:A|качество:A|сезон:A|серия:A'::tsquery),
> 1) DESC, v.views DESC
> LIMIT 6
> "Limit  (cost=103975.50..103975.52 rows=6 width=280) (actual
> time=2893.193..2893.199 rows=6 loops=1)"
> "  ->  Sort  (cost=103975.50..104206.07 rows=92228 width=280) (actual
> time=2893.189..2893.193 rows=6 loops=1)"
> "        Sort Key: (COALESCE(ts_rank_cd('{0.1,0.2,0.7,1}'::real[], fts, '( (
> ( ( ( ''dexter'':A | ''season'':A ) | ''seri'':A ) | ''декстер'':A ) |
> ''качество'':A ) | ''сезон'':A ) | ''серия'':A'::tsquery), 1::real)), views"
> "        Sort Method:  top-N heapsort  Memory: 25kB"
> "        ->  Seq Scan on video v  (cost=0.00..102322.34 rows=92228
> width=280) (actual time=0.100..2846.639 rows=54509 loops=1)"
> "              Filter: (active AND (fts @@ '( ( ( ( ( ''dexter'':A |
> ''season'':A ) | ''seri'':A ) | ''декстер'':A ) | ''качество'':A ) |
> ''сезон'':A ) | ''серия'':A'::tsquery) AND (id <> 500563))"
> "Total runtime: 2893.264 ms"
> Table scheme:
> CREATE TABLE video
> (
>   id bigserial NOT NULL,
>   hash character varying(12),
>   account_id bigint NOT NULL,
>   category_id smallint NOT NULL,
>   converted boolean NOT NULL DEFAULT false,
>   active boolean NOT NULL DEFAULT true,
>   title character varying(255),
>   description text,
>   tags character varying(1000),
>   authorized boolean NOT NULL DEFAULT false,
>   adult boolean NOT NULL DEFAULT false,
>   views bigint DEFAULT 0,
>   rating real NOT NULL DEFAULT 0,
>   screen smallint DEFAULT 2,
>   duration smallint,
>   "type" smallint DEFAULT 0,
>   mp4 smallint NOT NULL DEFAULT 0,
>   size bigint,
>   size_high bigint DEFAULT 0,
>   source character varying(255),
>   storage_id smallint NOT NULL DEFAULT 1,
>   rule_watching smallint,
>   rule_commenting smallint,
>   count_comments integer NOT NULL DEFAULT 0,
>   count_likes integer NOT NULL DEFAULT 0,
>   count_faves integer NOT NULL DEFAULT 0,
>   fts tsvector,
>   modified timestamp without time zone NOT NULL DEFAULT now(),
>   created timestamp without time zone DEFAULT now(),
>   CONSTRAINT video_pkey PRIMARY KEY (id),
>   CONSTRAINT video_hash_key UNIQUE (hash)
> )
> WITH (
>   OIDS=FALSE
> );
> Indexes:
> CREATE INDEX idx_video_account_id  ON video  USING btree  (account_id);
> CREATE INDEX idx_video_created  ON video  USING btree  (created);
> CREATE INDEX idx_video_fts  ON video  USING gin  (fts);
> CREATE INDEX idx_video_hash  ON video  USING hash  (hash);
> (here I tried both gist and gin indexes)
> I have 32Gb ram and 2 core quad E5520, 2.27GHz (8Mb cache).
> Pgsql conf:
> max_connections = 200
> shared_buffers = 7680MB
> work_mem = 128MB
> maintenance_work_mem = 1GB
> effective_cache_size = 22GB
> default_statistics_target = 100
> Anything else?

For returning that many rows, an index scan might actually be slower.
Maybe it's worth testing.  Try:

SET enable_seqscan=off;
EXPLAIN ANALYZE ...

and see what you get.  If it's slower, well, then be happy it didn't
use the index (maybe the question is... what index should you have
instead?).  If it's faster, post the results...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeremy Harris
Дата:
Сообщение: Re: external sort performance
Следующее
От: Christiaan Willemsen
Дата:
Сообщение: SSD endurance calculations