Re: [GENERAL] full text search on hstore or json with materialized view?

Поиск
Список
Период
Сортировка
От Rj Ewing
Тема Re: [GENERAL] full text search on hstore or json with materialized view?
Дата
Msg-id CAOSSsV0HXviU8aKUuBFa1yautVfsXZzL2NhihPeUF9+mR5Mp2w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] full text search on hstore or json with materialized view?  (George Neuner <gneuner2@comcast.net>)
Список pgsql-general


On Wed, Apr 19, 2017 at 6:44 PM, George Neuner <gneuner2@comcast.net> wrote:
On Wed, 19 Apr 2017 11:57:26 -0700, Rj Ewing <ewing.rj@gmail.com>
wrote:

>I did some testing using a secondary table with the key, value column.
>However I don't think this will provide the performance that we need.
>Queries we taking 60+ seconds just for a count.

SELECT count(*) or filtered?

Either way, your statistics may be way off.  Did you vacuum analyze
the table after the inserts (and the index creation if it was done
separately)?
 
​I think my statistics were off. I never ran vacuum analyze.

Without more detail re: your hardware, Postgresql version, what
indexes are/will be available, the types of queries you want to run,
etc., it's very hard to give really meaningful suggestions.

​postgresql 9.6, currently testing on 8gb ram, but have upto 64gb for production. 7-core 2.10GHz​.

mostly want to run ad-hoc queries, returning entire row, matching 1 - 3 k:v conditions.

The kind of query you have alluded to is pretty easily parallelized:
it can be spread over multiple sessions with result aggregation done
on the client side.

Or, if you you have 9.6, you might try using backend parallelism:
https://www.postgresql.org/docs/9.6/static/parallel-query.html
[I've not used this, but some people have done it successfully.]
 

If you can restrict the FTS query to certain keys:

  SELECT id FROM mytable
    WHERE tsquery( ... ) @@ to_tsvector(v)
    AND k IN ( ... )
    GROUP BY id

  [note: according to David Rowley, GROUP BY may be parallelized
         whereas  DISTINCT currently cannot be.]

then given an index on 'k' it may be much faster than just the FTS
query alone.  Subject to key variability, it also may be improved by
table partitioning to reduce the search space.

If the FTS query is key restricted, you can parallelize either on the
client or on the server.  If the FTS query is not key restricted, you
pretty much are limited to server side (and 9.6 or later).

​I'll look into parallelism if we can't get the performance we need.

​What do you mean if I can restrict the FTS query to certain keys? I'm not a sql expert, but it seems like the above query would match multiple keys to 1 tsquery value?

We need to be able to do AND conditions with separate k:v pairs. Our keys are know ahead of time, and would vary for a given table, but would be in the range of 30-60 keys per table.

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: [GENERAL] Unable to upload backups
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: [GENERAL] Large data and slow queries