Re: Postgres as key/value store

Поиск
Список
Период
Сортировка
От Jonathan Vanasco
Тема Re: Postgres as key/value store
Дата
Msg-id 47771C82-4FC1-4FDC-834E-E6A6AE2390B2@2xlp.com
обсуждение исходный текст
Ответ на Postgres as key/value store  (snacktime <snacktime@gmail.com>)
Список pgsql-general
On Sep 27, 2014, at 7:48 PM, snacktime wrote:

> The schema is that a key is a string, and the value is a string or binary.  I am actually storing protocol buffer
messages,but the library gives me the ability to serialize to native protobuf or to json.  Json is useful at times
especiallyfor debugging. 

I don't know if this will apply to you, but i received significant speed improvements on Postgres key searches by using
substringindexes. 

If your keys are just random hashes, this would probably work well for you.
if your keys are person-readable, it's probably not going to work as well as the distribution of prefix characters will
probablybe too uniform. 

But the general idea is twofold:

    1. create an additional partial index on the key field -- CREATE INDEX _entities_id__subst_7 ON
entities(substr(id,1,7));
    2. update your SELECTS to search for both the full string AND the substring

        - WHERE id = :id
        + WHERE (id = :id) AND (substr(id,1,7) = substr(:id, 1, 7))

By adding in the substring query, the planner will (usually) optimize the select by doing a first pass on the substring
index. then it searches that limited set for the rest of matching criteria. 

on a table with 4MM+ records , introducing a substring index/query improved my searches by a few orders of magnitude.
 

before trying this indexing strategy, we were actively looking to migrate this particular query service off of postgres
--it was such a bottleneck and was not scalable.   
now there is no reason to leave in the foreseeable future.


On Sep 27, 2014, at 8:33 PM, Gavin Flower wrote:

>>  This works well because keys are left prefixed with a scope, a delimiter, and then the actual key for the data.
> Then I noticed that your id is actually a compound key, and probably would be better modelled as:

if you're able to standardize the scope out, an index of "(scope, substring(key,1,7))" might work well.

i only used 1,7 as my key arguments, because that was an optimal speed/space mix on my dataset.  depending on yours, a
shorteror longer index might be more appropriate 



В списке pgsql-general по дате отправления:

Предыдущее
От: Andy Colson
Дата:
Сообщение: Re: PG 9.3 Switch streaming to wal shipping
Следующее
От: Willy-Bas Loos
Дата:
Сообщение: synchronize DTAP