Re: jsonb format is pessimal for toast compression

Поиск
Список
Период
Сортировка
От Arthur Silva
Тема Re: jsonb format is pessimal for toast compression
Дата
Msg-id CAO_YK0VzGrtndbQiUL67-aRxD0t2-YsC9KREgZmJCOJAz6AKhg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: jsonb format is pessimal for toast compression  (Josh Berkus <josh@agliodbs.com>)
Ответы Re: jsonb format is pessimal for toast compression  (Arthur Silva <arthurprs@gmail.com>)
Список pgsql-hackers

On Thu, Sep 11, 2014 at 10:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
So, I finally got time to test Tom's latest patch on this.

TLDR: we want to go with Tom's latest patch and release beta3.

Figures:

So I tested HEAD against the latest lengths patch.  Per Arthur Silva, I
checked uncompressed times for JSONB against compressed times.  This
changed the picture considerably.

TABLE SIZES
-----------

HEAD

      ?column?       | pg_size_pretty
---------------------+----------------
 json text format    | 393 MB
 jsonb: compressed   | 1147 MB
 jsonb: uncompressed | 1221 MB

PATCHED

      ?column?       | pg_size_pretty
---------------------+----------------
 json text format    | 394 MB
 jsonb: compressed   | 525 MB
 jsonb: uncompressed | 1200 MB


EXTRACTION TIMES
----------------

HEAD

Q1 (search via GIN index followed by extracting 100,000 values from rows):

jsonb compressed: 4000
jsonb uncompressed: 3250


Q2 (seq scan and extract 200,000 values from rows):

json: 11700
jsonb compressed: 3150
jsonb uncompressed: 2700


PATCHED

Q1:

jsonb compressed: 6750
jsonb uncompressed: 3350

Q2:

json: 11796
jsonb compressed: 4700
jsonb uncompressed: 2650

----------------------

Conclusion: with Tom's patch, compressed JSONB is 55% smaller when
compressed (EXTENDED).  Extraction times are 50% to 70% slower, but this
appears to be almost entirely due to decompression overhead.  When not
compressing (EXTERNAL), extraction times for patch versions are
statistically the same as HEAD, and file sizes are similar to HEAD.

USER REACTION
-------------

I polled at both PDXpgDay and at FOSS4G, asking some ~~ 80 Postgres
users how they would feel about a compression vs. extraction time
tradeoff.  The audience was evenly split.

However, with the current patch, the user can choose.  Users who know
enough for performance tuning can set JSONB columns to EXTERNAL, and the
the same performance as the unpatched version.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


The compression ratio difference is exaggerated in this case but it does support that Tom's patch alleviates the extraction penalty.

In my testings with the github archive data the savings <-> performance-penalty was fine, but I'm not confident in those results since there were only 8 top level keys.
For comparison, some twitter api objects[1] have 30+ top level keys. If I have time in the next couple of days I'll conduct some testings with the public twitter fire-hose data.

[1] https://dev.twitter.com/rest/reference/get/statuses/home_timeline

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: vacuumdb --all --analyze-in-stages - wrong order?
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_dump refactor patch to remove global variables