Re: [GENERAL] Insert performance and disk usage in JSON vs JSONB

Поиск
Список
Период
Сортировка
От Ronny Abraham
Тема Re: [GENERAL] Insert performance and disk usage in JSON vs JSONB
Дата
Msg-id a614214645ba471ea0d9b7120c87cfad@USEPRDEX1.corp.kns.com
обсуждение исходный текст
Ответ на [GENERAL] Insert performance and disk usage in JSON vs JSONB  (Ronny Abraham <rabraham@kns.com>)
Ответы Re: [GENERAL] Insert performance and disk usage in JSON vs JSONB
Re: [GENERAL] Insert performance and disk usage in JSON vs JSONB
Список pgsql-general

Here are the attachments.

 

 

From: Ronny Abraham
Sent: Monday, May 15, 2017 1:03 PM
To: 'pgsql-general@postgresql.org'
Subject: Insert performance and disk usage in JSON vs JSONB

 

Hello all,

 

I am trying to decide whether to use JSON or JSONB to store my application data.

 

From what I read so far about JSON vs JSONB:

--------------------------------------------------------

Performance -  JSON is faster for inserts since it only odes JSON format verification, vs JSONB which also converts the jSON input to a binary JSONB with its meta-data.

However, queries are much faster for JSONB especially when using indexes.

Disk Usage – JSONB uses more space vs JSON, I assume this is due to its meta data it stores in the binary.

 

 

But, I need to take my application use cases into consideration when making the decision of JSON vs JSONB:

------------------------------------------------------------------------------------------------------------------------------

My application has a data producer which sends data to a consumer which in turn inserts the data to the DB.

I wanted to make sure I don’t run into a consumer-producer problem where my producer generates data at a rate the consumer cannot handle.

Part of that is understanding insert times in JSON and JSONB.

In my application insert time is more critical than read time, since I do not have many clients on the reading side, and time is not critical. I also wanted to experiment with disk usage.

 

So I wrote 2 small test in python one for storage and one for insert performance.

In my test db I created 2 tables, each has only one column named ‘data’, in each I store a JSON\B with 10 fields.

 

The results of the first test (disk usage) are attached (excel sheet) – in it you can see a 26% overhead in JSONB over JSON.

The second test (insert performance) results are as following (python script attached):

 

1. Inserting 1 row to JSON table, result:

                                            QUERY PLAN                                            

---------------------------------------------------------------------------------------------------

Insert on table_json10  (cost=0.00..0.01 rows=1 width=32) (actual time=0.048..0.048 rows=0 loops=1)

  ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.001 rows=1 loops=1)         

Planning time: 0.035 ms                                                                           

Execution time: 0.071 ms                                                                          

(4 rows)

 

 

2. Inserting 1 row to JSONB table, result:

                                             QUERY PLAN                                            

----------------------------------------------------------------------------------------------------

Insert on table_json10b  (cost=0.00..0.01 rows=1 width=32) (actual time=0.052..0.052 rows=0 loops=1)

  ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=1)          

Planning time: 0.018 ms                                                                             

Execution time: 0.066 ms                                                                           

(4 rows)

 

 

3. Multi insert (10) to JSON table, result:

                                                  QUERY PLAN                                                  

--------------------------------------------------------------------------------------------------------------

Insert on table_json10  (cost=0.00..0.12 rows=10 width=32) (actual time=0.045..0.045 rows=0 loops=1)         

  ->  Values Scan on "*VALUES*"  (cost=0.00..0.12 rows=10 width=32) (actual time=0.006..0.010 rows=10 loops=1)

Planning time: 0.036 ms                                                                                      

Execution time: 0.072 ms                                                                                     

(4 rows)

 

 

4. Multi insert (10) to JSONB table, result:

                                                  QUERY PLAN                                                 

--------------------------------------------------------------------------------------------------------------

Insert on table_json10b  (cost=0.00..0.12 rows=10 width=32) (actual time=0.029..0.029 rows=0 loops=1)        

  ->  Values Scan on "*VALUES*"  (cost=0.00..0.12 rows=10 width=32) (actual time=0.002..0.005 rows=10 loops=1)

Planning time: 0.021 ms                                                                                      

Execution time: 0.043 ms                                                                                      

(4 rows)

 

 

4. Insert 10,000 rows to JSON, execution time (sec):

122.855001211

 

 

5. Insert 10,000 rows to JSONB, execution time (sec):

122.128999233

 

# END TEST

 

What’s interesting is that inserting to JSONB is slightly faster than inserting to JSON.

Maybe that’s because my JSON has a flat structure (no nesting), or maybe I am doing something else wrong?

 

I was just interested in some input regarding insert performance and disk usage in JSON vs JSONB.

 

Thanks!

Ronny

 




This email is non-binding, is subject to contract, and neither Kulicke and Soffa Industries, Inc. nor its subsidiaries (each and collectively “K&S”) shall have any obligation to you to consummate the transactions herein or to enter into any agreement, other than in accordance with the terms and conditions of a definitive agreement if and when negotiated, finalized and executed between the parties. This email and all its contents are protected by International and United States copyright laws. Any reproduction or use of all or any part of this email without the express written consent of K&S is prohibited.
Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Ronny Abraham
Дата:
Сообщение: [GENERAL] Insert performance and disk usage in JSON vs JSONB
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: [GENERAL] Coditional join of query using PostgreSQL