Обсуждение: jsonb_agg performance

Поиск
Список
Период
Сортировка

jsonb_agg performance

От
jfleming@kispring.com
Дата:
The jsonb_agg function seems to have significantly worse performance than its json_agg counterpart:

=> explain analyze select pa.product_id, jsonb_agg(attributes) from product_attributes2 pa group by pa.product_id;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1127.54..1231.62 rows=3046 width=380) (actual time=28.632..241.647 rows=3046 loops=1)
   Group Key: product_id
   ->  Sort  (cost=1127.54..1149.54 rows=8800 width=380) (actual time=28.526..32.826 rows=8800 loops=1)
         Sort Key: product_id
         Sort Method: external sort  Disk: 3360kB
         ->  Seq Scan on product_attributes2 pa  (cost=0.00..551.00 rows=8800 width=380) (actual time=0.010..7.231 rows=8800 loops=1)
 Planning time: 0.376 ms
 Execution time: 242.963 ms
(8 rows)

=> explain analyze select pa.product_id, json_agg(attributes) from product_attributes3 pa group by pa.product_id;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1136.54..1240.62 rows=3046 width=387) (actual time=17.731..30.126 rows=3046 loops=1)
   Group Key: product_id
   ->  Sort  (cost=1136.54..1158.54 rows=8800 width=387) (actual time=17.707..20.705 rows=8800 loops=1)
         Sort Key: product_id
         Sort Method: external sort  Disk: 3416kB
         ->  Seq Scan on product_attributes3 pa  (cost=0.00..560.00 rows=8800 width=387) (actual time=0.006..5.568 rows=8800 loops=1)
 Planning time: 0.181 ms
 Execution time: 31.276 ms
(8 rows)

The only difference between the two tables is the type of the attributes column (jsonb vs json).  Each table contains the same 8800 rows.  Even running json_agg on the jsonb column seems to be faster:

=> explain analyze select pa.product_id, json_agg(attributes) from product_attributes2 pa group by pa.product_id;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1127.54..1231.62 rows=3046 width=380) (actual time=30.626..62.943 rows=3046 loops=1)
   Group Key: product_id
   ->  Sort  (cost=1127.54..1149.54 rows=8800 width=380) (actual time=30.590..34.157 rows=8800 loops=1)
         Sort Key: product_id
         Sort Method: external sort  Disk: 3360kB
         ->  Seq Scan on product_attributes2 pa  (cost=000..551.00 rows=8800 width=380) (actual time=0.014..7.388 rows=8800 loops=1)
 Planning time: 0.142 ms
 Execution time: 64.504 ms
(8 rows)

Is it expected that jsonb_agg performance would be that much worse than json_agg?

Re: jsonb_agg performance

От
Andrew Dunstan
Дата:

On 01/29/2016 05:06 PM, jfleming@kispring.com wrote:
> The jsonb_agg function seems to have significantly worse performance
> than its json_agg counterpart:
>
> => explain analyze select pa.product_id, jsonb_agg(attributes) from
> product_attributes2 pa group by pa.product_id;
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=1127.54..1231.62 rows=3046 width=380) (actual
> time=28.632..241.647 rows=3046 loops=1)
>    Group Key: product_id
>    ->  Sort  (cost=1127.54..1149.54 rows=8800 width=380) (actual
> time=28.526..32.826 rows=8800 loops=1)
>          Sort Key: product_id
>          Sort Method: external sort  Disk: 3360kB
>          ->  Seq Scan on product_attributes2 pa (cost=0.00..551.00
> rows=8800 width=380) (actual time=0.010..7.231 rows=8800 loops=1)
>  Planning time: 0.376 ms
>  Execution time: 242.963 ms
> (8 rows)
>
> => explain analyze select pa.product_id, json_agg(attributes) from
> product_attributes3 pa group by pa.product_id;
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=1136.54..1240.62 rows=3046 width=387) (actual
> time=17.731..30.126 rows=3046 loops=1)
>    Group Key: product_id
>    ->  Sort  (cost=1136.54..1158.54 rows=8800 width=387) (actual
> time=17.707..20.705 rows=8800 loops=1)
>          Sort Key: product_id
>          Sort Method: external sort  Disk: 3416kB
>          ->  Seq Scan on product_attributes3 pa (cost=0.00..560.00
> rows=8800 width=387) (actual time=0.006..5.568 rows=8800 loops=1)
>  Planning time: 0.181 ms
>  Execution time: 31.276 ms
> (8 rows)
>
> The only difference between the two tables is the type of the
> attributes column (jsonb vs json).  Each table contains the same 8800
> rows.  Even running json_agg on the jsonb column seems to be faster:
>
> => explain analyze select pa.product_id, json_agg(attributes) from
> product_attributes2 pa group by pa.product_id;
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=1127.54..1231.62 rows=3046 width=380) (actual
> time=30.626..62.943 rows=3046 loops=1)
>    Group Key: product_id
>    ->  Sort  (cost=1127.54..1149.54 rows=8800 width=380) (actual
> time=30.590..34.157 rows=8800 loops=1)
>          Sort Key: product_id
>          Sort Method: external sort  Disk: 3360kB
>          ->  Seq Scan on product_attributes2 pa (cost=000..551.00
> rows=8800 width=380) (actual time=0.014..7.388 rows=8800 loops=1)
>  Planning time: 0.142 ms
>  Execution time: 64.504 ms
> (8 rows)
>
> Is it expected that jsonb_agg performance would be that much worse
> than json_agg?


I do expect it to be significantly worse. Constructing jsonb is quite a
lot more expensive than constructing json, it's the later processing
that provides the performance benefit of jsonb. For 99 out of 100 uses
that I have seen there is no need to be using jsonb_agg, since the
output is almost always fed straight back to the client, not stored or
processed further in the database. Rendering json to the client is
extremely cheap, since it's already just text. Rendering jsonb as text
to the client involves a lot more processing.

cheers

andrew