Optimization of some jsonb functions

Поиск
Список
Период
Сортировка
От Nikita Glukhov
Тема Optimization of some jsonb functions
Дата
Msg-id 7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
обсуждение исходный текст
Ответы Re: Optimization of some jsonb functions  (David Steele <david@pgmasters.net>)
Список pgsql-hackers
Attached set of patches with some jsonb optimizations that were made during
comparison of performance of ordinal jsonb operators and jsonpath operators.

1. Optimize JsonbExtractScalar():  It is better to use getIthJsonbValueFromContainer(cont, 0) instead of  JsonIterator to get 0th element of raw-scalar pseudoarray.  JsonbExtractScalar() is used in jsonb casts, so they speed up a bit.

2. Optimize operator #>>, jsonb_each_text(), jsonb_array_elements_text():  These functions have direct conversion (JsonbValue => text) only for  jbvString scalars, but indirect conversion of other types of scalars   (JsonbValue => jsonb => text) is obviously too slow.  Extracted common  subroutine JsonbValueAsText() and used in all suitable places.

3. Optimize JsonbContainer type recognition in get_jsonb_path_all():  Fetching of the first token from JsonbIterator is replaced with lightweight  JsonbContainerIsXxx() macros.

4. Extract findJsonbKeyInObject():  Extracted findJsonbKeyInObject() from findJsonbValueFromContainer(),  which is slightly easier to use (key string and its length is passed instead  of filled string JsonbValue).

5. Optimize resulting value allocation in findJsonbValueFromContainer() and  getIthJsonbValueFromContainer():  Added ability to pass stack-allocated JsonbValue that will be filled with  the result of operation instead of returning unconditionally palloc()ated  JsonbValue.

Patches #4 and #5 are mostly refactorings, but they can give small speedup
(up to 5% for upcoming jsonpath operators) due to elimination of unnecessary 
palloc()s.  The whole interface of findJsonbValueFromContainer() with JB_OBJECT
and JB_ARRAY flags always seemed a bit strange to me, so I think it is worth to
have separate functions for searching keys in objects and elements in arrays.


Performance tests:- Test data for {"x": {"y": {"z": i}}}:  CREATE TABLE t AS  SELECT jsonb_build_object('x',           jsonb_build_object('y',             jsonb_build_object('z', i))) js  FROM generate_series(1, 3000000) i;
- Sample query:  EXPLAIN (ANALYZE) SELECT js -> 'x' -> 'y' -> 'z' FROM t;
- Results:                                                   |   execution time, ms                        query                      |  master  |   optimized   
-------------------------------------------------------------------------------             {"x": {"y": {"z": i}}}js #> '{x,y,z}'                                    | 1148.632 | 1005.578 -10%js #>> '{x,y,z}'                                   | 1520.160 |  849.991 -40%(js #> '{x,y,z}')::numeric                         | 1310.881 | 1067.752 -20%(js #>> '{x,y,z}')::numeric                        | 1757.179 | 1109.495 -30%
js -> 'x' -> 'y' -> 'z'                            | 1030.211 |  977.267js -> 'x' -> 'y' ->> 'z'                           |  887.101 |  838.745(js -> 'x' -> 'y' -> 'z')::numeric                 | 1184.086 | 1050.462(js -> 'x' -> 'y' -> 'z')::int4                    | 1279.315 | 1133.032(js -> 'x' -> 'y' ->> 'z')::numeric                | 1134.003 | 1100.047(js -> 'x' -> 'y' ->> 'z')::int4                   | 1077.216 |  991.995
js ? 'x'                                           |  523.111 |  495.387js ?| '{x,y,z}'                                    |  612.880 |  607.455js ?& '{x,y,z}'                                    |  674.786 |  643.987js -> 'x' -> 'y' ? 'z'                             |  712.623 |  698.588js @> '{"x": {"y": {"z": 1}}}'                     | 1154.926 | 1149.069

jsonpath:js @@ '$.x.y.z == 123'                             |  973,444 |   912,08  -5%
             {"x": i, "y": i, "z": i}jsonb_each(js)                                     | 2281.577 | 2262.660jsonb_each_text(js)                                | 2603.539 | 2112.200 -20%                  [i, i, i]jsonb_array_elements(js)                           | 1255.210 | 1205.939jsonb_array_elements(js)::numeric                  | 1662.550 | 1576.227  -5%jsonb_array_elements_text(js)                      | 1555.021 | 1067.031 -30%
js @> '1'                                          |  798.858 |  768.664  -4%js <@ '[1,2,3]'                                    |  820.795 |  785.086  -5%js <@ '[0,1,2,3,4,5,6,7,8,9]'                      | 1214.170 | 1165.289  -5%


As it can be seen, #> operators are always slower than equivalent series of ->.
I think it is caused by array deconstruction in "jsonb #> text[]".

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [PATCH v20] GSSAPI encryption support
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: boolean and bool in documentation