Обсуждение: BUG #13680: PostgreSQL backend process crashes on jsonb_object_agg() in plpgsql recursive function
BUG #13680: PostgreSQL backend process crashes on jsonb_object_agg() in plpgsql recursive function
От
b.yordanov2@gmail.com
Дата:
The following bug has been logged on the website:
Bug reference: 13680
Logged by: Boyko
Email address: b.yordanov2@gmail.com
PostgreSQL version: 9.5beta1
Operating system: CentOS release 6.4 (Final)
Description:
Hi,
I wrote the following function:
CREATE OR REPLACE FUNCTION public.recursive_stats_daily(d jsonb, i
interval)
RETURNS TABLE(day_time_col timestamp without time zone, data_col jsonb)
LANGUAGE plpgsql
AS $function$
begin
return query select stats.day_time_col,jsonb_object_agg(z,(d->z ||
stats.data_col->z)) as dd from stats,jsonb_object_keys(stats.data_col) z
where stats.day_time_col::date = current_date and stats.day_time_col::time =
'00:00:00'::time + i group by stats.day_time_col;
if day_time_col::time < '23:00:00'::time then
return query select * from
recursive_stats_daily(dd,i+'1h'::interval);
end if;
end $function$
The idea is to recursively concat nested json data, which is generated
hourly (24 jsons per day) into single "daily" json object.
Structure of stats:
CREATE TABLE stats (
id integer NOT NULL,
day_time_col timestamp without time zone DEFAULT now() NOT NULL,
at time without time zone DEFAULT '00:00:00'::time without time zone NOT
NULL,
data_col jsonb DEFAULT '{}'::jsonb NOT NULL
);
Example json data:
=> select * from stats where day_time_col::date = current_date;
-[ RECORD 1
]+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 209
day_time_col | 2015-10-14 00:00:00
at | 00:00:00
data_col | {"36": {"2285": {"c": 1}}, "61": {"4023": {"c": 1}}, "71":
{"3572": {"c": 1}}, "143": {"4916": {"c": 1}}, "166": {"1171": {"c": 1}},
"185": {"604": {"c": 1}}, "264": {"6535": {"c": 1}}, "321": {"7817": {"c":
1}}, "383": {"4031": {"c": 1}}, "462": {"8303": {"c": 1}}, "486": {"9795":
{"c": 1}}, "497": {"2745": {"c": 1}}}
-[ RECORD 2
]+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 210
day_time_col | 2015-10-14 01:00:00
at | 00:00:00
data_col | {"36": {"1135": {"c": 1}}, "47": {"1364": {"c": 1}}, "54":
{"5920": {"c": 1}, "8402": {"c": 1}}, "71": {"386": {"c": 1}}, "90":
{"7145": {"c": 1}}, "94": {"8625": {"c": 1}}, "168": {"3814": {"c": 1}},
"326": {"722": {"c": 1}}, "339": {"4888": {"c": 1}}, "420": {"6447": {"c":
1}}, "430": {"5701": {"c": 1}}, "460": {"861": {"c": 1}}, "466": {"1722":
{"c": 1}}, "496": {"6945": {"c": 1}}, "498": {"6929": {"c": 1}}}
Calling the function like this causes the crash:
sravni_hstore=> select * from
recursive_stats_daily('{}'::jsonb,'1h'::interval);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Log entry:
< 2015-10-14 19:01:54.640 EEST >LOG: statement: select * from
recursive_stats_daily('{}'::jsonb,'1h'::interval);
< 2015-10-14 19:01:54.645 EEST >LOG: server process (PID 15993) was
terminated by signal 11: Segmentation fault
< 2015-10-14 19:01:54.645 EEST >DETAIL: Failed process was running: select
* from recursive_stats_daily('{}'::jsonb,'1h'::interval);
< 2015-10-14 19:01:54.645 EEST >LOG: terminating any other active server
processes
< 2015-10-14 19:01:54.645 EEST >WARNING: terminating connection because of
crash of another server process
< 2015-10-14 19:01:54.645 EEST >DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
< 2015-10-14 19:01:54.645 EEST >HINT: In a moment you should be able to
reconnect to the database and repeat your command.
< 2015-10-14 19:01:54.645 EEST >FATAL: the database system is in recovery
mode
< 2015-10-14 19:01:54.645 EEST >LOG: all server processes terminated;
reinitializing
< 2015-10-14 19:01:54.654 EEST >LOG: database system was interrupted; last
known up at 2015-10-14 19:01:12 EEST
< 2015-10-14 19:01:54.671 EEST >LOG: database system was not properly shut
down; automatic recovery in progress
< 2015-10-14 19:01:54.672 EEST >LOG: invalid record length at 0/7561438
< 2015-10-14 19:01:54.672 EEST >LOG: redo is not required
< 2015-10-14 19:01:54.678 EEST >LOG: MultiXact member wraparound
protections are now enabled
< 2015-10-14 19:01:54.678 EEST >LOG: database system is ready to accept
connections
< 2015-10-14 19:01:54.678 EEST >LOG: autovacuum launcher started
On Thu, Oct 15, 2015 at 1:08 AM, <b.yordanov2@gmail.com> wrote:
> Calling the function like this causes the crash:
> sravni_hstore=> select * from
> recursive_stats_daily('{}'::jsonb,'1h'::interval);
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
Indeed. Reusing your example with the script attached I can reproduce
the crash easily:
* thread #1: tid = 0x0000, 0x0000000106674720
postgres`JsonbValueToJsonb(val=0x0000000000000000) + 16 at
jsonb_util.c:82, stop reason = signal SIGSTOP
* frame #0: 0x0000000106674720
postgres`JsonbValueToJsonb(val=0x0000000000000000) + 16 at
jsonb_util.c:82
frame #1: 0x0000000106670a05
postgres`jsonb_object_agg_transfn(fcinfo=0x00007f9fc8958710) + 1061 at
jsonb.c:1824
frame #2: 0x0000000106401bb2
postgres`advance_transition_function(aggstate=0x00007f9fc8950788,
pertrans=0x00007f9fc8958638, pergroupstate=0x00007f9fc89637b0) + 386
at nodeAgg.c:7
I am looking into it in more details, for now I have added an open item for 9.5.
Regards,
--
Michael
Вложения
On Thu, Oct 15, 2015 at 10:44 AM, Michael Paquier wrote:
> I am looking into it in more details, for now I have added an open item for 9.5.
> Regards,
This simple query reproduces the crash as well:
=# select json_object_agg(1, NULL::json);
json_object_agg
-----------------
{ "1" : null }
(1 row)
=# select jsonb_object_agg(1, NULL::jsonb);
server closed the connection unexpectedly
It happens that jsonb_object_agg_transfn is not able to manage
correctly NULL values in the context of a JSONB value, and it seems to
me that this is caused by an oversight in datum_to_jsonb regarding the
handling of NULL values.
Attached is a patch with some regression tests for master and
REL9_5_STABLE where the bug has been introduced.
Thoughts?
--
Michael
Вложения
On Thu, Oct 15, 2015 at 7:07 PM, boyko yordanov wrote:
> sravni_hstore=# select ('{"3": {"9703": {"c": 1}}}'::jsonb->'3') || ('{"3":
> {"8309": {"c": 1}}}'::jsonb->'3');
> -[ RECORD 1 ]----------------------------------
> ?column? | {"8309": {"c": 1}, "9703": {"c": 1}}
>
> sravni_hstore=# select '{"3": {"9703": {"c": 1}}}'::jsonb->'3' || '{"3":
> {"8309": {"c": 1}}}'::jsonb->'3';
> -[ RECORD 1 ]----------------
> ?column? | {"8309": {"c": 1}}
>
> What I expect is that there should be no difference in the output of the
> last two queries. Not sure if I should submit this as a separate bug?
The last one looks correct to me, the concat operator is using on the
right side '{"3": {"8309": {"c": 1}}}'::jsonb in your last query, and
not '{"3": {"8309": {"c": 1}}}'::jsonb->'3', so || takes precedence on
->. When applying the parenthesis, ('{"3": {"8309": {"c":
1}}}'::jsonb->'3') is completely taken into account.
--
Michael
Now I get it, its about operators precedence. Thanks once again!
2015-10-15 14:31 GMT+03:00 Michael Paquier <michael.paquier@gmail.com>:
> On Thu, Oct 15, 2015 at 7:07 PM, boyko yordanov wrote:
> > sravni_hstore=# select ('{"3": {"9703": {"c": 1}}}'::jsonb->'3') ||
> ('{"3":
> > {"8309": {"c": 1}}}'::jsonb->'3');
> > -[ RECORD 1 ]----------------------------------
> > ?column? | {"8309": {"c": 1}, "9703": {"c": 1}}
> >
> > sravni_hstore=# select '{"3": {"9703": {"c": 1}}}'::jsonb->'3' || '{"3":
> > {"8309": {"c": 1}}}'::jsonb->'3';
> > -[ RECORD 1 ]----------------
> > ?column? | {"8309": {"c": 1}}
> >
> > What I expect is that there should be no difference in the output of the
> > last two queries. Not sure if I should submit this as a separate bug?
>
> The last one looks correct to me, the concat operator is using on the
> right side '{"3": {"8309": {"c": 1}}}'::jsonb in your last query, and
> not '{"3": {"8309": {"c": 1}}}'::jsonb->'3', so || takes precedence on
> ->. When applying the parenthesis, ('{"3": {"8309": {"c":
> 1}}}'::jsonb->'3') is completely taken into account.
> --
> Michael
>
Applied the patch and the crash is gone, looks good, was able to debug my
function and it now works as expected, posting it for the record:
create or replace function recursive_stats_daily (d jsonb, i interval)
returns table (daily_json_for date, data_col jsonb)
language plpgsql
as $$
begin
if i::time < '23:00:00'::time then
return query
select * from recursive_stats_daily(
(select d || jsonb_object_agg(z,(coalesce(d->z,'{}'::jsonb)::jsonb
|| (stats.data_col->z)::jsonb)) from
stats,jsonb_object_keys(stats.data_col) z where stats.day_time_col::date =
current_date and stats.day_time_col::time = '00:00:00'::time + i group by
stats.day_time_col),
(i+'1h'::interval)
);
else
return query select current_date, d;
end if;
end $$;
As a side note - I notice this weird behavior of the concat operator -
below queries show what I mean:
sravni_hstore=# select ('{"3": {"9703": {"c": 1}}}'::jsonb->'3');
-[ RECORD 1 ]----------------
?column? | {"9703": {"c": 1}}
sravni_hstore=# select '{"3": {"9703": {"c": 1}}}'::jsonb->'3';
-[ RECORD 1 ]----------------
?column? | {"9703": {"c": 1}}
sravni_hstore=# select ('{"3": {"8309": {"c": 1}}}'::jsonb->'3');
-[ RECORD 1 ]----------------
?column? | {"8309": {"c": 1}}
sravni_hstore=# select '{"3": {"8309": {"c": 1}}}'::jsonb->'3';
-[ RECORD 1 ]----------------
?column? | {"8309": {"c": 1}}
sravni_hstore=# select ('{"3": {"9703": {"c": 1}}}'::jsonb->'3') || ('{"3":
{"8309": {"c": 1}}}'::jsonb->'3');
-[ RECORD 1 ]----------------------------------
?column? | {"8309": {"c": 1}, "9703": {"c": 1}}
sravni_hstore=# select '{"3": {"9703": {"c": 1}}}'::jsonb->'3' || '{"3":
{"8309": {"c": 1}}}'::jsonb->'3';
-[ RECORD 1 ]----------------
?column? | {"8309": {"c": 1}}
What I expect is that there should be no difference in the output of the
last two queries. Not sure if I should submit this as a separate bug?
Thanks for the effort! :)
Boyko
2015-10-15 8:41 GMT+03:00 Michael Paquier <michael.paquier@gmail.com>:
> On Thu, Oct 15, 2015 at 10:44 AM, Michael Paquier wrote:
> > I am looking into it in more details, for now I have added an open item
> for 9.5.
> > Regards,
>
> This simple query reproduces the crash as well:
> =# select json_object_agg(1, NULL::json);
> json_object_agg
> -----------------
> { "1" : null }
> (1 row)
> =# select jsonb_object_agg(1, NULL::jsonb);
> server closed the connection unexpectedly
>
> It happens that jsonb_object_agg_transfn is not able to manage
> correctly NULL values in the context of a JSONB value, and it seems to
> me that this is caused by an oversight in datum_to_jsonb regarding the
> handling of NULL values.
>
> Attached is a patch with some regression tests for master and
> REL9_5_STABLE where the bug has been introduced.
> Thoughts?
> --
> Michael
>
Michael Paquier <michael.paquier@gmail.com> writes:
> It happens that jsonb_object_agg_transfn is not able to manage
> correctly NULL values in the context of a JSONB value, and it seems to
> me that this is caused by an oversight in datum_to_jsonb regarding the
> handling of NULL values.
> Attached is a patch with some regression tests for master and
> REL9_5_STABLE where the bug has been introduced.
This looks sane to me, although since the function's header comment
specifies that tcategory is not to be relied on when is_null is true,
I think it would be cleaner to test is_null before looking at tcategory.
Will adjust and push.
regards, tom lane
On Fri, Oct 16, 2015 at 1:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> It happens that jsonb_object_agg_transfn is not able to manage >> correctly NULL values in the context of a JSONB value, and it seems to >> me that this is caused by an oversight in datum_to_jsonb regarding the >> handling of NULL values. > >> Attached is a patch with some regression tests for master and >> REL9_5_STABLE where the bug has been introduced. > > This looks sane to me, although since the function's header comment > specifies that tcategory is not to be relied on when is_null is true, > I think it would be cleaner to test is_null before looking at tcategory. > Will adjust and push. Thanks! -- Michael