Обсуждение: Re: BUG #16171: Potential malformed JSON in explain output

Поиск
Список
Период
Сортировка

Re: BUG #16171: Potential malformed JSON in explain output

От
Hamid Akhtar
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

I've reviewed and verified this patch and IMHO, this is ready to be committed.


So I have verified this patch against the tip of REL_12_STABLE branch (commit c4c76d19).
- git apply <patch> works without issues.

Although the patch description mentions that it fixes a malformed JSON in explain output by creating a "Plan" group,
thisalso fixes the same malformation issue for XML and YAML formats. It does not impact the text output though.
 

I'm sharing the problematic part of output for an unpatched version for JSON, XML, and YAML formats:
-- JSON
       "Plans": [
         "Subplans Removed": 1,
         {
           "Node Type": "Seq Scan",

-- XML
       <Plans>
         <Subplans-Removed>1</Subplans-Removed>
         <Plan>

-- YAML
     Plans:
       Subplans Removed: 1
       - Node Type: "Seq Scan"

The patched version gives the following and correct output:
-- JSON
       "Plans": [
         {
           "Subplans Removed": 1
         }, 
         {
           "Node Type": "Seq Scan",

-- XML
       <Plans>
         <Plan>
           <Subplans-Removed>1</Subplans-Removed>
         </Plan>
         <Plan>

-- YAML
     Plans:
       - Subplans Removed: 1
       - Node Type: "Seq Scan"

Following is the query that I used for validating the output. I picked it up (and simplified) from
"src/test/regress/sql/partition_prune.sql".You can probably come up with a simpler query, but this does the job. The
querybelow gives the output in JSON format:
 
----
create table ab (a int not null, b int not null) partition by list (a);
create table ab_a2 partition of ab for values in(2) partition by list (b);
create table ab_a2_b1 partition of ab_a2 for values in (1);
create table ab_a1 partition of ab for values in(1) partition by list (b);
create table ab_a1_b1 partition of ab_a1 for values in (2);

-- Disallow index only scans as concurrent transactions may stop visibility
-- bits being set causing "Heap Fetches" to be unstable in the EXPLAIN ANALYZE
-- output.
set enable_indexonlyscan = off;

prepare ab_q1 (int, int, int) as
select * from ab where a between $1 and $2 and b <= $3;

-- Execute query 5 times to allow choose_custom_plan
-- to start considering a generic plan.
execute ab_q1 (1, 8, 3);
execute ab_q1 (1, 8, 3);
execute ab_q1 (1, 8, 3);
execute ab_q1 (1, 8, 3);
execute ab_q1 (1, 8, 3);

explain (format json, analyze, costs off, summary off, timing off) execute ab_q1 (2, 2, 3);

deallocate ab_q1;
drop table ab;
----

The new status of this patch is: Ready for Committer

Re: BUG #16171: Potential malformed JSON in explain output

От
Tom Lane
Дата:
Hamid Akhtar <hamid.akhtar@gmail.com> writes:
> I've reviewed and verified this patch and IMHO, this is ready to be committed.

I took a look at this and I don't think it's really going in the right
direction.  ISTM the clear intent of this code was to attach the "Subplans
Removed" item as a field of the parent [Merge]Append node, but the author
forgot about the intermediate "Plans" array.  So I think that, rather than
doubling down on a mistake, we ought to move where the field is generated
so that it *is* a field of the parent node.

Another failure to follow the design conventions for EXPLAIN output is
that in non-text formats, the schema for each node type ought to be fixed;
that is, if a given field can appear for a particular node type and
EXPLAIN options, it should appear always, not be omitted just because it's
zero.

So that leads me to propose 0001 attached.  This does lead to some field
order rearrangement in text mode, as per the regression test changes,
but I think that's not a big deal.  (A change can only happen if there
are initplan(s) attached to the parent node.)

Also, I wondered whether we had any other violations of correct formatting
in this code, which led me to the idea of running auto_explain in JSON
mode and having it feed its result to json_in.  This isn't a complete
test because it won't whine about duplicate field names, but it did
correctly find the current bug --- and I couldn't find any others while
running the core regression tests with various auto_explain options.
0002 attached isn't committable, because nobody would want the overhead
in production, but it seems like a good trick to keep up our sleeves.

Thoughts?

            regards, tom lane

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index c367c75..d901dc4 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -119,8 +119,9 @@ static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
 static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
 static void show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
                                   ExplainState *es);
-static void ExplainMemberNodes(PlanState **planstates, int nsubnodes,
-                               int nplans, List *ancestors, ExplainState *es);
+static void ExplainMemberNodes(PlanState **planstates, int nplans,
+                               List *ancestors, ExplainState *es);
+static void ExplainMissingMembers(int nplans, int nchildren, ExplainState *es);
 static void ExplainSubPlans(List *plans, List *ancestors,
                             const char *relationship, ExplainState *es);
 static void ExplainCustomChildren(CustomScanState *css,
@@ -1967,6 +1968,30 @@ ExplainNode(PlanState *planstate, List *ancestors,
         ExplainFlushWorkersState(es);
     es->workers_state = save_workers_state;

+    /*
+     * If partition pruning was done during executor initialization, the
+     * number of child plans we'll display below will be less than the number
+     * of subplans that was specified in the plan.  To make this a bit less
+     * mysterious, emit an indication that this happened.  Note that this
+     * field is emitted now because we want it to be a property of the parent
+     * node; it *cannot* be emitted within the Plans sub-node we'll open next.
+     */
+    switch (nodeTag(plan))
+    {
+        case T_Append:
+            ExplainMissingMembers(((AppendState *) planstate)->as_nplans,
+                                  list_length(((Append *) plan)->appendplans),
+                                  es);
+            break;
+        case T_MergeAppend:
+            ExplainMissingMembers(((MergeAppendState *) planstate)->ms_nplans,
+                                  list_length(((MergeAppend *) plan)->mergeplans),
+                                  es);
+            break;
+        default:
+            break;
+    }
+
     /* Get ready to display the child plans */
     haschildren = planstate->initPlan ||
         outerPlanState(planstate) ||
@@ -2007,31 +2032,26 @@ ExplainNode(PlanState *planstate, List *ancestors,
         case T_ModifyTable:
             ExplainMemberNodes(((ModifyTableState *) planstate)->mt_plans,
                                ((ModifyTableState *) planstate)->mt_nplans,
-                               list_length(((ModifyTable *) plan)->plans),
                                ancestors, es);
             break;
         case T_Append:
             ExplainMemberNodes(((AppendState *) planstate)->appendplans,
                                ((AppendState *) planstate)->as_nplans,
-                               list_length(((Append *) plan)->appendplans),
                                ancestors, es);
             break;
         case T_MergeAppend:
             ExplainMemberNodes(((MergeAppendState *) planstate)->mergeplans,
                                ((MergeAppendState *) planstate)->ms_nplans,
-                               list_length(((MergeAppend *) plan)->mergeplans),
                                ancestors, es);
             break;
         case T_BitmapAnd:
             ExplainMemberNodes(((BitmapAndState *) planstate)->bitmapplans,
                                ((BitmapAndState *) planstate)->nplans,
-                               list_length(((BitmapAnd *) plan)->bitmapplans),
                                ancestors, es);
             break;
         case T_BitmapOr:
             ExplainMemberNodes(((BitmapOrState *) planstate)->bitmapplans,
                                ((BitmapOrState *) planstate)->nplans,
-                               list_length(((BitmapOr *) plan)->bitmapplans),
                                ancestors, es);
             break;
         case T_SubqueryScan:
@@ -3348,33 +3368,34 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
  *
  * The ancestors list should already contain the immediate parent of these
  * plans.
-*
-* nsubnodes indicates the number of items in the planstates array.
-* nplans indicates the original number of subnodes in the Plan, some of these
-* may have been pruned by the run-time pruning code.
  */
 static void
-ExplainMemberNodes(PlanState **planstates, int nsubnodes, int nplans,
+ExplainMemberNodes(PlanState **planstates, int nplans,
                    List *ancestors, ExplainState *es)
 {
     int            j;

-    /*
-     * The number of subnodes being lower than the number of subplans that was
-     * specified in the plan means that some subnodes have been ignored per
-     * instruction for the partition pruning code during the executor
-     * initialization.  To make this a bit less mysterious, we'll indicate
-     * here that this has happened.
-     */
-    if (nsubnodes < nplans)
-        ExplainPropertyInteger("Subplans Removed", NULL, nplans - nsubnodes, es);
-
-    for (j = 0; j < nsubnodes; j++)
+    for (j = 0; j < nplans; j++)
         ExplainNode(planstates[j], ancestors,
                     "Member", NULL, es);
 }

 /*
+ * Report about any pruned subnodes of an Append or MergeAppend node.
+ *
+ * nplans indicates the number of live subplans.
+ * nchildren indicates the original number of subnodes in the Plan;
+ * some of these may have been pruned by the run-time pruning code.
+ */
+static void
+ExplainMissingMembers(int nplans, int nchildren, ExplainState *es)
+{
+    if (nplans < nchildren || es->format != EXPLAIN_FORMAT_TEXT)
+        ExplainPropertyInteger("Subplans Removed", NULL,
+                               nchildren - nplans, es);
+}
+
+/*
  * Explain a list of SubPlans (or initPlans, which also use SubPlan nodes).
  *
  * The ancestors list should already contain the immediate parent of these
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 424d51d..9c8f80d 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1808,9 +1808,9 @@ explain (analyze, costs off, summary off, timing off) execute ab_q2 (2, 2);
                        QUERY PLAN
 ---------------------------------------------------------
  Append (actual rows=0 loops=1)
+   Subplans Removed: 6
    InitPlan 1 (returns $0)
      ->  Result (actual rows=1 loops=1)
-   Subplans Removed: 6
    ->  Seq Scan on ab_a2_b1 ab_1 (actual rows=0 loops=1)
          Filter: ((a >= $1) AND (a <= $2) AND (b < $0))
    ->  Seq Scan on ab_a2_b2 ab_2 (actual rows=0 loops=1)
@@ -1826,9 +1826,9 @@ explain (analyze, costs off, summary off, timing off) execute ab_q3 (2, 2);
                        QUERY PLAN
 ---------------------------------------------------------
  Append (actual rows=0 loops=1)
+   Subplans Removed: 6
    InitPlan 1 (returns $0)
      ->  Result (actual rows=1 loops=1)
-   Subplans Removed: 6
    ->  Seq Scan on ab_a1_b2 ab_1 (actual rows=0 loops=1)
          Filter: ((b >= $1) AND (b <= $2) AND (a < $0))
    ->  Seq Scan on ab_a2_b2 ab_2 (actual rows=0 loops=1)
@@ -2396,9 +2396,9 @@ explain (analyze, costs off, summary off, timing off) execute ab_q6(1);
                     QUERY PLAN
 --------------------------------------------------
  Append (actual rows=0 loops=1)
+   Subplans Removed: 12
    InitPlan 1 (returns $0)
      ->  Result (actual rows=1 loops=1)
-   Subplans Removed: 12
    ->  Seq Scan on ab_a1_b1 ab_1 (never executed)
          Filter: ((a = $1) AND (b = $0))
    ->  Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -3032,9 +3032,9 @@ execute ps1(1);
                        QUERY PLAN
 --------------------------------------------------------
  Append (actual rows=1 loops=1)
+   Subplans Removed: 2
    InitPlan 1 (returns $0)
      ->  Result (actual rows=1 loops=1)
-   Subplans Removed: 2
    ->  Seq Scan on mc3p1 mc3p_1 (actual rows=1 loops=1)
          Filter: ((a = $1) AND (abs(b) < $0))
 (6 rows)
@@ -3047,9 +3047,9 @@ execute ps2(1);
                        QUERY PLAN
 --------------------------------------------------------
  Append (actual rows=2 loops=1)
+   Subplans Removed: 1
    InitPlan 1 (returns $0)
      ->  Result (actual rows=1 loops=1)
-   Subplans Removed: 1
    ->  Seq Scan on mc3p0 mc3p_1 (actual rows=1 loops=1)
          Filter: ((a <= $1) AND (abs(b) < $0))
    ->  Seq Scan on mc3p1 mc3p_2 (actual rows=1 loops=1)
@@ -3594,9 +3594,9 @@ explain (costs off) execute q (1, 1);
                           QUERY PLAN
 ---------------------------------------------------------------
  Append
+   Subplans Removed: 1
    InitPlan 1 (returns $0)
      ->  Result
-   Subplans Removed: 1
    ->  Seq Scan on p1 p
          Filter: ((a = $1) AND (b = $2) AND (c = $0))
    ->  Seq Scan on q111 q1
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index f69dde8..71cb4fa 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -18,6 +18,7 @@
 #include "commands/explain.h"
 #include "executor/instrument.h"
 #include "jit/jit.h"
+#include "utils/builtins.h"
 #include "utils/guc.h"

 PG_MODULE_MAGIC;
@@ -397,6 +398,10 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
             {
                 es->str->data[0] = '{';
                 es->str->data[es->str->len - 1] = '}';
+
+                /* Verify that it's valid JSON by feeding to json_in */
+                (void) DirectFunctionCall1(json_in,
+                                           CStringGetDatum(es->str->data));
             }

             /*

Re: BUG #16171: Potential malformed JSON in explain output

От
Daniel Gustafsson
Дата:
> On 1 Feb 2020, at 20:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Hamid Akhtar <hamid.akhtar@gmail.com> writes:
>> I've reviewed and verified this patch and IMHO, this is ready to be committed.
>
> I took a look at this and I don't think it's really going in the right
> direction.  ISTM the clear intent of this code was to attach the "Subplans
> Removed" item as a field of the parent [Merge]Append node, but the author
> forgot about the intermediate "Plans" array.  So I think that, rather than
> doubling down on a mistake, we ought to move where the field is generated
> so that it *is* a field of the parent node.

Right, that makes sense; +1 on the attached 0001 patch.

> This does lead to some field
> order rearrangement in text mode, as per the regression test changes,
> but I think that's not a big deal.  (A change can only happen if there
> are initplan(s) attached to the parent node.)

Does that prevent backpatching this, or are we Ok with EXPLAIN text output not
being stable across minors?  AFAICT Pg::Explain still works fine with this
change, but mileage may vary for other parsers.

> 0002 attached isn't committable, because nobody would want the overhead
> in production, but it seems like a good trick to keep up our sleeves.

Thats a neat trick, I wonder if it would be worth maintaining a curated list of
these tricks in a README under src/test to help others avoid/reduce wheel
reinventing?

cheers ./daniel


Re: BUG #16171: Potential malformed JSON in explain output

От
Tom Lane
Дата:
[ cc'ing depesz to see what he thinks about this ]

Daniel Gustafsson <daniel@yesql.se> writes:
> On 1 Feb 2020, at 20:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This does lead to some field
>> order rearrangement in text mode, as per the regression test changes,
>> but I think that's not a big deal.  (A change can only happen if there
>> are initplan(s) attached to the parent node.)

> Does that prevent backpatching this, or are we Ok with EXPLAIN text output not
> being stable across minors?  AFAICT Pg::Explain still works fine with this
> change, but mileage may vary for other parsers.

I'm not sure about that either.  It should be a clear win for parsers
of the non-text formats, because now we're generating valid
JSON-or-whatever where we were not before.  But it's not too hard to
imagine that someone's ad-hoc parser of text output would break,
depending on how much it relies on field order rather than indentation
to make sense of things.

In the background of all this is that cases where it matters must be
pretty thin on the ground so far, else we'd have gotten complaints
sooner.  So we shouldn't really assume that everyone's parser handles
such cases at all yet.

I'm a little bit inclined to back-patch, on the grounds that JSON
output is hopelessly broken without this, and any text-mode parsers
that need work would need work by September anyway.  But I could
easily be argued into not back-patching.

Another approach we could consider is putting your patch in the
back branches and mine in HEAD.  I'm not sure that's a good idea;
it trades short-term stability of the text format for a long-term
mess in the non-text formats.  But maybe somebody will argue for it.

Thoughts?

            regards, tom lane



Re: BUG #16171: Potential malformed JSON in explain output

От
hubert depesz lubaczewski
Дата:
On Sun, Feb 02, 2020 at 11:48:32AM -0500, Tom Lane wrote:
> > Does that prevent backpatching this, or are we Ok with EXPLAIN text output not
> > being stable across minors?  AFAICT Pg::Explain still works fine with this
> > change, but mileage may vary for other parsers.
> I'm not sure about that either.  It should be a clear win for parsers
> of the non-text formats, because now we're generating valid
> JSON-or-whatever where we were not before.  But it's not too hard to
> imagine that someone's ad-hoc parser of text output would break,
> depending on how much it relies on field order rather than indentation
> to make sense of things.

Change looks reasonable to me.

Interestingly Pg::Explain doesn't handle either current JSON output in
this case (as it's not a valid JSON), nor the new one - but this can be
fixed easily.

Best regards,

depesz




Re: BUG #16171: Potential malformed JSON in explain output

От
Daniel Gustafsson
Дата:
> On 2 Feb 2020, at 17:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Thoughts?

Keeping TEXT explain stable across minor versions is very appealing, but more
so from a policy standpoint than a technical one.  The real-world implication
is probably quite small, but that's a very unscientific guess (searching Github
didn't turn up anything but thats far from conclusive).  Having broken JSON is
however a clear negative, and so is having a different JSON format in back-
branches for something which has never worked in the first place.

I guess I'm leaning towards backpatching, but it's not entirely clear-cut.

cheers ./daniel


Re: BUG #16171: Potential malformed JSON in explain output

От
Tom Lane
Дата:
Daniel Gustafsson <daniel@yesql.se> writes:
> I guess I'm leaning towards backpatching, but it's not entirely clear-cut.

That's where I stand too.  I'll wait a day or so to see if anyone
else comments; but if not, I'll back-patch.

            regards, tom lane



Re: BUG #16171: Potential malformed JSON in explain output

От
Tom Lane
Дата:
Daniel Gustafsson <daniel@yesql.se> writes:
> On 1 Feb 2020, at 20:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 0002 attached isn't committable, because nobody would want the overhead
>> in production, but it seems like a good trick to keep up our sleeves.

> Thats a neat trick, I wonder if it would be worth maintaining a curated list of
> these tricks in a README under src/test to help others avoid/reduce wheel
> reinventing?

It occurred to me that as long as this is an uncommittable hack anyway,
we could feed the EXPLAIN data to jsonb_in and then hot-wire the jsonb
code to whine about duplicate keys.  So attached, for the archives' sake,
is an improved version that does that.  I still don't find any problems
(other than the one we're fixing here); though no doubt if I reverted
100136849 it'd complain about that.

            regards, tom lane

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index f69dde8..1945a77 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -18,6 +18,7 @@
 #include "commands/explain.h"
 #include "executor/instrument.h"
 #include "jit/jit.h"
+#include "utils/builtins.h"
 #include "utils/guc.h"

 PG_MODULE_MAGIC;
@@ -397,6 +398,10 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
             {
                 es->str->data[0] = '{';
                 es->str->data[es->str->len - 1] = '}';
+
+                /* Verify that it's valid JSON by feeding to jsonb_in */
+                (void) DirectFunctionCall1(jsonb_in,
+                                           CStringGetDatum(es->str->data));
             }

             /*
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index edec657..b9038fa 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -1915,6 +1915,9 @@ uniqueifyJsonbObject(JsonbValue *object)
                 if (ptr != res)
                     memcpy(res, ptr, sizeof(JsonbPair));
             }
+            else
+                elog(WARNING, "dropping duplicate jsonb key %s",
+                     ptr->key.val.string.val);
             ptr++;
         }

diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index a70cd0b..2db14e1 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -3488,6 +3488,9 @@ SELECT '{"ff":{"a":12,"b":16},"qq":123}'::jsonb;
 (1 row)

 SELECT '{"aa":["a","aaa"],"qq":{"a":12,"b":16,"c":["c1","c2"],"d":{"d1":"d1","d2":"d2","d1":"d3"}}}'::jsonb;
+WARNING:  dropping duplicate jsonb key d1
+LINE 1: SELECT '{"aa":["a","aaa"],"qq":{"a":12,"b":16,"c":["c1","c2"...
+               ^
                                               jsonb
 --------------------------------------------------------------------------------------------------
  {"aa": ["a", "aaa"], "qq": {"a": 12, "b": 16, "c": ["c1", "c2"], "d": {"d1": "d3", "d2": "d2"}}}
@@ -4021,6 +4024,8 @@ select jsonb_concat('{"d": "test", "a": [1, 2]}', '{"g": "test2", "c": {"c1":1,
 (1 row)

 select '{"aa":1 , "b":2, "cq":3}'::jsonb || '{"cq":"l", "b":"g", "fg":false}';
+WARNING:  dropping duplicate jsonb key baacq
+WARNING:  dropping duplicate jsonb key cq
                   ?column?
 ---------------------------------------------
  {"b": "g", "aa": 1, "cq": "l", "fg": false}
@@ -4033,6 +4038,7 @@ select '{"aa":1 , "b":2, "cq":3}'::jsonb || '{"aq":"l"}';
 (1 row)

 select '{"aa":1 , "b":2, "cq":3}'::jsonb || '{"aa":"l"}';
+WARNING:  dropping duplicate jsonb key aacq
            ?column?
 ------------------------------
  {"b": 2, "aa": "l", "cq": 3}

Re: BUG #16171: Potential malformed JSON in explain output

От
Tom Lane
Дата:
I wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:
>> I guess I'm leaning towards backpatching, but it's not entirely clear-cut.

> That's where I stand too.  I'll wait a day or so to see if anyone
> else comments; but if not, I'll back-patch.

Hearing no objections, done that way.

            regards, tom lane