Re: SQL/JSON revisited

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: SQL/JSON revisited
Дата
Msg-id 20230220172456.q3oshnvfk3wyhm5l@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: SQL/JSON revisited  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: SQL/JSON revisited  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
Hi,

On 2023-02-20 16:35:52 +0900, Amit Langote wrote:
> Subject: [PATCH v4 03/10] SQL/JSON query functions
> +/*
> + * Evaluate a JSON error/empty behavior result.
> + */
> +static Datum
> +ExecEvalJsonBehavior(JsonBehavior *behavior, bool *is_null)
> +{
> +    *is_null = false;
> +
> +    switch (behavior->btype)
> +    {
> +        case JSON_BEHAVIOR_EMPTY_ARRAY:
> +            return JsonbPGetDatum(JsonbMakeEmptyArray());
> +
> +        case JSON_BEHAVIOR_EMPTY_OBJECT:
> +            return JsonbPGetDatum(JsonbMakeEmptyObject());
> +
> +        case JSON_BEHAVIOR_TRUE:
> +            return BoolGetDatum(true);
> +
> +        case JSON_BEHAVIOR_FALSE:
> +            return BoolGetDatum(false);
> +
> +        case JSON_BEHAVIOR_NULL:
> +        case JSON_BEHAVIOR_UNKNOWN:
> +            *is_null = true;
> +            return (Datum) 0;
> +
> +        case JSON_BEHAVIOR_DEFAULT:
> +            /* Always handled in the caller. */
> +            Assert(false);
> +            return (Datum) 0;
> +
> +        default:
> +            elog(ERROR, "unrecognized SQL/JSON behavior %d", behavior->btype);
> +            return (Datum) 0;
> +    }
> +}

Does this actually need to be evaluated at expression eavluation time?
Couldn't we just emit the proper constants in execExpr.c?

> +/* ----------------------------------------------------------------
> + *        ExecEvalJson
> + * ----------------------------------------------------------------
> + */
> +void
> +ExecEvalJson(ExprState *state, ExprEvalStep *op, ExprContext *econtext)

Pointless comment.


> +{
> +    JsonExprState *jsestate = op->d.jsonexpr.jsestate;
> +    JsonExprPreEvalState *pre_eval = &jsestate->pre_eval;
> +    JsonExprPostEvalState *post_eval = &jsestate->post_eval;
> +    JsonExpr   *jexpr = jsestate->jsexpr;
> +    Datum        item;
> +    Datum        res = (Datum) 0;
> +    JsonPath   *path;
> +    bool        throwErrors = jexpr->on_error->btype == JSON_BEHAVIOR_ERROR;
> +
> +    *op->resnull = true;        /* until we get a result */
> +    *op->resvalue = (Datum) 0;
> +
> +    item = pre_eval->formatted_expr.value;
> +    path = DatumGetJsonPathP(pre_eval->pathspec.value);
> +
> +    /* Reset JsonExprPostEvalState for this evaluation. */
> +    memset(post_eval, 0, sizeof(*post_eval));
> +
> +    res = ExecEvalJsonExpr(op, econtext, path, item, op->resnull,
> +                           !throwErrors ? &post_eval->error : NULL);
> +
> +    *op->resvalue = res;
> +}

I really don't like having both ExecEvalJson() and ExecEvalJsonExpr(). There's
really no way to know what which version does, just based on the name.


> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y

This stuff adds quite a bit of complexity to the parser. Do we realy need like
a dozen new rules here?


> +json_behavior_empty_array:
> +            EMPTY_P ARRAY    { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
> +            /* non-standard, for Oracle compatibility only */
> +            | EMPTY_P        { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
> +        ;

Do we really want to add random oracle compat crud here?


> +/*
> + * Evaluate a JSON path variable caching computed value.
> + */
> +int
> +EvalJsonPathVar(void *cxt, char *varName, int varNameLen,
> +                JsonbValue *val, JsonbValue *baseObject)

Missing static?

> +{
> +    JsonPathVariableEvalContext *var = NULL;
> +    List       *vars = cxt;
> +    ListCell   *lc;
> +    int            id = 1;
> +
> +    if (!varName)
> +        return list_length(vars);
> +
> +    foreach(lc, vars)
> +    {
> +        var = lfirst(lc);
> +
> +        if (!strncmp(var->name, varName, varNameLen))
> +            break;
> +
> +        var = NULL;
> +        id++;
> +    }
> +
> +    if (!var)
> +        return -1;
> +
> +    /*
> +     * When belonging to a JsonExpr, path variables are computed with the
> +     * JsonExpr's ExprState (var->estate is NULL), so don't need to be computed
> +     * here.  In some other cases, such as when the path variables belonging
> +     * to a JsonTable instead, those variables must be evaluated on their own,
> +     * without the enclosing JsonExpr itself needing to be evaluated, so must
> +     * be handled here.
> +     */
> +    if (var->estate && !var->evaluated)
> +    {
> +        Assert(var->econtext != NULL);
> +        var->value = ExecEvalExpr(var->estate, var->econtext, &var->isnull);
> +        var->evaluated = true;

Uh, so this continues to do recursive expression evaluation, as
ExecEvalJsonExpr()->JsonPathQuery()->executeJsonPath(EvalJsonPathVar)

I'm getting grumpy here. This is wrong, has been pointed out many times. The
only thing that changes is that the point of recursion is moved around.


> +
> +/*
> + * ExecEvalExprSafe
> + *
> + * Like ExecEvalExpr(), though this allows the caller to pass an
> + * ErrorSaveContext to declare its intenion to catch any errors that occur when
> + * executing the expression, such as when calling type input functions that may
> + * be present in it.
> + */
> +static inline Datum
> +ExecEvalExprSafe(ExprState *state,
> +                 ExprContext *econtext,
> +                 bool *isNull,
> +                 Node *escontext,
> +                 bool *error)

Afaict there's no caller of this?


>  
> +/*
> + * ExecInitExprWithCaseValue
> + *
> + * This is the same as ExecInitExpr, except the caller passes the Datum and
> + * bool pointers that it would like the ExprState.innermost_caseval
> + * and ExprState.innermost_casenull, respectively, to be set to.  That way,
> + * it can pass an input value to evaluate the expression via a CaseTestExpr.
> + */
> +ExprState *
> +ExecInitExprWithCaseValue(Expr *node, PlanState *parent,
> +                          Datum *caseval, bool *casenull)
> +{
> +    ExprState  *state;
> +    ExprEvalStep scratch = {0};
> +
> +    /* Special case: NULL expression produces a NULL ExprState pointer */
> +    if (node == NULL)
> +        return NULL;
> +
> +    /* Initialize ExprState with empty step list */
> +    state = makeNode(ExprState);
> +    state->expr = node;
> +    state->parent = parent;
> +    state->ext_params = NULL;
> +    state->innermost_caseval = caseval;
> +    state->innermost_casenull = casenull;
> +
> +    /* Insert EEOP_*_FETCHSOME steps as needed */
> +    ExecInitExprSlots(state, (Node *) node);
> +
> +    /* Compile the expression proper */
> +    ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
> +
> +    /* Finally, append a DONE step */
> +    scratch.opcode = EEOP_DONE;
> +    ExprEvalPushStep(state, &scratch);
> +
> +    ExecReadyExpr(state);
> +
> +    return state;

> +struct JsonTableJoinState
> +{
> +    union
> +    {
> +        struct
> +        {
> +            JsonTableJoinState *left;
> +            JsonTableJoinState *right;
> +            bool        advanceRight;
> +        }            join;
> +        JsonTableScanState scan;
> +    }            u;
> +    bool        is_join;
> +};

A join state that unions the join member with a scan, and has a is_join field?


> +/*
> + * JsonTableInitOpaque
> + *        Fill in TableFuncScanState->opaque for JsonTable processor
> + */
> +static void
> +JsonTableInitOpaque(TableFuncScanState *state, int natts)
> +{
> +    JsonTableContext *cxt;
> +    PlanState  *ps = &state->ss.ps;
> +    TableFuncScan *tfs = castNode(TableFuncScan, ps->plan);
> +    TableFunc  *tf = tfs->tablefunc;
> +    JsonExpr   *ci = castNode(JsonExpr, tf->docexpr);
> +    JsonTableParent *root = castNode(JsonTableParent, tf->plan);
> +    List       *args = NIL;
> +    ListCell   *lc;
> +    int            i;
> +
> +    cxt = palloc0(sizeof(JsonTableContext));
> +    cxt->magic = JSON_TABLE_CONTEXT_MAGIC;
> +
> +    if (ci->passing_values)
> +    {
> +        ListCell   *exprlc;
> +        ListCell   *namelc;
> +
> +        forboth(exprlc, ci->passing_values,
> +                namelc, ci->passing_names)
> +        {
> +            Expr       *expr = (Expr *) lfirst(exprlc);
> +            String       *name = lfirst_node(String, namelc);
> +            JsonPathVariableEvalContext *var = palloc(sizeof(*var));
> +
> +            var->name = pstrdup(name->sval);
> +            var->typid = exprType((Node *) expr);
> +            var->typmod = exprTypmod((Node *) expr);
> +            var->estate = ExecInitExpr(expr, ps);
> +            var->econtext = ps->ps_ExprContext;
> +            var->mcxt = CurrentMemoryContext;
> +            var->evaluated = false;
> +            var->value = (Datum) 0;
> +            var->isnull = true;
> +
> +            args = lappend(args, var);
> +        }
> +    }
> +
> +    cxt->colexprs = palloc(sizeof(*cxt->colexprs) *
> +                           list_length(tf->colvalexprs));
> +
> +    JsonTableInitScanState(cxt, &cxt->root, root, NULL, args,
> +                           CurrentMemoryContext);
> +
> +    i = 0;
> +
> +    foreach(lc, tf->colvalexprs)
> +    {
> +        Expr       *expr = lfirst(lc);
> +
> +        cxt->colexprs[i].expr =
> +            ExecInitExprWithCaseValue(expr, ps,
> +                                      &cxt->colexprs[i].scan->current,
> +                                      &cxt->colexprs[i].scan->currentIsNull);
> +
> +        i++;
> +    }
> +
> +    state->opaque = cxt;
> +}

Evaluating N expressions for a json table isn't a good approach, both memory
and CPU efficiency wise.

Why don't you just emit the proper expression directly, insted of the
CaseTestExpr stuff, that you then separately evaluate?



Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: add PROCESS_MAIN to VACUUM
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Killing off removed rels properly