Re: Problem with accessing TOAST data in stored procedures

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Problem with accessing TOAST data in stored procedures
Дата
Msg-id CAFj8pRCpo4WsGtOVKXDmkSCMd95jMC=sHCgq2JofZd9Ndvpfug@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Problem with accessing TOAST data in stored procedures  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers


pá 19. 2. 2021 v 16:19 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 19.02.2021 11:12, Pavel Stehule wrote:


pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 19.02.2021 10:47, Pavel Stehule wrote:


pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 19.02.2021 10:14, Pavel Stehule wrote:


pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use procedures without transaction control. So it doesn't look like a good solution.

I am more concentrated on the Pg 14 release, where the work with SPI is redesigned, and I hope so this issue is fixed there. For older releases, I don't know. Is this issue related to Postgres or it is related to PgPro only? If it is related to community pg, then we should fix and we should accept not too good performance, because there is no better non invasive solution. If it is PgPro issue (because there are ATX support) you can fix it (or you can try backport the patch https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf ). You have more possibilities on PgPro code base.

Sorry, it is not PgPro specific problem and recent master suffers from this bug as well.
In the original bug report there was simple scenario of reproducing the problem:

CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':') FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':') FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;

can you use new procedure_resowner?

Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?

This is just an idea - I think the most correct with zero performance impact is keeping snapshot, and this can be stored in procedure_resowner.

The fundamental question is if we want or allow more snapshots per query. The implementation is a secondary issue.

I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually implicitly start new transaction.
And new transaction should have new snapshot. Otherwise its behavior will change.

I have no problem with this. I have a problem with cycle implementation - when I iterate over some result, then this result should be consistent over all cycles.  In other cases, the behaviour is not deterministic.

I have investigated more the problem with toast data in  stored procedures and come to very strange conclusion:
to fix the problem it is enough to pass expand_external=false to expanded_record_set_tuple instead of !estate->atomic:

                                {
                                        /* Only need to assign a new tuple value */
                                        expanded_record_set_tuple(rec->erh, tuptab->vals[i],
-                                                                                         true, !estate->atomic);
+                                                                                         true, false);
                                }

Why it is correct?
Because in assign_simple_var we already forced detoasting for data:

    /*
     * In non-atomic contexts, we do not want to store TOAST pointers in
     * variables, because such pointers might become stale after a commit.
     * Forcibly detoast in such cases.  We don't want to detoast (flatten)
     * expanded objects, however; those should be OK across a transaction
     * boundary since they're just memory-resident objects.  (Elsewhere in
     * this module, operations on expanded records likewise need to request
     * detoasting of record fields when !estate->atomic.  Expanded arrays are
     * not a problem since all array entries are always detoasted.)
     */
    if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
        VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
    {
        MemoryContext oldcxt;
        Datum        detoasted;

        /*
         * Do the detoasting in the eval_mcontext to avoid long-term leakage
         * of whatever memory toast fetching might leak.  Then we have to copy
         * the detoasted datum to the function's main context, which is a
         * pain, but there's little choice.
         */
        oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
        detoasted = PointerGetDatum(detoast_external_attr((struct varlena *) DatumGetPointer(newvalue)));


So, there is no need to initialize TOAST snapshot and "no known snapshots" error is false alarm.
What do you think about it?

This is Tom's code, so important is his opinion.

Regards

Pavel


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Some regular-expression performance hacking
Следующее
От: gkokolatos@pm.me
Дата:
Сообщение: Re: PATCH: Attempt to make dbsize a bit more consistent