Обсуждение: lazy detoasting

Поиск

Список

Период

Сортировка

lazy detoasting

От

Chapman Flack

Дата:

01 апреля 2018 г., 08:04:37

Hi,

If I copy an out-of-line, on-disk TOAST pointer into a memory context
with transaction lifetime, with an eye to detoasting it later in the
same transaction, are there circumstances where it wouldn't work?

Thanks,
-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

01 апреля 2018 г., 08:19:47

Chapman Flack <chap@anastigmatix.net> writes:
> If I copy an out-of-line, on-disk TOAST pointer into a memory context
> with transaction lifetime, with an eye to detoasting it later in the
> same transaction, are there circumstances where it wouldn't work?

Should be safe *as long as you hold onto a snapshot that can see the
toast value's parent row*.  Otherwise, it's not.

For a counterexample, see the changes we had to make to avoid depending
on out-of-line toast values in the catcaches, commit 08e261cbc.

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

05 апреля 2018 г., 05:34:04

On 04/01/18 01:19, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> If I copy an out-of-line, on-disk TOAST pointer into a memory context
>> with transaction lifetime, with an eye to detoasting it later in the
>> same transaction, are there circumstances where it wouldn't work?
> 
> Should be safe *as long as you hold onto a snapshot that can see the
> toast value's parent row*.  Otherwise, it's not.
> 
> For a counterexample, see the changes we had to make to avoid depending
> on out-of-line toast values in the catcaches, commit 08e261cbc.

Thanks. I think I see two approaches happening in that commit: retaining
a snapshot, if the tuples will be used within the transaction, or eagerly
detoasting, when persisting a holdable portal so tuples can be used after
the transaction.

Please bear with me as I check my understanding of snapshot management
by looking at PersistHoldablePortal(). There's a
PushActiveSnapshot(queryDesc->snapshot) in there. Is that because:

(a) that snapshot might not be on the active stack at this point, and it
    needs to be be there to keep it alive during this executor re-run, or

(b) it's known to be somewhere on the active stack already, but not
    necessarily on top, and needs to be pushed so it is topmost while
    re-running this executor (does the top snapshot on the active stack
    affect which tuples the executor sees? or is this stack's only purpose
    to keep snapshots alive?), or

(c) it's known to be somewhere on the stack already, but needs to be
    be pushed to make sure some stack-depth invariant is preserved
    (perhaps because of an implied pop in case of an error), or

(d) some other reason I haven't thought of ?

I *think* I'm smart enough to rule out choice (a), because it wouldn't
make sense to have queryDesc->snapshot refer to a snapshot that isn't
already on the stack (unless it's also been registered, in which case,
why would it need to be pushed on the stack too?), as then I would be
reckless to assume it's still alive to *be* pushed. Am I close?

I haven't yet gone to track down the code that assigns a snapshot to
queryDesc->snapshot.

Thanks,
-Chap

Re: lazy detoasting

От

Amit Kapila

Дата:

08 апреля 2018 г., 07:58:57

On Thu, Apr 5, 2018 at 8:04 AM, Chapman Flack <chap@anastigmatix.net> wrote:
> On 04/01/18 01:19, Tom Lane wrote:
>> Chapman Flack <chap@anastigmatix.net> writes:
>>> If I copy an out-of-line, on-disk TOAST pointer into a memory context
>>> with transaction lifetime, with an eye to detoasting it later in the
>>> same transaction, are there circumstances where it wouldn't work?
>>
>> Should be safe *as long as you hold onto a snapshot that can see the
>> toast value's parent row*.  Otherwise, it's not.
>>
>> For a counterexample, see the changes we had to make to avoid depending
>> on out-of-line toast values in the catcaches, commit 08e261cbc.
>
> Thanks. I think I see two approaches happening in that commit: retaining
> a snapshot, if the tuples will be used within the transaction, or eagerly
> detoasting, when persisting a holdable portal so tuples can be used after
> the transaction.
>
> Please bear with me as I check my understanding of snapshot management
> by looking at PersistHoldablePortal(). There's a
> PushActiveSnapshot(queryDesc->snapshot) in there. Is that because:
>
> (a) that snapshot might not be on the active stack at this point, and it
>     needs to be be there to keep it alive during this executor re-run, or
>
> (b) it's known to be somewhere on the active stack already, but not
>     necessarily on top, and needs to be pushed so it is topmost while
>     re-running this executor (does the top snapshot on the active stack
>     affect which tuples the executor sees? or is this stack's only purpose
>     to keep snapshots alive?), or
>
> (c) it's known to be somewhere on the stack already, but needs to be
>     be pushed to make sure some stack-depth invariant is preserved
>     (perhaps because of an implied pop in case of an error), or
>
> (d) some other reason I haven't thought of ?
>
> I *think* I'm smart enough to rule out choice (a), because it wouldn't
> make sense to have queryDesc->snapshot refer to a snapshot that isn't
> already on the stack (unless it's also been registered, in which case,
> why would it need to be pushed on the stack too?), as then I would be
> reckless to assume it's still alive to *be* pushed. Am I close?
>

I think it is somewhat close to what you have mentioned in (b).
Basically, it will help executor to use that snapshot for tuple
visibility.

> I haven't yet gone to track down the code that assigns a snapshot to
> queryDesc->snapshot.
>

See CreateQueryDesc().


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: lazy detoasting

От

Andrew Gierth

Дата:

08 апреля 2018 г., 09:01:26

>>>>> "Chapman" == Chapman Flack <chap@anastigmatix.net> writes:

 Chapman> Please bear with me as I check my understanding of snapshot
 Chapman> management by looking at PersistHoldablePortal(). There's a
 Chapman> PushActiveSnapshot(queryDesc->snapshot) in there. Is that
 Chapman> because:

 Chapman> (d) some other reason I haven't thought of ?

It has to be pushed as the active snapshot so that it's the snapshot
that the executor uses to run the query to populate the tuplestore which
becomes the "held" portal content.

I think you're confusing the stack of active snapshots with the registry
of snapshots. The snapshot of a portal that's not currently running in
the executor won't be on the stack, but it will be registered (which is
enough to maintain the session's reported xmin, which is what prevents
visible data from being vacuumed away).

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Chapman Flack

Дата:

09 апреля 2018 г., 17:57:01

On 04/08/2018 02:01 AM, Andrew Gierth wrote:

>  Chapman> (d) some other reason I haven't thought of ?
> 
> It has to be pushed as the active snapshot so that it's the snapshot
> that the executor uses to run the query to populate the tuplestore which
> becomes the "held" portal content.

That seems closest to my (b) guess.

> I think you're confusing the stack of active snapshots with the registry
> of snapshots. The snapshot of a portal that's not currently running in
> the executor won't be on the stack, but it will be registered (which is
> enough to maintain the session's reported xmin, which is what prevents
> visible data from being vacuumed away).

That's why I was asking. The first paragraph in snapmgr.c seems to say
that the registry and the active stack both exist as ways to keep the
snapshot alive, with reachability from either one being sufficient:

* We keep track of snapshots in two ways: those "registered" by
* resowner.c, and the "active snapshot" stack.  All snapshots in
* either of them live in persistent memory.  When a snapshot is
* no longer in any of these lists (tracked by separate refcounts
* on each snapshot), its memory can be freed.

AFAICS, that is *all* that comment block has to say about why there's
an active snapshot stack. I believe you are saying it has another
important function, namely that its top element is what tells the
executor what can be seen. That makes sense, and to clarify that
was why I was asking.

I suppose the reason that's not mentioned in the comments is that it
was so obviously the ultimate purpose of the whole scheme that nobody
writing or reviewing the comments could imagine not knowing it. :)

-Chap

Re: lazy detoasting

От

Andrew Gierth

Дата:

10 апреля 2018 г., 07:30:56

>>>>> "Chapman" == Chapman Flack <chap@anastigmatix.net> writes:

 Chapman> AFAICS, that is *all* that comment block has to say about why
 Chapman> there's an active snapshot stack. I believe you are saying it
 Chapman> has another important function, namely that its top element is
 Chapman> what tells the executor what can be seen.

That's not precisely true - ultimately, the routines that do actual
scans take the snapshot to use as a parameter, and the executor mostly
references the snapshot from the EState; but a bunch of places do
require that ActiveSnapshot be set to the currently applicable snapshot
(e.g. for queries inside stable functions nested inside the main query).

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Chapman Flack

Дата:

10 апреля 2018 г., 09:45:27

On 04/10/18 00:30, Andrew Gierth wrote:

> That's not precisely true - ultimately, the routines that do actual
> scans take the snapshot to use as a parameter, and the executor mostly
> references the snapshot from the EState; but a bunch of places do
> require that ActiveSnapshot be set to the currently applicable snapshot
> (e.g. for queries inside stable functions nested inside the main query).

I'm becoming increasingly glad I asked (or less embarrassed that I hadn't
figured it all out yet).  :)

Am I right in thinking that, for my original purpose of detoasting something
later in a transaction, all that matters is that I registered a snapshot
from the time at which I copied the toasted datum, and the resource owner
I registered it to has not been released yet, so rows referred to in the
snapshot haven't been vacuumed away? Is that a sufficient condition for
detoast to work?

Or would I need to do something more, like push and pop that snapshot
around the detoast call?

This would be in a PL function (or the handler for the PL function),
if the context matters.

-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

10 апреля 2018 г., 17:06:24

Chapman Flack <chap@anastigmatix.net> writes:
> I'm becoming increasingly glad I asked (or less embarrassed that I hadn't
> figured it all out yet).  :)

> Am I right in thinking that, for my original purpose of detoasting something
> later in a transaction, all that matters is that I registered a snapshot
> from the time at which I copied the toasted datum, and the resource owner
> I registered it to has not been released yet, so rows referred to in the
> snapshot haven't been vacuumed away? Is that a sufficient condition for
> detoast to work?

I believe so.

> Or would I need to do something more, like push and pop that snapshot
> around the detoast call?

You shouldn't need that; toast reads intentionally use a non-MVCC-aware
"snapshot" to handle exactly this type of situation, where somebody is
trying to pull data out of a tuple that would no longer be visible to
the transaction's current snapshot.

Wouldn't be a bad idea to test this, of course ;-)

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

10 апреля 2018 г., 18:26:19

On 04/10/2018 10:06 AM, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:

>> Am I right in thinking that, for my original purpose of
>> detoasting something later in a transaction, all that matters
>> is that I registered a snapshot from the time at which I copied
>> the toasted datum, and the resource owner I registered it to
>> has not been released yet, ...
> 
> I believe so.

Ok.

I see I have about half a dozen choices for the snapshot that
I choose to squirrel away at the same time as the copied datum.
(That's not counting SnapshotToast, which I didn't know was a thing
until Pavan's thread this morning, but it's not a thing I can get,
just something constructed on the fly in the tuptoaster.)

Out of the six GetFooSnapshot()s, would I want to squirrel away
Active? Oldest? Transaction? This should be happening in a PL
function not doing anything arcane; the datum in question might
be passed to it as a parameter or retrieved from a query done
within the function.

GetOldestTransaction() is what the tuptoaster will eventually use
to construct SnapshotToast when looking for the data, but it's
probably overkill for me to save the oldest one in sight at the
time I save the datum. Or is it? Should I be confident that, at
the time I'm handed this datum, its toasted content must be
retrievable through the (Active? Transaction?) snapshot at that
moment, and it's enough to register that, while to register the
Oldest would be to pin something unnecessarily far back?

> Wouldn't be a bad idea to test this, of course ;-)

Mm-hmm. (Thunderbird has just corrected my spelling of mm-hmm.)
I'm still not sure I know enough to construct a meaningful test...

I assume that, while I'm doing this for a backend PL at the
moment, some of the same considerations will apply if a future
wire protocol is to support Craig's "v4 protocol TODO item -
Lazy fetch/stream of TOASTed valued" suggestion in
https://www.postgresql.org/message-id/53FF0EF8.100@2ndquadrant.com

-Chap

Re: lazy detoasting

От

Robert Haas

Дата:

10 апреля 2018 г., 23:03:11

On Tue, Apr 10, 2018 at 11:26 AM, Chapman Flack <chap@anastigmatix.net> wrote:
> Out of the six GetFooSnapshot()s, would I want to squirrel away
> Active? Oldest? Transaction?

I suspect you want, or maybe need, to use the same snapshot as the
scan that retrieved the tuple containing the toasted datum.

(This advice may be worth what you paid for it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 00:04:04

On 04/10/2018 04:03 PM, Robert Haas wrote:

> I suspect you want, or maybe need, to use the same snapshot as the
> scan that retrieved the tuple containing the toasted datum.

I'm sure it's worth more than that, but I don't know if it's
implementable.

If I'm a function, and the datum came to me as a parameter, I may
have no way to determine what snapshot the enclosing query used to
obtain the thing passed to me. Or, if I found it myself, say by an
SPI query within the function, usually that's at a level of abstraction
somewhere above what-snapshot-was-used-in-the-scan.

But in both cases, it's expected that I could successfully detoast
either datum if I did so right there on the spot, as that's the usual
convention, right? So at that moment, something in the set of
registered || active snapshots is protecting the tuples I need.

If it's impractical to determine which snapshot is needed (or just
enough work to obviate any benefit of lazy detoasting), I wonder if
there's at least a cheap way to check a particular snapshot
for suitability wrt a given toast pointer. Check a couple usual
suspects, find one most of the time, fall back to eager detoasting
otherwise?

Guess I need to go back for a deeper dive on just what constitutes
a toast pointer. I was skimming last time....

-Chap

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 00:19:36

On 04/10/2018 05:04 PM, Chapman Flack wrote:

> If I'm a function, and ... I found [the datum] myself, say by an
> SPI query within the function, usually that's at a level of abstraction
> somewhere above what-snapshot-was-used-in-the-scan.

It looks like for that case (since the commit 08e261cbc that Tom
mentioned earlier), there can sometimes be a portal->holdSnapshot
that will be the right one.

-Chap

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 01:24:55

On 04/10/2018 05:04 PM, Chapman Flack wrote:
> ... I wonder if
> there's at least a cheap way to check a particular snapshot
> for suitability wrt a given toast pointer. Check a couple usual
> suspects, find one most of the time, fall back to eager detoasting
> otherwise?
> 
> Guess I need to go back for a deeper dive on just what constitutes
> a toast pointer. I was skimming last time....

I see all I have in a toast pointer is a relation id and a value
oid, so probably no way to check that a given snapshot 'works' to
find it, at least no more cheaply than by opening the toast relation
and trying to find it.

Welp, what tuptoaster does to construct its SnapshotToast is to
grab GetOldestSnapshot():

*  since we don't know which one to use, just use the oldest one.
*  This is safe: at worst, we will get a "snapshot too old" error
*  that might have been avoided otherwise.

... which means, I take it, that a more recent snapshot
might work, but this conservative choice would only fail
if the oldest snapshot has really been around for many days,
under typical old_snapshot_threshold settings?

... and if I'm getting a value from a portal that happens to have
a non-null holdSnapshot, then that would be the one to use, in
preference to just conservatively grabbing the oldest?

-Chap

... am I right that the init_toast_snapshot construction is really
making only one change to the snapshot data, namely changing the
'satisfies' condition to HeapTupleSatisfiesToast ?

The lsn and whenTaken seem to be fetched from the original data
and stored right back without change.

Re: lazy detoasting

От

Jan Wieck

Дата:

11 апреля 2018 г., 05:17:57

Maybe I'm missing something here, but let me put $.02 in anyway.

TOAST reuses entries. If a toasted value is unchanged on UPDATE (i.e. the toast pointer didn't get replaced by a new value), the new tuple points to the same toast slices as the old. If it is changed, the current transaction DELETEs the old toast slices and INSERTs new ones (if needed).

If your session has a transaction snapshot that protects the old toast slices from being vacuumed away, you are fine. Even under READ COMMITTED (that is why it uses that other visibility, so they don't go away when the concurrent UPDATE commits and rather behave like REPEATABLE READ).

At least that is what the code was intended to do originally.

Regards, Jan

On Tue, Apr 10, 2018 at 6:24 PM, Chapman Flack <chap@anastigmatix.net> wrote:

On 04/10/2018 05:04 PM, Chapman Flack wrote:
> ... I wonder if
> there's at least a cheap way to check a particular snapshot
> for suitability wrt a given toast pointer. Check a couple usual
> suspects, find one most of the time, fall back to eager detoasting
> otherwise?
>
> Guess I need to go back for a deeper dive on just what constitutes
> a toast pointer. I was skimming last time....

I see all I have in a toast pointer is a relation id and a value
oid, so probably no way to check that a given snapshot 'works' to
find it, at least no more cheaply than by opening the toast relation
and trying to find it.

Welp, what tuptoaster does to construct its SnapshotToast is to
grab GetOldestSnapshot():

* since we don't know which one to use, just use the oldest one.
* This is safe: at worst, we will get a "snapshot too old" error
* that might have been avoided otherwise.

... which means, I take it, that a more recent snapshot
might work, but this conservative choice would only fail
if the oldest snapshot has really been around for many days,
under typical old_snapshot_threshold settings?

... and if I'm getting a value from a portal that happens to have
a non-null holdSnapshot, then that would be the one to use, in
preference to just conservatively grabbing the oldest?

-Chap

... am I right that the init_toast_snapshot construction is really
making only one change to the snapshot data, namely changing the
'satisfies' condition to HeapTupleSatisfiesToast ?

The lsn and whenTaken seem to be fetched from the original data
and stored right back without change.

Jan Wieck
Senior Postgres Architect

http://pgblog.wi3ck.info

Re: lazy detoasting

От

Amit Kapila

Дата:

11 апреля 2018 г., 16:28:18

On Wed, Apr 11, 2018 at 2:34 AM, Chapman Flack <chap@anastigmatix.net> wrote:
> On 04/10/2018 04:03 PM, Robert Haas wrote:
>
>> I suspect you want, or maybe need, to use the same snapshot as the
>> scan that retrieved the tuple containing the toasted datum.
>
> I'm sure it's worth more than that, but I don't know if it's
> implementable.
>
> If I'm a function, and the datum came to me as a parameter, I may
> have no way to determine what snapshot the enclosing query used to
> obtain the thing passed to me.
>

Before query execution, we push the active snapshot, so any time
during execution, if you get the active snapshot via
GetActiveSnapshot(), you can access it.  So, I think during a function
execution, if you use GetActiveSnapshot, you should get the snapshot
used by enclosing query.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 17:28:38

On 04/10/2018 10:17 PM, Jan Wieck wrote:
> If your session has a transaction snapshot that protects the old toast
> slices from being vacuumed away, you are fine.

That harks back to my earlier (naïve?) thought that, as long as
my lazy datum doesn't have to outlive the transaction, lazy
detoasting might Just Work.

Tom seems to say otherwise, in
https://www.postgresql.org/message-id/23711.1522559987%40sss.pgh.pa.us

The message of the commit he mentions there includes:

  I believe this code was all right when written, because our
  management of a session's exposed xmin was such that the TOAST
  references were safe until end of transaction.  But that's
  no longer true now that we can advance or clear our PGXACT.xmin
  intra-transaction.

So perhaps my original thought really would have Just Worked,
in PG versions before that changed? When did that change, btw?

On 04/11/2018 09:28 AM, Amit Kapila wrote:
> Before query execution, we push the active snapshot, so any time
> during execution, if you get the active snapshot via
> GetActiveSnapshot(), you can access it.  So, I think during a function
> execution, if you use GetActiveSnapshot, you should get the snapshot
> used by enclosing query.

So perhaps GetActiveSnapshot is the right choice to accompany a datum
that came as a function parameter.

For something retrieved from an SPI query within the function, it
looks like I will have a portal->holdSnapshot in certain cases
(PORTAL_ONE_RETURNING, PORTAL_ONE_MOD_WITH, PORTAL_UTIL_SELECT).

The comments added to portal.h in that commit say:

* Snapshot under which tuples in the holdStore were read.  We must keep
* a reference to this snapshot if there is any possibility that the
* tuples contain TOAST references, because releasing the snapshot ...

Soo, is that telling me that the three cases named above, where
holdSnapshot gets set, are in fact the only cases where "there is any
possibility that the tuples contain TOAST references", and therefore
I could count on holdSnapshot to be nonnull whenever it matters?

-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

11 апреля 2018 г., 17:41:38

Chapman Flack <chap@anastigmatix.net> writes:
> On 04/10/2018 10:17 PM, Jan Wieck wrote:
>> If your session has a transaction snapshot that protects the old toast
>> slices from being vacuumed away, you are fine.

> That harks back to my earlier (naïve?) thought that, as long as
> my lazy datum doesn't have to outlive the transaction, lazy
> detoasting might Just Work.

> Tom seems to say otherwise, in
> https://www.postgresql.org/message-id/23711.1522559987%40sss.pgh.pa.us

> The message of the commit he mentions there includes:

>   I believe this code was all right when written, because our
>   management of a session's exposed xmin was such that the TOAST
>   references were safe until end of transaction.  But that's
>   no longer true now that we can advance or clear our PGXACT.xmin
>   intra-transaction.

> So perhaps my original thought really would have Just Worked,
> in PG versions before that changed? When did that change, btw?

When snapmgr.c came in, which seems to have been 8.4.

The core of the problem now is that in a READ COMMITTED transaction, we
may not be holding any snapshot at all between statements.  So if you're
hanging onto a toast pointer across statements you're at risk.

On the other hand, it's also arguable that you shouldn't be interested
in dereferencing such a pointer later than the statement in which it
was read, precisely because it no longer represents accessible data.
So maybe we need to take two steps back and talk about the semantics
of what you're doing.

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 18:27:19

On 04/11/2018 10:41 AM, Tom Lane wrote:
> The core of the problem now is that in a READ COMMITTED transaction, we
> may not be holding any snapshot at all between statements.  So if you're
> hanging onto a toast pointer across statements you're at risk.
> 
> On the other hand, it's also arguable that you shouldn't be interested
> in dereferencing such a pointer later than the statement in which it
> was read, precisely because it no longer represents accessible data.
> So maybe we need to take two steps back and talk about the semantics
> of what you're doing.

The mission is to implement java.sql.SQLXML, which has long been
missing from PL/Java.

https://docs.oracle.com/javase/6/docs/api/index.html?java/sql/SQLXML.html

The API spec includes this: "An SQLXML object is valid for
the duration of the transaction in which it was created."

This is the class of object your Java code can retrieve from a
ResultSet row from a query with an XML column type. (It's also
what can be passed to you as a function parameter, if your
function's SQL declaration has a parameter type XML.)

An SQLXML object represents the XML value. You get to read the
object (exactly once, or not at all) for the content it represents,
in your choice of a half-dozen different ways corresponding to
available Java XML-processing APIs. (The SQLXML implementation knows
what the underlying stored form is, so it encapsulates the knowledge
of how most efficiently to get it from there to the form the program
wants to work with.)

In most cases I can easily imagine, a function that gets an SQLXML
object is going to read it "pretty soon" ... surely some time during
the function call, if it was passed as a parameter, and probably
within the same row loop iteration, for one retrieved from a query.
However, the spec does explicitly provide that you could, for whatever
reason, sit on the thing for a while, then read it later in the same
transaction. You should get the same stuff you would have if you had
read it right away, whether or not stuff has changed in the database
since. Eager detoasting at the time of creating the object, into a
transaction-scoped memory context, would trivially have that behavior,
but on the chance that XML values might be large, and a function might
conceivably never read the thing at all on certain code paths, I'd
rather defer detoasting until the code holding the SQLXML object
actually wants to read it.

-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

11 апреля 2018 г., 18:33:22

Chapman Flack <chap@anastigmatix.net> writes:
> On 04/11/2018 10:41 AM, Tom Lane wrote:
>> So maybe we need to take two steps back and talk about the semantics
>> of what you're doing.

> The mission is to implement java.sql.SQLXML, which has long been
> missing from PL/Java.
> This is the class of object your Java code can retrieve from a
> ResultSet row from a query with an XML column type. (It's also
> what can be passed to you as a function parameter, if your
> function's SQL declaration has a parameter type XML.)

OK, but if this object only lives within a single function call,
what's the problem?  The underlying row must be visible to the
calling query, and that condition won't change for the duration
of the call.

Things could get interesting if you consider a Java *procedure*,
but I think you could afford to detoast-on-entry for that case.

(Wanders away wondering what Peter has done about toasted parameter
values for procedures in general ...)

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 20:21:48

On 04/11/2018 11:33 AM, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> The mission is to implement java.sql.SQLXML, which has long been
>> missing from PL/Java.
>> This is the class of object your Java code can retrieve from a
>> ResultSet row from a query with an XML column type. (It's also
>> what can be passed to you as a function parameter, if your
>> function's SQL declaration has a parameter type XML.)
> 
> OK, but if this object only lives within a single function call,
> what's the problem?  The underlying row must be visible to the
> calling query, and that condition won't change for the duration
> of the call.

Well, the devilsAdvocate() function would stash the object
in a static, then try to look at it some time in a later call
in the same transaction.

Mind you, I have no use case in mind for a function wanting to do that
(other than as a test case for spec compliance). But the API spec for
the SQLXML class expressly says "object is valid for the duration
of the transaction", and I try not to renege on spec guarantees
if I can find a practical way not to.

> Things could get interesting if you consider a Java *procedure*,

and, yes, that. Though so far, there still are no PL/Java *procedures*.
I haven't even been following that development closely enough yet to
think about what a plan for supporting those would require.

-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

11 апреля 2018 г., 20:55:17

Chapman Flack <chap@anastigmatix.net> writes:
> On 04/11/2018 11:33 AM, Tom Lane wrote:
>> OK, but if this object only lives within a single function call,
>> what's the problem?  The underlying row must be visible to the
>> calling query, and that condition won't change for the duration
>> of the call.

> Well, the devilsAdvocate() function would stash the object
> in a static, then try to look at it some time in a later call
> in the same transaction.

If you're worried about that, you should also worry about what happens
if the function uses the static variable in some later transaction.
The spec grants you license to throw an error, but it still needs to
be a clean error (not something about "can't find toast value", IMO).

Can you detect that the value is being stored in a long-lived variable
and detoast at that point?

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 21:38:46

On 04/11/2018 01:55 PM, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> Well, the devilsAdvocate() function would stash the object
>> in a static, then try to look at it some time in a later call
>> in the same transaction.
> 
> If you're worried about that, you should also worry about what happens
> if the function uses the static variable in some later transaction.
> The spec grants you license to throw an error, but it still needs to
> be a clean error (not something about "can't find toast value", IMO).

There's precedent for that kind of thing in PL/Java already ... objects
that Java considers alive as long as some code holds a reference
to them, but proxy for things in PG that may only have function-call
lifetime or cursor-row lifetime, etc. If they are closed by Java code
(or the Java GC finds them unreachable) first, they have to remember
to release their PG stuff; if the PG stuff goes first, they have to
update themselves to throw a suitable "you've kept me past my sell-by
date" exception if the Java code tries to use them again.

Thomas implemented most of those things ages ago; this is the first
I've added myself, with a little adjustment of technique because his
were for lifetimes shorter than transaction. I'm using the
TopTransactionResourceOwner to learn when the transaction is finished.
Resource owners have been around as long as any PG version PL/Java
supports, so that seems ok.

> Can you detect that the value is being stored in a long-lived variable
> and detoast at that point?

Not easily, I don't think. The question resembles "is this object
still reachable, or unreachable, now as I exit this function call?"
or at some other specific time. The Java garbage collector eventually
learns what's become unreachable, but it doesn't promise *when*.

But let me return to the earlier idea for a moment: are you saying
that it might *not* be sufficient to find an applicable snapshot at
the time of constructing the object, and register that snapshot
on TopTransactionResourceOwner?

It would obviously not be kept around any longer than the transaction,
and would be released earlier whenever the Java code reads/closes it,
or lets go of the last reference and GC finds it. In all the typical
cases I can imagine, it would be registered very briefly, with the
object being constructed/read/freed in quick succession.

-Chap

Re: lazy detoasting

От

Tom Lane

Дата:

11 апреля 2018 г., 22:04:13

Chapman Flack <chap@anastigmatix.net> writes:
> But let me return to the earlier idea for a moment: are you saying
> that it might *not* be sufficient to find an applicable snapshot at
> the time of constructing the object, and register that snapshot
> on TopTransactionResourceOwner?

The problem is to know which snapshot is applicable; if the transaction
has more than one, you don't know which was used to read the row of
interest.  I suppose you could be conservative and use the oldest one,
if snapmgr lets you find that.

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

11 апреля 2018 г., 22:13:52

On 04/11/2018 03:04 PM, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> that it might *not* be sufficient to find an applicable snapshot at
>> the time of constructing the object, and register that snapshot
>> on TopTransactionResourceOwner?
> 
> The problem is to know which snapshot is applicable; if the transaction
> has more than one, you don't know which was used to read the row of
> interest.  I suppose you could be conservative and use the oldest one,
> if snapmgr lets you find that.

There does seem to be GetOldestSnapshot(), returning
older( oldest on active stack, oldest on registered heap ).

And it seems to be the very thing called by tuptoaster itself,
right after the comment "since we don't know which one to use,
just use the oldest one".

-Chap

Re: lazy detoasting

От

Andrew Gierth

Дата:

11 апреля 2018 г., 22:57:12

>>>>> "Chapman" == Chapman Flack <chap@anastigmatix.net> writes:

 Chapman> There's precedent for that kind of thing in PL/Java already
 Chapman> ... objects that Java considers alive as long as some code
 Chapman> holds a reference to them, but proxy for things in PG that may
 Chapman> only have function-call lifetime or cursor-row lifetime, etc.
 Chapman> If they are closed by Java code (or the Java GC finds them
 Chapman> unreachable) first, they have to remember to release their PG
 Chapman> stuff; if the PG stuff goes first, they have to update
 Chapman> themselves to throw a suitable "you've kept me past my sell-by
 Chapman> date" exception if the Java code tries to use them again.

How's the code doing this? I couldn't find it in a cursory scan.

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Peter Eisentraut

Дата:

24 апреля 2018 г., 21:28:02

On 4/11/18 11:33, Tom Lane wrote:
> (Wanders away wondering what Peter has done about toasted parameter
> values for procedures in general ...)

I'm not sure.  How can a procedure have a toasted parameter value?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: lazy detoasting

От

Andrew Gierth

Дата:

25 апреля 2018 г., 14:35:24

>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

 > On 4/11/18 11:33, Tom Lane wrote:
 >> (Wanders away wondering what Peter has done about toasted parameter
 >> values for procedures in general ...)

 Peter> I'm not sure.  How can a procedure have a toasted parameter value?

do $$ declare a text; begin select f1.a into a from f1; call p1(a); end; $$;

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Andrew Gierth

Дата:

25 апреля 2018 г., 14:50:50

>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
 >> On 4/11/18 11:33, Tom Lane wrote:
 >>> (Wanders away wondering what Peter has done about toasted parameter
 >>> values for procedures in general ...)

 Peter> I'm not sure.  How can a procedure have a toasted parameter value?

 Andrew> do $$ declare a text; begin select f1.a into a from f1; call p1(a); end; $$;

And behold:

do $$
  declare a text;
  begin
    select f1.a into a from f1;
    delete from f1;
    commit;
    perform pg_sleep(10);  -- vacuum f1 in another session while it sleeps
    call p1(a);
  end; $$;
INFO:  a: (t,t,f,"missing chunk number 0",,)

(p1 in this case is using toast_item_detail() from the module I just put
up at https://github.com/RhodiumToad/pg-toastutils to examine the value)

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Peter Eisentraut

Дата:

01 мая 2018 г., 17:04:38

On 4/25/18 07:50, Andrew Gierth wrote:
> do $$
>   declare a text;
>   begin
>     select f1.a into a from f1;
>     delete from f1;
>     commit;
>     perform pg_sleep(10);  -- vacuum f1 in another session while it sleeps
>     call p1(a);
>   end; $$;
> INFO:  a: (t,t,f,"missing chunk number 0",,)
> 
> (p1 in this case is using toast_item_detail() from the module I just put
> up at https://github.com/RhodiumToad/pg-toastutils to examine the value)

Is there a more self-contained way to test this?  I have been trying
with something like

create table test1 (a int, b text);

insert into test1 values (1, repeat('foo', 2000));

do $$
  declare
    x text;
  begin
    select test1.b into x from test1;
    delete from test1;
    commit;
    perform pg_sleep(10);  -- vacuum test1 in another session
    raise notice 'x = %', x;  -- should fail
  end;
$$;

But it doesn't fail.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: lazy detoasting

От

Andrew Gierth

Дата:

02 мая 2018 г., 02:56:38

>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

 Peter> Is there a more self-contained way to test this? I have been
 Peter> trying with something like

 Peter> create table test1 (a int, b text);

 Peter> insert into test1 values (1, repeat('foo', 2000));

That value is no good because it's too compressible; it'll be left
inline in the main table rather than being externalized, so the value of
'x' in the DO-block is still self-contained (though it's still toasted
in the sense of being VARATT_IS_EXTENDED).

I tend to use something like this:

insert into test1
  values (1, (select string_agg(chr(32+floor(95*random())::integer),'')
                from generate_series(1,10000)));

If I do that, I get a different error from your test (whether or not the
vacuum is done):

ERROR:  no known snapshots
CONTEXT:  PL/pgSQL function inline_code_block line 1 at RAISE

This is another issue that was mentioned before in relation to
procedures.

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Andrew Gierth

Дата:

02 мая 2018 г., 03:20:05

>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Andrew> ERROR:  no known snapshots
 Andrew> CONTEXT:  PL/pgSQL function inline_code_block line 1 at RAISE

 Andrew> This is another issue that was mentioned before in relation to
 Andrew> procedures.

See  https://www.postgresql.org/message-id/29608.1518533639@sss.pgh.pa.us

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Peter Eisentraut

Дата:

03 мая 2018 г., 19:41:14

On 5/1/18 19:56, Andrew Gierth wrote:
>  Peter> insert into test1 values (1, repeat('foo', 2000));
> 
> That value is no good because it's too compressible; it'll be left
> inline in the main table rather than being externalized, so the value of
> 'x' in the DO-block is still self-contained (though it's still toasted
> in the sense of being VARATT_IS_EXTENDED).

Right.  I added

alter table test1 alter column b set storage external;

then I can see the error.

The attached test fixes this issue by flattening the toast values before
storing them into PL/pgSQL variables.  It could use another check to see
if there are other code paths that need similar adjustments, but I think
it's the right idea in general.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

v1-0001-PL-pgSQL-Flatten-TOAST-data-in-nonatomic-context.patch

Re: lazy detoasting

От

Andrew Gierth

Дата:

03 мая 2018 г., 20:03:05

>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

 Peter> The attached test fixes this issue by flattening the toast
 Peter> values before storing them into PL/pgSQL variables. It could use
 Peter> another check to see if there are other code paths that need
 Peter> similar adjustments, but I think it's the right idea in general.

Uhh.

What about:

  somevar := (select blah);

or

  somevar := function_returning_toasted_val(blah);

or

  call someproc(function_returning_toasted_val(blah));

or or or ...

-- 
Andrew (irc:RhodiumToad)

Re: lazy detoasting

От

Peter Eisentraut

Дата:

12 мая 2018 г., 00:17:19

On 5/3/18 13:03, Andrew Gierth wrote:
>>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> 
>  Peter> The attached test fixes this issue by flattening the toast
>  Peter> values before storing them into PL/pgSQL variables. It could use
>  Peter> another check to see if there are other code paths that need
>  Peter> similar adjustments, but I think it's the right idea in general.
> 
> Uhh.
> 
> What about:
> or or or ...

Here is a more complete patch.  I made a call graph to get to the
bottom, literally, of how variable assignments happen in PL/pgSQL.  (See
attached.)  There are four leaf functions to patch up.

Also, I wrote some isolation tests to hit each of these cases.  I wasn't
able to construct one for expanded_record_set_fields(), but the
principle there should be the same.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: lazy detoasting

От

Tom Lane

Дата:

13 мая 2018 г., 00:25:05

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Here is a more complete patch.  I made a call graph to get to the
> bottom, literally, of how variable assignments happen in PL/pgSQL.  (See
> attached.)  There are four leaf functions to patch up.
> Also, I wrote some isolation tests to hit each of these cases.  I wasn't
> able to construct one for expanded_record_set_fields(), but the
> principle there should be the same.

I don't like this patch too much: it leaks memory in some places where
that's not a good idea, and in expanded_record_set_field_internal,
it's actually allocating the new field value in the wrong context.
And it pays no attention to updating comments, even ones that it's
flat out invalidated.

I also realized while poking at it that it'd broken much of the work
we did to optimize array and record variables in plpgsql by storing
them in "expanded" form, because in a non-atomic context it causes
assign_simple_var to forcibly flatten anything that heap_tuple_fetch_attr
knows how to flatten --- including "expanded" datums.

To make that last case work properly, we want assign_simple_var to only
flatten things that are actually ONDISK or INDIRECT external datums.
It's okay to leave "expanded" datums as-is, because they're just in memory
and as long as we haven't screwed up memory management they'll be fine
across a transaction boundary.

I made a new macro in postgres.h to recognize such cases, but I wonder
if anyone thinks we should do that differently.  It seems a bit odd
that we've not needed to make that distinction before.  OTOH, flattening
all external datums does seem like the right semantics for the
expandedrecord.c functions to have, so maybe it's fine.

Anyway, attached is a revised patch.  I found a test case for
expanded_record_set_fields(), too.

            regards, tom lane

diff --git a/src/backend/utils/adt/expandedrecord.c b/src/backend/utils/adt/expandedrecord.c
index 0bf5fe8..b1b6883 100644
--- a/src/backend/utils/adt/expandedrecord.c
+++ b/src/backend/utils/adt/expandedrecord.c
@@ -19,6 +19,7 @@
 #include "postgres.h"

 #include "access/htup_details.h"
+#include "access/tuptoaster.h"
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "utils/builtins.h"
@@ -41,7 +42,7 @@ static const ExpandedObjectMethods ER_methods =

 /* Other local functions */
 static void ER_mc_callback(void *arg);
-static MemoryContext get_domain_check_cxt(ExpandedRecordHeader *erh);
+static MemoryContext get_short_term_cxt(ExpandedRecordHeader *erh);
 static void build_dummy_expanded_header(ExpandedRecordHeader *main_erh);
 static pg_noinline void check_domain_for_new_field(ExpandedRecordHeader *erh,
                            int fnumber,
@@ -57,8 +58,9 @@ static pg_noinline void check_domain_for_new_tuple(ExpandedRecordHeader *erh,
  *
  * The expanded record is initially "empty", having a state logically
  * equivalent to a NULL composite value (not ROW(NULL, NULL, ...)).
- * Note that this might not be a valid state for a domain type; if the
- * caller needs to check that, call expanded_record_set_tuple(erh, NULL).
+ * Note that this might not be a valid state for a domain type;
+ * if the caller needs to check that, call
+ * expanded_record_set_tuple(erh, NULL, false, false).
  *
  * The expanded object will be a child of parentcontext.
  */
@@ -424,8 +426,11 @@ make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
  *
  * The tuple is physically copied into the expanded record's local storage
  * if "copy" is true, otherwise it's caller's responsibility that the tuple
- * will live as long as the expanded record does.  In any case, out-of-line
- * fields in the tuple are not automatically inlined.
+ * will live as long as the expanded record does.
+ *
+ * Out-of-line field values in the tuple are automatically inlined if
+ * "expand_external" is true, otherwise not.  (The combination copy = false,
+ * expand_external = true is not sensible and not supported.)
  *
  * Alternatively, tuple can be NULL, in which case we just set the expanded
  * record to be empty.
@@ -433,7 +438,8 @@ make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
 void
 expanded_record_set_tuple(ExpandedRecordHeader *erh,
                           HeapTuple tuple,
-                          bool copy)
+                          bool copy,
+                          bool expand_external)
 {
     int            oldflags;
     HeapTuple    oldtuple;
@@ -453,6 +459,25 @@ expanded_record_set_tuple(ExpandedRecordHeader *erh,
         check_domain_for_new_tuple(erh, tuple);

     /*
+     * If we need to get rid of out-of-line field values, do so, using the
+     * short-term context to avoid leaking whatever cruft the toast fetch
+     * might generate.
+     */
+    if (expand_external && tuple)
+    {
+        /* Assert caller didn't ask for unsupported case */
+        Assert(copy);
+        if (HeapTupleHasExternal(tuple))
+        {
+            oldcxt = MemoryContextSwitchTo(get_short_term_cxt(erh));
+            tuple = toast_flatten_tuple(tuple, erh->er_tupdesc);
+            MemoryContextSwitchTo(oldcxt);
+        }
+        else
+            expand_external = false;    /* need not clean up below */
+    }
+
+    /*
      * Initialize new flags, keeping only non-data status bits.
      */
     oldflags = erh->flags;
@@ -468,6 +493,10 @@ expanded_record_set_tuple(ExpandedRecordHeader *erh,
         newtuple = heap_copytuple(tuple);
         newflags |= ER_FLAG_FVALUE_ALLOCED;
         MemoryContextSwitchTo(oldcxt);
+
+        /* We can now flush anything that detoasting might have leaked. */
+        if (expand_external)
+            MemoryContextReset(erh->er_short_term_cxt);
     }
     else
         newtuple = tuple;
@@ -676,23 +705,13 @@ ER_get_flat_size(ExpandedObjectHeader *eohptr)
                 VARATT_IS_EXTERNAL(DatumGetPointer(erh->dvalues[i])))
             {
                 /*
-                 * It's an external toasted value, so we need to dereference
-                 * it so that the flat representation will be self-contained.
-                 * Do this step in the caller's context because the TOAST
-                 * fetch might leak memory.  That means making an extra copy,
-                 * which is a tad annoying, but repetitive leaks in the
-                 * record's context would be worse.
+                 * expanded_record_set_field_internal can do the actual work
+                 * of detoasting.  It needn't recheck domain constraints.
                  */
-                Datum        newValue;
-
-                newValue = PointerGetDatum(PG_DETOAST_DATUM(erh->dvalues[i]));
-                /* expanded_record_set_field can do the rest */
-                /* ... and we don't need it to recheck domain constraints */
                 expanded_record_set_field_internal(erh, i + 1,
-                                                   newValue, false,
+                                                   erh->dvalues[i], false,
+                                                   true,
                                                    false);
-                /* Might as well free the detoasted value */
-                pfree(DatumGetPointer(newValue));
             }
         }

@@ -1087,12 +1106,16 @@ expanded_record_fetch_field(ExpandedRecordHeader *erh, int fnumber,
  * (without changing the record's state) if the domain's constraints would
  * be violated.
  *
+ * If expand_external is true and newValue is an out-of-line value, we'll
+ * forcibly detoast it so that the record does not depend on external storage.
+ *
  * Internal callers can pass check_constraints = false to skip application
  * of domain constraints.  External callers should never do that.
  */
 void
 expanded_record_set_field_internal(ExpandedRecordHeader *erh, int fnumber,
                                    Datum newValue, bool isnull,
+                                   bool expand_external,
                                    bool check_constraints)
 {
     TupleDesc    tupdesc;
@@ -1124,23 +1147,46 @@ expanded_record_set_field_internal(ExpandedRecordHeader *erh, int fnumber,
         elog(ERROR, "cannot assign to field %d of expanded record", fnumber);

     /*
-     * Copy new field value into record's context, if needed.
+     * Copy new field value into record's context, and deal with detoasting,
+     * if needed.
      */
     attr = TupleDescAttr(tupdesc, fnumber - 1);
     if (!isnull && !attr->attbyval)
     {
         MemoryContext oldcxt;

+        /* If requested, detoast any external value */
+        if (expand_external)
+        {
+            if (attr->attlen == -1 &&
+                VARATT_IS_EXTERNAL(DatumGetPointer(newValue)))
+            {
+                /* Detoasting should be done in short-lived context. */
+                oldcxt = MemoryContextSwitchTo(get_short_term_cxt(erh));
+                newValue = PointerGetDatum(heap_tuple_fetch_attr((struct varlena *) DatumGetPointer(newValue)));
+                MemoryContextSwitchTo(oldcxt);
+            }
+            else
+                expand_external = false;    /* need not clean up below */
+        }
+
+        /* Copy value into record's context */
         oldcxt = MemoryContextSwitchTo(erh->hdr.eoh_context);
         newValue = datumCopy(newValue, false, attr->attlen);
         MemoryContextSwitchTo(oldcxt);

+        /* We can now flush anything that detoasting might have leaked */
+        if (expand_external)
+            MemoryContextReset(erh->er_short_term_cxt);
+
         /* Remember that we have field(s) that may need to be pfree'd */
         erh->flags |= ER_FLAG_DVALUES_ALLOCED;

         /*
          * While we're here, note whether it's an external toasted value,
-         * because that could mean we need to inline it later.
+         * because that could mean we need to inline it later.  (Think not to
+         * merge this into the previous expand_external logic: datumCopy could
+         * by itself have made the value non-external.)
          */
         if (attr->attlen == -1 &&
             VARATT_IS_EXTERNAL(DatumGetPointer(newValue)))
@@ -1193,14 +1239,20 @@ expanded_record_set_field_internal(ExpandedRecordHeader *erh, int fnumber,
  * Caller must ensure that the provided datums are of the right types
  * to match the record's previously assigned rowtype.
  *
+ * If expand_external is true, we'll forcibly detoast out-of-line field values
+ * so that the record does not depend on external storage.
+ *
  * Unlike repeated application of expanded_record_set_field(), this does not
  * guarantee to leave the expanded record in a non-corrupt state in event
  * of an error.  Typically it would only be used for initializing a new
- * expanded record.
+ * expanded record.  Also, because we expect this to be applied at most once
+ * in the lifespan of an expanded record, we do not worry about any cruft
+ * that detoasting might leak.
  */
 void
 expanded_record_set_fields(ExpandedRecordHeader *erh,
-                           const Datum *newValues, const bool *isnulls)
+                           const Datum *newValues, const bool *isnulls,
+                           bool expand_external)
 {
     TupleDesc    tupdesc;
     Datum       *dvalues;
@@ -1245,22 +1297,37 @@ expanded_record_set_fields(ExpandedRecordHeader *erh,
         if (!attr->attbyval)
         {
             /*
-             * Copy new field value into record's context, if needed.
+             * Copy new field value into record's context, and deal with
+             * detoasting, if needed.
              */
             if (!isnull)
             {
-                newValue = datumCopy(newValue, false, attr->attlen);
+                /* Is it an external toasted value? */
+                if (attr->attlen == -1 &&
+                    VARATT_IS_EXTERNAL(DatumGetPointer(newValue)))
+                {
+                    if (expand_external)
+                    {
+                        /* Detoast as requested while copying the value */
+                        newValue = PointerGetDatum(heap_tuple_fetch_attr((struct varlena *)
DatumGetPointer(newValue)));
+                    }
+                    else
+                    {
+                        /* Just copy the value */
+                        newValue = datumCopy(newValue, false, -1);
+                        /* If it's still external, remember that */
+                        if (VARATT_IS_EXTERNAL(DatumGetPointer(newValue)))
+                            erh->flags |= ER_FLAG_HAVE_EXTERNAL;
+                    }
+                }
+                else
+                {
+                    /* Not an external value, just copy it */
+                    newValue = datumCopy(newValue, false, attr->attlen);
+                }

                 /* Remember that we have field(s) that need to be pfree'd */
                 erh->flags |= ER_FLAG_DVALUES_ALLOCED;
-
-                /*
-                 * While we're here, note whether it's an external toasted
-                 * value, because that could mean we need to inline it later.
-                 */
-                if (attr->attlen == -1 &&
-                    VARATT_IS_EXTERNAL(DatumGetPointer(newValue)))
-                    erh->flags |= ER_FLAG_HAVE_EXTERNAL;
             }

             /*
@@ -1291,7 +1358,7 @@ expanded_record_set_fields(ExpandedRecordHeader *erh,
     if (erh->flags & ER_FLAG_IS_DOMAIN)
     {
         /* We run domain_check in a short-lived context to limit cruft */
-        MemoryContextSwitchTo(get_domain_check_cxt(erh));
+        MemoryContextSwitchTo(get_short_term_cxt(erh));

         domain_check(ExpandedRecordGetRODatum(erh), false,
                      erh->er_decltypeid,
@@ -1303,25 +1370,26 @@ expanded_record_set_fields(ExpandedRecordHeader *erh,
 }

 /*
- * Construct (or reset) working memory context for domain checks.
+ * Construct (or reset) working memory context for short-term operations.
+ *
+ * This context is used for domain check evaluation and for detoasting.
  *
- * If we don't have a working memory context for domain checking, make one;
- * if we have one, reset it to get rid of any leftover cruft.  (It is a tad
- * annoying to need a whole context for this, since it will often go unused
- * --- but it's hard to avoid memory leaks otherwise.  We can make the
- * context small, at least.)
+ * If we don't have a short-lived memory context, make one; if we have one,
+ * reset it to get rid of any leftover cruft.  (It is a tad annoying to need a
+ * whole context for this, since it will often go unused --- but it's hard to
+ * avoid memory leaks otherwise.  We can make the context small, at least.)
  */
 static MemoryContext
-get_domain_check_cxt(ExpandedRecordHeader *erh)
+get_short_term_cxt(ExpandedRecordHeader *erh)
 {
-    if (erh->er_domain_check_cxt == NULL)
-        erh->er_domain_check_cxt =
+    if (erh->er_short_term_cxt == NULL)
+        erh->er_short_term_cxt =
             AllocSetContextCreate(erh->hdr.eoh_context,
-                                  "expanded record domain checks",
+                                  "expanded record short-term context",
                                   ALLOCSET_SMALL_SIZES);
     else
-        MemoryContextReset(erh->er_domain_check_cxt);
-    return erh->er_domain_check_cxt;
+        MemoryContextReset(erh->er_short_term_cxt);
+    return erh->er_short_term_cxt;
 }

 /*
@@ -1340,8 +1408,8 @@ build_dummy_expanded_header(ExpandedRecordHeader *main_erh)
     ExpandedRecordHeader *erh;
     TupleDesc    tupdesc = expanded_record_get_tupdesc(main_erh);

-    /* Ensure we have a domain_check_cxt */
-    (void) get_domain_check_cxt(main_erh);
+    /* Ensure we have a short-lived context */
+    (void) get_short_term_cxt(main_erh);

     /*
      * Allocate dummy header on first time through, or in the unlikely event
@@ -1372,7 +1440,7 @@ build_dummy_expanded_header(ExpandedRecordHeader *main_erh)
          * nothing else is authorized to delete or transfer ownership of the
          * object's context, so it should be safe enough.
          */
-        EOH_init_header(&erh->hdr, &ER_methods, main_erh->er_domain_check_cxt);
+        EOH_init_header(&erh->hdr, &ER_methods, main_erh->er_short_term_cxt);
         erh->er_magic = ER_MAGIC;

         /* Set up dvalues/dnulls, with no valid contents as yet */
@@ -1488,7 +1556,7 @@ check_domain_for_new_field(ExpandedRecordHeader *erh, int fnumber,
      * We call domain_check in the short-lived context, so that any cruft
      * leaked by expression evaluation can be reclaimed.
      */
-    oldcxt = MemoryContextSwitchTo(erh->er_domain_check_cxt);
+    oldcxt = MemoryContextSwitchTo(erh->er_short_term_cxt);

     /*
      * And now we can apply the check.  Note we use main header's domain cache
@@ -1502,7 +1570,7 @@ check_domain_for_new_field(ExpandedRecordHeader *erh, int fnumber,
     MemoryContextSwitchTo(oldcxt);

     /* We might as well clean up cruft immediately. */
-    MemoryContextReset(erh->er_domain_check_cxt);
+    MemoryContextReset(erh->er_short_term_cxt);
 }

 /*
@@ -1518,7 +1586,7 @@ check_domain_for_new_tuple(ExpandedRecordHeader *erh, HeapTuple tuple)
     if (tuple == NULL)
     {
         /* We run domain_check in a short-lived context to limit cruft */
-        oldcxt = MemoryContextSwitchTo(get_domain_check_cxt(erh));
+        oldcxt = MemoryContextSwitchTo(get_short_term_cxt(erh));

         domain_check((Datum) 0, true,
                      erh->er_decltypeid,
@@ -1528,7 +1596,7 @@ check_domain_for_new_tuple(ExpandedRecordHeader *erh, HeapTuple tuple)
         MemoryContextSwitchTo(oldcxt);

         /* We might as well clean up cruft immediately. */
-        MemoryContextReset(erh->er_domain_check_cxt);
+        MemoryContextReset(erh->er_short_term_cxt);

         return;
     }
@@ -1551,7 +1619,7 @@ check_domain_for_new_tuple(ExpandedRecordHeader *erh, HeapTuple tuple)
      * We call domain_check in the short-lived context, so that any cruft
      * leaked by expression evaluation can be reclaimed.
      */
-    oldcxt = MemoryContextSwitchTo(erh->er_domain_check_cxt);
+    oldcxt = MemoryContextSwitchTo(erh->er_short_term_cxt);

     /*
      * And now we can apply the check.  Note we use main header's domain cache
@@ -1565,5 +1633,5 @@ check_domain_for_new_tuple(ExpandedRecordHeader *erh, HeapTuple tuple)
     MemoryContextSwitchTo(oldcxt);

     /* We might as well clean up cruft immediately. */
-    MemoryContextReset(erh->er_domain_check_cxt);
+    MemoryContextReset(erh->er_short_term_cxt);
 }
diff --git a/src/include/postgres.h b/src/include/postgres.h
index bbcb50e..b596fcb 100644
--- a/src/include/postgres.h
+++ b/src/include/postgres.h
@@ -321,6 +321,8 @@ typedef struct
     (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW)
 #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \
     (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR)))
+#define VARATT_IS_EXTERNAL_NON_EXPANDED(PTR) \
+    (VARATT_IS_EXTERNAL(PTR) && !VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR)))
 #define VARATT_IS_SHORT(PTR)                VARATT_IS_1B(PTR)
 #define VARATT_IS_EXTENDED(PTR)                (!VARATT_IS_4B_U(PTR))

diff --git a/src/include/utils/expandedrecord.h b/src/include/utils/expandedrecord.h
index a95c9cc..c999f44 100644
--- a/src/include/utils/expandedrecord.h
+++ b/src/include/utils/expandedrecord.h
@@ -127,8 +127,10 @@ typedef struct ExpandedRecordHeader
     char       *fstartptr;        /* start of its data area */
     char       *fendptr;        /* end+1 of its data area */

+    /* Some operations on the expanded record need a short-lived context */
+    MemoryContext er_short_term_cxt;    /* short-term memory context */
+
     /* Working state for domain checking, used if ER_FLAG_IS_DOMAIN is set */
-    MemoryContext er_domain_check_cxt;    /* short-term memory context */
     struct ExpandedRecordHeader *er_dummy_header;    /* dummy record header */
     void       *er_domaininfo;    /* cache space for domain_check() */

@@ -171,7 +173,7 @@ extern ExpandedRecordHeader *make_expanded_record_from_tupdesc(TupleDesc tupdesc
 extern ExpandedRecordHeader *make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
                                     MemoryContext parentcontext);
 extern void expanded_record_set_tuple(ExpandedRecordHeader *erh,
-                          HeapTuple tuple, bool copy);
+                          HeapTuple tuple, bool copy, bool expand_external);
 extern Datum make_expanded_record_from_datum(Datum recorddatum,
                                 MemoryContext parentcontext);
 extern TupleDesc expanded_record_fetch_tupdesc(ExpandedRecordHeader *erh);
@@ -186,13 +188,15 @@ extern Datum expanded_record_fetch_field(ExpandedRecordHeader *erh, int fnumber,
 extern void expanded_record_set_field_internal(ExpandedRecordHeader *erh,
                                    int fnumber,
                                    Datum newValue, bool isnull,
+                                   bool expand_external,
                                    bool check_constraints);
 extern void expanded_record_set_fields(ExpandedRecordHeader *erh,
-                           const Datum *newValues, const bool *isnulls);
+                           const Datum *newValues, const bool *isnulls,
+                           bool expand_external);

 /* outside code should never call expanded_record_set_field_internal as such */
-#define expanded_record_set_field(erh, fnumber, newValue, isnull) \
-    expanded_record_set_field_internal(erh, fnumber, newValue, isnull, true)
+#define expanded_record_set_field(erh, fnumber, newValue, isnull, expand_external) \
+    expanded_record_set_field_internal(erh, fnumber, newValue, isnull, expand_external, true)

 /*
  * Inline-able fast cases.  The expanded_record_fetch_xxx functions above
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 228d1c0..fbf6c7b 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -20,6 +20,7 @@
 #include "access/htup_details.h"
 #include "access/transam.h"
 #include "access/tupconvert.h"
+#include "access/tuptoaster.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -912,16 +913,20 @@ plpgsql_exec_trigger(PLpgSQL_function *func,
     }
     else if (TRIGGER_FIRED_BY_INSERT(trigdata->tg_event))
     {
-        expanded_record_set_tuple(rec_new->erh, trigdata->tg_trigtuple, false);
+        expanded_record_set_tuple(rec_new->erh, trigdata->tg_trigtuple,
+                                  false, false);
     }
     else if (TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event))
     {
-        expanded_record_set_tuple(rec_new->erh, trigdata->tg_newtuple, false);
-        expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple, false);
+        expanded_record_set_tuple(rec_new->erh, trigdata->tg_newtuple,
+                                  false, false);
+        expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple,
+                                  false, false);
     }
     else if (TRIGGER_FIRED_BY_DELETE(trigdata->tg_event))
     {
-        expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple, false);
+        expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple,
+                                  false, false);
     }
     else
         elog(ERROR, "unrecognized trigger action: not INSERT, DELETE, or UPDATE");
@@ -5061,7 +5066,7 @@ exec_assign_value(PLpgSQL_execstate *estate,

                 /* And assign it. */
                 expanded_record_set_field(erh, recfield->finfo.fnumber,
-                                          value, isNull);
+                                          value, isNull, !estate->atomic);
                 break;
             }

@@ -5875,7 +5880,8 @@ exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
                     tupdescs_match)
                 {
                     /* Only need to assign a new tuple value */
-                    expanded_record_set_tuple(rec->erh, tuptab->vals[i], true);
+                    expanded_record_set_tuple(rec->erh, tuptab->vals[i],
+                                              true, !estate->atomic);
                 }
                 else
                 {
@@ -6647,7 +6653,7 @@ exec_move_row(PLpgSQL_execstate *estate,
                  */
                 newerh = make_expanded_record_for_rec(estate, rec,
                                                       NULL, rec->erh);
-                expanded_record_set_tuple(newerh, NULL, false);
+                expanded_record_set_tuple(newerh, NULL, false, false);
                 assign_record_var(estate, rec, newerh);
             }
             else
@@ -6689,7 +6695,7 @@ exec_move_row(PLpgSQL_execstate *estate,
             else
             {
                 /* No coercion is needed, so just assign the row value */
-                expanded_record_set_tuple(newerh, tup, true);
+                expanded_record_set_tuple(newerh, tup, true, !estate->atomic);
             }

             /* Complete the assignment */
@@ -6927,7 +6933,7 @@ exec_move_row_from_fields(PLpgSQL_execstate *estate,
         }

         /* Insert the coerced field values into the new expanded record */
-        expanded_record_set_fields(newerh, values, nulls);
+        expanded_record_set_fields(newerh, values, nulls, !estate->atomic);

         /* Complete the assignment */
         assign_record_var(estate, rec, newerh);
@@ -7194,7 +7200,8 @@ exec_move_row_from_datum(PLpgSQL_execstate *estate,
                  (erh->er_typmod == rec->erh->er_typmod &&
                   erh->er_typmod >= 0)))
             {
-                expanded_record_set_tuple(rec->erh, erh->fvalue, true);
+                expanded_record_set_tuple(rec->erh, erh->fvalue,
+                                          true, !estate->atomic);
                 return;
             }

@@ -7216,7 +7223,8 @@ exec_move_row_from_datum(PLpgSQL_execstate *estate,
                 (rec->rectypeid == RECORDOID ||
                  rec->rectypeid == erh->er_typeid))
             {
-                expanded_record_set_tuple(newerh, erh->fvalue, true);
+                expanded_record_set_tuple(newerh, erh->fvalue,
+                                          true, !estate->atomic);
                 assign_record_var(estate, rec, newerh);
                 return;
             }
@@ -7306,7 +7314,8 @@ exec_move_row_from_datum(PLpgSQL_execstate *estate,
                  (tupTypmod == rec->erh->er_typmod &&
                   tupTypmod >= 0)))
             {
-                expanded_record_set_tuple(rec->erh, &tmptup, true);
+                expanded_record_set_tuple(rec->erh, &tmptup,
+                                          true, !estate->atomic);
                 return;
             }

@@ -7323,7 +7332,8 @@ exec_move_row_from_datum(PLpgSQL_execstate *estate,

                 newerh = make_expanded_record_from_typeid(tupType, tupTypmod,
                                                           mcontext);
-                expanded_record_set_tuple(newerh, &tmptup, true);
+                expanded_record_set_tuple(newerh, &tmptup,
+                                          true, !estate->atomic);
                 assign_record_var(estate, rec, newerh);
                 return;
             }
@@ -8051,7 +8061,8 @@ plpgsql_subxact_cb(SubXactEvent event, SubTransactionId mySubid,
  * assign_simple_var --- assign a new value to any VAR datum.
  *
  * This should be the only mechanism for assignment to simple variables,
- * lest we do the release of the old value incorrectly.
+ * lest we do the release of the old value incorrectly (not to mention
+ * the detoasting business).
  */
 static void
 assign_simple_var(PLpgSQL_execstate *estate, PLpgSQL_var *var,
@@ -8059,6 +8070,36 @@ assign_simple_var(PLpgSQL_execstate *estate, PLpgSQL_var *var,
 {
     Assert(var->dtype == PLPGSQL_DTYPE_VAR ||
            var->dtype == PLPGSQL_DTYPE_PROMISE);
+
+    /*
+     * In non-atomic contexts, we do not want to store TOAST pointers in
+     * variables, because such pointers might become stale after a commit.
+     * Forcibly detoast in such cases.
+     */
+    if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
+        VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
+    {
+        MemoryContext oldcxt;
+        Datum        detoasted;
+
+        /*
+         * Do the detoasting in the eval_mcontext to avoid long-term leakage
+         * of whatever memory toast fetching might leak.  Then we have to copy
+         * the detoasted datum to the function's main context, which is a
+         * pain, but there's little choice.
+         */
+        oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
+        detoasted = PointerGetDatum(heap_tuple_fetch_attr((struct varlena *) DatumGetPointer(newvalue)));
+        MemoryContextSwitchTo(oldcxt);
+        /* Now's a good time to not leak the input value if it's freeable */
+        if (freeable)
+            pfree(DatumGetPointer(newvalue));
+        /* Once we copy the value, it's definitely freeable */
+        newvalue = datumCopy(detoasted, false, -1);
+        freeable = true;
+        /* Can't clean up eval_mcontext here, but it'll happen before long */
+    }
+
     /* Free the old value if needed */
     if (var->freeval)
     {
diff --git a/src/test/isolation/expected/plpgsql-toast.out b/src/test/isolation/expected/plpgsql-toast.out
new file mode 100644
index 0000000..4341153
--- /dev/null
+++ b/src/test/isolation/expected/plpgsql-toast.out
@@ -0,0 +1,189 @@
+Parsed test spec with 2 sessions
+
+starting permutation: lock assign1 vacuum unlock
+pg_advisory_unlock_all
+
+
+pg_advisory_unlock_all
+
+
+step lock:
+    SELECT pg_advisory_lock(1);
+
+pg_advisory_lock
+
+
+step assign1:
+do $$
+  declare
+    x text;
+  begin
+    select test1.b into x from test1;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'x = %', x;
+  end;
+$$;
+ <waiting ...>
+step vacuum:
+    VACUUM test1;
+
+step unlock:
+    SELECT pg_advisory_unlock(1);
+
+pg_advisory_unlock
+
+t
+step assign1: <... completed>
+
+starting permutation: lock assign2 vacuum unlock
+pg_advisory_unlock_all
+
+
+pg_advisory_unlock_all
+
+
+step lock:
+    SELECT pg_advisory_lock(1);
+
+pg_advisory_lock
+
+
+step assign2:
+do $$
+  declare
+    x text;
+  begin
+    x := (select test1.b from test1);
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'x = %', x;
+  end;
+$$;
+ <waiting ...>
+step vacuum:
+    VACUUM test1;
+
+step unlock:
+    SELECT pg_advisory_unlock(1);
+
+pg_advisory_unlock
+
+t
+step assign2: <... completed>
+
+starting permutation: lock assign3 vacuum unlock
+pg_advisory_unlock_all
+
+
+pg_advisory_unlock_all
+
+
+step lock:
+    SELECT pg_advisory_lock(1);
+
+pg_advisory_lock
+
+
+step assign3:
+do $$
+  declare
+    r record;
+  begin
+    select * into r from test1;
+    r.b := (select test1.b from test1);
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+ <waiting ...>
+step vacuum:
+    VACUUM test1;
+
+step unlock:
+    SELECT pg_advisory_unlock(1);
+
+pg_advisory_unlock
+
+t
+step assign3: <... completed>
+
+starting permutation: lock assign4 vacuum unlock
+pg_advisory_unlock_all
+
+
+pg_advisory_unlock_all
+
+
+step lock:
+    SELECT pg_advisory_lock(1);
+
+pg_advisory_lock
+
+
+step assign4:
+do $$
+  declare
+    r test2;
+  begin
+    select * into r from test1;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+ <waiting ...>
+step vacuum:
+    VACUUM test1;
+
+step unlock:
+    SELECT pg_advisory_unlock(1);
+
+pg_advisory_unlock
+
+t
+step assign4: <... completed>
+
+starting permutation: lock assign5 vacuum unlock
+pg_advisory_unlock_all
+
+
+pg_advisory_unlock_all
+
+
+step lock:
+    SELECT pg_advisory_lock(1);
+
+pg_advisory_lock
+
+
+step assign5:
+do $$
+  declare
+    r record;
+  begin
+    for r in select test1.b from test1 loop
+      null;
+    end loop;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+ <waiting ...>
+step vacuum:
+    VACUUM test1;
+
+step unlock:
+    SELECT pg_advisory_unlock(1);
+
+pg_advisory_unlock
+
+t
+step assign5: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index b650e46..0e99721 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -74,3 +74,4 @@ test: predicate-gin-nomatch
 test: partition-key-update-1
 test: partition-key-update-2
 test: partition-key-update-3
+test: plpgsql-toast
diff --git a/src/test/isolation/specs/plpgsql-toast.spec b/src/test/isolation/specs/plpgsql-toast.spec
new file mode 100644
index 0000000..e6228c9
--- /dev/null
+++ b/src/test/isolation/specs/plpgsql-toast.spec
@@ -0,0 +1,137 @@
+# Test TOAST behavior in PL/pgSQL procedures with transaction control.
+#
+# We need to ensure that values stored in PL/pgSQL variables are free
+# of external TOAST references, because those could disappear after a
+# transaction is committed (leading to errors "missing chunk number
+# ... for toast value ...").  The tests here do this by running VACUUM
+# in a second session.  Advisory locks are used to have the VACUUM
+# kick in at the right time.  The different "assign" steps test
+# different code paths for variable assignments in PL/pgSQL.
+
+setup
+{
+    CREATE TABLE test1 (a int, b text);
+    ALTER TABLE test1 ALTER COLUMN b SET STORAGE EXTERNAL;
+    INSERT INTO test1 VALUES (1, repeat('foo', 2000));
+    CREATE TYPE test2 AS (a bigint, b text);
+}
+
+teardown
+{
+    DROP TABLE test1;
+    DROP TYPE test2;
+}
+
+session "s1"
+
+setup
+{
+    SELECT pg_advisory_unlock_all();
+}
+
+# assign_simple_var()
+step "assign1"
+{
+do $$
+  declare
+    x text;
+  begin
+    select test1.b into x from test1;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'x = %', x;
+  end;
+$$;
+}
+
+# assign_simple_var()
+step "assign2"
+{
+do $$
+  declare
+    x text;
+  begin
+    x := (select test1.b from test1);
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'x = %', x;
+  end;
+$$;
+}
+
+# expanded_record_set_field()
+step "assign3"
+{
+do $$
+  declare
+    r record;
+  begin
+    select * into r from test1;
+    r.b := (select test1.b from test1);
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+}
+
+# expanded_record_set_fields()
+step "assign4"
+{
+do $$
+  declare
+    r test2;
+  begin
+    select * into r from test1;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+}
+
+# expanded_record_set_tuple()
+step "assign5"
+{
+do $$
+  declare
+    r record;
+  begin
+    for r in select test1.b from test1 loop
+      null;
+    end loop;
+    delete from test1;
+    commit;
+    perform pg_advisory_lock(1);
+    raise notice 'r = %', r;
+  end;
+$$;
+}
+
+session "s2"
+setup
+{
+    SELECT pg_advisory_unlock_all();
+}
+step "lock"
+{
+    SELECT pg_advisory_lock(1);
+}
+step "vacuum"
+{
+    VACUUM test1;
+}
+step "unlock"
+{
+    SELECT pg_advisory_unlock(1);
+}
+
+permutation "lock" "assign1" "vacuum" "unlock"
+permutation "lock" "assign2" "vacuum" "unlock"
+permutation "lock" "assign3" "vacuum" "unlock"
+permutation "lock" "assign4" "vacuum" "unlock"
+permutation "lock" "assign5" "vacuum" "unlock"

Re: lazy detoasting

От

Peter Eisentraut

Дата:

17 мая 2018 г., 16:12:05

On 5/12/18 17:25, Tom Lane wrote:
> Anyway, attached is a revised patch.  I found a test case for
> expanded_record_set_fields(), too.

Thank you for fixing this up.

In reviewing the committed patch, I noticed that in ER_get_flat_size()
you have removed the PG_DETOAST_DATUM() call and let
expanded_record_set_field_internal() do the de-toasting work.  I had
considered that too, but my impression is that the purpose of the
PG_DETOAST_DATUM() is to de-compress for the purpose of size
computation, whereas expanded_record_set_field_internal() only does the
inlining of externally stored values and doesn't do any explicit
decompression.  Is this correct?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: lazy detoasting

От

Tom Lane

Дата:

17 мая 2018 г., 17:10:41

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> In reviewing the committed patch, I noticed that in ER_get_flat_size()
> you have removed the PG_DETOAST_DATUM() call and let
> expanded_record_set_field_internal() do the de-toasting work.  I had
> considered that too, but my impression is that the purpose of the
> PG_DETOAST_DATUM() is to de-compress for the purpose of size
> computation, whereas expanded_record_set_field_internal() only does the
> inlining of externally stored values and doesn't do any explicit
> decompression.  Is this correct?

Hmm ... yeah, there is a semantics change there, but I think it's fine.
I don't recall that I'd thought specifically about the behavior for
inline-compressed/short-header Datums when I wrote that code to begin
with.  But for its purpose, which is that we're trying to flatten the
expanded record into a tuple, leaving an inline-compressed field in that
form seems OK, perhaps actually preferable.  And short-header is a no-op:
it'd end up with a short header in the emitted tuple regardless of our
choice here.

            regards, tom lane

Re: lazy detoasting

От

Chapman Flack

Дата:

12 августа 2018 г., 23:26:31

On 04/11/18 11:27, Chapman Flack wrote:
> In most cases I can easily imagine, a function that gets an SQLXML
> object is going to read it "pretty soon" ... 
> However, the spec does explicitly provide that you could, for whatever
> reason, sit on the thing for a while, then read it later in the same
> transaction. You should get the same stuff you would have if you had
> read it right away, ...
>        Eager detoasting at the time of creating the object, into a
> transaction-scoped memory context, would trivially have that behavior,
> but on the chance that XML values might be large, and a function might
> conceivably never read the thing at all on certain code paths, I'd
> rather defer detoasting until the code holding the SQLXML object
> actually wants to read it.

In the interest of closure, how this idea looks implemented in PL/Java
is here:

  https://github.com/tada/pljava/commit/1a5caf1

It uses GetOldestSnapshot() to choose the snapshot to retain, and it
passes the following test, inspired by Andrew Gierth's in this thread.
(It also fails the test if the snapshot retention isn't done, confirming
that it's needed.)

CREATE TABLE t(x xml);
BEGIN READ WRITE, ISOLATION LEVEL READ COMMITTED;
/*
 * In other session: INSERT INTO t(x)
 * SELECT table_to_xml('pg_operator', true, false, '');
 */
SELECT javatest.echoxmlparameter(x, 0, 5) FROM t; -- 0 => stash x
/*
 * In other session: DELETE FROM t;
 * VACUUM t;
 */
SELECT javatest.echoxmlparameter(null, 5, 5); -- null => unstash
COMMIT;

GetOldestSnapshot() appeared in 9.6. For older PG releases, instead
of a snapshot and toast pointer being retained to detoast lazily,
the ondisk content is eagerly fetched, then decompressed lazily.
XML compresses well, so that can still use 95% less memory while one
of these objects is being held but not read from. (The 9.5-and-earlier
support is added in a separate commit.)

-Chap

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: lazy detoasting

Вложения

Вложения