Обсуждение: Bug with plpgsql, temp tables and TOAST?
Hi,
When I run the SQL below, I get an error on the third call to foo() : "ERRO=
R: could not open relation with OID 884693". I'm quite sure this OID belon=
gs to the TOAST table corresponding to the temporary table created by foo()=
during the third call. The fourth call works fine again.
I suspect the third one fails because the data is too large (even when comp=
ressed) to be stored without using toast. I can't reproduce the issue if fo=
r example I use a long string of identical characters instead of "random" o=
nes. My guess would be that the TOAST table is still somehow referenced by =
the result value, even though the table itself has been dropped by the time=
the result value is used.
I have tested this myself on 8.4.4 and it has also been verified on a 9.1 b=
y someone on #postgresql.
Kind regards,
Matthijs Bomhoff
CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
DECLARE
acc_ TEXT :=3D '';
cur_rec_ RECORD;
BEGIN
EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';
-- Construct a string with random characters to prevent compression (with=
high probability)
LOOP
EXIT WHEN length(acc_) >=3D size_;
acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32);
END LOOP;
EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')=
';
EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
EXECUTE 'DROP TABLE foo_tab';
RETURN cur_rec_.blob;
END
$EOF$ LANGUAGE plpgsql;
SELECT md5(foo(10));
SELECT md5(foo(20));
SELECT md5(foo(40000)); -- This one breaks on my 8.4.4
SELECT md5(foo(30)); -- And this one works fine again
DROP FUNCTION foo(INTEGER);
Excerpts from Matthijs Bomhoff's message of mié jun 29 07:40:07 -0400 2011:
> CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
> DECLARE
> acc_ TEXT := '';
> cur_rec_ RECORD;
> BEGIN
> EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';
>
> -- Construct a string with random characters to prevent compression (with high probability)
> LOOP
> EXIT WHEN length(acc_) >= size_;
> acc_ := acc_ || chr(ceil(random()*64)::integer + 32);
> END LOOP;
>
> EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')';
> EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
> EXECUTE 'DROP TABLE foo_tab';
> RETURN cur_rec_.blob;
> END
> $EOF$ LANGUAGE plpgsql;
Hmm, so what's happening here, I think, is that the value is getting
assigned to the record variable without detoasting. I guess we should
detoast the value prior to assigning it, but it seems to me that that
would have a large performance penalty for other cases in which the
toast table is not dropped; in fact, you can even imagine some cases in
which the toasted value is not even accessed, so getting to the point of
detoasting it would be a severe penalization.
--
Ãlvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun 29, 2011, at 6:03 PM, Alvaro Herrera wrote:
> Excerpts from Matthijs Bomhoff's message of mi=E9 jun 29 07:40:07 -0400 2=
011:
>=20
>> CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
>> DECLARE
>> acc_ TEXT :=3D '';
>> cur_rec_ RECORD;
>> BEGIN
>> EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';
>>=20
>> -- Construct a string with random characters to prevent compression (wi=
th high probability)
>> LOOP
>> EXIT WHEN length(acc_) >=3D size_;
>> acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32);
>> END LOOP;
>>=20
>> EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || =
')';
>> EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
>> EXECUTE 'DROP TABLE foo_tab';
>> RETURN cur_rec_.blob;
>> END
>> $EOF$ LANGUAGE plpgsql;
>=20
> Hmm, so what's happening here, I think, is that the value is getting
> assigned to the record variable without detoasting. I guess we should
> detoast the value prior to assigning it, but it seems to me that that
> would have a large performance penalty for other cases in which the
> toast table is not dropped; in fact, you can even imagine some cases in
> which the toasted value is not even accessed, so getting to the point of
> detoasting it would be a severe penalization.
Possibly related: In some earlier attempts at reproducing this, I actually =
tried to use length() instead of md5(). It seemed that I could not get it t=
o trigger with that. Just selecting * triggers it of course, but caused a b=
it too much clutter in my psql for the required long random strings, so I h=
ad to find a function that would actually need the value itself. Hence the =
md5().
Regards,
Matthijs
Excerpts from Matthijs Bomhoff's message of mié jun 29 15:20:49 -0400 2011: > Possibly related: In some earlier attempts at reproducing this, I > actually tried to use length() instead of md5(). It seemed that I > could not get it to trigger with that. Just selecting * triggers it of > course, but caused a bit too much clutter in my psql for the required > long random strings, so I had to find a function that would actually > need the value itself. Hence the md5(). Yeah, the length can be obtained from the toast header AFAIR, so you don't need to do a full detoast of the value, which masks the problem. -- Ãlvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Hmm, so what's happening here, I think, is that the value is getting
> assigned to the record variable without detoasting. I guess we should
> detoast the value prior to assigning it, but it seems to me that that
> would have a large performance penalty for other cases in which the
> toast table is not dropped; in fact, you can even imagine some cases in
> which the toasted value is not even accessed, so getting to the point of
> detoasting it would be a severe penalization.
Yeah, we have seen similar reports before, and concluded that forcibly
detoasting on the remote chance that the toast table wouldn't stick
around was just untenable from a performance standpoint. It's annoying
but I don't see a good compromise solution ATM.
Pavel was working recently on a hack to prevent repeated detoastings,
which is at least somewhat related to this issue. I don't recall that
anyone liked his hack though ...
regards, tom lane