Обсуждение: XX000: iso-8859-1 type of jsonb container.

Поиск
Список
Период
Сортировка

XX000: iso-8859-1 type of jsonb container.

От
"Poot, Bas (B.J.)"
Дата:

Hi there,

First of all I must say that we're not only very thankful for the work you guys are doing. But also very impressed with the result.
We have various products using various databases, and coming from an oracle database I must say I'm overwhelmed by the performance and features postgres has.

Unfortunately we're experiencing very weird behavior upgrading to postgres 13, and possibly found a bug.


This is going to be a weird bug report, but please keep reading.

Recently we migrated our postgres test database. We had a postgres 12.5 on windows, and moved to a postgres 13.2.1 on ubuntu 20 (clustered).

99% of all queries worked just fine, but one particular query created an exception: XX000: unknown type of jsonb container.

We did some various testing with various versions of postgres on various operating systems (Ubuntu and windows server 2016)
We found that we can reproduce the bug in both ubuntu 20 and windows 2016 with postgres 13.2.1.

Also we didn't have the problem with postgres 12.5 or 12.6 on both windows and ubuntu.
So it seems to be a 13.x bug (we didn't try lower versions of 13 then 13.2.1)


Of course the first thing I did was run the query in a different tool (we use DBeaver) and I got the same result.
I tried to reduce the query, by stripping columns, rows etc. figure out what part of it created the error.  I got as far as this:

 
 
select 
display_name,
'' as rolename
,filtur
,jsonb_each_text(filtur) as x
,to_jsonb(jsonb_each_text(filtur)) as frows
,array(SELECT jsonb_object_keys(filtur)) as objectkeys
from tmp_bug2
order by logical_name
;
 

where our 'tmp_bug2' table is a select into from a lot of joins on other tables. The DDL is now this:

CREATE TABLE public.tmp_bug2 (
logical_name varchar NULL,
display_name varchar NULL,
filtur jsonb NULL
);



But now the weird stuff started happening. (tldr: testcase at the end of this mail)

I can do a lot of changes on this query that makes me get a different exception (or no exception at all). But when I wait a bit, or change some randomly other line I get the error again.

Weird behavior 1:
When the tmp_bug2 table has these rows:
- logical_name
- display_name
- operation
- filtur
And I remove the 'display_name' from the query, it works. I get no more errors and see data.
 

Then I figured that i don't use the column 'operation' at all, so i removed it from my tmp_bug2 table (dropped it, and redid the select into).
The table now contains these columns:
 
- logical_name
- display_name
- filtur

 
When I run the query it still fails. But now, when I remove the 'display_name' field from my select, it keeps failing. So the removing of the 'operation' column in my table somehow changes the behavior of my query.
Even when I don't use the entire column.



Weird behavior 2:
(based on the original query)
When I remove the line 
'' as rolename,

I get a perfectly fine result. No errors, it all works.
 
So removing a totally useless column makes my query work.



Weird behavior 3:
When I remove 'order by' clause my query works again like a charm. No errors whatsoever.
But that's weird, because my select into query also has an order by.
Therefor the table 'tmp_bug2' is already sorted by logical_name. 
I can imagine some operations only being done on the lines you return to your client, but the data is ordered in the same way, so I should return the same rows.
With or without that order by statement



Weird behavior 4
In our original query, we have the last 3 columns in the select that do stuff with json(b) functions. 
,jsonb_each_text(filtur) as x
,to_jsonb(jsonb_each_text(filtur)) as frows
,array(SELECT jsonb_object_keys(filtur)) as objectkeys

obviously this error has something to do with those statements.
So I tried to comment out the first line: still an error.
I tried to comment out the second one (instead of the first) still an error.
Then I commented out both of them. Problem solved, query works!
Then enabled them all, and commented out the 3th line. Problem solved again.

So:
when I have the first or second line active we get an error.
when I have the 3th line active we get an error
when we disable the 3th line it works (even with the first and second line active)



Weird behavior 5
Now I tried to figure out in what row triggers the error.

first I created a table that contained all the information in my original subquery (not in this email) and called it tmp_bug.
Then I did this:
select *
into tmp_bug2
from (
select logical_name
, display_name
, operation
, filtur
from tmp_bug
order by logical_name 
) t1
limit 300000
;

note that I order by logical_name, and limit my rows, so I only get 300k of my 390k rows.
So I only lose 90k rows.
When I run my query (top of this email) it works like a charm. Therefor I concluded, the error must be in the 90k rows I dropped.

So I dropped my tmp_bug2 table, and redid the above query, but then altering the order by clause using:
order by logical_name desc
this way I dropped the first 90K rows, instead of the last 90K. 
I reran my query, and (wtf) it again worked like a charm..
So somehow it has nothing to do with the specific rows I select....


Weird behavior 6
In an attempt to create a testcase with the same amount of rows I tried updating them all to the same value.
So I updated the columns 'display_name' and 'logical_name' to 'thing' and the the case still stands (errors as above)
I simplified my select into statement to generate the table, so it now always added 'thing' into those 2 columns.
Results as expected. Still errors. (but closer to a testcase for you guys)

Then I updated the column 'filtur' like so:

update tmp_bug2 
set filtur  = '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow", "How_secret_is_it": "Level 1", "Does.it.shoot": "YES!"}'::jsonb

And again the case still stands. Errors as above.
So I simplified my select into statement just as with the other to columns. It was now sort of like this:

select *

into tmp_bug2

from (

select logical_name
, 'thing' as display_name
, 'thing' as operation
, '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow", "How_secret_is_it": "Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
from ( .. complex subquery here.. )
order by logical_name 

) t1


And suddenly all queries ran fine. No more errors.
I removed my simplification, and went back to

select *

into tmp_bug2

from (

select logical_name
, 'thing' as display_name
, 'thing' as operation
, filtur
from ( .. complex subquery here.. )
order by logical_name 

) t1

and the errors were back.
Then I updated the table using the above update statement, and again: errors remained..

I had the same problem updating the logical_name column, when I used the update (after creating the table) I still had the error. But when I created the table with the static value ('thing' in my case) the errors were suddenly gone

So creating a table with a value has a different result then creating a table with a value, and then updating that value..


Weird behavior 7

Finally! I have a testcase for you guys. This is my query to generate the data.
select *
into tmp_bug2
from (
select 'thing' as  logical_name
, 'thing' as display_name
, 'thing' as operation
, '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow", "How_secret_is_it": "Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
from generate_series(1, 302443)
) t1;

And this is the query that generates the error.

select 
display_name
,'' as x
,filtur
,jsonb_each_text(filtur) as x
,to_jsonb(jsonb_each_text(filtur)) as frows
,array(SELECT jsonb_object_keys(filtur)) as objectkeys
from tmp_bug2
order by logical_name;

Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.




I'm lost here.
And I sure as hell would like a hand

Bas

------------------------- Disclaimer ----------------------------
De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming kregen dit bericht te lezen.
Kennisneming door anderen is niet toegestaan.
De informatie in dit e-mailbericht (en bijlagen) kan vertrouwelijk van aard zijn en binnen het bereik van een geheimhoudingsplicht en/of een verschoningsrecht vallen.
Indien dit e-mailbericht niet voor u bestemd is, wordt u verzocht de afzender daarover onmiddellijk te informeren en het e-mailbericht (en bijlagen) te vernietigen.
Conform het beveiligingsbeleid van de Politie wordt e-mail van en naar de politie gecontroleerd op virussen, spam en phishing en moet deze e-mail voldoen aan de voor de overheid verplichte mailbeveiligingsstandaarden die zijn vastgesteld door het Forum Standaardisatie.
Mail die niet voldoet aan het beveiligingsbeleid kan worden geblokkeerd waardoor deze de geadresseerde niet bereikt. De geadresseerde wordt hiervan niet in kennis gesteld.
---------------------------------------------------------------------
The information sent in this E-mail message (including any attachments) is exclusively intended for the individual(s) to whom it is addressed and for the individual(s) who has/have had permission from the recipient(s) to read this message.
Access by others is not permitted.
The information in this E-mail message (including any attachments) may be of a confidential nature and may form part of the duty of confidentiality and/or the right of non-disclosure.
If you have received this E-mail message in error, please notify the sender without delay and delete the E-mail message (including any attachments).
In conformity with the security policy of the Police, E-mails from and to the Police are checked for viruses, spam and phishing and this E-mail must meet the standards of the government-imposed E-mail security as set by the Standardization Forum.
Any E-mail failing to meet said security policy may be blocked as a result of which it will not reach the intended recipient. The recipient concerned will not be notified.
---------------------------------------------------------------------

Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Wed, Apr 07, 2021 at 10:59:31AM +0000, Poot, Bas (B.J.) wrote:
>
> Finally! I have a testcase for you guys. This is my query to generate the data.
> select *
> into tmp_bug2
> from (
> select 'thing' as  logical_name
> , 'thing' as display_name
> , 'thing' as operation
> , '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow", "How_secret_is_it":
"Level1", "Does.it.shoot": "YES!"}'::jsonb as filtur
 
> from generate_series(1, 302443)
> ) t1;
>
> And this is the query that generates the error.
>
> select
> display_name
> ,'' as x
> ,filtur
> ,jsonb_each_text(filtur) as x
> ,to_jsonb(jsonb_each_text(filtur)) as frows
> ,array(SELECT jsonb_object_keys(filtur)) as objectkeys
> from tmp_bug2
> order by logical_name;
>
> Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.

Thanks for posting the test case, I can reproduce it on the master
branch as well (also without the second call to jsonb_each_text in line
with to_jsonb). Interesting, it looks like for unclear to me reasons the
argument, evaluated in ExecMakeFunctionResultSet for jsonb_each_text,
contains value from the previous attribute, not jsonb. This makes
iteratorFromContainer complain because both array & object flags are set
in the header. I'll try to investigate, unless someone else will be
faster.



Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Wed, Apr 07, 2021 at 08:11:05PM +0200, Dmitry Dolgov wrote:
> > On Wed, Apr 07, 2021 at 10:59:31AM +0000, Poot, Bas (B.J.) wrote:
> >
> > Finally! I have a testcase for you guys. This is my query to generate the data.
> > select *
> > into tmp_bug2
> > from (
> > select 'thing' as  logical_name
> > , 'thing' as display_name
> > , 'thing' as operation
> > , '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow",
"How_secret_is_it":"Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
 
> > from generate_series(1, 302443)
> > ) t1;
> >
> > And this is the query that generates the error.
> >
> > select
> > display_name
> > ,'' as x
> > ,filtur
> > ,jsonb_each_text(filtur) as x
> > ,to_jsonb(jsonb_each_text(filtur)) as frows
> > ,array(SELECT jsonb_object_keys(filtur)) as objectkeys
> > from tmp_bug2
> > order by logical_name;
> >
> > Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.
>
> Thanks for posting the test case, I can reproduce it on the master
> branch as well (also without the second call to jsonb_each_text in line
> with to_jsonb). Interesting, it looks like for unclear to me reasons the
> argument, evaluated in ExecMakeFunctionResultSet for jsonb_each_text,
> contains value from the previous attribute, not jsonb. This makes
> iteratorFromContainer complain because both array & object flags are set
> in the header. I'll try to investigate, unless someone else will be
> faster.

Here is what I've found so far:

* It seems that technically the regression was introduced in
  ea190ed14b, but not directly, via using gather paths in more
  situations.

* The direct problem is that JsonbIteratorInit is confused by the
  argument, because it contains both headers for array and object set.
  From what I understood this confusion stems from
  ExecMakeFunctionResultSet taking wrong value from the slot. Hacking it
  to use different attnum in this case makes it work.

* The explanation for this could be that target list from where attnum
  is taken and slot values have different order of elements. This in
  turn comes out of grouping planner trying to isolate SRF and SRF-free
  targets and as a result changing the order of elements in
  final_target. The final_target is then passed into
  create_ordered_paths and applied via apply_projection_to_path, but
  somehow goes in disagreement with what is used while creating the slot
  with values.

If this analysis is correct, I'm not sure yet what would be the best
course of action to address the problem, need to think a bit more. But
probably others have suggestions or comments?



Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Tue, Apr 13, 2021 at 06:15:00PM +0200, Dmitry Dolgov wrote:
> > On Wed, Apr 07, 2021 at 08:11:05PM +0200, Dmitry Dolgov wrote:
> > > On Wed, Apr 07, 2021 at 10:59:31AM +0000, Poot, Bas (B.J.) wrote:
> > >
> > > Finally! I have a testcase for you guys. This is my query to generate the data.
> > > select *
> > > into tmp_bug2
> > > from (
> > > select 'thing' as  logical_name
> > > , 'thing' as display_name
> > > , 'thing' as operation
> > > , '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow",
"How_secret_is_it":"Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
 
> > > from generate_series(1, 302443)
> > > ) t1;
> > >
> > > And this is the query that generates the error.
> > >
> > > select
> > > display_name
> > > ,'' as x
> > > ,filtur
> > > ,jsonb_each_text(filtur) as x
> > > ,to_jsonb(jsonb_each_text(filtur)) as frows
> > > ,array(SELECT jsonb_object_keys(filtur)) as objectkeys
> > > from tmp_bug2
> > > order by logical_name;
> > >
> > > Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.
> >
> > Thanks for posting the test case, I can reproduce it on the master
> > branch as well (also without the second call to jsonb_each_text in line
> > with to_jsonb). Interesting, it looks like for unclear to me reasons the
> > argument, evaluated in ExecMakeFunctionResultSet for jsonb_each_text,
> > contains value from the previous attribute, not jsonb. This makes
> > iteratorFromContainer complain because both array & object flags are set
> > in the header. I'll try to investigate, unless someone else will be
> > faster.
> 
> Here is what I've found so far:
> 
> * It seems that technically the regression was introduced in
>   ea190ed14b, but not directly, via using gather paths in more
>   situations.
> 
> * The direct problem is that JsonbIteratorInit is confused by the
>   argument, because it contains both headers for array and object set.
>   From what I understood this confusion stems from
>   ExecMakeFunctionResultSet taking wrong value from the slot. Hacking it
>   to use different attnum in this case makes it work.
> 
> * The explanation for this could be that target list from where attnum
>   is taken and slot values have different order of elements. This in
>   turn comes out of grouping planner trying to isolate SRF and SRF-free
>   targets and as a result changing the order of elements in
>   final_target. The final_target is then passed into
>   create_ordered_paths and applied via apply_projection_to_path, but
>   somehow goes in disagreement with what is used while creating the slot
>   with values.
> 
> If this analysis is correct, I'm not sure yet what would be the best
> course of action to address the problem, need to think a bit more. But
> probably others have suggestions or comments?

I couldn't find any other feasible explanations, and have come to a
conclusion that this happens when a projection is applied to a
GatherMerge path. As it's a projection capable path, no new projection
is created and target list is changed in place. In the subpath target
list ordering is different because of query ordering, and I don't see
where it all comes together during execution. Funny enough even explain
shows that the final plan passes a wrong values to jsonb_each_text.

If I make GatherMerge non projection capable it fixes this particular
case and changes only one test in select_parallel (seems like a minor
plan changes). But I have not enough experience with this code to say if
it's a good or bad idea.



Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Fri, Apr 23, 2021 at 05:15:05PM +0200, Dmitry Dolgov wrote:
> > On Tue, Apr 13, 2021 at 06:15:00PM +0200, Dmitry Dolgov wrote:
> > > On Wed, Apr 07, 2021 at 08:11:05PM +0200, Dmitry Dolgov wrote:
> > > > On Wed, Apr 07, 2021 at 10:59:31AM +0000, Poot, Bas (B.J.) wrote:
> > > >
> > > > Finally! I have a testcase for you guys. This is my query to generate the data.
> > > > select *
> > > > into tmp_bug2
> > > > from (
> > > > select 'thing' as  logical_name
> > > > , 'thing' as display_name
> > > > , 'thing' as operation
> > > > , '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow",
"How_secret_is_it":"Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
 
> > > > from generate_series(1, 302443)
> > > > ) t1;
> > > >
> > > > And this is the query that generates the error.
> > > >
> > > > select
> > > > display_name
> > > > ,'' as x
> > > > ,filtur
> > > > ,jsonb_each_text(filtur) as x
> > > > ,to_jsonb(jsonb_each_text(filtur)) as frows
> > > > ,array(SELECT jsonb_object_keys(filtur)) as objectkeys
> > > > from tmp_bug2
> > > > order by logical_name;
> > > >
> > > > Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.
> > >
> > > Thanks for posting the test case, I can reproduce it on the master
> > > branch as well (also without the second call to jsonb_each_text in line
> > > with to_jsonb). Interesting, it looks like for unclear to me reasons the
> > > argument, evaluated in ExecMakeFunctionResultSet for jsonb_each_text,
> > > contains value from the previous attribute, not jsonb. This makes
> > > iteratorFromContainer complain because both array & object flags are set
> > > in the header. I'll try to investigate, unless someone else will be
> > > faster.
> > 
> > Here is what I've found so far:
> > 
> > * It seems that technically the regression was introduced in
> >   ea190ed14b, but not directly, via using gather paths in more
> >   situations.
> > 
> > * The direct problem is that JsonbIteratorInit is confused by the
> >   argument, because it contains both headers for array and object set.
> >   From what I understood this confusion stems from
> >   ExecMakeFunctionResultSet taking wrong value from the slot. Hacking it
> >   to use different attnum in this case makes it work.
> > 
> > * The explanation for this could be that target list from where attnum
> >   is taken and slot values have different order of elements. This in
> >   turn comes out of grouping planner trying to isolate SRF and SRF-free
> >   targets and as a result changing the order of elements in
> >   final_target. The final_target is then passed into
> >   create_ordered_paths and applied via apply_projection_to_path, but
> >   somehow goes in disagreement with what is used while creating the slot
> >   with values.
> > 
> > If this analysis is correct, I'm not sure yet what would be the best
> > course of action to address the problem, need to think a bit more. But
> > probably others have suggestions or comments?
> 
> I couldn't find any other feasible explanations, and have come to a
> conclusion that this happens when a projection is applied to a
> GatherMerge path. As it's a projection capable path, no new projection
> is created and target list is changed in place. In the subpath target
> list ordering is different because of query ordering, and I don't see
> where it all comes together during execution. Funny enough even explain
> shows that the final plan passes a wrong values to jsonb_each_text.
> 
> If I make GatherMerge non projection capable it fixes this particular
> case and changes only one test in select_parallel (seems like a minor
> plan changes). But I have not enough experience with this code to say if
> it's a good or bad idea.

Almost forgot about this one. It seems that the issue still could be
reproduced on the latest master branch, so probably worth an open item.



Re: XX000: iso-8859-1 type of jsonb container.

От
"Poot, Bas (B.J.)"
Дата:

thank you for not forgetting 👍


I read about the postgres 14 beta and was afraid the bug was indeed still in there..


Is there anything else I can do to help you solve this?




Van: Dmitry Dolgov <9erthalion6@gmail.com>
Verzonden: woensdag 26 mei 2021 15:50
Aan: Poot, Bas (B.J.)
CC: pgsql-bugs@lists.postgresql.org
Onderwerp: Re: XX000: unknown type of jsonb container.
 
> On Fri, Apr 23, 2021 at 05:15:05PM +0200, Dmitry Dolgov wrote:
> > On Tue, Apr 13, 2021 at 06:15:00PM +0200, Dmitry Dolgov wrote:
> > > On Wed, Apr 07, 2021 at 08:11:05PM +0200, Dmitry Dolgov wrote:
> > > > On Wed, Apr 07, 2021 at 10:59:31AM +0000, Poot, Bas (B.J.) wrote:
> > > >
> > > > Finally! I have a testcase for you guys. This is my query to generate the data.
> > > > select *
> > > > into tmp_bug2
> > > > from (
> > > > select 'thing' as  logical_name
> > > > , 'thing' as display_name
> > > > , 'thing' as operation
> > > > , '{"unit": "Our special Unit", "some.place": "placename", "place_of_birth": "Over the rainbow", "How_secret_is_it": "Level 1", "Does.it.shoot": "YES!"}'::jsonb as filtur
> > > > from generate_series(1, 302443)
> > > > ) t1;
> > > >
> > > > And this is the query that generates the error.
> > > >
> > > > select
> > > > display_name
> > > > ,'' as x
> > > > ,filtur
> > > > ,jsonb_each_text(filtur) as x
> > > > ,to_jsonb(jsonb_each_text(filtur)) as frows
> > > > ,array(SELECT jsonb_object_keys(filtur)) as objectkeys
> > > > from tmp_bug2
> > > > order by logical_name;
> > > >
> > > > Notice that when you generate not 302443 but 302442 rows, it works perfectly fine.
> > >
> > > Thanks for posting the test case, I can reproduce it on the master
> > > branch as well (also without the second call to jsonb_each_text in line
> > > with to_jsonb). Interesting, it looks like for unclear to me reasons the
> > > argument, evaluated in ExecMakeFunctionResultSet for jsonb_each_text,
> > > contains value from the previous attribute, not jsonb. This makes
> > > iteratorFromContainer complain because both array & object flags are set
> > > in the header. I'll try to investigate, unless someone else will be
> > > faster.
> >
> > Here is what I've found so far:
> >
> > * It seems that technically the regression was introduced in
> >   ea190ed14b, but not directly, via using gather paths in more
> >   situations.
> >
> > * The direct problem is that JsonbIteratorInit is confused by the
> >   argument, because it contains both headers for array and object set.
> >   From what I understood this confusion stems from
> >   ExecMakeFunctionResultSet taking wrong value from the slot. Hacking it
> >   to use different attnum in this case makes it work.
> >
> > * The explanation for this could be that target list from where attnum
> >   is taken and slot values have different order of elements. This in
> >   turn comes out of grouping planner trying to isolate SRF and SRF-free
> >   targets and as a result changing the order of elements in
> >   final_target. The final_target is then passed into
> >   create_ordered_paths and applied via apply_projection_to_path, but
> >   somehow goes in disagreement with what is used while creating the slot
> >   with values.
> >
> > If this analysis is correct, I'm not sure yet what would be the best
> > course of action to address the problem, need to think a bit more. But
> > probably others have suggestions or comments?
>
> I couldn't find any other feasible explanations, and have come to a
> conclusion that this happens when a projection is applied to a
> GatherMerge path. As it's a projection capable path, no new projection
> is created and target list is changed in place. In the subpath target
> list ordering is different because of query ordering, and I don't see
> where it all comes together during execution. Funny enough even explain
> shows that the final plan passes a wrong values to jsonb_each_text.
>
> If I make GatherMerge non projection capable it fixes this particular
> case and changes only one test in select_parallel (seems like a minor
> plan changes). But I have not enough experience with this code to say if
> it's a good or bad idea.

Almost forgot about this one. It seems that the issue still could be
reproduced on the latest master branch, so probably worth an open item.
------------------------- Disclaimer ----------------------------
De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming kregen dit bericht te lezen.
Kennisneming door anderen is niet toegestaan.
De informatie in dit e-mailbericht (en bijlagen) kan vertrouwelijk van aard zijn en binnen het bereik van een geheimhoudingsplicht en/of een verschoningsrecht vallen.
Indien dit e-mailbericht niet voor u bestemd is, wordt u verzocht de afzender daarover onmiddellijk te informeren en het e-mailbericht (en bijlagen) te vernietigen.
Conform het beveiligingsbeleid van de Politie wordt e-mail van en naar de politie gecontroleerd op virussen, spam en phishing en moet deze e-mail voldoen aan de voor de overheid verplichte mailbeveiligingsstandaarden die zijn vastgesteld door het Forum Standaardisatie.
Mail die niet voldoet aan het beveiligingsbeleid kan worden geblokkeerd waardoor deze de geadresseerde niet bereikt. De geadresseerde wordt hiervan niet in kennis gesteld.
---------------------------------------------------------------------
The information sent in this E-mail message (including any attachments) is exclusively intended for the individual(s) to whom it is addressed and for the individual(s) who has/have had permission from the recipient(s) to read this message.
Access by others is not permitted.
The information in this E-mail message (including any attachments) may be of a confidential nature and may form part of the duty of confidentiality and/or the right of non-disclosure.
If you have received this E-mail message in error, please notify the sender without delay and delete the E-mail message (including any attachments).
In conformity with the security policy of the Police, E-mails from and to the Police are checked for viruses, spam and phishing and this E-mail must meet the standards of the government-imposed E-mail security as set by the Standardization Forum.
Any E-mail failing to meet said security policy may be blocked as a result of which it will not reach the intended recipient. The recipient concerned will not be notified.
---------------------------------------------------------------------

Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Wed, May 26, 2021 at 01:52:07PM +0000, Poot, Bas (B.J.) wrote:
>
> thank you for not forgetting 👍
> 
> I read about the postgres 14 beta and was afraid the bug was indeed still in there..
> 
> Is there anything else I can do to help you solve this?

Theoretically nothing else, you've already done a lot by providing such
a good reproducing script :) I just need to find someone with better
understanding of GatherMerge part to confirm my conclusions or suggest
something better.



Re: XX000: iso-8859-1 type of jsonb container.

От
Tom Lane
Дата:
Dmitry Dolgov <9erthalion6@gmail.com> writes:
> I couldn't find any other feasible explanations, and have come to a
> conclusion that this happens when a projection is applied to a
> GatherMerge path. As it's a projection capable path, no new projection
> is created and target list is changed in place. In the subpath target
> list ordering is different because of query ordering, and I don't see
> where it all comes together during execution. Funny enough even explain
> shows that the final plan passes a wrong values to jsonb_each_text.

I dug into this, and found that the proximate problem is that we're
generating a Sort node in which the output column order is different
from the input.  If Sort could project, that'd be fine, but of course
it doesn't.  (I wonder if there should be some more assertions in
setrefs.c to try to catch plans that are broken like that.)

Of course, it's not like the planner has never heard of that restriction.
The problem occurs because what arrives at createplan.c looks like

    ProjectionPath
        ProjectionPath
            SortPath
                SeqScanPath

that is, we've got *two* redundant ProjectionPaths, and the logic in
createplan.c that is supposed to decide whether we can skip generating
a projecting Result node for a ProjectionPath gets confused and
decides we don't need any Result, rather than deciding we need one
Result.

Maybe we could work around that, but having two stacked ProjectionPaths
is just silly, so the attached patch deals with this problem by making
sure we don't do that.

The place where that happens in the example reported in this thread is
apply_projection_to_path, which somebody kluged up to the point where
it'd create this situation just below a Gather/GatherMerge node.
So I first tried to fix it there, and for safety added an Assert to
create_projection_path saying that its input wasn't a ProjectionPath.

The Assert blew up the regression tests.  The same thing is happening
in other places, and only by luck have we not noticed any bad effects.

So the attached fixes it by stripping any input ProjectionPath in
create_projection_path, instead.

I'm not sure about creating a test case.  The case reported here is
far too expensive to use in the regression tests.

            regards, tom lane

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index b248b038e0..6c6bde04f6 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2632,7 +2632,22 @@ create_projection_path(PlannerInfo *root,
                        PathTarget *target)
 {
     ProjectionPath *pathnode = makeNode(ProjectionPath);
-    PathTarget *oldtarget = subpath->pathtarget;
+    PathTarget *oldtarget;
+
+    /*
+     * We mustn't put a ProjectionPath directly above another; it's useless
+     * and will confuse createplan.c.  Rather than making sure all callers
+     * know that, let's implement it here, by stripping off any ProjectionPath
+     * in what we're given.  Given this rule, there won't be more than one.
+     */
+    if (IsA(subpath, ProjectionPath))
+    {
+        ProjectionPath *subpp = (ProjectionPath *) subpath;
+
+        Assert(subpp->path.parent == rel);
+        subpath = subpp->subpath;
+        Assert(!IsA(subpath, ProjectionPath));
+    }

     pathnode->path.pathtype = T_Result;
     pathnode->path.parent = rel;
@@ -2658,6 +2673,7 @@ create_projection_path(PlannerInfo *root,
      * Note: in the latter case, create_projection_plan has to recheck our
      * conclusion; see comments therein.
      */
+    oldtarget = subpath->pathtarget;
     if (is_projection_capable_path(subpath) ||
         equal(oldtarget->exprs, target->exprs))
     {

Re: XX000: iso-8859-1 type of jsonb container.

От
Tom Lane
Дата:
I wrote:
> I'm not sure about creating a test case.  The case reported here is
> far too expensive to use in the regression tests.

Ah, we can do it in select_parallel, which reduces all the parallel
costs to zero to favor these sorts of plans.

This test case only fails back to v13, as does the original example.
I suspect we should back-patch the code change further though,
since create_projection_plan will be just as confused by such cases
in earlier branches.

            regards, tom lane

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index b248b038e0..9ce5f95e3b 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2632,7 +2632,23 @@ create_projection_path(PlannerInfo *root,
                        PathTarget *target)
 {
     ProjectionPath *pathnode = makeNode(ProjectionPath);
-    PathTarget *oldtarget = subpath->pathtarget;
+    PathTarget *oldtarget;
+
+    /*
+     * We mustn't put a ProjectionPath directly above another; it's useless
+     * and will confuse create_projection_plan.  Rather than making sure all
+     * callers handle that, let's implement it here, by stripping off any
+     * ProjectionPath in what we're given.  Given this rule, there won't be
+     * more than one.
+     */
+    if (IsA(subpath, ProjectionPath))
+    {
+        ProjectionPath *subpp = (ProjectionPath *) subpath;
+
+        Assert(subpp->path.parent == rel);
+        subpath = subpp->subpath;
+        Assert(!IsA(subpath, ProjectionPath));
+    }

     pathnode->path.pathtype = T_Result;
     pathnode->path.parent = rel;
@@ -2658,6 +2674,7 @@ create_projection_path(PlannerInfo *root,
      * Note: in the latter case, create_projection_plan has to recheck our
      * conclusion; see comments therein.
      */
+    oldtarget = subpath->pathtarget;
     if (is_projection_capable_path(subpath) ||
         equal(oldtarget->exprs, target->exprs))
     {
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 05ebcb284a..4ea1aa7dfd 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -1126,6 +1126,29 @@ ORDER BY 1, 2, 3;
 ------------------------------+---------------------------+-------------+--------------
 (0 rows)

+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT generate_series(1, two), array(select generate_series(1, two))
+  FROM tenk1 ORDER BY tenthous;
+                              QUERY PLAN
+----------------------------------------------------------------------
+ ProjectSet
+   Output: generate_series(1, tenk1.two), (SubPlan 1), tenk1.tenthous
+   ->  Gather Merge
+         Output: tenk1.two, tenk1.tenthous
+         Workers Planned: 4
+         ->  Result
+               Output: tenk1.two, tenk1.tenthous
+               ->  Sort
+                     Output: tenk1.tenthous, tenk1.two
+                     Sort Key: tenk1.tenthous
+                     ->  Parallel Seq Scan on public.tenk1
+                           Output: tenk1.tenthous, tenk1.two
+   SubPlan 1
+     ->  ProjectSet
+           Output: generate_series(1, tenk1.two)
+           ->  Result
+(16 rows)
+
 -- test passing expanded-value representations to workers
 CREATE FUNCTION make_some_array(int,int) returns int[] as
 $$declare x int[];
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index d31e290ec2..f924731248 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -429,6 +429,10 @@ ORDER BY 1;
 SELECT * FROM information_schema.foreign_data_wrapper_options
 ORDER BY 1, 2, 3;

+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT generate_series(1, two), array(select generate_series(1, two))
+  FROM tenk1 ORDER BY tenthous;
+
 -- test passing expanded-value representations to workers
 CREATE FUNCTION make_some_array(int,int) returns int[] as
 $$declare x int[];

Re: XX000: iso-8859-1 type of jsonb container.

От
Dmitry Dolgov
Дата:
> On Sun, May 30, 2021 at 08:59:28PM -0400, Tom Lane wrote:
> 
> Of course, it's not like the planner has never heard of that restriction.
> The problem occurs because what arrives at createplan.c looks like
> 
>     ProjectionPath
>         ProjectionPath
>             SortPath
>                 SeqScanPath
> 
> that is, we've got *two* redundant ProjectionPaths, and the logic in
> createplan.c that is supposed to decide whether we can skip generating
> a projecting Result node for a ProjectionPath gets confused and
> decides we don't need any Result, rather than deciding we need one
> Result.
> 
> Maybe we could work around that, but having two stacked ProjectionPaths
> is just silly, so the attached patch deals with this problem by making
> sure we don't do that.
> 
> The place where that happens in the example reported in this thread is
> apply_projection_to_path, which somebody kluged up to the point where
> it'd create this situation just below a Gather/GatherMerge node.
> So I first tried to fix it there, and for safety added an Assert to
> create_projection_path saying that its input wasn't a ProjectionPath.
> 
> The Assert blew up the regression tests.  The same thing is happening
> in other places, and only by luck have we not noticed any bad effects.
> 
> So the attached fixes it by stripping any input ProjectionPath in
> create_projection_path, instead.
 
Thanks for looking into this. I don't see any problems with the
suggested patch, so +1.

> This test case only fails back to v13, as does the original example.
> I suspect we should back-patch the code change further though,
> since create_projection_plan will be just as confused by such cases
> in earlier branches.

That would be my assumption as well.



Re: XX000: iso-8859-1 type of jsonb container.

От
Tom Lane
Дата:
Dmitry Dolgov <9erthalion6@gmail.com> writes:
> On Sun, May 30, 2021 at 08:59:28PM -0400, Tom Lane wrote:
>> This test case only fails back to v13, as does the original example.
>> I suspect we should back-patch the code change further though,
>> since create_projection_plan will be just as confused by such cases
>> in earlier branches.

> That would be my assumption as well.

Done now.  I spent a little bit of time looking for a test case
that'd fail in the prior branches, without success.  I think
it's highly probable that there is one, but I didn't want to
spend any more time here.

            regards, tom lane