Bug in either collation docs or code

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Bug in either collation docs or code
Дата
Msg-id CAAKRu_ZDVaE48w+zNE2x2gcFEbRShtHyMg0ERF5rO-_UM_cCHQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Bug in either collation docs or code  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
Hi,
I noticed what seems like a bug in collation.

This query errors out:
SELECT (SELECT 'c' COLLATE "de_DE") > (SELECT 'ç' COLLATE "es_ES");  -- error

While this query does not:
SELECT 'c' COLLATE "de_DE" > (SELECT 'ç' COLLATE "es_ES");  -- no error

It seems like this is in conflict with what the documentation says:
"If any input expression has an explicit collation derivation, then all explicitly derived collations among the input expressions must be the same, otherwise an error is raised."

After digging into it a bit, I believe the reason for this is that when we are assigning collations in assign_collations_walker, we always set collation strength to IMPLICIT for the subquery and always set the collation strength to EXPLICIT for the collate node on the other side of the OpExpr. So, we don't hit an error later like the one in merge_collation_state when the collation of one expression is conflicting with that of its parent and the strength of both is EXPLICIT.
I think this still applies to our case because one of the two arguments to OpExpr would have set their parent's collation strength to either IMPLICIT or EXPLICIT and then we will process the other argument which would then have a different collation strength than the one we just set its parent to. So, we end up setting the inputcollid for the OpExpr to that of the explicit collation in the collate node.

Basically, it seems like our subquery will always have its collation strength set to IMPLICIT, so, if we have explicit collation in the first target entry of the subquery's target list, it looks like we will never truly treat that as explicit collation.

To test the theory that this is why we are not erroring out with explicit collation on either side of our expression, I added a hack to set the collation strength for a subquery to EXPLICIT if its first target entry is a collate node (see attached patch). With this hack, it throws what I think is the correct error in the case above.
However, I am sure that this is too specific a way of solving this. Just on first thought, it wouldn't handle SubPlans (where we have a param as the other argument to the OpExpr).

This query plans and executes with no error now and with the attached patch:
select 'c' COLLATE "de_DE" > ANY(select 'ç' COLLATE "es_ES");

I'm not sure if this behavior is considered a bug, but I also can't imagine how it would be expected given the current documentation. It seems to me one or the other should be updated.

--
Melanie Plageman
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [PATCH] Improve geometric types
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced