Re: Separate the attribute physical order from logical order

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: Separate the attribute physical order from logical order
Дата
Msg-id 20220628182723.shy47wx74mt6ordh@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: Separate the attribute physical order from logical order  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: Separate the attribute physical order from logical order  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2022-Jun-28, Julien Rouhaud wrote:


> On Tue, Jun 28, 2022 at 10:53:14AM +0200, Alvaro Herrera wrote:

> > My feeling is that every aspect of user interaction should show
> > columns ordered in logical order.
> 
> I'm not entirely sure of what you meant.  Assuming tables a(a, z) and b(b, z),
> what do you think those queries should return?
> 
> SELECT * FROM a JOIN b on a.z = b.z
> Currently it returns (a.a, a.z, b.b, b.z)
> 
> SELECT * FROM a JOIN b USING (z)
> Currently it returns a.z, a.a, b.b.
> 
> Should it now return (a.a, z, b.b) as long as the tables have that logical
> order, whether or not any other position (attnum / attphysnum) is different or
> stay the same as now?

For all user-visible intents and purposes, the column order is whatever
the logical order is (attlognum), regardless of attnum and attphysnum.
If the logical order is changed, then the order of the output columns of
a join will change to match.  The attnum and attphysnum are completely
irrelevant to all these purposes.  So, to answer your question, if the
join expands in this way at present, then it should continue to expand
that way if you define a table that has different attnum/attphysnum but
the same attlognum for those columns.


> I'm not following.  If we keep attnum as the official identity position and
> use attlognum as the position that should be used in any interactive command,
> wouldn't that risk to break every single client?

Yeah, it might break a lot of tools, but other things break tools too
and the world just moves on.

But if you don't want to break tools, I can think of two alternatives:

1. make the immutable column identity something like attidnum and
   keep attnum as the logical column order.
   This keeps tools happy, but if they try to match pg_attrdef by attnum
   bad things will happen.

2. in order to avoid possible silent breakage, remove attnum altogether
   and just have attidnum, attlognum, attphysnum; then every tool is
   forced to undergo an update.  Any cross-catalog relationships are now
   correct.

> Imagine you have some framework that automatically generates queries based on
> the catalog, if it sees table abc with:
> c: attnum 1, attphysnum 1, attlognum 3
> b: attnum 2, attphysnum 2, attlognum 2
> a: attnum 3, attphysnum 3, attlognum 1

Hopefully the framework will add a column list,
  INSERT INTO abc (c,b,a) VALUES ('c', 'b', 'a');
to avoid this problem.  But if it doesn't, then yeah it will misbehave,
and I don't think you should try to make it not misbehave.

> Also, about the default values evaluation (see [2]), should it be tied to your
> attnum, attphysnum or attlognum?

Default is tied to column identity.  If you change column order, the
defaults don't need to change at all.  Similarly, if the server decides
to repack the columns in a different way to save alignment padding, the
defaults don't need to change.

If you do not provide a column identity number or you use something else
(e.g. attlognum) to cross-references attributes from other catalogs,
then you'll have to edit pg_attrdef when a column moves; and any other
reference to a column number will have to change.  Or think about
pg_depend.  You don't want that.  This is why you need three columns,
not two.

> I think that supporting at least a way to specify the logical order
> during the table creation should be easy to implement

As long as it is really simple (just some stuff in CREATE TABLE, nothing
at all in ALTER TABLE) then that sounds good.  I just suggest not to
complicate things too much to avoid the risk of failing the project
altogether.

For testability, her's a crazy idea: have some test mode (maybe #ifdef
USE_ASSERT_CHECKING) that randomizes attlognum to start at some N >> 1,
and only attidnum starts at 1.  Then they never match and all tools need
to ensure they handle weird cases correctly.

> (there shouldn't be any question on whether it needs to invalidate any
> cache or what lock level to use), and could also be added in the
> initial submission without much extra efforts, which could help with
> the testing.

Famous last words :-)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Nadie está tan esclavizado como el que se cree libre no siéndolo" (Goethe)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: making relfilenodes 56 bits
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Hardening PostgreSQL via (optional) ban on local file system access