Обсуждение: preliminary: logical column order

Поиск
Список
Период
Сортировка

preliminary: logical column order

От
Neil Conway
Дата:
This patch provides a preliminary implementation of the logical column
ordering functionality recently proposed on -hackers. This patch has a
number of known issues, so please don't apply it.

Current functionality implemented:

        - Added attpos to the system catalogs & bumped the catalog
          version: initdb required
        - COPY TO/FROM obey logical column ordering
        - When expanding "SELECT * ...", we make sure to obey logical
          column order (see below, however)
        - If INSERT is given an empty column list, it will produce the
          necessary implicit column list in sorted by attpos
        - (unrelated) improve a bunch of comments in pg_class.h,
          remove some dead code from pg_attribute.h, refactor some
          code in access/common/printtup.c

Remaining work to do:

        - Change the name of the pg_attribute attribute to something
          other than 'attpos', per the discussion on -hackers. I don't
          believe we've settled on the right name yet.
        - The code for actually sorting the columns in attpos-order is
          duplicated a few times -- this was just done for the sake of
          convenience, I'm going to clean this up and stick it in a
          single, shared location in the new patch.
        - The INSERT change causes the alter_table regression tests to
          fail (the regression.diff) is attached. This appears to be
          the logical column ordering interacting with inheritance,
          but I haven't yet taken an in-depth look at it.
        - When processing a "SELECT *", for example, the actual data
          columns are returned in the right order, but the
          RowDescription messages sent by libpq are not (i.e. they are
          sent in attnum-order, not attpos).

          In SendRowDescriptionMessage() and related printtup code, I
          took a look at trying to sort the attributes we're about to
          send RowDescription messages for by their 'attpos', but the
          TupleDesc that we have for the return type doesn't included
          a filled-in attpos.

          I spent a while trying to see how feasible it would be to
          fill in the result-type-TupleDesc with an attpos where
          available, but couldn't figure out an easy way to do
          this. Are there any suggestions on how this should be best
          done?
        - I haven't yet implemented ordering-by-attpos for the output
          columns of a join, or the alias->column matching in the FROM
          alias list (per Tom's email to -hackers). I saw a couple
          places where I might be able to do this, but I'm quite sure
          what the correct spot is. Tom, do you have any suggestions?

Any comments would be gratefully appreciated.

-Neil

Вложения

Re: preliminary: logical column order

От
Tom Lane
Дата:
Neil Conway <neilc@samurai.com> writes:
>         - The code for actually sorting the columns in attpos-order is
>           duplicated a few times -- this was just done for the sake of
>           convenience, I'm going to clean this up and stick it in a
>           single, shared location in the new patch.

Bruce and I were chatting about that on the phone today.  I think it
might be useful for TupleDescs to doubly index their contained attribute
rows --- that is, keep the existing array-indexed-by-attnum, but add
another pointer array indexed by attpos, containing only nondeleted
columns.  This would be easy to build, and it'd eliminate
searching/sorting for places that had access to a TupleDesc.

>         - When processing a "SELECT *", for example, the actual data
>           columns are returned in the right order, but the
>           RowDescription messages sent by libpq are not (i.e. they are
>           sent in attnum-order, not attpos).

Easy to fix given above proposal ... although actually I am not sure why
this would occur.  printtup and friends should always get a constructed
TupDesc that has no notion of deleted or renumbered columns.  This may
be a symptom of a more fundamental error somewhere.

            regards, tom lane

Re: preliminary: logical column order

От
Manfred Koizar
Дата:
On Fri, 21 Nov 2003 03:09:02 -0500, Neil Conway <neilc@samurai.com> wrote:
>attachment; filename=weird_regression.diffs

This was caused by a small oversight in ALTER TABLE ... ADD COLUMN:

diff -ruN ../base/src/backend/commands/tablecmds.c src/backend/commands/tablecmds.c
--- ../base/src/backend/commands/tablecmds.c    2003-10-14 00:47:15.000000000 +0200
+++ src/backend/commands/tablecmds.c    2003-11-23 16:51:37.000000000 +0100
@@ -1786,6 +1786,7 @@
     attribute->attcacheoff = -1;
     attribute->atttypmod = colDef->typename->typmod;
     attribute->attnum = i;
+    attribute->attpos = i;
     attribute->attbyval = tform->typbyval;
     attribute->attndims = attndims;
     attribute->attisset = (bool) (tform->typtype == 'c');


Servus
 Manfred

Re: preliminary: logical column order

От
Neil Conway
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Bruce and I were chatting about that on the phone today.  I think it
> might be useful for TupleDescs to doubly index their contained
> attribute rows

Ok, I implemented this. I made it so that the properly sorted
attribute array is constructed lazily -- my reasoning was that (a)
relatively few locations in the code actually use the sorted attribute
array, so there is no point allocating and initializing it every time
a TupleDesc is used (b) the array could potentially be 1,500 pointers
long -- not huge, but not tiny either, so it is worth a little effort
to avoid unnecessarily alloc'ing and sorting it. Access to the sorted
array is (only) done via a new function, GetAttrByLogicalPosition()

> Easy to fix given above proposal ... although actually I am not sure
> why this would occur.  printtup and friends should always get a
> constructed TupDesc that has no notion of deleted or renumbered
> columns.

It seems to happen because:

   - SendRowDescription() in printtup.c notes that "the TupleDesc has
     been manufactured by ExecTypeFromTL() or some similar function;
     it does not contain a full set of fields."

   - ExecTypeFromTL() allocates an empty TupleDesc, and then fills in
     its individual entries with data from a TargetList -- the
     Form_pg_attribute for the attributes involved is never consulted,
     so the default attlogpos inserted by TupleDescInitEntry() is
     used: attlogpos == attnum

I'm unsure of the best way to fix this so that the TupleDesc that is
handed to printtup & friends contains the information we require. Any
suggestions?

A new version of the patch is attached. Changes:

   - replace sorting code with a lazily-constructed sorted array of
     pointers to attribute data in TupleDesc
   - include Manfred's suggested fix for the alter table regression
     test
   - (unrelated) add a comment to rewrite/rewriteDefine.c noting the
     intent of the code and known problems
   - (unrelated) merge my other patch for refactoring
     CreateTupleDescCopy() into this patch: I needed to modify this
     function in this patch, so I didn't want to deal with patching
     complexities.

-Neil

Вложения

Re: preliminary: logical column order

От
Andreas Pflug
Дата:
I wonder if it wouldn't be easier to reorder the TupDesc->attrs[] array
according to an attphysid when filling the TupDesc structure, right
after a column was dropped/recreated (before any indexes/constraints are
recreated), so attnum remains, while storage changes.

Example:

before:
attnum   attphysid  attname attisdropped

1         1         foo           f
2         2         bar           f

after drop/recreate col:
1         3         foo           f
2         2         bar           f
3         1         foo_del       t

resulting in an attrs array
attrs[0] describing physical col 3
attrs[1] describing physical col 2
attrs[2] describing physical col 1



Regards,
Andreas