Обсуждение: preliminary: logical column order
This patch provides a preliminary implementation of the logical column
ordering functionality recently proposed on -hackers. This patch has a
number of known issues, so please don't apply it.
Current functionality implemented:
- Added attpos to the system catalogs & bumped the catalog
version: initdb required
- COPY TO/FROM obey logical column ordering
- When expanding "SELECT * ...", we make sure to obey logical
column order (see below, however)
- If INSERT is given an empty column list, it will produce the
necessary implicit column list in sorted by attpos
- (unrelated) improve a bunch of comments in pg_class.h,
remove some dead code from pg_attribute.h, refactor some
code in access/common/printtup.c
Remaining work to do:
- Change the name of the pg_attribute attribute to something
other than 'attpos', per the discussion on -hackers. I don't
believe we've settled on the right name yet.
- The code for actually sorting the columns in attpos-order is
duplicated a few times -- this was just done for the sake of
convenience, I'm going to clean this up and stick it in a
single, shared location in the new patch.
- The INSERT change causes the alter_table regression tests to
fail (the regression.diff) is attached. This appears to be
the logical column ordering interacting with inheritance,
but I haven't yet taken an in-depth look at it.
- When processing a "SELECT *", for example, the actual data
columns are returned in the right order, but the
RowDescription messages sent by libpq are not (i.e. they are
sent in attnum-order, not attpos).
In SendRowDescriptionMessage() and related printtup code, I
took a look at trying to sort the attributes we're about to
send RowDescription messages for by their 'attpos', but the
TupleDesc that we have for the return type doesn't included
a filled-in attpos.
I spent a while trying to see how feasible it would be to
fill in the result-type-TupleDesc with an attpos where
available, but couldn't figure out an easy way to do
this. Are there any suggestions on how this should be best
done?
- I haven't yet implemented ordering-by-attpos for the output
columns of a join, or the alias->column matching in the FROM
alias list (per Tom's email to -hackers). I saw a couple
places where I might be able to do this, but I'm quite sure
what the correct spot is. Tom, do you have any suggestions?
Any comments would be gratefully appreciated.
-Neil
Вложения
Neil Conway <neilc@samurai.com> writes:
> - The code for actually sorting the columns in attpos-order is
> duplicated a few times -- this was just done for the sake of
> convenience, I'm going to clean this up and stick it in a
> single, shared location in the new patch.
Bruce and I were chatting about that on the phone today. I think it
might be useful for TupleDescs to doubly index their contained attribute
rows --- that is, keep the existing array-indexed-by-attnum, but add
another pointer array indexed by attpos, containing only nondeleted
columns. This would be easy to build, and it'd eliminate
searching/sorting for places that had access to a TupleDesc.
> - When processing a "SELECT *", for example, the actual data
> columns are returned in the right order, but the
> RowDescription messages sent by libpq are not (i.e. they are
> sent in attnum-order, not attpos).
Easy to fix given above proposal ... although actually I am not sure why
this would occur. printtup and friends should always get a constructed
TupDesc that has no notion of deleted or renumbered columns. This may
be a symptom of a more fundamental error somewhere.
regards, tom lane
On Fri, 21 Nov 2003 03:09:02 -0500, Neil Conway <neilc@samurai.com> wrote:
>attachment; filename=weird_regression.diffs
This was caused by a small oversight in ALTER TABLE ... ADD COLUMN:
diff -ruN ../base/src/backend/commands/tablecmds.c src/backend/commands/tablecmds.c
--- ../base/src/backend/commands/tablecmds.c 2003-10-14 00:47:15.000000000 +0200
+++ src/backend/commands/tablecmds.c 2003-11-23 16:51:37.000000000 +0100
@@ -1786,6 +1786,7 @@
attribute->attcacheoff = -1;
attribute->atttypmod = colDef->typename->typmod;
attribute->attnum = i;
+ attribute->attpos = i;
attribute->attbyval = tform->typbyval;
attribute->attndims = attndims;
attribute->attisset = (bool) (tform->typtype == 'c');
Servus
Manfred
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Bruce and I were chatting about that on the phone today. I think it
> might be useful for TupleDescs to doubly index their contained
> attribute rows
Ok, I implemented this. I made it so that the properly sorted
attribute array is constructed lazily -- my reasoning was that (a)
relatively few locations in the code actually use the sorted attribute
array, so there is no point allocating and initializing it every time
a TupleDesc is used (b) the array could potentially be 1,500 pointers
long -- not huge, but not tiny either, so it is worth a little effort
to avoid unnecessarily alloc'ing and sorting it. Access to the sorted
array is (only) done via a new function, GetAttrByLogicalPosition()
> Easy to fix given above proposal ... although actually I am not sure
> why this would occur. printtup and friends should always get a
> constructed TupDesc that has no notion of deleted or renumbered
> columns.
It seems to happen because:
- SendRowDescription() in printtup.c notes that "the TupleDesc has
been manufactured by ExecTypeFromTL() or some similar function;
it does not contain a full set of fields."
- ExecTypeFromTL() allocates an empty TupleDesc, and then fills in
its individual entries with data from a TargetList -- the
Form_pg_attribute for the attributes involved is never consulted,
so the default attlogpos inserted by TupleDescInitEntry() is
used: attlogpos == attnum
I'm unsure of the best way to fix this so that the TupleDesc that is
handed to printtup & friends contains the information we require. Any
suggestions?
A new version of the patch is attached. Changes:
- replace sorting code with a lazily-constructed sorted array of
pointers to attribute data in TupleDesc
- include Manfred's suggested fix for the alter table regression
test
- (unrelated) add a comment to rewrite/rewriteDefine.c noting the
intent of the code and known problems
- (unrelated) merge my other patch for refactoring
CreateTupleDescCopy() into this patch: I needed to modify this
function in this patch, so I didn't want to deal with patching
complexities.
-Neil
Вложения
I wonder if it wouldn't be easier to reorder the TupDesc->attrs[] array according to an attphysid when filling the TupDesc structure, right after a column was dropped/recreated (before any indexes/constraints are recreated), so attnum remains, while storage changes. Example: before: attnum attphysid attname attisdropped 1 1 foo f 2 2 bar f after drop/recreate col: 1 3 foo f 2 2 bar f 3 1 foo_del t resulting in an attrs array attrs[0] describing physical col 3 attrs[1] describing physical col 2 attrs[2] describing physical col 1 Regards, Andreas