Обсуждение: TODO items (was Re: [COMMITTERS] pgsql/src/include/nodes (execnodes.h))

Поиск
Список
Период
Сортировка

TODO items (was Re: [COMMITTERS] pgsql/src/include/nodes (execnodes.h))

От
Tom Lane
Дата:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Tom, I only see one entry in TODO for this:
>     * -Fix memory leak for aggregates?

>> -----------------------------  Log Message  -----------------------------
>> Modify nodeAgg.c so that no rows are returned for a GROUP BY
>> with no input rows, per pghackers discussions around 7/22/99.  Clean up
>> a bunch of ugly coding while at it; remove redundant re-lookup of
>> aggregate info at start of each new GROUP.  Arrange to pfree intermediate
>> values when they are pass-by-ref types, so that aggregates on pass-by-ref
>> types no longer eat memory.  This takes care of a couple of TODO items...

Hmm, you are right --- I thought that discussion about changing the
semantics of aggregates with GROUP BY had made it to the TODO list,
but apparently it never did.  It should have, however: * When using aggregates + GROUP BY, no rows in should yield no
rowsout
 


This motivated me to grovel through the TODO list, which I hadn't done
for a while, and I have some updates/comments.


PARSER
------

* Select a[1] FROM test fails, it needs test.a[1]

Fixed for 6.6 --- actually same thing as next item,
* -Array index references without table name cause problems [array]

* Update table SET table.value = 3 fails

AFAICS, the SQL92 syntax allows only a bare <column name> as the
target of a SET clause.  Not sure it's worth expending any effort
on this one...

ENHANCEMENTS
------------

COMMANDS

* Generate error on CREATE OPERATOR of ~~, ~ and and ~*

"Error" seems a little strong, maybe a "NOTICE" along the lines of
"We trust you know that ~~ defines the behavior of the LIKE keyword".

I believe the original motivation for this entry was that the parser
would do the wrong thing for arbitrary operators named ~~ etc, because
it would try to apply optimizations that were only suitable for the
standard ops of those names (textlike etc).  That's no longer a problem,
because those optimizations are now triggered off matching of the
operator OID; they will not cause a problem if Joe User invents an
operator named ~~ for his spiffy new datatype.  But perhaps Joe should
be reminded that he just made LIKE applicable to his datatype.  Or maybe
that's not worth worrying about...

* Move LIKE index optimization handling to the optimizer

This is basically done, although I have a couple of cleanup issues
to take care of.

CLIENTS

* PQrequestCancel() be able to terminate backend waiting for lock

There is an equivalent item under MISC, and it doesn't seem like it
belongs under CLIENTS --- the necessary code change is in the backend.

MISC

* Do autocommit so always in a transaction block(?)

Huh?  What is this supposed to mean?

PERFORMANCE
-----------

INDEXES

* Convert function(constant) into a constant for index use

Done as of now; see Frankpitt's constant-expression simplifier.
We might have some lingering bugs with simplifying things that
ought not be simplified, however...

* Allow SELECT * FROM tab WHERE int2col = 4 use int2col index, int8 too[optimizer]

I believe float4 columns have the same sort of problem, since a numeric
constant will be taken as float8 not float4 if not explicitly casted.
For that matter, numeric/decimal columns do too, or would if we had
indexing support for them...

* Allow optimizer to prefer plans that match ORDER BY

This is done, although we now have the opposite problem: the darn thing
is too eager to pick an indexscan plan :-(.  Need to make the cost
estimates for indexscan vs explicit sort more accurate.

MISC

* Update pg_statistic table to remove operator column

I do not believe we should do this.  It's true that right now we have
no use for the operator column, because only the default '<' ordering
will ever be used by VACUUM, but we should keep the column in the name
of datatype extensibility.  Someday VACUUM might compute stats with
respect to more than one ordering, for datatypes that have more than one.

* -Fix memory exhaustion when using many OR's [cnfify]

cnfify is still pretty slow with many subclauses --- the behavior
is now O(N^2) rather than O(2^N), but that just means it's 
slow rather than intolerable.  I'm not sure what to do about it.
We probably need to be using heuristics instead of an unconditional
convert-to-normal-form-or-bust algorithm, but what should the
heuristic conditions be?  Am thinking about it, could use suggestions.

* Process const = const parts of OR clause in separate pass

Done --- Frankpitt's const simplifier handles this.

* change VACUUM ANALYZE to use btree comparison functions, not <,=,> calls

Didn't we decide this probably wasn't worth doing?

SOURCE CODE
-----------

* Remove SET KSQO option if OR processing is improved

You can put my name on this one --- I'm not quite ready to pull KSQO
but I think we are close.
        regards, tom lane


Re: TODO items (was Re: [COMMITTERS] pgsql/src/include/nodes (execnodes.h))

От
Bruce Momjian
Дата:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > Tom, I only see one entry in TODO for this:
> >     * -Fix memory leak for aggregates?
> 
> >> -----------------------------  Log Message  -----------------------------
> >> Modify nodeAgg.c so that no rows are returned for a GROUP BY
> >> with no input rows, per pghackers discussions around 7/22/99.  Clean up
> >> a bunch of ugly coding while at it; remove redundant re-lookup of
> >> aggregate info at start of each new GROUP.  Arrange to pfree intermediate
> >> values when they are pass-by-ref types, so that aggregates on pass-by-ref
> >> types no longer eat memory.  This takes care of a couple of TODO items...
> 
> Hmm, you are right --- I thought that discussion about changing the
> semantics of aggregates with GROUP BY had made it to the TODO list,
> but apparently it never did.  It should have, however:
>   * When using aggregates + GROUP BY, no rows in should yield no rows out

Added to TODO, with a completion mark.

> This motivated me to grovel through the TODO list, which I hadn't done
> for a while, and I have some updates/comments.

Good.  It is a long list.

> PARSER
> ------
> 
> * Select a[1] FROM test fails, it needs test.a[1]
> 
> Fixed for 6.6 --- actually same thing as next item,
> * -Array index references without table name cause problems [array]

Done.

> 
> * Update table SET table.value = 3 fails
> 
> AFAICS, the SQL92 syntax allows only a bare <column name> as the
> target of a SET clause.  Not sure it's worth expending any effort
> on this one...

Marked now as:
* Update table SET table.value = 3 fails(SQL standard says this is OK)

> 
> ENHANCEMENTS
> ------------
> 
> COMMANDS
> 
> * Generate error on CREATE OPERATOR of ~~, ~ and and ~*
> 
> "Error" seems a little strong, maybe a "NOTICE" along the lines of
> "We trust you know that ~~ defines the behavior of the LIKE keyword".
> 
> I believe the original motivation for this entry was that the parser
> would do the wrong thing for arbitrary operators named ~~ etc, because
> it would try to apply optimizations that were only suitable for the
> standard ops of those names (textlike etc).  That's no longer a problem,
> because those optimizations are now triggered off matching of the
> operator OID; they will not cause a problem if Joe User invents an
> operator named ~~ for his spiffy new datatype.  But perhaps Joe should
> be reminded that he just made LIKE applicable to his datatype.  Or maybe
> that's not worth worrying about...

Removed.  You are correct that the message describes the old LIKE
optimization of user ~~ functions.  This item is removed.

> 
> * Move LIKE index optimization handling to the optimizer
> 
> This is basically done, although I have a couple of cleanup issues
> to take care of.

Marked as done.

> 
> CLIENTS
> 
> * PQrequestCancel() be able to terminate backend waiting for lock
> 
> There is an equivalent item under MISC, and it doesn't seem like it
> belongs under CLIENTS --- the necessary code change is in the backend.

Removed.  Already present, as you mentioned.

> 
> MISC
> 
> * Do autocommit so always in a transaction block(?)
> 
> Huh?  What is this supposed to mean?

Some people want the SQL session to start inside a transaction, and you
have to explicity use COMMIT, at which point you are in a new
transaction that lasts until the next commit.  Ingres SQL does this, and
it is a pain, I think.

> 
> PERFORMANCE
> -----------
> 
> INDEXES
> 
> * Convert function(constant) into a constant for index use
> 
> Done as of now; see Frankpitt's constant-expression simplifier.
> We might have some lingering bugs with simplifying things that
> ought not be simplified, however...

Marked as done.

> 
> * Allow SELECT * FROM tab WHERE int2col = 4 use int2col index, int8 too
>     [optimizer]
> 
> I believe float4 columns have the same sort of problem, since a numeric
> constant will be taken as float8 not float4 if not explicitly casted.
> For that matter, numeric/decimal columns do too, or would if we had
> indexing support for them...

Added new types to list.

> 
> * Allow optimizer to prefer plans that match ORDER BY
> 
> This is done, although we now have the opposite problem: the darn thing
> is too eager to pick an indexscan plan :-(.  Need to make the cost
> estimates for indexscan vs explicit sort more accurate.

That is amusing.  Marked as done.

> 
> MISC
> 
> * Update pg_statistic table to remove operator column
> 
> I do not believe we should do this.  It's true that right now we have
> no use for the operator column, because only the default '<' ordering
> will ever be used by VACUUM, but we should keep the column in the name
> of datatype extensibility.  Someday VACUUM might compute stats with
> respect to more than one ordering, for datatypes that have more than one.

Removed from the list.

> 
> * -Fix memory exhaustion when using many OR's [cnfify]
> 
> cnfify is still pretty slow with many subclauses --- the behavior
> is now O(N^2) rather than O(2^N), but that just means it's 
> slow rather than intolerable.  I'm not sure what to do about it.
> We probably need to be using heuristics instead of an unconditional
> convert-to-normal-form-or-bust algorithm, but what should the
> heuristic conditions be?  Am thinking about it, could use suggestions.

Marked as done.  Let's see if people complain.


> 
> * Process const = const parts of OR clause in separate pass
> 
> Done --- Frankpitt's const simplifier handles this.

Marked as done.

> 
> * change VACUUM ANALYZE to use btree comparison functions, not <,=,> calls
> 
> Didn't we decide this probably wasn't worth doing?
> 

Yes.  Removed.


> SOURCE CODE
> -----------
> 
> * Remove SET KSQO option if OR processing is improved
> 
> You can put my name on this one --- I'm not quite ready to pull KSQO
> but I think we are close.

Marked for you.  New TODO copy installed.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: TODO items

От
Peter Eisentraut
Дата:
Just my 0.02 kronor . . .

On Sep 27, Bruce Momjian noted:

> > * Update table SET table.value = 3 fails
> > 
> > AFAICS, the SQL92 syntax allows only a bare <column name> as the
> > target of a SET clause.  Not sure it's worth expending any effort
> > on this one...
> 
> Marked now as:
> 
>     * Update table SET table.value = 3 fails(SQL standard says this is OK)

In my opinion this should definitely _not_ be allowed. Let's be glad the
UPDATE command is so conceptually simple (cf. SELECT). The next thing they
want is ALTER TABLE foo RENAME foo.colum [ TO bar.something ??? -- moving
columns between tables, why not :) ] and then CREATE TABLE foo (foo.a int,
...); and it won't stop :)


> > MISC
> > 
> > * Do autocommit so always in a transaction block(?)
> > 
> > Huh?  What is this supposed to mean?
> 
> Some people want the SQL session to start inside a transaction, and you
> have to explicity use COMMIT, at which point you are in a new
> transaction that lasts until the next commit.  Ingres SQL does this, and
> it is a pain, I think.

I have been wondering about this, too. Oracle does this as well. This is
also how they taught me SQL in university, so it is probably not out of
the blue. What do the standards say?

Then again, while I think that client programmers won't die if they type
an extra BEGIN here or there, this might be useful as a psql feature. Too
many times I've seen people type DELETE FROM <table>; by accident.

What do y'all think? (Besides the fact that this might be a pain to
implement.)


Peter

-- 
Peter Eisentraut - peter_e@gmx.net
http://yi.org/peter-e



Re: [HACKERS] Re: TODO items

От
Bruce Momjian
Дата:
> Just my 0.02 kronor . . .
>
> On Sep 27, Bruce Momjian noted:
>
> > > * Update table SET table.value = 3 fails
> > >
> > > AFAICS, the SQL92 syntax allows only a bare <column name> as the
> > > target of a SET clause.  Not sure it's worth expending any effort
> > > on this one...
> >
> > Marked now as:
> >
> >     * Update table SET table.value = 3 fails(SQL standard says this is OK)
>
> In my opinion this should definitely _not_ be allowed. Let's be glad the
> UPDATE command is so conceptually simple (cf. SELECT). The next thing they
> want is ALTER TABLE foo RENAME foo.colum [ TO bar.something ??? -- moving
> columns between tables, why not :) ] and then CREATE TABLE foo (foo.a int,
> ...); and it won't stop :)

OK, let's leave it in so people know it is not implemented.

> > Some people want the SQL session to start inside a transaction, and you
> > have to explicity use COMMIT, at which point you are in a new
> > transaction that lasts until the next commit.  Ingres SQL does this, and
> > it is a pain, I think.
>
> I have been wondering about this, too. Oracle does this as well. This is
> also how they taught me SQL in university, so it is probably not out of
> the blue. What do the standards say?
>
> Then again, while I think that client programmers won't die if they type
> an extra BEGIN here or there, this might be useful as a psql feature. Too
> many times I've seen people type DELETE FROM <table>; by accident.

No one has really been passionate about it either way.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026