Обсуждение: Status of 7.2

Поиск
Список
Период
Сортировка

Status of 7.2

От
Bruce Momjian
Дата:
OK, 7.2 is looking _very_ good.  We have very few open items.  They are:
Source Code Changes-------------------Compile in syslog feature by default? (Peter, Tom)AIX compile (Tatsuo)Libpq++
compileon Solaris (Peter)Documentation Changes---------------------
 

The always-updated list is at:
ftp://candle.pha.pa.us/pub/postgresql/open_items.

I also have created a post-7.2 list of items that are either patches
that need to be applied or discussed for 7.3.  That is at:
http://candle.pha.pa.us/cgi-bin/pgpatches2

This list is longer than usual.  Seems we have quite a number of things
in-progress that can be worked on as soon as 7.2 is complete.  If there
are things there than can be decided now, please dig in and send an
email to the hackers list.

Once we start 7.3, I will use that list to request patches to complete
these items.  Because we are done development on 7.2, people can start
working on patches now.  If you send them to the lists, I will load them
up on the page and apply them as soon as 7.3 starts. 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Status of 7.2

От
"Christopher Kings-Lynne"
Дата:
Thought you might like to know that I should be able to upload regression
test reports for:

IRIX 6.5
FreeBSD 4.4 on Intel
FreeBSD 4.4 on Alpha
VMS on Alpha

For 7.2b2 when it's available.  Is Postgres supported on all these
platforms?

Chris

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian
Sent: Wednesday, 7 November 2001 11:45 AM
To: PostgreSQL-development
Subject: [HACKERS] Status of 7.2


OK, 7.2 is looking _very_ good.  We have very few open items.  They are:
Source Code Changes-------------------Compile in syslog feature by default? (Peter, Tom)AIX compile (Tatsuo)Libpq++
compileon Solaris (Peter)
 
Documentation Changes---------------------

The always-updated list is at:
ftp://candle.pha.pa.us/pub/postgresql/open_items.

I also have created a post-7.2 list of items that are either patches
that need to be applied or discussed for 7.3.  That is at:
http://candle.pha.pa.us/cgi-bin/pgpatches2

This list is longer than usual.  Seems we have quite a number of things
in-progress that can be worked on as soon as 7.2 is complete.  If there
are things there than can be decided now, please dig in and send an
email to the hackers list.

Once we start 7.3, I will use that list to request patches to complete
these items.  Because we are done development on 7.2, people can start
working on patches now.  If you send them to the lists, I will load them
up on the page and apply them as soon as 7.3 starts.

-- Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org



Re: Status of 7.2

От
"Marc G. Fournier"
Дата:
I'll be announcing v7.2b2 tomorrow afternoon ... its packaged and ready to
go, ifyou want to get a head start (ftp.postgresql.org), but am giving a
bit of time for mirrors to catch up ...


On Wed, 7 Nov 2001, Christopher Kings-Lynne wrote:

> Thought you might like to know that I should be able to upload regression
> test reports for:
>
> IRIX 6.5
> FreeBSD 4.4 on Intel
> FreeBSD 4.4 on Alpha
> VMS on Alpha
>
> For 7.2b2 when it's available.  Is Postgres supported on all these
> platforms?
>
> Chris
>
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian
> Sent: Wednesday, 7 November 2001 11:45 AM
> To: PostgreSQL-development
> Subject: [HACKERS] Status of 7.2
>
>
> OK, 7.2 is looking _very_ good.  We have very few open items.  They are:
>
>     Source Code Changes
>     -------------------
>     Compile in syslog feature by default? (Peter, Tom)
>     AIX compile (Tatsuo)
>     Libpq++ compile on Solaris (Peter)
>
>     Documentation Changes
>     ---------------------
>
> The always-updated list is at:
>
>     ftp://candle.pha.pa.us/pub/postgresql/open_items.
>
> I also have created a post-7.2 list of items that are either patches
> that need to be applied or discussed for 7.3.  That is at:
>
>     http://candle.pha.pa.us/cgi-bin/pgpatches2
>
> This list is longer than usual.  Seems we have quite a number of things
> in-progress that can be worked on as soon as 7.2 is complete.  If there
> are things there than can be decided now, please dig in and send an
> email to the hackers list.
>
> Once we start 7.3, I will use that list to request patches to complete
> these items.  Because we are done development on 7.2, people can start
> working on patches now.  If you send them to the lists, I will load them
> up on the page and apply them as soon as 7.3 starts.
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>



Re: Status of 7.2

От
Turbo Fredriksson
Дата:
Is there any list of changes between 7.1.3 and 7.2b2 available?

-- Turbo     __ _     Debian GNU     Unix _IS_ user friendly - it's just ^^^^^    / /(_)_ __  _   ___  __  selective
aboutwho its friends are         / / | | '_ \| | | \ \/ /   Debian Certified Linux Developer   _ /// / /__| | | | | |_|
|> <  Turbo Fredriksson   turbo@tripnet.se \\\/  \____/_|_| |_|\__,_/_/\_\ Stockholm/Sweden
 

security counter-intelligence [Hello to all my fans in domestic
surveillance] Soviet Legion of Doom South Africa SEAL Team 6 subway
iodine $400 million in gold bullion Ft. Meade Delta Force killed
attack Waco, Texas
[See http://www.aclu.org/echelonwatch/index.html for more about this]


Re: Status of 7.2

От
"Tille, Andreas"
Дата:
On Tue, 6 Nov 2001, Bruce Momjian wrote:

> I also have created a post-7.2 list of items that are either patches
> that need to be applied or discussed for 7.3.  That is at:
>
>     http://candle.pha.pa.us/cgi-bin/pgpatches2
>
> This list is longer than usual.  Seems we have quite a number of things
> in-progress that can be worked on as soon as 7.2 is complete.  If there
> are things there than can be decided now, please dig in and send an
> email to the hackers list.
>
> Once we start 7.3, I will use that list to request patches to complete
> these items.  Because we are done development on 7.2, people can start
> working on patches now.  If you send them to the lists, I will load them
> up on the page and apply them as soon as 7.3 starts.
Sorry, I´m really unable to send patches but I have a feature request
which was addressed in the thread "Serious performance problem" on this
list.  It mainly concerns the performance increase if there would be
an index scan method which doesn´t have to check the validity of data
in the table.  I´m just waiting for a statement from you guys if you
think it will be doable in 7.3 (while now started to optimize my
database as you suggested ;-).)  I think this would increase acceptance
of PostgreSQL for certain people here in Germany which have real influence
on decisions about database in medical diagnostics and care in Germany.

Kind regards
        Andreas.


Re: Status of 7.2

От
Christof Petig
Дата:
Bruce Momjian wrote:

> OK, 7.2 is looking _very_ good.  We have very few open items.  They are:
>
>         Source Code Changes
>         -------------------
>         Compile in syslog feature by default? (Peter, Tom)
>         AIX compile (Tatsuo)
>         Libpq++ compile on Solaris (Peter)
>
>         Documentation Changes
>         ---------------------
>
> The always-updated list is at:
>
>         ftp://candle.pha.pa.us/pub/postgresql/open_items.
>
> I also have created a post-7.2 list of items that are either patches
> that need to be applied or discussed for 7.3.  That is at:
>
>         http://candle.pha.pa.us/cgi-bin/pgpatches2
>
> This list is longer than usual.  Seems we have quite a number of things
> in-progress that can be worked on as soon as 7.2 is complete.  If there
> are things there than can be decided now, please dig in and send an
> email to the hackers list.

I would suggest to schedule my patch (the last on the list) for 7.2 since it
finishes the work I began for 7.2.
Since some patches (part of the work/redesign) are in but the last two are
yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not
leaving this work half-done.

Christof




Re: Status of 7.2

От
Bruce Momjian
Дата:
> Is there any list of changes between 7.1.3 and 7.2b2 available?
> 

Sure see /HISTORY in the source tarball.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Status of 7.2

От
Bruce Momjian
Дата:
> > This list is longer than usual.  Seems we have quite a number of things
> > in-progress that can be worked on as soon as 7.2 is complete.  If there
> > are things there than can be decided now, please dig in and send an
> > email to the hackers list.
> 
> I would suggest to schedule my patch (the last on the list) for 7.2 since it
> finishes the work I began for 7.2.
> Since some patches (part of the work/redesign) are in but the last two are
> yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not
> leaving this work half-done.

OK, this is for ecpg.  If you can get an OK from Michael, I will be glad
to apply them.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Status of 7.2

От
Tom Lane
Дата:
>> I would suggest to schedule my patch (the last on the list) for 7.2 since it
>> finishes the work I began for 7.2.
>> Since some patches (part of the work/redesign) are in but the last two are
>> yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not
>> leaving this work half-done.

> OK, this is for ecpg.  If you can get an OK from Michael, I will be glad
> to apply them.

More to the point, I don't think it's core's business to overrule
Michael's technical decisions about ecpg.  If he thinks the patch
is okay, but hasn't time to apply it, then we can do that for him.
But we won't apply it without his review and okay.
        regards, tom lane


Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Wed, 7 Nov 2001, Tille, Andreas wrote:

> Sorry, I´m really unable to send patches but I have a feature request
> which was addressed in the thread "Serious performance problem" on this
> list.  It mainly concerns the performance increase if there would be
> an index scan method which doesn´t have to check the validity of data
> in the table.  I´m just waiting for a statement from you guys if you
> think it will be doable in 7.3 (while now started to optimize my
> database as you suggested ;-).)  I think this would increase acceptance
> of PostgreSQL for certain people here in Germany which have real influence
> on decisions about database in medical diagnostics and care in Germany.
Is it possible that hackers do any statement according this issue.
I want to repeat the problem.  It´s hard to argue for PostgreSQL (and
I would really like to advocate for PostgreSQL) against MS SQL if we
talk about an imaginary possible dataloss if my colleague has not ever
faced dataloss and certainly know that other power users of MS SQL are
using it.  It´s much more hard to argue if there are cases in which
MS SQL outperforms PostgreSQL in the order of magnitude.  It´s hard
to convince somebody if I tell him that the reason is his bad database
design.  He really isn´t sooo bad and he claims that MS SQL has transparent
transaction *and* fast index usage.  Don´t ask me how they do this.
I repeat that my colleague is in the position to decide about software
usage of several medicine related projects in Germany.

I just want to know now if this is an issue for PostgreSQL hackers:
  [ ] yes  [ ] no  [ ] we are discussing about that

In case of "no" I would be happy if you could provide me with some
technical reasons which could help me arguing.

Kind regards
       Andreas.


Re: Further open item (Was: Status of 7.2)

От
Thomas Lockhart
Дата:
> I just want to know now if this is an issue for PostgreSQL hackers:
>    [X] yes
>    [X] no
>    [X] we are discussing about that
> In case of "no" I would be happy if you could provide me with some
> technical reasons which could help me arguing.

The hacker community has a wide range of interests.

From my POV, the overall performance of PostgreSQL is more than
competitive with other database products, including M$SQL. There is not
much point in arguing a specific query case, though we are happy to talk
about specific overall applications and to offer suggestions on how to
build databases that are generally well designed and that will perform
well on more than one product.

If you have a colleague who firmly believes that M$SQL is the best
solution, it sounds like he is not listening to all of the facts. That
certainly can be frustrating, eh? Maybe after a few more years of
crashed machines and increasing costs he will be more open to
alternatives ;)
                     - Thomas


Re: Further open item (Was: Status of 7.2)

От
Stephan Szabo
Дата:
> I just want to know now if this is an issue for PostgreSQL hackers:
>
>    [ ] yes
>    [ ] no
>    [ ] we are discussing about that
>
> In case of "no" I would be happy if you could provide me with some
> technical reasons which could help me arguing.

My guess is that its likely to get discussed again when 7.3 development
starts if someone brings it up.  I think right now alot of discussion
is towards the 7.2betas and bugs and stuff that might possibly get put off
that was already talked about earlier in this cycle.



Re: Further open item (Was: Status of 7.2)

От
Peter Eisentraut
Дата:
Tille, Andreas writes:

> > Sorry, I´m really unable to send patches but I have a feature request
> > which was addressed in the thread "Serious performance problem" on this
> > list.  It mainly concerns the performance increase if there would be
> > an index scan method which doesn´t have to check the validity of data
> > in the table.

> I just want to know now if this is an issue for PostgreSQL hackers:
>
>    [ ] yes
>    [ ] no
>    [ ] we are discussing about that

We are always willing to discuss changes that improve performance,
reliability, standards compliance, etc.  However, "MS SQL does it, and MS
SQL is fast" is not sufficient proof that a feature would improve average
performance in PostgreSQL.  This issue has been brought up with similarly
unsatisfactory arguments in the past, so you should be able to find out
about the discussion in the archives.  Some of the arguments against this
change were bigger indexes, slower write operations, non-existent proof
that it's really faster, putting the index on a different disk will mostly
obsolete the issue.  Consequently, this is currently not something that
has got a chance to be implemented anytime soon.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Further open item (Was: Status of 7.2)

От
Bruce Momjian
Дата:
> We are always willing to discuss changes that improve performance,
> reliability, standards compliance, etc.  However, "MS SQL does it, and MS
> SQL is fast" is not sufficient proof that a feature would improve average
> performance in PostgreSQL.  This issue has been brought up with similarly
> unsatisfactory arguments in the past, so you should be able to find out
> about the discussion in the archives.  Some of the arguments against this
> change were bigger indexes, slower write operations, non-existent proof
> that it's really faster, putting the index on a different disk will mostly
> obsolete the issue.  Consequently, this is currently not something that
> has got a chance to be implemented anytime soon.

I personally would like to have index scans that look up heap rows
record the heap expired status into the index entry via one bit of
storage.  This will not _prevent_ checking the heap but it will prevent
heap lookups for index entries that have been exipred for a long time. 
However, with the new vacuum, and perhaps autovacuum coming soon, may be
little need for this optimization.

The underlying problem the user is seeing is how to _know_ an index
tuple is valid without checking the heap, and I don't see how to do that
unless we start storing the transaction id in the index tuple, and that
requires extra storage.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Fri, 16 Nov 2001, Thomas Lockhart wrote:

> The hacker community has a wide range of interests.
For sure, but there will be a raodmap with general consensus of the
hackers.

> From my POV, the overall performance of PostgreSQL is more than
> competitive with other database products, including M$SQL.
I never doubt you point of view, but it hardly counts as an
argument for my current problem.  There is a technical reason
why MS SQL server is faster here and they claim to do it safely.
(Well personally I do not give a cent for thigs that MS claims
about but this does not help here.)

> There is not much point in arguing a specific query case,
It is no specific query case.  It is the speed of an index scan which
goes like N if you do it with PostgreSQL and it goes like log N if
you do not have to look back into the table like MS SQL server does.

> though we are happy to talk
> about specific overall applications and to offer suggestions on how to
> build databases that are generally well designed and that will perform
> well on more than one product.
I doubt that you could care about any database designer who does
poor database design and just does a straigtforeward index scan.
If you think that PostgreSQL is only targeted to high professional
database designers which know how to avoid index scans I doubt that
PostgreSQL will get the user base it would deserve.
I could imagine several cases like my colleague who might think about
porting their application and get into the trap as me that the first
simple question they try performs that badly.  I really want to say
that we should address this issue in the documentation.  If there
exists such cases we should make it clear *why* PostgreSQL fails
this performance test (and perhaps include your text in your mail
as a base of this documentation).  If we ignore that we will not
attrakt users.

> If you have a colleague who firmly believes that M$SQL is the best
> solution, it sounds like he is not listening to all of the facts.
He is a little bit MS centric but in principle knows the advantage
of OpenSource.  On the other hand he is led by pragmatism and just
asks:  Which software gives the solution quickly.  And he found his
answer.
On the other hand we should also listen to things he presents as
"facts" ...

> That certainly can be frustrating, eh?
Yes.

> Maybe after a few more years of
> crashed machines and increasing costs he will be more open to
> alternatives ;)
This does not help currently.
I repeat:  We should at least upgrade PostgreSQL documentation to address
those issues.

Kind regards
        Andreas.

PS: I prefer not to be CCed if I do not explicite ask for this service.   It seems to be common habit on PostgreSQL
liststo CC users.  Does   this make any sense?  On many other lists such bahaviour is banned.
 


Re: Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Fri, 16 Nov 2001, Bruce Momjian wrote:

> I personally would like to have index scans that look up heap rows
> record the heap expired status into the index entry via one bit of
> storage.  This will not _prevent_ checking the heap but it will prevent
> heap lookups for index entries that have been exipred for a long time.
> However, with the new vacuum, and perhaps autovacuum coming soon, may be
> little need for this optimization.
>
> The underlying problem the user is seeing is how to _know_ an index
> tuple is valid without checking the heap, and I don't see how to do that
> unless we start storing the transaction id in the index tuple, and that
> requires extra storage.
For my special case I think doubling main memory is about the same
price as a MS SQL server license.  I can´t say which further problems
might occure.

Kind regards
       Andreas.


Re: Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Fri, 16 Nov 2001, Peter Eisentraut wrote:

> We are always willing to discuss changes that improve performance,
> reliability, standards compliance, etc.  However, "MS SQL does it, and MS
> SQL is fast" is not sufficient proof that a feature would improve average
> performance in PostgreSQL.  This issue has been brought up with similarly
> unsatisfactory arguments in the past, so you should be able to find out
> about the discussion in the archives.
Sorry, I do not see any favour for PostgreSQL if we want people who
consider switching to PostgreSQL to search the archive for useful information.
Just stating the issues and principles clearly could convince people.
If not PostgreSQL is faster removed from the list of available
alternatives of database servers than a web browser is fired up.

Kind regards
       Andreas.


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:

Tille, Andreas wrote:

>On Fri, 16 Nov 2001, Bruce Momjian wrote:
>
>>I personally would like to have index scans that look up heap rows
>>record the heap expired status into the index entry via one bit of
>>storage.  This will not _prevent_ checking the heap but it will prevent
>>heap lookups for index entries that have been exipred for a long time.
>>However, with the new vacuum, and perhaps autovacuum coming soon, may be
>>little need for this optimization.
>>
>>The underlying problem the user is seeing is how to _know_ an index
>>tuple is valid without checking the heap,
>>
I'd propose a memory-only (or heavily cached) structure of tuple death 
transaction
ids for all transactions since the oldest live trx. And when that oldest 
finishes then
the tombstone marks for all tuples deleted between that and the new 
oldest are
moved to relevant indexes (or the index keys are deleted) by concurrent 
vacuum
or similar process.

We could even try to use the index itself as that structure by favoring 
changed index pages
when making caching decisions. It is much safer to cache indexes than it 
is to cache data
pages as for indexes we only need to detect (by keeping info in WAL for 
example) that it
is broken and not what it contained as it can always be rebuilt after 
computer crash.

The problem with using an ndex for this is _which_ index to use when 
there are many per table.
Perhaps a good choice would be the PRIMARY KEY.

OTOH, keeping this info in index and not in a dedicated structure makes 
the amount of
data needing to be cached well bigger and thus the whole operation more 
expensive.

>> and I don't see how to do that
>>unless we start storing the transaction id in the index tuple, and that
>>requires extra storage.
>>
>For my special case I think doubling main memory is about the same
>price as a MS SQL server license.  I can´t say which further problems
>might occure.
>
Then you must have really huge amounts of memory already ;)

------------------
Hannu




Re: Further open item (Was: Status of 7.2)

От
Tom Lane
Дата:
Hannu Krosing <hannu@tm.ee> writes:
> I'd propose a memory-only (or heavily cached) structure of tuple death 
> transaction
> ids for all transactions since the oldest live trx.

Seems like just a special-purpose reimplementation of disk pages sitting
in shared buffers.  If you've got the memory to keep track of tuples
you've killed recently, then you've probably got the memory to hold the
pages they're in, so a redundant separate caching structure is not
obviously a win.

The possible win of marking index entries dead (once their tuple is
known dead for all transactions) is that it saves visiting disk pages
that have *not* been visited recently, and thus that aren't likely to
be hanging around in buffers.

OTOH there are a lot of potential problems, most notably that
is-the-tuple-dead-for-ALL-transactions is not the normal tuple time
qual check, and so it'd represent extra overhead in indexscans.
I'm also concerned about how to do it without introducing lots of
ugly interactions (maybe even deadlocks) between the index access
methods and the heap access code.

If concurrent vacuuming turns out to be cheap enough, just running
vacuum frequently might be a better answer than trying to push the
maintenance work into the main line of execution.
        regards, tom lane


Re: Further open item (Was: Status of 7.2)

От
Stephan Szabo
Дата:
On Mon, 19 Nov 2001, Tille, Andreas wrote:

> On Fri, 16 Nov 2001, Thomas Lockhart wrote:

> > There is not much point in arguing a specific query case,
> It is no specific query case.  It is the speed of an index scan which
> goes like N if you do it with PostgreSQL and it goes like log N if
> you do not have to look back into the table like MS SQL server does.

But it is in some way.  It's dependant on the number of rows returned
by the query.  For a small enough number of rows returned, having the
additional information in the index could very well make the query
slower even if it avoids the reads from the heap. Keeping the information
in some other fashion where it doesn't directly do that may alleviate
that, but it's not a straightforward one is better than the other in
all cases.  It's not like postgres never uses an index on a large
table, it's just that after a certain amount of expected returns it
switches over.





Re: Further open item (Was: Status of 7.2)

От
Bruce Momjian
Дата:
> > There is not much point in arguing a specific query case,
> It is no specific query case.  It is the speed of an index scan which
> goes like N if you do it with PostgreSQL and it goes like log N if
> you do not have to look back into the table like MS SQL server does.

Have you tried using CLUSTER to match the heap order with the index
order.  That should help with index scans looking up heap rows.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:

Tom Lane wrote:

>Hannu Krosing <hannu@tm.ee> writes:
>
>>I'd propose a memory-only (or heavily cached) structure of tuple death 
>>transaction
>>ids for all transactions since the oldest live trx.
>>
>
>Seems like just a special-purpose reimplementation of disk pages sitting
>in shared buffers.  If you've got the memory to keep track of tuples
>you've killed recently, then you've probably got the memory to hold the
>pages they're in, so a redundant separate caching structure is not
>obviously a win.
>
I suspect that for even the border case of a table containing just 1 
CHAR(1) field the
above structure will be a lot smaller than the page cache for the same 
tuples.

>The possible win of marking index entries dead (once their tuple is
>known dead for all transactions) is that it saves visiting disk pages
>that have *not* been visited recently, and thus that aren't likely to
>be hanging around in buffers
>
>
>
>OTOH there are a lot of potential problems, most notably that
>is-the-tuple-dead-for-ALL-transactions is not the normal tuple time
>qual check, and so it'd represent extra overhead in indexscans.
>I'm also concerned about how to do it without introducing lots of
>ugly interactions (maybe even deadlocks) between the index access
>methods and the heap access code.
>
>If concurrent vacuuming turns out to be cheap enough, just running
>vacuum frequently might be a better answer than trying to push the
>maintenance work into the main line of execution.
>
What I proposed would have been mostly the job of concurrent vacuum
(marking/removing dead index tuples)

Perhaps it would be an overall win for fast changing (vs. fast-growing) 
databases if
we kept the tuple metainfo (attnum < 0) on separate (cache) pages, as it 
saves writes of
tmax updates on both UPDATE and DELETE.

If we kept them in a separate table as well that could make the metainfo 
"table"
essentially a kind of index. That table/index could of course be 
concealed inside
the main table by using typed data pages.

---------------
Hannu



Re: Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Mon, 19 Nov 2001, Bruce Momjian wrote:

> > > There is not much point in arguing a specific query case,
> > It is no specific query case.  It is the speed of an index scan which
> > goes like N if you do it with PostgreSQL and it goes like log N if
> > you do not have to look back into the table like MS SQL server does.
>
> Have you tried using CLUSTER to match the heap order with the index
> order.  That should help with index scans looking up heap rows.
Yes, I´ve tried even that and it increase PostgreSQLs performance a little
bit for this special query but it did not get nearly the speed of the
same query on the MS SQL server.  Moreover there are tables with more than
one index and I guess it makes only sense to cluster one index per table.

Kind regards
         Andreas.


Re: Further open item (Was: Status of 7.2)

От
Bruce Momjian
Дата:
> On Mon, 19 Nov 2001, Bruce Momjian wrote:
> 
> > > > There is not much point in arguing a specific query case,
> > > It is no specific query case.  It is the speed of an index scan which
> > > goes like N if you do it with PostgreSQL and it goes like log N if
> > > you do not have to look back into the table like MS SQL server does.
> >
> > Have you tried using CLUSTER to match the heap order with the index
> > order.  That should help with index scans looking up heap rows.

> Yes, I?ve tried even that and it increase PostgreSQLs performance a little
> bit for this special query but it did not get nearly the speed of the
> same query on the MS SQL server.  Moreover there are tables with more than
> one index and I guess it makes only sense to cluster one index per table.

Yes, CLUSTER only matches one index.

Something I just realized, that other probably figured out, is that
while we have plans to backfill expired tuple status into the index
tuples, it is not easy to backfill enough information to know a tuple is
valid.

Setting aside the problem of different tuple visibilities for different
backends, one problem is that when we go to expire a tuple, we would
have to update all the index tuples that point to the heap tuple.  That
is an expensive operation because you have to use the keys in the heap
to find the index.

So, while we do have plans to mark some index tuples so we _know_ they
are expired, we don't know how to efficiently mark index tuples so we
_know_ they are valid.

This is what I believe you want, where we can scan the index without
checking the heap at all.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Further open item (Was: Status of 7.2)

От
"Tille, Andreas"
Дата:
On Tue, 20 Nov 2001, Bruce Momjian wrote:

> So, while we do have plans to mark some index tuples so we _know_ they
> are expired, we don't know how to efficiently mark index tuples so we
> _know_ they are valid.
>
> This is what I believe you want, where we can scan the index without
> checking the heap at all.
An new index type (say READONLY INDEX or some reasonable name) which is
valid all the time between two vacuum processes would suffice for my
application.  It would fit the needs of people who do a daily database
update and vacuum after this.

Of course it´s your descision if this makes sense and fits PostgreSQL
philosophy, but I think it would speed up some kind of applications.

Kind regards
        Andreas.


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:

Tille, Andreas wrote:

>On Tue, 20 Nov 2001, Bruce Momjian wrote:
>
>>So, while we do have plans to mark some index tuples so we _know_ they
>>are expired, we don't know how to efficiently mark index tuples so we
>>_know_ they are valid.
>>
>>This is what I believe you want, where we can scan the index without
>>checking the heap at all.
>>
>An new index type (say READONLY INDEX or some reasonable name) which is
>valid all the time between two vacuum processes would suffice for my
>application.  It would fit the needs of people who do a daily database
>update and vacuum after this.
>
Or perhaps MAINTAINED INDEX, meaning that it has always both tmin and tmax
up-to-date.
Btw 7.2 still has broken behaviour of xmax which by definition should 
not have a
non-0 value for live tuples

pg72b2=# create table parent(pid int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 
'parent_pkey' for table 'parent'
CREATE
pg72b2=# create table child(cid int, pid int references parent);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY 
check(s)
CREATE
pg72b2=# insert into parent values(1);
INSERT 16809 1
pg72b2=# insert into child values(1,1);
INSERT 16810 1
pg72b2=# update child set pid=2;
ERROR:  <unnamed> referential integrity violation - key referenced from 
child not found in parent
pg72b2=# select xmin,xmax,* from child;xmin | xmax | cid | pid
------+------+-----+----- 171 |  172 |   1 |   1
(1 row)

pg72b2=#

>
>
>Of course it´s your descision if this makes sense and fits PostgreSQL
>philosophy, but I think it would speed up some kind of applications.
>
>Kind regards
>
>         Andreas.
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to majordomo@postgresql.org so that your
>message can get through to the mailing list cleanly
>




Re: Further open item (Was: Status of 7.2)

От
Bruce Momjian
Дата:
Huh, a non-zero XMAX is fine.  You mark the XMAX when you _think_ you
are updating it.  It is only expired when the XMAX on the tuple is
committed.

> Or perhaps MAINTAINED INDEX, meaning that it has always both tmin and tmax
> up-to-date.
> Btw 7.2 still has broken behaviour of xmax which by definition should 
> not have a
> non-0 value for live tuples
> 
> pg72b2=# create table parent(pid int primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 
> 'parent_pkey' for table 'parent'
> CREATE
> pg72b2=# create table child(cid int, pid int references parent);
> NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY 
> check(s)
> CREATE
> pg72b2=# insert into parent values(1);
> INSERT 16809 1
> pg72b2=# insert into child values(1,1);
> INSERT 16810 1
> pg72b2=# update child set pid=2;
> ERROR:  <unnamed> referential integrity violation - key referenced from 
> child not found in parent
> pg72b2=# select xmin,xmax,* from child;
>  xmin | xmax | cid | pid
> ------+------+-----+-----
>   171 |  172 |   1 |   1
> (1 row)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:
Bruce Momjian wrote:
> 
> Huh, a non-zero XMAX is fine.  You mark the XMAX when you _think_ you
> are updating it.  It is only expired when the XMAX on the tuple is
> committed.

But

http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html

claims:

xmax    The identity (transaction ID) of the deleting transaction,       or zero for an undeleted tuple. In practice,
thisis        never nonzero for a visible tuple.
 

cmax    The command identifier within the deleting transaction,       or zero. Again, this is never nonzero for a
visibletuple.
 


Which is IMHO good and useful behaviour, for example for all kinds of
mirroring

I also think that this kas historically been the behaviour and that 
this was broken sometime in not too distant past (i.e after postgres95
;)
by foreign keys and/or somesuch.

Tom Lane once told me about a way to determine the visibility of a tuple 
by other means than [x|c][min|max] but I can't find/remember it anymore
;(

-----------------
Hannu


Re: Further open item (Was: Status of 7.2)

От
Tom Lane
Дата:
Hannu Krosing <hannu@tm.ee> writes:
> But
> http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html

That documentation is in error (my fault).  Current docs say

xmax
    The identity (transaction ID) of the deleting transaction, or zero    for an undeleted tuple. It is possible for
thisfield to     be nonzero in a visible tuple: that usually indicates that the    deleting transaction hasn't
committedyet, or that an     attempted deletion was rolled back. 
 

> I also think that this kas historically been the behaviour 

No, it wasn't.
        regards, tom lane


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:

Tom Lane wrote:

>Hannu Krosing <hannu@tm.ee> writes:
>
>>But
>>http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html
>>
>
>That documentation is in error (my fault).  Current docs say
>
>xmax
>
>     The identity (transaction ID) of the deleting transaction, or zero
>     for an undeleted tuple. It is possible for this field to 
>     be nonzero in a visible tuple: that usually indicates that the
>     deleting transaction hasn't committed yet,
>
That seems reasonable

> or that an attempted deletion was rolled back. 
>
But could we not make it so that rollback will also reset xmax and cmax 
to 0.
It should be quite cheap to do so as it's on the same page with the 
commit bits ?

The meaning "last transaction that attempted to delete this tuple" seems 
somewhat weird

>>I also think that this kas historically been the behaviour 
>>
>No, it wasn't.
>
Are you sure that it was a bug not in code but in docs ?

---------------
Hannu




Re: Further open item (Was: Status of 7.2)

От
Tom Lane
Дата:
Hannu Krosing <hannu@tm.ee> writes:
> But could we not make it so that rollback will also reset xmax and cmax 
> to 0.

We never have done that and I don't see why we should start.
(And no, I'm not sure that it'd be entirely safe; there are
concurrency/atomicity issues involved, because we do not
insist on getting exclusive lock to set the it's-dead-Jim
flag bit.)

We could make the user readout of xmax/cmax be zeroes if the flag
bits show they are invalid.  But this really just begs the question
of what use they are to users in the first place.  I can't see any;
and if we make them read as zeroes then they for sure won't have any.
        regards, tom lane


Re: Further open item (Was: Status of 7.2)

От
Hannu Krosing
Дата:

Tom Lane wrote:

>Hannu Krosing <hannu@tm.ee> writes:
>
>>But could we not make it so that rollback will also reset xmax and cmax 
>>to 0.
>>
>
>We never have done that and I don't see why we should start.
>(And no, I'm not sure that it'd be entirely safe; there are
>concurrency/atomicity issues involved, because we do not
>insist on getting exclusive lock to set the it's-dead-Jim
>flag bit.)
>
>We could make the user readout of xmax/cmax be zeroes if the flag
>bits show they are invalid.
>
If there is a cheap way to get a list of pending transactions, then we 
could make them
read out as 0 if they are about to be deleted (ie xmax in 
pending_transactions()) and
else show the value of the transaction that is about to delete them.

>But this really just begs the question
>of what use they are to users in the first place.  I can't see any;
>and if we make them read as zeroes then they for sure won't have any.
>
I can see some use for xmax user-visible only while being deleted.

At least this would be more useful than themeaning
last-trx-that-was-about-to-delete.

Another way for getting equivalent functionality would be to make the
pending_transactions() function available to users.

---------------
Hannu