Обсуждение: Re[2]: Weird indices

Поиск
Список
Период
Сортировка

Re[2]: Weird indices

От
Jean-Christophe Boggio
Дата:
Joseph,

I think you're going a bit too far... Tom and Stephan have been very
patient explaining you the basics of indices.

>> The name of the game here is to make a plan *without* actually going
>> out and expending large amounts of time to find out the true state of
>> affairs; by the time you know for sure, you've already done the query.

Believe this. All the best DB engines including PostgreSQL work that
way. This is based on measures, on real life.

JS> Well I'd hope that extracting the count from the index should be very
JS> low cost. That is what indecies are for.

No, indices are made for finding a record in one go or for isolating a
small range of values.

JS> But certain things could be done.  Like planning for the case of there
JS> being a single not null value, and updating the indecies not to point at
JS> expired rows.

And then you'll ask when there are 2 not null values...?

JS> Isn't the point of a vacuum to get rid of old rows?  Then
JS> why doesn't it update the index as well?

It does. Look at vacuum verbose.

JS> I mean the explain shows that getting the count(*) from the field that
JS> is indexed has to do a seq scan, presumably to determine if the rows are
JS> in fact valid.

count(*) means you want all the rows that have all the fields "not
null". Read carefully : ALL THE FIELDS.

JS> That is ridiculous.

ahem. One solution to the problem is known as "optimizer hints" in
Oracle : you specify directly in the query HOW the optimizer should
execute the query. It's very useful in various situations. I have
asked Tom many times if that exists in PostgreSQL but didn't get any
answer. I guess it's on a TODO list somewhere ;-)

--
Jean-Christophe Boggio
cat@thefreecat.org
Independant Consultant and Developer
Delphi, Linux, Perl, PostgreSQL



Re: Re[2]: Weird indices

От
Tom Lane
Дата:
Jean-Christophe Boggio <cat@thefreecat.org> writes:
> JS> I mean the explain shows that getting the count(*) from the field that
> JS> is indexed has to do a seq scan, presumably to determine if the rows are
> JS> in fact valid.

> count(*) means you want all the rows that have all the fields "not
> null". Read carefully : ALL THE FIELDS.

No, actually it just means "count the rows".  count(f) for a field f
(or more generally any expression f) counts the number of non-null
values of f, but "*" just indicates count the rows.

Nonetheless, it's not that easy to keep a running count(*) total for a
table, even if we thought that select count(*) with no WHERE clause was
a sufficiently critical operation to justify slowing down every other
operation to keep the count(*) stats up to date.  Think about committed
vs not-committed transactions.  In the worst case, each active
transaction could have a different view of the table and thus a
different idea of what count(*) should yield; and each transaction might
have different pending updates that should affect the count(*) total
when and if it commits.

> ahem. One solution to the problem is known as "optimizer hints" in
> Oracle : you specify directly in the query HOW the optimizer should
> execute the query. It's very useful in various situations. I have
> asked Tom many times if that exists in PostgreSQL but didn't get any
> answer. I guess it's on a TODO list somewhere ;-)

Not on mine ;-).  I don't believe in the idea, first because it's not
standard SQL, and second because I don't trust the user to know better
than the system what the best plan is in a particular context.  Hints
that you put in last year may have been the right thing at the time
(or not...) but they'll still be lying there forgotten in your code
when the table contents and the Postgres implementation have changed
beyond recognition.  Yes, the optimizer needs work, and it'll get that
work over time --- but a hint that's wrong is worse than no hint.
I'd rather have Postgres blamed for performance problems of its own
making than those of the user's making.

            regards, tom lane

Re: Weird indices

От
Joseph Shraibman
Дата:
Jean-Christophe Boggio wrote:
>
> Joseph,
>
> I think you're going a bit too far... Tom and Stephan have been very
> patient explaining you the basics of indices.
>
They are being patient.  I thank them.  I'm not trying to point fingers
at anybody and say 'you idiot! why aren't you thinking like me?'.  It's
just that this whole visibility meets index meets planner thing is still
eluding me.

> >> The name of the game here is to make a plan *without* actually going
> >> out and expending large amounts of time to find out the true state of
> >> affairs; by the time you know for sure, you've already done the query.
>
> Believe this. All the best DB engines including PostgreSQL work that
> way. This is based on measures, on real life.
>
I believe it.  I just seemed to me that that looking at the index should
have been before the cutoff point.  And actually seemed to be in my
little experiment.

> JS> Well I'd hope that extracting the count from the index should be very
> JS> low cost. That is what indecies are for.
>
> No, indices are made for finding a record in one go or for isolating a
> small range of values.
>

When I learned data structures in cs 102 I learned how to use an index
to quickly get a count.  I assume the postgres developers know this as
well.

> JS> But certain things could be done.  Like planning for the case of there
> JS> being a single not null value, and updating the indecies not to point at
> JS> expired rows.
>
> And then you'll ask when there are 2 not null values...?

I think I mispoke myself there.  I meant the case of there being the
single repeated value.  I can think offhand of a few ways to do it but I
don't know how postgres uses indeces.
>
> JS> Isn't the point of a vacuum to get rid of old rows?  Then
> JS> why doesn't it update the index as well?
>
> It does. Look at vacuum verbose.
>
In my test case it didn't.  My database is vacuumed every night, and I
hadn't touched it that day before I did my test.  Therefore it should
have known there were only 16 matches and not 52.

> JS> I mean the explain shows that getting the count(*) from the field that
> JS> is indexed has to do a seq scan, presumably to determine if the rows are
> JS> in fact valid.
>
> count(*) means you want all the rows that have all the fields "not
> null". Read carefully : ALL THE FIELDS.

Uh, no.  Tom said so in a different reply.

I understand that keeping different views for different open
transactions can be difficult, but after a transaction  that updates a
row is over why isn't the row marked as 'universally visible' for all
new transactions until another update occurs?

Maybe I'm not making myself understood.  Another way of asking the same
thing:
Say there is a transaction that is looking at a non-current version of a
row.  'non-current' could be the value it was at the start of the
transaction (and was updated by another transaction) or was updated by
this transaction but not committed yet.  When this transaction is over
is it really that hard to get rid of the refrence to the old version of
the row?  There should be a 1 bit field 'is old value and isn't being
used by any transaction'.  Is that really hard?

Maybe this is part of the whole 'vacuum later' vs. 'update now'
philosophy.  If the point of vacuum later is to put off the performance
hit until later if it is causing these performance hits on queries
because index scans aren't being used then doesn't that mean 'update
now' is more likely to pay off in the short run?


--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com

Re: Weird indices

От
Tom Lane
Дата:
Joseph Shraibman <jks@selectacast.net> writes:
> Maybe I'm not making myself understood.  Another way of asking the same
> thing:
> Say there is a transaction that is looking at a non-current version of a
> row.  'non-current' could be the value it was at the start of the
> transaction (and was updated by another transaction) or was updated by
> this transaction but not committed yet.  When this transaction is over
> is it really that hard to get rid of the refrence to the old version of
> the row?  There should be a 1 bit field 'is old value and isn't being
> used by any transaction'.  Is that really hard?

Sure, it's easy to do that sort of bookkeeping ... on a per-row basis.
And we do.  What's not so easy (an index helps not at all) is to
summarize N per-row status values into a single count(*) statistic that
you can maintain in a way significantly cheaper than just scanning the
rows when you need the count(*) value.  Especially when the per-row
status values interact with the state values of the observing process
to determine what it should think count(*) really is.

The issue is not really "could we make count(*) fast"?  Yeah, we
probably could, if that were the only measure of performance we cared
about.  The real issue is "can we do it at a price we're willing to pay,
considering the costs of slowdown of insert/update/delete operations,
extra storage space, and extra system complexity?"  So far the answer's
been "no".

You might want to look at the manual's discussion of MVCC and at the
Postgres internals talks that were given at OSDN (see slides at
http://www.postgresql.org/osdn/index.html) to learn more about how
things work.

            regards, tom lane

Re: Weird indices

От
Ian Lance Taylor
Дата:
Joseph Shraibman <jks@selectacast.net> writes:

A caveat on this reply: I've been studying the Postgres internals, but
I have not mastered them.

> I understand that keeping different views for different open
> transactions can be difficult, but after a transaction  that updates a
> row is over why isn't the row marked as 'universally visible' for all
> new transactions until another update occurs?

It is.  This mark is on the tuple in the heap.  When a tuple is
current, and not locked for update, HEAP_XMAX_INVALID is set.  After
the tuple is removed, HEAP_XMAX_COMMITTED is set.

> Maybe I'm not making myself understood.  Another way of asking the same
> thing:
> Say there is a transaction that is looking at a non-current version of a
> row.  'non-current' could be the value it was at the start of the
> transaction (and was updated by another transaction) or was updated by
> this transaction but not committed yet.  When this transaction is over
> is it really that hard to get rid of the refrence to the old version of
> the row?  There should be a 1 bit field 'is old value and isn't being
> used by any transaction'.  Is that really hard?

There is a 1 bit field indicating that a tuple is an old value.
Postgres can also determine whether any transaction can see the
tuple.  It does this by storing the transaction ID in the t_xmax
field.  If all current transactions are newer than that transaction
ID, then that tuple is no longer visible to any transaction.

In fact, I believe that is what the VACUUM command looks for.

> Maybe this is part of the whole 'vacuum later' vs. 'update now'
> philosophy.  If the point of vacuum later is to put off the performance
> hit until later if it is causing these performance hits on queries
> because index scans aren't being used then doesn't that mean 'update
> now' is more likely to pay off in the short run?

I don't follow.  A simple VACUUM doesn't update the statistics.
VACUUM ANALYZE has to do more work.

Are you suggesting that the statistics should be updated continuously?
I guess that would be doable, but it would clearly slow down the
database.  For some applications, it would be an obviously bad idea.

Ian

Re: Weird indices

От
Joseph Shraibman
Дата:
Ian Lance Taylor wrote:
>
> Joseph Shraibman <jks@selectacast.net> writes:
>
> A caveat on this reply: I've been studying the Postgres internals, but
> I have not mastered them.
>
> > I understand that keeping different views for different open
> > transactions can be difficult, but after a transaction  that updates a
> > row is over why isn't the row marked as 'universally visible' for all
> > new transactions until another update occurs?
>
> It is.  This mark is on the tuple in the heap.  When a tuple is
> current, and not locked for update, HEAP_XMAX_INVALID is set.  After
> the tuple is removed, HEAP_XMAX_COMMITTED is set.

On the heap, but when is the index updated?  Not until the next vacuum?
>
> > Maybe I'm not making myself understood.  Another way of asking the same
> > thing:
> > Say there is a transaction that is looking at a non-current version of a
> > row.  'non-current' could be the value it was at the start of the
> > transaction (and was updated by another transaction) or was updated by
> > this transaction but not committed yet.  When this transaction is over
> > is it really that hard to get rid of the refrence to the old version of
> > the row?  There should be a 1 bit field 'is old value and isn't being
> > used by any transaction'.  Is that really hard?
>
> There is a 1 bit field indicating that a tuple is an old value.
> Postgres can also determine whether any transaction can see the
> tuple.  It does this by storing the transaction ID in the t_xmax
> field.  If all current transactions are newer than that transaction
> ID, then that tuple is no longer visible to any transaction.
>
> In fact, I believe that is what the VACUUM command looks for.
>
> > Maybe this is part of the whole 'vacuum later' vs. 'update now'
> > philosophy.  If the point of vacuum later is to put off the performance
> > hit until later if it is causing these performance hits on queries
> > because index scans aren't being used then doesn't that mean 'update
> > now' is more likely to pay off in the short run?
>
> I don't follow.  A simple VACUUM doesn't update the statistics.
> VACUUM ANALYZE has to do more work.

I'm talking about indices.  The index should be updated to only point at
valid rows.  But if the index isn't used by the planner then the point
is moot.

>
> Are you suggesting that the statistics should be updated continuously?
> I guess that would be doable, but it would clearly slow down the
> database.  For some applications, it would be an obviously bad idea.

No, I'm suggesting that indices should be updated continuously so the
planner can use them without having a performance hit from checking if
tuples are valid or not.


BTW is there any way to tell postgres to do an update at every commit
without waiting for a vacuum?  I understand that the postgres core team
thinks it is a bad idea, but there are ways to (sort of) force using
index scans whent he planner doesn't want to, so is there something
similar to force incremental vacuuming at the end of each query?

Java has this option:
-Xincgc           enable incremental garbage collection

that isn't recommended, but the java developers recognized that
sometimes a user might want it anyway for whatever reason.

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com

Re: Weird indices

От
Joseph Shraibman
Дата:
Err I wan't complaing about count(*) per se, I was just using that as a
simple example of something that should be done with an index.  Because
if the index doesn't have to worry about rows that aren't current then
you don't even have to go into the heap because the index alone should
have enough information to do that.  If it doesn't have to worry about
rows that aren't visible.

Tom Lane wrote:
>
> Joseph Shraibman <jks@selectacast.net> writes:
> > Maybe I'm not making myself understood.  Another way of asking the same
> > thing:
> > Say there is a transaction that is looking at a non-current version of a
> > row.  'non-current' could be the value it was at the start of the
> > transaction (and was updated by another transaction) or was updated by
> > this transaction but not committed yet.  When this transaction is over
> > is it really that hard to get rid of the refrence to the old version of
> > the row?  There should be a 1 bit field 'is old value and isn't being
> > used by any transaction'.  Is that really hard?
>
> Sure, it's easy to do that sort of bookkeeping ... on a per-row basis.
> And we do.  What's not so easy (an index helps not at all) is to
> summarize N per-row status values into a single count(*) statistic that
> you can maintain in a way significantly cheaper than just scanning the
> rows when you need the count(*) value.  Especially when the per-row
> status values interact with the state values of the observing process
> to determine what it should think count(*) really is.
>
> The issue is not really "could we make count(*) fast"?  Yeah, we
> probably could, if that were the only measure of performance we cared
> about.  The real issue is "can we do it at a price we're willing to pay,
> considering the costs of slowdown of insert/update/delete operations,
> extra storage space, and extra system complexity?"  So far the answer's
> been "no".
>
> You might want to look at the manual's discussion of MVCC and at the
> Postgres internals talks that were given at OSDN (see slides at
> http://www.postgresql.org/osdn/index.html) to learn more about how
> things work.
>
Ugh, pdf format.  I'll have to remember to look at them next time I'm on
a windows box.

>                         regards, tom lane

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com

Re: Weird indices

От
Ian Lance Taylor
Дата:
Joseph Shraibman <jks@selectacast.net> writes:

> > > I understand that keeping different views for different open
> > > transactions can be difficult, but after a transaction  that updates a
> > > row is over why isn't the row marked as 'universally visible' for all
> > > new transactions until another update occurs?
> >
> > It is.  This mark is on the tuple in the heap.  When a tuple is
> > current, and not locked for update, HEAP_XMAX_INVALID is set.  After
> > the tuple is removed, HEAP_XMAX_COMMITTED is set.
>
> On the heap, but when is the index updated?  Not until the next vacuum?

The index is updated right away.  Otherwise it could never be used,
since it would be inaccurate.

When a new tuple is inserted in the heap, the index is updated to
point to the new tuple.  This involves inserting new tuples into the
index.  The old tuples in the index are left untouched, and continue
to point to the old tuple in the heap.

The old tuples in the index, and the heap, are removed by VACUUM.
VACUUM walks through the index and checks the heap tuple corresponding
to each index tuple.  If the heap tuple is gone, or can no longer be
seen by any transaction, then the index tuple can be removed.

Note that this all implies that when walking through the index to find
heap tuples, you must check the current validity of each heap tuple.
It is normal for an index tuple to point to a heap tuple which has
been deleted.

> > > Maybe this is part of the whole 'vacuum later' vs. 'update now'
> > > philosophy.  If the point of vacuum later is to put off the performance
> > > hit until later if it is causing these performance hits on queries
> > > because index scans aren't being used then doesn't that mean 'update
> > > now' is more likely to pay off in the short run?
> >
> > I don't follow.  A simple VACUUM doesn't update the statistics.
> > VACUUM ANALYZE has to do more work.
>
> I'm talking about indices.  The index should be updated to only point at
> valid rows.

When should the index be updated to only point at valid rows?  That is
only possible when a heap tuple is finally and completely removed.
But currently that is only known at the time of a VACUUM.

Consider a transaction which sits around for a while and then aborts.
At the moment that the transaction aborts, Postgres may become able to
remove some heap tuples and some index tuples, and it might be invalid
for Postgres to remove those tuples before the transaction aborts.

But the transaction might never have looked at those tuples.  So,
given the Postgres data structures, the only way to keep an index
fully up to date would be to effectively run a VACUUM over the entire
database every time a transaction aborts.

OK, that's not quite true.  It would be possible to keep a list in
shared memory of tuples which were recently dropped.  Then as
transactions dropped out, it would be possible to see which ones could
be completely removed.  This list of tuples would presumably include
index tuples.

> But if the index isn't used by the planner then the point
> is moot.

As far as I know the index itself isn't used by the planner.

> > Are you suggesting that the statistics should be updated
> > continuously?  I guess that would be doable, but it would clearly
> > slow down the database.  For some applications, it would be an
> > obviously bad idea.
>
> No, I'm suggesting that indices should be updated continuously so the
> planner can use them without having a performance hit from checking if
> tuples are valid or not.

Well, as far as I know, the planner doesn't use the index.  It only
uses the statistics.

> BTW is there any way to tell postgres to do an update at every commit
> without waiting for a vacuum?  I understand that the postgres core team
> thinks it is a bad idea, but there are ways to (sort of) force using
> index scans whent he planner doesn't want to, so is there something
> similar to force incremental vacuuming at the end of each query?
>
> Java has this option:
> -Xincgc           enable incremental garbage collection
>
> that isn't recommended, but the java developers recognized that
> sometimes a user might want it anyway for whatever reason.

I don't think there is any way to do that today.  It would be possible
to implement something along the lines I suggest above.  I have no
idea if the Postgres maintainers have any plans along these lines.

Ian

Re: Weird indices

От
Stephan Szabo
Дата:
On Tue, 20 Feb 2001, Joseph Shraibman wrote:

> Err I wan't complaing about count(*) per se, I was just using that as a
> simple example of something that should be done with an index.  Because
> if the index doesn't have to worry about rows that aren't current then
> you don't even have to go into the heap because the index alone should
> have enough information to do that.  If it doesn't have to worry about
> rows that aren't visible.

But the problem is how do you deal with concurrency?  At any given point
in time there are different sets of rows that are "current" for different
transactions.  They all need to be in the index so that index scans work
for those transactions (unless you were to do something hacky to get
around it) but not all of them are valid for each transaction, you still
have to get that information somehow. You can keep the transaction
information in the index but I know Tom's talked this idea down in the
past (it's come up on hackers before), I don't really remember what the
full arguments were both ways.


Re: Weird indices

От
Joseph Shraibman
Дата:
Ian Lance Taylor wrote:
>
> Joseph Shraibman <jks@selectacast.net> writes:
>
> > > > I understand that keeping different views for different open
> > > > transactions can be difficult, but after a transaction  that updates a
> > > > row is over why isn't the row marked as 'universally visible' for all
> > > > new transactions until another update occurs?
> > >
> > > It is.  This mark is on the tuple in the heap.  When a tuple is
> > > current, and not locked for update, HEAP_XMAX_INVALID is set.  After
> > > the tuple is removed, HEAP_XMAX_COMMITTED is set.
> >
> > On the heap, but when is the index updated?  Not until the next vacuum?
>
> The index is updated right away.  Otherwise it could never be used,
> since it would be inaccurate.

I meant cleaned up.  Which you answered: when vacuumed.

<snip>
>
> Note that this all implies that when walking through the index to find
> heap tuples, you must check the current validity of each heap tuple.
> It is normal for an index tuple to point to a heap tuple which has
> been deleted.

<snip>
> >
> > I'm talking about indices.  The index should be updated to only point at
> > valid rows.
>
> When should the index be updated to only point at valid rows?  That is
> only possible when a heap tuple is finally and completely removed.
> But currently that is only known at the time of a VACUUM.
>
You just said above 'It is normal for an index tuple to point to a heap
tuple which has been deleted.'


> Consider a transaction which sits around for a while and then aborts.
> At the moment that the transaction aborts, Postgres may become able to
> remove some heap tuples and some index tuples, and it might be invalid
> for Postgres to remove those tuples before the transaction aborts.
>
> But the transaction might never have looked at those tuples.  So,
> given the Postgres data structures, the only way to keep an index
> fully up to date would be to effectively run a VACUUM over the entire
> database every time a transaction aborts.

Why?  There is a mechanism for keeping track of which heap tuples are
valid, why not index tuples?  It is the nature of indices to be updated
on inserts, why not deletes?  I would think that the advantage of being
able to use the index in the planner would outweigh the immediate cost
of doing the update.

<snip>
>
> > But if the index isn't used by the planner then the point
> > is moot.
>
> As far as I know the index itself isn't used by the planner.
>
But could be.  As I understand it the reason the index isn't used by the
planner is because the index could point at non-visible rows (row = heap
tuple).  If the index could be used, many things now that are seq scans
could be converted to faster index scans.

<snip>

> I don't think there is any way to do that today.  It would be possible
> to implement something along the lines I suggest above.  I have no
> idea if the Postgres maintainers have any plans along these lines.
>
At the end of a transaction, when it sets the bit that this tuple isn't
valid, couldn't it at the same time also remove it if was no longer
visible to any transaction?  It wouldn't remove the need for vacuum
because there may be another transaction that prevents it from being
removed right then and there.

But for index tuples we could use your list system because (as I see it)
the value of being able to use the index in the planner would outweigh
the cost of the list system.

> Ian

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com

Re: Weird indices

От
Tom Lane
Дата:
Joseph Shraibman <jks@selectacast.net> writes:
> Why?  There is a mechanism for keeping track of which heap tuples are
> valid, why not index tuples?  It is the nature of indices to be updated
> on inserts, why not deletes?

An index is a hint: these tuples *might* be of interest to your
transaction.  It's OK for an index to point to some irrelevant tuples,
but it's useless if it fails to point to all the possibly relevant
tuples.  Therefore, it's OK to insert an index entry at the earliest
possible instant (as soon as a yet-uncommitted heap tuple is inserted);
and contrariwise the index entry can't be deleted until the heap tuple
can be proven to be no longer of interest to any still-alive transaction.

Currently, proving that a heap tuple is globally no-longer-of-interest
and removing it and its associated index tuples is the task of VACUUM.

Intermediate state transitions (eg, this tuple has been deleted by a
not-yet-committed transaction) are recorded in the heap tuple, but we
don't try to look around and update all the associated index tuples at
the same time.  Maintaining that same state in all the index tuples
would be expensive and would bloat the indexes.  An index that's not
a lot smaller than the associated heap is of little value, so extra
bits in an index entry are to be feared.

These are very fundamental system design decisions.  If you'd like
to show us the error of our ways, step right up to the plate and swing
away; but unsubstantiated suggestions that these choices are wrong are
not going to be taken with any seriousness.  Postgres has come pretty
far on the basis of these design choices.

            regards, tom lane

Re: Weird indices

От
Ian Lance Taylor
Дата:
Joseph Shraibman <jks@selectacast.net> writes:

> > Note that this all implies that when walking through the index to find
> > heap tuples, you must check the current validity of each heap tuple.
> > It is normal for an index tuple to point to a heap tuple which has
> > been deleted.
>
> <snip>
> > >
> > > I'm talking about indices.  The index should be updated to only point at
> > > valid rows.
> >
> > When should the index be updated to only point at valid rows?  That is
> > only possible when a heap tuple is finally and completely removed.
> > But currently that is only known at the time of a VACUUM.
> >
> You just said above 'It is normal for an index tuple to point to a heap
> tuple which has been deleted.'

I'm not sure what your point is here.

I can see that there is an ambiguity in what I said.  I said it is
normal for an index tuple to point to a heap tuple which has been
deleted.  However, although the heap tuple has been deleted in present
time, there may still be transactions which do not see that heap tuple
as having been deleted.  So the index tuple is still meaningful until
all those transactions have completed.

When all such transactions have completed is the case I meant when I
described the heap tuple as finally and completely removed.

> > Consider a transaction which sits around for a while and then aborts.
> > At the moment that the transaction aborts, Postgres may become able to
> > remove some heap tuples and some index tuples, and it might be invalid
> > for Postgres to remove those tuples before the transaction aborts.
> >
> > But the transaction might never have looked at those tuples.  So,
> > given the Postgres data structures, the only way to keep an index
> > fully up to date would be to effectively run a VACUUM over the entire
> > database every time a transaction aborts.
>
> Why?  There is a mechanism for keeping track of which heap tuples are
> valid, why not index tuples?  It is the nature of indices to be updated
> on inserts, why not deletes?  I would think that the advantage of being
> able to use the index in the planner would outweigh the immediate cost
> of doing the update.

You're right.  The mechanism used to preserve multiple versions of
heap tuples could be extended to index tuples as well.

Based on the heap tuple implementation, this would require adding two
transaction ID's and a few flags to each index tuple.  That's not
insignificant.  In a B-tree, right now, I think there are 8 bytes plus
the key for each item in the tree.  This would require adding another
10 bytes or so.  That's a lot.

Also, more work would be required for every update.  Right now an
update requires a B-tree insert for each index.  With this change,
every update would require an additional B-tree lookup and write for
each index.  That would require on average a bit less than one
additional block write per index.  That's a lot.

In exchange, certain queries would become faster.  Specifically, any
query which only needed the information found in an index would become
faster.  Each such query would save on average a bit less than one
additional block read per value found in the index.  But since the
indices would be less efficient, some of the advantage would be lost
due to extra block reads of the index.

What you are suggesting seems possible, but it does not seem to be
obviously better.

If you feel strongly about this, the most reasonable thing would be
for you to implement it, and test the results.  Since as far as I can
see what you are suggesting is not clearly better, it's unlikely that
anybody else is going to go to the considerable effort of implement it
on your behalf.

> > > But if the index isn't used by the planner then the point
> > > is moot.
> >
> > As far as I know the index itself isn't used by the planner.
> >
> But could be.  As I understand it the reason the index isn't used by the
> planner is because the index could point at non-visible rows (row = heap
> tuple).  If the index could be used, many things now that are seq scans
> could be converted to faster index scans.

Some things could, sure.  It's not obvious to me that many things
could.  The planner can't spend a lot of time looking at an index to
decide whether or not to use it.  If it's going to do that, it's
better off to just decide to use the index in the first place.  Index
examination is not free.  It requires disk reads just like everything
else.

> > I don't think there is any way to do that today.  It would be possible
> > to implement something along the lines I suggest above.  I have no
> > idea if the Postgres maintainers have any plans along these lines.
> >
> At the end of a transaction, when it sets the bit that this tuple isn't
> valid, couldn't it at the same time also remove it if was no longer
> visible to any transaction?  It wouldn't remove the need for vacuum
> because there may be another transaction that prevents it from being
> removed right then and there.

Yes, this could be done.  It wouldn't speed things up, though.  In
fact, it would slow them down.  The only advantage would be that
VACUUM would be required less often--an advantage which is not
insignificant.

I would guess that in the average multi-user database less than half
of the tuples could be deleted at that point.  It would be easy to
instrument Postgres to test this--why don't you try that?

Ian

Re: Weird indices

От
Joseph Shraibman
Дата:
Ian Lance Taylor wrote:
>
<snip>

> You're right.  The mechanism used to preserve multiple versions of
> heap tuples could be extended to index tuples as well.
>
> Based on the heap tuple implementation, this would require adding two
> transaction ID's and a few flags to each index tuple.  That's not
> insignificant.  In a B-tree, right now, I think there are 8 bytes plus
> the key for each item in the tree.  This would require adding another
> 10 bytes or so.  That's a lot.
>
OK now this is starting to make sense to me.  I think.  I guess I'll
really have to sift throught the code again to figure out the rest.

> Also, more work would be required for every update.  Right now an
> update requires a B-tree insert for each index.  With this change,
> every update would require an additional B-tree lookup and write for
> each index.  That would require on average a bit less than one
> additional block write per index.  That's a lot.
>
> In exchange, certain queries would become faster.  Specifically, any
> query which only needed the information found in an index would become
> faster.  Each such query would save on average a bit less than one
> additional block read per value found in the index.  But since the
> indices would be less efficient, some of the advantage would be lost
> due to extra block reads of the index.
>
> What you are suggesting seems possible, but it does not seem to be
> obviously better.

It may not be as obvious as it first seemed to me, but I bet there are
certain databases out there that have just the right pattern of data
that would benefit from this.  I suppose this is something that
compilers have tried to balance all along.  Maybe there could be a
different type of index that could be manually added by admins who
wanted to fiddle around with their database.
>
> If you feel strongly about this, the most reasonable thing would be
> for you to implement it, and test the results.  Since as far as I can
> see what you are suggesting is not clearly better, it's unlikely that
> anybody else is going to go to the considerable effort of implement it
> on your behalf.
>

<snip>
> Some things could, sure.  It's not obvious to me that many things
> could.  The planner can't spend a lot of time looking at an index to
> decide whether or not to use it.  If it's going to do that, it's
> better off to just decide to use the index in the first place.  Index
> examination is not free.  It requires disk reads just like everything
> else.
>
Not free, but possibly worth it if it saves a seq scan.

> > > I don't think there is any way to do that today.  It would be possible
> > > to implement something along the lines I suggest above.  I have no
> > > idea if the Postgres maintainers have any plans along these lines.
> > >
> > At the end of a transaction, when it sets the bit that this tuple isn't
> > valid, couldn't it at the same time also remove it if was no longer
> > visible to any transaction?  It wouldn't remove the need for vacuum
> > because there may be another transaction that prevents it from being
> > removed right then and there.
>
> Yes, this could be done.  It wouldn't speed things up, though.  In
> fact, it would slow them down.  The only advantage would be that
> VACUUM would be required less often--an advantage which is not
> insignificant.
>
> I would guess that in the average multi-user database less than half
> of the tuples could be deleted at that point.  It would be easy to
> instrument Postgres to test this--why don't you try that?
>

I just might.  I've been thinking of hacking postgres, but for adding
xml support to postgres. That seems to be mostly a matter of parsing.



--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com

Re: Weird indices

От
Bruce Momjian
Дата:
> > Also, more work would be required for every update.  Right now an
> > update requires a B-tree insert for each index.  With this change,
> > every update would require an additional B-tree lookup and write for
> > each index.  That would require on average a bit less than one
> > additional block write per index.  That's a lot.
> >
> > In exchange, certain queries would become faster.  Specifically, any
> > query which only needed the information found in an index would become
> > faster.  Each such query would save on average a bit less than one
> > additional block read per value found in the index.  But since the
> > indices would be less efficient, some of the advantage would be lost
> > due to extra block reads of the index.
> >
> > What you are suggesting seems possible, but it does not seem to be
> > obviously better.
>
> It may not be as obvious as it first seemed to me, but I bet there are
> certain databases out there that have just the right pattern of data
> that would benefit from this.  I suppose this is something that
> compilers have tried to balance all along.  Maybe there could be a
> different type of index that could be manually added by admins who
> wanted to fiddle around with their database.

If you want performance options, MySQL is the champ.  We usually require
a clear reason to give users more options because too many options can
be quite confusing.  It is more a design philosophy.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026