Обсуждение: vacuum internals and performance affect

Поиск
Список
Период
Сортировка

vacuum internals and performance affect

От
MirrorX
Дата:
dear all,

i am trying to understand if i am missing something on how vacuum works. i
ve read the manual, and did some research on the web about that but i am
still not sure.

to my understanding, vacuum just marks the dead rows of a table so that from
that point on that space would be re-used for new inserts and new updates on
that specific table. however, if there is an open transaction, vacuum can
only do what is described above up to the point that the open transaction
was started. so if for example there is a query running for 1 day, no matter
how many times i will have vacuumed the table (manual or auto), the dead
rows wont be possible to be marked as re-usable space.
-is the above correct?
-is there something more about vacuum in that case i am describing? would
for example mark the rows as 'semi-dead' so that when a scan would be made
these rows wouldn't be checked and so the queries would be faster? is there
anything else for this specific case?
-would there be any effect from the vacuum on the indexes of the table?like
i said above for the table, would the entries of the index not be scanned
for a query, due to some reason?

if there is a something i could read to answer these questions plz point me
to that direction, otherwise i would really appreciate any information you
may have. thx in advance

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/vacuum-internals-and-performance-affect-tp5033043p5033043.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: vacuum internals and performance affect

От
Josh Berkus
Дата:
MX,

> to my understanding, vacuum just marks the dead rows of a table so that from
> that point on that space would be re-used for new inserts and new updates on
> that specific table. however, if there is an open transaction, vacuum can
> only do what is described above up to the point that the open transaction
> was started. so if for example there is a query running for 1 day, no matter
> how many times i will have vacuumed the table (manual or auto), the dead
> rows wont be possible to be marked as re-usable space.
> -is the above correct?

More or less.  The transactionID isn't a timestamp, so the "stop point"
is based on snapshots rather than a point-in-time.  But that's a fine
distinction.

> -is there something more about vacuum in that case i am describing? would
> for example mark the rows as 'semi-dead' so that when a scan would be made
> these rows wouldn't be checked and so the queries would be faster? is there
> anything else for this specific case?

Well, vacuum does some other work, yes.

> -would there be any effect from the vacuum on the indexes of the table?like
> i said above for the table, would the entries of the index not be scanned
> for a query, due to some reason?

Vacuum also does some pruning dead index pointers.

Otherwise, I'm not sure what you're asking.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: vacuum internals and performance affect

От
MirrorX
Дата:
thx a lot for your answer :)

so when a transaction is still open from a while back (according to the
transactionID), no 'new dead' tuples can be marked as re-usable space for
new rows, right? by 'new dead' i mean that for example there is a
transaction running from 10.00am(with a specific transactionID). when i
delete rows at 11.00am these are the ones i am referring to.

the same thing happens with the index, right? the dead enties for the rows
that were deleted at 11.00am cannot be removed yet (this is not based on the
timestamp, i get it, i just want to point out that due to MVCC these rows
should be visible to the old transaction and by using timestamps this is
more obvious)

but, for these rows, the 'deleted' ones. does vacuum do anything at all at
that time? and if so, what is it? thx in advance

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/vacuum-internals-and-performance-affect-tp5033043p5036800.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: vacuum internals and performance affect

От
Josh Berkus
Дата:
MirrorX,

> so when a transaction is still open from a while back (according to the
> transactionID), no 'new dead' tuples can be marked as re-usable space for
> new rows, right? by 'new dead' i mean that for example there is a
> transaction running from 10.00am(with a specific transactionID). when i
> delete rows at 11.00am these are the ones i am referring to.

With the understanding that what we're actually checking is snapshots
(which are not completely linear) and not timestamps, yes, that's a good
simplification for what happens.

> but, for these rows, the 'deleted' ones. does vacuum do anything at all at
> that time? and if so, what is it? thx in advance

No, it does nothing.  What would it do?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: vacuum internals and performance affect

От
MirrorX
Дата:
from what i ve read and have i ve seen in practice, i expected it to do
nothing at all. i just wanted to be absolutely sure and that's why i asked
here.
thank you very much for the clarification

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/vacuum-internals-and-performance-affect-tp5033043p5039677.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.