Обсуждение: pg_relation_size locking

Поиск
Список
Период
Сортировка

pg_relation_size locking

От
Andreas Pflug
Дата:
Until recently, pg_relation_size used SearchSysCache to locate the 
relation to examine, and calculated the file location from that 
information. Starting with dbsize.c V1.5 (committed after Beta2), 
relation_open(.., AccessShareLock) is used. This is very unfortunate 
because it will not allow to observe a table growing while it is 
populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. 
After reverting to 1.4, everything was fine again.

Can we have this reverted/fixed?

Regards,
Andreas


Re: pg_relation_size locking

От
Tom Lane
Дата:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Until recently, pg_relation_size used SearchSysCache to locate the 
> relation to examine, and calculated the file location from that 
> information. Starting with dbsize.c V1.5 (committed after Beta2), 
> relation_open(.., AccessShareLock) is used. This is very unfortunate 
> because it will not allow to observe a table growing while it is 
> populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. 

Nonsense.

> After reverting to 1.4, everything was fine again.
> Can we have this reverted/fixed?

Can we have the actual problem explained?
        regards, tom lane


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> 
>>Until recently, pg_relation_size used SearchSysCache to locate the 
>>relation to examine, and calculated the file location from that 
>>information. Starting with dbsize.c V1.5 (committed after Beta2), 
>>relation_open(.., AccessShareLock) is used. This is very unfortunate 
>>because it will not allow to observe a table growing while it is 
>>populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. 
> 
> 
> Nonsense.

Ahem.

I'm running Slony against a big replication set. While slon runs COPY 
foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). 
pg_locks will show that the AccessShareLock on foo is not granted.

Problem is gone with reverted dbsize.c

Regards,
Andreas


Re: pg_relation_size locking

От
Tom Lane
Дата:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Tom Lane wrote:
>> Nonsense.

> Ahem.

> I'm running Slony against a big replication set. While slon runs COPY 
> foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). 
> pg_locks will show that the AccessShareLock on foo is not granted.

That's only possible if Slony is taking AccessExclusive lock; if so,
your gripe is properly directed to the Slony folks, not to
pg_relation_size which is acting as a good database citizen should.
Certainly a plain COPY command does not take AccessExclusive.
        regards, tom lane


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> 
>>Tom Lane wrote:
>>
>>>Nonsense.
> 
> 
>>Ahem.
> 
> 
>>I'm running Slony against a big replication set. While slon runs COPY 
>>foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). 
>>pg_locks will show that the AccessShareLock on foo is not granted.
> 
> 
> That's only possible if Slony is taking AccessExclusive lock; if so,
> your gripe is properly directed to the Slony folks, not to
> pg_relation_size which is acting as a good database citizen should.

More precisely, it executes TRUNCATE;COPY at the same time; there might 
be additional locks to prevent using the table. Still, I see no reason 
why pg_relation_size shouldn't continue to use SearchSysCache as id did 
for years now. There's no sense in using locking mechanisms on table foo 
while reading file system data; pg_class is sufficient to locate the 
table's files.

Regards,
Andreas


Re: pg_relation_size locking

От
Tom Lane
Дата:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Tom Lane wrote:
>> That's only possible if Slony is taking AccessExclusive lock; if so,
>> your gripe is properly directed to the Slony folks, not to
>> pg_relation_size which is acting as a good database citizen should.

> More precisely, it executes TRUNCATE;COPY at the same time; there might 
> be additional locks to prevent using the table. Still, I see no reason 
> why pg_relation_size shouldn't continue to use SearchSysCache as id did 
> for years now. There's no sense in using locking mechanisms on table foo 
> while reading file system data; pg_class is sufficient to locate the 
> table's files.

The fact that the contrib version did things incorrectly for years is
no justification for not fixing it at the time it's taken into the core.
You have to have a lock to ensure that the table even exists, let alone
that you are looking at the right set of disk files.

In the above example, the contrib code would have not done the right
thing at all --- if I'm not mistaken, it would have kept handing back
the size of the original, pre-TRUNCATE file, since the new pg_class
row with the new relfilenode isn't committed yet.  So it wouldn't have
done what you wish anyway.
        regards, tom lane


Re: pg_relation_size locking

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> In the above example, the contrib code would have not done the right
> thing at all --- if I'm not mistaken, it would have kept handing back
> the size of the original, pre-TRUNCATE file, since the new pg_class
> row with the new relfilenode isn't committed yet.  So it wouldn't have
> done what you wish anyway.

It wouldn't have worked anyway because it used the Oid to search the
file, not the relfilenode.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Alvaro Herrera wrote:
> 
> The problem with the original coding was that it used the table Oid to
> look up the file name, which is wrong.  (Test it with a table that has
> been clustered or an index that has been reindexed.)

Um, can't test at the moment. The oldcode used pg_class->relfilnode, 
which delivers "Name of the on-disk file of this relation" according to 
the docs. What's wrong with that?

regards,
Andreas


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> 
>>Tom Lane wrote:
>>
>>>That's only possible if Slony is taking AccessExclusive lock; if so,
>>>your gripe is properly directed to the Slony folks, not to
>>>pg_relation_size which is acting as a good database citizen should.
> 
> 
>>More precisely, it executes TRUNCATE;COPY at the same time; there might 
>>be additional locks to prevent using the table. Still, I see no reason 
>>why pg_relation_size shouldn't continue to use SearchSysCache as id did 
>>for years now. There's no sense in using locking mechanisms on table foo 
>>while reading file system data; pg_class is sufficient to locate the 
>>table's files.
> 
> 
> The fact that the contrib version did things incorrectly for years is
> no justification for not fixing it at the time it's taken into the core.
> You have to have a lock to ensure that the table even exists, let alone
> that you are looking at the right set of disk files.

This would require a lock on pg_class, not table foo, no?

> In the above example, the contrib code would have not done the right
> thing at all --- if I'm not mistaken, it would have kept handing back
> the size of the original, pre-TRUNCATE file, since the new pg_class
> row with the new relfilenode isn't committed yet. 

Hm, I see the issue. Interesting enough, I *do* see the size growing. 
OTOH, when running BEGIN;TRUNCATE against a test table and retrieving 
pg_relation_size returns the previous relfilenode and size as expected.

Regards,
Andreas


Re: pg_relation_size locking

От
Alvaro Herrera
Дата:
Andreas Pflug wrote:
> Until recently, pg_relation_size used SearchSysCache to locate the 
> relation to examine, and calculated the file location from that 
> information. Starting with dbsize.c V1.5 (committed after Beta2), 
> relation_open(.., AccessShareLock) is used. This is very unfortunate 
> because it will not allow to observe a table growing while it is 
> populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. 
> After reverting to 1.4, everything was fine again.

The diff:
http://projects.commandprompt.com/projects/public/pgsql/changeset/23120

The problem with the original coding was that it used the table Oid to
look up the file name, which is wrong.  (Test it with a table that has
been clustered or an index that has been reindexed.)

We could use a SysCache on filenode, if there was one.  Unfortunately I
don't think we have it.

> Can we have this reverted/fixed?

If you can see a way without reintroducing the old bugs, let me know.


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pg_relation_size locking

От
Tom Lane
Дата:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Tom Lane wrote:
>> You have to have a lock to ensure that the table even exists, let alone
>> that you are looking at the right set of disk files.

> This would require a lock on pg_class, not table foo, no?

No, the convention is that you take a lock on the relation you're
interested in.  The fact that some of the information you care about is
in pg_class is incidental.  There is actually stuff going on behind
the scenes to make sure that you get up-to-date info when you do
LockRelation; looking at the pg_class row does *not* by itself guarantee
that.  That is, when you SearchSysCache you might get a row that was
good at the start of your transaction but no longer is; relation_open
with a lock guarantees that you get a relation descriptor that is
currently correct.

> Hm, I see the issue. Interesting enough, I *do* see the size growing. 
> OTOH, when running BEGIN;TRUNCATE against a test table and retrieving 
> pg_relation_size returns the previous relfilenode and size as expected.

That's a bit curious.  If they just did TRUNCATE then COPY, the commit
of the TRUNCATE should have released the lock.  If the TRUNCATE wasn't
committed yet, then how are you able to pick up the correct relfilenode
to look at?
        regards, tom lane


Re: pg_relation_size locking

От
Alvaro Herrera
Дата:
[Resend: apparently there's a problem with my mail server]

Andreas Pflug wrote:
> Until recently, pg_relation_size used SearchSysCache to locate the 
> relation to examine, and calculated the file location from that 
> information. Starting with dbsize.c V1.5 (committed after Beta2), 
> relation_open(.., AccessShareLock) is used. This is very unfortunate 
> because it will not allow to observe a table growing while it is 
> populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. 
> After reverting to 1.4, everything was fine again.

The diff:
http://projects.commandprompt.com/projects/public/pgsql/changeset/23120

The problem with the original coding was that it used the table Oid to
look up the file name, which is wrong.  (Test it with a table that has
been clustered or an index that has been reindexed.)

We could use a SysCache on filenode, if there was one.  Unfortunately I
don't think we have it.

> Can we have this reverted/fixed?

If you can see a way without reintroducing the old bugs, let me know.


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> 
>>Tom Lane wrote:
>>
>>>You have to have a lock to ensure that the table even exists, let alone
>>>that you are looking at the right set of disk files.
> 
> 
>>This would require a lock on pg_class, not table foo, no?
> 
> 
> No, the convention is that you take a lock on the relation you're
> interested in. 

So pgAdmin violates the convention, because it doesn't hold a lock an a 
table when reengineering its attributes....
Since pg_relation_size is a pure metadata function, I don't think the 
convention hits here (.

> 
>>Hm, I see the issue. Interesting enough, I *do* see the size growing. 
>>OTOH, when running BEGIN;TRUNCATE against a test table and retrieving 
>>pg_relation_size returns the previous relfilenode and size as expected.
> 
> 
> That's a bit curious.  If they just did TRUNCATE then COPY, the commit
> of the TRUNCATE should have released the lock.  If the TRUNCATE wasn't
> committed yet, then how are you able to pick up the correct relfilenode
> to look at?

The truncate is buried in a function, I suspect that actually no 
truncate happened on an empty table.

Regards,
Andreas


Re: pg_relation_size locking

От
Alvaro Herrera
Дата:
Andreas Pflug wrote:
> Alvaro Herrera wrote:
> >
> >The problem with the original coding was that it used the table Oid to
> >look up the file name, which is wrong.  (Test it with a table that has
> >been clustered or an index that has been reindexed.)
> 
> Um, can't test at the moment. The oldcode used pg_class->relfilnode, 
> which delivers "Name of the on-disk file of this relation" according to 
> the docs. What's wrong with that?

Hum, nothing that I can see, but I changed that code precisely because
somebody complained that it didn't work after truncating.  Do you mean
"oldcode" as "the contrib code", or "the initially integrated in backend
code"?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pg_relation_size locking

От
Andreas Pflug
Дата:
Alvaro Herrera wrote:
> Andreas Pflug wrote:
> 
>>Alvaro Herrera wrote:
>>
>>>The problem with the original coding was that it used the table Oid to
>>>look up the file name, which is wrong.  (Test it with a table that has
>>>been clustered or an index that has been reindexed.)
>>
>>Um, can't test at the moment. The oldcode used pg_class->relfilnode, 
>>which delivers "Name of the on-disk file of this relation" according to 
>>the docs. What's wrong with that?
> 
> 
> Hum, nothing that I can see, but I changed that code precisely because
> somebody complained that it didn't work after truncating.  Do you mean
> "oldcode" as "the contrib code", or "the initially integrated in backend
> code"?

Both, esp. backend/utils/adt/dbsize.c V1.4. and contrib/dbsize/dbsize.c 
from 8.0.5.

You might have been irritated by the naming:

relnodeOid = pg_class->relfilenode;
(..)
PG_RETURN_INT64(calculate_relation_size(tblspcOid, relnodeOid));

Regards,
Andreas