Обсуждение: Select hangs and there are lots of files in table and index directories.
Have a problem where a stored procedure is taking a week to run. The stored procedure should take less than a second to run. In researching a select hanging problem, three things are suggested; an autovacuum problem, a resource is locked, or there is something wrong with the stored procedure.
· Autovacuum is running. A ‘ps –elf | grep postgres’ shows:
00:00:43 postgres: logger process
00:5:50 postgres: writer process
00:3:04 postgres: wal writer process
00:00:48 postgres: autovacuum launcher process
00:00:50 postgres: stats collector process
00:01:28 postgres: operstions OPDB [local] idle
154.11.29 postgres: operstions OPDB [local] select
· The select is from running a select of a stored procedure from a ‘c’ program using the PqsendQuery function.
· Postgres.conf has both autovacuum and track_counts set to ‘on’. All other autovacuum values are left as delivered (commented out).
· A ‘select * from pg_stats_activity;’ shows no query is blocked.
· We have recently changed from using Oracle 10g (running on Red Hat AS 4.5) to PostgreSQL 9.1.2 (running on CentOS 6.3). The only differences between the two versions are:
o Syntax changes between Oracle and Postgres.
o In Oracle a commit was executed after each ‘chuck’ of work was done. A commit is no longer used in Postgres because the Postgres documentation indicates that a commit has no affect until the end of the transaction (i.e., the end of the stored procedure).
o The same stored procedure is running just fine on all of our test systems and at one of our two customer sites. All of the systems are configured the same (same operating system and software).
Lastly, in the directories used to store the tables and indexes, there are 918896 files in the tables directory and 921291 files in the indexes directory. All of the file names are just numbers (no extensions). About 60 files are added to each directory every second. On our test systems and at our other customer site, there are only about 50 files in each directory.
Peter Blair <petertblair@gmail.com> writes: > Have a problem where a stored procedure is taking a week to run. The > stored procedure should take less than a second to run. Is that "it's known to terminate if you give it a week", or "we've let it run for a week and it shows no sign of ever terminating"? > In researching a > select hanging problem, three things are suggested; an autovacuum problem, > a resource is locked, or there is something wrong with the stored procedure. I'd bet on the last, given that you're apparently working with an immature port from Oracle. The error recovery semantics, in particular, are enough different in PL/SQL and PL/pgSQL that it's not too hard to credit having accidentally written an infinite loop via careless translation. > Lastly, in the directories used to store the tables and indexes, there are > 918896 files in the tables directory and 921291 files in the indexes > directory. All of the file names are just numbers (no extensions). About > 60 files are added to each directory every second. On our test systems and > at our other customer site, there are only about 50 files in each directory. > Why are there so many files? If the filenames are just numbers, then they must be actual tables or indexes, not temp files. (You could cross-check that theory by noting whether the system catalogs, such as pg_class, are bloating at a proportional rate.) I'm guessing that there's some loop in your procedure that's creating new temp tables, or maybe even non-temp tables. You would not be able to see them via "select * from pg_class" in another session because they're not committed yet, but they'd be taking up filesystem entries. The loop might or might not be dropping the tables again; IIRC the filesystem entries wouldn't get cleaned up till end of transaction even if the tables are nominally dropped. Not much to go on, but I'd look for a loop that includes a CREATE TABLE and a BEGIN ... EXCEPT block, and take a close look at the conditions under which the EXCEPT allows the loop to continue. regards, tom lane
Peter Blair <petertblair@gmail.com> writes:Is that "it's known to terminate if you give it a week", or "we've let
> Have a problem where a stored procedure is taking a week to run. The
> stored procedure should take less than a second to run.
it run for a week and it shows no sign of ever terminating"?I'd bet on the last, given that you're apparently working with an immature
> In researching a
> select hanging problem, three things are suggested; an autovacuum problem,
> a resource is locked, or there is something wrong with the stored procedure.
port from Oracle. The error recovery semantics, in particular, are enough
different in PL/SQL and PL/pgSQL that it's not too hard to credit having
accidentally written an infinite loop via careless translation.If the filenames are just numbers, then they must be actual tables or
> Lastly, in the directories used to store the tables and indexes, there are
> 918896 files in the tables directory and 921291 files in the indexes
> directory. All of the file names are just numbers (no extensions). About
> 60 files are added to each directory every second. On our test systems and
> at our other customer site, there are only about 50 files in each directory.
> Why are there so many files?
indexes, not temp files. (You could cross-check that theory by noting
whether the system catalogs, such as pg_class, are bloating at a
proportional rate.) I'm guessing that there's some loop in your procedure
that's creating new temp tables, or maybe even non-temp tables. You would
not be able to see them via "select * from pg_class" in another session
because they're not committed yet, but they'd be taking up filesystem
entries. The loop might or might not be dropping the tables again; IIRC
the filesystem entries wouldn't get cleaned up till end of transaction
even if the tables are nominally dropped.
Not much to go on, but I'd look for a loop that includes a CREATE TABLE
and a BEGIN ... EXCEPT block, and take a close look at the conditions
under which the EXCEPT allows the loop to continue.
regards, tom lane
Tom,You are correct. The was an infinate loop created because of the differences in the date math between Oracle and Postgres.Thank again for your help.On Mon, Jan 27, 2014 at 7:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Peter Blair <petertblair@gmail.com> writes:Is that "it's known to terminate if you give it a week", or "we've let
> Have a problem where a stored procedure is taking a week to run. The
> stored procedure should take less than a second to run.
it run for a week and it shows no sign of ever terminating"?I'd bet on the last, given that you're apparently working with an immature
> In researching a
> select hanging problem, three things are suggested; an autovacuum problem,
> a resource is locked, or there is something wrong with the stored procedure.
port from Oracle. The error recovery semantics, in particular, are enough
different in PL/SQL and PL/pgSQL that it's not too hard to credit having
accidentally written an infinite loop via careless translation.If the filenames are just numbers, then they must be actual tables or
> Lastly, in the directories used to store the tables and indexes, there are
> 918896 files in the tables directory and 921291 files in the indexes
> directory. All of the file names are just numbers (no extensions). About
> 60 files are added to each directory every second. On our test systems and
> at our other customer site, there are only about 50 files in each directory.
> Why are there so many files?
indexes, not temp files. (You could cross-check that theory by noting
whether the system catalogs, such as pg_class, are bloating at a
proportional rate.) I'm guessing that there's some loop in your procedure
that's creating new temp tables, or maybe even non-temp tables. You would
not be able to see them via "select * from pg_class" in another session
because they're not committed yet, but they'd be taking up filesystem
entries. The loop might or might not be dropping the tables again; IIRC
the filesystem entries wouldn't get cleaned up till end of transaction
even if the tables are nominally dropped.
Not much to go on, but I'd look for a loop that includes a CREATE TABLE
and a BEGIN ... EXCEPT block, and take a close look at the conditions
under which the EXCEPT allows the loop to continue.
regards, tom lane
Peter Blair <petertblair@gmail.com> writes: > One other problem with this case, those 900K worth of files in each of the > table and index directories (1.8M total files) are still hanging around. Hm ... if left to its own devices, I think the session that created them should have deleted them, assuming you did a normal query cancel on it. Maybe you did kill -9? > Just a guess, but do I shutdown the database, and delete any file not > listed in pg_class? For starters, try just stopping and starting the database; I think there might be logic to remove orphaned files during postmaster startup. If that doesn't work, you can get rid of any numeric-named files that match no value in pg_class.relfilenode of their database. regards, tom lane