Обсуждение: Select hangs and there are lots of files in table and index directories.

Поиск
Список
Период
Сортировка

Select hangs and there are lots of files in table and index directories.

От
Peter Blair
Дата:

Have a problem where a stored procedure is taking a week to run.  The stored procedure should take less than a second to run.   In researching a select hanging problem, three things are suggested; an autovacuum problem, a resource is locked, or there is something wrong with the stored procedure.

·         Autovacuum is running.  A ‘ps –elf | grep postgres’ shows:

00:00:43 postgres: logger process

00:5:50 postgres: writer process

00:3:04 postgres: wal writer process

00:00:48 postgres: autovacuum launcher process

00:00:50 postgres: stats collector process

00:01:28 postgres: operstions OPDB [local] idle

154.11.29 postgres: operstions OPDB [local] select

·         The select is from running a select of a stored procedure from a ‘c’ program using the PqsendQuery function.

·         Postgres.conf has both autovacuum and track_counts set to ‘on’.  All other autovacuum values are left as delivered (commented out).

·         A ‘select * from pg_stats_activity;’ shows no query is blocked.

·         We have recently changed from using Oracle 10g (running on Red Hat AS 4.5) to PostgreSQL 9.1.2 (running on CentOS 6.3).  The only differences between the two versions are:

o   Syntax changes between Oracle and Postgres.

o   In Oracle a commit was executed after each ‘chuck’ of work was done.  A commit is no longer used in Postgres because the Postgres documentation indicates that a commit has no affect until the end of the transaction (i.e., the end of the stored procedure).

o   The same stored procedure is running just fine on all of our test systems and at one of our two customer sites.  All of the systems are configured the same (same operating system and software).

Lastly, in the directories used to store the tables and indexes, there are 918896 files in the tables directory and 921291 files in the indexes directory.  All of the file names are just numbers (no extensions).  About 60 files are added to each directory every second.  On our test systems and at our other customer site, there are only about 50 files in each directory.

Why are there so many files?
Thank you everyone for your time.
Peter Blair

 

Re: Select hangs and there are lots of files in table and index directories.

От
Tom Lane
Дата:
Peter Blair <petertblair@gmail.com> writes:
> Have a problem where a stored procedure is taking a week to run.  The
> stored procedure should take less than a second to run.

Is that "it's known to terminate if you give it a week", or "we've let
it run for a week and it shows no sign of ever terminating"?

> In researching a
> select hanging problem, three things are suggested; an autovacuum problem,
> a resource is locked, or there is something wrong with the stored procedure.

I'd bet on the last, given that you're apparently working with an immature
port from Oracle.  The error recovery semantics, in particular, are enough
different in PL/SQL and PL/pgSQL that it's not too hard to credit having
accidentally written an infinite loop via careless translation.

> Lastly, in the directories used to store the tables and indexes, there are
> 918896 files in the tables directory and 921291 files in the indexes
> directory.  All of the file names are just numbers (no extensions).  About
> 60 files are added to each directory every second.  On our test systems and
> at our other customer site, there are only about 50 files in each directory.
> Why are there so many files?

If the filenames are just numbers, then they must be actual tables or
indexes, not temp files.  (You could cross-check that theory by noting
whether the system catalogs, such as pg_class, are bloating at a
proportional rate.)  I'm guessing that there's some loop in your procedure
that's creating new temp tables, or maybe even non-temp tables.  You would
not be able to see them via "select * from pg_class" in another session
because they're not committed yet, but they'd be taking up filesystem
entries.  The loop might or might not be dropping the tables again; IIRC
the filesystem entries wouldn't get cleaned up till end of transaction
even if the tables are nominally dropped.

Not much to go on, but I'd look for a loop that includes a CREATE TABLE
and a BEGIN ... EXCEPT block, and take a close look at the conditions
under which the EXCEPT allows the loop to continue.

            regards, tom lane


Re: Select hangs and there are lots of files in table and index directories.

От
Peter Blair
Дата:
Tom,
 
You are correct.  The was an infinate loop created because of the differences in the date math between Oracle and Postgres.
 
Thank again for your help.
On Mon, Jan 27, 2014 at 7:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Blair <petertblair@gmail.com> writes:
> Have a problem where a stored procedure is taking a week to run.  The
> stored procedure should take less than a second to run.

Is that "it's known to terminate if you give it a week", or "we've let
it run for a week and it shows no sign of ever terminating"?

> In researching a
> select hanging problem, three things are suggested; an autovacuum problem,
> a resource is locked, or there is something wrong with the stored procedure.

I'd bet on the last, given that you're apparently working with an immature
port from Oracle.  The error recovery semantics, in particular, are enough
different in PL/SQL and PL/pgSQL that it's not too hard to credit having
accidentally written an infinite loop via careless translation.

> Lastly, in the directories used to store the tables and indexes, there are
> 918896 files in the tables directory and 921291 files in the indexes
> directory.  All of the file names are just numbers (no extensions).  About
> 60 files are added to each directory every second.  On our test systems and
> at our other customer site, there are only about 50 files in each directory.
> Why are there so many files?

If the filenames are just numbers, then they must be actual tables or
indexes, not temp files.  (You could cross-check that theory by noting
whether the system catalogs, such as pg_class, are bloating at a
proportional rate.)  I'm guessing that there's some loop in your procedure
that's creating new temp tables, or maybe even non-temp tables.  You would
not be able to see them via "select * from pg_class" in another session
because they're not committed yet, but they'd be taking up filesystem
entries.  The loop might or might not be dropping the tables again; IIRC
the filesystem entries wouldn't get cleaned up till end of transaction
even if the tables are nominally dropped.

Not much to go on, but I'd look for a loop that includes a CREATE TABLE
and a BEGIN ... EXCEPT block, and take a close look at the conditions
under which the EXCEPT allows the loop to continue.

                        regards, tom lane

Re: Select hangs and there are lots of files in table and index directories.

От
Peter Blair
Дата:
All,
 
One other problem with this case, those 900K worth of files in each of the table and index directories (1.8M total files) are still hanging around.  I have:
* fixed the and reloaded the stored procedure
* restarted the database
* ran the stored procedure
* there are only 378 rows in the pg_class table
 
How do I get rid of those other files?
 
Just a guess, but do I shutdown the database, and delete any file not listed in pg_class?  I do not see anything in the PostgreSQL documentation about this.
 
Thank again.

On Tue, Jan 28, 2014 at 9:47 AM, Peter Blair <petertblair@gmail.com> wrote:
Tom,
 
You are correct.  The was an infinate loop created because of the differences in the date math between Oracle and Postgres.
 
Thank again for your help.
On Mon, Jan 27, 2014 at 7:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Blair <petertblair@gmail.com> writes:
> Have a problem where a stored procedure is taking a week to run.  The
> stored procedure should take less than a second to run.

Is that "it's known to terminate if you give it a week", or "we've let
it run for a week and it shows no sign of ever terminating"?

> In researching a
> select hanging problem, three things are suggested; an autovacuum problem,
> a resource is locked, or there is something wrong with the stored procedure.

I'd bet on the last, given that you're apparently working with an immature
port from Oracle.  The error recovery semantics, in particular, are enough
different in PL/SQL and PL/pgSQL that it's not too hard to credit having
accidentally written an infinite loop via careless translation.

> Lastly, in the directories used to store the tables and indexes, there are
> 918896 files in the tables directory and 921291 files in the indexes
> directory.  All of the file names are just numbers (no extensions).  About
> 60 files are added to each directory every second.  On our test systems and
> at our other customer site, there are only about 50 files in each directory.
> Why are there so many files?

If the filenames are just numbers, then they must be actual tables or
indexes, not temp files.  (You could cross-check that theory by noting
whether the system catalogs, such as pg_class, are bloating at a
proportional rate.)  I'm guessing that there's some loop in your procedure
that's creating new temp tables, or maybe even non-temp tables.  You would
not be able to see them via "select * from pg_class" in another session
because they're not committed yet, but they'd be taking up filesystem
entries.  The loop might or might not be dropping the tables again; IIRC
the filesystem entries wouldn't get cleaned up till end of transaction
even if the tables are nominally dropped.

Not much to go on, but I'd look for a loop that includes a CREATE TABLE
and a BEGIN ... EXCEPT block, and take a close look at the conditions
under which the EXCEPT allows the loop to continue.

                        regards, tom lane


Re: Select hangs and there are lots of files in table and index directories.

От
Tom Lane
Дата:
Peter Blair <petertblair@gmail.com> writes:
> One other problem with this case, those 900K worth of files in each of the
> table and index directories (1.8M total files) are still hanging around.

Hm ... if left to its own devices, I think the session that created them
should have deleted them, assuming you did a normal query cancel on it.
Maybe you did kill -9?

> Just a guess, but do I shutdown the database, and delete any file not
> listed in pg_class?

For starters, try just stopping and starting the database; I think there
might be logic to remove orphaned files during postmaster startup.

If that doesn't work, you can get rid of any numeric-named files that
match no value in pg_class.relfilenode of their database.

            regards, tom lane