Обсуждение: unexplainable psql exit status 1

Поиск
Список
Период
Сортировка

unexplainable psql exit status 1

От
Tim Kane
Дата:
HI all,

Is anyone aware of cases where psql will occasionally return an exit status of 1, despite there being no obvious error condition?


I have a regular (daily) cron that executes a fairly simple script of the form:

psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1


The execution of this duplicate_removal.sql script appears to run without error. It’s fairly simple, and never fails to execute in and of itself.  
However, on very rare occasion, the psql session will run to completion, without visible/logged error – but it returns an exit status 1.

The only thing close to an error coming from this script is a notice, resulting from a CREATE TABLE IF NOT EXISTS. This is no different from any other execution of the script.
NOTICE:  relation "discarded" already exists, skipping


I can see that the output is correctly redirected to $DB_LOG and I can see the successful COMMIT response returned at the end of the session. I can also verify that the behaviour of the script is as expected, to completion.


The psql man page suggests that an exit status of 1 is a fatal error of its own.

EXIT STATUS
       psql returns 0 to the shell if it finished normally, 1 if a fatal error of its own occurs (e.g. out of memory, file not found), 2 if the connection to the server went bad and the session was not
       interactive, and 3 if an error occurred in a script and the variable ON_ERROR_STOP was set.


There is no indication in the postgres log file of anything adverse.

At this point my only option is to set log_statement = all, enable shell debugging, and wait a few months for it to happen again (I’ve seen this occur only twice, over a 2-3 month period).  Has anyone encountered this behaviour before?

Thanks

Tim

Re: unexplainable psql exit status 1

От
Tim Kane
Дата:

I should have mentioned, psql client and server version 9.2.4


From: Tim Kane <tim.kane@gmail.com>
Date: Wednesday, 4 December 2013 10:59
To: pgsql-general General <pgsql-general@postgresql.org>
Subject: unexplainable psql exit status 1

HI all,

Is anyone aware of cases where psql will occasionally return an exit status of 1, despite there being no obvious error condition?


I have a regular (daily) cron that executes a fairly simple script of the form:

psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1


The execution of this duplicate_removal.sql script appears to run without error. It’s fairly simple, and never fails to execute in and of itself.  
However, on very rare occasion, the psql session will run to completion, without visible/logged error – but it returns an exit status 1.

The only thing close to an error coming from this script is a notice, resulting from a CREATE TABLE IF NOT EXISTS. This is no different from any other execution of the script.
NOTICE:  relation "discarded" already exists, skipping


I can see that the output is correctly redirected to $DB_LOG and I can see the successful COMMIT response returned at the end of the session. I can also verify that the behaviour of the script is as expected, to completion.


The psql man page suggests that an exit status of 1 is a fatal error of its own.

EXIT STATUS
       psql returns 0 to the shell if it finished normally, 1 if a fatal error of its own occurs (e.g. out of memory, file not found), 2 if the connection to the server went bad and the session was not
       interactive, and 3 if an error occurred in a script and the variable ON_ERROR_STOP was set.


There is no indication in the postgres log file of anything adverse.

At this point my only option is to set log_statement = all, enable shell debugging, and wait a few months for it to happen again (I’ve seen this occur only twice, over a 2-3 month period).  Has anyone encountered this behaviour before?

Thanks

Tim

Re: unexplainable psql exit status 1

От
Albe Laurenz
Дата:
Tim Kane wrote:
> Is anyone aware of cases where psql will occasionally return an exit status of 1, despite there being
> no obvious error condition?
> 
> I have a regular (daily) cron that executes a fairly simple script of the form:
> 
>     psql --set ON_ERROR_STOP=1 -h $DB_HOST -U $DB_USER -a -v DATE="'${DATE}'" -v
> YYYYMM="${DATE_YYYYMM}" -f duplicate_removal.sql $DB_NAME >> $DB_LOG 2>&1
> 
> The execution of this duplicate_removal.sql script appears to run without error. It’s fairly simple,
> and never fails to execute in and of itself.
> However, on very rare occasion, the psql session will run to completion, without visible/logged error
> – but it returns an exit status 1.

> I can see that the output is correctly redirected to $DB_LOG and I can see the successful COMMIT
> response returned at the end of the session. I can also verify that the behaviour of the script is as
> expected, to completion.
> 
> The psql man page suggests that an exit status of 1 is a fatal error of its own.
> 
>     EXIT STATUS
>            psql returns 0 to the shell if it finished normally, 1 if a fatal error of its own occurs
>            (e.g. out of memory, file not found), 2 if the connection to the server went bad and
>            the session was not interactive, and 3 if an error occurred in a script and the variable
>            ON_ERROR_STOP was set.
> 
> There is no indication in the postgres log file of anything adverse.
> 
> At this point my only option is to set log_statement = all, enable shell debugging, and wait a few
> months for it to happen again (I’ve seen this occur only twice, over a 2-3 month period).  Has anyone
> encountered this behaviour before?

According to the documentation you quoted, it would be useless to set
log_statement=all, because the problem is on the client end.

Shell debugging sounds slightly more promising, but it will probably
not tell you much more than that psql returned with exit code 1.

A quick look at the code gave me the impression that psql will always
write an error message before exiting with 1, but I may have missed a case.

Are you sure that there is nothing on stdout or stderr in the cases where
psql returns 1?

What is the last psql command that is executed?

Yours,
Laurenz Albe

Re: unexplainable psql exit status 1

От
Tim Kane
Дата:

> According to the documentation you quoted, it would be useless to set
> log_statement=all, because the problem is on the client end.
> Shell debugging sounds slightly more promising, but it will probably
> not tell you much more than that psql returned with exit code 1.
> A quick look at the code gave me the impression that psql will always
> write an error message before exiting with 1, but I may have missed a case.
> Are you sure that there is nothing on stdout or stderr in the cases where
> psql returns 1?
> What is the last psql command that is executed?
> Yours,
> Laurenz Albe


Thanks Laurenz.
I’ve enabled shell debugging to be 100% sure that I’m not munging the return code anywhere. It’s entirely possible there is something going on at the shell side of things, though I fail to see how just at the minute :)
The output of the script is as follows, bearing in mind that all STDERR is being redirected to STDOUT – there should be no other output to be found.
I’ve verified stdout just in case, to no avail.


\set QUIET 1
\timing
\set HISTSIZE 2000
\unset QUIET
DISCARD ALL;
DISCARD ALL
Time: 0.321 ms
—-SELECT STATEMENT SNIPPED
SELECT 7713
Time: 22.081 ms
CREATE TABLE IF NOT EXISTS duplicates(like source_table);
psql:/path/to/duplicate_removal.sql:36: NOTICE:  relation "duplicates" already exists, skipping
CREATE TABLE
Time: 0.416 ms
BEGIN;
BEGIN
Time: 0.210 ms
—-INSERT STATEMENT SNIPPED
INSERT 0 7713
Time: 93.855 ms

-—DELETE STATEMENT SNIPPED
DELETE 7713
Time: 33.939 ms

COMMIT;
COMMIT
Time: 0.242 ms


Cheers,

Tim

Re: unexplainable psql exit status 1

От
Albe Laurenz
Дата:
Tim Kane wrote:
> I’ve enabled shell debugging to be 100% sure that I’m not munging the return code anywhere. It’s
> entirely possible there is something going on at the shell side of things, though I fail to see how
> just at the minute :)
> The output of the script is as follows, bearing in mind that all STDERR is being redirected to STDOUT
> – there should be no other output to be found.
> I’ve verified stdout just in case, to no avail.
> 
[...]
> COMMIT;
> COMMIT
> Time: 0.242 ms

Doesn't look like that could cause return code 1 ...

Just a random idea: are there any weird characters at
the end of your SQL script?
But that wouldn't explain why psql sometimes emits
return code 1 and sometimes not ...

Yours,
Laurenz Albe